[dcdev] Searching
Carl-Adam Brengesjö <[email protected]>
2004-01-14 8:34
[email protected]

Alot of discussion has been focused on search commands and it's features.
I don't think we need (nor should) do it too complex and with many features.

The example I'm going to show is simple, and the messages are not that long, and with few arguments. And if you'd like to get an easy overview of it in a telnet client, it's human readable.

Note that I use irc-style on messages, for faster parsing; only one space containing string allowed in each message. Easily found after the first " :" string and continues to end of message (CRLF).

Here follows the syntax

"SEARCH <dest[:port]> <id> <mime> <size|#[type#]hash> :<pattern>\r\n"

The first 3 arguments are given, no discussion there.
(destination must be a ip with port for active search and for passive it is the client's GUID or name... whatever will be used).
The fourth argument can be for two uses. either a size (range) or a hash. If it's a hash it must begin with a # char.
Next comes the pattern (wich can contain spaces, therefor the : to tell the beginning of the message).
Pattern can also have two meanings, either it can be a regex (poisix or perl, doesnt matter) or it is the exact (or wildcarded) name of the file. Clients can decide, upon submitting, if they want to replace spaces with a * to work the way searching currently does.
To tell a regex from name the work is simple - as no filename on a filesystem can contain directory delimiters and therefor noone can search for one, we simply add them in the beginning and end of the regex. And woila, we still have a valid regex: "/pattern goes here/" :) (some regex libs cant handle the / /, but it's easy to filter out by the end-client).

No need for fancy if-statements and such in the search query. A user mostly just searches for a name anyway and goes through the results manually, it's the quickest and most userfriendly. Ok - it's simple to write a if-statement yourself.. but imagine a gui to that.. to be able to make a complex if statement without actually write the if statement it would require alot of controls. As per my example you only need a single text field. Pherhaps you like to some checkboxes to tell if you want it as a regex, wildcard of plain to ensure syntax before sending the request to save traffic, but this is optional.
I know we are to discuss the actual protocol - but what's the point of making a protocol that is not usable?

Also, you quickly want to find something.. dont want to sit 2minutes configuring a criteria when it only takes 30seconds to scroll through 100+ results.

with if-statement i mean something like this
"( N =~ avi$ || N =~ ogm$ ) && ( T =~ ^video/ ) && ( S > 450000000 )"

another thing, using a particular type of hashing. simply set the hash/size argument to "#MD5#615b3de4b7a572679b457de271305229" for using MD5 the algorithm (of course, this requires a shared database of names of hashing types, but they shouldn't be too many to choose from).

some examples...

A file named "foo.mpg" of type video/* and greater than 4MB in size.
"SEARCH 01 video/* >4096 :foo.mpg\r\n"

A text file named "*bar*" with maximum or equal 3000 bytes in size
"SEARCH 02 text/* <=300 :*bar*\r\n"

a part of a WinRAR packed file name "foo" with exact size 15000000 bytes
"SEARCH 03 text/* =15000000 :/^foo\.part[0-9]{2,3}\.rar$/\r\n"

a textfile named "why_dc.txt" with hash "185410fd"
"SEARCH 04 text/plain #185410fd :why_dc.txt\r\n"

/Carl-Adam Brengesjö

Ps. New to this list. hello everyone :)
Working with Fabrice on ODCH - if you wondered.

DC Developers mailinglist