[dcdev] New feats...
"Jacek Sieka" <[email protected]>
2004-02-16 11:48
"'Direct Connect developers'" <[email protected]>

Ok, it's time for the dc world to take a step up on the ladder, and since I
happen to own the biggest client I guess if I don't do it, it won't have any

The next dc++ will support a few new things, and if you have any comments on
these, this is a good time to speak up...

1) Hashing. I've chosen TTH, as it has a few nice properties, the most
important being the ability to check individual file parts. The choice of
Tiger as hash function for the merkle trees is quite arbitrary, but was
chosen because bitzi and bcdc++ already use it, and I don't think the world
needs another hash type...SHA1 could have been used to save a few bytes and
cpu cycles, but the difference is silly since hashing speed is capped by
shitty disk drive speed anyway. The only thing speaking against tiger is
that sha1 has been researched for weaknesses much more thouroughly, and if
anyone knows of anything that would compromise (it's security, not speed)
tiger, speak up now citing the source. Searches are done using type 9, with
TTH:xxxx as prefix to the search string. Hash support is sent to the hub
using $supports, so that hub owners can nuke clients that don't support TTH
/ have it turned off...no use to hash your own files unless everyone else
does it as well...

2) XML File lists. The file list of my documets folder increased ~3.5% (55k
vs 57k) compared to the nmdc style file list compressed with bz2 (in both
cases)...adding hashes to it will obviously grow it more. You can argue all
you want about binary formats and whatnot, I don't care.
Also, dc++ will not offer uncompressed xml lists (at least not in the first
test release), as these can easily grow to silly big files (my 55k list with
8400 files is ~355k uncompressed)
The format is more or less what I wrote earlier, but without the columns (I
don't feel like defining data types for the columns at the moment...version
2 perhaps...)
The file list contain TTH roots for all files that have been hashed so
The file lists use utf-8 as encoding, and therefore also introduce new $get
commands that use utf-8 for filenames (yes, this is unfortunately
necessary), you'll find the specs in extensions.txt or the wiki "soon", but
basically they're called $UGetBlock and $UGetZBlock and have the same
parameters as $GetZBlock...DC++ will not accept any other encoding for the
xml file list.

3) The GetZBlock changed slightly to accommodate for $UGetBlock and
$UGetZBlock. DC++ is also leaving the $GetTestZBlock phase, so the new
version is final.


Ps on a side note, dc++ finished hashing the 8400 files (~4 gb on a 5400rpm
notebook drive) when I was halfway through this mail, so I don't see any
particular speed issues either...the file list grew to 246k compressed and
770k uncompressed...

DC Developers mailinglist