About hashing

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Locked
Franky
Posts: 1
Joined: 2003-07-09 03:20

About hashing

Post by Franky » 2003-07-09 03:38

Just a little thought about the whole hashing thing. Why not just add the same hashing method used in the edonkey network? eMule is open source, so it shouldn't be very difficult to implement, and the big advantage is of course that you could use ed2k-links to find files on the DC network. This is actually the main reason I stick with edonkey, the ability to be absolutely sure *before I even start the download* that the file is what is looks like, and in many cases, I can get much information about it beforehand too, like in the case of a movie the resolution, codecs used, language etc.

sandos
Posts: 186
Joined: 2003-01-05 10:16
Contact:

Re: About hashing

Post by sandos » 2003-07-09 03:53

Franky wrote:Just a little thought about the whole hashing thing. Why not just add the same hashing method used in the edonkey network? eMule is open source, so it shouldn't be very difficult to implement, and the big advantage is of course that you could use ed2k-links to find files on the DC network. This is actually the main reason I stick with edonkey, the ability to be absolutely sure *before I even start the download* that the file is what is looks like, and in many cases, I can get much information about it beforehand too, like in the case of a movie the resolution, codecs used, language etc.
Hashing has been extensively discussed, and I think general consensus is that we should use something a bit more secure than ed2k hashes. Bitzi.com can also be used to some extent to convert between hashes. BCDC already has tree hashes implemented, its mostly a matter of finishing it up, and somehow make people use hashing.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2003-07-09 06:49

Eventually, CMD4 hashes might well be supported, as well as SHA1 magnet links (for Gnutella, etc.). Making use of the many web based hash release sites has crossed my mind, and the mind of the others who've talked about it (including sandos, I believe :)). However, the first step is just to get a good form of hashing - and the tiger tree hashes in BCDC's code are the best.

ender
Posts: 224
Joined: 2003-01-03 17:47

Post by ender » 2003-07-09 09:43

While the tiger tree hashes might be the best, the disadvantage is, that there aren't any hash databases available on the 'net for them... OTOH, there are some extensive ed2k MD4 databases out there...

Besides, how much practical advantage does tigertree have over MD4? unless the hashing process is significantly faster than MD4, I really don't really see any... I've had 1 ed2k hash collision in my share (30000 files), and it was between a 74 kB JPEG and 320 MB AVI...

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2003-07-09 10:34

Well, the real importance of TTH is that you can control what resolution you have over verifying segments of the files. I'm sure you're as tired as I am of finding a corrupt 9mb segment of an eDonkey downloaded file, and having to fetch the whole block again.

Once hashing is in place, those with the full file can calculate both the SHA1 (magnet) and MD4 (edonkey) links for it. If you tie that together with a search-by-type-of-hash and client to client hash exchange command, you could convert the ed2k hash to SHA1/root TTH to do a search on DC for more sources... Bitzi, unfortunately, only keys on SHA1 and TTH, or SHA1 alone. Otherwise, they'd be a great hash converter - for known files, of course. ;)

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2003-07-09 10:43

bitzi maintains such a database.

As far as practical advantages, using a merkle hash tree (of whatever hash algorithm, e.g. tiger) ensures that a network doesn't lock itself into a fragment size, as eDonkey/eMule has done.[/url]

sandos
Posts: 186
Joined: 2003-01-05 10:16
Contact:

Post by sandos » 2003-07-10 01:15

There is also http://www.sharelive.com (sharelive seems to be down atm) and http://www.peerweb.org. These both use Magnets with bitprints. Its possible to discard the SHA1 and search only using TTH, if thats what gets implemented. These sites arent that big, but with more applications supporting Magnets, they could become bigger.

volkris
Posts: 121
Joined: 2003-02-02 18:07
Contact:

Post by volkris » 2003-07-14 15:13

The HUGE benefit to tree hashing methods is that clients can share blocks from incomplete transfers.

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Post by Wisp » 2003-07-14 15:21

I don't like hashing.. if you wanna use that method, why not stick with edonkey? it's an excellent program

but edonkey-links are very usefull, i would like them too, it would be great if the same file identification system would be implemented (can't be too hard?)

anyway, i used edonkey for a while and i hated it, it's always downloading with 0,14 or something, after 2 weeks it only had downloaded about 20mb :(

so i stick with dc++ :)

ender
Posts: 224
Joined: 2003-01-03 17:47

Post by ender » 2003-07-14 16:06

Hashing is an absolute must: I've had just too many corrupted transfers with DC, so lately I'm downloading only off xmule (linux port of emule; constantly downloads around 60 kB/s for me) - I only use DC if I need a missing episode quickly (==I'm impatient), when I don't care if I get some corruption - I don't keep these files though, they're deleted as soon as xmule's download completes.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2003-07-19 13:19

Wisp wrote:I don't like hashing.. if you wanna use that method, why not stick with edonkey? it's an excellent program
Hashing will to DC. And DC will be much better for it.
but edonkey-links are very usefull, i would like them too, it would be great if the same file identification system would be implemented (can't be too hard?)
You might want to re-read the above posts... this was mentioned. I think an ideal solutin would be one that allows direct file links (such as the ed2kfile and magnet ones) to be used to start/search for downloads in DC++.

Locked