File hash identification

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Locked
Wisp
Posts: 218
Joined: 2003-04-01 10:58

File hash identification

Post by Wisp » 2003-11-28 21:32

I know there have been other discussion about this issue, but I couldn't find them right now.

I think the most important feature for dc++ is the identification of files, every file should get an unique code by hashing the file, this has the following advantages. I think the hash method should be the same as other known file networks like bittorent or emule, that way dc++ users can also use the 'sharereactor' sites.

- the 'automatic search for alternatives' would work much better becuase the filename is not important
- 'match queue' would work much better for the same reason
- you could add files by entering an URL, just like edonkey
- you can search manually for the hash code instead of browsing and searching trough a lot of results to find the right file

The hash ID should be included in the search results, file lists, and the download queue.

So I was wondering if this feature is still being developed, and what the priority is. I think this is by far the most important feature dc++ needs, and (although I'm not an C programmer) I don't think it's a very difficult feature to make, although it needs some protocol changes, but I don't think users would mind.

:)

sandos
Posts: 186
Joined: 2003-01-05 10:16
Contact:

Re: File hash identification

Post by sandos » 2003-11-28 23:23

Wisp wrote:I think the hash method should be the same as other known file networks like bittorent or emule, that way dc++ users can also use the 'sharereactor' sites.
This is good thinking, too bad edonkey and bittorrent uses different hashes. There are almost as many hashes are there are p2p networks out there. (md4, md5, SHA1, TTH, sig2dat, uuhash, various truncated vartiants).

The hashes that DC probably will end up with are TTH (TigerTree Hash) which are found in bitprints (http://www.bitzi.com/, lists a few apps that use them), and I only know of two sites publishing magnets with bitprints in them:

http://www.sharelive.com
http://www.peerweb.org

About the hashtree itself:

http://open-content.net/specs/draft-jch ... ex-02.html

You might be able to get a client supporting this here:

http://wza.digitalbrains.com/DC/BCDCpp/

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Re: File hash identification

Post by Wisp » 2003-11-29 05:28

sandos wrote:
Wisp wrote:I think the hash method should be the The hashes that DC probably will end up with are TTH (TigerTree Hash) which are found in bitprints (http://www.bitzi.com/, lists a few apps that use them), and I only know of two sites publishing magnets with bitprints in them:

http://www.sharelive.com
http://www.peerweb.org

About the hashtree itself:

http://open-content.net/specs/draft-jch ... ex-02.html

You might be able to get a client supporting this here:

http://wza.digitalbrains.com/DC/BCDCpp/
Hmm why can't DC use the same as the Edonkey network? Emule is also opensource

deesee++
Posts: 9
Joined: 2003-11-01 06:17

Post by deesee++ » 2003-11-29 08:03

I know there have been other discussion about this issue, but I couldn't find them right now.
Yeah, sometimes they aren't available "right now".. you have to search between 2:37am - 7:28am EST, only on Tuesday and Sunday.........yeah....

Hashing is being worked on..
After hashing will come multi source downloads..
and sharereactor style websites too..
then what will you have...?

Just another kazaa, emule, torrent, _____,.. whatever.
Only you have hubowners getting sued/shutdown.
You have filelists for 'the man' to see just what you are illegally sharing (they don't post home movies on sharereactor, do they?).
Plus with hashing it makes it very simple to verify that the files you are sharing are indeed illegal.

dc++ is NOT like all the other p2p apps
and THAT is dc's most important feature...

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Post by Wisp » 2003-11-29 10:50

deesee++ wrote:
I know there have been other discussion about this issue, but I couldn't find them right now.
Yeah, sometimes they aren't available "right now".. you have to search between 2:37am - 7:28am EST, only on Tuesday and Sunday.........yeah....
you're funny :?
Hashing is being worked on..
After hashing will come multi source downloads..
and sharereactor style websites too..
then what will you have...?

Just another kazaa, emule, torrent, _____,.. whatever.
Only you have hubowners getting sued/shutdown.
You have filelists for 'the man' to see just what you are illegally sharing (they don't post home movies on sharereactor, do they?).
Plus with hashing it makes it very simple to verify that the files you are sharing are indeed illegal.

dc++ is NOT like all the other p2p apps
and THAT is dc's most important feature...
I don't want multisource downloading, file hashing is just something that dc++ needs, it's not an 'extra' feature, it would be an improvement of the basic dc++ functions like searching, resuming, etc etc

Twink
Posts: 436
Joined: 2003-03-31 23:31
Location: New Zealand

Post by Twink » 2003-11-29 18:09

hmmm i can see this post going nowhere.... cologic is working on hashing for BCDC, if this features works well then he might submit the code to arnie, until then you can use BCDC to find problems with the hashing and maybe speed the process up.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2003-12-01 14:12

Shareaza uses the ed2k hashing only to supplement its TTH hashing... Any well thought out DC client should do the same.

However, if the release sites Sandos mentions are good, there's no need to ever add the ed2k hashes, except for legacy support.

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Post by Wisp » 2003-12-02 12:03

GargoyleMT wrote:Shareaza uses the ed2k hashing only to supplement its TTH hashing... Any well thought out DC client should do the same.

However, if the release sites Sandos mentions are good, there's no need to ever add the ed2k hashes, except for legacy support.
importing ed2k hashes isn't essential, but it would be a nice extra feature, dc users could also use sharereactor sites and exchange hashes

the important thing is that dc++ has hashes do it can identify files better, even if they have a different name.

But i'm glad there is being worked on, keep up the good work ;)

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Post by Wisp » 2003-12-02 12:04

btw, why is the "edit" button removed at the forum?

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2003-12-02 13:38

Actually, cologic has brought up a lot of research papers showing how collisions for MD4 (which is what ed2k links are based off of) are pretty easy to make. It's better than the kazaa hash, but if kazaa dies, the **AA can construct fake files for ed2k hashes without too much trouble... (so I'm not convinced that it is a good idea at all)


Regular users have always only been able to edit their posts in Help/Support and Programmer's Help...

joakim_tosteberg
Forum Moderator
Posts: 587
Joined: 2003-05-07 02:38
Location: Sweden, Linkoping

Post by joakim_tosteberg » 2003-12-02 13:39

Wisp wrote:btw, why is the "edit" button removed at the forum?
You can edit your posts, but only in "help and support" and "programmers help"
The cause to why it is so is that arne have configured it like that :wink:

sandos
Posts: 186
Joined: 2003-01-05 10:16
Contact:

Post by sandos » 2003-12-03 17:25

I actually found another site with magnets+bitprints:

http://gamephilez.us/

Not a very big one, but..

clowne
Posts: 1
Joined: 2003-12-22 14:35

Post by clowne » 2003-12-25 01:28

agree, hashing is the most importanf feature dc++ can implement

Locked