Unified TTH

Use this forum to flesh out your feature request before you enter it in <a href="http://dcpp.net/bugzilla/">Bugzilla</a>.

Moderator: Moderators

Locked
SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Unified TTH

Post by SubversiveAgent » 2006-03-30 10:19

I've been dabbling with other programs lately and it seems to me that there's no reason why a unified TTH can't be created (apart from the "this is my little castle, you can't play with my little castle").

Shareaza offers two distinct ID's for every file, for example (The e2K network ID and the I forget what ID).

Any chance we can all get in the same page, create a unified TTH and simplify crossclient sharing?
I want your slot.

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2006-03-30 11:29

What's a unified TTH?

Gnutella uses compatible hashes, in any case, and Bitzi indexes them.

SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Post by SubversiveAgent » 2006-03-30 11:50

Well I don't know the appropriate technical term, but unified TTH sounded good.

The unique code that (most) files have on DC++ is a TTH, right? Or a hash number? Something like that.

If the same "hash generator" was used to generate a hash code across all the networks, then every file would have a unique but identical code in different networks. And that would simplify spreading the file around. I think.
I want your slot.

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2006-03-30 11:51

Did you read my whole post?

SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Post by SubversiveAgent » 2006-03-30 12:22

Finally got around to trying to understand what blitzi is... But the magnet string for file X doesn't look anything like the TTH string I see on a TTH search, hence my confusion :(

Oops, I figured it out.. my bad :)

xt=urn:sha1:PTMEVHYUMYJ22X46WZLECONNW3JOFW4K&
xt=urn:kzhash:4da29bc8709ff396336f343985d0e2d7ff1fffff37e20d415b0486a489b247d3c20458d4&
xt=urn:tree:tiger:GF5O4AT3DZ7VIOYP4W2XVH3LZTOGZJS66AW4ZLY&
xt=urn:ed2k:ce01e5b2829de20e336a75fe181dcd11&
xl=57344&dn=bitcollider.exe

Tree tiger's for DC++, ed2k's for edonkey.. what are the others for?
I want your slot.

ullner
Forum Moderator
Posts: 333
Joined: 2004-09-10 11:00
Contact:

Post by ullner » 2006-03-30 12:48

They are different hashes.

The first is SHA-1, one of most used hashing algortihms around. The second is from the Kazaa- and/or FastTrack network if I remember correctly. The last seem just to be the name of the file. :)

'Tree Tiger' is not explicitly for DC++. DC++ can handle; tree:tiger:, bitprint:, tree:tiger/: and tree:tiger/1024:

SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Post by SubversiveAgent » 2006-03-30 13:31

Interesting.. now to figure out how to make it work for me
I want your slot.

ullner
Forum Moderator
Posts: 333
Joined: 2004-09-10 11:00
Contact:

Post by ullner » 2006-03-30 13:36

Make *what* work?

Todi
Forum Moderator
Posts: 699
Joined: 2003-03-04 12:16
Contact:

Post by Todi » 2006-03-31 01:55

Magnets are made to be unified, i think that's the whole point. I.e, you can post a magnet link, and many different programs can react on it, depending on what hash it contains. However, since most filesharing programs have chosen to implement their own hashes, which are of course not compatible with others, you can't use just one link for every network, since the hash is what you use to find the file. This is very unlikely to change imho.

That's where bitzi comes in, in a way.

SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Post by SubversiveAgent » 2006-03-31 06:23

Well, that's basically my point. Why not unify hash programs? "One hash to bind them all" and all that :)

It'd make it simpler to look for files across different networks, and maybe even allow DC++ to access them. Unless you want to keep DC++ closed. Which is also good. :)
I want your slot.

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2006-03-31 08:17

I answered this already. Gnutella is (sometimes; it can use SHA1 too, but its Merkle trees are of Tiger form) compatible. The Bitzi link shows that TTHes appearing isn't theoretical, but that someone does, in fact, retain a database of files indexed by TTH accessible for both Gnutella and DC. This might not be quite as portable between P2P clients as you're seeking, but it's _not_ as if DC++ is a closed world hashwise.

Again, since this was missed several times already: DC hashes can and do interoperate.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Re: Unified TTH

Post by GargoyleMT » 2006-03-31 08:33

SubversiveAgent wrote:apart from the "this is my little castle, you can't play with my little castle"
That's an odd sentence, since you don't appear too knowledgeable about file hashes. DC++ uses "Tiger Tree Hashes", which is a name for a specific type of hash. We didn't make it up - it was in use by a couple Gnutella clients before we decided to use it.

There are a lot of hashes out there, and they're not all good. EDonkey's hash is pretty lose - it's based on the MD4 algorithm, and there's a good chance that someone can construct two files with the same hash. The hash is relatively unique in that it allows verification of portions of the file, but they're size of the portions is fixed: 9 MiB.

SHA1 is used by most of the Gnutella clients, but it doesn't support incremental verification. More recently, there have been some (at least theoretical) attacks on it. Ditto for MD5, though it's not in use on any network as far as I know. And its attacks are less theoretical.


I put (rudimentary) bitzi.com support and MAGNET link support into DC++ precisely for the point you address. I don't think DC++ should take up more CPU time on hashing to create "compatibility" hashes. Doing so would encourage non-TTH hashes, nearly all of which would probably return no results. By accepting only TTH hashes, I think we have a better chance of finding content that people specifically wanted to share on a TTH enabled network (i.e. DC).

SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Post by SubversiveAgent » 2006-03-31 09:47

Compatibility hashes would be cool, but since DC++ has become more CPU hungry in the recent versions, adding that to the workload does seem like a bad idea.

Pity everyone can't move to the same hash process, using the same algorithm though :(
I want your slot.

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2006-03-31 10:12

(1) "compatibility hashes" tend to reduce functionality to a lowest common denominator; for example, DC++ is apparently moving towards replacing rollback with TTH verification (see the advanced TTH resume), which requires hash constructions of a certain type and granularity to function tolerably. If compatibility hashes were used, they'd undermine such and thus either (a) limit DC by their lack of support for functionality and/or (b) not be fully usable in DC, and made (almost) purely for external software's use, which seems of dubious value to DC.

(2) Diversity in hashes is good; monocultures are inherently more vulnerable than diverse (eco/computer-)systems.

Todi
Forum Moderator
Posts: 699
Joined: 2003-03-04 12:16
Contact:

Post by Todi » 2006-03-31 10:19

Same algoritm = Bad (see above).
Everyone supporting all algoritms = Overkill, and never happening (as far as i see it). If the users whine over hashing their files once (which they do), imagine how much whining they'd do over hashing each files 5+ times.

SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Post by SubversiveAgent » 2006-03-31 11:15

(2) Safety in chaos? ;)

But okay, I see your point(s).
I want your slot.

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2006-03-31 11:34

Less metaphorically: hash functions tend to be relatively poorly understood among cryptographic constructions, and as such have been broken fairly frequently. The Hashing Function Lounge devotes a column to showing which are broken. Note that MD4 (which eMule has used, though they're transitioning to SHA1 through AICH), introduced fifteen years ago, is essentially totally broken. MD5 is as well, though it's found and still finds in existing systems wide application (I'm not sure if any P2P applications have employed it). SHA1, which is the basis of the replacement eMule's been moving to, is under attack, though it has life left yet. Given the frequency of breaks in this class of algorithms, attempting to standardize on a single one across all applications would result in said single break causing more harm than it would currently.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2006-04-02 11:47

SubversiveAgent wrote:Pity everyone can't move to the same hash process, using the same algorithm though :(
They were designed at different times with different goals. I had hoped that you would get that idea from my post.

Now, all the networks use the hashes they first adopted because it would cause much complaining from end-users (such as yourself) if they switched. Or, switching just isn't feasible, since the protocol is already built around their hash.

Yes, it would be nice, but let's also be practical here. We've done all we can (reasonably) do on the subject. And we're cognizant of it, so... what more could you possibly want?

SubversiveAgent
Posts: 53
Joined: 2006-03-27 06:11

Post by SubversiveAgent » 2006-04-04 09:34

I want coffee making features on DC++ :wink:

At this point I'm pretty happy with this proggie. Oh wait, there's something I'm curious about (opens new post)
I want your slot.

Locked