Autosearching by TTH only in .403

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Locked
Gasp
Posts: 7
Joined: 2004-06-20 11:28

Autosearching by TTH only in .403

Post by Gasp » 2004-07-09 15:14

Don't you think that there should be an option to ignore the hashes when autosearching? (I'm, not sure how it's done in .401) Among people's shares on most of the hubs I'm on many files does not have hashes at all and, most of all, even if they have, the number of different hashes for the "same" files is dramatic.
Therefore, I'd rather risk a little corruption, especially as most of common file formats have error correction built in, than limit my sources by 70% or more. For a year I've been using DC now it has happened only once or twice that I downloaded a corrupted file. It's gonna take some time for hashes to spread so widely that one could rely on them solely.
However, I understand that if users are not forced to rely on them just now, the process of "unifying" the hashes for most of the copies is going to take longer time.

I hope I didn't miss something out here but I could find any such option.

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Re: Autosearching by TTH only in .403

Post by TheParanoidOne » 2004-07-09 15:21

Gasp wrote:on most of the hubs I'm on many files does not have hashes at all
In this case, DC++ will use the older method of matching by file name and size. Any hashes encountered will be ignored (as far as I can remember).
The world is coming to an end. Please log off.

DC++ Guide | Words

Gasp
Posts: 7
Joined: 2004-06-20 11:28

Re: Autosearching by TTH only in .403

Post by Gasp » 2004-07-09 17:34

TheParanoidOne wrote:
Gasp wrote:on most of the hubs I'm on many files does not have hashes at all
In this case, DC++ will use the older method of matching by file name and size. Any hashes encountered will be ignored (as far as I can remember).
I supposed so but the issue is still there considering how much the hashes vary today.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Re: Autosearching by TTH only in .403

Post by GargoyleMT » 2004-07-09 19:59

Gasp wrote:I supposed so but the issue is still there considering how much the hashes vary today.
They don't vary, there are a lot of corrupted versions, each with their own TTH.

This is one of the benefits of hashing, to see the corruption inherent in the system.

Gasp
Posts: 7
Joined: 2004-06-20 11:28

Re: Autosearching by TTH only in .403

Post by Gasp » 2004-07-10 05:02

GargoyleMT wrote:
Gasp wrote:I supposed so but the issue is still there considering how much the hashes vary today.
They don't vary, there are a lot of corrupted versions, each with their own TTH.

This is one of the benefits of hashing, to see the corruption inherent in the system.
Of course they are corrupted versions. But as I said
I'd rather risk a little corruption, especially as most of common file formats have error correction built in, than limit my sources by 70% or more.
That's the point. The TTH differences does not necessarily mean that the file is totally screwed up, usually the corruption is minor. In some cases matching only by TTH would sabotage autosearching as such - I've just searched for a quite common avi and got about 15 sources (all same size etc.). ALL of them with different hashes or no hashes at all.
I want an option for all these people who can live with those usually unnoticeable errors.

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2004-07-10 22:56

Yes, but there are those of us (me) that want to prevent the 15 hashes for the "same" file situation. Hence, you get it from one guy and keep the hash consistent, so we have 14 singles and 2 of the same. Then the next person gets the version that 2 of you have because more sources is better, and we now have only one popular copy out there instead of many different ones.

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2004-07-10 23:39

I want an option for all these people who can live with those usually unnoticeable errors.
Not only don't I want that option for myself, but I don't want anyone else to have it either. I value file integrity, as it increases the health of the DC network. (Well, more precisely, I don't want others to reshare such files, but since by the time it's in a potential share preventing such is unfeasible, I'll settle for the TTH checking during queuing.)

average joe
Posts: 1
Joined: 2003-12-20 12:15

Post by average joe » 2004-07-11 08:39

TTH is awesome, only a fool would wanna turn that off. I hope it clean up the shares after a while. Now if people left releases intact and did a SFV check on their downloads I'd be really happy.

Only in a perfect world, right?

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Re: Autosearching by TTH only in .403

Post by GargoyleMT » 2004-07-11 19:52

Gasp wrote:ALL of them with different hashes or no hashes at all.
I want an option for all these people who can live with those usually unnoticeable errors.
Eventually, DC++ may have the option to add exact size but different hashes to a file as a source, if it already has the TTH leaves to verify them. In this case, as soon as a source (they can't be a source unless they have the same file) sent data that didn't match the TTH leaves, they'd be removed.

If this happens at all, it is a long way off.

What you want is to go back to pre-0.307, where there are no hashes. This is not the direction that the DC network (or at least DC++ and ADC) is headed.

bode_jr
Posts: 1
Joined: 2004-05-10 12:32

Post by bode_jr » 2004-08-01 18:10

I am inexperienced in the DC use but I am facing a problem that seems to be of definition of the program:
if an archive that you have partially copied will have the using rejection of hub you does not obtain to continue the copy of this archive, exactly that it has located it in another one hub and another user.
In my opinion this is not good.

Todi
Forum Moderator
Posts: 699
Joined: 2003-03-04 12:16
Contact:

Post by Todi » 2004-08-02 01:14

Could you rephrase that?

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-08-02 10:58

bode_jr wrote:if an archive that you have partially copied will have the using rejection of hub you does not obtain to continue the copy of this archive, exactly that it has located it in another one hub and another user.
In my opinion this is not good.
Ok, here's my translation of what you said:
If I have a partially downloaded archive, and no user on the hub has the exact same file so that I might continue it, my only option is to find another user with the exact file in another hub. In my opinion this is not good.
How did I do?

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-08-03 02:47

average joe wrote:Now if people left releases intact and did a SFV check on their downloads I'd be really happy.

Only in a perfect world, right?
me likes you :P
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

Guitarm
Forum Moderator
Posts: 385
Joined: 2004-01-18 15:38

Post by Guitarm » 2004-08-03 03:38

GargoyleMT wrote:
bode_jr wrote:if an archive that you have partially copied will have the using rejection of hub you does not obtain to continue the copy of this archive, exactly that it has located it in another one hub and another user.
In my opinion this is not good.
Ok, here's my translation of what you said:
If I have a partially downloaded archive, and no user on the hub has the exact same file so that I might continue it, my only option is to find another user with the exact file in another hub. In my opinion this is not good.
How did I do?
Hehe, I hope you did good because I actually understood what you translated it to, let's see what the answer might be....
"Nothing really happens fast. Everything happens at such a rate that by the time it happens, it all seems normal."

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-08-03 06:02

The answer is... yes.. this is the way is has to be at this time.. the situation will improve later on when DC++ can match separete TTH leaves and pick together parts from different incomplete files.
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-08-03 07:55

cyberal wrote:later on when DC++ can match separete TTH leaves and pick together parts from different incomplete files.
DC++ automatically match leaves? An average ~147mb file has nearly 600 leaves, but many more intermediate hashes. Searching for non-root hashes and matching them is not feasible.

I've talked about adding non-matching files in the past, but I've always intended it to be a solely manual undertaking.

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-08-03 09:18

What I meant was, add sources based upon the old system with filename and size.. and then download the parts where the TTH leaves match with the partily downloaded file..
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-08-03 20:02

cyberal wrote:What I meant was, add sources based upon the old system with filename and size.. and then download the parts where the TTH leaves match with the partily downloaded file..
If it's manual, that's fine. Otherwise, you've just doubled the amount of autosearches. Or halved the effective number of searches (if they're spread out identically).

:)

311Sam
Posts: 1
Joined: 2004-08-04 15:57

Post by 311Sam » 2004-08-17 20:02

the problem i have is that if you change a song title in a mp3 then it changes the hash........

Xan1977
Forum Moderator
Posts: 627
Joined: 2003-06-05 20:15

Post by Xan1977 » 2004-08-17 20:52

Then don't change the song title...

Or, if someone downloads an .MP3 that has an completely incorrect ID3 tag then change it, hash it with the proper tag and spread it around with its new hash. That way, the misslabled file will eventually drop out of circulation.

I don't download music, so I don't know if they're still using ID3 or if some other header information sceme has replaced it, but the concept is still valid.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-08-20 11:53

Yes, if you change the tag, the hash changes. The Gnutella network has experimented a bit with a possible solution to this: they have an audio-only SHA1 (one of their hash types). They strip out all tag data, and only hash the valid MPEG frames in the stream.

This *might* get into DC++ eventually, with ADC. It is a pretty ugly hack, and though it can be done for any audio file, only mp3s are in any way likely to get the treatment.

Locked