Why should I like/use hashing? (split)

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Locked
Djungelurban
Posts: 5
Joined: 2004-11-10 18:00

Post by Djungelurban » 2004-11-10 18:28

Why can't there be a function that, if you want, disables hashing being taken into consideration when downloading/searching. I'm still not sure why it's being implented because so far it has only manage to piss me of royaly AND screw up a handful of my downloads. It has helped me at all so far. You could start by explaning why hashing is so great and why I'm forced to use it...
I have no problem with hashing my share. If people for some reason like the function I'm more then happy to help them. But for me it hasn't helped me at all.
Here's a frequent scenario:
I like to watch small obscure movies. Since small obscure movies is pretty hard to come by you're pretty much forced to take what is available and it also mean that most movies has an average download time of 1 month or 2 (obviously not all the time). So let's say I find a movie and start download. Only one user has that movie. So I put it up on my queue and hope for the best. Let's say I'm lucky and it starts to download immedietely. And then after about 25% has been downloaded that guy leaves. I search and search and search for the movie and it's nowhere to be found. Finally after 2 weeks another user appear with the movie. I right click, go to "Download to..." and to my surprise there's nothing there. I check the sizes and they are the same. The problem is, the hash codes aren't exactly the same. Now I'm faced with two options, either start a new download and probably see the whole thing repeat. OR keep searching and hope that someone will come around with a file with an identical hash code. And while deciding what to do I'm completely aware that if it wasn't for the hash code I could add this file to the previous download without any problems at all and having it continuing the download. Sure, there might be some graphical errors for a second there but atleast I get to see the thing...
The above scenario happends pretty mush every single time when I use a DC++ version with hashing. That's why I've been avoiding it for the longest time and used an old version. But now almost all of my favorite hubs are begining to force me to use a 0.4XX version so I have no choice anymore.
So with that in mind, I ask you once again, why should I join the hashing pride parade. Cause right now it's just making me crabby.

Xan1977
Forum Moderator
Posts: 627
Joined: 2003-06-05 20:15

Post by Xan1977 » 2004-11-10 21:21

BSOD2600 wrote:There are several benefits of file hashing:

1. No longer does one need to pay attention to the name of the file when looking for alternative sources. If the files are the same, they will have the same hash and thusly be chosen as an alternative source.

2.Magnet Links.. Implemented in DC++ 0.4032. More information in this FAQ.

3. Segmented (aka multisource) downloading. While it is currently not implemented, now there is a safe way to implement downloading files from multiple sources. All clients at this point have been implementing segmented downloading in cowboy fashion. They do not verify the files are the same (except for the size and partial name) which does result in corrupt files. A file hash ensures the files are identical.
&

The scenario you described is unfortunate, to be sure, but ignoring the differences is not the solution. Think of it this way. If the first person to share that file had hashed it, and then the second person downloaded it, verifying it by hash, then you would have been able to use both sources.

It's not going away, it won't be optional, so just think to the future when it will help eliminate the similar file disparity problem.

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-11-11 00:18

Djungelurban wrote:Now I'm faced with two options, either start a new download and probably see the whole thing repeat. OR keep searching and hope that someone will come around with a file with an identical hash code.
Third option. Keep them both in your queue until one of them completes.
Djungelurban wrote:if it wasn't for the hash code I could add this file to the previous download without any problems at all and having it continuing the download.
You would have no guarantee that it would resume.

People have been continuously asking questions about hashing since it was first introduced in 0.307 (released 2004-03-10). There are therefore numerous threads here that cover the pros and cons of hashing. I suggest you have a look through some of them.
The world is coming to an end. Please log off.

DC++ Guide | Words

Djungelurban
Posts: 5
Joined: 2004-11-10 18:00

Post by Djungelurban » 2004-11-11 06:38

TheParanoidOne wrote:
Djungelurban wrote:Now I'm faced with two options, either start a new download and probably see the whole thing repeat. OR keep searching and hope that someone will come around with a file with an identical hash code.
Third option. Keep them both in your queue until one of them completes.
Yeah sure, I can do that aswell. The thing is just that even if I do that, odds are that that file will be just as rare as the old file. And then maybe another person comes along with the same movie and the same size but a different hash code and so on. If that continues I'll never get my movie and unfortunately there's a good chance that I won't get it at all. Right now I've got several movies that been in my queue for over a year. And that's without the hashing problems. With hashing I might aswell just give up...

Djungelurban
Posts: 5
Joined: 2004-11-10 18:00

Post by Djungelurban » 2004-11-11 06:41

Xan1977 wrote:
BSOD2600 wrote:There are several benefits of file hashing:

1. No longer does one need to pay attention to the name of the file when looking for alternative sources. If the files are the same, they will have the same hash and thusly be chosen as an alternative source.

2.Magnet Links.. Implemented in DC++ 0.4032. More information in this FAQ.

3. Segmented (aka multisource) downloading. While it is currently not implemented, now there is a safe way to implement downloading files from multiple sources. All clients at this point have been implementing segmented downloading in cowboy fashion. They do not verify the files are the same (except for the size and partial name) which does result in corrupt files. A file hash ensures the files are identical.
&

The scenario you described is unfortunate, to be sure, but ignoring the differences is not the solution. Think of it this way. If the first person to share that file had hashed it, and then the second person downloaded it, verifying it by hash, then you would have been able to use both sources.

It's not going away, it won't be optional, so just think to the future when it will help eliminate the similar file disparity problem.
But why can't it be opitional? As long as I'm hashing my own stuff then everything should be alright. Because I don't want TTH searching, magnet links are utterly useless and I can honestly say that I have basically no use for multi-source downloading.
And it will never help with the "file disparity problems". There will always be file corruption and according to Murphy's Law the files that will be afflicted most of the time will be the ones that will result in the most damage.

Twink
Posts: 436
Joined: 2003-03-31 23:31
Location: New Zealand

Post by Twink » 2004-11-11 08:42

Isn't the hash for the file stored in the queue.xml file? meaning that you could likely shut down dc++, open the queue in notepad and delete the hash. Then when you next start dc++ you could add any source you want to the file.

However if you stop and think about it hashing is actually made to stop your problem in the long run. If you download part from each of those two people with slightly different files chances are that you now have a file different to both of the files that they have. Meaning next person to come along wanting the file has to pick between 3 files. If you downloaded from just one then there would be 2 sources for the video, yeah i know they're not common videos but still i dont see how adding a third video for the same thing is a good idea.

Instead of suggesting that DC++ make hashing optional I would think it would make sense to add the ability to download sections of a file from another source where the hashes are the same. If you've downloaded half a file and the only differences between the two sources is in the part you've already downloaded then it should be safe to resume from that source.

Djungelurban
Posts: 5
Joined: 2004-11-10 18:00

Post by Djungelurban » 2004-11-11 11:03

Twink wrote:Isn't the hash for the file stored in the queue.xml file? meaning that you could likely shut down dc++, open the queue in notepad and delete the hash. Then when you next start dc++ you could add any source you want to the file.
You've just solved my problem. I mean, it will be pretty time consuming but I feel it will be worth it.

Edit: After trying it, then no... the problem isn't solved. I've deleted the hash from the queue file but it still appears in DC++.

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-11-11 12:01

Did you do the editing while DC++ was shut down?
The world is coming to an end. Please log off.

DC++ Guide | Words

Djungelurban
Posts: 5
Joined: 2004-11-10 18:00

Post by Djungelurban » 2004-11-11 12:07

TheParanoidOne wrote:Did you do the editing while DC++ was shut down?
I really thought I did, but appearently I didn't... It works now, thanks for the help :)

Twink
Posts: 436
Joined: 2003-03-31 23:31
Location: New Zealand

Post by Twink » 2004-11-11 19:11

Dear god what have i done.

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-11-11 19:18

Twink wrote:Dear god what have i done.
Unleashed the tides of darkness.
The world is coming to an end. Please log off.

DC++ Guide | Words

Locked