Hashing suxx...
Moderator: Moderators
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
Hashing suxx...
.. for small files. Otherwise it is great!
But how about not including the tth in the filelists for files that are really small.
Nobody ever searches for them via tth.
So how about just not hash everything in the smallfile category, or hash it on the fly when the small file is requested to enshure integrity? Or how about having one filelist with tths of the small files for answering searches, but the to other users uploaded filelist just misses the tth fo smallfiles!
May be even files smaller than 500Kb could be left out for hashing so shares with a lot of jpegs would get smaller.
I think that could downsize a lot of huge filelists without to strong sideeffects.
Please think about it and comment.
But how about not including the tth in the filelists for files that are really small.
Nobody ever searches for them via tth.
So how about just not hash everything in the smallfile category, or hash it on the fly when the small file is requested to enshure integrity? Or how about having one filelist with tths of the small files for answering searches, but the to other users uploaded filelist just misses the tth fo smallfiles!
May be even files smaller than 500Kb could be left out for hashing so shares with a lot of jpegs would get smaller.
I think that could downsize a lot of huge filelists without to strong sideeffects.
Please think about it and comment.
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
Have you ever searched for an alternate source of a smallfile?
I mena this little files you gat an extra slot for you don't need alternates for them!
Also identifing duplicates is not the main function of DC++. But also for this there would be work around by just having a filelist with the hashed smallfiles and one without. So dc can identifie duplicates, but not has to upload the list with the tths for smallfiles .
I mena this little files you gat an extra slot for you don't need alternates for them!
Also identifing duplicates is not the main function of DC++. But also for this there would be work around by just having a filelist with the hashed smallfiles and one without. So dc can identifie duplicates, but not has to upload the list with the tths for smallfiles .
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Forum Moderator
- Posts: 1420
- Joined: 2003-04-22 14:37
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
Ok then let's think you are not the only one for this. Still I think we would benefit from this. Searching for smallfiles is something rare. But donloading filelists is something done quite often. And huge loads of jpeg are as much a Problem for the filelist size as rarsharing.
But getting rid of rar sharing will need alot of features (multiple source download, upload from unfinished files)
Getting rid of the space small files take in filelists through tth is much easier.
Hash them but don't include their tths in the filelist when uploaded!
But getting rid of rar sharing will need alot of features (multiple source download, upload from unfinished files)
Getting rid of the space small files take in filelists through tth is much easier.
Hash them but don't include their tths in the filelist when uploaded!
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
no not sets Mangas are here normally also rared in easy to handle 300Mb packages with 5 Books. But if I watch at my share let me see:
Videos in rar file 4000 files
Normal Videos: 1200 files
Mp3: 2000 files
Progs: 400 files
Games (partially rar): 2400 files
Images: 2000 files
Doks:1300 files
misc: 200 files
so not sharinfg tth would get a lot of the doks the images if we imagine tth makes after compression one third of the filelist. Well doesn't look like a big impact on the share but imagine for someone with less videos.
It still is amazing for me that my images and docs make about 0.5% of my share but consume 25% of my filelist.
Well this feature request would make my list about 10% smaller others filelist may be 15% or even more.
10% doesn't sound like much ? I think it is a lot!
Videos in rar file 4000 files
Normal Videos: 1200 files
Mp3: 2000 files
Progs: 400 files
Games (partially rar): 2400 files
Images: 2000 files
Doks:1300 files
misc: 200 files
so not sharinfg tth would get a lot of the doks the images if we imagine tth makes after compression one third of the filelist. Well doesn't look like a big impact on the share but imagine for someone with less videos.
It still is amazing for me that my images and docs make about 0.5% of my share but consume 25% of my filelist.
Well this feature request would make my list about 10% smaller others filelist may be 15% or even more.
10% doesn't sound like much ? I think it is a lot!
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Posts: 38
- Joined: 2003-11-15 22:10
- Location: Hell
Think About It ....
Quicksilver, if all your argument is about is smaller filelist sizes, then it fails.
Filelist size is nothing. Hashing is the way things are, and the way they will be. More and more hubs will be banning the older, non-TTH clients (thankfully!)
You talk about being able to match small files (like JPGs) - I will *guarantee* you that the error rate - meaning *wrong* file is downloaded - is FAR greater without hashing - it's only going by filesize and name. Hashing makes "exact" files "exact".
People complain about the speed of hashing - DC will hash ~200GB an hour on a P4 2.8ghz system (your milage may vary). That is *not* an unreasonable speed.
I for one, use Match TTH on *every* search initially, maybe YOU don't search for files (even small ones) with TTH, but the vast majority of users do.
In addition, since the newer clients can no longer connect to the older (306 and back) clients, they will be gotten rid of even faster.
Bottom line, live with it or find another P2P program.
Filelist size is nothing. Hashing is the way things are, and the way they will be. More and more hubs will be banning the older, non-TTH clients (thankfully!)
You talk about being able to match small files (like JPGs) - I will *guarantee* you that the error rate - meaning *wrong* file is downloaded - is FAR greater without hashing - it's only going by filesize and name. Hashing makes "exact" files "exact".
People complain about the speed of hashing - DC will hash ~200GB an hour on a P4 2.8ghz system (your milage may vary). That is *not* an unreasonable speed.
I for one, use Match TTH on *every* search initially, maybe YOU don't search for files (even small ones) with TTH, but the vast majority of users do.
In addition, since the newer clients can no longer connect to the older (306 and back) clients, they will be gotten rid of even faster.
Bottom line, live with it or find another P2P program.
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
Hmm may be you misunderstood the "Hashing suxx" title that was just as provocation
1. In my hub I set a date to 1.1.2005 then alle clients without tth are banned
so yes I want tth
2. smallfile is normally not needed to find alternates because you automatically get a slot for it
3. I think you have better chances to match a properly named small file than a big movie, because from old dc times there are movies around with same name same size and 10 different tth.
4. Also not uploading the tths for the smallerfiles doesn't make it impossible to search for it via tth.
5. I never copmplained about hashing speed. Was that ever thematic of this thread?
I think you just read the headline and not really read the thread!
So please read before you post! I am not a newb that complains about hashing.
I just want a bit smaller filelists without to negative sideeffects.
1. In my hub I set a date to 1.1.2005 then alle clients without tth are banned
so yes I want tth
2. smallfile is normally not needed to find alternates because you automatically get a slot for it
3. I think you have better chances to match a properly named small file than a big movie, because from old dc times there are movies around with same name same size and 10 different tth.
4. Also not uploading the tths for the smallerfiles doesn't make it impossible to search for it via tth.
5. I never copmplained about hashing speed. Was that ever thematic of this thread?
I think you just read the headline and not really read the thread!
So please read before you post! I am not a newb that complains about hashing.
I just want a bit smaller filelists without to negative sideeffects.
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Posts: 38
- Joined: 2003-11-15 22:10
- Location: Hell
Understood =)
Ok, maybe I did go a bit overboard in my response (been "battling" the hashing issue in too many hubs) - I know you didn't mention the hashing time, but I thought I'd throw that in for good measure
I congratulate you on the non-TTH ban - I wish more hubs would implement them that quickly.
I understand more clearly your issue with the small files now - and with that understanding, I agree: if the filesize falls below the threshold for mini-slots, then indeed, it should not be hashed.
As I understand it, the mini-slot size for DC++ is 64kib (fulDC allows you to adjust that to whatever size you wish). If this was adjusted to maybe 100kib, I think that would allow the automatic download of a fairly good percentage of image/doc/html files.
Personally, I've not done a breakdown of filelists to see how much of the size is taken by the hash signatures.
In retrospect, I think a better, more appropriate, Subject could have been chosen
I think this is an issue that should definitely be posted to Bugzilla.
Respectfully,
--- DeathStalker
I congratulate you on the non-TTH ban - I wish more hubs would implement them that quickly.
I understand more clearly your issue with the small files now - and with that understanding, I agree: if the filesize falls below the threshold for mini-slots, then indeed, it should not be hashed.
As I understand it, the mini-slot size for DC++ is 64kib (fulDC allows you to adjust that to whatever size you wish). If this was adjusted to maybe 100kib, I think that would allow the automatic download of a fairly good percentage of image/doc/html files.
Personally, I've not done a breakdown of filelists to see how much of the size is taken by the hash signatures.
In retrospect, I think a better, more appropriate, Subject could have been chosen
I think this is an issue that should definitely be posted to Bugzilla.
Respectfully,
--- DeathStalker
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
1. I would have done this none tth client ban earlier, but I wanted some stable alternate for my users. And now since 0.667 is really stable also for passive users, i think it is time for this for every hub.
2. Basically I wanted here some cracks that know the protocol very well.
I wanted them for a statement if it is possible to use two filelists one with tth and one without tth for smaller files.
For small files it would already be great not to upload the tth each time.
But to use this with a bit bigger files (how about everything lower than 250Kib) and upload the tth of this files on request to make them searchable via tth.
I don't know if this is possible without protocol changes.
Or can a tth root be requested for a file from a user?
Only leaving tths out for <64k files would help a bit , but most jpegs are in between 50kib and 300kib so their tths also enlarge filelists alot. Thats what I want to discuss here before sending a request to bugzilla.
2. Basically I wanted here some cracks that know the protocol very well.
I wanted them for a statement if it is possible to use two filelists one with tth and one without tth for smaller files.
For small files it would already be great not to upload the tth each time.
But to use this with a bit bigger files (how about everything lower than 250Kib) and upload the tth of this files on request to make them searchable via tth.
I don't know if this is possible without protocol changes.
Or can a tth root be requested for a file from a user?
Only leaving tths out for <64k files would help a bit , but most jpegs are in between 50kib and 300kib so their tths also enlarge filelists alot. Thats what I want to discuss here before sending a request to bugzilla.
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Posts: 506
- Joined: 2003-01-03 07:33
I must say this thread is stupid.
1) DC++ do support a filelist without TTHs, if you want it.. download it.
2) DC++ do support a filelist with TTHs, if you want it.. download it..
What is your problem? Almost everything you request is available..
Should DC++ also support something inbetween? Hell no.
1) DC++ do support a filelist without TTHs, if you want it.. download it.
2) DC++ do support a filelist with TTHs, if you want it.. download it..
What is your problem? Almost everything you request is available..
Should DC++ also support something inbetween? Hell no.
Everyone is supposed to download from the hubs, - I don´t know why, but I never do anymore.
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
ok never thought about sending the old raw to get a filelist, is it that you are talking about?
Ok nice idea, but no benefit for the average user. I also don't know if dc will automatically get the tth from the other person if I downloaded the list without tths, to be able to search for alternates.
Now imagine the average user in dc doesn't even know what tth is
even more users don't know how to download a filelist without tth from someone (like me too)!
[Please no don't tell me something about educating them, there is a limit to education and a filesharer should be easy to use imho]
So this thread was something for everyones benefit not just for the cracks. And I think yes a way for in between would be nice if you wanna call it like this, an easy way in between.
You might think of this thread as stupid, I would think of it as more stupid to post at bugzilla before discussing it, also there are stupider threads availabel.
Ok nice idea, but no benefit for the average user. I also don't know if dc will automatically get the tth from the other person if I downloaded the list without tths, to be able to search for alternates.
Now imagine the average user in dc doesn't even know what tth is
even more users don't know how to download a filelist without tth from someone (like me too)!
[Please no don't tell me something about educating them, there is a limit to education and a filesharer should be easy to use imho]
So this thread was something for everyones benefit not just for the cracks. And I think yes a way for in between would be nice if you wanna call it like this, an easy way in between.
You might think of this thread as stupid, I would think of it as more stupid to post at bugzilla before discussing it, also there are stupider threads availabel.
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Posts: 506
- Joined: 2003-01-03 07:33
maybe not everyone knows howto, but... if you want it enough you will.
Its better if file integrity is preserved between filetransfers than the opposite. This is of more importance to the community as a hole. TTH helps in this.
Its better if file integrity is preserved between filetransfers than the opposite. This is of more importance to the community as a hole. TTH helps in this.
Everyone is supposed to download from the hubs, - I don´t know why, but I never do anymore.
-
- Posts: 18
- Joined: 2004-10-05 08:26
- Location: Want my ip? just ask!
Right fileintegrity is important. But fileintegrity can also be ashured via tth even if the tth is not in the filelist, by sending it for the smaller files after request.
Well but I am beginning to see another negative sideeffect, sending requests for tths that are not in the filelist would be negative for the hubs performance(upload) which is even more important to keep this low.
Have you any reasons against no tth just for files <64Kib ? For this size tcp errorprotection seems to be good enough, to enshure integrity without tth, or not?
Well but I am beginning to see another negative sideeffect, sending requests for tths that are not in the filelist would be negative for the hubs performance(upload) which is even more important to keep this low.
Have you any reasons against no tth just for files <64Kib ? For this size tcp errorprotection seems to be good enough, to enshure integrity without tth, or not?
Imagination sets the spirit free,
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
Into distant lands of fantasy,
Close your eyes and you will see,
Within your mind there lies the key.
-
- Posts: 506
- Joined: 2003-01-03 07:33
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
Cliff's notes: no, GargRFC 793 wrote:The checksum field is the 16 bit one's complement of the one's
complement sum of all 16 bit words in the header and text. If a
segment contains an odd number of header and text octets to be
checksummed, the last octet is padded on the right with zeros to
form a 16 bit word for checksum purposes. The pad is not
transmitted as part of the segment. While computing the checksum,
the checksum field itself is replaced with zeros.
The checksum also covers a 96 bit pseudo header conceptually
prefixed to the TCP header. This pseudo header contains the Source
Address, the Destination Address, the Protocol, and TCP length.
This gives the TCP protection against misrouted segments. This
information is carried in the Internet Protocol and is transferred
across the TCP/Network interface in the arguments or results of
calls by the TCP on the IP.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Thanks. The RFC seems to have left "text" undefined, though it seems that if it's not the header, it would naturally mean the data payload.PseudonympH wrote:Cliff's notes: no, Garg
ZoneAlarm must have been corrupting packets and touching up the checksum, from all of the corruption reports we've experienced.
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46