# Hashing--TTH
Moderator: Moderators
# Hashing--TTH
Some observations for your clarification.
1.Unfinished file name.In the previous versions the
unfinished file name had a identity and could be tracked
easily.Now files if they have a tth the file name is hidden
in a junk of a name.Moreover since the extensions are
concealed the icons are missing!It is difficult to run the
file through say realplay.
Since TTH has relevance only in the DC++ program why
not continue the previous nomenclature of names.
2.Let us say I have a file in the queue with aTTH.Will
DC++ allow a alternate download from a user who has
the exact file but not hashed it?? If not the purpose is
defeated uness everyone hashes the file?
Nagan
1.Unfinished file name.In the previous versions the
unfinished file name had a identity and could be tracked
easily.Now files if they have a tth the file name is hidden
in a junk of a name.Moreover since the extensions are
concealed the icons are missing!It is difficult to run the
file through say realplay.
Since TTH has relevance only in the DC++ program why
not continue the previous nomenclature of names.
2.Let us say I have a file in the queue with aTTH.Will
DC++ allow a alternate download from a user who has
the exact file but not hashed it?? If not the purpose is
defeated uness everyone hashes the file?
Nagan
-
- Posts: 25
- Joined: 2004-02-26 22:14
- Location: Wisconsin, USA
Re: # Hashing--TTH
Everyone should be having their files hashed. Thats why its mandatory in v. 0.401 and in v. 0.4032 only shares hashed files.nagan wrote: If not the purpose is
defeated uness everyone hashes the file?
Nagan
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
No, but you shouldn't be downloading from them anyway since it can often cause corruption. My small private hub recently enforced a 0.401-minimum-version rule, and I'd like to see larger public hubs do the same. And why are you using realplayer when you could use a good player that ignores the filenames and looks at the actual header of the file to determine what it is, like VLC?
Till such time HUBS enforce hashing or minimum DC++ version , I think the later version should allow a choice to the user >>> Incase the file name and size matches ,DC++ should be able to dwld the files in case the ROLLBACK works fine incase the other user has not hashed it (as was being done prior to 401).Hashing is just another advantage and should not act as a drawback.
2.In case a user opts for the later version what happens to the lots of files in the old dwnload queue which does not have a TTH?
3.The previous version of naming incomplete files were ideal.Putting a 38 alphabet junk in front of the original file name and concealing the file extension is a irritant.Let the file name be first.Suppose I want to check the file from the incomplete dwnld folder I have trouble in finding them.Siince the TTH information resides in the DC++ queue transfering it to the file name seems a little trivial (is it really needed).I thought the old scheme of names would still work.TTH in the program is basically to identify the right files within the hub!!! Once identified the dwnld could take place as per the old procedure.Even prior to 401 each incompete file in the queue had a unique 8 letter junk (after file name) in the incomplete download folder.So it wont matter just because TTH is added because the main comparison and identity takes place within the program and not outside!!
2.In case a user opts for the later version what happens to the lots of files in the old dwnload queue which does not have a TTH?
3.The previous version of naming incomplete files were ideal.Putting a 38 alphabet junk in front of the original file name and concealing the file extension is a irritant.Let the file name be first.Suppose I want to check the file from the incomplete dwnld folder I have trouble in finding them.Siince the TTH information resides in the DC++ queue transfering it to the file name seems a little trivial (is it really needed).I thought the old scheme of names would still work.TTH in the program is basically to identify the right files within the hub!!! Once identified the dwnld could take place as per the old procedure.Even prior to 401 each incompete file in the queue had a unique 8 letter junk (after file name) in the incomplete download folder.So it wont matter just because TTH is added because the main comparison and identity takes place within the program and not outside!!
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
OK Sir! Let us assume that the queue.xml is lost!!.How do you think the incomplete files could be resumed with the tth.The tth has to be manually copied from the incomplete file name and searched for from the program?
What prevents the junk to follow the original file name (as done prior to ver 401).Checking of the files being downloaded becomes easy.LEaving the file extensions intact without a dctmp extn will also be an improvement.
What about query 1 of my previous post?
What prevents the junk to follow the original file name (as done prior to ver 401).Checking of the files being downloaded becomes easy.LEaving the file extensions intact without a dctmp extn will also be an improvement.
What about query 1 of my previous post?
-
- Posts: 506
- Joined: 2003-01-03 07:33
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Since this change, no, since one of the latest posts about a poor use losing his queue and manually trying to find the right source (if he's downloading a video, there may be a lot of sources for it) for it, I decided to change the name of the incomplete temp file, and then write a feature that will scan a given directory, pick out the incomplete DC++ files, and make a queue of their original names and hashes. This will be implemented in an upcoming version of DC++, as will the "Preview partial file" menu choice, which address the two biggest concerns about my change.nagan wrote:OK Sir! Let us assume that the queue.xml is lost!!.How do you think the incomplete files could be resumed with the tth.
Me thinks the file name at the beginning with a junk tail is well off for checking files as they download! Since you are writing a well defined program for it you could as well leave the file extensions intact.I decided to change the name of the incomplete temp file, and then write a feature that will scan a given directory, pick out the incomplete DC++ files, and make a queue of their original names and hashes. This will be implemented in an upcoming version of DC++, as will the "Preview partial file" menu choice, which address the two biggest concerns about my change.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
.dctmp will stay on the end - it prevents incomplete files, if shared, from being returned with regular search hits (it's the extension that determines the type - an .avi.dctmp file will not be returned for Video searches, but a *dctmp.*.avi file will be) The preview function will work on the embedded extension, so I don't see any reason why incompletes should be changed in the filesystem as well.nagan wrote:Since you are writing a well defined program for it you could as well leave the file extensions intact.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
DC++ already has a mandatory incomplete folder and refuses to share your current incomplete folder.nagan wrote:Thnks Gargoyle for the reply. But in the future versions cant you make an incomplete folder mandatory and an algorithm to prevent it from being shared? Simple?
Code: Select all
-- 0.4032 2004-08-08 --
* Unfinished files now have a slightly different naming scheme (thanks garg)
* Added default unfinished folder (thanks garg)
-- 0.302 2003-11-14 --
* Temporary downloads folder no longer shared
Ok nailed!! But you may consider having the file name at the beginning.
While on this subject ,it is common rule that dwnloaded files need to be sorted as per convenience that entails relegating files to different directories from the default download folder.Of course they will be shared.Now comes the issue! DC as per norms checks for any duplicate in the default dwnld directory.I may end up downloading ABC.mpeg again (which has already been dwnld and moved to another folder).With TTH in force can you perform a check on all the shared directories so as to quell duplicates?Herculean task?
While on this subject ,it is common rule that dwnloaded files need to be sorted as per convenience that entails relegating files to different directories from the default download folder.Of course they will be shared.Now comes the issue! DC as per norms checks for any duplicate in the default dwnld directory.I may end up downloading ABC.mpeg again (which has already been dwnld and moved to another folder).With TTH in force can you perform a check on all the shared directories so as to quell duplicates?Herculean task?
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
This has been requested before, and is probably already in the bug tracker. If for some reason it's not, feel free to add it; I'll vote for it.nagan wrote:Now comes the issue! DC as per norms checks for any duplicate in the default dwnld directory.I may end up downloading ABC.mpeg again (which has already been dwnld and moved to another folder).With TTH in force can you perform a check on all the shared directories so as to quell duplicates?Herculean task?
If I'm reading the recent CVS code right, the incomplete file naming scheme has indeed been switched around to $filename.$ext.$tth.dctmp
Code: Select all
string QueueManager::getTempName(const string& aFileName, const TTHValue* aRoot) {
string tmp(aFileName);
if(aRoot != NULL) {
TTHValue tmpRoot(*aRoot);
tmp += "." + tmpRoot.toBase32();
}
tmp += TEMP_EXTENSION;
return tmp;
}
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
True, it was changed last Saturday.PseudonympH wrote:If I'm reading the recent CVS code right, the incomplete file naming scheme has indeed been switched around to $filename.$ext.$tth.dctmp
" CVS -- or Concurrent Versioning System -- is a system for managing simultaneous development of files. It is in common use in large programming projects, and is also useful to system administrators, technical writers, and anyone who needs to manage files."
"CVS stores files in a central repository, set (using standard Unix permissions) to be accessible to all users of the files. Commands are given to "check out" a copy of a file for development, and "commit" changes back to the repository. It also scans the files as they are moved to and from the repository, to prevent one person's work from overwriting another's."
"This system ensures that a history of the file is retained..."
You can browse the CVS here: http://cvs.sourceforge.net/viewcvs.py/d ... cplusplus/
"CVS stores files in a central repository, set (using standard Unix permissions) to be accessible to all users of the files. Commands are given to "check out" a copy of a file for development, and "commit" changes back to the repository. It also scans the files as they are moved to and from the repository, to prevent one person's work from overwriting another's."
"This system ensures that a history of the file is retained..."
You can browse the CVS here: http://cvs.sourceforge.net/viewcvs.py/d ... cplusplus/
-
- Forum Moderator
- Posts: 587
- Joined: 2003-05-07 02:38
- Location: Sweden, Linkoping
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
While on this subject:-
1.I find that files hashed on starting DC++ dont find a place in the file list cause the file list is refreshed later--- embarassing!! If I had to sort files and rearrange them into directories and then start DC++ hashing takes place and it excludes those files from the share and it could affect the share size.Same is the case when the user runs 403 for the first time, after hashing te file list will show zero bytes!!.He has to restart it to get all the files reflected in the share.Why not make DC++ a little courteous to remind the user on the start of DC++whether he intends refreshing file list in case there is hashing (necessiated due to above reasons).
2.Is there some sort of a log file for hashing to know only which files are indeed duplicates, to have them removed from the disk as well?
3.Can hashing (refreshing) be done as user list refreshing by the user?
1.I find that files hashed on starting DC++ dont find a place in the file list cause the file list is refreshed later--- embarassing!! If I had to sort files and rearrange them into directories and then start DC++ hashing takes place and it excludes those files from the share and it could affect the share size.Same is the case when the user runs 403 for the first time, after hashing te file list will show zero bytes!!.He has to restart it to get all the files reflected in the share.Why not make DC++ a little courteous to remind the user on the start of DC++whether he intends refreshing file list in case there is hashing (necessiated due to above reasons).
2.Is there some sort of a log file for hashing to know only which files are indeed duplicates, to have them removed from the disk as well?
3.Can hashing (refreshing) be done as user list refreshing by the user?
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Any refreshes in >= 0.4032 will list files that have been hashed since the last refresh, there's no need to restart. The development version updates your file list up to every 15 minutes to include files that have been hashed recently. So, there's nothing more to be done for the next version on that front.nagan wrote:1.I find that files hashed on starting DC++ dont find a place in the file list cause the file list is refreshed late ... He has to restart it to get all the files reflected in the share.
Yes, this has been in DC++ for a while now - enable the "Log System Messages" in Logs and Sounds.nagan wrote:2.Is there some sort of a log file for hashing to know only which files are indeed duplicates, to have them removed from the disk as well?
Yes, See point 1.nagan wrote:3.Can hashing (refreshing) be done as user list refreshing by the user?
Well what is the point if the user is not able to connect to the hubs for 15 minutes in case he runs it for the first time or he happens to rearrange files that could affect the share until refreshed?GargoyleMT wrote: Any refreshes in >= 0.4032 will list files that have been hashed since the last refresh, there's no need to restart. The development version updates your file list up to every 15 minutes to include files that have been hashed recently. So, there's nothing more to be done for the next version on that front.
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Given that DC++ is sharing only hashed files, what do you expect the solution to be? At 10-30 megabytes per second, 9,000 - 27,000 mb will be done in those initial minutes.nagan wrote:Well what is the point if the user is not able to connect to the hubs for 15 minutes in case he runs it for the first time or he happens to rearrange files that could affect the share until refreshed?
I'd suggest writing an email to an old friend, going outside for a breath of fresh air, or registering to vote.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Appreciate your way of making things clear that this point is not to be considered now!!GargoyleMT wrote:Given that DC++ is sharing only hashed files, what do you expect the solution to be? At 10-30 megabytes per second, 9,000 - 27,000 mb will be done in those initial minutes.
I'd suggest writing an email to an old friend, going outside for a breath of fresh air, or registering to vote.
Well shares need to be so large 9000 -27000 mb ?.........Hmm....
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Well for people with lesser shares hashing of which completes in 5 to 10 min ,they might have to wait for some time or restart DC++ again.GargoyleMT wrote:9 - 27 gigabytes is not a large share, and that's what DC++ will hash in 15 minutes on an average system. (at 10-30 mb/s)nagan wrote:Well shares need to be so large 9000 -27000 mb ?.........Hmm....
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
Posted by gargoyle on 22-09-04
System logs grow in size daily with lots of average data ,dup files have to be searched for. Some log showing only the duplicate files will be more purposeful?Rather than a report (which may get deleted) if it is interactive (taking the details from the "Hash Data" ) on request it would be useful?GargoyleMT wrote:Yes, this has been in DC++ for a while now - enable the "Log System Messages" in Logs and Sounds.nagan wrote:2.Is there some sort of a log file for hashing to know only which files are indeed duplicates, to have them removed from the disk as well?
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Hear, hear.PseudonympH wrote:Maybe because it's already a menu option, a /command, and a keyboard shortcut?
If you delete the file that contains the information you want, that's too bad.nagan wrote:Some log showing only the duplicate files will be more purposeful?Rather than a report (which may get deleted)
Eventually, the system log may have "areas" within it (anyone who has read /var/log/messages knows what I'm talking about), but splitting duplicates out into a separate log file is absolutely ludicrous - it's the exact opposite of the purpose of system.log.
Might as well hear the comment (with proof) just 2 posts earlier.Gargoyle wrote:Hear, hear..PseudonympH wrote:Maybe because it's already a menu option, a /command, and a keyboard shortcut?
Agreed.After every refresh all the duplicates are listed at the end.What I was talking of was a function to list all the duplicate files along with a checkbox The user can check them and DC can directly consign it to recycle bin .Gargoyle wrote:If you delete the file that contains the information you want, that's too bad.Eventually, the system log may have "areas" within it (anyone who has read /var/log/messages knows what I'm talking about), but splitting duplicates out into a separate log file is absolutely ludicrous - it's the exact opposite of the purpose of system.log.
As of the log messages rather than list all the files hashed (which is expected and obvious) can those which went contrary alone be listed ,ofcourse with the other necessary logs! If a user could read any useful data out of the log of 1000's of files hashed ,well no complaints.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
I saw nothing that would convince me. Want to repeat it?nagan wrote:Might as well hear the comment (with proof) just 2 posts earlier.
Why add another feature to DC++ when specialized applications do the same thing, and do a better job of it?nagan wrote:What I was talking of was a function to list all the duplicate files along with a checkbox The user can check them and DC can directly consign it to recycle bin .
Duplic8 - Duplicate File finder and manager
If users wanted to remove the duplicates from their share, a proper tool is only a google search away...
Well ,the post from 'Cologic" on 6-10-04 (dd-mm-yy).GargoyleMT wrote: I saw nothing that would convince me. Want to repeat it?
Gee thanks Gargoyle! So long as the technology employed by the other tool is as perfect as hashing ,so that it does not contradict DC++.GargoyleMT wrote: Why add another feature to DC++ when specialized applications do the same thing, and do a better job of it.If users wanted to remove the duplicates from their share, a proper tool is only a google search away...
Meanwhile I posted a request to Bugzilla with the theme "Resume function for file lists" as DC++ always starts dwnloading of file lists from the beginning even incase of interruption or disconnection. .Am not able to locate its status
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
He was mocking you, with the hope of making you see how rediculous your request was.nagan wrote:Well ,the post from 'Cologic" on 6-10-04 (dd-mm-yy).
Gee, if you want more bloat added to DC++, feel free to try to get arne to code that feature for you.nagan wrote:Gee thanks Gargoyle! So long as the technology employed by the other tool is as perfect as hashing ,so that it does not contradict DC++.
Bug 194: Resume file list transfers?nagan wrote:Meanwhile I posted a request to Bugzilla with the theme "Resume function for file lists" as DC++ always starts dwnloading of file lists from the beginning even incase of interruption or disconnection. .Am not able to locate its status
It's in bugzilla. What more do you want to know?