report dupes that has been removed

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Locked
cyberal
Posts: 360
Joined: 2003-05-16 05:42

report dupes that has been removed

Post by cyberal » 2004-03-29 07:47

Many users have gotten their shares shrinked in DC++ 0.401.
I guess this is cause of the improved dupe removing function, which is good.

However, this causes some problems to the users, that (apperently) were sharing dupes without beeing aware of it.

Some kind of reporting mechanism for the dupe removing would be most welcome. A "dupes.txt" in the DC++ directory would do just fine.
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Re: report dupes that has been removed

Post by Wisp » 2004-03-29 08:37

Are dupes really removed, or just not counted in the share size?

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-03-29 09:04

removed if you don't check the "include dupes" option.
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 09:08

The reporting already exists.

Settings --> Logs and Sound -->Log System Messages
The world is coming to an end. Please log off.

DC++ Guide | Words

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-03-29 09:11

so there is a system message for each file that is removed then?
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 09:18

Yes.

Try it. Enable the system log and do a /refresh. The log should show you the duplicates.
The world is coming to an end. Please log off.

DC++ Guide | Words

Todi
Forum Moderator
Posts: 699
Joined: 2003-03-04 12:16
Contact:

Post by Todi » 2004-03-29 09:28

Indeed.. and it gets quite big if you have many dupe files ;) (they're checked every time you /refresh after all, and with automatic refresh on.. I never realised i had so many dupes.. but it's a great inspiration to sort your share properly.. i only wish both locations of the file was written to the logfile.. *hint hint*)

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 09:42

Todi wrote:it's a great inspiration to sort your share properly.. i only wish both locations of the file was written to the logfile.. *hint hint*)
Indeed, to both.
The world is coming to an end. Please log off.

DC++ Guide | Words

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-03-29 09:44

hm, I think that the dupe feature is broken... it seems that if the dupe files have the same name... they are not recognized as dupes...
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 09:53

Duplicates are now identified by TTH. If a TTH doesn't exist, I don't think it falls back to the old system. At least that's what my brief tests show me.
The world is coming to an end. Please log off.

DC++ Guide | Words

Todi
Forum Moderator
Posts: 699
Joined: 2003-03-04 12:16
Contact:

Post by Todi » 2004-03-29 09:55

Do they have the same hash then?

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 09:57

At this point I would like to point out that everything discussed in this thread so far is actually in the changelog.
The world is coming to an end. Please log off.

DC++ Guide | Words

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-03-29 10:33

yes, I know it works by comparing hashes.. but the files is already hashed.. I had a folder shared... with hashes.. then I copyed that folder and shared that too.. since it's the same files.. DC++ does no new hashing.. and no dupe removing either... cause it has no hashes to compare I guess.. ?
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Post by Wisp » 2004-03-29 10:50

what's the point of removing duplicate's? Why not exclude them from the share size ?

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 10:56

cyberal wrote:cause it has no hashes to compare I guess.. ?
Indeed.

Here's the teeny tiny test I did:

Three files shared:
C:\test\test.txt wrote:This is a test
C:\test\Folder\test.txt wrote:This is b test
C:\test\Copy of test.txt wrote:This is a test
On startup:
system.log wrote:2004-03-29 15:55:44: File list refresh initiated
2004-03-29 15:55:44: Finished hashing C:\test\Copy of test.txt
2004-03-29 15:55:44: Finished hashing C:\test\Folder\test.txt
2004-03-29 15:55:44: Finished hashing C:\test\test.txt
2004-03-29 15:55:44: File list refresh finished
No dupes were removed despite two files with the same size and name, due to no available hashing information.

On /refresh:
system.log wrote:2004-03-29 15:55:57: File list refresh initiated
2004-03-29 15:55:57: Duplicate file will not be shared: test.txt (Size: 14 B) (Directory: "test")
2004-03-29 15:55:57: File list refresh finished
One of the files is seen as a duplicate because of the available file hashing information. Unfortunately, due to my lousy naming scheme, it is unclear what the names refer to. :sigh:
The world is coming to an end. Please log off.

DC++ Guide | Words

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 10:58

Wisp wrote:what's the point of removing duplicate's? Why not exclude them from the share size ?
The "include duplicate files" options has been explained to you before. Why do you keep asking about it?
The world is coming to an end. Please log off.

DC++ Guide | Words

Wisp
Posts: 218
Joined: 2003-04-01 10:58

Post by Wisp » 2004-03-29 11:00

TheParanoidOne wrote:
Wisp wrote:what's the point of removing duplicate's? Why not exclude them from the share size ?
The "include duplicate files" options has been explained to you before. Why do you keep asking about it?
I assume you mean this thread?

There's not a clear answer there, or maybe i'm overlooking something

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-03-29 11:21

Partially that thread, but what sparked off my comment was the question you asked earlier in this thread. Haveing re-read that question though, I see that you are actually asking something different.

I apologise.
The world is coming to an end. Please log off.

DC++ Guide | Words

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-03-29 11:57

Wisp wrote:what's the point of removing duplicate's? Why not exclude them from the share size ?
and that was not the point... DC++ does not see them as dupes at all!
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-03-29 12:03

Code: Select all

2004-03-29 19:04:43: Finished hashing D:\Download\test1\ORiON-DiABLO_keygen_Nero.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\ORiON-keygen-GameSpy.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\ORiON-keygen-Nero-again.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\ORiON-keygen-Nero.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\ORiON-keygen-Tapptoons.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\TMG-keygen-CloneCD.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\TMG-keygen-FireBurner.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\TMG-keygen-Norton2004.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\TMG-keygen-TVtool.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\TMG-keygen-WinRar.exe
2004-03-29 19:04:43: Finished hashing D:\Download\test1\TMG-NFO.exe
2004-03-29 19:04:45: File list refresh initiated
2004-03-29 19:04:45: File list refresh finished
2004-03-29 19:05:31: Finished hashing D:\Download\test2\ORiON-DiABLO_keygen_Nero.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\ORiON-keygen-GameSpy.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\ORiON-keygen-Nero-again.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\ORiON-keygen-Nero.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\ORiON-keygen-Tapptoons.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\TMG-keygen-CloneCD.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\TMG-keygen-FireBurner.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\TMG-keygen-Norton2004.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\TMG-keygen-TVtool.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\TMG-keygen-WinRar.exe
2004-03-29 19:05:31: Finished hashing D:\Download\test2\TMG-NFO.exe
2004-03-29 19:05:32: File list refresh initiated
2004-03-29 19:05:32: File list refresh finished
2004-03-29 19:06:02: File list refresh initiated
2004-03-29 19:06:02: File list refresh finished
A. I added test1 and let it hash the files
B. Added test2 (which is an exact copy of test1) and let that hash
C. Did another manual refresh
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-03-30 21:19

Todi wrote:I only wish both locations of the file was written to the logfile.
When it was doing duplicate checks by filename and size, this wasn't easily possible without reworking the code. =) It's probably a different story now, though.
cyberal wrote:since it's the same files.. DC++ does no new hashing
FYI, it will always hash the file again, even if the file was moved from elsewhere in your share. DC++ cannot know that the file hasn't changed without hashing.
cyberal wrote:A. I added test1 and let it hash the files
B. Added test2 (which is an exact copy of test1) and let that hash
C. Did another manual refresh
I took an already hashed file, copied it into a New Folder (in my share) (so it has the same name), /refreshed, let it finish hashing, then did another /refresh and got the duplicate-by-hash warning.

do you still get the same behavior?

if you open your file list, are the hashes the same?

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-07-25 11:06

*cough*
this bug still exists..

when a file has the same hash AND filename.. it is not "dupelized" ;)
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2004-07-25 12:30

Indeed.
CyberAL wrote:[2004-07-25 16:41] <CyberAL> I share the same albums, 4 times, cause of the junction link mp3-sorting... my reported total share size should be 1/4 of what the filelist says.. but it isn't
The use of the word "junction" made me think that you were creating NTFS hard links for directories, but reading that line again, I'm not sure sure. Can you explain what you meant?
The world is coming to an end. Please log off.

DC++ Guide | Words

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-07-25 22:09

Re-testing with the CVS code I have handy seems to confirm that filenames with the same name and hash are removed from the share.

You're using DC++ 0.403, not fulDC? Trem purposefully removed the dupe check in that mod.

I was going to download your whole share to see if the same happened on my end, but I figured that might be a bit wasteful, even if you do have free slots.

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-07-26 11:20

TheParanoidOne> you are correct, it's NTFS junction links

GargoyleMT> I use fulDC 6.49, you say Trem removed the dupe check? You know why?
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-07-26 12:37

cyberal wrote:I use fulDC 6.49, you say Trem removed the dupe check? You know why?
Paraphrasing him, it's something like "it took too long to dupe check for people with big shares."

I'm not sure if he said it here, or in one of the DCDev hubs. If exact wording is important, I can find it later.

cyberal
Posts: 360
Joined: 2003-05-16 05:42

Post by cyberal » 2004-07-27 01:35

k, thanks for the help :)
http://whyrar.omfg.se - Guide to RAR and DC behaviour!
http://bodstrom.omfg.se - Bodströmsamhället, Länksamling om hoten mot vår personliga integritet

Locked