ID3 independent checksum [possible?]

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Locked
grenavitar
Posts: 8
Joined: 2003-04-27 18:33
Contact:

ID3 independent checksum [possible?]

Post by grenavitar » 2004-05-25 09:25

For the last while I have noticed many songs off by a few bytes which leads me to believe they are the same encoding just different tags. I was wondering if there is a way of hashing that would be ID3 independent (just as it is filename independent now) and if there is any sense incorporating such a thing into DC++ so if you select you might have to fix the ID3 tags but you will have the files.... I do not know much about the viability of this but I figure I can at least put out the topic and get some responses.

In my search to find some more information about this:
I found a Hydrogen Audio thread and links to GetID3 and MORG which apparently do this kind of hashing.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-05-25 10:16

This same discussion has happened on the Gnutella mailing list[1]. The decision there is seems to be either to do full-file hashes, and let things be the way they are, or virtually strip-metadata, do a filehash of the resulting audio-only. The metadata, I believe, is sent separately and (could in theory be) reconstituted on the remote end.

This is pretty ugly - if mp3s deserve this treatment, pretty much all audio subtypes (OGGs, FLACs) also do.

This also makes things more complicated, since DC++ doesn't know anything about the format of certain files. Once it does, updating becomes much more important for the end-users (and developers).

I'm not sure how this will be addressed (including not at all) in DC++. There's no obvious clean solution.

Please read at least some of the mailing list (below), I'd like to have a real informed conversation on this topic.

[1] - http://groups.yahoo.com/group/the_gdf/

reallyjoel
Posts: 5
Joined: 2003-07-04 06:00

Post by reallyjoel » 2004-05-29 07:31

from the site you linked:
"... Okee dockeye. Donkey's?! Comes with kikiriki. Let me try add to the convers
Har har... =) Just call us AquaBear... -dave- . . . (...horrible joke, I know,
... LOL I don't think we need to change it either. No one visits the URL with"

informed conversation..?

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-05-29 08:31

reallyjoel wrote:informed conversation..?
Yes, once you find the threads covering stripping of tag data from mp3 files, and the possiblity of audio-only sha1s, it is informative.

(These are the first messages in the threads)
http://groups.yahoo.com/group/the_gdf/message/19314
http://groups.yahoo.com/group/the_gdf/message/19069
http://groups.yahoo.com/group/the_gdf/message/17549

The GDF is the gnutella developer mailing list, so some of the posts will be specific to Gnutella, I think the audio-only hash was a fairly good generic thread.

Locked