For the last while I have noticed many songs off by a few bytes which leads me to believe they are the same encoding just different tags. I was wondering if there is a way of hashing that would be ID3 independent (just as it is filename independent now) and if there is any sense incorporating such a thing into DC++ so if you select you might have to fix the ID3 tags but you will have the files.... I do not know much about the viability of this but I figure I can at least put out the topic and get some responses.
In my search to find some more information about this:
I found a Hydrogen Audio thread and links to GetID3 and MORG which apparently do this kind of hashing.
ID3 independent checksum [possible?]
Moderator: Moderators
-
- Posts: 8
- Joined: 2003-04-27 18:33
- Contact:
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
This same discussion has happened on the Gnutella mailing list[1]. The decision there is seems to be either to do full-file hashes, and let things be the way they are, or virtually strip-metadata, do a filehash of the resulting audio-only. The metadata, I believe, is sent separately and (could in theory be) reconstituted on the remote end.
This is pretty ugly - if mp3s deserve this treatment, pretty much all audio subtypes (OGGs, FLACs) also do.
This also makes things more complicated, since DC++ doesn't know anything about the format of certain files. Once it does, updating becomes much more important for the end-users (and developers).
I'm not sure how this will be addressed (including not at all) in DC++. There's no obvious clean solution.
Please read at least some of the mailing list (below), I'd like to have a real informed conversation on this topic.
[1] - http://groups.yahoo.com/group/the_gdf/
This is pretty ugly - if mp3s deserve this treatment, pretty much all audio subtypes (OGGs, FLACs) also do.
This also makes things more complicated, since DC++ doesn't know anything about the format of certain files. Once it does, updating becomes much more important for the end-users (and developers).
I'm not sure how this will be addressed (including not at all) in DC++. There's no obvious clean solution.
Please read at least some of the mailing list (below), I'd like to have a real informed conversation on this topic.
[1] - http://groups.yahoo.com/group/the_gdf/
-
- Posts: 5
- Joined: 2003-07-04 06:00
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
Yes, once you find the threads covering stripping of tag data from mp3 files, and the possiblity of audio-only sha1s, it is informative.reallyjoel wrote:informed conversation..?
(These are the first messages in the threads)
http://groups.yahoo.com/group/the_gdf/message/19314
http://groups.yahoo.com/group/the_gdf/message/19069
http://groups.yahoo.com/group/the_gdf/message/17549
The GDF is the gnutella developer mailing list, so some of the posts will be specific to Gnutella, I think the audio-only hash was a fairly good generic thread.