Hmm. Well, we definitely need to be able to return hashes with $Search results sometime, to get the hash to passive clients. Or do we? I suppose a passive user might be the only one of many sources of a specific file online at a given time, and you need the hash from him/her to add it to your queue. Is this a resonable scenario?
What you mean is that we can get the hash from someone other than the passive user, right? Well, in that case, let's ignore the passive user as a source for hashes. Problem solved (and yes, I have an "active" connection myself )
GargoyleMT wrote:When searching by HASH>[value], do we need to return hashes? The client can just request the hash when we connect to that source and verify that it's the right file. And non-hash enabled clients shouldn't send back results at all.
If we search by hash we should never get a different hash back. That'd be like searching for "Britney Spears" and getting search results without "Britney Spears" in their filenames/path, an indication of a faulty client. Thus, we can just assume that the search results returned did, indeed, match the hash we specified, but as we'll check this when we connect and retrieve the file anyway it's no biggie.
GargoyleMT wrote:Do we need to overload $Search at all? If we just use the notation above in keywords...
We might just put this here thingy within the size field and put T and F in the file sizes (or was it F and T) fields... thus allowing other clients to just ignore it. I'd recommend using a new filetype (0xFF would be good) - this would make it really easy for clients just to ignore any filetypes they do not recognize.
GargoyleMT wrote:This is the problem with the one-true-hash idea, which of the web direct file links have a TT hash as their key?
None, as far as I've heard.
GargoyleMT wrote:[snip]True, but I think saving the "bad" sources is probably a good idea. It could be reused in at least another feature I've seen, to "switch" to another source if the speed is below a certain threshold. It also seems like the right thing to do the first time... If someone has a limited number of download slots, then they would probably not appreciate one of those being used to connect to a user that isn't a valid source anyway.
There has already been a DC++ clone/branch which implemented a "blacklist" (Operas version, perhaps?) - a variant of that could be used.
GargoyleMT wrote:I was kinda hoping that someone else would jump into this conversation, but if nobody is, we must have our heads on straight.
Well, I'm jumping in now, but only since you asked so nicely.
The reason I've not commented on this thread before is that there have not been anything for me to say something about - I have close to none knowledge about hashes, and since the collective "we" decided to dive headlong into TigerTree hashes, Merkle hashes and SHA1 and... well, you get the idea.
sandos wrote:I would much like to see metadata like this transferred over the hub, actually. I, as a passive mode user, might want to get the TTH for a file, or even SHA1, from another passive user, so that I can use that to search.
If metadata was passed over the hub... woweee! Have anyone done any calculations of how much this would increase the bandwidth usage on the hub? Since most hubs seem to put broad-pipe connections on their knees with simple chat+commands, getting them to shuffle XML metadata would be a death stroke... to either them or the clients that try to get the hub to do things it doesn't want to.
sandos wrote:Theres also the option to add metadata to the filelist. I heard Arne mention xml-based filelists, and that sounds very good to me, should make it possible to add arbitrary metadata.
A much better way of handling things... this way one could even add alternates (based on the possible "hash" metadata) in an efficient way.
sandos wrote:[snip quote about segment hashing]
This also makes partial filesharing and "swarming" possible using hashes.
AFAIK swarming is about downloading segments from users... since users in our system are very likely to have a complete file (or at least very unlikely just to have loads of segments lying around their storage medium), we might as well search on complete files. Complete files have a slightly bigger "longevity", too, as unfinished/temporary files are deleted with impunity while people are more likely to store the "MegaLegal MusicMovie from the Group With No Name.avi" which they spent a while collecting, but are not as likely to store the 33rd, 48th and 75th segments of it.
My main reasons to support ED2K/sig2dat/mumbojumbo links is that they are useful in and of themselves - with such a link I could have DC++ in its current form search for alternates and download the latest releases automatically. Sure it would be nice to support the hashes they use, but they are not necessary to be useful. Merely helpful.
Hope this helps confuse the issue further.
When you have had all that you can take, put the rest back.