Improved sources search
Moderator: Moderators
Improved sources search
I am not really sure if this option isnt already included, but it seems to me that it isnt.
I would like to see, that when DC++ itself searches an alternative source for a
download, looks for that one (if there is any) which has the most free slots of total
number of slots and the fastest connection type.
For short example: There are 4 sources available
1. 4 free slots out of 10 DSL
2. 5 free slots out of 5 T3
3. 1 free slot out of 15 T1
4. 6 free slots out of 6 Cable
and it automaticly selects the 2. source, because the download from it would be the
fastest.
If that's possible.
I would like to see, that when DC++ itself searches an alternative source for a
download, looks for that one (if there is any) which has the most free slots of total
number of slots and the fastest connection type.
For short example: There are 4 sources available
1. 4 free slots out of 10 DSL
2. 5 free slots out of 5 T3
3. 1 free slot out of 15 T1
4. 6 free slots out of 6 Cable
and it automaticly selects the 2. source, because the download from it would be the
fastest.
If that's possible.
The "fastest connection type" is unreliable, to say the least, since this is something each user sets on their own. It has nothing to do with their actual speed or their connection.
A problem that comes to mind about the "most free slots" first idea, is that the autosearch for alternatives has often added these sources quite a while before they are actually needed, so the number of free slots indicated is no longer accurate.
A problem that comes to mind about the "most free slots" first idea, is that the autosearch for alternatives has often added these sources quite a while before they are actually needed, so the number of free slots indicated is no longer accurate.
-
- Forum Moderator
- Posts: 587
- Joined: 2003-05-07 02:38
- Location: Sweden, Linkoping
Re: Improved sources search
For picking the fastest source, i think my suggestion is better
I have read your suggestion, but i think that the biggest problem is in informations which wouldnt be current and the procces should be then repeated over and over again.
For example DC++ would download 10-20 filelists or even more, and that would take a remarkable amount of time in meantime the rates of speed would also change, because of new uploads which would be started.
If the selected source would be dropped because of low speed,and the next sources would also be slow, the whole procces could not end at all.
Perhaps is a possible sollutions just in dropping very slow downloads and selecting alternatives on my principle (if there are any) without calculating.
The performance would be better and there would be any threat of an unending chain.
For example DC++ would download 10-20 filelists or even more, and that would take a remarkable amount of time in meantime the rates of speed would also change, because of new uploads which would be started.
If the selected source would be dropped because of low speed,and the next sources would also be slow, the whole procces could not end at all.
Perhaps is a possible sollutions just in dropping very slow downloads and selecting alternatives on my principle (if there are any) without calculating.
The performance would be better and there would be any threat of an unending chain.
I think that Wisp's suggestion is a real good one, and that it would improve the program. It's true that the speed in which DC++ did get the filelists, and sorted the users accordingly, could have changed considerably in the time passed between the program getting the filelist and actually switching to another source.
But this kind of sorting would at least be a qualified guess as to which users are fast, which in my opinion is better than the current system. At least it sorts out (or places last in available sources list) the really slow users.
But this kind of sorting would at least be a qualified guess as to which users are fast, which in my opinion is better than the current system. At least it sorts out (or places last in available sources list) the really slow users.
I also think that the suggestion is good, but it has to be set so that there wouldnt come to an unending chain or a procces which would last for too long.
The program should select just the sources with max free slots (for example 5), calculate the speed from all of them and in the end again check the slot ratios and pick the appropriate connection. By doing so it would pick the best connection which is currently available.
The program should select just the sources with max free slots (for example 5), calculate the speed from all of them and in the end again check the slot ratios and pick the appropriate connection. By doing so it would pick the best connection which is currently available.
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
Re: Improved sources search
For picking the fastest source, I think your second idea is retarded.Wisp wrote:For picking the fastest source, i think my suggestion is better
Re: Improved sources search
I don't have a second idea yetPseudonympH wrote:For picking the fastest source, I think your second idea is retarded.
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
'm totally new to this forum, but since then i I have seen you three times saying that someones idea is stupid without explaining why or giving any kind of good argument.PseudonympH wrote:My mistake; I skimmed it and thought second paragraph meant second idea. It's still a stupid kluge.
Is that some kind of bad habbit of yours?
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
If the file lists were downloaded anyway, for queue matching, that would be ok. But intentionally downloading a file list just to see if I'm the fastest user is wasteful. (Plus, imagine if everyone was doing that and searched at [nearly] the same time.)DolphinSI wrote:For example DC++ would download 10-20 filelists or even more, and that would take a remarkable amount of time in meantime the rates of speed would also change, because of new uploads which would be started.
Efcourse, i'm well aware of it. But i'm also aware of that that DC++ already downloads to many filelist which dont end beeing used.
For example: When i search for alternative source, the program starts to download a great number of filelist, without beeing told to. The filelists are beeing downloaded in my case 5-10 minutes, which is at least a remarkable amount of time turned to waste. Imagine now that you would like to search for the fastest downloads for 10 files. The filelist would be downloaded for 50-100 minutes. Thats a waste of time by my opinion.
The biggest problem is by users with large shares and slow upload speed, because of
almoust all used slots.
I myself (just like anybody else i think) pickup just the sources that have the most free slots and turning out to be the fastest..
That could also the program do. It should take less sources (3-5 max) with max slot ratio, calculate their speed and pick the best choice. It would be faster, because the program does pick more source etherwise, as it would in my suggestion. If the source would be cut, it would pick the free source on the same principle and there would be still less downloading.
For example: When i search for alternative source, the program starts to download a great number of filelist, without beeing told to. The filelists are beeing downloaded in my case 5-10 minutes, which is at least a remarkable amount of time turned to waste. Imagine now that you would like to search for the fastest downloads for 10 files. The filelist would be downloaded for 50-100 minutes. Thats a waste of time by my opinion.
The biggest problem is by users with large shares and slow upload speed, because of
almoust all used slots.
I myself (just like anybody else i think) pickup just the sources that have the most free slots and turning out to be the fastest..
That could also the program do. It should take less sources (3-5 max) with max slot ratio, calculate their speed and pick the best choice. It would be faster, because the program does pick more source etherwise, as it would in my suggestion. If the source would be cut, it would pick the free source on the same principle and there would be still less downloading.
a.) Calling someones idea stupid, is still a bad habbit. The idea is just not good.PseudonympH wrote:a) nobody has taken me up on it yet
b) it's already been discussed ten thousand times
b.) Discussion lead to answers or when not just to knowledge and more understanding of
the discussed problem. You can find a sollution of any problem, just by trying to find the
right way, not by just picking the right way at once.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
I think this feature is a hack to try to do what multisouce downloading accomplishes without any work (besides the complete overhaul required of many of DC++'s internals).
I'm more willing to work on a long term solution (multisouce) than a short term one (speed finding on existing sources). If you feel otherwise, feel free to contribute, just as I have.
I'm more willing to work on a long term solution (multisouce) than a short term one (speed finding on existing sources). If you feel otherwise, feel free to contribute, just as I have.
I'll agree on the multisource idea. If I understand it correctly there are some mods that have multisource implemented already. I don't have the knowledge to judge whether these are good implementations or not but maybe some of you guys can shed some light on this subject - What algos to use, how to implement it, and if there's certain arguments on why it hasn't been implemented beforeGargoyleMT wrote:I think this feature is a hack to try to do what multisouce downloading accomplishes without any work (besides the complete overhaul required of many of DC++'s internals).
I'm more willing to work on a long term solution (multisouce) than a short term one (speed finding on existing sources). If you feel otherwise, feel free to contribute, just as I have.
"Nothing really happens fast. Everything happens at such a rate that by the time it happens, it all seems normal."
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
As far as I understand, all of the multisource clients use the reverse connect code. I'm not sure if that's 100% true. And I'm not sure if any of the users of that code have altered it significantly.Guitarm wrote:I'll agree on the multisource idea. If I understand it correctly there are some mods that have multisource implemented already.
Well, DC++ would reimplement it, regardless of the existing code. The Queue and Download Managers must be changed so that they can cope with ranges of files being complete, and not assuming 0 to (current file size) is contiguous. The files being downloaded will have to be open, yet being written to in several different places, at different rates. The actual download requests will need to be either chunked into fixed sizes (using ADCGET or $?Get*Block), or determined more dynamically. This will make people's log files fairly ugly, so some better method of logging might be desired (or have a default format that specifically includes the start position and range).Guitarm wrote:What algos to use, how to implement it, and if there's certain arguments on why it hasn't been implemented before
Multi-source downloading has 2 major drawbacks:
1) more space (a lot!) for incomplete files is needed (unless you store the segments in separate files),
2) a higher total number of slots will be needed for the same bandwidth that the users have which is pointless and potentially increases the number of timeouts when more connections are used at the same time.
Of course I don't have to use it, but I like the idea of choosing the best source a lot more than multi-source downloading.
1) more space (a lot!) for incomplete files is needed (unless you store the segments in separate files),
2) a higher total number of slots will be needed for the same bandwidth that the users have which is pointless and potentially increases the number of timeouts when more connections are used at the same time.
Of course I don't have to use it, but I like the idea of choosing the best source a lot more than multi-source downloading.
-
- Forum Moderator
- Posts: 587
- Joined: 2003-05-07 02:38
- Location: Sweden, Linkoping
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
It requires no more space than the completed file + some bitmap that tells which chunks have been completed (which would probably be roughly the size of the TTHL (24k).paka wrote:1) more space (a lot!) for incomplete files is needed (unless you store the segments in separate files),
I don't see this. If the slots remain the same, they'll be occupied less time, because otherwise unused slots will reduce the amount of time needed to transfer a given file.2) a higher total number of slots will be needed for the same bandwidth that the users have which is pointless and potentially increases the number of timeouts when more connections are used at the same time.
That's a quicker fix, but will probably result in modifications to the existing code that may be hard to navigate or understand. With a clean multisource implementation, the code should be more understandable.Of course I don't have to use it, but I like the idea of choosing the best source a lot more than multi-source downloading.
That's exactly what I meant. Maybe I wasn't precise enough - I have a queue of about 150 GBs, about 10-20 GBs of free space on the download partition and I'm able to download successfully at the moment. Suppose you get the point now. With multi-segment downloading it's going to be impossible.GargoyleMT wrote:It requires no more space than the completed file + some bitmap that tells which chunks have been completed (which would probably be roughly the size of the TTHL (24k).paka wrote:1) more space (a lot!) for incomplete files is needed (unless you store the segments in separate files),
Usually better slots (with higher speed, of course) are occupied anyway. What I'm afraid of is the situation when hub admins change the slot rules to higher minimal numbers because the demand for slots may increase.I don't see this. If the slots remain the same, they'll be occupied less time, because otherwise unused slots will reduce the amount of time needed to transfer a given file.2) a higher total number of slots will be needed for the same bandwidth that the users have which is pointless and potentially increases the number of timeouts when more connections are used at the same time.
I suppose so. Also it seems to me that implementation of best source selection algorithm should be easier after multi-source downloads are added. When these features are (hopefully) present, it would be great if multi-source and/or best source selection were optional, because of disk space consumption with multi-source on (as in the situation above). The solution could be to set the maximal number of segments to 1 and not to create - in this case - a file of its total size at the beginning of a download/when queueing.That's a quicker fix, but will probably result in modifications to the existing code that may be hard to navigate or understand. With a clean multisource implementation, the code should be more understandable.Of course I don't have to use it, but I like the idea of choosing the best source a lot more than multi-source downloading.
BTW, the diffs of JDC++ 0.401 (http://jdcpp.free.fr/) with best source selection mod have a total size of 19 k. That is a bit, considering that it's 14 files modified, but still it's not that much. The work's been done, but the patch wasn't accepted due to formal problems (bottom of the site). I understand these, but it's a pity that noone has made a(n) (successful) attempt to comply with vanilla's code standard since then (I don't know C++ well enough myself).
-
- Forum Moderator
- Posts: 366
- Joined: 2004-03-06 02:46
However, with multisource, individual files will complete faster, so they spend less time in the temp directory. Besides, the multisource algorithm could be implemented in leapfrog fashion from the beginning of the file instead of random order, so we don't need to allocate the whole file up front.paka wrote:That's exactly what I meant. Maybe I wasn't precise enough - I have a queue of about 150 GBs, about 10-20 GBs of free space on the download partition and I'm able to download successfully at the moment. Suppose you get the point now. With multi-segment downloading it's going to be impossible.
You're probably right. The question is: how much faster? 2 times? I don't think so. The total upload bandwidth will remain the same (unless ISPs change it) so multi-source can only optimise its use. With some slight manual management, I have usually no problems using up my 1 Mb/s.PseudonympH wrote:However, with multisource, individual files will complete faster, so they spend less time in the temp directory.
Still I would have to reserve 150 GBs for a 150 GB queue (or even 50 GBs, if it was smaller due to faster downloads) and now I don't have to.
Yup, so there is a solution. The question is: how difficult to implement will it be?Besides, the multisource algorithm could be implemented in leapfrog fashion from the beginning of the file instead of random order, so we don't need to allocate the whole file up front.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
No, I don't understand. You've queued 150 GiB of information, but don't have space for all of it. If all of your sources were available now, you'd also run out of space...paka wrote:Suppose you get the point now. With multi-segment downloading it's going to be impossible.
Also, google for Sparse Files on NTFS disks...
That might happen, but I don't see why it would - hub owners are just as interested in users downloading quickly as the users themselves are.Usually better slots (with higher speed, of course) are occupied anyway. What I'm afraid of is the situation when hub admins change the slot rules to higher minimal numbers because the demand for slots may increase.
Only the original coder can submit the patch to arne, since he's transferring copyright on it, so that DC++'s source is only under one person's copyright.I understand these, but it's a pity that noone has made a(n) (successful) attempt to comply with vanilla's code standard since then (I don't know C++ well enough myself).
But they aren't - this is the reality and this is why it is possible. Limited bandwidth also prevents from too speedy disk space consumption.GargoyleMT wrote:No, I don't understand. You've queued 150 GiB of information, but don't have space for all of it. If all of your sources were available now, you'd also run out of space...
Yeah, my fault that I'm not using NTFS5, but NTFS3 which isn't capable of handling sparse files. What if I had to use FAT32 for some reason? But OK, that's true, we shouldn't support obsolete solutions.Also, google for Sparse Files on NTFS disks...
But they do get influenced by users who demand more slots. I've simply seen that many times. And DC++ has a limit on transfer connection attempts made in a unit of time, AFAIK.That might happen, but I don't see why it would - hub owners are just as interested in users downloading quickly as the users themselves are.
True, but the author of this patch encourages himself to resubmit the patch (see the bottom of the JDC++ webpage). This is an implicit agreement to change the copyright to arne when someone else submits the modified patch.Only the original coder can submit the patch to arne, since he's transferring copyright on it, so that DC++'s source is only under one person's copyright.
-
- DC++ Contributor
- Posts: 3212
- Joined: 2003-01-07 21:46
- Location: .pa.us
You're probably right that if JDC++ mod's author really wanted to do some legal abuse, he might win a lawsuit. I read the statement as a good will to put the code he created in the main version of DC++.GargoyleMT wrote:If it were challenged in a court of law, would that hold up? I suspect it would not.