Multiple Source Downloading.

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Blade^^
Posts: 9
Joined: 2004-03-30 11:24

Post by Blade^^ » 2004-05-05 18:36

I still don't see where you got the flaming or trash talking from since i din't do that (atleast not as far i can see... let alone i intended to do so..).

What else is a Features forum for then to discuss new and upcomming featurs on DC++, I only gave my point of view on it and some questions and idee's. And i don't see how that could posible be seen as trash talking the DC++ team and saying if i don't like it i should go code it my self... Then you can just aswell close the Features forum.. Since you could say that to anyone that comes up with a idee..

And as i said before i think DC++ is a great program and its by far the best DC client on the net. And iam sure Multi source downloading will be a great benifit to DC++. But you must understand that with inplanting something like multi source downloading in such a popuar client it will make a big inpact on evryone using and even to people not using DC++. I only stated the questions and some idee's i had about the issue noting more...

If i offended you I am very sorry I had no intention to do so.

Anyways Iam sure the DC++ team find will a good way to inplant this feature, and i just wanted to say, keep up the good work...
// Blade

BSOD2600
Forum Moderator
Posts: 503
Joined: 2003-01-27 18:47
Location: USA
Contact:

Post by BSOD2600 » 2004-05-05 19:08

I misread one of your posts in a negative manner... which just so happen to coincide with with the mail system at my work going down and I was being flooded by angry people, which in turn made me all flustered. The mistake is mine :oops:

*fonetik*
Posts: 1
Joined: 2004-05-06 16:42

Post by *fonetik* » 2004-05-06 17:05

Just had an idea... what if the slot size was fixed?

For example 5k=1 slot.

So for a dialup user they could only upload one file at a time but it would be at full speed. They could also only download one file at a time, but again it would be at full speed. (This would probably restrict them to 1 hub maybe?)

For a 512/256 ADSL/Cable user, you would get, say the option of enabling 4 (If you needed to upload some other data) or 6 slots all running at 5k.
You could then multisource download for example, 1 file from 12 users at 5k from each giving you a 60k download, or for example, 12 files from 12 other users at the minimum of 5k each, and any combination in between. (e.g. if you had 6 files in your que, they would download at 10k from 2 users each. Then, when say 3 files were left to download they would search for more sources and download at 30k each from 6 users etc.)

You could only download a maximum of say 12 files at a time, stopping 1 user connecting to 30 other users at 1k but using up their free slots. Also the downloads would finish faster, then you could move on to the next download in your que.

This would stop slow downloads, keep slots open, alow multisource downloads, but set the maximum number of slots depending on your bandwith etc.

Hope that makes sense?!? :?

joakim_tosteberg
Forum Moderator
Posts: 587
Joined: 2003-05-07 02:38
Location: Sweden, Linkoping

Post by joakim_tosteberg » 2004-05-07 00:14

Problem: How can you know what bandwith a user has?
Search round a bit on the forum and you'll find the problems.

Blade^^
Posts: 9
Joined: 2004-03-30 11:24

Post by Blade^^ » 2004-05-07 18:40

BSOD2600 wrote:I misread one of your posts in a negative manner... which just so happen to coincide with with the mail system at my work going down and I was being flooded by angry people, which in turn made me all flustered. The mistake is mine :oops:
Thats ok :) but don't scare me like that again :p
joakim_tosteberg wrote:Problem: How can you know what bandwith a user has?
Search round a bit on the forum and you'll find the problems.
Well to start with a download should be ratio to the upload, that will aslo against slot blockers/bandwith limiters. But then again thats not really a option for a open source client..
// Blade

vec
Posts: 4
Joined: 2004-02-25 22:23

Post by vec » 2004-05-07 21:12

overreacting isnt a good way to keep discussion going. <period>

Cyborg
Posts: 41
Joined: 2004-01-23 18:15

Post by Cyborg » 2004-05-13 23:07

I have an idea of how to make this work:
When downloading from multiple sources, the 'extra' sources should only download a really small part from the file at a time (like 1-5% (maybe based by the uploader's speed?)), then wait a few seconds and then connect again. This will make some room for other users to download from this person too.
But one of the sources should always keep downloading.

Remember that 1% of a 700 MB movie is 7 MB, so it's not that small amount.
During those few seconds of wait before connection again, another user might take that slot, but then you might start downloading from a user in another hub (if you are connected to several hubs) that the other downloader is not connected to.

With this feature the used slots will be more evenly spread over the hubs, so that you don't always need to search through several hubs before finding an open slot to that file you were looking for.

ojmyster
Posts: 8
Joined: 2004-04-19 11:52

Post by ojmyster » 2004-05-15 03:00

Hmm, that seems like an incredibly slow way of doing things. Perhaps even problematic.

Cyborg
Posts: 41
Joined: 2004-01-23 18:15

Post by Cyborg » 2004-05-15 09:08

Yes, you're right, but there are some advantages.

ojmyster
Posts: 8
Joined: 2004-04-19 11:52

Post by ojmyster » 2004-05-16 03:04

Such as?

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-05-16 16:01

Cyborg wrote:When downloading from multiple sources, the 'extra' sources should only download a really small part from the file at a time (like 1-5% (maybe based by the uploader's speed?))
Sounds like swarming - download really small bits from many, many people. At least, how BitTorrent implements it.


As someone pointed out, don't assume that multi-source clients will be well behaved - if you're interested in download speeds, being well behaved will come into conflict with that goal. Users/clients should be assumed to be greedy... It's just a safer assumption.

ojmyster
Posts: 8
Joined: 2004-04-19 11:52

Post by ojmyster » 2004-05-17 04:27

Agreed

Blade^^
Posts: 9
Joined: 2004-03-30 11:24

Post by Blade^^ » 2004-05-17 06:57

I read a lot of idee's and comments about this from users, but i was wondering if one of the DC++ team could post there idee about using multi source downloading and how they think it should be inplanted. :)
// Blade

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2004-05-17 20:42

I'd say 8 to 16 megs would be a good starter chunk size. Files smaller than that don't need to be multisourced, since by the time you get more than one slot it'll be done already unless the person has really bad upload.

A lot of the problems from multisource come with having to still work with old clients, not working in a "clean room" environment.

If and when we get partial file sharing working, we (and by "we," I mean people who aren't just all talk like I am) might want to switch to downloading random chunks like BitTorrent to get complete distributed copies out there quickly, but as for now, chunks should be downloaded sequentially, meaning when you get a slot it will go for the next chunk that's not currently being downloaded. Chunks that were only partially downloaded because a user left or disconnected you should be completed before new ones are started, of course.

I'm not too great at just talking about it; I prefer to complain about how other people say they'd do it instead. :)

Bass(Fish)
Posts: 4
Joined: 2004-05-25 15:59

Post by Bass(Fish) » 2004-05-26 01:23

IMHO Multisourcing would be beneficial only if it is done in the same manner as BitTorrent has been done... that being said, I don't think it will fit well in the DC-community, namely because of DC-traffic being governed by hubs, instead of trackers that have all the info of every file available, conversely DC has the benefit of giving the user free searching abilities (thus leading to some very interesting results, rare albums, movies etc.).

I think that the only real problem with DC is the fact that if a user disco's from the hub and is never seen again... well, it's damn frustrating, especially if you've d/l'ded 350Mb's from a 700Mb file, you know? What I'm thinking is a... well, admittedly and odd proposition to implement a multi-hub-search... here's how: the DC-client would monitor the number of available sources for your files, as it does now... when the number of sources drops to zero for a certain amount of time, say 1-2 hrs (maybe longer), the client makes contact with other hubs (either from the public hub-list or the favorite-list) to search for the file(s) that have no sources left. Ofcourse, theoretically and technically this obviously leads to a huge number of abuse-opportunities (well, probably, I'm not sure) and most certainly needs a hefty amount of bandwidth. But these problems can be solved by limiting the search for 2-3 files at a time etc. etc.

Hmmm... would this work, at all?

EDIT: Oh, btw, I didn't intend that once sources have been found from other hubs the client would automatically connect to these hubs/sources, I was just thinking that it would be helpful if one could easily find sources when current hubs run dry...from there users would have the option of changing hubs to get the files they want at the time. And it would be a useful addition (well it'd be useful as it is) to see only those hubs that have a sharelimit that you can accommodate, ie. if you're sharing 30Gb, you're getting hits from only those hubs that have a sharelimit up to 30Gb.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-05-26 11:27

Bass(Fish) wrote:IMHO Multisourcing would be beneficial only if it is done in the same manner as BitTorrent has been done... that being said, I don't think it will fit well in the DC-community, namely because of DC-traffic being governed by hubs, instead of trackers that have all the info of every file available, conversely DC has the benefit of giving the user free searching abilities (thus leading to some very interesting results, rare albums, movies etc.).
This is why, if we implement partial-file-sharing, source exchange will expose sources seen on the hub the user is on. This is to avoid leaking data about private hubs, and for cluttering a client with sources that they may never see.

DC Hubs control traffic only in the loosest sense - they're common meeting rooms for clients, and so they "limit" traffic because not all users are in the same room...
Hmmm... would this work, at all?
It could be programmed, but is contrary to DC's decentralized layout. Clients shouldn't wander out onto other hubs to search - leave "global" searching to specialized, external, tools such as MoGLO.

ojmyster
Posts: 8
Joined: 2004-04-19 11:52

Post by ojmyster » 2004-05-26 18:45

It would be good to see a member of DC++ sum up this thread. By this I meand Iit would be good to see if DC++ were intending to take and actions towards M.S.Downloading. Wether of not it will or won't be implemented in the future.
Please give us a word DC++ team.

Bass(Fish)
Posts: 4
Joined: 2004-05-25 15:59

Post by Bass(Fish) » 2004-05-27 02:04

GargoyleMT wrote: It could be programmed, but is contrary to DC's decentralized layout. Clients shouldn't wander out onto other hubs to search - leave "global" searching to specialized, external, tools such as MoGLO.
But clients do wander out onto other hubs, I've done it a number of times myself because I've had much experience on people migrating to other hubs, whether it's because they've attained a new sharelimit and want to move on to bigger venues, or because they realize that they can't find / finish d/l:ing the files they are after, such as certain movies, albums etc. There is no need to change hubs if you're downloading contemporary hit albums or whatnot, but if you want to get Koyaanisqatsi's 1992 Remake by Clint Eastwood, it's another story...

And since there is no search-function available for getting what you want (and note: I'm not an impatient man, but not forever patient either), you have to disco' on one hub and connect to another, do a search for the file and repeat this process until you find it. The real problem is that this kind of behaviour leads to exactly what I'm talking about: hub-migration.

Person A enters hub to look for something. Person B starts d/l:ing something from person A. Person A doesn't find what he/she is after, thus leaving the hub (and maybe disconnecting B straight away, but nonetheless, leaving the hub to effectively discontinue B:s d/l at a later time). B realizes that A left and is probably not coming back either (you know, you spend 8-10 hrs a day on one hub and if they ain't back in three days...), so B disco's the hub and connects to a new one, and searches. Now person C starts d/:ing something from B and the cycle continues from here-on.

Well, maybe I'm just frustrated because I look up strange shit, the majority of the users probably wouldn't need this function. And besides, there is the chat option so people probably should just ask if they can resume the downloading later on...

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-05-27 10:07

Bass(Fish) wrote:But clients do wander out onto other hubs, I've done it a number of times myself because I've had much experience on people migrating to other hubs, whether it's because they've attained a new sharelimit and want to move on to bigger venues, or because they realize that they can't find / finish d/l:ing the files they are after, such as certain movies, albums etc.
Well, sure. I used client to mean "client software", not "user." I think users should go out to hubs and find the ones that fit them best at a given time. However, there's already an application (that is pseudo-abusive) to go out and do a global search: MoGLO.

Adding MoGLO like search features to DC++ is not desirable, in my opinion.

If there's a better way to track (or 'stalk' - if the user left the hub to avoid you) users across hubs, I'm all for that.
Bass(Fish) wrote:There is no need to change hubs if you're downloading contemporary hit albums or whatnot, but if you want to get Koyaanisqatsi's 1992 Remake by Clint Eastwood, it's another story...
Koyaanisqatsi remake? By Clint Eastwood? The original was very nice, I even got the Philip Glass Soundtrack back when cdnow was still cdnow. I haven't seen the two other movies in the series yet, though they're on my to-watch list.
Bass(Fish) wrote:The real problem is that this kind of behaviour leads to exactly what I'm talking about: hub-migration.

Well, maybe I'm just frustrated because I look up strange shit, the majority of the users probably wouldn't need this function.
Well, those are valid concerns, and I'm not sure there are good technical solutions to them. In particular, any solution has to make it easy for a private hub to stay private - those are part of what makes DC a good network.

Bass(Fish)
Posts: 4
Joined: 2004-05-25 15:59

Post by Bass(Fish) » 2004-05-28 01:33

Hehe, that Koyaanisqatsi-thingy was just thrown in for a decent example concerning rare stuff, and how hard it really is to get the REALLY rare stuff... oh well, while I do understand that "stalking" users is not very nice at all, I still think people should be able to find the files they're after more easily, because the majority (ok this is just a theory, which I base on basic human behaviour) of DC-users don't really give a shit about the communal aspect of file-sharing, especially when it comes to rare movies or albums, because they wouldn't be rare anymore and certain types of people don't like their fave-stuff being "out there"...which annoys the hell out of me.

*taking a breather*

And I don't really know what the problem is concerning private hubs anyway? Private hubs stay private if they don't advertise their address, I believe? Also, implementing an omni-hub-search function would only mean that it would be a hub-side option whether the hub shows up on these kind of searches. Besides, people would find "the right kind of hubs" more easily this way and the community would be saved!! ...ahem...


Hmm... maybe I am impatient... LOL sorry about the rant, but sometimes I can't let go :D

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-05-28 09:15

Bass(Fish) wrote:I still think people should be able to find the files they're after more easily, because the majority (ok this is just a theory, which I base on basic human behaviour) of DC-users don't really give a shit about the communal aspect of file-sharing, especially when it comes to rare movies or albums, because they wouldn't be rare anymore and certain types of people don't like their fave-stuff being "out there"...which annoys the hell out of me.
Direct Connect isn't the same as all the other networks - Gnutella, eDonkey/Overnet, and FastTrack all have global searches. DC isn't designed for that (and in particular, clients with large shares can't stand up to the amount of traffic global searching will generate - ask the forum user fusbar). OpenNap is similar, searches are per-host-or-host-cluster. There's no global search to handle all of the OpenNap servers, even though there are lists of all the networks and hosts.
Bass(Fish) wrote:And I don't really know what the problem is concerning private hubs anyway? Private hubs stay private if they don't advertise their address, I believe?
The logic goes this way: files will be more available if users can share their downloads-in-progress. Everyone who's downloading a certain file can then implement a feature that tells other downloaders they have a partial (partial file sharing -pfs), and about their known sources. This source exchange must be sure not to leak download locations on other hubs (because you don't know which are private).
Bass(Fish) wrote:Hmm... maybe I am impatient... LOL sorry about the rant, but sometimes I can't let go :D
I don't know if that's something you should be sorry about. This is... a conversation of sorts, is it not?

Bass(Fish)
Posts: 4
Joined: 2004-05-25 15:59

Post by Bass(Fish) » 2004-05-29 09:33

Yep, I guess you're right about this after all... I would use MoGlo if it weren't for the fact that it can effectively ban the IP I'm using on many hubs, which is twice unfair because I'm on a dynamic IP... not good etiquette to go around MoGloin' and banning people by association... Maybe I just have to find some hubs-dé-extraordinaire with good people...

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-05-29 11:38

Bass(Fish) wrote:I would use MoGlo if it weren't for the fact that it can effectively ban the IP I'm using on many hubs, which is twice unfair because I'm on a dynamic IP...
A better MoGLO would be welcome, with an accepted way for hub owners to opt-into (not opt-out, like they can do currently) global searches. Having this in a separate tool is pretty important to me.

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2004-06-08 10:11

I've been putting off typing this up for weeks now, and as I have nothing else to do in programming class I guess I'll give it a shot.

Basically, I believe that with a good multisourcing algorithm, it's possible to get most of the benefits of rotating upload queues and the download speeds multisource promises. The goals of this would be to:
  • for each uploader, rare files are given more priority when it comes to the person keeping the slot
  • files can be gotten from multiple sources, but not so many it causes problems
  • the faster the user uploads, the larger percentage of the file you try to get from them
  • try to rotate through as many sources as possible for popular files, to let those getting rare ones have more of a chance of getting a slot
The behavior will be controlled by several variables: timeSlice, minSources, and maxSources. timeSlice is how long the client waits before reevaluating whether it will keep a slot or give it up to allow another user the chance to download, and 30 minutes sounds about right. minSources is the limit to where the client keeps the slots instead of giving them up after its time slice is done and should be probably 1 or 2. maxSources is where the client stops looking for open slots, and should probably be from 4 to 10.

When a downloader gets a slot, a counter is started and the client requests the next 4 MiB chunk that is not already being downloaded (4 MiB is just a guess at what I think it should be...). When the chunk is done, the client checks to see if it's used half of its time slice yet. If it has, it simply requests the next chunk; if not, it doubles the chunk size and then requests the next chunk. If it's used up all of its time slice, then it checks to see how many other active sources there are. If it's greater than or equal to minSources, it disconnects.

The reasoning behind doubling the chunk size instead of keeping it the same is because transfers usually start out at slower speeds and then grow to full speed. Doubling the chunk size allows it to stay at full speed for longer periods of time (though this doesn't matter really unless the uploader is on a T3).

The only thing I'm unsure about is how to count active sources. If you just count it for the individual file, then people who are downloading albums or series will wind up taking up much more slots compared to people who are just getting one movie.

I plan on posting a bit more on the active sources part, but I'm out of time. I think I got the basic idea down good enough. :)

misterhopman
Posts: 3
Joined: 2003-05-02 14:23

Post by misterhopman » 2004-06-18 02:33

Hm.

It seems the argument about whether or not multiple source downloading is a good feature or not has somewhat died, but my justification for it is a good introduction to my other ideas. Now, I will immediately note that I know little to nothing about coding, I'm a math guy not a programming guy; however, this means that I do know math, which is the basis of my ideas. Anyway, the argument for multi-source downloading, in my view, is not about better speeds, it's about better efficiency. The goal should be to use the available resources to transfer as much as possible. As I see it the only limiting factor here is bandwith, I doubt any implementation will stress the available processing power, storage space, or anything but the available bandwith.

The first, and the most obvious, feature to improve efficiency is that you will want to use as much of the available bandwith as possible. To facilitate the use of as much bandwith as possible, it would be very useful to know how much bandwith is available. The more accurate this value is, the better. I could see a couple ways to do this, al with their own benefits and drawbacks. Up to now, I believe there has been no sufficient reason to go through the trouble implementing a method of finding these values, and in fact there may never be sufficient reason, but it would allow for more efficiency. A few ways to do this are a speed test, which would, in effect, waste some small amount of available bandwith, which is bad, but would be easy to implement and fairly accurate, which is good. A more difficult to implement, I assume, way to do it would be to make the program "learn" these values; you could have it just remember the fastest speed at which uploads/downloads have gone, this would work as long as these values aren't used as actual caps.

Anyway, multi-source downloading helps very much in the attempt to use as much of the available bandwith as possible; this is, assumedly, clear, basically when there is available bandwith from an additional source that normally wouldn't be used, with multi-source downloading it is used. Knowing how much upload/download bandwith is available could help in that when you have reached your maximum download speed there would be no reason to connect to more sources, by comparing the available upload speeds of the different sources, you could determine the most effective way to use this speed, i.e. download as fast as possible from others who have the most upload bandwith available and even disconnect from those whose upload bandwith is all being used.

As was described before, "hot swapping" could be done, though it would work somewhat differently. Basically when you are downloading something, if there were a max number of clients you can be downloading from, by knowing how much bandwith each person with the client has available you would be able to connect to those with the most available bandwith. In fact, this same method could be used as an uploader, given a max number of upload slots, you would upload to those others who would combine to use as much of your bandwith as possible.

Once you are using as much bandwith as you possibly can with the current implementation, the next thing to do is use the available bandwith more efficiently. One good way to do this, is to make the available bandwith for any given file dependent on the supply and demand of the file, i.e. upload bandwith of those with the file and download bandwith of those who want the file. This is a lot easier to do when all "files" are the same size, i.e. if partial downloads are available as in bittorrent, and in fact it would be best if file segments did not need to be downloaded sequentially (to an extent, the smaller the file segment size, the better).

A more complicated method of increasing overall efficiency is something that would actually be a method of using more of the available bandwith and of using the bandwith more efficiently. This particular method would only work in certain cases, specifically it would not help when there is only one client attempting to download a certain file. It would go something like this, when the demand for a "file" (actually just a file segment) is greater than the supply, if you are a client uploading that file you would find an uninvolved client with available download bandwith and more upload bandwith than you have available and you would send this file to them and they would then be able to use that unused upload bandwith of theirs. This may seem somewhat intrusive in that it is using an uninvolved party's bandwith, but I don't see how it is any different than how a client's unused upload bandwith is used except that in this you use some of their unused download bandwith.

This last method that I described is currently more likely to be used in a program such as bittorrent, a program where you will have more than a single person downloading a given file from each person at a time. But, that is simply because it is a complicated sounding, and, even I can tell, a more difficult to implement method than something as relatively simple as multi-source downloading. But, once you do implement multi-source downloading, if you are to continue to improve how efficiently you use the available resources the road will lead to such methods as those I described above. In fact, some bittorrent clients, a relatively new p2p method, are already implementing some similar such methods of optimizing bandwith usage.

As DC++ matures these methods could in fact become more suitable to it. The simple addition of multi-source downloading will, as time goes by, become more and more useful. As it is now, there is little to no reason to download "SomeFile(1)" rather than "SomeFile(2)" just because two people have the first and only one has the second. Multi-source downloading will slowly make itself more useful... as in the case above, you would be more likely to download the one that two people have, and then three people would have that version and it would spread even faster. Once more and more people have the same copies of more and more things, these other methods I described become more and more realisitically useful.

Finally, if you could sit back in your chair a moment and tough out the rest of this long, much longer than I expected, post I will describe some of what a "smart" system could accomplish. Once you find that rare, rare files that you have been looking for, you would not have to wait while the person with that file had all their slots wasted on something that could be gotten from 50 other people, anyone downloading from them would simply be disconnected and would be able to download from others what they want, and you in turn would be given either all that person's available bandwith, or as much of their bandwith as you could handle. In this situation, there is no loser.

Now, consider two people have a file, say 2000kB that 4 other people want. Now, say that all of the people have 20kB/s up and 40kB/s down (If these values are different for the different people, the "smart" system is even more impressive). In the current system, the best case (each person uploading to one person at a time) would take 200 seconds for everyone to get the file an averagge of 40kB/s. With multi-source downloading (without partial), the best case (download to one person as fast as possible each time) would be that everyone has the file in 150 seconds, 53.3 kB/s. With multi-source downloading and 500kB partial files (without smart bandwith distribution) the best case would be 116 seconds, 68.6kB/s. Multi-source, 500 kB partials and smart, best case is 87.5 seconds, 91.4kB/s. Note that that is more than twice as fast as the current implementation. But yet that is not nearly as fast as possible. In a perfect system it would take 66.7 seconds and average transfer would be 120kB/s. Just as a mention... if there were another client who was not interested in the file, but had 20kB/s down and 60kB/s up available, in a perfect system it would take only take 50 seconds and transfer would be 160kB/s to the interested parties.

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2004-06-18 08:13

Wow. That was impressive. And I thought I wrote a lot. :)

Anyway, note that partial file sharing is a planned feature. Multisource is also sort of useless for files this small, as who cares if it takes a minute longer to get something? Multisource only really becomes useful when the files are greater than 100 MB, or if all the sources have very low upload speed, so with multisource it becomes a reasonable speed (like 10-20 kB/s).

Also, the situation you describe could pretty much never happen except on a small hub, and even then only one where people hang out to chat instead of just to download. On large public hubs, it's pretty much impossible to find people with open slots, especially those with no slots taken.

While it does seem like this system would work great in a clean room environment, in the real DC world things never work out that simply, and anyway, people's upload is almost always maxed out. Even if only one slot is taken, they are probably uploading at max, and getting more slots taken only results in spreading the speed out more. And, even if they aren't maxed out, that's what the "open a new slot if speed is below" feature is for. While it may work great in the perfect world, I do think that we should try to Keep It Simple, Stupid to an extent. Uploading a file to someone who doesn't even want it is a definite no-no.

The idea behind multisource is not to give higher download speeds to those who have 100+ files in their queue; it's to let people who want one file and only one file get speeds as good as those who are getting a large set. For this reason, I think multisource should only be used for people who are only currently downloading a small amount of files. This means that the way to count "active" sources (see previous post) is to count all active downloads, to ensure that speeds are more consistent whether someone is getting one file or 300.

misterhopman
Posts: 3
Joined: 2003-05-02 14:23

Post by misterhopman » 2004-06-18 13:18

Yeah, I tend to go on and on and on... I'll try to follow the form of your post here, just imagine that your paragraphs are quoted before mine as I'm not a big fan of them quote tags.

But anyway, the way that I see it, multi-source downloading for files that small makes little sense to the one downloading it, true, but even for files that small, multi-source downloading does help out the entire system, it uses bandwith that would not normally be used. Of course in the example I used the numbers were simply meant to make it easier on me, and I still got lost in the middle (I blame it on the fact that it was 3am). Now, true, the situation that I described would not likely happen in the real world.

In fact, currently a situation that would benefit from some of the control that I had described is far from likely. However, with the introduction of multi-source downloading this may slowly change. Clearly, multi-source downloading is a simple optimization for when you want a file that multiple others have, and in fact it works very well in that situation. What multi-source downloading does not help with is the situation where you have a file that multiple others want. This would be where other optimizations come into play, the simplest of which would be partial file sharing (Note that partial file sharing does not help in the case that only one person wants the file).

Consider the "super-seed mode" that some bittorrent clients use (info <a href="http://home.elp.rr.com/tur/superseed.txt" target="_blank">here</a>). There you can see that some of these optimizations do show results in the real world. Though, admittedly, bittorrent is a much more specialized system and more suited to certain optimizations, however, that "super-seed mode" is a rather hacked together solution as they had to stay within the confines of that standard. As far as people usually having their upload speed maxed, this may change as download optimizations, i.e. multi-source downloading, are implemented.

Now, a point of yours that I can't accept without justification is that "Uploading a file to someone who doesn't even want it is a definite no-no". I believe we are looking at this from different angles, you see the system as a way to primarily share files, I see it as a way to primarily share bandwith. The way I see it, whenever I'm looking for a certain file, I have no problem finding someone who is willing to share that file, there is an abundance of such people, I'm looking for someone who is willing to lend me some of their bandwith to get that file. Considered this way, there is really no difference between sharing your unused upload bandwith and sharing your download bandwith. Consider of course, you would not be sending the entire file to these third parties, merely a few segments at most.

Here again, I think that we are looking at things from a different angle. You are seeing it from the view of the individual, how each particular person is helped by multi-source downloading. I look at it with the goal to improve efficiency of the entire system rather than just improve the download speed of an individual. In this situation the goal, first and foremost, is to use as much of the available bandwith as possible, regardless of how it is being used. Once that is done, you want to start using bandwith smarter, don't waste upload bandwith on common files (a file with available bandwith elsewhere) when it can be used for rare files. Upload to clients that will be able to spread the file faster, i.e. if a file is in high demand, upload first to those who will themselves be able to upload. Etc., etc. The point is, with this view, it does in fact help for even that guy downloading a thousand files to use multi-source downloading. Though, still, it is generally most helpful for each client to be uploading and downloading one file at maximum speed at a time, as this will allow for rebroadcasting of that file quicker.

Now, when your goal is to improve the efficiency of the system, this in the end does help the end user. A more efficient system will transfer more data in the same amount of time, so in the end either more data is shared or the same data is shared faster.

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2004-06-18 14:01

misterhopman wrote:Yeah, I tend to go on and on and on... I'll try to follow the form of your post here, just imagine that your paragraphs are quoted before mine as I'm not a big fan of them quote tags.
Well, I just hope you don't mind that I do. :)
But anyway, the way that I see it, multi-source downloading for files that small makes little sense to the one downloading it, true, but even for files that small, multi-source downloading does help out the entire system, it uses bandwith that would not normally be used. Of course in the example I used the numbers were simply meant to make it easier on me, and I still got lost in the middle (I blame it on the fact that it was 3am). Now, true, the situation that I described would not likely happen in the real world.
As I already stated, the case of bandwidth not being used is a very rare one. The average download speed is probably like six times the average upload, so it's very difficult for transfer speeds to be limited by the downloader. And no, I don't blame you at all for the numbers, that was just a silly nitpick. I just wanted to point out that it is only worth the overhead for large files.
In fact, currently a situation that would benefit from some of the control that I had described is far from likely. However, with the introduction of multi-source downloading this may slowly change. Clearly, multi-source downloading is a simple optimization for when you want a file that multiple others have, and in fact it works very well in that situation. What multi-source downloading does not help with is the situation where you have a file that multiple others want. This would be where other optimizations come into play, the simplest of which would be partial file sharing (Note that partial file sharing does not help in the case that only one person wants the file).
Note that "simple" is a very relative term. There reaches a point where the performance gains become logarithmic to the amount of effort put into the system, and that is the point to stop complicating things. As a side note, between hashing and PFS, we can now no longer have a need for RAR-sets, which I think is a good thing.
Consider the "super-seed mode" that some bittorrent clients use (info here). There you can see that some of these optimizations do show results in the real world. Though, admittedly, bittorrent is a much more specialized system and more suited to certain optimizations, however, that "super-seed mode" is a rather hacked together solution as they had to stay within the confines of that standard. As far as people usually having their upload speed maxed, this may change as download optimizations, i.e. multi-source downloading, are implemented.
I am quite familiar with super-seed mode, and I think it is a good idea that should be implemented once we get PFS up and running. However, I disagree about upload speed no longer being maxed: we're still going to have the same amount uploaded and downloaded, no matter how it's distributed. People might even wind up downloading more once they find out that they can get good speeds when only getting one thing.
Now, a point of yours that I can't accept without justification is that "Uploading a file to someone who doesn't even want it is a definite no-no". I believe we are looking at this from different angles, you see the system as a way to primarily share files, I see it as a way to primarily share bandwith. The way I see it, whenever I'm looking for a certain file, I have no problem finding someone who is willing to share that file, there is an abundance of such people, I'm looking for someone who is willing to lend me some of their bandwith to get that file. Considered this way, there is really no difference between sharing your unused upload bandwith and sharing your download bandwith. Consider of course, you would not be sending the entire file to these third parties, merely a few segments at most.
Yes, and here I think we will always fail to see eye-to-eye. I also think that generating more traffic, which is what your plan calls for, is something that should be avoided. I'm curious what kinds of hubs you hang out on, as I pretty much never see this unused upload bandwidth you keep talking about.
Here again, I think that we are looking at things from a different angle. You are seeing it from the view of the individual, how each particular person is helped by multi-source downloading. I look at it with the goal to improve efficiency of the entire system rather than just improve the download speed of an individual. In this situation the goal, first and foremost, is to use as much of the available bandwith as possible, regardless of how it is being used. Once that is done, you want to start using bandwith smarter, don't waste upload bandwith on common files (a file with available bandwith elsewhere) when it can be used for rare files. Upload to clients that will be able to spread the file faster, i.e. if a file is in high demand, upload first to those who will themselves be able to upload. Etc., etc. The point is, with this view, it does in fact help for even that guy downloading a thousand files to use multi-source downloading. Though, still, it is generally most helpful for each client to be uploading and downloading one file at maximum speed at a time, as this will allow for rebroadcasting of that file quicker.
I agree that rare files should have precedence over common ones, and that was something I thought about when thinking how to do a good upload queue. However, my idea was to keep it lightweight and conservative, so that it wouldn't require a complete re-engineering of the Direct Connect system. Swapping slots a lot when getting common files, as I described above, gives people getting rare files a better chance to get a slot so they can hold onto it.
Now, when your goal is to improve the efficiency of the system, this in the end does help the end user. A more efficient system will transfer more data in the same amount of time, so in the end either more data is shared or the same data is shared faster.
My idea was to try to keep true to the current DC system and not break backwards compatibility. What you are describing is something totally different that simply feels fundamentally different from the DC world, and therefore it's a path I don't think we should venture down.

I've been half asleep the whole day, so don't kill me if this is incoherent.

misterhopman
Posts: 3
Joined: 2003-05-02 14:23

Post by misterhopman » 2004-06-18 16:31

Ah. Please, use quotes to your heart's content, makes it that much easier for me to follow, I just don't like to take the time to do it, I already spend enough time on typing this stuff.

Well, I believe that we have come to an agreement (I believe we actually started with this agreement) on the implementation of both multi-source downloading and partial file sharing. And, in my opinion those are the two best things to implement, in that respective order (though I assume they would be done together, anyway, as the second would be kinda odd without the first, and the first benefits greatly from the second...). And, it seems we agree on the precedence of rare files over common files, at least to some extent.

Anyway, as far as what hubs I am on with this abundance of bandwith, well I'm not. I'm on hubs that need upload bandwith, in fact, I believe every hub that I'm on would benefit more from another user with 768kB/s up who is sharing nothing than from another user with 256kB/s up who is sharing 100GB, if there were a way for that first person's upload bandwith to be used. The simple fact is, there is more than enough shared that another person sharing 100GB isn't gonna make a difference, what is needed is more upload bandwith, but right now there is no reason to allow people into a hub based on there upload bandwith alone, in fact the most effective thing right now is to base it upon how much actual data they are sharing, at a certain point, that, however, becomes a moot point.

As far as what I'm suggesting being somewhat, ah, proprietary... it is true that this is how things evolve, isn't it? I mean take TTH for example, I don't know really how the use of this has come about, but it is limited to only certain clients, isn't it? And if it is used for multi-source downloading, that too becomes proprietary. As far as whether what I'm describing is fundamentally different, I cannot say, I don't know what clients can be made to communicate and what they can't. All that it would truly require would be that clients be able to communicate their upload and download bandwiths, and they may not even be able to do that...

ATV
Posts: 6
Joined: 2004-06-20 18:51
Location: Brazil

Post by ATV » 2004-06-20 19:25

Would multi-source downloading support be good for DC++ at all? Well, in my opinion I think the way this amazing P2P software currently stands is just perfect! No need multi-source dl'ng support, nether does it resuming support.

At least for my dial-up connection, only 1 source for each download brings about astonishing speed transfers! So, I think DC++ doesn't really need this kinda feature. :)

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Post by GargoyleMT » 2004-06-21 19:15

ATV wrote:Would multi-source downloading support be good for DC++ at all?
This has been discussed many a time before. You won't know until you get a good (i.e. non-corrupting) implementation and try it on a limited basis. (There are all kinds of flavors of justifying both possible answers if you search the forum.)

ATV
Posts: 6
Joined: 2004-06-20 18:51
Location: Brazil

Post by ATV » 2004-06-24 22:11

Ok, I'll go over this subject on the forum then :D

Don
Posts: 1
Joined: 2004-06-30 12:09
Contact:

multisource downloading feature

Post by Don » 2004-06-30 13:11

Hi guys, my 2 cents about multisource feature in DC++, I think that this features must be available, I vote for it, because if implemented properly - it can help a lot to make a traffic lower, especially when you can control slots/connections on a hub/client. Personally I like a lot how it is implemented in last versions of eMule. I like that you can see there:
1) what file parts certain user has available,
2) check when file was last time seen complete (on current client/hub, for example)
3) corrupt chunk checks
4) priority which part to download first/last
5) put comments/ratings on files,
6) check your number in a queue to each user
I miss these features a lot in DC++, especially when you have slow connection and want to share something big with others - it can take much longer without multisource/parts downloading before everyone will get your files if all will start to download only from you simultaneously.
In general I like idea of DC, for certain purposes it is better then eMule (other) networks, thanks a lot for it!!!

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Re: multisource downloading feature

Post by GargoyleMT » 2004-06-30 22:19

Don wrote:1) what file parts certain user has available,
Partial File Sharing (PFS) is a separate feature, but is sometimes brought up in conjunction with an upload queue. If you read past discussions, you can see that at least the upload queue won't go into DC++. Karma may not either, since without an upload queue, it's rather useless.

Decent points, and it doesn't hurt to keep the topic semi-fresh (or rehash old points).

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Re: multisource downloading feature

Post by PseudonympH » 2004-06-30 22:20

Don wrote:Hi guys, my 2 cents about multisource feature in DC++, I think that this features must be available, I vote for it, because if implemented properly - it can help a lot to make a traffic lower, especially when you can control slots/connections on a hub/client. Personally I like a lot how it is implemented in last versions of eMule. I like that you can see there:
1) what file parts certain user has available,
Do you mean the client software or the actual person? Because I don't really see the point in the user being able to see that some guy has chunk #213 but not chunk #125.
2) check when file was last time seen complete (on current client/hub, for example)
Well, it's my thinking at least that you definitely shouldn't be able to queue except from a complete source, which should help out preventing orphaned downloads.
3) corrupt chunk checks
Already in DC++ 0.403
4) priority which part to download first/last
I don't see any point in doing this in a way other than prioritizing rare chunks...
5) put comments/ratings on files,
That's what Bitzi it for.
6) check your number in a queue to each user
DC clients do not have upload queues, and seeing as it's a rejected feature I don't think we'll be seeing this anytime soon.
I miss these features a lot in DC++, especially when you have slow connection and want to share something big with others - it can take much longer without multisource/parts downloading before everyone will get your files if all will start to download only from you simultaneously.
In general I like idea of DC, for certain purposes it is better then eMule (other) networks, thanks a lot for it!!!
Nothing to complain about here. :)

odstom2
Posts: 1
Joined: 2004-07-02 17:29

Multisourse Will give faster download for everyone

Post by odstom2 » 2004-07-02 17:36

Shroom wrote:I have to agree with St0ry here. If multisource downloading is implemented, it will have a very negative impact on the current slot system. But I'm sure you guys have thought of this.

Maybe dc++ could figure out your maximum download speed (excluding the realtime compression that is), and only add new sources to the multisource download if your current total download speed is less than say 90 % of your maximum download speed. Multisource downloading would also have to be limited to say 3 sources at once. Anything more would be a huge waste of slots. (in my opinion of course).

:D
Well it seems like Everyone that are agains multisource here forget one thing, they clame it wil be to many userslots in use so it will be a huge waste of slots, and waiting time for users to download. BUT They all seems to forget that EVERYONE will get theire Files 100 times faster if they get 100 slots, then this file will have 101 slots for next person.
to download this file.

This again wil get downloading time For everyne Down, not the other way as they try to indikate.

this is logical for me..

or explain to me why i'm wrong..
.

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Re: Multisourse Will give faster download for everyone

Post by GargoyleMT » 2004-07-02 18:53

odstom2 wrote:this is logical for me..
or explain to me why i'm wrong...
The counter argument, that users will open more slots and the speed per slot will be lower, is also logical. We won't know whether multisource will be good for DC until we experiment with a good, non-corrupting client (which can only happen now that 0.402/3 is out).

Arguments pro/con here are irrelevant, because they're based upon user's opinions, not experimental data.

synOs
Posts: 7
Joined: 2004-02-14 18:04

Re: Multisourse Will give faster download for everyone

Post by synOs » 2004-08-18 16:24

GargoyleMT wrote:The counter argument, that users will open more slots and the speed per slot will be lower, is also logical. We won't know whether multisource will be good for DC until we experiment with a good, non-corrupting client (which can only happen now that 0.402/3 is out).

Arguments pro/con here are irrelevant, because they're based upon user's opinions, not experimental data.
It would be impossible, however, to deny that download speed, in aggregate, would not increase considering bandwidth unused in the current single-source implementation of DC++ would now be used. On a per download basis, if all others variables such as the number of downloads a user typically makes per day stays constant, the download speed will invariablely increase. Again, if all other variables are kept constant, users will be able to download faster if multi-source downloading is implemented. Whether or not a user increases his/her number of slots is irrelevant to the bandwidth that user is supplying to the HUB. If a user typically supplies 100KiB/s with 5 slots open, and hence an average of 20KiB/s per slot, it will not matter is the average drops to 5KiB/s with 20 slots considering that the total bandwidth will still be utilized. Where multi-source downloading adds to the current bandwidth aggregate is it's utilization of other users bandwidth not yet tapped by the current single-source download implementation.

corona
Posts: 3
Joined: 2003-06-14 03:47
Contact:

Post by corona » 2004-08-19 09:13

when it comes down to it, multi sourcing only really helps people who've got faster than average connections.
I've got a tunnel connection that I can download at up to 300KB/s on, but drops out a lot and is passive only. Using a multi-source client (a non-corrupting one that is now available, been flawless for me) allows me to get a cd in 2-3 hours often, simply because getting 10-20KB/s off 10 people sucks it down in no time.
But if I use my 512Kbit adsl, I've only got to download a few different files at once, and get 10-20KB/s on each of them, and I've maxed out my connection - absolutly nothing to gain using the multi source here.
The multi-source client I sometimes use has auto-dropping of slow users, which seems to get rid of pretty much every source if I have 20 sources @ 2KB/s.
I'm not sure if I should really mention the name of the client here or not.....

Corona
Windoze Sux still....

GargoyleMT
DC++ Contributor
Posts: 3212
Joined: 2003-01-07 21:46
Location: .pa.us

Re: Multisourse Will give faster download for everyone

Post by GargoyleMT » 2004-08-20 11:55

synOs wrote:It would be impossible, however, to deny that download speed, in aggregate, would not increase considering bandwidth unused in the current single-source implementation of DC++ would now be used.
impossible? ;))

-T_A-
Posts: 3
Joined: 2004-02-18 17:54
Location: Israel

Post by -T_A- » 2004-08-23 04:49

i think what most people miss here is the fact the a big percentage of the sources in DC
are useless simply cuz the upload of this sources is too low (lower than 10-15K)
with multisource you can use all those sources since alot of 5-10k sources can build
into a fast download

PseudonympH
Forum Moderator
Posts: 366
Joined: 2004-03-06 02:46

Post by PseudonympH » 2004-08-23 09:20

Thank you, Captain Obvious.

synOs
Posts: 7
Joined: 2004-02-14 18:04

Re: Multisourse Will give faster download for everyone

Post by synOs » 2004-08-27 17:30

GargoyleMT wrote:
synOs wrote:It would be impossible, however, to deny that download speed, in aggregate, would not increase considering bandwidth unused in the current single-source implementation of DC++ would now be used.
impossible? ;))
Yes. Impossible... whatcha got? :D

madman2003
Posts: 7
Joined: 2003-11-22 14:11

Post by madman2003 » 2004-08-28 06:39

Doesn't dc++ already have multi source downloading?(the ability to resume files from another user when the current one leaves) Isn't the feature on discussion here "Segmented Downloading"?

Segmented downloading will probably work great in private hubs, where slots aren't always full and people come back to share, where a (small) group won't be presented with a massive increase in used upload bandwith.

For those who have succesfully used segmented downloading clients in public hubs, keep in mind that only a small percentage of users use it, if everyone would, then the situation would be quite different.

Segmented downloading could potentially lead to an overhaul of the slot system. We could get situations with 1 slot/1 KiB upload or even less, which would make it look too much like other p2p systems. Direct connect is a hub based system and should be tuned to function best in one or a small number of hubs. In such a situation segmented downloading isn't always beneficial.

Madman2003.

ilab
Posts: 1
Joined: 2004-09-05 16:32

x

Post by ilab » 2004-09-05 16:42

The sum of the slots of the users in a hub would not be less or more if dc++ supported multisource (or segmented or whatever) downloading. You say that the slots are almost all being used even now. Than what would change? In fact, nothing, because if sy uses more than one slot for downloading a file, than that file will arrive sooner, so the slots will be used for proportionally shorter time. The one exception is, if many people would decide to download much more stuff than now, but i don't think there are many such people.

CDGSatan
Posts: 1
Joined: 2004-09-11 17:48
Contact:

Post by CDGSatan » 2004-11-09 11:06

I think it will maximise the DC network more than slow it down. I mean, some other clients have had multi-source dling for a long time now (i'm not sure if i can mention them or not so i'll say nothing). I have tried them out, it means i spend more time sharing that dling. I can spend more time with an empty dl queue because i got that new film in an hour rather than 2 days. I don't download more (and, admit it, none of you lot with either - you'd don't say, aww, i gotta wait 2 days, i don't think i'll get that) I just download the stuff i want, quicker.

I mean, surely 5000 users on a hub each with 5/6 slots filled because ppl are multi-sourcing is better than 2900 with 6/6 slots filled and 2100 with 1/6 filled as it is now? I think it will maximise it because ppl will be sending more often than they are dling. It's a tricky topic, so i could be wrong, but whatever...

Locked