Distributed DC network

Technical discussion about the NMDC and <a href="http://dcpp.net/ADC.html">ADC</A> protocol. The NMDC protocol is documented in the <a href="http://dcpp.net/wiki/">Wiki</a>, so feel free to refer to it.

Moderator: Moderators

A
Posts: 17
Joined: 2003-02-02 05:55

Post by A » 2003-05-01 09:16

if you want too make changes too protocol you should consider compatiblity with all hub software.

Since the NMDC hub doesn´t get updates anymore, that one problem, then DC V2 comes out soon i heard (late april they said) what about that compatiblity problems?

You can´t just discuss in one topic a change of a protocol, The way your all talking is a whole new p2p network.

I know te protocol sucks, but it is just really hard too change all of it now.
because you need too change hub side and client side.

btw: a thought ptokaxhub runs smood with 1500+ users

A
Posts: 17
Joined: 2003-02-02 05:55

Post by A » 2003-05-01 09:27

but then again bandwidth is the discussion

maby it´s possible too modify the client that it can pass information thru
too other clients and then build up some tree.

That the hub only have too give 10 users or more or less the information
and that they give the information too other clients and so on.

Marvin
Posts: 147
Joined: 2003-03-06 06:56
Location: France
Contact:

Post by Marvin » 2003-05-01 09:47

IMO, since the DC protocol has been published, and new clients have been released, incompatibily with neomodus software is not an issue. If the hub you want to connect to does not support the client you're using, it's up to you to get another client. If J. Hess wants his software to be compliant to the extended protocol (as you may know, the protocol as changed clientside since the release of NMDC), he is free to release as many versions as he wishes. If he wants to stick to his (original) version of the protocol, or extend it in his own way, I think I'll live with it, as I don't use NMDC nor NMDCHub.

As long as new clients keep backwards compatibility with old hub software (so that anybody with the latest client could join as many hub as possible), I can't see the point to keep hub software compatible with old clients (upgrading to a new client takes less than a minute when it comes to DC++).

aMutex
Posts: 10
Joined: 2003-02-07 15:04

Post by aMutex » 2003-05-01 15:10

Ok, that Ptokax thing is really getting on my nerves. Just for your info, there's several other hubs than Ptokax that support alot more users with less ressouces.
Tell me please which hubsoftware have had 1700 users WITHOUT lag (and 1900 users at all) so far - besides ptokax. Btw: i'd expect you name some windows-hubsofts here - as you cant compare linux with windows-hubs. But i guess you're smart enough to understand that.
And if you had bothered actually reading this very interesting thread before spoiling it with garbage, you would noticed ...
The starter of that particular thread (seeSharp) wrote something about the background of his post. Have you read that? His opinion was that there is no hubsoftware that can handle more than 900 users.
So i think its only kind of logical that one person posts here, that ptokax CAN handle more than 900 users. So whats your problem here?

Surely - there is a physical limit with the current DC protocol. But i dont think seeSharp expected some new protocol beeing able to deal with 5000++ users. BTW: that would be against the spirit of DC at all. But i guess you dont care for this one anyway.

So before you start some ptokax-flaming-post next time - think any further.

Anyway, good luck with the further discussion about PROTOCOL/CORE-CHANGINGS ...

TasMan
Posts: 196
Joined: 2003-01-03 08:31
Location: Canada
Contact:

Post by TasMan » 2003-05-01 15:37

If you read ButterflySoul's entire post, what he's really annoyed about is the blatant posts about PtoKaX. Spamming is a thing nobody likes....(many posts ptokax are mentioned where it doesn't really have any relevance - even here! =)

I agree with one point though. Stop the pointless flaming - and mentioning any hub softwares in particular. We can flame in a thread specific for flaming =)

Anyways back to proper topic.....

I like the idea of distributing the traffic.....when the ideas become solid, I would gladly begin implementation so we can start getting an idea of how much bandwidth is really saved, etc.
Shadows Direct Connect Hub - Taking away the light from NMDCH, leaving only shadows.....

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2003-05-01 16:54

You can´t just discuss in one topic a change of a protocol, The way your all talking is a whole new p2p network.
Not if the extra messages among the hubs necessary to maintain the multihub network are filtered out before the users ever see them; in that event, the client can't distinguish this virtual hub from a single, real hub.
maby it´s possible too modify the client that it can pass information thru
too other clients and then build up some tree.

That the hub only have too give 10 users or more or less the information
and that they give the information too other clients and so on.
As the previous couple of people who brought this up so failed to do, please suggest security mechanisms.

Alternatively, just give these few users a hub to run, for it's important that a hub, or relay station (*cough* castrated hub, useless to write/distribute separately) keep running, than a client does, and I worry about those 10 users accidentally closing their clients. I'd avoid commingling client functionaity with hub functionality, for at least this psychological reason.
Marvin wrote:I can't see the point to keep hub software compatible with old clients
Whilst it's not verboten, certainly, to change/add to the DC protocol, I avoid it if something's possible without doing so, such that the greatest number of existing participants in the DC network may benefit. This is one of those cases; no client change is necessary to quite noticeably improve the bandwidth situation, so I avoid such client change.

volkris
Posts: 121
Joined: 2003-02-02 18:07
Contact:

Post by volkris » 2003-05-01 20:00

ButterflySoul wrote:If I remember properly, the few approaches of distributed hubs so far apparently suffer from an "out of sync" issue after a bit. Having one single hub keep track of all the users on all the slave hubs is the most simple way to fix the issue.
However, I'll definitely agree with you that it's not the most elegant =)
Yes, and I really wonder #1, why they are getting out of sync (it kind of sounds like an implementation problem/bug rather than a problem with the architecture or idea), and #2, exactly what specificially needs to be kept in sync, as I'm pretty sure the things that need sync don't need to be distributed, as you seem to agree.

HaArD
Posts: 147
Joined: 2003-01-04 02:20
Location: Canada http://hub-link.sf.net
Contact:

Post by HaArD » 2003-05-01 21:56

I'm glad I missed the campfire.... now that the flames have died down and we're back on topic..
volkris wrote:Yes, and I really wonder #1, why they are getting out of sync (it kind of sounds like an implementation problem/bug rather than a problem with the architecture or idea), and #2, exactly what specificially needs to be kept in sync, as I'm pretty sure the things that need sync don't need to be distributed, as you seem to agree.
Exactly!

In order to keep things in Sync the userlist and contents of the Users $MyINFO must be given priority in Hub-Hub exchanges. Following that I would place chat messages or it becomes impossible to follow a conversation, then search requests and PM's.

I would have placed search requests higher but we are talking about linking a large hub to allow it to grow further, any delay in broadcasting the request on 'linked' hubs shouldn't cause a big problem since the 'home' hub will have sent it to 'several hundred' users right away anyway..

One of the things I have noted is the tendency of some clients (tested with DC++ v.24 and oDC v5.1) to send $MyINFO whenever the user clicks OK on the settings, even if nothing in the $MyINFO string has changed.

It would be preferable for clients to only send $MyINFO when there is a change, but, failing that, hub-hub traffic should not include redundant $MyINFO strings and hubs should not broadcast them (NMDCH does, untested with others)

Just some random thoughts on this... I'd like to hear more about the existing implementations of linking, the approach taken and what worked/didn't with that approach.... I haven't heard of a Windows linking beyond NMDC and I'm not a Linux user so I'm unfamiliar with those...

HaArD

seeSharp
Posts: 24
Joined: 2003-04-19 10:03

Post by seeSharp » 2003-05-07 18:57

Back in Business :)

I'm happy to read many new and good ideas. I really like the master/slave HUB concept. That one could solve the bandwidth problems without any client side changes. And it would give a good solution to sync problems too.

It's extra bandwidth requirements over a Hub-relay network should be estimated... (maybe I will do it some time later...:).

volkris: simple hub linking solution do get out of sync because of bugs in the implementation. The point is, that running a HUB designed for a stand-alone server operation, and hacking it to work in a server tree/pool isn't a nice idea. I remember old times with Win3.x/Win9x, where you never got a really stable system. Just because of the architecture. It wasn't designed to be safe/stable/etc...

That's why I would like to see a solution, which isn't the simpliest hack into the system to get it working, but a good one.

volkris
Posts: 121
Joined: 2003-02-02 18:07
Contact:

Post by volkris » 2003-05-09 09:44

So it seems to me that hub linking has just never been given a fair trial.

Basically of the two solutions proposed here, linking of hubs and using clients as relays, hub linking is more inline with the needs of DC.

The most obvious difference between the two is the exponentially greater complexity of client relaying, which is fine if it's justified, but I don't believe it is here.

What it boils down to is that with client-relays you'll have less dependability, less determinism, more security issues, more route flapping, more programming, and significant changes to the client side.

With hub linking you don't have any of these problems, the only real negatives being that people have to volunteer to run linked hubs and the RIAA has a slightly better idea of whose door to knock on. Neither of these seem to be significant problems in DC, though (if the RIAA starts taking out hubs it's not going to be stopped by the client-relays anyway).

So hub linking is basically the answer.

seeSharp
Posts: 24
Joined: 2003-04-19 10:03

Post by seeSharp » 2003-05-09 14:15

I'm getting a bit confused now...

Are we talking about HUB linking or using clients as relays?

Or are we talking about 3 different things?
1.: clients as relays - seems to be disliked
2.: using RelayStations - one MasterHUB and some Slave-HUBs
a.: with client side changes to optimize the network load - client connects to Master and 1 Slave
b.: without client side changes - Client only connects to 1 Slave
3.: hub linking - traditional DC hubs, linked with some plugins/scripts. - that's what we have trouble with now.

About RIAA:
Doing version 2/b would make it harder for RIAA: one master HUB with semi low load - could be placed even on a DSL line - hard to catch. SlaveHUBs now nothing. Can you they sue cisco? their routers are also routing p2p traffic... :))
Well, larger HUBs would be better tartgets for them, but it won't change that much. Hubs could grow from 1000 user to 5000 maybe. Well under the millions on Kazaa...

agito
Posts: 1
Joined: 2003-05-10 19:18

would it be possible to make it like irc servers?

Post by agito » 2003-05-10 19:23

I'm not too familiar with IRC servers but i do know that they have that wierd netsplit thingy that is the result of servers losing connections with each other but it seems to do fairly well for chatting...

I wanted something like that for my LAN at school.

something that would be like DC++ with filesharing but the chat is distributed through multiple servers that are linked together and only allow like lets say 50 connections each...

that way not one server is taking all the hits in bandwidth and everyone can chat in the channel.

at this moment there could be infinite channels of the same name on different servers of DC...

if we linked the servers then the channels would merge... well... in theory...

Thanks,
Agito

yilard
Posts: 66
Joined: 2003-01-11 06:04
Location: Slovakia

Re: would it be possible to make it like irc servers?

Post by yilard » 2003-05-11 07:39

agito wrote:I'm not too familiar with IRC servers but i do know that they have that wierd netsplit thingy that is the result of servers losing connections with each other but it seems to do fairly well for chatting...
This problem is not applicable to the idea with master and slave hubs as far as I understand it.

In IRC is problem that all servers are peers and when they cannot communicate they don't know if there aren't users with same names on several servers. When they join again, they have to decide whom to keep connected.

IRC network is designed for fault tolerance and bandwidth conservation, but bandwidth conservation is enough for as. Slave hubs would be just empty shells without master.
agito wrote: if we linked the servers then the channels would merge... well... in theory...
You are offering an analogy to IRC network, but the architecture does not scale well to as many hubs as there are currently online.

Chat can be linked in smaller hub networks (I think there are already scripts that do it).
In the age of super-boredom/hype and mediocrity/celebrate relentlessness/menace to society --KMFDM

ButterflySoul
Posts: 210
Joined: 2003-01-23 17:24
Location: Nevada
Contact:

Post by ButterflySoul » 2003-05-12 16:25

This problem is not applicable to the idea with master and slave hubs as far as I understand it.
Well, I didn't think it would end up "competing" alongside the other ideas already presented in the thread, but since there are a few posts about it, I thought it might do some good to elaborate a bit.

The general idea is that what currently blocks the hubs is the bandwidth (when ran on a decent cpu anyway =), and what eats up most of the bandwidth is the searches (according to the stats posted so far).

So essentially, each slave would redirect all the searches directly to other slaves, and send all the rest of the protocol data to the master hub. It would of course also take care of distributing "upstream" protocol to all the clients it has in charge.

The master hub would therefore receive a lot of downstream (that's not a problem, there's usualy plenty of it) and send very little upstream (since no client is actually "connected" to it. Only the slave hubs talk with the master). Remember that it would never see any protocol data related to searches, so eventhough it would have to process a lot of "bulk" data, it would be nothing compared to the whole data being exchanged on the "group of hubs" (i.e. the distributed hub considered as a whole).

To limit the processing load on the master, you could also consider having the slave hubs do a minimum of "pre-validation" of clients when they want to connect. Say a client want to log in, you could have the slave hub hold a copy of the banned IP list and do that check itself (that should help a bit). If the process goes as far as validate nick, it would then ask the master hub "client xxxx wants to connect, do I need to ask a pass ?" and wait for an answer (either "no", either the pass, either "nick is already taken"). From there, the client can negotiate the rest of the login sequence with the slave hub, and the slave hub could enforce any slots / mldonkey / minimum share / etc rules and the rest of the login sequence up to the $MyInfo. Once all this is done, if the client passed all the checks, a last "client successfuly loged in (i.e. with the right pass, enough files shared, isn't a hacked client, etc), please add him to the global userlist" from the slave hub to the master hub would finish the login on the "distributed hub".
The rest of the protocol data needs some processing too, of course, but the command holding the record for processing is the $MyInfo (I realise the chat and pms can trigger much more complex commands, but these commands aren't executed each and every time a chat sentence or a pm arrives to the hub).

The solution is not scalable ad infinitum. Because the number of slave connections a master hub can handle is limited, the solution has a finite number of clients it can handle.
That said, an average connection nowadays can handle 500 clients. Often more than 500, of course, but let's consider that the slave hubs will have to send a bit more of upstream than traditional hubs, and that the master hub will have to download more data from its own clients (i.e. the slave hubs)... Not sure how many clients each hub will be able to handle, but all in all 500 seems a reasonable, modest count, so let's go with that.
With 500 slave hubs pluged into the master hub, each slave hub handling 500 users... that's an architecture that supports 250 000 users total on the hub, if you really push it to the edge (or fairly close, anyway). Now consider the fact that your other 249999 co-hubsters will be sending searches (which the master hub won't have to process, but your client will). Add to that the fact that it will also have to process chat, maybe a few PMs, and keep up-to-date a userlist of 250k people with their lill icons and everything... chances are your CPU will scream bloody murder, even if you found a patch to run DC++ on that Cray sitting in your living room.
My point is, it's not as scalable as far as other solutions like icq, but it's most likely scablable as high as the client software will go. And so far, it seemed that the distributed solutions requiring client modifications had terrible drawbacks (what if the client disconnects, how to keep it compatible with older clients, what if older clients saturate the hub -since they'll need to connect to the hub-, and most importantly, how do we get all client devs AND hub devs to agree on a common standard code and protocol to handle that mess ?).
[CoZ] Children of Zeus
-----
Shadows DC Hub - VBS and JS scripting at their best

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2003-05-12 18:50

each slave hub handling 500 users
Each slave must effectively handle 1,000 users - 500 clients, 499 other slaves, and the master. Thus, a connection which can handle 500 "users" will handle no end-user clients, but the 499 other slaves and the master, each of which will require as much bandwidth as an end-user client, leaving no bandwidth free to deal with the latter. This clearly doesn't accomplish much.

Thus, you could use a configuration with 250 hubs or something - each slave can then service 250 end-users, and you'd still have 250 slaves*250 users/slave = 62,500 users. I would strongly suspect that a completely connected graph is the most efficient one can be here though; a minimal spanning tree should be optimal in terms of bandwidth usage, and any tree is more easily routable than a non-complete graph with cycles.

More broadly, I question the point of the master here. Why not just make all the slaves peers, with no single master among them, and connect them as I've described?

ButterflySoul
Posts: 210
Joined: 2003-01-23 17:24
Location: Nevada
Contact:

Post by ButterflySoul » 2003-05-13 18:32

Each slave must effectively handle 1,000 users
Ah yes, indeed. Make that 62k users instead of 250k.

Even so, I think that's still a lot of work for CPUs running the dc clients. I'm not sure how many fellow users a DC++ client can manage on an average CPU (and I know it depends a lot on the size of your share, and on the frequency of searches) but my guess is you will probably want to keep it way under 62k.

Independently from what the computer and the software can handle, you also have to take in account what the users, as human beings can handle. Personally would never want to waste my time steping into a hub with 5000 users, because the chat is probably going to be a chaotic mess impossible to follow. Either that, or it's going to be plain dead, with a majority of "I'm not talking, I'm just downloading in my lill corner" users, which is even worse, because DC is about the community inside each hub; not simply the files. For a simple download supermarket without any interaction, Kazaa or BitTorrent are much more appropriate. You can have a large community of several dozen of thousands of users, but not with everyone in the same place all day long. It's not adapted to chat rooms, or hubs; but rather to messages boards, or concepts that allow users to "drag their group" in a more quiet place. If you've ever played one of these text based RPGs, you probably know what I mean. Big events with everyone together in the same place/same room are fun... for about a half hour. After that, you really want to go back to a more quiet place with the people you feel close to, rather than fight the scroll and feel lost in the mass.
Here and there, a few people managed to reach over a thousand and keep a sense of community that goes beyond a simple "we're in the same hub" (which I think is amazing, considering the trouble I have remembering the real-life names, continents and general tastes of barely a hundred users). Obviously, there is a need for a tad bit more space and room in a few hubs; but while the technical limit to this is -in theory- infinite, you can only go so high before it plain stops even being a hub at all. The essence of a DC hub is not adapted to these massive (>10k)numbers of simultaneous users, unless you butcher away some parts of the hub concept as a whole, and only consider the upload/download aspect. It's an issue that was barely mentioned in the thread so far, and I don't think it's really trivial. So far, the general tone was "the more, the better"; which is true from a purely technical point of view, but becomes false past a certain point from a global point of view.
The real question is : Where is that certain point ?
Obviously, the distributed hub concept doesn't need to be scalable beyond it -and my personal opinion is it actually shouldn't-

-----

On a side note, you'd probably also become the #1 target for the riaa guys (I'm sure they'd prefer to have the whole community on 10 hubs of 50k users each rather than split all over the place like now. It would make their job much easier =)

More broadly, I question the point of the master here. Why not just make all the slaves peers, with no single master among them, and connect them as I've described?
Ok, the real reason why is that beyound the pure concept discussion and nice ideals on a message board, at some point, someone is going to have to write some code.
Now if that someone was ever to be me, I would pick a concept with a master hub, because it makes things easier to code; and while it does have limitations on the maximum number of users, it still lets you have "enough" of them; and even "more" than what would allow the final result to still be called a "true" hub (that is, unless you completely disagree with the huge paragraph above)

Having a master hub a centralised user list, means many things are easier:
- Take the case of 1 slave/node being full. With a master/slave concept, you can have the slave announce it's full, and the master assign a "redirect" spot for the next users. With a "no one is the boss" system, all the slaves have to keep track of each other's load.
- Take another example : the registered user list. If you want someone to be asked for their password when they connect on node#13, but were registered on node #3, it means at new registrations/unregistrations, the info has to be propagated to all the slaves, to keep things in sync. With a master/slave concept, you always know who to ask for the pass, and you're sure it will always be up to date.

You can give the relays some degree of independancy, and have them relay to each other much more than just the searches. You can have them process many more things than just some pre-connection check. But if you want them to process everything 100% independantly without any master hub, then you'll need to propagate a lot of "plumbing" info, and design a complete "plumbing for distributed DC" protocol.

Essentially, to keep the whole structure together with independant nodes and no master, you'd need to have an equivalent of the role ICMP plays on the internet today. With a master hub in the picture, you don't need to worry about it. As long as the master knows everything, you're fine. It will tell the slaves what they need, when they need it, and won't bother them the rest of the time.
[CoZ] Children of Zeus
-----
Shadows DC Hub - VBS and JS scripting at their best

HaArD
Posts: 147
Joined: 2003-01-04 02:20
Location: Canada http://hub-link.sf.net
Contact:

Post by HaArD » 2003-05-13 21:17

I think you have an excellent point about how much is too much. 50,000 users chatting at once would be a scrolling nightmare...

One thing I'd add to the master Slave concept is the requirement for a redundant MASTER.

A Primary MASTER and a Secondary MASTER. All SLAVES would know both addresses. The Primary would keep the Secondary up to date on all centralized info (Passwords/Bans etc) and provide a "heartbeat" to the Secondary.

If the Primary went down, the Secondary would "announce" that it has taken over the MASTER role. When the Primary comes back up, the Secondary would provide the required "catch up data" and the Primary would "announce" that it has taken over the MASTER role again.

HaArD

cologic
Programmer
Posts: 337
Joined: 2003-01-06 13:32
Contact:

Post by cologic » 2003-05-13 23:33

So far, the general tone was "the more, the better"; which is true from a purely technical point of view, but becomes false past a certain point from a global point of view.
The real question is : Where is that certain point ?
Obviously, the distributed hub concept doesn't need to be scalable beyond it -and my personal opinion is it actually shouldn't-
Agreed. I don't know where that point is, though. The two following objections to the masterless configuration also are good:
- Take the case of 1 slave/node being full. With a master/slave concept, you can have the slave announce it's full, and the master assign a "redirect" spot for the next users. With a "no one is the boss" system, all the slaves have to keep track of each other's load.
True, though my solution would be to have an external, redirect only hub for that purpose. This hub handles user logons, and sends users off to the hub that's least busy. The crucial difference between this method and the master hub methods are that this one is much more robust against such things as taking down a single hub (e.g. the master, or both of them); new users simply wouldn't be able to log on, not nearly as bad.
- Take another example : the registered user list. If you want someone to be asked for their password when they connect on node#13, but were registered on node #3, it means at new registrations/unregistrations, the info has to be propagated to all the slaves, to keep things in sync. With a master/slave concept, you always know who to ask for the pass, and you're sure it will always be up to date.
Hrm, I'm not sure how I'd handle it - perhaps via this frontend, redirect-only hub as suggested above.
But if you want them to process everything 100% independantly without any master hub, then you'll need to propagate a lot of "plumbing" info, and design a complete "plumbing for distributed DC" protocol.
That's fine. None of it has to leak to clients, so it doesn't even have to be standardized.
Essentially, to keep the whole structure together with independant nodes and no master, you'd need to have an equivalent of the role ICMP plays on the internet today. With a master hub in the picture, you don't need to worry about it. As long as the master knows everything, you're fine. It will tell the slaves what they need, when they need it, and won't bother them the rest of the time.
Explain more?
One thing I'd add to the master Slave concept is the requirement for a redundant MASTER.
... and complexity is added back in... The reason for using a master was apparently to avoid complexity; given that a masterless topology is ideal, if one begins to approach its complexity anyway with patches to the master approach, it seems one should probably reconsider having masters. I'm not sure how far towards that point a secondary master goes, though.

Bizzy_D
Posts: 2
Joined: 2003-05-14 06:45

Post by Bizzy_D » 2003-05-14 07:14

hey.. im writing a new hub prog.. and im hoping to write distributed networking in to it, as well as some new chat stuff.
i dont know how long its going to take or how difficult its gonna be, but im gonna try anyway.

hopefully, this will allow for MUCH bigger hubs than 1000 users... if people co-operate, we could easyly see hubs of 3 maybe 4000 people with 200+ TB share

anyway, any help would be loved :)

Bizzy

HaArD
Posts: 147
Joined: 2003-01-04 02:20
Location: Canada http://hub-link.sf.net
Contact:

Post by HaArD » 2003-05-14 07:15

The reason for using a master was apparently to avoid complexity; given that a masterless topology is ideal, if one begins to approach its complexity anyway with patches to the master approach, it seems one should probably reconsider having masters. I'm not sure how far towards that point a secondary master goes, though.
In a Masterless setup you have to propogate "plumbing" to 'n' hubs. This could lead to synchronization issues and presents some challenges with load balancing.

In a Master Slave setup there is no propogation, therefore no synchronization issues and load balancing is simple. But, you now have a single point of failure that could take down the whole network.

In a Primary, Secondary situation the Master propogates that "plumbing" to one other location regardless of the number of Slaves. You get the benefits of the Master/Slave model with some of the redundancy and stability of the serverless model at far less "cost/complexity"

HaArD

ButterflySoul
Posts: 210
Joined: 2003-01-23 17:24
Location: Nevada
Contact:

Post by ButterflySoul » 2003-05-14 13:12

That's fine. None of it has to leak to clients, so it doesn't even have to be standardized.
(...)
Explain more?
No, it doesn't need to be standardized. Wether you pick a master/slave setup or a masterless setup, you will need some plumbling data to flow inside the distributed structure anyway. The difference in complexity resides in propagating the info and keeping it in sync, as Haard pointed it out.

In a masterless setup, it needs to be managed by each hub for the entire network, and propagated to all the other hubs. Masterless example : you send a command to your 90 lill brother-hubs (the first, obvious drawback being that you use 90x the bandwidth of a master/slave setup). You also need to keep track of where you sent it, and who acknoledge it (so you're also waiting for 90 bandwidth eating acklodegement commands, plus you have to manage a huge list of "who acknoledged, and who didn't" for each command).
In case for some reason node #24 and #25 are not answering after 5 re-sends, you would assume their are down... the problem is hub #25 was down about a minute earlier, than hub #24, so you need to keep for each hub an indivudal custom list of the commands that specific hub missed while he was away (that's fine if it's down for the time of a reboot, but what if it's a contractor that blew the line and the node won't show back up for a whole day ? What if the node owner went on holiday for 3 weeks ?)
What if yourself, your own node goes down for 5 minutes... That's when node #24 and #25 show back up... Obviously if you're the only one who had the info, it's pretty bad. So all the other nodes will have to keep track of what #24 and #25 missed while they were away (yay... we have 90 computers doing the job instead of one... what do mean that's not what "distributed" hub is about ? =p)... Let's make it a bit more tricky and say that because of provider choices, you're the one with a nice fast connection to node #25, both your providers are using a common partner which is what causes you to be temporarly down... So hub #25 and youself are still cut from the rest of the world, but hub #24 logs in fine... Hub #24 will receive it's first update from whoever was the fastest... and update everything... Now of course, it also receives a second updates shortly after, and a third, (...), and an 87th.. So what ? Does it discard all the other updates all together because "I already got my login update, thank you, but I'm acknoledging your packet anyway, because I'm online now" ? Obviously, you can have the same user loged out 87 times, but you can't start processing each update as they come, or you'll have him loged in 87 times which is going to be troublesome if you keep your records indexed...
Let's spice things up and go back to you and node #25. You finally got a hold of him, so you send your big "while you were away" update...but you won't have all the updates. You'll be missing 5 minutes worth of key information. You simply don't know it. Now the common partner fixes his routers, and you're both back online for real. You get a nice update for while you were away. That's fine, it's the update for the right time frame, your node wortks fine... But what happens to node #25 ? Obviously, it already acknoledged a "while you were away" update (the one from you); but it doesn't contain all the updates that would be required. So if it discards the next 88 updates, it will be out of sync... If it processes them (or at least processes one of them) there will be some operations made twice. For the thing to work properly, you'd need to make some serious cross-referecing with what you had, what you already updated, and determine what's left to update from cross-referencing n updates together... Come on... It's a nightmare ! =p
With a master/slave setup the info doesn't need to propagate to n hubs. It doesn't need to propagate at all... The slaves can ask the master "is password so and so the right one for that account" and the master answers yes or no. The slaves don't need to be kept up to date in real time of each password change. The slaves can ask the master "I spaced out for a sec, send me the userlist" and the master serves a huge nicklist. No need to cross reference 87 nicklists and make a best guess.


Want another example ? Say the hubs are masterless... What prevents a "rogue" node from announcing itself and asking to be served the complete registered userlist with passwords and everything for each account on the distributed hub, including the "general hub superadmin" account (your gateway node, eventually, would do the trick, but then I'm concerned about the traffic distribution on each hub and the load balancing as people log off from the nodes) ? What prevents him from being "disruptive" ? Not go as far as a major security issue, but let's simply say spam people with PMs about copying software being illegal. What are you going to do then ? $NodeXxx $kick user bla bla bla ? Yeah right, like it's going to process it =p Sure, you could setup a list of "trusted nodes", but then, who is going to have a word about which node is trusted and which node isn't ? How are you going to update that list ? Who is going to be the "trusted source for adding trusted nodes to the strcture" ? How do you update the trusted source for trusted nodes (etc, etc, etc...)

Yes, sure, it can be done... You can add hashes to the plumbing protocol to make it a bit safe, keep all the accounts and passwords on a gateway (and god help you if someone decides to directly connect on a node without going through the gateway first, because their friend at school gave them the address or whatever), you can add complex cross referencing mechanisms to keep all the updates in sync, and determine what needs to be discarded and what needs to be processed.
But from a quick look at it, it seems a lot easier to go with a master, and just "ask" when needed... instead of being constantly kept updated by independant nodes, receiving each info n times, and making sure it propagates properly n times as well.
In a Primary, Secondary situation the Master propogates that "plumbing" to one other location regardless of the number of Slaves. You get the benefits of the Master/Slave model with some of the redundancy and stability of the serverless model at far less "cost/complexity"
Aye. Essentially like the primary and secondary of an NT network model. The secondary is just there to be a backup. It doesn't add all the complexity again, because there's still only one "known for sure" unique central trusted source (at a time) for everything. If the primary (or the secondary) falls, big deal. You just serve it the current state when it comes back.
There's no cross-referencing needed because there's only one source. There's no half-comitted updates that have been processed on some places but not on others and that need to finish to propagate, because there's only one place where the updates go. They stay stored there and don't propagate further. Either they make it there or they don't. You never end up with those ugly cases where they spread to two thirds of the network, but not to the last third; and you need to figure out how to fix it from there =)
[CoZ] Children of Zeus
-----
Shadows DC Hub - VBS and JS scripting at their best

SBSoftEA
Posts: 20
Joined: 2003-01-03 12:29

Post by SBSoftEA » 2003-05-15 06:07

well I am currently working on a so called MHN (multi hub network) server for the SBHub, which should also work with any other hubs, through a script. The server will make the following things possible:
1) MHC
2) Multi hub bans
3) Multi hub ops
4) Synchronization with a master op list located @ the Server
5) Synchronization with a master ban list located @ the Server
6) Redirect Address can also be synchronized with the server
7) MHS (multi hub search) would also be possible, but as far as i know, DC++ won't let you dl of users not located in the user list (correct me if i am wrong) If you want to, i can post the protocol that i have created so far
SBSoftEA Test Hub ---> seabass.no-ip.org:413
Not online 24/7 come by to help us test =)

seeSharp
Posts: 24
Joined: 2003-04-19 10:03

MHS

Post by seeSharp » 2003-06-09 04:33

Well, good news everybody. The new PtokaX (Testdrive v3) has got some multihub features. It's not enabled yet. They have told, it will work in the next stable release.

AFAIK it will be a Master/Slave solution, and won't require any client side changes.

Locked