Scalability issue: extreme user counts and searching

Technical discussion about the NMDC and <a href="http://dcpp.net/ADC.html">ADC</A> protocol. The NMDC protocol is documented in the <a href="http://dcpp.net/wiki/">Wiki</a>, so feel free to refer to it.

Moderator: Moderators

Locked
Dj_Offset
Posts: 48
Joined: 2003-02-22 19:22
Location: Oslo, Norway
Contact:

Scalability issue: extreme user counts and searching

Post by Dj_Offset » 2003-06-18 05:12

How many users can search my files?

I mean, if I join multiple hubs with 1000+ users each things start to go slow due to excessive incoming search requests.
Does anyone have any numbers here?

I saw from the screentshots of the intra dc client it had built in support for dropping search requests if load is too high, any plans for this in DC++
(or is it in DC++ already?)

I saw from the DC++ changelog that a boyer-more search algorithm was implemented to speed up substring searches.
Any (un)official benchmark comparisons?
I wrote QuickDC - A DC++ compatible client for Linux and FreeBSD.

Sedulus
Forum Moderator
Posts: 687
Joined: 2003-01-04 09:32
Contact:

Re: Scalability issue: extreme user counts and searching

Post by Sedulus » 2003-06-18 07:39

Dj_Offset wrote:Any (un)official benchmark comparisons?
ivulfusbar wrote:I have finaly started to test the cvs-implimentation.

Facts:
A FileList consisting of 58425 files (mp3, jpeg, .nfo and .sfv).
Compared 0.242 with the CVS implimentation using the same filelist.
I faked 10 Passive searchs / second to the client using a bouncer with random nickname, random size, random searchstring.

size was between 0 and 50MiB.
searchstring was one word between 5 to 14 chars.

With a 1.2Ghz machine, it used approximatly 26% cpu using the CVS-implimentation. The 0.242 consumed approximated 40%.

oooh-i-like-it-so-much-yes-i-do-ly'ers ;))
http://dc.selwerd.nl/hublist.xml.bz2
http://www.b.ali.btinternet.co.uk/DCPlusPlus/index.html (TheParanoidOne's DC++ Guide)
http://www.dslreports.com/faq/dc (BSOD2600's Direct Connect FAQ)

Dj_Offset
Posts: 48
Joined: 2003-02-22 19:22
Location: Oslo, Norway
Contact:

Post by Dj_Offset » 2003-06-19 09:27

And the search handling is running in it's own thread in DC++?
I wrote QuickDC - A DC++ compatible client for Linux and FreeBSD.

arnetheduck
The Creator Himself
Posts: 296
Joined: 2003-01-02 17:15

Post by arnetheduck » 2003-06-20 05:36

The search runs in the same thread as the hub communication thread for the hub that sent the hub...it does not drop searches on high load, I considered it first but it would, in the end, lead to search results differing on every search, and if users learn that their search might work better if they do it multiple times...
In any case, if the computer is loaded, the connection to the hubs will eventually be dropped becase it won't read from the socket until it's searches are done...
That said, the current implementation is quite effective I think...as fusbar tested, 58000 files is no problem in a 10 searches/sec scenario (and it should in normal cases be more effective as certain search words are handled more effectively than the random ones fus sent...

Locked