Generating an Index..

royen99 · Post by **royen99** » 2004-04-23 01:33

Wondering if someone already made something that creates an index file with ALL files from the users in a particular hub.
(probably needs to be some hubscript/bot, or maybe a standalone prog/script..)

What i want to do with the end result, is parse the complete index into a mysql database, have a nice php frontend to it. Let users search/browse the index from a website and their results in the users browser will give a dchub:// link with the file they was searching for ....

Any idea's ??

/fREaK Out ...

Todi · Post by **Todi** » 2004-04-23 02:58

First, you'd have to download the filelist of every user in the hub, not many scripts can do this so you'd need a client/bot. After you've done that, it shouldn't be too hard to parse them together, but would require some kind of custom program. The hard part would probably be to handle duplicates and actually making it useful (it would be pretty disorganized, and very, very hard to browse).

This site, it might give you a few ideas for the searching part (even if it's pretty different from your idea).

royen99 · Post by **royen99** » 2004-04-23 03:49

Actually, the result of that site (its search output) is almost exactly what I wanted the output to be.
Although this site apparently searches multiple hubs, I just need it to search my own hub.

i.e. A script/bot that downloads the filelist of every user connected (few at a time) and stores all possible info (hash, user, slots etc etc) in the database.

The rest isn't that hard to do (the php frontend to read out the sql database and present the database to the end user). Its getting the files info that I can't

Post by **GargoyleMT** » 2004-04-26 22:08

if you want to show even offline user's files, that could be a real pain. Otherwise, you could write a special DC client and have a web front end. There's another PHP project that did similar - search in the forum archives and you'll find it.

Rodga · Post by **Rodga** » 2006-05-29 20:53

GargoyleMT wrote:if you want to show even offline user's files, that could be a real pain. Otherwise, you could write a special DC client and have a web front end. There's another PHP project that did similar - search in the forum archives and you'll find it.

Fortunately, at least if using YnHub (with SQL), you can always check in some table (in the database) which users are online, then exclude these users' filelists from the "all users" filelist.

I'll try to search for this PHP project you mentioned, thanks

Nick-V · Post by **Nick-V** » 2006-10-07 13:56

I know the orginal question was posted a while back but our hub and site provides web pages connected to an SQL Server database of members and files in our hub which is updated using MS Access:

1) Manually download all filelists periodically.

2) MS Access routine run daily dealing with 1) filename (keyed on TTH) and 2) username (user, folder, filename) tables both locally and on SQL Server:
* decompress all file lists
* optionally (update the two local Access tables from the SQL Server)
* replace all username records for each user with a filelist adding unique entries into the filename table as required (some other checks and basic processing to meet our specific requirements)
* update the filename SQL Server table and replace the username SQL Server table from Access
* run a stored procedure that does a few final bits including a reindex

3) publish the SQL data using various search and Dreamweaver web pages designed to meet our specific requirement.

Note that the slightly strange update approach was chosen because it provides us with the best update performance.
Search results are published in about 3 seconds with 500,000 username records and 100,000 unique filename records.
The facility helps users identify unique (TTH) files meeting their criteria and not in their collection, identify duplicates cluttering their disk and see the various alternative names and sources by which a file is shared.

Let me know if this might help you or you want more info.