Thoughts on ADL Searches

Archived discussion about features (predating the use of Bugzilla as a bug and feature tracker)

Moderator: Moderators

Locked
mai9
Posts: 111
Joined: 2003-04-16 23:02

Thoughts on ADL Searches

Post by mai9 » 2003-04-17 00:09

I searched to see if it was already posted but I didn't find anything, so I post :?

I don't know if it's a feature, but I don't like (and I think is not correct) to present different ADL Searches that overlap each other, with less results.

Imagine I have two ADL Searches, one for ".com" and another for "setup". if that user has a file "setup.com" that result will only be present in one of the two ADL Searches. So, if in that moment I am only looking for "setup" files, I may miss some because they fell in another ADL Search.

Talking about the same ADL Search, I'd like to have the "Download whole directory" option there. Either that, or an item to jump to where the file is really placed.


Thanks :wink:

Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Re: Thoughts on ADL Searches

Post by Kenneth-Chile » 2003-04-17 20:03

I didn't understand. It's only one ADL Search (not 2), with two Items:
"Setup" and ".com"

if user has setup.com and you look for "setup", the final search will be the same with ".com" or wiithout ".com".

Could you put another example...i really didn't understand.

mai9 wrote:Talking about the same ADL Search, I'd like to have the "Download whole directory" option there. Either that, or an item to jump to where the file is really placed.
That was talked here:
http://dcplusplus.sourceforge.net/forum ... php?t=1621

mai9
Posts: 111
Joined: 2003-04-16 23:02

mmm...

Post by mai9 » 2003-04-18 10:14

Thanks for pointing to that thread :) I hope heen will read this aswell :roll:


And no, it's TWO ADL Searches both looking for filenames, but with different text strings. (if one adlsearch looks for filenames and another for folders won't overlap)

First ADL Search: "Setup"
This one caches: "setup.exe", "this is the setup I like.txt" and "100 ways to make a setup without programming.pdf"

Second ADL Search: ".com"
This one caches: "command.com", "doskey.com" and "hidden switches of format.com and how to use them.txt"

But, If a file called "setup.com" is in the filelist, it will only be listed in the first ADL Search. So this means that having more than one ADL Search may result in having less results :(

heen
Posts: 14
Joined: 2003-03-31 10:50

Sorting results

Post by heen » 2003-04-18 15:39

It is very true like you say mai9, a file will only be found in one search (the first encountered). Only exception is if ADLSearch has found a directory match that is replicated, then a file can be both in that directory and a separate filename/full path search.

I did it that way because I was unsure about performance, that it could take too long to try to match all searches for every file. Now it seems it might be possible, because the search routine is very fast.

I see two possible ways here, and I would like your opinions if possible:

1) Always try to match all searches. Not for directories though, it would be very hard to implement.

2) A selectable option for individual searches, something like 'search always' if you know what I mean by that.

What do you think?
/Henrik

Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Post by Kenneth-Chile » 2003-04-18 15:43

:D But....In my Dc++ (0.241) there are only one ADL Search Window, and when I open a user list, there are only one folder <<<ADLSEARCH>>>.


Why are you talking about 2 ADL Search? How do you differenciate between the first and the second ADLS.?

For me it's one ADLSearch with 1,2,3....n search items and if I have 2 Filename Items (one "setup" and the other ".com") the ADLSerach Folder will contain setup.com (The same would happen if i have only "setup" or if I have only ".com")

Or are you restringing the file size of one filname an then ADLS overlap the result of the other filename item?


:shock:

heen
Posts: 14
Joined: 2003-03-31 10:50

Search ordering

Post by heen » 2003-04-18 16:06

When I say 'search' I mean one 'line' in the ADLSearch window. The ordering is important, the first search (first line) will be tried first, then the second one etc. If a match is found no more of the below searches will be tried (as of today...).

Therefore you can 'Move Up' and 'Move Down' searches to alter the search order.

Come to think of it, there is a reason why I made the search order important. Lets say you are into movies for example. You have some movies that you particulary look for. You therefore place those high up and maybe direct them to a special destination directory, something like 'Sought Movies'.

But you still want to look for other movies in general. Then you add one or more lower ranked searches ('avi', 'mpg', 'divx' etc) and direct them to a general destination directory like 'Movies'.

It can be useful in some ways I think.

heen
Posts: 14
Joined: 2003-03-31 10:50

Doh

Post by heen » 2003-04-18 16:37

Sorry, I wasn't thinking straight (quite late here and I should be in bed...) the problem discussed here only arises if you have two different destination directories.

So, do you think the same file should be placed in all destination directories that it matches?

/Henrik

[KUN.NL]mepmuff
Posts: 73
Joined: 2003-01-06 09:32

Post by [KUN.NL]mepmuff » 2003-04-18 16:58

Yep, definately.

The order specified in the ADL-search might not be the order in which you look through the ADL folders. Overall i feel it would be correct to have the result give the actual results of that search, not the results minus which fall into other earlier specified categories.

Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Post by Kenneth-Chile » 2003-04-18 17:35

Now I understand mai9 :D :D :D

I didn't understand because I only have <<<ADLSearch>>> Folder for all search items :P

Sorry :oops:

:D

Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Re: Sorting results

Post by Kenneth-Chile » 2003-04-18 21:46

heen wrote:So, do you think the same file should be placed in all destination directories that it matches?
Yes, I think so.
heen wrote: 1) Always try to match all searches. Not for directories though, it would be very hard to implement.

2) A selectable option for individual searches, something like 'search always' if you know what I mean by that.

What do you think?
/Henrik
1)Why does it hard to implement?
How does it work the ADLS algorithm basically?
Does it search for each adls item one or more user list names coincidences and then put them on the destination folder?

2) Mmm, I think that by implementing the 1) option well, it's not necessary.

heen
Posts: 14
Joined: 2003-03-31 10:50

ADLSearch implementation

Post by heen » 2003-04-19 04:58

Hope you understand my pseodo-code explanation:

1) There is a 'collection' of all the search items. It has got two interface functions; MatchesFile(...) and MatchesDirectory(...). The input to the first is a file name, and it returns true if there is a search item that matches the current file in the directory listing. The same for the second one but for directories instead.

2) When a directory listing is downloaded it is traversed one item at a time according to the users directory structure (straight-forward).

Here is some pseudo-code that is executed for all items (files + directories) in i directory listing, somewhat simplified:

if(current item is a directory)
{
if(already storing a matched directory)
{
PutItInMatchedDir(current item)
}
else if(MatchesDirectory(current item))
{
PutItInDestinationDir(MakeNewMatchedDir(current item))
}
}
if(current item is a file)
{
if(already storing a matched directory)
{
PutItInMatchedDir(current item)
}
if(MatchesFile(current item))
{
PutItInDestinationDir(current item)
}
}

The problem is it would be much harder (not impossible of course) to keep track of several 'matched' directories.

I will think about it and see how it is best solved. I just saw that the other changes I have made is in the CVS so they probably will be in the next release.

/Henrik

heen
Posts: 14
Joined: 2003-03-31 10:50

Tabs

Post by heen » 2003-04-19 05:01

Hmm... seems my indentation screwed up in the previous post. Lets try this:

Code: Select all


if(current item is a directory) 
{ 
    if(already storing a matched directory) 
    { 
        PutItInMatchedDir(current item) 
    } 
    else if(MatchesDirectory(current item)) 
    { 
        PutItInDestinationDir(MakeNewMatchedDir(current item)) 
    } 
} 
if(current item is a file) 
{ 
    if(already storing a matched directory) 
    { 
        PutItInMatchedDir(current item) 
    } 
    if(MatchesFile(current item)) 
    { 
        PutItInDestinationDir(current item) 
    } 
} 


Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Post by Kenneth-Chile » 2003-04-19 16:12

heen wrote:Hmm... seems my indentation screwed up in the previous post. Lets try this:

Code: Select all

if(current item is a directory) 
{ 
...
} 
if(current item is a file) 
{ 
...
} 
OK. I understood it. It's a Priority v/s Redundancy of search info problem and algorithm efficiency problem.

I believe, there are some posibilities:

1. Don't change anything, respecting the priority search. Each user is responsible to mantain its ADL search priorities.

2. Override the priority search. Then, for each item in the user list, to compare it with the ADL search items. It will take a longer time than option 1, but search info will be present in all search folders. Users who uses priorities will confuse when they see that priority dissappeared.

3. Create for each item a "search always" option(like you explained in a older post). But, I think it's difficult to understand it(especially to begginers).

4. Create a general option "Don't use priority search"or "search always" (not for each item, but for all ones in general).


I like option 1. Do not anything or option 4. :P
But I don't know what people like. I already said my opinion.

:shock:

Well, I have just seen the interfaces in DC++ 0.242. Great work, congratulations!, It looks pretty good and much better than before.
Analyzing change by change:

:arrow: Buttons (Add, Edit, Remove, Move Up, Move Down, What's this?)
EXCELENT! :) What's this explanation too.

:arrow: ADD/EDIT Items Interface
EXCELENT :), good work, I have to say only one thing: I think the ACTIVE OPTION should be dissappear and to create a checkbox in the beginning of each search item (like the favorites window, look it) eliminating the Active Column. When user creates the item search and press OK, then checkbox checks it by default. What do you think?

:arrow: Bold Text'Folders in user list
EXCELENT :).

ADL Search is more intuitive now! Thanks again!

Kenneth :D

mai9
Posts: 111
Joined: 2003-04-16 23:02

Re: Sorting results

Post by mai9 » 2003-04-19 16:57

Kenneth-Chile wrote:
heen wrote:So, do you think the same file should be placed in all destination directories that it matches?
Yes, I think so.
I feel the same.

You never thought two ADL Searches could end up in two folders, and I never thought they could end up in the same! :D

mai9
Posts: 111
Joined: 2003-04-16 23:02

Post by mai9 » 2003-04-19 17:41

Kenneth-Chile wrote:2. Override the priority search. Then, for each item in the user list, to compare it with the ADL search items. It will take a longer time than option 1, but search info will be present in all search folders. Users who uses priorities will confuse when they see that priority dissappeared.
I don't thing anybody uses the priority 'feature' to eliminate results for the next adl search. And it can confuse users because "a search is a search". In my case, I thought that move up/down was a nice aesthetic feature, and only found that it was a function when I saw that some searches lacked results (but that was in 0.241).

I didn't understand very well the pseudo-code of heen, so here's what I'd do (that coincides with Ken's option 2):

1. file
1.1. check first adl search
1.1.1 matches? -> add to folder
1.2. check second
1.2.1 matches? -> add to folder
1.2. check third
1.2.1 matches? -> add to folder
2. folder
2.1. check first adl search
2.1.1 matches? -> add to folder
2.2. check second
2.2.1 matches? -> add to folder
2.2. check third
2.2.1 matches? -> add to folder
Kenneth-Chile wrote:I think the ACTIVE OPTION should be dissappear and to create a checkbox in the beginning of each search item (like the favorites window, look it) eliminating the Active Column. When user creates the item search and press OK, then checkbox checks it by default. What do you think?
yes, I like this idea!

I'd like a checkbox for favorite users too, for granting them a permanent slot. It's not a killer feature like ADL Search is, but useful too :wink:

heen
Posts: 14
Joined: 2003-03-31 10:50

New features

Post by heen » 2003-04-20 02:52

Really glad you liked the new interface. I think it is better too. Personally I've found the 'Go to directory' function most useful... it helps a lot.

I'll be away for a couple of days which gives me time to think over your new suggestions, and how to fix the multiple dest dir issue. I've got the feeling it can be solved quite nicely

If I make a 'Active' checkbox in the main ADLS window, don't you think the dialog should remain as it is? I think so... to show what the checkbox means.

Thanks,
Henrik

mai9
Posts: 111
Joined: 2003-04-16 23:02

Post by mai9 » 2003-04-20 08:21

Yes, the checkbox in the "ADLSearch Properties" (shouldn't it be "ADL Search"?) is perfect to explain without words what that other checkbox in the main window does.

After some hours of use, I miss the context menu with 'Add', 'Remove' and 'Edit' :oops:

btw, wouldn't it be better if you used 'New' instead of 'Add', and 'Properties' instead of 'Edit' like the rest of the DC++ interface?

Thanks very much heen for reading our suggestions, and accepting !!!

Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Re: New features

Post by Kenneth-Chile » 2003-04-20 18:43

heen wrote: If I make a 'Active' checkbox in the main ADLS window, don't you think the dialog should remain as it is? I think so... to show what the checkbox means.
I don't think so, to show what checkbox means, you have to implement the same idea of favorites' window. (In fav.'s window first column is called Auto connect/Name)

Then, the first column should be named "Active/Search String" and that name explains what checkbox means. You have to consider that in most cases users create a new ADLS item to active it inmediatly, and it's less probable that users uncheck that chechbox the first time.
After, when users want to desactive the item, just need to uncheck the checkbox, and not to edit the search item. Then, the checkbox in the Search Item Properties is not necesary.

Thanks!
Kenneth

Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Post by Kenneth-Chile » 2003-04-21 19:21

Another thing with ADLS inteface..when i click right button on a item it could appear a context menu with: {NEW..., PROPERTIES, REMOVE} (if you change the button's names like May9 suggested)

Or {ADD, EDIT, REMOVE} (if you don't change the actual button's names)
In Theory, there is no difference between Theory and Practice. In Practice, there is.

Kenneth-Chile
Posts: 80
Joined: 2003-03-21 10:17
Location: Concepcion, Chile.

Post by Kenneth-Chile » 2003-05-21 01:40

Heen.....all perfect in dc++ v. 0.25.

Great and thanks!

Kenneth
In Theory, there is no difference between Theory and Practice. In Practice, there is.

heen
Posts: 14
Joined: 2003-03-31 10:50

Post by heen » 2003-05-21 12:07

You are very welcome. Please tell me if you come up with anything else that could improve ADLSearch.

Gratch06
Posts: 141
Joined: 2003-05-25 01:48
Location: USA

Post by Gratch06 » 2003-06-07 16:49

I just found an interesting use for the older style of ADLSearch where it would put results in one folder (i.e. the top one) and not in another. I use ADLSearch to filter through file lists looking for material that violates hub rules, but unfortunately, there are always false positives. If the precedence and only showing up in one search applied, it could be used to create a filter persay and put them in the folder "False Positives" let's say, so I don't have to look through false positives to find the real deal. Could there be a checkbox for "only show once" or something of the like to facilitate filtering of false positives?

This would just be a streamlining issue for me, but my compliments on the ADLSearch...I'm really enjoying it!

- Gratch

heen
Posts: 14
Joined: 2003-03-31 10:50

Post by heen » 2003-06-08 03:23

I suppose I could make a global switch named something like

'Break after first match in ADLSearch' = true/false

that would emulate the old search style. It would end up in the settings->advanced dialog.

But if a directory type search is currently matched, a file could end up in two or more searches anyway.

Does it sound ok to you?

/Henrik

Gratch06
Posts: 141
Joined: 2003-05-25 01:48
Location: USA

Post by Gratch06 » 2003-06-08 03:50

That sounds like an excellent implementation concept for me, but I did have one question on it.
heen wrote:But if a directory type search is currently matched, a file could end up in two or more searches anyway.
Perhaps I don't understand quite how Directory matches work, as I haven't seen a directory in my ADLSearchResults yet. Does it yield all of the files in that directory or simply the directory name?

Thanks for the speedy reply!

- Gratch

heen
Posts: 14
Joined: 2003-03-31 10:50

Post by heen » 2003-06-08 05:41

If you set type = directory for a adl search it will match against directory name. If a match is found, the current directory and all of its subdirectories + files will be found in the adl search result.

Try a directory type search with string = 'mov' as it should find you directories named 'movie' or 'movies', for example.

/Henrik

Gratch06
Posts: 141
Joined: 2003-05-25 01:48
Location: USA

Post by Gratch06 » 2003-06-08 16:50

That makes a lot of sense on the directory search. Thanks!
As I was thinking about this some more, perhaps a bit more efficient use of CPU would be to add a "filter" checkbox on each individual item. This way one could set items to filter without having to populate a <<<False Positives>>> folder, simply not list them in any folder (as that is the end result I want anyway). I did some playing with .241 and noticed that building a list with a bunch of false positives took a considerable amount of time. The method you mentioned above
heen wrote:I suppose I could make a global switch named something like
'Break after first match in ADLSearch' = true/false
that would emulate the old search style. It would end up in the settings->advanced dialog.
would work fine, but I'm just thinking about streamlining speed on the filter option. My 2 cents anyway :)

- Gratch

Gratch06
Posts: 141
Joined: 2003-05-25 01:48
Location: USA

Post by Gratch06 » 2003-06-09 01:24

Addition to filter concept in above post: This would speed up the search as the filter would only need to be applied to results that match the main results, and not everything else in the file list. Less total scans would be the end result.

- Gratch

heen
Posts: 14
Joined: 2003-03-31 10:50

Post by heen » 2003-06-09 06:13

The way I proposed will make ADLSearch behave exactly like it did in the first version. I'm not sure it would make any difference the way you proposed, because the directory listing is traversed *once*, and not once for every search item. For every file/directory in the listing, all search items are checked while traversing. I think your proposal builds on the idea that the list is traversed once for every search item, am I right?

/Henrik

Gratch06
Posts: 141
Joined: 2003-05-25 01:48
Location: USA

Post by Gratch06 » 2003-06-09 12:20

It must traverse the entire listing of files once, that's correct. But at every single step, it must examine the files/directories and compare them to each search term. Obviously if you have one or two search terms, it isn't going to be a big deal. If/when you get up to 200 or so search terms, it takes a good deal of time to do 200 checks at each point in the traversal. Making it with the "filter these results checkbox" would only add the additional checking if something triggered the inital ADLSearch.

Code: Select all

Pseudocode:
while (!end of file list){
    check item against ADLSearch listing
    if (match is found){
        check item against Filter listing
        if (match is found on filter list)  
            skip to the next item in the traversal without doing anything else.
        else
            populate ADLSearch folder with results
    }
}
Example:
ADLSearch list of 200 items
Filter Listing of 50 items
Filelist containing one folder and 4 items.
Filename search.

Code: Select all

Search starts, and checks item #1 against each of the 200 items, finding no matches.  ADLsearch does nothing.  200 checks total on file #1
Goes on to item #2 and checks against each of 200 items, then finds a match.
- Scan the one matching file using the 50 items on filter list  250 total checks on file #2
       -  it matches filter list, don't populate any list with it.  Skip on to the next item for traversal.
Goes on to item #3 and checks against each of the 200 items, again finding a match.
       - it does not match any of the 50 items on the filter list, so populate the list as usual   250 total checks on file #3
- Goes on to item #4, checks against 200 items, finding no matches.  ADLSearch does nothing.  200 total checks on file #4

OK, hope that above example helps out a little. I know that many times results will trigger earlier and not use the maximums. I just used the maximums for ease of seeing where the numbers came from. The three advantages of adding the filter exist box exist in that
1) The additional 50 items on the filter list are only invoked if the item triggers on one of the first 200 search terms from the initial search, saving 50 checks (max) per item.
2) Saves time populating a list with the false positive results, as well as saving time because the results are not populated in the original ADLSearch. They are simply not outputted anywhere.

I would use the filter for many common terms. For example, if I wanted to use ADLSearch to find a Britney Spears song (oh my, what a sad state I'd be in then), but didn't want to get porn archives back, I could do an ADLSearch for "Britney Spears" and use the filter on xxx, pr0n, porn, and a few other common terms that people like to tack onto files. WithOUT the filter, you would be populating a list of False Positives spanning multiple hundred files on some file lists. And yes, I am looking at using the filter on the scope/scale of filtering results such as xxx from the ADLSearch.

Hope that helps a little with where I'm coming from.
- Gratch

heen
Posts: 14
Joined: 2003-03-31 10:50

Post by heen » 2003-06-09 14:42

Ok, now I see what you getting at. I have not heard of anyone using so many search items, so I have not thought so much about it.

Here is how it works in the current release:

For every file/directory in a directory listing ADLSearch goes through *all* ADL search items and tries to find a match. Since all are tried every time, the order does not matter.

Here is how it will work with the new 'break on first' switch:

If, for a specific file/directory, a match is found then the rest of the ADL search items will not be tried. Hence, the order *is* important. The actual order that is used is from the first/top ADL search item listed to the last/bottom one listed. You can re-arrange the order by using move up/down-buttons. They had only cosmetic function before the new switch.

I must say I do not feel too much about adding the complete functionality as you described. It is not the actual copying of results to 'false positive' folders that takes time, it is the underlying substring search algorithm.

How about this idea, built on 'garbage' destinations:

No extra 'filter' checkbox is added. If one uses (a new special) Destination = 'Discard' for one or more search items, they will never be shown in the output.

Your example:

#1 string='nude', destination='discard'
#2 string='britney', destination='music?'

A file called 'britney spears nude.avi' would match #1. Since it is the *special* destination directory 'discard', it will not show up anywhere. But it is a match, and #2 will not be tried.

A file called 'britney spears lalala.mp3' would only match #2 and end up in 'music?'.

It seems to me we would meet half-ways... what do you think, would it help you to a certain degree?

/Henrik

Gratch06
Posts: 141
Joined: 2003-05-25 01:48
Location: USA

Post by Gratch06 » 2003-06-09 15:30

heen wrote:Here is how it will work with the new 'break on first' switch
Where will this switch be located? on each individual file or a "master switch" in settings?
heen wrote:If one uses (a new special) Destination = 'Discard' for one or more search items, they will never be shown in the output.
The garbage file idea would achieve the exact ends I would like, and would work out quite well for my purposes. If this is the method that is implemented, I would be fully supportive of it. Now on to the efficiency...
heen wrote:It is not the actual copying of results to 'false positive' folders that takes time, it is the underlying substring search algorithm.
The whole reason for setting up the checkbox and the secondary "filter" list was to remove possible substrings from checking unless you found a match on a regular ADLSearch term. With this implementation as you suggest here, you will check all of the filter strings as well as the collector strings for every single piece of data (200 + 50 = 250 checks). Applying the filter list only on matches reduces total checks using that list significantly assuming a decent number of filters and search terms. just FYI, right now I'm using 221 ADLSearch terms (and I'd probably have 30-40 filters up in a few minutes if such an option were available) in the hub I OP in to keep track of shared items that shouldn't be there.

- Gratch

Locked