New log analysis tool: dcplusplusstats

Know of something that might be useful to the DC community? Post it here! (Still, no advertising)

Moderator: Moderators

Locked
obdob
Posts: 8
Joined: 2003-05-21 11:44

New log analysis tool: dcplusplusstats

Post by obdob » 2003-05-21 12:11

Create HTML files with statistics from one of the two log files. Can process standard log format only. See http://www.geocities.com/marcoschmidt.g ... stats.html

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2003-05-21 13:15

I like statistics :)
The world is coming to an end. Please log off.

DC++ Guide | Words

obdob
Posts: 8
Joined: 2003-05-21 11:44

Post by obdob » 2003-05-21 16:53

TheParanoidOne wrote:I like statistics :)
So do I :)

If anyone has problems with the program or other questions, please post them here. Or feature requests, if they're not too time-consuming I may integrate them.

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2003-05-21 17:25

I only used it once, on my download log and I liked what I saw.

I will experiment with it a bit more and see if I have any comments.

My first request would be dealing with non-standard log formats. I don't use them myself, but I'm sure that there are people out there who do. Without knowing how you have implemented the code though, I'm not usre how easy/difficult that would be.

Actually, I do have another suggestion: User0.html should be the first file displayed, and not User1.html.
The world is coming to an end. Please log off.

DC++ Guide | Words

obdob
Posts: 8
Joined: 2003-05-21 11:44

Post by obdob » 2003-05-21 20:39

TheParanoidOne wrote: My first request would be dealing with non-standard log formats. I don't use them myself, but I'm sure that there are people out there who do.
Could be, but why change the standard format? Are there any transfer properties that are not included in it?
TheParanoidOne wrote: Without knowing how you have implemented the code though, I'm not usre how easy/difficult that would be.
There is a method in the log file parser that has the standard format hard-codec in it. That method could be changed to allow arbitrary formats, maybe using regular expressions. But I guess it's quite some work to get that done right, so I'm not doing it before anyone asks for it.
TheParanoidOne wrote: Actually, I do have another suggestion: User0.html should be the first file displayed, and not User1.html.
I like it sorted by amount of data... But I could add links to all sort modes directly from the front page. For the time being, you could bookmark User0.html if you use the program often and find clicking twice inconvenient.

BSOD2600
Forum Moderator
Posts: 503
Joined: 2003-01-27 18:47
Location: USA
Contact:

Post by BSOD2600 » 2003-05-21 21:14

Hey, cool little program you made (actually works too!).

One little suggestion...for the "Bytes transferred by file format" section, I think it would make sense to include .r01, .r02, etc. and possibly .001, .002, etc (most often RAR files...could be ACE) in with plain RAR format.

obdob
Posts: 8
Joined: 2003-05-21 11:44

Post by obdob » 2003-05-21 22:33

BSOD2600 wrote:Hey, cool little program you made (actually works too!).
Thanks.
BSOD2600 wrote:One little suggestion...for the "Bytes transferred by file format" section, I think it would make sense to include .r01, .r02, etc. and possibly .001, .002, etc (most often RAR files...could be ACE) in with plain RAR format.
Good idea. I added that, also now the user pages are linked to from the first page so that everyone can pick their favourite sort mode. Version 1.0.1 is now available from the homepage: http://www.geocities.com/marcoschmidt.g ... stats.html

TheParanoidOne
Forum Moderator
Posts: 1420
Joined: 2003-04-22 14:37

Post by TheParanoidOne » 2003-05-22 06:13

obdob wrote:I like it sorted by amount of data... But I could add links to all sort modes directly from the front page. For the time being, you could bookmark User0.html if you use the program often and find clicking twice inconvenient.
It's not a matter of inconvenience. It's a matter of logical consistency. 0 come before 1. So to me, it just makes sense to have User0.html be the first item to be displayed regardless of what data it actually represents. Do you see what I mean?

Also, adding links to all the sort modes from the main page may not be a good idea as it will cause clutter and would not be scalable.

Personally I have no preference as to what data item it is initially sorted by, as I will look at them all anyway. Mmmm ... statistics :)
The world is coming to an end. Please log off.

DC++ Guide | Words

obdob
Posts: 8
Joined: 2003-05-21 11:44

Post by obdob » 2003-05-24 08:49

TheParanoidOne wrote: It's not a matter of inconvenience. It's a matter of logical consistency. 0 come before 1. So to me, it just makes sense to have User0.html be the first item to be displayed regardless of what data it actually represents. Do you see what I mean?
Yes, but the order of the items that belong to a user (name, first / last date, size) have been chosen arbitrarily. I just found the size more interesting than names, so I linked to the version sorted by name.
TheParanoidOne wrote:Also, adding links to all the sort modes from the main page may not be a good idea as it will cause clutter and would not be scalable.
It's only five, and there won't be any more, so I think that's OK.
TheParanoidOne wrote: Personally I have no preference as to what data item it is initially sorted by, as I will look at them all anyway. Mmmm ... statistics :)
I remember! :)

Spencer
Posts: 3
Joined: 2003-05-28 20:58
Location: Vancouver
Contact:

A question

Post by Spencer » 2003-05-28 21:10

First of all, thanks for making this! I've been waiting for a useful, working log stat system for a long time =) I certainly found some surprises in both of my logs.

Secondly, my question. I suppose this is also a question about the logging of file transfers in DC++ in general, but its also about this program.
It seems to me that the numbers shown in the upload/download log can be quite inaccurate. I'm sure most people have noticed that. (For example, you can't start a 5.9 gigabyte file you have 99.999% complete and in the upload log of the person you finish the file with, it shows 5.9 gigs. If you manually look at the transfer speed with time, you'll be able to see that obviously only a very small portion of the file was uploaded before completion, but DC++ logs the full 5.9 gigs nonetheless.)
So my question is:
1) Are the averaged transfer speeds and exact transfer times found in the log accurate? If so, could these numbers be used instead to get a much more accurate figure of transferred data?

2) Files which are not completed (as far as I know) are not included in the log anywhere, and therefore not in your stat reporting. Does DC++ handle unfinished downloads in any way that could be added to this reporting? For example, perhaps for unfinished downloads connection time and average speed are collected somewhere?

3) Taking 1&2 into account, you can probably get much better figures, in a perfect world. Are either #1 or #2 possible?

Don't get me wrong, your program is great...but it just got me thinking about how accurately DC++ reports on data transferred.

Spencer

BSOD2600
Forum Moderator
Posts: 503
Joined: 2003-01-27 18:47
Location: USA
Contact:

Re: A question

Post by BSOD2600 » 2003-05-28 23:23

Spencer wrote:2) Files which are not completed (as far as I know) are not included in the log anywhere, and therefore not in your stat reporting. Does DC++ handle unfinished downloads in any way that could be added to this reporting? For example, perhaps for unfinished downloads connection time and average speed are collected somewhere?
DC++ can log partial uploads. Go to the advanced tab, check "partial uplod logging" To my knowlege, thats all the partial up/downloading DC logs.

Marvin
Posts: 147
Joined: 2003-03-06 06:56
Location: France
Contact:

Re: A question

Post by Marvin » 2003-05-28 23:45

BSOD2600 wrote:DC++ can log partial uploads. Go to the advanced tab, check "partial uplod logging" To my knowlege, thats all the partial up/downloading DC logs.
Am I wrong, or are you speaking about BCDC ? I can't see this option in vanillia DC++ (didn't try with my third eye). 8)

Spencer
Posts: 3
Joined: 2003-05-28 20:58
Location: Vancouver
Contact:

Post by Spencer » 2003-05-29 04:04

Yeah I'm pretty sure its not in there....unless I'm just missing it somehow.

BSOD2600
Forum Moderator
Posts: 503
Joined: 2003-01-27 18:47
Location: USA
Contact:

Post by BSOD2600 » 2003-05-29 09:18

Heh, you're right, BCDC only has this option. I just assumed that it was a feature normal DC++ had.

obdob
Posts: 8
Joined: 2003-05-21 11:44

Re: A question

Post by obdob » 2003-06-01 16:46

Spencer wrote:1) Are the averaged transfer speeds and exact transfer times found in the log accurate? If so, could these numbers be used instead to get a much more accurate figure of transferred data?
The transfer speed isn't entirely accurate for small files - at least in my experience. Some speed values in my log are just not possible.
Spencer wrote:2) Files which are not completed (as far as I know) are not included in the log anywhere, and therefore not in your stat reporting. Does DC++ handle unfinished downloads in any way that could be added to this reporting? For example, perhaps for unfinished downloads connection time and average speed are collected somewhere?
This has been answered already, which was very helpful - I've yet to find the 'log partial uploads' in BCDC++, though. But I haven't looked hard, or my version is too old. Obviously, once partial uploads are logged, dcplusplusstats's Number of file transfers statistics get skewed - it considers one line in the log as one transfer. On the other hand, the Number of bytes transferred section will become more accurate once partial transfers are included.
Spencer wrote:3) Taking 1&2 into account, you can probably get much better figures, in a perfect world. Are either #1 or #2 possible?

Don't get me wrong, your program is great...but it just got me thinking about how accurately DC++ reports on data transferred.
Sure, I have to think about how to deal with partial transfers. Also, I may add some speed statistics. Even if DC++ seems to measure a wrong value from time to time, it may be interesting in the long run.

Again, thanks everybody for the feedback so far!

Spencer
Posts: 3
Joined: 2003-05-28 20:58
Location: Vancouver
Contact:

Post by Spencer » 2003-06-01 21:27

Thanks, sounds like you are going to be adding/upgrading the tool in the future then?
Also in regards to the incorrect times for very small files, I'v noticed that too, and again, only noticeable on small files for me. Perhaps for files under a certain size, the program could just report them as they are being reported now, since partial downloads of very small files are unlikely.

Spencer

obdob
Posts: 8
Joined: 2003-05-21 11:44

Post by obdob » 2003-06-02 17:52

Spencer wrote:Thanks, sounds like you are going to be adding/upgrading the tool in the future then?
Yes, as time allows. I just released 1.1: http://www.geocities.com/marcoschmidt.g ... stats.html

That's the change list I submitted to the tool's Freshmeat page http://freshmeat.net/projects/dcplusplusstats/
Added support for user speed statistics. Generated HTML pages now have more navigation options (LINK elements in the head section, direct links to user pages on the front page). Fixed bug with zero-sized table entries. More reliable recognition of RAR files.

Anyone who wants to get notified about dcplusplusstats's progress should go to that Freshmeat page and pick 'Subscribe to new releases'.
Spencer wrote:Also in regards to the incorrect times for very small files, I'v noticed that too, and again, only noticeable on small files for me. Perhaps for files under a certain size, the program could just report them as they are being reported now, since partial downloads of very small files are unlikely.
Right now I don't plan to change anything with regard to partial uploads. If people log them, they'll be considered normal transfers.

Locked