Your Daily Source for Apache News and Information |
Breaking News | Preferences | Contribute | Triggers | Link Us | Search | About |
|
By In the first sections of this series, I've talked about what goes into the standard log files, and how you can change the contents of those files. This week, we're looking at how to get meaningful information back out of those log files. The ChallengeThe problem is that although there is an enormous amount of information in the log files, it's not much good to the people that pay your salary. They want to know how many people visited your site, what they looked at, how long they stayed, and where they found out about your site. All of that information is (or might be) in your log files. They also want to know the names, addresses, and shoe sizes of those people, and, hopefully, their credit card numbers. That information is not in there, and you need to know how to explain to your employer that not only is it not in there, but the only way to get this information is to explicitly ask your visitors for this information, and be willing to be told 'no.' What Your Log Files Can Tell YouThere is a lot of information available to put in your log files, including the following:
What your log files don't tell youHTTP is a stateless, anonymous protocol. This is by design, and is not, at least in my opinion, a shortcoming of the protocol. If you want to know more about your visitors, you have to be polite, and actually ask them. And be prepared to not get reliable answers. This is amazingly frustrating for marketing types. They want to know the average income, number of kids, and hair color, of their target demographic. Or something like that. And they don't like to be told that that information is not available in the log files. However, it is quite beyond your control to get this information out of the log files. Explain to them that HTTP is anonymous. And even what the log files do tell you is occasionally suspect. For example, I have numerous entries in my log files indicating that a machine called Another implication of this is that if, 10 minutes later, someone else sitting behind that same proxy requests the same page, they don't generate a log file entry at all. They type in the address, and that request goes to the proxy server. The proxy sees the request and thinks "I already have that document in memory. There's no point asking the web site for it again." And so instead of asking my web site for the page, it gives the copy that it already has to the client. So, not only is the address field suspect, but the number of request is also suspect. So, Um, What Good are These Logs?It might sound like the data that you receive is so suspect as to be useless. This is in fact not the case. It should just be taken with a grain of salt. The number of hits that your site receives is almost certainly not really the number of visitors that came to your site. But it's a good indication. And it still gives you some useful information. Just don't rely on it for exact numbers. How Do I Get Useful Statistics?So, to the real meat of all of this. How do you actually generate statistics from your Web-server logs? There are two main approaches that you can take here. You can either do it yourself, or you can get one of the existing applications that is available to do it for you. Unless you have custom log files that don't look anything like the So, without further ado, here's some of the great apps out there that can help you with this task.
Or, You Can Do it YourselfIf you want to do your own log parsing and reporting, the best tool for the task is going to be Perl. In fact, Perl's name (Practical Extraction and Report Language) is a tribute to its ability to extract useful information from logs and generate reports. (In reality, the name ``Perl'' came before the expansion of it, but I suppose that does not detract from my point.) The For detailed information about how to use this module, install it and read the documentation. Once you have installed the module, you can get at the documentation by typing Trolling through the source code for WWWStat is another good way to learn about Perl log file parsing. And that's about itNot much more to say here. I'm sure that I've missed out someone's favorite log parsing tool, and that's to be expected. There are hundreds of them on the market. It's really a question of how much you want to pay, and what sort of reports you need. Thanks for ListeningThanks for reading. Let me know if there are any subjects that you'd like to see articles on in the future. You can contact me at --Rich Related Stories: |
|
Talkback(s) | Name | Date | ||
|
Sep 18, 2000, 18:38:06 | |||
|
Sep 18, 2000, 19:29:46 | |||
|
Sep 19, 2000, 13:11:19 | |||
|
Sep 19, 2000, 13:33:30 | |||
|
Sep 25, 2000, 14:20:02 | |||
|
Sep 25, 2000, 15:53:25 | |||
|
Sep 27, 2000, 23:02:44 | |||
|
Sep 28, 2000, 03:20:06 | |||
|
Oct 1, 2000, 00:41:16 | |||
|
Oct 10, 2000, 21:39:12 | |||
|
Oct 13, 2000, 03:14:53 | |||
|
Nov 10, 2000, 15:49:30 | |||
|
Jan 5, 2001, 01:25:21 |
About Triggers | Newsletters | Media Kit | Security | Triggers | Login |
All times are recorded in UTC. Linux is a trademark of Linus Torvalds. Powered by Linux 2.2.12, Apache 1.3.9. and PHP 3.14 Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy. |