Your Daily Source for Apache News and Information  
Breaking News Preferences Contribute Triggers Link Us Search About
Apache Today [Your Apache News Source] To internet.com

Apache HTTPD Links
Apache-Perl Integration Project
The Apache FAQ
The Apache Software Foundation
The Jakarta Project
Apache Project
The Java Apache Project
Apache Module Registry
PHP Server Side Scripting
Apache XML Project
ApacheCon
Apache-Related Projects

  internet.com

Internet News
Internet Investing
Internet Technology
Windows Internet Tech.
Linux/Open Source
Web Developer
ECommerce/Marketing
ISP Resources
ASP Resources
Wireless Internet
Downloads
Internet Resources
Internet Lists
International
EarthWeb
Career Resources

Search internet.com
Advertising Info
Corporate Info
Apache Guide: Logging, Part 5: Advanced Logging Techniques and Tips
Sep 25, 2000, 12 :52 UTC (13 Talkback[s]) (13976 reads) (Other stories by Rich Bowen)

By

In this final article on logging, I'll attempt to touch on a few of the things that I've left out or skimped on. By now you're probably tired of hearing about logging, so we'll start on something new next week.

I'll start with a few additional comments about log-file parsing. After stating that I was not at all trying to be comprehensive in my treatment of log-file-parsing software, and stating that I was aware of many other programs for this purpose, I received no less than 20 email messages from various users and software vendors either suggesting other packages, or chastising me for not mentioning their favorite application for this purpose.

There are dozens and dozens of software packages on the market for the purpose of parsing HTTP server log files and generating useful statistics. I talked about the few that I actuall have used and which I have found to be useful, and about one other that had been highly recommended to me recently. I was not trying to suggest that these were the only ones available, or even that they are the best.

A quick search on Google for "apache log reporting" or something like that, will return hundreds of pages dedicated to this topic, and various vendors selling their particular solution to this rather simple problem. These will do everything from give you a number ("You had 12 visits to your web site.") all the way up to drawing detailed graphs analyzing your traffic based on domain names and how that particular company is doing on the stock market. ("20% of your traffic was from Fortune 500 companies. See the blue bar on graph 27.")

I tend to prefer something closer to the simple end, since I'm usually just trying to get the big picture anyway.

Logging to a process

You don't have to log to a file. You can log to a process. This is particularly useful if you want your logs to go to a database or to some process that will give some type of real-time statistics on your web site traffic.

Now, I need to be perfectly honest about this. I have never had any particular use for this ability. I have played with it from time to time, but have never found any actual practical use for it. Perhaps someone can tell me about some real-life situation where this has been of value.

Anyways, here's what you can do. Using either the TransferLog or CustomLog directive, you can, instead of specifying a file to which the log should be written, you can specify "|", followed by the name of a program that is to receive the logging information.

For example:

     CustomLog |/usr/bin/apachelog.pl common

where /usr/bin/apachelog.pl is some program that knows what to do with Apache log file entries. This may be as simple as a Perl program that processes the log entries in some fashion, or it may be something that writes entries to a database.

The main thing to be cautious about if you're going to do this is security. Log files are opened with the permissions of the user that starts the server. This is usually root. And this applies as well to logging to a process. Make sure that the process to which you are logging is secure. If you log to an insecure process (one that some non-root user can tinker with) you run the risk of having that process be replaced by another that does unsavory things. If, for example, /usr/bin/apachelog.pl is world-writable, any user could edit it to shut down your server, mail someone the password file, or delete important files. This would be done with root permissions.

If you want to log to a process of some kind, you might be better advised to look for a module that already implements the functionality that you are looking for. Check out http://modules.apache.org/ for a list of some of the modules available to do all sorts of cool things with Apache.

Rotating Your Log Files

Log files get big. If you're not careful, and if you're logging to somewhere like /var, you can actually fill up the partition and bring your server to a grinding halt. Yes, I've done this.

The way around this is to move your log files to some other place before they get too big. This can be accomplished a number of different ways. Some Unix variants come with a logrotate script that handles this for you. RedHat, for example, comes preconfigured to rotate your logs for you every few days, based on either their size or their age.

If you want to do this yourself, you can use a Perl module (freely available from CPAN) called Logfile::Rotate. The following code, run periodically (perhaps once a week?) by cron, will rotate out your logfile, keeping five previous log files at any given time. Each backup log file will be gzipped to conserve space.

     use Logfile::Rotate;
     $logfile = new Logfile::Rotate(
          File => '/usr/local/apache/logs/access_log',
          Count => 5,
          Gzip => '/bin/gzip',
          Signal => sub {
               `/usr/local/apache/bin/apachectl restart`;
               }
          );

This does not seem like much. The Perl module takes care of all the details. You'll end up with files called things like access_log.1.gz, access_log.2.gz, and so on. Each file will get bumped up one number each time, and the file that used to be access_log.5.gz will be deleted each time.

This keeps you from running out of space on your log drive, and keeps as much of an archive as you like.

Logging for Multiple Virtual Hosts

I had several people write to me asking about how to handle logging when you have more than one virtual host on the same machine. I assume that they are running all of their logs into one log file, and are then attempting to split that log file back out into its component parts in order to get meaningful reports per host.

The solution to this problem is not to log to one log file in the first place. I know that there are utilities out there that will take a mixed log file, and, based on your virtual host configurations, figure out what requests were for which virtual host, and generate reports appropriately. This all seems to be too much work, as far as I can tell.

In each of your VirtualHost sections, simply specify a log file for that host. You can then handle each log file separately when it comes time to run reports.

There are some concerns with available file handles. That is, if you are running hundreds of virtual hosts, and have a log file per host, you may encounter a situation where you run out of available file handles. This can cause system instability and can even cause your system to halt. However, this is primarily a concern on servers that are hosting a very large number of virtual hosts.

For those that asked this question, please let me know if I'm completely missing the point of your question.

Summary

In the last several weeks, we've talked about various aspects of logging with Apache. You should now be equipped to log whatever information that you're interested in, and get all sorts of useful statistics out of those log files.

If there are other topics that you'd like to see me cover in Apache Guide, please send me a note at

  Current Newswire:
Everything Solaris: Apache: Handling Traffic

LinuxEasyInstaller 2.0 final release

Apache 2.0.32 beta is available

Everything Solaris: Apache: The Basics

Apache Jakarta James Mailserver v2.0a2 Released

PostgreSQL v7.2 Final Release

Daemon News: Multiple webservers behind one IP address

Zend Technologies launches Zend Studio 2.0

NuSphere first to enable development of PHP web services

Covalent Technologies raises $18 million in venture capital

 Talkback(s) Name  Date
  Topics
Hi

I have read all these topics. Thanks very much!
Furthermore, if it is possible, could you introduct
something about the whole landscape of the apache server?
Then it is easy for us to read the source codes or
documents for apache server.
Thanks again.
  
  Sep 26, 2000, 01:50:36
  apache log script is broken.
hello ...

On the apache rotate script, using the rotate module I got this:

Signal is a deprecated argument, see Pre/Post at apache_access_rotate.pl line 11

running:

This is perl, version 5.005_03 built for i386-linux (redhat)

Thus, how can the signal argument be updated?

t.

  
  Sep 29, 2000, 19:57:52
  Comments for your apache log series
Simply -> excelent

  
  Oct 3, 2000, 11:46:11
  Log rotation - one of the weakest Apache features ?
One of the weakest Apache features, IMHO, is log rotation ! Why ? Well,
no-one's come up with a sensible solution yet that doesn't involve sending a
restart signal to the server (with hundreds of virtual hosts, Apache can
take 30 seconds to restart !).

Yes, I know about stuff like the "logrotate" program that comes with
Apache, but that's unusable with multiple virtual hosts logged to separate
files (the idea of sending all your virtual host logging to one file is
terrible !) because you need *two* copies of logrotate per virtual host
(one for access_log and another for error_log).

What's desperately needed with Apache is a proper logging/rotation module
that all other modules can call (a bit like the way you have that chunk of
Perl that calls a generic Perl rotation library). That way, if you have
stuff like SSL logs (2 more to rotate !), they can use the same routines.

What I'd suggest is the following at rotation time:

* Apache tells all its processes to stop logging and close the log files
(note: Apache may only have a file handle for each log file - it would have
to know the full pathname to each log file too, for re-opening of course).

* This does *not* stop Web serving - incoming entries can either be buffered
in memory as log lines (to be written out when logging re-activates) or
just thrown away if you don't consider them too important (a config option
I'd suggest).

* Apache moves the current log file to an configurable logfile.suffix file and
(maybe) creates a fresh zero-length logfile with the same owner/perms/
group as the just-moved logfile.

* Apache re-opens log files and flushes any buffered log lines to those
re-opened logs.

* A "post-rotate" command is optionally run on the logfile.suffix file
(e.g. gzip to compress them).

This way, we get no disruption of Web serving at rotation time and, potentially,
no loss of logging either. Infinitely superior to the current situation !
I can't be the only person to have thought of this, surely ?

At the moment, I have a script that kicks in and does a lot of this, but it
still has to restart the server otherwise the logging would just continue to
the logfile.suffix file of course.   
  Oct 10, 2000, 21:17:10
  Re: Log rotation
> At the moment, I have a script that kicks in and does a lot of this, but it
> still has to restart the server otherwise the logging would just continue to
> the logfile.suffix file of course.

Do you use 'kill -HUP'/'kill -1' or 'kill -USR1'? The USR1 is a bit cleaner since any currently serving children finish off their request before they exit.

You could also consider logging to a semi-smart process instead of a file. We added the host name to the beginning of the log format so the logger can store it in the proper file. On the up side:

* When you rotate logs, just kill or signal the logger -- the apache children will freeze at the logging stage until it returns, but apache will still accept requests.

* Make the logger a bit smarter and it can rotate the logs, automatically, on it's own (including forking a child to do any post-rotation stuff like gzip).

On the down side:
* Never send apache a USR1 signal if it logs to a process -- any currently serving children will permanently hang at the logging stage. Use -HUP instead.

* Unless you edit the source code, you can't add the host name to error logs (not sure about SSL), which means you still need to restart apache to rotate those if you want to keep them separate. If you can merge all the error logs, you can log those to a process, too.

If you have an external script rotate the logs, you can kill the AccessLog logger -- apache will restart it. You must setup a signal handler for the ErrorLog one, though -- apache (on Linux, anyway) will NOT restart it.   
  Oct 14, 2000, 18:50:57
  Log Rotate without restart
I had this question, is it a bad idea to do this in a LogRotate script

cp /path/logs/access_log /path/logs/access_log_date
cp /dev/null /path/logs/access_log

This way no restart of apache needed. I know you might loose some hits but in a fairly loaded server, what's ten hits when you count 100K of them.

Is this a bad/good idea to do ratation this way? I really can't bring myself to restarting apache in a cron process. I always feel like it will die and not come back up.   
  Oct 19, 2000, 14:04:49
  ErrorLog
It seems to me that if I escalate ErrorLog to LogLevel above error (I used emerg and alert), then it stops capturing errors? Anyone had this?   
  Oct 19, 2000, 14:16:49
  rotatelogs is good
rotatelogs never stop and restart your httpd, so you never loose you connetion, and get your access_log file day by day or week by week. if log file name like this : access_log.1021 (Oct, 21) not 0097200000, it will be better. and we can backup all this logs by a cron script.

Zhefu Fan   
  Oct 20, 2000, 19:00:42
   Re: ErrorLog
> It seems to me that if I escalate ErrorLog to LogLevel above error (I used emerg and alert), then it stops capturing errors? Anyone had this?

Yes, that's correct. That's the way that it's supposed to work. Setting it to, for example, emerg, means that you only want messages that are emergencies.

Rich   
  Nov 1, 2000, 01:30:24
  cronolog
I use cronolog it is pretty good and does the trick..
You have to compile it.. Seams to to do it easily and installs in
/usr/local/sbin
I think there is a link somewhere on the apache webpage for it...
I think it might even replace the rotatelogs script that comes with apache in the future.. Read it somewhere...
You can orginise your logs in various ways using it.
You do not have to stop and start your webserver, and is a lot nicer than copying and moving log files from cronjobs yuk!!
http://www.ford-mason.co.uk/resources/cronolog/

Andy Rae   
  Nov 7, 2000, 15:31:23
   Re: rotatelogs is good
Can rotatelogs be used with multiple vhosts though?   
  Nov 16, 2000, 16:09:08
   Re: Log Rotate without restart
> I had this question, is it a bad idea to do this in a LogRotate script

> cp /path/logs/access_log /path/logs/access_log_date
> cp /dev/null /path/logs/access_log

> This way no restart of apache needed.

Wouldn't this create an error (fighting for a fd) and cause Apache to stop logging?   
  Nov 16, 2000, 16:13:58
  Order of cookie fields?
This is an interesting series, but it didn't answer the question I'm currently
trying to answer, which is: cookies often contain several (or many!) constituent fields. Is there any way to instruct an Apache server to write the cookie fields in a particular order? For example, to place cookie A first?
  
  Nov 21, 2000, 21:46:17
Enter your comments below.
Your Name: Your Email Address:


Subject: CC: [will also send this talkback to an E-Mail address]
Comments:

See our talkback-policy for or guidelines on talkback content.

About Triggers Media Kit Security Triggers Login


All times are recorded in UTC.
Linux is a trademark of Linus Torvalds.
Powered by Linux 2.4, Apache 1.3, and PHP 4
Copyright 2002 INT Media Group, Incorporated All Rights Reserved.
Legal Notices,  Licensing, Reprints, & Permissions,  Privacy Policy.
http://www.internet.com/