Configuration Options - Basics Configuration Options - Reading NT Event Logs

Configuration Options - Reading Log Files

Veduta reads input from log files and NT event logs. To configure the reading of NT event logs, see here

To set up reading from log files we need to configure Veduta via consumers.xml.

consumers.xml looks like:

<?xml version="1.0" encoding="UTF-8"?>
<consumers>
  <consumer>
    <file>
      <name>System Messages</name>
      ...
    </file>
  </consumer>
  <consumer>
    <file>
      <name>Error Messages</name>
      ...
    </file>
    ...
  </consumer>

</consumers>
so each file is configured using a separate <consumer/> section. Each <consumer/> section is given a name, so it can be referenced via the report.xml configuration - determining how each log file is displayed.

Configuring a Log File

A log file is configured using the following:
<?xml version="1.0" encoding="UTF-8"?>
<consumers>
  <consumer>
    <file>
      <name>System Messages</name>
      <interval>5s</interval>
      <!-- note Unix filename convention here -->
      <filename>/var/log/messages</filename>
      <parser type="..."/>
    </file>
  </consumer>
</consumers>
so we give the log file a name for later reference, then specify how often to read this file, the directory and filename of the log file to be read. So far we've specified a log file to read and how often. Now we need to specify how to read that log file.
Note: Veduta by default will hold a file open whilst monitoring it. On Windows (NT) systems this may cause problems if a log file needs to be moved/renamed or deleted. Veduta can be configured to only open the file when required by setting:
<file mode="closing">
    
for each log file.

Getting Timestamps from the Log File

We tell Veduta how to read the log file by providing a parser. In the simple introductory example we specified a simple parser. However that is really only useful for the simplest demos. A timestamped parser is much more useful.
<parser type="timestamped">
  <pattern dateIndex="1" messageIndex="2">{regular expression}</pattern>
  <dateformat currentDate="false" currentYear="true">{date/time expression}</dateformat>
</parser>
so this sets the parser to be a timestamped parser. We then have to specify how to read each log file line, and how to split it into a date/timestamp, and the remaining message. e.g. in
12:34:56 - server up on host lon24cmp
we need to identify the timestamp (12:34:56) and the message (server up on host lon24cmp).

To identify the contents of the log file, we specify a regular expression, identify each part of the line, and tell Veduta which part is the timestamp, and which is the log line. e.g in the above example, we can provide the following:

<pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>
so the regular expression looks like:
^(.{8}) - (.*)$
so the first eight characters (which we know are the date) are captured by the first expression in brackets, and the remainder of the message after the hyphen (which we know to be the message) are captured by the second expression. We tell Veduta that the date is represented by the first expression by specifying dateIndex="1", and that the message is represented by the second expression by specifying messageIndex="2".

See here for regular expression examples.

We now have to tell Veduta how to understand the timestamp. We provide a date format expression. e.g. in the above example we can provide:

<dateformat currentDate="true">HH:mm:ss</dateformat>
    
This tells Veduta that the timestamp its read (12:34:56) consists of hours, minutes and seconds separated by colons (:). The date format can specify dates as well (days, months and years).

If the timestamp doesn't contain a date then we can instruct Veduta to substitute the correct date by specifying currentDate="true". This will tell Veduta that when reading a time without any date information, it will determine the date to be either the current day or the previous day (determined by the time read).

If the timestamp doesn't contain the year, but does contain the day and month, we can instruct Veduta to substitute the correct year by specifying currentYear="true". This will tell Veduta that when reading a time/date without any year information, it will determine the year to be either the current year or the previous year (determined by the time/date read) (a surprising number of log files contain date and time info without year info!)

See here for date/time examples.

Now we've determined how to read a log file, and how to parse it, we can put the above together into a configuration thus:

<?xml version="1.0" encoding="UTF-8"?>
<consumers>
  <consumer>
    <file>
      <name>System Messages</name>
      <interval>5s</interval>
      <dir>/var/log/</dir>
      <filename>messages</filename>
      <parser type="timestamped">
        <pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>
        <dateformat currentDate="true" currentYear="false">HH:mm:ss</dateformat>
      </parser>
    </file>
  </consumer>
</consumers>

Filtering

You may only be interested in certain log files lines. e.g. those that contain a server name, those that don't contain particular information etc. You can configure Veduta to handle this via filters. e.g.
<?xml version="1.0" encoding="UTF-8"?>
<consumers>
  <consumer>
    <file>
      <name>System Messages</name>
      <dir>/var/log/</dir>
      <filename>messages</filename>
      <parser type="timestamped">
        <pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>
        <dateformat currentDate="true" currentYear="false">HH:mm:ss</dateformat>
        <filters>
          <accept caseSensitive="true">.*server1.*</accept>
        </filters>
      </parser>
    </file>
  </consumer>
</consumers>
so the above will only read lines that contain server1 in any line. Similarly you can specify to reject lines thus:
<filters>
  <reject caseSensitive="false">.*server1.*</reject>
</filters>
    
Note that the above is configured to be case insensitive and will thus reject server1, SERVER1 etc.

Some log files contain blank lines. You can remove these by specifying a convertor thus:

<?xml version="1.0" encoding="UTF-8"?>
<consumers>
  <consumer>
    <file>
      <name>System Messages</name>
      <dir>/var/log/</dir>
      <filename>messages</filename>
      <parser type="timestamped">
        <pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>
        <dateformat currentDate="true" currentYear="false">HH:mm:ss</dateformat>
        <converters>
          <converter type="blankLineRemover"/>
        </converters>
      </parser>
    </file>
  </consumer>
</consumers>
Other convertor types are available. For example, Java stack trace run over many different lines in a log file. These can be collapsed into one log file line by specifying a exceptionToLine convertor.

Examples

Note: Veduta uses Java regular expressions. These work in a very similar fashion to those of Perl and other utilities. For additional information on how these work, see the Java regular expression documentation
Note: Veduta uses Java date/time parsing. See the Java date format documentation for comprehensive information on date/time parsing.
The following show examples for parsing different commonly found log files.

Linux /var/log/messages

Mar  8 14:13:15 dali kernel: tg3: eth0: Link is down.
Mar  8 14:13:18 dali kernel: tg3: eth0: Link is up at 100 Mbps, full duplex.
Mar  8 14:13:18 dali kernel: tg3: eth0: Flow control is on for TX and on for RX.
can be parsed using:
<parser type="timestamped">
  <pattern dateIndex="1" messageIndex="2">^(.{15}) (.*)$</pattern>
  <dateformat currentDate="false" currentYear="true">MMM dd HH:mm:ss</dateformat>
</parser>
    
(note that the date is displayed in the log file but the year is not).

Apache /var/log/httpd/access_log

66.249.71.42 - - [27/Nov/2005:12:41:40 +0000] "GET /robots.txt HTTP/1.0" 404 992
66.249.71.42 - - [27/Nov/2005:12:41:41 +0000] "GET /software/xmltask/ HTTP/1.0" 200 65145
can be parsed using:
<parser type="timestamped">
  <pattern dateIndex="2" messageIndex="1">^(.* - - \[(.*)\] ".*)$</pattern>
  <dateformat currentDate="false" currentYear="false">dd/MMM/yyyy:HH:mm:ss Z</dateformat>
  <converters>
    <converter type="blankLineRemover"/>
  </converters>
</parser>
The regular expression in this example is relatively complicated since the date appears in the middle of the message. So the first regular expression group (in brackets) encompasses the complete expression, and the second wraps the date/time stamp in the middle of this.
Configuration Options - Basics Configuration Options - Reading NT Event Logs