To set up reading from log files we need to configure Veduta via consumers.xml.
consumers.xml looks like:
<?xml version="1.0" encoding="UTF-8"?>
<consumers>
<consumer>
<file>
<name>System Messages</name>
...
</file>
</consumer>
<consumer>
<file>
<name>Error Messages</name>
...
</file>
...
</consumer>
</consumers>
<?xml version="1.0" encoding="UTF-8"?>
<consumers>
<consumer>
<file>
<name>System Messages</name>
<interval>5s</interval>
<!-- note Unix filename convention here -->
<filename>/var/log/messages</filename>
<parser type="..."/>
</file>
</consumer>
</consumers>
<file mode="closing">
for each log file.
<parser type="timestamped">
<pattern dateIndex="1" messageIndex="2">{regular expression}</pattern>
<dateformat currentDate="false" currentYear="true">{date/time expression}</dateformat>
</parser>
To identify the contents of the log file, we specify a regular expression, identify each part of the line, and tell Veduta which part is the timestamp, and which is the log line. e.g in the above example, we can provide the following:
<pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>^(.{8}) - (.*)$
See here for regular expression examples.
We now have to tell Veduta how to understand the timestamp. We provide a date format expression. e.g. in the above example we can provide:
<dateformat currentDate="true">HH:mm:ss</dateformat>
If the timestamp doesn't contain a date then we can instruct Veduta to substitute the correct date by specifying currentDate="true". This will tell Veduta that when reading a time without any date information, it will determine the date to be either the current day or the previous day (determined by the time read).
If the timestamp doesn't contain the year, but does contain the day and month, we can instruct Veduta to substitute the correct year by specifying currentYear="true". This will tell Veduta that when reading a time/date without any year information, it will determine the year to be either the current year or the previous year (determined by the time/date read) (a surprising number of log files contain date and time info without year info!)
See here for date/time examples.
Now we've determined how to read a log file, and how to parse it, we can put the above together into a configuration thus:
<?xml version="1.0" encoding="UTF-8"?>
<consumers>
<consumer>
<file>
<name>System Messages</name>
<interval>5s</interval>
<dir>/var/log/</dir>
<filename>messages</filename>
<parser type="timestamped">
<pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>
<dateformat currentDate="true" currentYear="false">HH:mm:ss</dateformat>
</parser>
</file>
</consumer>
</consumers>
<?xml version="1.0" encoding="UTF-8"?>
<consumers>
<consumer>
<file>
<name>System Messages</name>
<dir>/var/log/</dir>
<filename>messages</filename>
<parser type="timestamped">
<pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>
<dateformat currentDate="true" currentYear="false">HH:mm:ss</dateformat>
<filters>
<accept caseSensitive="true">.*server1.*</accept>
</filters>
</parser>
</file>
</consumer>
</consumers>
<filters>
<reject caseSensitive="false">.*server1.*</reject>
</filters>
<?xml version="1.0" encoding="UTF-8"?>
<consumers>
<consumer>
<file>
<name>System Messages</name>
<dir>/var/log/</dir>
<filename>messages</filename>
<parser type="timestamped">
<pattern dateIndex="1" messageIndex="2">^(.{8}) - (.*)$</pattern>
<dateformat currentDate="true" currentYear="false">HH:mm:ss</dateformat>
<converters>
<converter type="blankLineRemover"/>
</converters>
</parser>
</file>
</consumer>
</consumers>
Linux /var/log/messages
Mar 8 14:13:15 dali kernel: tg3: eth0: Link is down. Mar 8 14:13:18 dali kernel: tg3: eth0: Link is up at 100 Mbps, full duplex. Mar 8 14:13:18 dali kernel: tg3: eth0: Flow control is on for TX and on for RX.
<parser type="timestamped">
<pattern dateIndex="1" messageIndex="2">^(.{15}) (.*)$</pattern>
<dateformat currentDate="false" currentYear="true">MMM dd HH:mm:ss</dateformat>
</parser>
Apache /var/log/httpd/access_log
66.249.71.42 - - [27/Nov/2005:12:41:40 +0000] "GET /robots.txt HTTP/1.0" 404 992 66.249.71.42 - - [27/Nov/2005:12:41:41 +0000] "GET /software/xmltask/ HTTP/1.0" 200 65145
<parser type="timestamped">
<pattern dateIndex="2" messageIndex="1">^(.* - - \[(.*)\] ".*)$</pattern>
<dateformat currentDate="false" currentYear="false">dd/MMM/yyyy:HH:mm:ss Z</dateformat>
<converters>
<converter type="blankLineRemover"/>
</converters>
</parser>