freshmeat.net: Tutorials - Writing Syslog Messages to MySQL with rsyslogd

Tue, May 13th

12:29 PDT

[Article]

add comment

[Article]

Writing Syslog Messages to MySQL with rsyslogd
by Rainer Gerhards, in Tutorials - Sat, Sep 29th 2007 00:00 PDT

In this paper, I describe how to write syslog messages to a MySQL database. Having syslog messages in a database is often handy, especially when you intend to set up a frontend for viewing them. This paper describes an approach with rsyslogd, an alternative enhanced syslog daemon natively supporting MySQL. I describe the components that need to be installed and how to configure them.

Copyright notice: All reader-contributed material on freshmeat.net is the property and responsibility of its author; for reprint rights, please contact the author directly.

Background

In many cases, syslog data is simply written to text files. This approach has some advantages, most notably that it is very fast and efficient. However, data stored in text files is not readily accessible for realtime viewing and analysis. For that, the messages need to be in a database.

There are various ways to store syslog messages in a database. For example, some people have the syslogd write text files which are later fed via a separate script into the database. Others have written scripts which take the data (via a pipe) from a non-database-aware syslogd and store them as they appear. Others use database-aware syslogds and make them write the data directly to the database. In this paper, I use that "direct write" approach. I think it is superior because the syslogd itself knows the status of the database connection and thus can handle it intelligently (hopefully). I use rsyslogd to accomplish this, simply because I initiated the rsyslog project with database-awareness as one goal.

One word of caution: while message storage in a database provides an excellent foundation for interactive analysis, it comes at a cost. Database I/O is considerably slower than text file I/O. As such, directly writing to a database makes sense only if your message volume is low enough to allow the syslogd, the network, and the database server to catch up with it. Some time ago, I wrote a paper on optimizing syslog server performance. While this paper talks about Window-based solutions, the ideas in it are generic enough to apply here, too, so it might be worth reading if you anticipate medium-high to high traffic. If you anticipate really high traffic (or very large traffic spikes), you should seriously consider forgetting about direct database writes. In my opinion, such a situation needs either a very specialized system or a different approach (the text-file-to-database approach might work better for you in this case).

Overall System Setup

If you decide to use a client-server system, it would be a good idea to do it securely, and you may also be interested in my paper on SSL-encrypting syslog message transfers.

In this paper, I concentrate on the server side. If you are thinking about interactive syslog message review, you probably want to centralize syslog. In such a scenario, you have multiple machines (the so-called clients) send their data to a central machine (called "server" in this context). While I expect such a setup to be typical when you are interested in storing messages in the database, I do not describe how to set it up. This is beyond the scope of this paper. If you search a little, you will probably find many good descriptions on how to centralize syslog.

No matter how the messages arrive at the server, their processing is always the same, so you can use this paper in combination with any description for centralized syslog reporting.

As I said, I use rsyslogd on the server. It has intrinsic support for talking to MySQL databases. For obvious reasons, we also need an instance of MySQL running. To keep us focused, the setup of MySQL itself is also beyond the scope of this paper. I assume that you have successfully installed MySQL and also have a frontend at hand to work with it (for example, phpMyAdmin). Please make sure that this is installed and actually working, and that you have a basic understanding of how to handle it.

Setting up the system

You need to download and install rsyslogd first. Obtain it from the rsyslog site. Make sure that you disable stock syslogd, or you will experience some difficulties.

It is important to understand how rsyslogd talks to the database. In rsyslogd, there is the concept of "templates". Basically, a template is a string which includes some replacement characters, which are called "properties" in rsyslog. Properties are accessed via the "Property Replacer". You access properties by including their name between percent signs inside the template. For example, if the syslog message is "Test", the template "%msg%" would be expanded to "Test". Rsyslogd supports sending template text as a SQL statement to MySQL. As such, the template must be a valid SQL statement. There is no limit to what the statement might be, but there are some obvious and not-so-obvious choices. For example, a template with "drop table xxx" is possible, but does not make much sense. In practice, you will always use an "insert" statement inside the template.

An example: If you would just like to store the message part of the full syslog message, you have probably created a table "syslog" with a single column "message". In such a case, a good template would be "insert into syslog(message) values ('%msg%')". With the example above, that would be expanded to "insert into syslog(message) values('Test')". This expanded string is then sent to the database. It's that easy, there's no special magic. The only thing you must ensure is that your template expands to a proper SQL statement and that this statement matches your database design.

Does that mean you need to create database schema yourself and also must fully understand rsyslogd's properties? No, that's not needed. Because we anticipated that folks are probably more interested in getting things going instead of designing them from scratch, we provided a default schema as well as build-in support for it. This schema also offers an additional benefit: rsyslog is part of Adiscon's MonitorWare product line (which includes Open Source and closed source members). All of these tools share the same default schema and know how to operate on it. For this reason, the default schema is also called the "MonitorWare Schema". If you use it, you can simply add phpLogCon (a GPLed syslog Web interface) to your system and have instant interactive access to your database. So, there are some benefits to using the provided schema.

The schema definition is contained in the file "createDB.sql", which comes with the rsyslog package. Review it to check that the database name is acceptable to you. Be sure to leave the table and field names unmodified, or you will need to customize rsyslogd's default SQL template, which we do not do in this paper. Then, run the script with your favorite MySQL tool. Double-check that the table was successfully created.

Next, we need to tell rsyslogd to write data to the database. Since we'll use the default schema, we do not need to define a template for this. We can use the hardcoded one (rsyslogd handles the proper template linking), so all we need to do is add a simple selector line to /etc/rsyslog.conf:

*.*      >database-server,database-name,database-userid,database-password

In many cases, MySQL will run on the local machine. In this case, you can simply use "127.0.0.1" for database-server. This can be especially advisable if you do not need to expose MySQL to any process outside of the local machine. In this case, you can simply bind it to 127.0.0.1, which provides a quite secure setup. Of course, there's also support for remote MySQL instances. In that case, use the remote server name (e.g., mysql.example.com) or IP address. The default database-name is "syslog". If you have modified the default, use your name here. database-userid and -password are the credentials used to connect to the database. As they are stored in clear text in rsyslog.conf, that user should have only the least possible privileges. It is sufficient to grant it only INSERT privileges to the systemevents table. As a side note, it is strongly advisable to make the rsyslog.conf file readable by root only. If you make it world-readable, everyone could obtain the password (and, eventually, other vital information from it). In our example, let's assume you have created a MySQL user named "syslogwriter" with a password of "topsecret". If your MySQL database is on the local machine, your rsyslog.conf line might look like this:

*.*      >127.0.0.1,syslog,syslogwriter,topsecret

Save rsyslog.conf, restart rsyslogd, and you should see syslog messages being stored in the "systemevents" table.

The example line stores every message to the database. If you have a high traffic volume, you will probably want to limit the number of messages being logged. This is easy to accomplish. The "write database" action is just a regular selector line, so you can apply normal selector-line filtering. If, for example, you are only interested in messages from the mail subsystem, you can use the following selector line:

mail.*      >127.0.0.1,syslog,syslogwriter,topsecret

Review the rsyslog.conf documentation for details on selector lines and their filtering.

You have now completed everything necessary to store syslog messages to the MySQL database. If you would like to try out a frontend, you might want to look at phpLogCon, which displays syslog data in a browser. As of this writing, phpLogCon is not yet a powerful tool, but it's Open Source, so it might be a starting point for your own solution.

On Reliability...

rsyslogd writes syslog messages directly to the database. This implies that the database must be available at the time of message arrival. If the database is offline, no space is left, or something else goes wrong, rsyslogd can not write the database record. If rsyslogd is unable to store a message, it performs one retry. This is helpful if the database server was restarted. In this case, the previous connection was broken, but a reconnect immediately succeeds. However, if the database is down for an extended period of time, an immediate retry does not help. While rsyslogd could retry until it finally succeeds, that would have negative impact. Syslog messages keep coming in. If rsyslogd were busy retrying the database, it would not be able to process these messages. Ultimately, this would lead to loss of newly-arrived messages.

In most cases, rsyslogd is configured not only to write to the database, but to perform other actions as well. In the always-retry scenario, that would mean no other actions would be carried out, so the design of rsyslogd is limited to a single retry. If that does not succeed, the current message will not be written to the database, and the MySQL database writer will be suspended for a short period of time. Obviously, this leads to the loss of the current message as well as all messages received during the suspension period, but they are only lost in regard to the database (all other actions are correctly carried out). While not perfect, we consider this to be a better approach than the potential loss of all messages in all actions.

In short: Try to avoid database downtime if you do not want to experience message loss.

Please note that this restriction is not rsyslogd-specific. All approaches to realtime database storage share this problem area.

Conclusion

With minimal effort, you can use rsyslogd to write syslog messages to a MySQL database. Once the messages arrive there, you can interactively review and analyze them. In usual practice, the messages are also stored in text files for longer-term archiving, and the databases are emptied after some time (to avoid becoming too slow). If you expect an extremely high syslog message volume, storing it in realtime to the database may outperform your database server. In such cases, either filter out some messages or think about alternate approaches involving non-realtime database writing (beyond the scope of this paper).

The method outlined in this paper provides an easy-to-setup and easy-to-maintain solution for most use cases, especially with low- and medium-sized syslog message volume (or fast database servers).

Copyright

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license can be viewed at http://www.gnu.org/copyleft/fdl.html.

Author's bio:

Rainer Gerhards can be reached at rgerhards@hq.adiscon.com.

T-Shirts and Fame!

We're eager to find people interested in writing articles on software-related topics. We're flexible on length, style, and topic, so long as you know what you're talking about and back up your opinions with facts. Anyone who writes an article gets a t-shirt from ThinkGeek in addition to 15 minutes of fame. If you think you'd like to try your hand at it, let jeff.covey@freshmeat.net know what you'd like to write about.

[Comments are disabled]

Referenced categories

Topic :: Database
Topic :: System :: Logging

Referenced projects

MySQL - A fast SQL database server.
phpLogCon - A Web front end for syslog and other network events.
phpMyAdmin - A tool that handles the basic administration of MySQL over the Web.
rsyslog - An enhanced syslog daemon.

Comments

[»] syslogd's replacement
by Andreanes Yosef Vanderlee - Dec 11th 2007 00:05:25

I think with all the functionality in Rsyslog, someday it'll replace the current syslogd. Because the original syslogd have a very limited functionality. And capability to store its log to database is a must for me.

Though Rsyslog still need improvements with its performance. If you use it in a production environment, it'll takes for more than 10 seconds before storing it to database.

Thanks for the good software and article.

[reply] [top]

[»] Good article.....
by Da Beave - Nov 13th 2007 00:47:51

Very good article, as it made me aware of rsyslogd. It sounds like the the setup is pretty straight forward, however, there's nothing I can't already do with syslog-ng (which ships with many Linux based distros). I could argue I can do more with syslog-ng (FIFO pipes, inserts into SQL databases, with a mix of perl/C/FIFO real time monitoring). Basically, rsyslogd reminds me a lot of a short lived project "msyslogd" (which I used).

However, don't get me wrong.... I did enjoy the article and it's always nice to know alternatives out there.

[reply] [top]

[»] Re: Good article.....
by Rainer Gerhards - Nov 27th 2007 02:55:41

> Very good article, as it made me aware

> of rsyslogd. It sounds like the the

> setup is pretty straight forward,

> however, there's nothing I can't

> already do with syslog-ng (which ships

> with many Linux based distros). I

> could argue I can do more with syslog-ng

> (FIFO pipes, inserts into SQL

> databases, with a mix of perl/C/FIFO

> real time monitoring). Basically,

> rsyslogd reminds me a lot of a short

> lived project "msyslogd"

> (which I used).

>

> However, don't get me wrong.... I

> did enjoy the article and it's always

> nice to know alternatives out there.

Hi, I was a bit on the road, thus I follow up somewhat late. I fully understand your concerns and I am aware of it. Rsyslog is there for quite a while and (of course) I think it won't go away that fast. But I think the big question is "why another syslogd?" - I have blogged about this not so long ago, and you may like to have a look at the answer:

http://rgerhards.blogspot.com/2007/08/why-does-world-need-another-syslogd.html

I would also like to mention that rsyslog has just become part of Fedora 8, so we are moving on. We already have the capability to work with backup destination (e.g. if one database server fails, try the next one) and there are other features syslog-ng does not yet offer. GSS-API has just been added, will be released soon. So I sincerely hope this will become a well-alive and useful project. Of course, there will always be differences to syslog-ng, as we do not to create a clone - why should that be useful... We try to do things a bit differently and hopefully hit the sweet pot ;)

Thanks again, Rainer

[reply] [top]