Logs are one thing that I think are severely underutilized by most systems administrators. Most of us have taken the first step by actually logging the data, but neglect organizing it into any sort of manageable form. You’ll probably argue that any hardcore *nix admin would be able to take the raw logs using grep, cut, awk, and a handful of other *nix power tools and turn it into consumable information, but that’ll only get you so far.

Several months ago we evaluated a bunch of log management solutions with several goals in mind. We wanted a solution that was agile enough to be able to take in a wide variety of log formats as well as configuration files. It needed to shield sensitive information (passwords, credit card information, etc) from unauthorized users. It needed to provide us with a customizable interface where we could report on all of the log data it gathered. Lastly, it needed to provide our customers (developers) with the ability to self-service their own log files. After evaluating most of the major players in the log management arena, we found our ideal solution in a product called Splunk.

The first thing I noticed when evaluating Splunk was that they’re not like everyone else. They’re not trying to sell you some sort of logging appliance and they offer their software free for customers with 100 MB/day or less worth of logging. Getting Splunk installed was a breeze. You can have it up and running in minutes. It truly is Log Management for Dummies in that respect, but under the hood there is a highly configurable and customizable tool with an API that you could use to write your own applications to examine log files.

At this point I’ve mucked around with Splunk for a few months and our configuration is pretty intense. I’ve added in custom indexes to make my custom dashboards load faster. I’ve set Splunk up to create queryable metadata fields based on information in the logs. I’ve added filters for custom timestamps and auditing so we can tell if a log file has been modified. I’ve even set up a “deployment server” to distribute Splunk’s configuration bundles to my various types of servers. This brings me to the one drawback of Splunk: Upgrading. Rumor has it that they are working on making it easier to upgrade from one version to the next, but for the time being it involves logging in to each server, stopping Splunk, upgrading the files, and restarting Splunk again. If you only had to upgrade every once in a while it would be fine, but they maintain a very active development team so I find myself constantly wanting to upgrade to get the latest bug fixes and features.

Other than that, Splunk does exactly what I tell it to do. It grabs all of our logs and presents them in a single intuitive interface. Think of it as a search engine for log and configuration files. Then, once I have the log data in front of me, I can create custom reports based on that data. If I want to, I can even alert based on information Splunk finds in my logs (send an e-mail to a developer every time their application throws an error message). Oh, did I mention that Splunk has a PCI Dashboard that you can install for free? Ask those other guys how much they charge for their PCI solution.

The next time you have some free time be sure to download Splunk and install it on one of your development servers. You won’t be disappointed.