Making sense of your logs with Logstash 1.2.2 and Kibana 3

The latest release of logstash, v1.2.2, now ships with Kibana3! An awesome combination to get your logs parsed, analysed and visualized – free!

What’s in the Jar?

By default, all you get is a JAR file to run, so you’ll need Java installed (any JDK will do, but I’d suggest going for the Sun Java JDK, it’s faster and generally better; however, openjdk will do just fine for now). Kibana is not a separate piece of software to install, it’s all part of the logstash Jar that you download from the above link.

Logstash uses some other open source software that you will need to install:

  • Elasticsearch
  • Redis

Elasticsearch comes embedded within Logstash, however, I strongly suggest using Elasticsearch and Redis to send your log data from other hosts via a “log queue” (Redis is lightweight and very, very fast).

Installation (via puppet)

To make things easy for everyone, I’ve created a Puppet module you can use to setup a proof of concept in about 5 minutes… It will set-up everything (Elasticsearch, Redis, Logstash Indexer + Web) on a single host: https://github.com/alex-leonhardt/logstash (any issues, please open a bug in github).

Run puppet apply once or twice on the host you want to be the “logstash indexer” to ensure all services are up and running, then you should be able to get to the dashboard by going to:

http://host:9292/

You’ll be welcomed with the option to create a blank dashboard or a minimum dashboard that will probably show a histogram of all logs received, etc.

Important:
Ensure that you change the index to be by day – it will make the search for logs more efficient, to do this, follow these steps:

Dashboard -> Click on top right corner “Configure Dashboard” -> Select tab “Index” -> Set “Timestamping” to “day” and accept the suggested format -> Click on close (it’s automatically saved).

So now that you’re successfully indexing your log data (at least on the host you just installed using the puppet module above), it’s time to add some filters to parse your logs and make sense of the data that is being sent to logstash.

Filters/ Log parsing

Filters in logstash can use a large variety to types, grok patterns are a very popular choice and work very well. Fortunately, there are a lot of pre-configured grok patterns we can use here:
https://github.com/logstash/logstash/tree/master/patterns

Put those files into /opt/logstash/patterns/ (you may have to create that directory, it doesn’t come with the proof of concept module yet). Once done, update the file /opt/logstash/indexer.conf with the following:

filter {
if [type] == "apache" {
grok {
patterns_dir => [ "/opt/logstash/patterns" ]
match => [ "message", "%{COMBINEDAPACHELOG}" ]
}
}
}

The above filter identifies logs received by logstash that are tagged as type “apache”. It will parse these log messages using the existing Grok pattern for combined Apache logs.

You should take a look at indexer.conf and shipper.conf, you can (and should) edit the module’s erb templates with whatever other logfiles you want to be sent to the logstash host, by default it’ll send /var/log/*log and /var/log/messages. Restart logstash on the client nodes and voila, they will magically appear in the Kibana dashboard a few seconds later, readily indexed by Elasticsearch.

logstash dashboard

If you need more examples or help setting things up, try IRC channel #logstash on Freenode – very helpful and friendly people there!

When it all works …

Once your business/ manager/ team have accepted to spend some resources on this, you should use the existing Puppet modules that will install all the parts necessary for an easy way to scale as and when needed. There are existing puppetforge/ public modules for Elasticsearch, logstash and Redis on GitHub.

Print Friendly

9 thoughts on “Making sense of your logs with Logstash 1.2.2 and Kibana 3

  1. Alex,

    Can share with us, your Kibana dashboard schema, I see that you are using some filter to graph times, and into your post, use Apache Combined Logs, I’ll like to understand how create similar graph from my apache logs too.

    Thanks
    Alejandro
    From Argentina

  2. Hi Alejandro! Que tal! :)

    Unfortunately, the COMBINEDAPACHELOG was used for illustration purposes, the screenshot of the dashboard is in fact using different logs, however, the combined apache log should be giving you processing time for each request (if not, then you can amend the log to add it). You will then simply need to match that field in the search bar with something similar to apache_process_time:[0 TO 5] – this should highlight all requests that took between 0s and 5s. With the logged process time, you will be able to add the Min , Mean and Max timing graphs as well, simply choose the standard “Histogram” , change the mode to “min” and choose the appropriate field that contains your timing (e.g. apache_process_time).

    Please note that above field name/s are picked at random and won’t actually reflect the real field name that would be available to you via logstash/kibana.

    Hope this helps!

    Regards,
    Alex

  3. Hi Alex,

    What would I need to have es and kibana run from different IPs? right now I have redis in one server and es in another one. I am using the embedded kibana in logstash1.2.2 and I am running it from the same IP as es.

    Thanks,

    M

  4. Hi Alex,

    Nice little overview.

    Did you have to change your ES mappings in order to get your range queries to work? (I’m having some issues here)
    As far as I understand fields can be misinterpreted as being strings, which will lead to range queries being done alphabetically, instead of on the value of the number.

    Thanks,
    Derk

    • Hi Derk,

      I ran into the same issues, to fix it, you’ll have to change your grok pattern to use something like

      %{NUMBER:responsetime:float}

      this will make sure that ES will index it as a number, not as a string – limitation is that currently only INT and FLOAT are supported – but I think that’ll still be “enough” for the time being ;)

      Also, you will have to delete your old indexes from at least the day you’re fixing the pattern, or the last hour, or whichever time frame you prefer. I got rid of the day’s logs as it was a test instance – in production, I’d wait until the next day when a new index is being created.

      Hope it helps!
      Alex

  5. Hi Alex.Leonhardt:
    A bit confused on using “COMBINEDAPACHELOG”, and hoping you might be able to shed a little light on the subject.

    We have a special Log Format in our Logs… example follows:
    { “@timestamp”: “2014-02-24T18:12:40+0000”, “@fields”: { “client”: “192.168.1.131”, “duration_usec”: 15105, “status”: 200, “request”: “/index.php”, “method”: “HEAD”, “referrer”: “-“, “True-Client-IP”: “-“, “X-Forwarded-For”: “-“, “Host”: “192.168.2.35:80”, “Via”: “-“, “connection_status”: “-“, “Time-Taken-To-Serve-Req-sec”: “-” } }

    But I cannit use this “filter” to pull it into our elasticsearch (1.0):
    filter {
    grok {
    match => [ message, “%{COMBINEDAPACHELOG}” ]
    }
    }

    —do I do something like this to get the rest of the ‘fields’ into ‘elasticsearch DB’?
    filter {
    grok {
    match => [ message, “%{COMBINEDAPACHELOG}” ]
    add_field => [ “True-Client-IP”, “%{True-Client-IP}” ]
    }
    }
    —-appreciate any response(s) that you or another member of the community may have.

    • Hi,

      when using your own log format, you may use COMBINEDAPACHELOG as a guide, but I’d suggest to go to http://grokdebug.herokuapp.com/ to try and match your logs using the custom pattern – the COMBINEDAPACHELOG won’t work for you here as the fields may be arranged differently, additional fields wont be recognized, etc.

      you would then use something like this :

      if [type] == “custom_apache_log” {
      grok {
      patterns_dir => “./patterns”
      match => [ “message”, “grokpatternthat youworkedoutgoeshere”]
      }
      }

      Hope it helps!
      Alex

Leave a Reply