This post describes how you can extend the Graylog2 AlertCondition mechanism to raise alerts based on the contents of a field (instead of just a message count or a field value).

Introduction

Currently Graylog2 only allows you to trigger AlarmCallbacks based on either

  • MESSAGE_COUNT
    Here you define how many messages must be assigned to that stream before an alarm will be triggered.

  • FIELD_VALUE
    In this scenario you define a numeric threshold that must be reached before an alarm will be triggered.

So out of the box it is not possible to check for the existence of a field or the exact value of a string (neither numeric nor string). Inside a custom filter plugin this would be possible to check for the respective conditions, but inside a plugin it is not possible to trigger an alert (as far as I know).

Implementation

However, there is a workaround:

  1. Implement a filter that runs after the StreamMatcherFilter filter (i.e. a priority of higher than 40).
    Within that filter we perform all necessary logic we want to (such as checking for a specific field content).

    We can implement the alarm-check logic via the plugin described in Creating a Graylog2 Filter Plugin and only have to add a stream check inside the javascript code and add the respective field. The filter would then execute a script similar to this:

    // Filter script part 1
    
    var streams = message.getStreams();
    astreams = streams.toArray();
    for(c = 0; c < astreams.length; c++)
    {
      var stream = astreams[c];
      if("routerStatus" != stream.title) 
      {
        continue;
      }
      var routerStatusMessage = message.getField("ROUTER_STATUS_MESSAGE");
      if("DOWN" != routerStatusMessage)
      {
        continue;
      }
      // ...
    }
    

    Note: checking for the stream id (instead of title might be more efficient.

  2. Add a custom field to the message with an arbitrary value
    This field (with a value of 1) will be used to later utilise the built-in FIELD_VALUE AlertCondition feature.

    // Filter script part 2
    
      // ...
      message.addField("DF_STREAM_ALERT_" + stream.title, 1);
      break;
    }
    

  3. Create a regular AlertCondition
    This alert will check the field from the previous step and trigger an alarm when the value is larger than 0

    Alert is triggered when the field DF_STREAM_ALERT_routerStatus 
    has a higher min value than 0 in the last 1 minutes. Grace 
    period: 0 minutes. Including last 5 messages in alert notification. 
    

  4. optional Add a custom AlarmCallback
    Set up an AlarmCallback like biz.dfch.j.graylog2.plugin.alarm.execscript to further process the actual alarm (such as submitting another message into Graylog2 or sending out some custom REST request).

Notes

  • This is certainly a workaround, as the product itself does not support this.

  • And of course script processing will not achieve very high performance. But for testing this should be rather fine.

  • Checking for stream id instead of stream title will increase performance as well.

  • Instead of using a filter, with an AlertCondition and an AlarmCallback, this all could be done inside a filter altogether (with asynchronous threads). However then you will not see triggered alerts in the UI any more.

  • Another approach could be to assign the message to a (specific alarm) stream instead of creating a custom field. From that stream we could treat every message as an alarm and process the message via an output filter.

  • Adding these custom fields to a message has the advantage that you can later still identify messages that caused alarms (whereas the normal alarms only live in memory of that specific node).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.