[NoBrainer] Sending Gelf messages to Graylog2 via PowerShell

Recently I was testing Graylog2 as a store for metering data and had to bulk-load a larger amount of data into its database. My goal was to create synthetic and randomised metric information for a couple of virtual servers (5’000) over a period of one year at a sampling rate of 5s. This would make a total number of 31'536'000'000 (= 365 * 24 * 60 * (60/5) * 5000) (or 31 billion) entries. So manual insertion of these records was top priority to avoid. Instead of using curl as described on the Gelf Format Specification I thought I would give PowerShell a try…

After setting up the respective Gelf HTTP Input and starting to insert messages via Invoke-RestMethod (see Insert-GelfHttp.ps1) I noticed that this was way too slow. There had to be something faster than performing an HTTP POST for every metric. So I switched to Gelf via UDP (using System.Net.Sockets.Socket, see Insert-GelfUdp.ps1) – however it soon became clear that running a graylog2 docker instance within a virtualised CentOS 7 was overloading the VMware NAT stack on my Windows notebook (losing loads of packets). So I finally settled with the Gelf TCP Input (see Insert-GelfTcp.ps1) and was eventually able to insert my test data (though it still took quite a while, i.e. days) …

Bulk loading data into Graylog2

Displaying metering data in Graylog2

You can find the complete source at our Gist or at the end of this Post.

Note

  • While inserting data for past dates I first fooled myself as I was not able to see anything in the Graylog2 search dialogue (because it only show the last couple of minutes by default). I had to switch to “show all messages” to actually see all my inserted data

  • When searching for messages before or after a specific timestamp you have to supply the timestamp in Epoch milliseconds. This is counter-intuitive as the timestamp in the Gelf messasges is in Epoch seconds. See the discussion at GELF spec 1.2.

  • When using the current graylog2 docker instance (id 387b6e81d77d) the container will not be able to start after you once stopped it. See Docker container cannot restart services after being stopped for a possible workaround.

  • When sending messages via Gelf TCP or UDP you might have to adjust/increase the receive buffer size or insert client side throttling or you might lose some of them.

  • Keep in mind you have to append a leading CRLF or 00 byte when inserting messages via TCP or UDP (depending on your input configuration.

  • When opening and closing sockets to the server your client may run quickly out of ephemeral (client) ports unless you specify to re-use sockets.

  • Not closing connections to the server may exhaust the server as well.

  • I had to increase ProcessBuffers and heap size (of master cache) significantly as my loader script was much faster than graylog2/elasticsearch (with having millions of messages in the backlog).

Scripts

If you cannot see the script code below you can view it directly at our Gist.

Trackbacks

  1. […] described in my last post Sending Gelf messages to Graylog2 via PowerShell we were testing Graylog2 for log and meter collection. It soon became clear that the current […]

  2. […] our automated testing via the Graylog2 API and our PowerShell modules (as quickly described in Sending Gelf messages to Graylog2 via PowerShell and Using Graylog2 REST API via […]

  3. […] the article about Creating a Graylog2 Output Plugin this post will cover the steps needed to create a filter plugin for Graylog2. While the SCRIPT […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: