20.1 C
Texas

Logstash “FILTER” in Action

This article is more about practical rather going through details, so I hope you enjoy this very much as we go through step by step. As we already covered in one of the ELK post, Logstash support many message transformation plugins inside in its filter{} section. They help to transform different formatted message into a JSON structured data.

Lets get started by discovering the plugin “GROK”.

GROK:

For the first example, lets consider the following message.

<Aug 6, 2018 7:01:30 PM EST5EDT> <Warning> <ALSB Logging> <bo3xxxxxxxxx01.example.com> <MS5> <[ACTIVE] ExecuteThread: '14' for queue: 'weblogic.kernel.Default (self-tuning)'>
- Advertisement -

 

Now before anything, you got to remind this first.

  • Every line that is being input to INPUT{} is referred to as  “event” and the moment the message get parsed by the INPUT It will be then a Json (key-value pair) as below.
"message" : "<Aug 6, 2018 7:01:30 PM EST5EDT> <Warning> <ALSB Logging> <bo3uslifxosb05.pearsoncmg.com> <MS5> <[ACTIVE] ExecuteThread: '14' for queue: 'weblogic.kernel.Default (self-tuning)'>"

 

Alright, Now lets ask our FILTER{} to segregate more FIELD values by parsing the message via GROK plugin.

filter {
        grok {
              match => ["message", "<%{DATA:timestamp}> <%{WORD:logLevel}> ?%{GREEDYDATA:msg}"]
        }
}

As you can see, the message is being segregated into three FIELD now.

  • timestamp =>
  • logLevel =>
  • msg =>

What happen here is that when Logstash started & the Event is parsed via GROK plugin it outputs the three Fields with their corresponding Values. To tell you, this would actually a Json structured data.

I am going to add couple of grok pattern for different types of messages for your reference. Also, to check a validity of these written grok, there are online links available. One I could recommend is this one.

  • It’s important to note that depending upon the message format which is being input, what GROK pattern that exists has to be changed, until otherwise it throws _grokparsefailure when unknown message format is being processed. Follow examples list down couple of log formats & their corresponding grok configurations for ease of understanding.

Message:

2018-06-16 11:38:40,056 ERROR [quartz.core.ErrorLogger] [org.xyz.solr.xyzcoCoreAdminHandler@5b38dd47_Worker-2] Job (Solr.CoreWatcher threw an exception

grok {
match => ["message", "%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL:loglevel} \[%{DATA:class}\] %{GREEDYDATA:class}"]
}

 

Message:

07/17/2018 at 04:11:16 AM\nSite ID: \”154227\”\nSite Abbrev: \”\”\nSite ID: \”154227\”\nAuthenticating loginname et1qa_edu1 and password XXXXX from PI database.

grok {
match => ["message", "(?<origindate>%{DATE_US} [a-z]{2} %{TIME} %{SPACE}[A-Z]{2})\\n%{GREEDYDATA:msg}"]
}

 

Message:

2018-08-15 06:35:12 WARN xxxConsumerServiceBean:52 – [SxxxxxRxxxxx][018a356b-898f-463b-b963-e4c985d548ef][][PSREVENTS_20404]: NOT FOUND :5b73c911e4b0c801193c745e

grok {
match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:loglevel} %{SPACE} (?<origindate>[a-zA-Z]*:[0-9]*) - %{GREEDYDATA:msg}"]
}

 

Message:

requestMethod: [GET] , requestQuery: [null], errors: [{timestamp=Thu Aug 23 09:27:30 UTC 2018, status=404, error=Not Found, message=No message available, path=/}]

grok {
match => ["message", "(?<timestamp>((?<=\=).+.* [0-9]{4}))"]
}

 

DATE:

If you carefully check the GROK filters above, they always represent with key-value pairs, a “timestamp” for example. Now, the next important outcome of this filters is that these key-value pairs available for further manipulation via another set of plugins. Let’s look at an example for a “date” plugin, which actually enforce the sting that is found on the Timestamp to be compatible to Elasticsearch.

Time:
2018-06-16 00:00:20,060

filter {

       grok { ..... match pattern here ...... }

       date {
             match => [ "timestamp", "yyyy-MM-dd HH:mm:ss,SSS Z", "MMM dd, yyyy HH:mm:ss a" ]
             timezone => “UTC”
       }
}

 

Time:
Aug 6, 2018 9:00:00 PM EST5EDT

filter { 
        grok { ..... match pattern here ...... } 
        date { 
              match => [ "timestamp", "MMM d, YYYY K:mm:ss aa ZZZ"] 
        }
}

 

Time:
Aug 6, 2018 9:00:00 PM EST5EDT
Aug 16, 2018 12:40:00 PM EST5EDT

date { 
      match => [ "timestamp", "MMM d, YYYY K:mm:ss aa ZZZ"
                               "MMM d, YYYY h:mm:ss aa ZZZ"] 
}

 

I hope, you now have a clear idea about how these filters are being called into Logstash for message manipulation to get Json structured data. As I mentioned in my previous post, there are lots of plugins that support many different use cases for message transformation. You can find them in the official Logstash page here.

 

Before wrapping up this post, I will show you below on how message transformation output a valid Jason. For starting up & basic configurations, please see my previous logstash documentation here.

Message to be worked on:

2018-06-16 11:38:40,056  ERROR [quartz.core.ErrorLogger] [org.xxxx.solr.xxxxAdminHandler@5b38dd47_Worker-2] Job (Solr.CoreWatcher threw an exception

 

Logstash configuration used:

input {
	file {
		path => "/home/testuser/test.log"
	}
}

filter {

        grok {
		match => {"message" => "%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL:loglevel} \[%{GREEDYDATA:msg}\]"}
        }
        
        date {
                match => [ "timestamp" , "ISO8601" ]
        }
        
        mutate { remove_field => [ "message" ] }
}


output {
	stdout { codec => rubydebug }
}

If you note down, for the OUTPUT{} section, I didn’t used elasticsearch node IP, instead marked it as standard output, which tell logstash just show every messages that being output in its console.

Logstash OutPut:

{
           "msg" => "quartz.core.ErrorLogger] [org.xxxx.solr.xxxxAdminHandler@5b38dd47_Worker-2",
          "path" => "/home/testuser/test.log",
    "@timestamp" => 2018-06-16T06:08:40.056Z,
      "loglevel" => "ERROR",
      "@version" => "1",
          "host" => "archlinux",
     "timestamp" => "2018-06-16 11:38:40,056"
}

It’s a perfect valid Json.

“Hope you can now start building your own Logstash environment for message transformation”

- Advertisement -
Everything Linux, A.I, IT News, DataOps, Open Source and more delivered right to you.
Subscribe
"The best Linux newsletter on the web"

LEAVE A REPLY

Please enter your comment!
Please enter your name here



Latest article