IBM Support

Log Analysis - Logstash Filter Plugin

Technical Blog Post


Abstract

Log Analysis - Logstash Filter Plugin

Body

[HTML Source]

If you are just started off with Logstash, you would want to first read the previous post =>

/support/pages/node/1081509

 

There are basically 3 "sections" (plugins) in Logstash - input, filter and output.

This blog entry will talk about the "filter" plugin =>

https://www.elastic.co/guide/en/logstash/2.2/filter-plugins.html

 

Technically speaking, it is almost impossible to tell you what to do with the "filter" section, because it is where data manipulation happens. It is entirely up to you on how you want to massage the data. Of course, if you are perfectly happy with the data, you can always leave the "filter" section blank.

 

Since we are learning Logstash in the context of Log Analysis (IOALA), you need to know that almost all Insight Packs came with sample Logstash configuration file which you can find under (after installation of the pack) <IOALA-DIR>/unity_content/<pack-name>/logstash

 

Eg.

[danielyeap@lahost WindowsOSEventsInsightPack_v1.1.0.6]$ pwd
/home/danielyeap/IBM/LogAnalysis/unity_content/WindowsOSEventsInsightPack_v1.1.0.6
[danielyeap@lahost WindowsOSEventsInsightPack_v1.1.0.6]$ ls -l
total 0
drwxrwxr-x 2 danielyeap danielyeap  59 Mar 10  2017 docs
drwxrwxr-x 2 danielyeap danielyeap  83 Mar 10  2017 dsv
drwxrwxr-x 4 danielyeap danielyeap  34 Mar 10  2017 extractors
drwxrwxr-x 2 danielyeap danielyeap  50 Mar 10  2017 i18n
drwxrwxr-x 2 danielyeap danielyeap  47 Mar 10  2017 lfa
drwxrwxr-x 2 danielyeap danielyeap  32 Mar 10  2017 logstash
drwxrwxr-x 2 danielyeap danielyeap 117 Mar 10  2017 metadata
drwxrwxr-x 5 danielyeap danielyeap  50 Mar 10  2017 unity_apps
[danielyeap@lahost WindowsOSEventsInsightPack_v1.1.0.6]$ cd logstash
[danielyeap@lahost logstash]$ ls -l
total 8
-rw-rw-r-- 1 danielyeap danielyeap 4674 Mar 10  2017 logstash-scala.conf
[danielyeap@lahost logstash]$ grep -v ^# logstash-scala.conf | head -15
input { 
 eventlog {
  type  => 'Win32-EventLog'
 }
}

filter{
   
    if [type] == "Win32-EventLog"{
  grok {
   match =>["TimeGenerated", "%{DATA:TIMEGEN_DATE} %{DATA:TIMEGEN_TIME} %{ISO8601_TIMEZONE:TZ}"]
   add_tag => ["eventlog_grokked"] 
  }# end grok
 
  if "eventlog_grokked" in [tags]{
[danielyeap@lahost logstash]$

 

If you need to do more than the provided example, then you can refer to Logstash documentation =>

https://www.elastic.co/guide/en/logstash/2.2/filter-plugins.html

 

There are several useful filters:

(1) drop (to use when you want to drop any event/data)

https://www.elastic.co/guide/en/logstash/2.2/plugins-filters-drop.html

(2) grok (advance parsing tool for unstructured data)

https://www.elastic.co/guide/en/logstash/2.2/plugins-filters-grok.html

[PATTERNS] https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns

(3) mutate (amend data to add field, update field, etc)

https://www.elastic.co/guide/en/logstash/2.2/plugins-filters-mutate.html

 

Most of the filters are pretty straight forward. But, I would like to talk a little bit more about "grok".

 

When you use "grok", you will need to know regular expression.

There are some default ones provided =>

https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns

 

As for IOALA, one of the most famous GROK pattern would be the LFAMESSAGE which is used to remove special characters in LFA messages =>

https://developer.ibm.com/answers/questions/376292/how-do-i-remove-the-special-characters-produced-by.html

 

Certain insight packs will also provide their own GROK patterns that you would need to copy into your Logstash pattern directory (refer to the pack documentation).

 

Eg.

[danielyeap@lahost logstash]$ pwd
/home/danielyeap/IBM/LogAnalysis/unity_content/Oracle_DBAlertInsightPack_v1.1.0.0/logstash
[danielyeap@lahost logstash]$ ls -l
total 0
drwxrwxr-x 2 danielyeap danielyeap 94 Sep 22 11:11 config
drwxrwxr-x 2 danielyeap danielyeap 21 Sep 22 11:11 patterns
[danielyeap@lahost logstash]$ cd patterns/
[danielyeap@lahost patterns]$ ls -l
total 4
-rw-rw-r-- 1 danielyeap danielyeap 1688 Sep 22 11:11 oracledb
[danielyeap@lahost patterns]$ grep -v ^# oracledb

ORACLEDBTIMESTAMP1 ^%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}
ORACLEGENERIC (?m)%{ORACLEDBTIMESTAMP1:timestamp}\s(?=%{STARTING_INSTANCE:starting_instance})?(?=%{PSTART})?(?=%{TRACEFILE})?(?=%{ORAMSGS})?%{GREEDYDATA:text}%{CONSOLIDATED_FIELDS}

PSTART (?:%{PSTART_PROCESS:process} started with pid=%{NUMBER:pid}, OS id=%{NUMBER:osid})?
PSTART_PROCESS [A-Z0-9]{4}

ORAMSGID [A-Z]{3}\-(?:[0-9]{5})
ORAMSGS (?:.*?%{ORAMSGID:messageID1}:)?(?:.*?%{ORAMSGID:messageID2}:)?(?:.*?%{ORAMSGID:messageID3}:)?(?:.*?%{ORAMSGID:messageID4}:)?
TRACEFILE (?:Errors in file %{URIPATHPARAM:tracefilepath}:)?

STARTING_INSTANCE Starting ORACLE instance \(%{DATA:start_type}\)
STARTSTATUS %{ORACLEDBTIMESTAMP1}\s((?=Instance shutdown complete)(?:%{DATA:clean})|(?!Instance shutdown complete)(?:%{DATA:dirty}))\s%{ORACLEDBTIMESTAMP1:timestamp}\sStarting ORACLE instance \(%{DATA:starttype}\)


NOT_SEMICOLON [^;]*
CONSOLIDATED_FIELDS ;\s*%{NOT_SEMICOLON:ENVIRONMENTNAME}\s*;\s*%{NOT_SEMICOLON:HOSTNAME}\s*;\s*%{NOT_SEMICOLON:FUNCTIONALNAME}\s*;\s*%{NOT_SEMICOLON:INSTANCE}\s*;\s*%{NOT_SEMICOLON:LOGNAME}
[danielyeap@lahost patterns]$

 

In a nutshell, "grok" filter allows you to extract data from event based on certain regular expressions and use the extracted data for further processing.

 

That is all that I would like to share here, happy coding!

 

 

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"","label":""},"Component":"","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

UID

ibm11081617