Tips and tricks for converting unstructured data from logs to ELK Stack using GROK in LogStash

Structuring unstructured data with GROK



If you are using the Elastic stack (ELK) and are interested in mapping custom Logstash logs to Elasticsearch, then this post is for you.





ELK Stack is an abbreviation for three open source projects: Elasticsearch, Logstash, and Kibana. Together they form a log management platform.



  • Elasticsearch is a search and analytics engine.
  • Logstash is a server-side data processing pipeline that takes data from multiple sources at the same time, transforms it, and then sends it to a cache like Elasticsearch.
  • Kibana allows users to visualize data using charts and graphs in Elasticsearch.


Beats came along later and is a light data shipper. The introduction of Beats has transformed Elk Stack into Elastic Stack, but that's not the point.



This article focuses on Grok, which is a feature in Logstash that can transform your logs before they are sent to the stash. For our purposes, I will only talk about processing data from Logstash to Elasticsearch.





Grok- Logstash, - . (regex) .



, Grok , .



Grok





Grok, Logstash Elasticsearch Kibana, .



, . , .





localhost GET /v2/applink/5c2f4bb3e9fda1234edc64d 400 46ms 5bc6e716b5d6cb35fc9687c0


, , , .



, , , API. .





  • ​ localhost == environment
  • ​ GET == method
  • ​ /v2/applink/5c2f4bb3e9fda1234edc64d == url
  • ​ 400 == response_status
  • ​ 46ms == response_time
  • ​ 5bc6e716b5d6cb35fc9687c0 == user_id


, . – . .



Grok



Grok



Logstash 100 . , , apache, linux, haproxy, aws .



, , ? Grok.



Grok



, Grok. Grok Debugger Grok Patterns.



, Grok : %{SYNTAX:SEMANTIC}



, , Discover Grok. , , Grok, , .





, Grok, , Github Elastic.





, - , .





Grok https://grokdebug.herokuapp.com/



:



localhost GET /v2/applink/5c2f4bb3e9fda1234edc64d 400 46ms 5bc6e716b5d6cb35fc9687c0


Pattern:



%{WORD:environment} %{WORD:method} %{URIPATH:url} %{NUMBER:response_status} %{WORD:response_time} %{USERNAME:user_id}




{
  "environment": [
    [
      "localhost"
    ]
  ],
  "method": [
    [
      "GET"
    ]
  ],
  "url": [
    [
      "/v2/applink/5c2f4bb3e9fda1234edc64d"
    ]
  ],
  "response_status": [
    [
      "400"
    ]
  ],
  "BASE10NUM": [
    [
      "400"
    ]
  ],
  "response_time": [
    [
      "46ms"
    ]
  ],
  "user_id": [
    [
      "5bc6e716b5d6cb35fc9687c0"
    ]
  ]
}


Grok , — Logstash.



Logstash.conf



, ELK, Logstash:



sudo vi /etc/logstash/conf.d/logstash.conf


.



input { 
  file {
    path => "/your_logs/*.log"
  }
}
filter{
  grok {
    match => { "message" => "%{WORD:environment} %{WORD:method} %{URIPATH:url} %{NUMBER:response_status} %{WORD:response_time} %{USERNAME:user_id}"}
  }
}
output {
  elasticsearch {
    hosts => [ "localhost:9200" ]
  }
}


Logstash , , .



sudo service logstash restart
sudo service logstash status


, , , Elasticsearch Logstash Kibana!





Grok !





, , Grok Elasticsearch. . , , , url-.



Try giving Grok expressions a shot! If you have another way of doing this, or have any problems with the examples above, just write a comment below to let me know.



Thanks for reading - and please follow me here on Medium for more interesting software engineering articles!



Resources

https://www.elastic.co/blog/do-you-grok-grok

https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns

https://grokdebug.herokuapp.com/



PS Link to source



Elasticsearch Telegram channel




All Articles