In the comments to my tutorial on parsing logs with Fluent-bit, two alternatives were given: Filebeat and Vector . This tutorial is going to show you how to organize collection and parsing of log messages using Filebeat.
The purpose of the tutorial: To organize the collection and parsing of log messages using Filebeat.
Disclaimer: The tutorial does not contain production-ready solutions, it was written to help those who are just starting to understand filebeat and to consolidate the studied material by the author. Also, the tutorial does not compare log providers. A comparison can be found here .
Who is interested in this topic, please, under the cut :)
We will run the test application using docker-compose .
general information
Filebeat is a lightweight log message provider. Its principle of operation is to monitor and collect log messages from log files and send them to elasticsearch or logstash for indexing.
Filebeat consists of key components:
- collectors ( Harvesters ) - responsible for reading the log files and sending log messages to the specified output interface, each log file is given a separate collector;
- input interfaces ( inputs the ) - are responsible for the search log messages sources and collectors control.
You can read more about how it works in the official guide .
Organization of collection of log messages
Filebeat has a variety of input interfaces for different sources of log messages. As part of the tutorial, I propose to move from setting up collection manually to automatically searching for sources of log messages in containers. In my opinion, this approach will allow a deeper understanding of filebeat, and besides, I myself went the same way.
, - .
, FastAPI, -.
- volume
. , Filebeat docker-compose.yml.
- volume :
- - :
app/api/main.py
logger.add( "./logs/file.log", format="app-log - {level} - {message}", rotation="500 MB" )
volume - :
docker-compose.yml
version: "3.8" services: app: ... volumes: # volume, - - app-logs:/logs log-shipper: ... volumes: # - ./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro # volume - - app-logs:/var/app/log volumes: app-logs:
filebeat:
filebeat.docker.yml
filebeat.inputs: - type: log # - paths: - /var/app/log/*.log # - output.console: pretty: true
, - :
{ "@timestamp": "2021-04-01T04:02:28.138Z", "@metadata": { "beat": "filebeat", "type": "_doc", "version": "7.12.0" }, "ecs": { "version": "1.8.0" }, "host": { "name": "aa9718a27eb9" }, "message": "app-log - ERROR - [Item not found] - 1", "log": { "offset": 377, "file": { "path": "/var/app/log/file.log" } }, "input": { "type": "log" }, "agent": { "version": "7.12.0", "hostname": "aa9718a27eb9", "ephemeral_id": "df245ed5-bd04-4eca-8b89-bd0c61169283", "id": "35333344-c3cc-44bf-a4d6-3a7315c328eb", "name": "aa9718a27eb9", "type": "filebeat" } }
- container
ontainer - - .
- container :
- log, , .
container :
filebeat.docker.yml
filebeat.inputs: - type: container # - paths: - '/var/lib/docker/containers/*/*.log' # - output.console: pretty: true
- volume app-logs app log-shipper , .
log-shipper - :
docker-compose.yml
version: "3.8" services: app: ... log-shipper: ... volumes: # - ./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro - /var/lib/docker/containers:/var/lib/docker/containers:ro - /var/run/docker.sock:/var/run/docker.sock:ro
- - :
app/api/main.py
logger.add( sys.stdout, format="app-log - {level} - {message}", )
container, , - , - .
.
-
- , , , .. filebeat , - . :
- ;
- -.
:
- container, , .
:
filebeat.docker.yml
filebeat.autodiscover: providers: # docker - type: docker templates: - condition: contains: # fastapi_app docker.container.name: fastapi_app # config: - type: container paths: - /var/lib/docker/containers/${data.docker.container.id}/*.log # - asgi- exclude_lines: ["^INFO:"] # - output.console: pretty: true
. filebeat - .
- (hints)
Filebeat .
() .
, Filebeat , - , .
:
app :
filebeat.docker.yml
filebeat.autodiscover: providers: - type: docker hints.enabled: true # - output.console: pretty: true
- log-shipper:
docker-compose.yml
version: "3.8" services: app: ... log-shipper: ... labels: co.elastic.logs/enabled: "false"
-
- Filebeat (processors).
.
.
- . drop_fields:
filebeat.docker.yml
processors: - drop_fields: fields: ["agent", "container", "ecs", "log", "input", "docker", "host"] ignore_missing: true
- :
{ "@timestamp": "2021-04-01T04:02:28.138Z", "@metadata": { "beat": "filebeat", "type": "_doc", "version": "7.12.0" }, "message": "app-log - ERROR - [Item not found] - 1", "stream": ["stdout"] }
- API - asgi-, add_tags:
filebeat.docker.yml
processors: - drop_fields: ... - add_tags: when: contains: "message": "app-log" tags: [test-app] target: "environment"
We structure the message field of the log message using the dissect handler and remove it using drop_fields :
filebeat.docker.yml
processors: - drop_fields: ... - add_tags: ... - dissect: when: contains: "message": "app-log" tokenizer: 'app-log - %{log-level} - [%{event.name}] - %{event.message}' field: "message" target_prefix: "" - drop_fields: when: contains: "message": "app-log" fields: ["message"] ignore_missing: true
Now the log message looks like this:
{ "@timestamp": "2021-04-02T08:29:07.349Z", "@metadata": { "beat": "filebeat", "type": "_doc", "version": "7.12.0" }, "log-level": "ERROR", "event": { "name": "Item not found", "message": "Foo" }, "environment": [ "test-app" ], "stream": "stdout" }
Addition
Filebeat also has out-of-the-box solutions for collecting and parsing log messages for widely used tools such as Nginx, Postgres, etc.
They are called modules .
For example, to collect Nginx log messages, just add a label to its container:
co.elastic.logs/module: "nginx"
and include hints in the config file. After that, we will get a ready-made solution for collecting and parsing log messages + a convenient dashboard in Kibana.
Thank you all for your attention!