Why use Beats if i can post directly to Elasticsearch?

742 Views Asked by At

Recently i have been reading into Elastic stack and finding out about this thing called Beats, which basically used for lightweight shippers.

So the question is, if my service can directly hit to Elasticsearch, do i actually need beats for it? Since from what i have known it's just kinda a proxy (?)

Hopefully my question is clear enough

3

There are 3 best solutions below

0
On BEST ANSWER

Not sure which beat you are specifically referring but let's take an example of Filebeat.

Suppose application logs need to be indexed into Elasticsearch. Options

  1. Post the logs directly to Elasticsearch
  2. Save the logs to a file, then use Filebeat to index logs
  3. Publish logs to a AMQP service like RabbitMQ or Kafka, then use Logstash input plugins to read from RabbitMQ or Kafka and index into Elasticsearch

Option 2 Benefits

  • Filebeat ensures that each log message got delivered at-least-once. Filebeat is able to achieve this behavior because it stores the delivery state of each event in the registry file. In situations where the defined output is blocked and has not confirmed all events, Filebeat will keep trying to send events until the output acknowledges that it has received the events.
  • Before shipping data to Elasticsearh, we can do some additional processing or filtering. We want to drop some logs based on some text in the log message or add additional field (eg: Add Application Name to all logs, so that we can index multiple application logs into single index, then on consumption side we can filter the logs based on application name.)

Essentially beats provide the reliable way of indexing data without causing much overhead to the system as beats are lightweight shippers.

Option 3 - This also provides the same benefits as option2. This might be more useful in case if we want to ship the logs directly to an external system instead of storing it in a file in the local system. For any applications deployed in Docker/Kubernetes, where we do not have much access or enough space to store files in the local system.

0
On

Beats are good as lightweight agents for collecting streaming data like log files, OS metrics, etc, where you need some sort of agent to collect and send. If you have a service that wants to put things into Elastic, then yes by all means it can just use rest/java etc API directly.

0
On

Filebeat offers a way to centralize live logs from Multiple Servers

Let's say you are running multiple instances of an application in different servers and they are writing logs.

You can ship all these logs to a single ElasticSearch index and analyze or visualize them from there.

A single static file doesn't need Filebeat for moving to ElasticSearch.