Centreon (nagios) to Prometheus : bash custom check

261 Views Asked by At

We have a Centreon server (based on nagios) with different custom check in bash.

For example, a script that check every 2 minutes if there is an error in a CSV file :

  • If there is more than 0 error the check comes out as WARNING
  • Otherwise, OK

Something like :

count_files=$(find $1 -name "*.csv" | wc -l)

if [[ $count_files -ne 0 ]]
then
    echo "WARNING - Error files found"
    exit 1
else
    echo "OK - No error file found"
    exit 0
fi

We are planning to change Centreon to Prometheus/Grafana but I was wondering how I can have custom bash check in Prometheus.

Should I put my script on these VMs and expose the result in Prometheus format ? Like this :

# HELP csv_checker Check error in csv files
# TYPE csv_checker gauge
csv_checker 20

If yes, how can I do that ?

Prometheus will check periodically on this endpoint ? How can I manage that with ~300 VMs ?

1

There are 1 best solutions below

1
On BEST ANSWER

I suggest you to use NodeExporter's feature called Textfile Collector

  1. Setup Node exporter on each VM and add a cronjob calling your bash script to collect the data every 2min (*/2 * * * *):

    #!/usr/bin/env bash
    set -euo pipefail
    
    csv_files=$(find /mnt -name "*.csv" | wc -l)
    echo "# HELP csv_files Amount of .csv files
    # TYPE csv_files gauge
    csv_files{} $csv_files" > /opt/node_exporter/textfile_collector/csv_files.prom.$$
    mv /opt/node_exporter/textfile_collector/csv_files.prom.$$ /opt/node_exporter/textfile_collector/csv_files.prom
    
  2. Setup Prometheus to collect metrics from your VMs

  3. Setup Alertmnager

  4. Add alert rule like:

      - alert: Errors in CSV files
        expr: csv_files != 0
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: Errors in CSV files on the instance {{ $labels.instance }}
          description: "VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
    

Prometheus will check all your hosts with given interval and notify about every instance with .csv files in the /target_folder