Is there a way to set a prometheus alert on storage.tsdb.retention.size, lets say if the retention size has been hit, i want an alert sent out.
Is there a way to set up alert with Prometheus storage retention size
231 Views Asked by floormind At
2
There are 2 best solutions below
0
gabo
On
I'm not sure about storage.tsdb.retention.size, but another way to accomplish this is to create an alert rule for prometheus persistent volume being used as storage.
The example below would be triggered if disk usage > 90%:
- alert: HighDiskUsageOnPVC
expr: 100 * sum(kubelet_volume_stats_used_bytes) by (persistentvolumeclaim)/sum(kubelet_volume_stats_capacity_bytes) by (persistentvolumeclaim) > 90
for: 1m
labels:
severity: warning
annotations:
summary: High disk usage on Persistent Volume {{ $labels.persistentvolumeclaim }}
description: "More than 90% of the disk was used."
Related Questions in KUBERNETES
- Golang == Error: OCI runtime create failed: unable to start container process: exec: "./bin": stat ./bin: no such file or directory: unknown
- I can't create a pod in minikube on windows
- Oracle setting up on k8s cluster using helm charts enterprise edition
- Retrieve the Dockerfile configuration from the Kubernetes and also change container Java parameter?
- Summarize pods not running, by Namespace and Reason - I'm having trouble finding the reason
- How to get Java running parameters from Spring Boot running inside container in pod where no ps exist
- How do we configure prometheus server to scrape metrics from a pod with Istio sidecar proxy?
- In rke kube-proxy pod is not present
- problem with edge server registration in Eureka
- Unable to Access Kubernetes LoadBalancer Service from Local Device Outside Cluster
- Kubernetes cluster on GCE connection refused error
- Based on my experience, I've outlined the Kubernetes request flow. Could someone please add or highlight any points I might have overlooked?
- how to define StackGres helm chart "restapi" values to use internal LoadBalancer - AWS EKS
- Python3.11 can't open file [Errno 2] No such file or directory
- Cannot find remote pod service - SERVICE_UNAVAILABLE
Related Questions in PROMETHEUS
- Using Amazon managed Prometheus to get EC2 metrics data in Grafana
- How do we configure prometheus server to scrape metrics from a pod with Istio sidecar proxy?
- Concept of _sum in prometheus histogram
- Telegraf input.exec not working with json
- Concept of process_cpu_seconds_total in prometheus
- Micrometer - Custom Gauge Metric Not Working
- wrong timestamp in promql
- Data visualization on Grafana dashboard
- Micrometer & Prometheus with Java subprocesses that can't expose HTTP
- How can I collect metrics from a Node.js application running in a Kubernetes cluster to monitor HTTP requests with status codes 5xx or 4xx?
- How do you filter a Prometheus metric based on the existence of a label in another metric?
- calculating availability of node using SysUpTime.0 variable collcted in prometheus and exposing to grafana
- Thanos Querier not showing metrics sent to hub Prometheus via remote write
- How to have multiple rules file on Loki (Kubernetes)?
- Monitoring Thread pool metrics through promethues
Related Questions in GRAFANA
- Creating and "Relating" variables with eachother, with tags from influxdb measurement on Grafana 10
- Creating variables on grafana version 10 from influxdb v2.7 fields
- Can't use panel with transformation as source panel
- Filtering for the Most Recent Log Entry Per System in Loki Over a Time Range
- K6 scenarios to generate specific request per second rate
- Data visualization on Grafana dashboard
- How to match a static list of system names against logs in Loki/Grafana to find inactive systems?
- sqlite error when migrating data from sqlite3 to postgresql using pgloader
- How can I collect metrics from a Node.js application running in a Kubernetes cluster to monitor HTTP requests with status codes 5xx or 4xx?
- KQL Query to filter Message based on Grafana Variable
- calculating availability of node using SysUpTime.0 variable collcted in prometheus and exposing to grafana
- Grafana error: function "humanize" not defined
- Loki on ecs crashes when cleaning up chunks
- SSO to Grafana embeded in iframe
- Plotly in grafana, avoid clashing plot and legend
Related Questions in PROMETHEUS-ALERTMANAGER
- I'm not getting notified on my email client i.e., outlook despite they're in firing state
- Alerts are not displaying in the Alertmanager UI and I'm not getting notified on my email client i.e., outlook despite they're in firing state
- Alerts are not kicking in the alertmanger UI despite they're in firing state and I'm not getting notified on my email cleint i.e., outlook
- Unable to fetch alert rules. Is the Prometheus data source properly configured?
- How can I exclude any hosts in prometheus alerting rules
- Resolved Alerts resets repeat_interval in Alertmanager
- Detect systemd service restart using prometheus
- Alertmanager template dynamic url
- prometheus alerts for true expr to be triggered after 4 hours except first time
- Exporting Prometheus Alertmanager Alerts to CSV Using Python, Filtering by Specific Timestamps
- Doc for setting up Prometheus Alertmanager in Go App?
- Prometheus-Alertmanager, Slack messages not fully showing
- Dynamic variables in Prometheus alert rule
- Is there a way to set up alert with Prometheus storage retention size
- Configure custom Prometheus Alertmanager templates in docker-compose
Related Questions in PROMETHEUS-OPERATOR
- Grafana Alerting Acting Weird
- Google managed prometheus breaks when Application started with TLS in GKE
- How to configure thanos-sidecar of prometheus -k8s through cluster-monitoring-operator in openshift-monitoring namespace of OKD 4.14
- Is there a way to set up alert with Prometheus storage retention size
- prometheus-community/helm-charts/prometheus Helm chart in agent mode: Configuring "rule_files"
- kube-prometheus-stack. Why doesn't Prometheus see external node-exporters?
- How to query Kubernetes node CPU levels
- thanos query how to connect a TLS endpoint and a noTLS endpoint at the same time
- Prometheus query to visualize "Cluster Node CPU Utilization" - many-to-many matching not allowed
- How can I get high availability for an AlertManager that is behind Google Private Service Connect on GKE?
- Why can't prometheus blackbox exporter verify a tls endpoints self signed certificate? Details below
- Trouble with Prometheus relabel_configs Regex for Matching Specific URL Patterns
- Restricting dashboard access with admin through Prometheus operator
- Agnet mode Prometheus shard duplicate remote wirte error
- Set prometheus metric to return datetime instead of a float
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
As per the prometheus operational flags
--storage.tsdb.retention.size flagis set to the maximum number of bytes that can be stored for blocks and cannot set alerts for this . The supported units to set retention size are : B, KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on powers-of-2, 1KB is 1024B. Use this with server mode only.As the alert configuration is unable to be set, you can use this flag
--storage.tsdb.retention.time:to remove old data. By this make sure storage is free. This retention time defaults to 15d. Units Supported: y, w, d, h, m, s, ms. Use this with server mode only.Example :
--storage.tsdb.retention.time=2h- erases old information after 2 hours. This is the lowest supported retention time.As this is a valid request you can take help/support from this prometheus official site or raise a git issue under the prometheus git community.
If you need to configure the prometheus alert for disk usage then use the
node_filesystem_avail_bytesflag. Refer to this How to configure alerts in Prometheus for diskspace by Abraham Dhanyaraj Arumbaka.