I am trying to send system metrics to Apache Skywalking from a node application.
My current setup is that I have created 2 nodeJS express apps. I am hitting app2 from app1 and I am able to see traces successfully there. Then the next step was to integrate metrics and logging as well. I tried first with metrics so what I did I followed the following link and configured everything as prescribed here:
https://skywalking.apache.org/docs/main/v9.4.0/en/setup/backend/backend-vm-monitoring/
Just for reference are my config files:
vm.yaml
(default, no changes)
#filter: "{ tags -> tags.job_name == 'vm-monitoring' }" # The OpenTelemetry job name
expSuffix: service(['node_identifier_host_name'] , Layer.OS_LINUX)
metricPrefix: meter_vm
metricsRules:
#node cpu
- name: cpu_total_percentage
exp: (node_cpu_seconds_total * 100).tagNotEqual('mode' , 'idle').sum(['node_identifier_host_name']).rate('PT1M')
- name: cpu_average_used
exp: (node_cpu_seconds_total * 100).sum(['node_identifier_host_name' , 'mode']).rate('PT1M')
- name: cpu_load1
exp: node_load1 * 100
- name: cpu_load5
exp: node_load5 * 100
- name: cpu_load15
exp: node_load15 * 100
#node Memory
- name: memory_total
exp: node_memory_MemTotal_bytes
- name: memory_available
exp: node_memory_MemAvailable_bytes
- name: memory_used
exp: node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes
- name: memory_swap_free
exp: node_memory_SwapFree_bytes
- name: memory_swap_total
exp: node_memory_SwapTotal_bytes
- name: memory_swap_percentage
exp: 100 - ((node_memory_SwapFree_bytes * 100) / node_memory_SwapTotal_bytes)
#node filesystem
- name: filesystem_percentage
exp: 100 - ((node_filesystem_avail_bytes * 100).sum(['node_identifier_host_name' , 'mountpoint']) / node_filesystem_size_bytes.sum(['node_identifier_host_name' , 'mountpoint']))
#node disk
- name: disk_read
exp: node_disk_read_bytes_total.sum(['node_identifier_host_name']).rate('PT1M')
- name: disk_written
exp: node_disk_written_bytes_total.sum(['node_identifier_host_name']).rate('PT1M')
#node network
- name: network_receive
exp: node_network_receive_bytes_total.sum(['node_identifier_host_name']).irate()
- name: network_transmit
exp: node_network_transmit_bytes_total.sum(['node_identifier_host_name']).irate()
#node netstat
- name: tcp_curr_estab
exp: node_netstat_Tcp_CurrEstab
- name: tcp_tw
exp: node_sockstat_TCP_tw
- name: tcp_alloc
exp: node_sockstat_TCP_alloc
- name: sockets_used
exp: node_sockstat_sockets_used
- name: udp_inuse
exp: node_sockstat_UDP_inuse
#node filefd
- name: filefd_allocated
exp: node_filefd_allocated
And otel-collector-config.yaml
extensions:
health_check:
# A receiver is how data gets into the OpenTelemetry Collector
receivers:
# Set Prometheus Receiver to collects metrics from targets
# It’s supports the full set of Prometheus configuration
prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 5s
static_configs:
# Replace the IP to your VMs‘s IP which has installed Node Exporter
- targets: [ '127.0.0.1:9100' ]
processors:
batch:
# An exporter is how data gets sent to different systems/back-ends
exporters:
# Exports metrics via gRPC using OpenCensus format
otlp:
endpoint: '127.0.0.1:11800' # The OAP Server address
tls:
insecure: true
logging:
logLevel: debug
service:
pipelines:
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [otlp, logging]
extensions: [health_check]
But when I go to UI of skywalking it shows only traces but not metrics. Infact when I choose instance it shows me only following there:
I also tried to add custom metrics for memory by adding an extra tab and widget but still, it shows nothing. See the below screenshot:
Just for information I am using mac M1 system. And I am using localhost to start both express apps there on port 3000 & 4000.
Below is the screenshot which shows the connected services:
Here are the logs of node_exporter viewable at localhost:9100
node_scrape_collector_duration_seconds{collector="boottime"} 2.75e-05
node_scrape_collector_duration_seconds{collector="cpu"} 0.0002835
node_scrape_collector_duration_seconds{collector="diskstats"} 0.002703042
node_scrape_collector_duration_seconds{collector="filesystem"} 0.000768834
node_scrape_collector_duration_seconds{collector="loadavg"} 0.00028
node_scrape_collector_duration_seconds{collector="meminfo"} 0.000278833
node_scrape_collector_duration_seconds{collector="netdev"} 0.003328583
node_scrape_collector_duration_seconds{collector="os"} 0.000646625
node_scrape_collector_duration_seconds{collector="powersupplyclass"} 0.000319625
node_scrape_collector_duration_seconds{collector="textfile"} 0.000167708
node_scrape_collector_duration_seconds{collector="thermal"} 0.004174959
node_scrape_collector_duration_seconds{collector="time"} 0.00033025
node_scrape_collector_duration_seconds{collector="uname"} 0.000171083
# HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
# TYPE node_scrape_collector_success gauge
node_scrape_collector_success{collector="boottime"} 1
node_scrape_collector_success{collector="cpu"} 1
node_scrape_collector_success{collector="diskstats"} 1
node_scrape_collector_success{collector="filesystem"} 1
node_scrape_collector_success{collector="loadavg"} 1
node_scrape_collector_success{collector="meminfo"} 1
node_scrape_collector_success{collector="netdev"} 1
node_scrape_collector_success{collector="os"} 1
node_scrape_collector_success{collector="powersupplyclass"} 1
node_scrape_collector_success{collector="textfile"} 1
node_scrape_collector_success{collector="thermal"} 0
node_scrape_collector_success{collector="time"} 1
node_scrape_collector_success{collector="uname"} 1
# HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
# TYPE node_textfile_scrape_error gauge
node_textfile_scrape_error 0
# HELP node_time_seconds System time in seconds since epoch (1970).
# TYPE node_time_seconds gauge
node_time_seconds 1.6834000089577541e+09
# HELP node_time_zone_offset_seconds System time zone offset in seconds.
# TYPE node_time_zone_offset_seconds gauge
node_time_zone_offset_seconds{time_zone="IST"} 19800
# HELP node_uname_info Labeled system information as provided by the uname system call.
# TYPE node_uname_info gauge
node_uname_info{domainname="(none)",machine="x86_64",nodename="Bhupesh-SFIN261",release="22.4.0",sysname="Darwin",version="Darwin Kernel Version 22.4.0: Mon Mar 6 21:00:41 PST 2023; root:xnu-8796.101.5~3/RELEASE_ARM64_T8103"} 1
# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
# TYPE promhttp_metric_handler_errors_total counter
promhttp_metric_handler_errors_total{cause="encoding"} 0
promhttp_metric_handler_errors_total{cause="gathering"} 0
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 1
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 193
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0
So I need help to know what I am doing wrong because of which I am unable to see the metrics of the system on skywalking. Any help would be appreciated. Thanks.