otel-collector doesn't parse json logs

30 Views Asked by At

I'm installing otel collector using helm and https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector chart

Values that configure log collection and export to Loki

presets:
  logsCollection:
    enabled: true
    includeCollectorLogs: true

 exporters:
    loki:
      endpoint: {removed}
      tls:
        insecure: true
      timeout: 10s
      default_labels_enabled:
        exporter: true
        job: true
service:
    pipelines:
      logs:
        exporters:
          - loki
        processors:
          - memory_limiter
          - k8sattributes
          - resource
          - batch

filelog config gets rendered to:

      filelog:
        exclude: []
        include:
        - /var/log/pods/*/*/*.log
        include_file_name: false
        include_file_path: true
        operators:
        - id: get-format
          routes:
          - expr: body matches "^{.*}$"
            output: parser-json
          - expr: body matches "^\\{"
            output: parser-docker
          - expr: body matches "^[^ Z]+ "
            output: parser-crio
          - expr: body matches "^[^ Z]+Z"
            output: parser-containerd
          type: router
        - id: parser-json
          parse_from: body
          type: json_parser
        - id: parser-crio
          regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
          timestamp:
            layout: 2006-01-02T15:04:05.999999999Z07:00
            layout_type: gotime
            parse_from: attributes.time
          type: regex_parser
        - combine_field: attributes.log
          combine_with: ""
          id: crio-recombine
          is_last_entry: attributes.logtag == 'F'
          max_log_size: 102400
          output: extract_metadata_from_filepath
          source_identifier: attributes["log.file.path"]
          type: recombine
        - id: parser-containerd
          regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
          timestamp:
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
            parse_from: attributes.time
          type: regex_parser
        - combine_field: attributes.log
          combine_with: ""
          id: containerd-recombine
          is_last_entry: attributes.logtag == 'F'
          max_log_size: 102400
          output: extract_metadata_from_filepath
          source_identifier: attributes["log.file.path"]
          type: recombine
        - id: parser-docker
          output: extract_metadata_from_filepath
          timestamp:
            layout: '%Y-%m-%dT%H:%M:%S.%LZ'
            parse_from: attributes.time
          type: json_parser
        - id: extract_metadata_from_filepath
          parse_from: attributes["log.file.path"]
          regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
          type: regex_parser

2 sample log lines from one of my pods:

  • {"@t":"2024-03-17T13:25:25.1565843Z","@mt":"HTTP {RequestMethod} {RequestPath} responded {StatusCode} in {Elapsed:0.0000} ms","@r":["63.9374"],"@tr":"873d379eebbe99df048a3c6be558b6e0","@sp":"fc8afde04266748a","RequestMethod":"GET","RequestPath":"/readyz","StatusCode":200,"Elapsed":63.937438,"SourceContext":"Serilog.AspNetCore.RequestLoggingMiddleware","RequestId":"0HN26GMQ2F7EU:00000001","ConnectionId":"0HN26GMQ2F7EU"}
  • simplified test, thinking '@' in log above was the cause {"some":"property","another":"propertValue"}

Querying logs in grafana I get this final body:

{ "body": "{"some":"property","another":"propertValue"}", "attributes": { "log.file.path": "/var/log/pods/todo_todo-api-deployment-854cfdfdb4-68vp8_1f5bd932-4557-41b4-8f12-5aa4226436cb/todo-api/0.log", "log.iostream": "stdout", "logtag": "F", "time": "2024-03-17T13:25:22.846758435Z" }, "resources": { "k8s.container.restart_count": "0", "k8s.deployment.name": "todo-api-deployment", "k8s.node.name": "aks-agentpool-33519033-vmss00000k", "k8s.pod.start_time": "2024-03-17T13:25:21Z", "k8s.pod.uid": "1f5bd932-4557-41b4-8f12-5aa4226436cb" } }

I expected body to have original message parsed into json object instead of raw, escaped string.

I don't really understand the matcher for parser-docker in filelog config (body matches "^\\{"), since it wants each line to start with backslash. So I tried adding another:

- expr: body matches "^{.*}$"
  output: parser-json
...
- id: parser-json
  parse_from: body
  type: json_parser

But that didn't seem to resolve anything.

0

There are 0 best solutions below