Is there a human readable structured logging fomat?

945 Views Asked by At

I'd like my program to write to its stderr a log that is simultaneously human and machine readable.

Requirements:

  1. A log is a stream of messages. I.e. I can't write just one big xml or json document. Every message has to be parseable separately without requiring an incremental parser or generator.
  2. tty detection and log format selection switches are considered cheating for the purposes of this question. I want the same exact output to be simultaneously human and machine readable.
  3. No postprocessing for the same reasons as 2
  4. No ad hoc formats. I don't want the consumer to have to write a parser. Not even a trivial one.
  5. No formats that are too obscure. There must be a library to parse this format in the top 10 most popular general purpose programming languages and the library must be able to parse the entire log into a stream of messages out of the box without requiring the consumer to massage the data.
  • Pretty JSON fails 5 - most JSON parsing APIs cannot parse multiple concatenated JSON documents.

  • JSON Lines is not human readable especially if it contains nested data because the entire log entry ends up on a single line.

  • It appears that application/json-seq (RFC 7464) does allow for the JSON texts to be pretty printed (human readable) while only requiring very simple parsing on top of a regular JSON decoder. This is the closest one yet.

Post-mortem

I ended up rethinking my approach: log jsonlines - easiest to generate and consume, post-process with en external pretty-printer such as jq . for human consumption.

2

There are 2 best solutions below

0
On BEST ANSWER

YAML seems to be a fine choice. You can emit the log as stand-alone YAML documents, separated by ---[1]. Having multiple documents is standard YAML and this is at least supported by the YAML parser I know about. Basic structured data can be printed in one line in YAML, while still being pretty human readable.

https://yaml.org/ has a list of implementations for different programming languages. There is an implementation for all the top ten programming languages in the stackoverflow 2021 developer survey[2].

Example Log

{level: INFO, message: first log message, time: "2021-10-06 21:37", data: [item1, item2, item3]}
---  # document separator (and this is a comment btw)
{level: INFO, message: still logging, time: "2021-10-06 21:38", data: {key1: [nested, data], key2: whatever}}
---
{level: FATAL, message: getting bored, time: "2021-10-06 21:38", data: 0}

[1] ok, that means you have to emit --- every second line or you have to insert it before parsing the document between all lines. If you just emit a --- in every second line, YAML satisfies your fifth requirement.
[2] except for Kotlin, but Kotlin users may use the java library as far as I can see

0
On

LIFECYCLE

Logging needs to be understood as a lifecycle, where JSON logs are typically used in 2 formats:

  • On a developer PC you use prettified output
  • In deployed systems you use a log entry per line bare format that suits log shippers such as Filebeat
  • Logs are then aggregated from the bare format to a system where they are used by humans in a readable way

The production logs remain human readable - tools such as Kibana allow you to view them in a readable format and ask questions of them.

When you think about the above flow it makes sense, since log storage is efficient and readability is also good. It requires a separation of concerns though.

ONLINE DEMO LOGGING SYSTEM

Feel free to log in to my cloud Elastic Search system at the bottom of this page. Log in to my web app at the top of the page, then query your own API activity in Kibana like this.

My system is a logging design for a microservices platform - see my Effective API Logging blog post in case interested in how it works. A good logging system usually needs an architectural design like this.