Skip to content

Structured Data Over Pipes

andychu edited this page Sep 27, 2017 · 11 revisions

Takeaway from http://www.oilshell.org/blog/2017/09/19.html

Minimal Solution, That Basically Exists

  • Use % format strings. This allows you to select fields.
  • Implement %x00 like git log or \0 like find to insert NUL bytes
  • Use utf-8 encoding for strings

Advantages:

  • Works with xargs -0 (which was meant for find -print0)
  • You can save serialization cost by selecting the fields you want
  • It's already a common practice, and %x00 is a trivial patch.

Disadvantages:

  • %s is not that readable. But this can be mitigated by Oil Metaprogramming. That is, turning it into "hash: $hash commit: $commit".

Other Solutions

  • JSON for structured (and proper escaping)
  • CSV for tabular data (and proper escaping)
    • Also need a foo.csv_schema for the types. JSON has types in the encoding.
  • Provide %#s for a length prefix, for truly binary data. What use cases exist?
    • Alternative: base64 encode
    • Alternative: pass the file system path of the file (could be in memory on tmpfs).
  • Netstrings -- for fixed formats
Clone this wiki locally