Introduction
This is the start of a short series about the JSON data format, and how
the command-line tool jq
can be used to process such data. The plan is to make an open series to
which others may contribute their own experiences using this tool.
The jq command is described on the GitHub page as follows:
jq is a lightweight and flexible command-line JSON processor
…and as:
jq is like sed for JSON data - you can use
it to slice and filter and map and transform structured data with the
same ease that sed, awk, grep and
friends let you play with text.
The jq tool is controlled by a programming language
(also referred to as jq), which is very powerful. This
series will mainly deal with this.
JSON (JavaScript Object
Notation)
To begin we will look at JSON itself. It is defined on
the Wikipedia page
thus:
JSON is an open standard file format and data
interchange format that uses human-readable text to store and transmit
data objects consisting of attribute–value pairs and arrays (or other
serializable values). It is a common data format with diverse uses in
electronic data interchange, including that of web applications with
servers.
The syntax of JSON is defined by RFC 8259 and by
ECMA-404.
It is fairly simple in principle but has some complexity.
JSON’s basic data types are (edited from the Wikipedia page):
Number: a signed decimal number that may contain a
fractional part and may use exponential E notation, but cannot include
non-numbers. (NOTE: Unlike what I said in the audio,
there are two values representing non-numbers: 'nan' and
infinity: 'infinity'.
String: a sequence of zero or more Unicode characters.
Strings are delimited with double quotation marks and support a
backslash escaping syntax.
Boolean: either of the values true or
false
Array: an ordered list of zero or more elements, each of
which may be of any type. Arrays use square bracket notation with
comma-separated elements.
Object: a collection of name–value pairs where the names
(also called keys) are strings. Objects are delimited with curly
brackets and use commas to separate each pair, while within each pair
the colon ':' character separates the key or name from its
value.
null: an empty value, using the word
null
Examples
These are the basic data types listed above (same order):
42
"HPR"
true
["Hacker","Public","Radio"]
{ "firstname": "John", "lastname": "Doe" }
null
jq
From the Wikipedia page:
jq was created by Stephen Dolan, and released in October
2012. It was described as being “like sed for JSON data”. Support for
regular expressions was added in jq version 1.5.
Obtaining jq
This tool is available in most of the Linux repositories. For
example, on Debian and Debian-based releases you can install it
with: