How can I extract multiple fields in nushell?

64 Views Asked by At

I have a Nushell command:

cat airports.yaml | from yaml | sort-by "latitude_deg" | reverse | get 'latitude_deg,name' | first 20 | to json

How can I output both the name and latitude_deg fields in the table? The above does not work because it seems that get only gets a single value. I can get name and then the name is output, but was hoping to output both fields.

Here is an example record in the YAML file:

- id: '28118'
  ident: VTBS
  type: large_airport
  name: Suvarnabhumi Airport
  latitude_deg: '13.681099891662598'
  longitude_deg: '100.74700164794922'
  elevation_ft: '5'
  continent: AS
  iso_country: TH
  iso_region: TH-10
  municipality: Bangkok
  scheduled_service: 'yes'
  gps_code: VTBS
  iata_code: BKK
  local_code:
  home_link:
  wikipedia_link: http://en.wikipedia.org/wiki/Suvarnabhumi_Airport
  keywords:
2

There are 2 best solutions below

0
mb21 On

This works for me:

open airports.yaml | sort-by --reverse latitude_deg | select latitude_deg name | first 20 | to json
0
NotTheDr01ds On

Short answer:

As @mb21 pointed out, this is a scenario where you'll use a select instead of a get.

Explanation:

The difference between the two is a common source of confusion (from what I've seen on the Nushell Discord).

In your case, to demonstrate let's start with:

let demoTable = "
- id: '28118'
  name: 'Suvarnabhumi Airport'
  latitude_deg: '13.681099891662598'
  longitude_deg: '100.74700164794922'
  wikipedia_link: http://en.wikipedia.org/wiki/Suvarnabhumi_Airport
- id: '2434'
  name: 'London Heathrow Airport'
  latitude_deg: '51.4706'
  longitude_deg: '-0.461941'
  wikipedia_link: https://en.wikipedia.org/wiki/Heathrow_Airport
  "
  | from yaml

First, we can see that the resulting Nushell data structure is a table:

> $demoTable
╭───┬───────┬─────────────────────────┬────────────────────┬────────────────────┬───────────────────────────────────────╮
│ # │  id   │          name           │    latitude_deg    │   longitude_deg    │            wikipedia_link             │
├───┼───────┼─────────────────────────┼────────────────────┼────────────────────┼───────────────────────────────────────┤
│ 0 │ 28118 │ Suvarnabhumi Airport    │ 13.681099891662598 │ 100.74700164794922 │ http://en.wikipedia.org/wiki/Suvarnab │
│   │       │                         │                    │                    │ humi_Airport                          │
│ 1 │ 2434  │ London Heathrow Airport │ 51.4706            │ -0.461941          │ https://en.wikipedia.org/wiki/Heathro │
│   │       │                         │                    │                    │ w_Airport                             │
╰───┴───────┴─────────────────────────┴────────────────────┴────────────────────┴───────────────────────────────────────╯
> $demoTable | describe
table<id: string, name: string, latitude_deg: string, longitude_deg: string, wikipedia_link: string>

With that in place, let's look at how get and select differ with that data:

get:

get will unwrap a value from the record. Since a table is a list of record types in Nu, unwrapping a column will return a list. E.g.,

> $demoTable | get name
╭───┬─────────────────────────╮
│ 0 │ Suvarnabhumi Airport    │
│ 1 │ London Heathrow Airport │
╰───┴─────────────────────────╯
> $demoTable | get name | describe
list<string>

We can drill in further as well using cell-paths. For instance, to retrieve just the string value of the first item (index 0):

> $demoTable | get wikipedia_link.0
http://en.wikipedia.org/wiki/Suvarnabhumi_Airport
> $demoTable | get wikipedia_link.0 | describe
string

select:

select, on the other hand, returns a subsection of the original input. For a table, that means that select will return a new table with only the selected rows or columns. In your case:

> $demoTable | select name
╭───┬─────────────────────────╮
│ # │          name           │
├───┼─────────────────────────┤
│ 0 │ Suvarnabhumi Airport    │
│ 1 │ London Heathrow Airport │
╰───┴─────────────────────────╯

Note the column heading - That wasn't present in the get version. It shows that select is returning a new table.

Back to your original case:

> $demoTable | select name | describe
table<gps_code: string> (stream)
> $demoTable | select latitude_deg name
╭───┬────────────────────┬─────────────────────────╮
│ # │    latitude_deg    │          name           │
├───┼────────────────────┼─────────────────────────┤
│ 0 │ 13.681099891662598 │ Suvarnabhumi Airport    │
│ 1 │ 51.4706            │ London Heathrow Airport │
╰───┴────────────────────┴─────────────────────────╯

Other notes:

  • You can simply:

    open airports.yaml | sort-by ...
    

    When using open, Nushell will use the file extension to automatically handle the from yaml conversion.

  • get operations can also typically be done by using cell-paths directly. For instance, the following will return the same results the get examples above:

    > ($demoTable).name
    ╭───┬─────────────────────────╮
    │ 0 │ Suvarnabhumi Airport    │
    │ 1 │ London Heathrow Airport │
    ╰───┴─────────────────────────╯
    > ($demoTable).name | describe
    list<string>
    > ($demoTable).wikipedia_link.0 | describe
    string
    

    However, get is, IMHO, often more readable and (especially) composable. It also comes in handy if you need to specify the key/column name dynamically using a variable.

  • This is completely personal preference, but you don't have to quote single-word key/column names in most cases. latitude_deg and "latitude_deg" are the same thing to the parser. This goes for most strings when there's no ambiguous meaning.