How can I filter on the number of rows in the value column of a group-by in Nushell?

599 Views Asked by At

I want to do the equivalent of a HAVING clause in SQL. The real life description is I'm trying to find Elasticsearch aliases that point to two (or more) indexes and their index names. The data looks like this.

I first do a group by, then I pivot them to get rows, where the first column is the group key and the second column is a table with n rows. The command to get this table is open file.txt | lines | split-column " " --collapse-empty Alias Index | group by Alias | pivot

 ──┬───────────────┬────────────────
 # │ Column0       │ Column1
───┼───────────────┼────────────────
 0 │ abcd_20200430 │ [table 1 rows]
 1 │ abcd_20200501 │ [table 3 rows]
 2 │ abcd_20200502 │ [table 2 rows]
 3 │ abcd_20200503 │ [table 1 rows]
 4 │ abcd_20200504 │ [table 1 rows]

I want to filter this table by the rows where Column1 has more than 1 row. How would I do that?

| where Column1.count > 1 doesn't work

2

There are 2 best solutions below

0
Andrés N. Robalino On BEST ANSWER

You could try adding another column with their sizes as a workaround. Something like this:

open file.txt | lines | split-column " " --collapse-empty Alias Index | group by Alias | pivot | default size 0 | update size { get Column1 | count }
0
Reilly Wood On

In recent versions of Nushell you should be able to use where and length like so:

where ($it.Column1 | length) > 1