gawk - sorting values from array of arrays

111 Views Asked by At

Using gawk 4 to build arrays of arrays and need to figure out percentile data from it. Need to sort values in ascending order which doesn't appear possible using asort when working with multidimensional arrays. Some of my values will be duplicate integers, but I need to keep all duplicates.

Here is what my data looks like. Element names for [a] and [b] end up being unique strings. Array [b] then has elements that are named 1,2,3,etc and contain as values the data I need to sort on.

mArray[a][b][1]=3456
mArray[a][b][2]=1456
mArray[a][b][3]=1456
...
mArray[a][b][1]=9233
mArray[a][b][2]=9233
mArray[a][b][3]=1234
...
mArray[a][b][1]=4567
mArray[a][b][2]=4567
mArray[a][b][3]=3097

I figure I can create regular arrays from each unique [a] element and insert values from it's corresponding [b][x] and then asort on that, but then I lose whatever duplicate values exist. Right now I am hacking it by walking mArray and writing to different files based on name of [a], printing out all values under [b][x] then running sort. Curious if there is a more elegant way of doing it.

Here is what I tried using asort against my mArray to test getting proper output. After 30mins I get no output or errors.

for ( a in mArray ) {
 for ( b in mArray[a] ) {
  n=asort(mArray[a][b][c])
  print n
 }
}

Background: parsing CSV reports from a network monitoring system, grabbing throughput sample data then aggregating those values across all interfaces to determine 95th percentile for total throughput of a device.

Edit

Desired output format after sorting would be:

mArray[a][b][1]=1456
mArray[a][b][2]=1456
mArray[a][b][3]=3456
.
mArray[a][b][1]=1234
mArray[a][b][2]=9233
mArray[a][b][3]=9233
...
mArray[a][b][1]=3097
mArray[a][b][2]=4567
mArray[a][b][3]=4567
1

There are 1 best solutions below

1
On

Well you have to sort myArray[a][b], not myArray[a][b][c], because c even doesn't exist ;)

If you don't want to sort in place, you have to add the destination as a second parameter to asort. At least this works in gawk, though I don't know since which version. In gawk 4 it does.

And then you have to print an array one by one...

for ( a in myArray ) {
 for ( b in myArray[a] ) {
  asort(myArray[a][b], n)
  for( i in n ) print "m["a"]["b"]["i"]="n[i]
 }
}