How can I sort by word frequency and then sort alphabetically within each frequency in Ruby?

729 Views Asked by At
wordfrequency = Hash.new(0)
splitfed.each { |word| wordfrequency[word] += 1 }

wordfrequency = wordfrequency.sort_by {|x,y| y }
wordfrequency.reverse!

puts wordfrequency

I have added the words into a hash table and have gotten it to sort by word frequency, but then order within each frequency is random when I want it to be in alphabetical order. Any quick fixes? Thanks! Much appreciated.

3

There are 3 best solutions below

3
On BEST ANSWER

You can use:

wordfrequency = wordfrequency.sort_by{|x,y| [y, x] }

to sort by the value then the key.

In your case,

splitfed = ["bye", "hi", "hi", "a", "a", "there", "alphabet"]


wordfrequency = Hash.new(0)
splitfed.each { |word| wordfrequency[word] += 1 }

wordfrequency = wordfrequency.sort_by{|x,y| [y, x] }
wordfrequency.reverse!

puts wordfrequency.inspect

will output:

[["hi", 2], ["a", 2], ["there", 1], ["bye", 1], ["alphabet", 1]]

which is reverse ordered by the occurrence of the word then the word itself.

Make sure you note (which might be pretty obvious) that wordfrequency is now an array.

0
On

Ruby's group_by is the basis for this:

words = %w[foo bar bar baz]
words.group_by{ |w| w } 
# => {"foo"=>["foo"], "bar"=>["bar", "bar"], "baz"=>["baz"]}

words.group_by{ |w| w }.map{ |k, v| [k, v.size ] } 
# => [["foo", 1], ["bar", 2], ["baz", 1]]

If you want to sort by the words then by their frequency:

words.group_by{ |w| w }.map{ |k, v| [k, v.size ] }.sort_by{ |k, v| [k, v] } 
# => [["bar", 2], ["baz", 1], ["foo", 1]]

If you want to sort by the frequency then by the words:

words.group_by{ |w| w }.map{ |k, v| [k, v.size ] }.sort_by{ |k, v| [v, k] } 
# => [["baz", 1], ["foo", 1], ["bar", 2]]
0
On

Hashes do not necessarily sort in natural order; it is down to the individual data structure. If you want to pretty print a hash, you need to sort the keys, then iterate over that sorted list of keys, outputting the value for each key as you go.

There are tricks you can do to do this on a single line, or collect the entries from the hash into a sorted array of arrays, but ultimately they all come back to sorting the keys then retrieving the data for the sorted key list.

Some hashes maintain insertion order, some hashes maintain a sorted structure which you can then traverse as you process the hash, but these are exceptions to the rule.