I'm trying to find the mode or value that occurs most frequently. I want a function like :
mode:' 'a list -> (''a * int) list
and it returns the mode and where it occurs, unless there is a tie then return all occurrences so something like:
mode([1,1,2,3,5,8]) ===> [(1,2)]
mode([1,3,5,2,3,5]) ===> [(3,2),(5,2)]
mode([true,false,true,true]) ====>[(true,3)]
I'm trying to do this without library functions in SML.
so far I got:
fun mode(L)=
if null L then nil
else if hd L= hd (tl L) then 1+mode(hd(tl L))
else mode(tl L);
I know this isn't right I guess I am curious on how you both keep the indices of where the mode occurs and what the mode is and return them as tuples in a list.
You're trying to solve an exercise in many parts with several easier exercises before it. Judging by your current progress, have you considered solving some very similar exercises that build up to the final goal? This is generally good advice when solving programming problems: Reduce your current problem to simpler problems and solve those.
I'd try and solve this problem first
Build a
histogram : ''a list -> (''a * int) listover the elements of a list:Do this by inserting each
xwith itscountinto a histogram.And write some tests that you can execute once in a while.
Find the
mode : ''a list -> ''aof a list:Do this by finding the element in the histogram with the biggest count:
and eventually try and solve this problem
When you have a good grasp of representing and navigating regular histograms recursively, you could create an annotated
histogram : (''a * int * int list) listthat doesn't just contain the frequency of each element, but also their positions in the input list:Do this by inserting each
xwith itscountand positionialong with previously found positionsisinto a histogram:Find the (possibly multiple)
mode : ''a list -> (''a * int list) listof a list:Do this by finding the (possibly multiple) element(s) in the histogram with the biggest count:
with
countMax : intbeing the frequency repeated intmpModes : (''a * int * int list) list. HerecountMaxandtmpModesare accumulating result parameters. Do this by determining whether(x, count, is)should be thrown away in favor of alltmpModes, or it should be added totmpModes, or it should be chosen in favor of alltmpNodesYes, this is not trivial. Using my suggested division into sub-problems, answering this depends on whether we are in the
histogramfunction or thefindMaxfunction:In
histogramyou can store the indices as part of the tuple that contains the element and the frequency. InfindMax, since you're potentially collecting multiple results, you need to keep track of both which frequency is the highest (countMax) and what the temporary modes of choice are (tmpModes); subject to replacement or addition in a later recursive call.So to answer your question: In an accumulating parameter.
and a little feedback to your code snippet
Use pattern matching instead of
null,hdandtl:Building a histogram is sort of an extension to this where instead of a fixed value like
4, or a fixed amount of them likeonesandtwos, you have a whole list of them, and you have to dynamically look for the one you've got,x, and determine if it needs to be added to the histogram or incremented in the histogram.The best way would be to do that in a helper function, so for example, if
count_abcwere made with a helper function,only instead of the histogram representation
you want
and
insertshould be recursive rather than howinsert_abcis repetitive.