How to sort a list using partial order in Haskell?

1.7k Views Asked by At

I have a procedural EDSL which uses blocks of statements.

These statements are added to the blocks in no particular order although there may be dependencies between statements.

During compilation of the EDSL, however, I need to make sure that the statements are ordered in the order of dependence, e.g.

B := A
C := B
E := D

Since not all statements have dependencies there is no total order (E.g. E := D above is independent and can be placed anywhere). There are no cyclic dependencies so list ordering should be possible.

I have tried to hack a solution by using Data.List.sortBy and defining Ordering which would return EQ to mean that the statements have no dependencies. This worked for some examples but not in the general case, e.g. ordering the following did nothing:

C := B                           B := A
D := C    = should produce =>    C := B
B := A                           D := C

This is because the default sort insertion sort and only makes sure the inserted item is smaller or equal to the next.

I have searched the internets for a Poset implementation but have not found anything applicable:

altfloat:Data.Poset defines Ordering = LT | GT | EQ | NC (NC for Non-comparable) which is good but the provided sort assumes NaN-like non-comparable items and just throws them away.

logfloat:Data.Number.PartialOrd is similar to the above except uses Maybe Ordering and I didn't see a sorting function anywhere in the package.

Math.Combinatorics.Poset I haven't figured out how to use it or whether it's applicable.

Below is a minimal example which has both binding and non-binding statements. The order of non-biniding statements matters and they must maintain the original order (i.e. sorting needs to be stable w.r.t. statements that don't have a dependence relation).

I hope there is a simple solution to this without using a full-blown dependence graph...

module Stmts where

import Data.List ( sortBy )

data Var = A | B | C | D | E | F | G | H deriving (Eq, Show)
data Stmt = Var := Var
          | Inc Var
  deriving (Show)

-- LHS variable
binds :: Stmt -> Maybe Var
binds (v := _) = Just v
binds _        = Nothing

-- RHS variables
references :: Stmt -> [Var]
references (_ := v) = [v]
references (Inc v)  = [v]

order :: [Stmt] -> [Stmt]
order = sortBy orderStmts

orderStmts :: Stmt -> Stmt -> Ordering
orderStmts s1 s2 = ord mbv1 mbv2
 where
  ord Nothing   Nothing   = EQ  -- No dep since they don't bind vars
  ord (Just v1) Nothing   = LT  -- Binding statements have precedence
  ord Nothing   (Just v2) = GT  -- ^^^
  ord (Just v1) (Just v2)       -- Both statements are binding:
    | v1 `elem` refs2 = LT      --  * s2 depends on s1
    | v2 `elem` refs1 = GT      --  * s1 depends on s2
    | otherwise       = EQ      --  * neither

  -- *Maybe* they bind variables
  mbv1  = binds s1
  mbv2  = binds s2

  -- Variables they reference  
  refs1 = references s1
  refs2 = references s2

-- The following should return [B := A, C := B, D := C, Inc F, Inc G]
test = order [Inc F, Inc G, C := B, D := C, B := A]
2

There are 2 best solutions below

9
On BEST ANSWER

The problem with your approach is that your orderStmts is neither an ordering nor a partial ordering. In particular, it's not transitive and this is why the attempts at using it for sorting fail.

What you are looking for is topological sorting. You have a graph of vertices (statements) that have edges between them (their dependencies) and you want to ensure that the ordering matches the edges.

I'll focus only on the declarations, as the non-binding statements are easy (we just need to split the list into two, sort the declarations and concatenate again).

Topological sorting is already implemented in Data.Graph, which makes the task very simple:

module Stmts where

import Data.Graph

data Var = A | B | C | D | E | F | G | H deriving (Eq, Ord, Show)

data Decl = Var := Var 
  deriving (Show, Eq)

data Stmt = Decl
          | Inc Var
  deriving (Show, Eq)

sortDecls :: [Decl] -> [SCC Decl]
sortDecls = stronglyConnComp . map triple
  where
    triple n@(x := y)   = (n, x, [y])

-- The following should return [B := A, C := B, D := C]
test = map flattenSCC . sortDecls $ [C := B, D := C, B := A]

Calling flattenSCC is only for testing, as SCC has no Show instance. You'll probably want to check the SCCs for cycles (a cycle would be a language compilation error), and if there is none, extract the sorted sequence.

2
On

I think the only way to sort your statements groups are walking from roots to children

import Data.List

data Var = A | B | C | D | E | F | G | H deriving (Eq, Show)
data Stmt = Var := Var deriving (Show)

parent :: Stmt -> Var
parent (_ := p) = p

child :: Stmt -> Var
child (c := _) = c

steps :: [Stmt] -> [[Stmt]]
steps st = step roots st
  where step _ [] = []
        step r s = let (a, b) = partition (flip elem r . parent) s
                       (v, u) = partition (flip elem (map child b) . child ) a
                   in  if null u then error "Cycle!"
                                 else u : step (r ++ (nub $ map child u)) (v ++ b)

        roots = let cs = map child st
                    rs = nub $ filter (not . flip elem cs) (map parent st)
                in  if null rs then error "No roots!"
                               else rs

main = mapM_ print $ steps [F := H, G := H, C := B, D := C, B := A]

with output

[F := H,G := H,B := A]
[C := B]
[D := C]

when "sort" is over groups (not statements).

(Stability is granted on this code, since is invariant through partition, map, ++, ...)

(Added)

If you really wish some stability property (sorting your statements) you must to add some other restriction (defining "stability").

Let two "sort" direct algorithms (simply reordering statements to front or to back)

orderToFront :: [Stmt] -> [Stmt]
orderToFront [] = []
orderToFront (s@(_ := p):xs) = let (l, r) = splitAtFirst ((==p).child) xs
                               in  if null r then s: orderToFront xs
                                             else head r: s: orderToFront (l ++ tail r)

orderToBack' :: [Stmt] -> [Stmt]
orderToBack' [] = []
orderToBack' (s@(c := _):xs) = let (l, r) = splitAtFirst ((==c).parent) xs
                               in  if null r then s: orderToBack' xs
                                             else orderToBack' (l ++ head r: s: tail r)
orderToBack = reverse . orderToBack'

splitAtFirst :: (a -> Bool) -> [a] -> ([a], [a])
splitAtFirst f xs = let rs = dropWhile (not.f) xs
                    in  (take (length xs - length rs) xs, rs)


main = do

    let q = [F := H, C := B, D := C, G := F, B := A]

    putStrLn "-- orderToFront"
    mapM_ print $ orderToFront q

    putStrLn "-- orderToBack"
    mapM_ print $ orderToBack q

with the same input, orderToFront output is different than orderToBack output but both are valid

-- orderToFront
F := H
B := A
C := B
D := C
G := F
-- orderToBack
B := A
F := H
G := F
C := B
D := C

(with only equality relation your algorithm cannot be lower than O(n^2) but if you define stability restriction it could be reduced)