Would it be possible to derive Data.Vector.Unbox via GHC's generic deriving?

398 Views Asked by At

It's possible to derive Storable via GHC's generic deriving mechanism: http://hackage.haskell.org/package/derive-storable (and https://hackage.haskell.org/package/derive-storable-plugin for performance). The only library I can find for deriving Data.Vector.Unbox, however, uses template Haskell: http://hackage.haskell.org/package/vector-th-unbox. It also requires the user to write a little code; it's not entirely automatic.

My question is, could a library like deriving-storable also exist for Unbox, or is this not possible due to some fundamental way in which Unbox differs from Storable? If the latter, does that mean it's also not possible to create a library that allows automatically deriving Unbox for any Storable type, as I could not find such a library.

I ask because ideally I'd like to avoid template Haskell and the manual annotations necessary for using vector-th-unbox.

1

There are 1 best solutions below

1
On BEST ANSWER

Say we had some Generic_ class to convert between our own types and some uniform representation which happens to have an Unbox instance (which amounts to both MVector and Vector instances for the Unboxed variants):

class Generic_ a where
  type Rep_ (a :: Type) :: Type
  to_ :: a -> Rep_ a
  from_ :: Rep_ a -> a

Then we can use that to obtain generic implementations of the methods of MVector/Vector:

-- (auxiliary definitions of CMV and uncoercemv at the end of this block)
-- vector imports (see gist at the end for a compilable sample)
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed.Mutable as UM
import Data.Vector.Generic.Mutable.Base (MVector(..))



-- MVector

gbasicLength :: forall a s. CMV s a => UM.MVector s a -> Int
gbasicLength = basicLength @UM.MVector @(Rep_ a) @s . coerce

gbasicUnsafeSlice :: forall a s. CMV s a => Int -> Int -> UM.MVector s a -> UM.MVector s a
gbasicUnsafeSlice i j = uncoercemv . basicUnsafeSlice @UM.MVector @(Rep_ a) @s i j . coerce

-- etc.


-- idem Vector


-- This constraints holds when the UM.MVector data instance of a is
-- representationally equivalent to the data instance of its generic
-- representation (Rep_ a).
type CMV s a = (Coercible (UM.MVector s a) (UM.MVector s (Rep_ a)), MVector UM.MVector (Rep_ a))

-- Sadly coerce doesn't seem to want to solve this correctly so we use
-- unsafeCoerce as a workaround.
uncoercemv :: CMV s a => UM.MVector s (Rep_ a) -> UM.MVector s a
uncoercemv = unsafeCoerce

Now if we have some generic type

data MyType = MyCons Int Bool ()

We can define a generic instance with its isomorphism to a tuple

instance Generic_ MyType where
  type Rep_ MyType = (Int, Bool, ())
  to_ (MyCons a b c) = (a, b, c)
  from_ (a, b, c) = MyCons a b c

And from there, there is a totally generic recipe to get its Unbox instance, if you have YourType instead with its own Generic_ instance, you can take this and literally replace MyType with YourType.

newtype instance UM.MVector s MyType
  = MVMyType { unMVMyType :: UM.MVector s (Rep_ MyType) }

instance MVector UM.MVector MyType where
  basicLength = gbasicLength
  basicUnsafeSlice = gbasicUnsafeSlice
  -- etc.

-- idem (Vector U.Vector MyType)

-- MVector U.Vector & Vector UM.MVector   =   Unbox
instance Unbox MyType

In theory all this boilerplate could be automated with internal language features (as opposed to TemplateHaskell or CPP). But there are various issues that get in the way in the current state of things.

First, Generic_ is essentially Generic from GHC.Generics. However, the uniform representation that gets derived by GHC is not in terms of tuples (,) but in terms of somewhat ad-hoc type constructors (:+:, :*:, M1, etc.), which lack Unbox instances.

  • Such Unbox instances could be added to use Generic directly
  • the generics-eot has a variant of Generic relying on tuples that could be a direct replacement to Generic_ here.

And second, MVector and Vector have quite a few methods. To avoid having to list them all, one might expect to leverage DerivingVia (or GeneralizedNewtypeDeriving), however they are not applicable because there are a couple of polymorphic monadic methods that prevent coercions (e.g., basicUnsafeNew). For now, the easiest way I can think of to abstract this is a CPP macro. In fact the vector package uses that technique internally, and it might be reusable somehow. I believe properly addressing those issues requires a deep redesign of the Vector/MVector architecture.

Gist (not complete, but compilable): https://gist.github.com/Lysxia/c7bdcbba548ee019bf6b3f1e388bd660