# Discovering DataKinds at Runtime

*This post assumes a cursory knowledge of DataKinds.*

Something almost universal to introductions to “type-level programming” in Haskell is an example involving sized vectors. Something like this:

```
import qualified Data.Vector as V
data SizedVector (n :: Nat) a = SizedVector (V.Vector a)
deriving ( Eq
Show
, )
```

This definition carries a lot of promise. It can be used to show^{1} that
appending two vectors yields one as long as the sum of its parts:

```
(++) :: SizedVector n a -> SizedVector m a -> SizedVector (n + m) a
++) (SizedVector x) (SizedVector y) = SizedVector (x V.++ y) (
```

Mapping over a vector doesn’t change its length:

```
map :: (a -> b) -> SizedVector n a -> SizedVector n b
map f (SizedVector v) = SizedVector (V.map f v)
```

Nor does zipping:

```
zipWith :: (a -> b -> c)
-> SizedVector n a
-> SizedVector n b
-> SizedVector n c
zipWith f (Vector x) (Vector y) = Vector (V.zipWith f x y)
```

Folding doesn’t care about the length:

```
foldl :: (a -> b -> a) -> a -> SizedVector n b -> a
foldl f i (SizedVector x) = V.foldl f i x
```

We even get type-safe `head`

, `tail`

, `init`

, and `last`

:

```
head :: SizedVector (1 + n) a -> a
head (SizedVector x) = V.head x
tail :: SizedVector (1 + n) a -> SizedVector n a
tail (SizedVector x) = V.tail x
init :: SizedVector (n + 1) a -> SizedVector n a
init (SizedVector x) = V.init x
last :: SizedVector (n + 1) a -> a
last (SizedVector x) = V.last x
```

This type, these operations, and many more are available in the very nice vector-sized package.

Where the water gets murky (and where most of the type-level programming
tutorials I’ve found go silent) is actually *constructing* a
`SizedVector`

. `natVal`

from the `KnownNat`

class allows us to get ahold of the
`Nat`

vector length as a value. We could compare that value with the length of
some `V.Vector`

and return a `SizedVector`

if they match:

```
tryMakeSizedVector :: forall n a.
KnownNat n
=> V.Vector a
-> Maybe (SizedVector n a)
=
tryMakeSizedVector x let lenProxy = Proxy :: Proxy n
= fromIntegral (natVal lenProxy)
retLen in if V.length x == retLen
then Just (SizedVector x)
else Nothing
```

This function looks like it will work just fine with `map`

:

```
>>= :t map (+1) <$> (tryMakeSizedVector (V.fromList [1, 2, 3]))
map (+1) <$> (tryMakeSizedVector (V.fromList [1, 2, 3]))
:: (Num b, KnownNat n) => Maybe (SizedVector n b)
```

But GHC complains when we try to evaluate it:

```
>>= map (+1) <$> (tryMakeSizedVector (V.fromList [1, 2, 3]))
<interactive> error:
• No instance for (KnownNat n0) arising from a use of ‘it’
• In the first argument of ‘System.IO.print’, namely ‘it’
In a stmt of an interactive GHCi command: System.IO.print it
```

The issue is that we don’t really know which `n`

we’re going to get back from
`tryMakeSizedVector`

. In this simple case we know what it should be, and if we
use a type annotation it works:

```
>>= map (+1) <$>
(tryMakeSizedVector (V.fromList [1, 2, 3]) :: Maybe (SizedVector 3 Int))
Just (SizedVector [2,3,4])
```

But this isn’t that useful. If the vector comes from outside of our program (command line argument, read from a file, received from a socket, randomly generated, etc.) we won’t be able to provide the length in a type annotation.

## Unification

By encoding the length of vectors in their types, we’ve lost *something*. It
might seem like we’ve lost the ability to mention vectors whose size we don’t
know, but that’s not quite what’s going on. Consider the type of `map`

from
above, with explicit `forall`

:

`map :: forall n a b. (a -> b) -> SizedVector n a -> SizedVector n b`

In the definition of `map`

we’re clearly working with vectors whose size we
don’t “know.” `map`

works the same way for any `n`

, `a`

, and `b`

, no matter what
they are. And yet, the `n`

in the signature for `map`

is somehow different than
the `n`

in the signature for `tryMakeSizedVector`

:

```
tryMakeSizedVector :: forall n a.
KnownNat n
=> V.Vector a
-> Maybe (SizedVector n a)
```

The `n`

in the signature for `map`

expresses a relationship between the type of
its second argument and the type of its result. When we gain information about
either the second argument (by applying `map`

to something) or the result (by
applying something to `map`

), we gain information about the other. Consider
these expression fragments:

```
let x :: SizedVector 5 Int
= map (+1) x
y -- map :: forall n a b. (a -> b) -> SizedVector n a -> SizedVector n b
-- n, a, and b are the free type variables.
-- We learn from the type of x that:
-- n = 5
-- a = Int
-- And we learn from the type of (+1) that:
-- b = Int
-- Therefore:
-- y :: SizedVector 5 Int
-> let i :: Num c => SizedVector 10 c
\j = map fromIntegral j
i -- map :: forall n a b. (a -> b) -> SizedVector n a -> SizedVector n b
-- n, a, and b are the free type variables.
-- We learn from the type of i that:
-- n = 10
-- a = forall c. Num c => c
-- And we learn from the type of fromIntegral that:
-- b = forall d. Integral d => d
-- Therefore:
-- j :: forall d. Integral d => SizedVector 10 d
```

Even if the argument and/or result of `map`

aren’t explicitly annotated with
types, the type of `map`

can be used to gain information about the relationship
between the types of the arguments and the result. This process of discovering
information about terms by applying relationships is called
*unification*. There’s even an entire
programming language
whose evaluation model is based on unification.
Contrast this with the information we gain by applying `tryMakeSizedVector`

to
its argument:

```
let x :: V.Vector Int
= tryMakeSizedVector x
y -- tryMakeSizedVector :: forall n a.
-- KnownNat n
-- => V.Vector a
-- -> Maybe (SizedVector n a)
-- n and a are the free type variables.
-- We learn from the type of x that:
-- a = Int
-- Therefore:
-- y :: forall n. KnownNat n => Maybe (SizedVector n Int)
```

In the case of `tryMakeSizedVector`

, we can only gain information about `a`

from
the argument. `n`

is a new universally quantified type variable introduced by
`tryMakeSizedVector`

without any relationship to other terms. We can concretize
`n`

by annotating the result of `tryMakeSizedVector`

or assigning it to a term
with a more concrete type:

```
>>= map (+1) <$>
(tryMakeSizedVector (V.fromList [1, 2, 3]) :: Maybe (SizedVector 3 Int))
Just (SizedVector [2,3,4])
```

This reveals the source of the problem: in order to do this in a useful way, we
need access to the length of the argument vector at the type level, the opposite
of what `natVal`

does.

## Existential Quantification

`GHC.TypeNats`

contains this definition:

```
-- | This type represents unknown type-level natural numbers.
data SomeNat = forall n. KnownNat n => SomeNat (Proxy n)
```

By introducing a type variable on the right hand side of the definition (instead
of introducing it in the type constructor on the left hand side), we can “hide”
the type. The type is “hidden” in the sense that it does not appear in the type
constructor, and a function passed a value of type `SomeNat`

cannot learn the
specific `n`

that was hidden inside it, nor express any relationship between `n`

an another type. If `SomeNat`

is defined with GADT syntax, the analogy with
something like `tryMakeSizedVector`

becomes a bit clearer:

```
data SomeNat where
SomeNat :: forall n. KnownNat n => Proxy n -> SomeNat
tryMakeSizedVector :: forall n a.
KnownNat n
=> V.Vector a
-> Maybe (SizedVector n a)
```

`tryMakeSizedVector`

allows a new universally quantified type variable to be
introduced, while the `SomeNat`

constructor allows a universally quantified type
variable to be consumed or “forgotten.”

`GHC.TypeLits`

contains this handy definition:

`someNatVal :: Integer -> Maybe SomeNat`

This looks sort of like what we’re after. `someNatVal`

allows us to hoist an
arbitrary `Integer`

up to the type level, hidden inside of `SomeNat`

. `Maybe`

is
there simply because `Nat`

s must be positive, so `someNatVal`

will return
`Nothing`

if the argument is negative. The guts of `someNatVal`

aren’t too
interesting since `Nat`

is a bit magical in GHC. However, all the machinery is
available to define our own `SomeSizedVector`

and `someSizedVectorVal`

:

```
-- Definition with forall:
data SomeSizedVector a =
forall n. KnownNat n => SomeSizedVector (SizedVector n a)
-- Definition with GADT syntax:
data SomeSizedVector a where
SomeSizedVector :: KnownNat n => SizedVector n a -> SomeSizedVector a
someSizedVectorVal :: V.Vector a -> SomeSizedVector a
=
someSizedVectorVal x case someNatVal (fromIntegral (V.length x)) of
Nothing -> error "negative length?!"
Just (SomeNat (_ :: Proxy n)) ->
SomeSizedVector (SizedVector x :: SizedVector n a)
```

Note that we don’t need to hide the `a`

(the type of vector elements), just the
`n`

(the type-level vector length). Let’s see if `map`

works:

```
>>= case someSizedVectorVal (V.fromList [1, 2, 3]) of
SomeSizedVector sv -> map (+1) sv
<interactive> error:
• Couldn't match expected type ‘p’
with actual type ‘SizedVector
n Integer’
because type variable ‘n’ would escape its scope
This (rigid, skolem) type variable is bound by
a pattern with constructor:
SomeSizedVector :: forall a (n :: Nat).
KnownNat n =>
SizedVector n a -> SomeSizedVector a,
in a case alternative
at <interactive>:13:51-68
• In the expression: map (+ 1) sv
In a case alternative: SomeSizedVector sv -> map (+ 1) sv
In the expression:
case someSizedVectorVal (V.fromList [1, 2, 3]) of {
SomeSizedVector sv -> map (+ 1) sv }
• Relevant bindings include
sv :: SizedVector n integer-gmp-1.0.2.0:GHC.Integer.Type.Integer
(bound at <interactive>:13:67)
it :: p (bound at <interactive>:13:1)
```

What would the type of this expression look like if it somehow did typecheck?
The type of `someSizedVectorVal (V.fromList [1, 2, 3])`

is `Num a => SomeSizedVector a`

. The pattern match `SomeSizedVector sv`

yields an `sv`

with
the type `forall n. KnownNat n => SizedVector n a`

. `a`

is the same type from
`Num a => SomeSizedVector a`

, but `n`

is the type hidden by the
`SomeSizedVector`

constructor, and `map`

will leak it: the type of `map (+1) sv`

would be something like `exists n. KnownNat n => SizedVector n a`

. All we know
is that some `n`

existed when `SomeSizedVector`

was constructed. If this type
leaked out into the rest of the program, there’d be no way for it to unify with
anything. There’s no way to recover the type-level information; using the
`SomeSizedVector`

destroys the type.

The type `n`

needs to be “consumed” by something with a type like `forall n. KnownNat n => (forall r. SizedVector n a -> r)`

where `r`

is not allowed to
reference `n`

. `map`

clearly doesn’t fit this criteria: the returned vector is
known to have the same length as the argument, so `n`

will leak to the
result. However, `foldl`

can be used in a way that meets these requirements:

```
>>= case someSizedVectorVal (V.fromList [1, 2, 3]) of
SomeSizedVector sv -> foldl (+) 0 sv
6
```

In fact, `map`

can be used as part of the expression on the right hand side of
the pattern match, as long as the type of the whole expression doesn’t reference
the forbidden `n`

:

```
>>= case someSizedVectorVal (V.fromList [1, 2, 3]) of SomeSizedVector sv ->
foldl (+) 0 (map (*5) sv)
30
```

## Quantified Continuations

Every expression that uses a `SizedVector`

whose size isn’t known up front will
need to have this same structure:

`-> case someSizedVectorVal x of SomeSizedVector sv -> f sv \f x `

If the type of `x`

is `V.Vector a`

, then by applying the type of the
`SomeSizedVector`

constructor we know that the type of `f`

must be something
like `forall r. (forall n. KnownNat n => SizedVector n a) -> r`

. With the
`RankNTypes`

language extension, we can write a function that takes a function
like `f`

as an argument, short-circuiting the `SomeSizedVector`

type:

```
withSizedVector :: forall a r.
V.Vector a
-> (forall n. KnownNat n => SizedVector n a -> r)
-> r
=
withSizedVector x f case someNatVal (fromIntegral (V.length x)) of
Nothing -> error "negative length?!"
Just (SomeNat (Proxy :: Proxy n)) ->
SizedVector x :: SizedVector n a) f (
```

This is a bit easier to use than pattern matching on `SomeSizedVector`

:

```
>>= withSizedVector (V.fromList [1, 2, 3]) (foldl (+) 0 . map (*5))
30
```

The type of `withSizedVector`

provides a nice intuition too: it allows an
arbitrary vector to be injected into a type-safe bubble, as long as the
expression in the bubble can handle a sized vector of any length, and as long as
the type of the result doesn’t leak the length of the vector in its type. The
length can appear at the term-level, of course:

```
>>= withSizedVector (V.fromList [1, 2, 3, 4, 5]) (foldl (\i _ -> i+1) 0)
5
```

Programming with this pattern makes explicit the interface between a safe model with checked constraints and arbitrary values from the cold, harsh, outside world.

## Enforcing Model Invariants with DataKinds

I recently worked on a project involving software components that handle streams of video. Each component is modeled as a video transformer: it consumes video of some resolution, colorspace, framerate, etc. and produces a new video stream with possibly different properties. Some components can only produce video (e.g. reading a video stream from a file or capturing it from a camera), while others only consume video (e.g. encoding the video into a bytestream).

One obvious invariant for such an API is that two components can only be connected if the upstream component produces video that the downstream component knows how to consume. Let’s make a tiny version of this API that only checks if the components have compatible resolutions:

```
data VideoRes = VideoRes {
width :: Nat
height :: Nat
,
}| None -- ^ No video consumed/produced at this end.
data VideoStream (inVideo :: VideoRes) (outVideo :: VideoRes) where
VideoStream :: VideoStream inVideo outVideo
```

In reality the `VideoStream`

constructor would contain a record of functions
providing the stream component’s implementation. Type families can be used to
encode how a component modifies the resolution:

```
type family ResMult (n :: Nat) (res :: VideoRes) :: VideoRes
type instance ResMult n ('VideoRes w h) = 'VideoRes (n * w) (n * h)
type instance ResMult n 'None = 'None
```

Now we can mock a camera, video upscaler, and encoder:

```
camera :: Proxy ('VideoRes w h)
-- ^ Configure the camera to output this resolution.
-> VideoStream 'None ('VideoRes w h)
= undefined
camera
upscaler :: Proxy n
-- ^ Scale up by a factor of n.
-> VideoStream inVideo (ResMult n inVideo)
= undefined
upscaler
-- | An encoder should be able to accept video of any resolution.
encoder :: VideoStream ('VideoRes w h) 'None
= undefined encoder
```

Stream composition looks a bit like function composition:

```
(|>) :: VideoStream a b
-> VideoStream b c
-> VideoStream a c
|>) = undefined (
```

In fact, it looks so much like function composition that we can write a
`Category`

instance:

```
instance Category VideoStream where
id = undefined :: VideoStream a a
.) = flip (|>) (
```

Here’s a video processing pipeline that grabs video from a camera, doubles the
frame, then encodes it. The type is `VideoStream 'None 'None`

because this
composition of streams is closed; a `camera`

only produces video, and an
`encoder`

only consumes it.

```
doubleScaler :: Proxy ('VideoRes w h) -> VideoStream 'None 'None
= camera proxyRes
doubleScaler proxyRes |> upscaler (Proxy :: Proxy 2)
|> encoder
```

It’d be nice to be able to configure the `camera`

dynamically at runtime, so
we’ll need a function like `someNatVal`

and `someSizedVectorVal`

to hoist a term
level resolution up to the type level:

```
dynRes :: Integer -- ^ Width.
-> Integer -- ^ Height.
-> (forall w h. (KnownNat w, KnownNat h) =>
Maybe (Proxy ('VideoRes w h)) -> a)
-> a
=
dynRes w h f case someNatVal w of
Just (SomeNat (_ :: Proxy wn)) ->
case someNatVal h of
Just (SomeNat (_ :: Proxy hn)) ->
Just (Proxy :: Proxy ('VideoRes wn hn)))
f (Nothing -> f (Nothing :: Maybe (Proxy ('VideoRes 0 0)))
Nothing -> f (Nothing :: Maybe (Proxy ('VideoRes 0 0)))
```

Now we can provide a double scaler that has its model checked by the typechecker and is configurable at runtime:

```
dynDoubleScaler :: Integer -> Integer -> Maybe (VideoStream 'None 'None)
= dynRes w h (fmap doubleScaler) dynDoubleScaler w h
```

I hope that these examples have improved your intuition for how to use
`DataKinds`

in a dynamic way. When I first began using types like this I
struggled to find a good explanation of how to actually *use* them in a real
program.

I have a lot more to say about using quantification in Haskell
programs. Understanding how to use `forall`

to solve problems is the gateway to
many modern type system features, and the attendant intuitions offer guidance
even when working in less-mathematically-inspired languages; I’ve found it helps
to light the way when designing C++ templates.

Module containing this post’s code.

By “show” I really mean “promise.” These functions don’t constitute a proof that e.g. the length of

`x ++ y`

is the length of`x`

plus the length of`y`

. Instead, their types simply carry a promise that this is true, and indeed we could misbehave with the`SizedVector`

constructor and make a function with a bogus promise. We could use our own inductive natural type instead of`Nat`

to really prove that this is true, but something like Agda, Idris, or Coq would be better suited for this purpose, and I’ve found very little Haskell code in the wild that contains such proofs. The promise is a useful compromise.↩︎