# Discovering DataKinds at Runtime

*This post assumes a cursory knowledge of DataKinds.*

Something almost universal to introductions to “type-level programming” in Haskell is an example involving sized vectors. Something like this:

```
import qualified Data.Vector as V
data SizedVector (n :: Nat) a = SizedVector (V.Vector a)
deriving ( Eq
, Show
)
```

This definition carries a lot of promise. It can be used to show^{1} that appending two vectors yields one as long as the sum of its parts:

```
(++) :: SizedVector n a -> SizedVector m a -> SizedVector (n + m) a
(++) (SizedVector x) (SizedVector y) = SizedVector (x V.++ y)
```

Mapping over a vector doesn’t change its length:

```
map :: (a -> b) -> SizedVector n a -> SizedVector n b
map f (SizedVector v) = SizedVector (V.map f v)
```

Nor does zipping:

```
zipWith :: (a -> b -> c)
-> SizedVector n a
-> SizedVector n b
-> SizedVector n c
zipWith f (Vector x) (Vector y) = Vector (V.zipWith f x y)
```

Folding doesn’t care about the length:

We even get type-safe `head`

, `tail`

, `init`

, and `last`

:

```
head :: SizedVector (1 + n) a -> a
head (SizedVector x) = V.head x
tail :: SizedVector (1 + n) a -> SizedVector n a
tail (SizedVector x) = V.tail x
init :: SizedVector (n + 1) a -> SizedVector n a
init (SizedVector x) = V.init x
last :: SizedVector (n + 1) a -> a
last (SizedVector x) = V.last x
```

This type, these operations, and many more are available in the very nice vector-sized package.

Where the water gets murky (and where most of the type-level programming tutorials I’ve found go silent) is actually *constructing* a `SizedVector`

. `natVal`

from the `KnownNat`

class allows us to get ahold of the `Nat`

vector length as a value. We could compare that value with the length of some `V.Vector`

and return a `SizedVector`

if they match:

```
tryMakeSizedVector :: forall n a.
KnownNat n
=> V.Vector a
-> Maybe (SizedVector n a)
tryMakeSizedVector x =
let lenProxy = Proxy :: Proxy n
retLen = fromIntegral (natVal lenProxy)
in if V.length x == retLen
then Just (SizedVector x)
else Nothing
```

This function looks like it will work just fine with `map`

:

```
>>= :t map (+1) <$> (tryMakeSizedVector (V.fromList [1, 2, 3]))
map (+1) <$> (tryMakeSizedVector (V.fromList [1, 2, 3]))
:: (Num b, KnownNat n) => Maybe (SizedVector n b)
```

But GHC complains when we try to evaluate it:

```
>>= map (+1) <$> (tryMakeSizedVector (V.fromList [1, 2, 3]))
<interactive> error:
• No instance for (KnownNat n0) arising from a use of ‘it’
• In the first argument of ‘System.IO.print’, namely ‘it’
In a stmt of an interactive GHCi command: System.IO.print it
```

The issue is that we don’t really know which `n`

we’re going to get back from `tryMakeSizedVector`

. In this simple case we know what it should be, and if we use a type annotation it works:

```
>>= map (+1) <$>
(tryMakeSizedVector (V.fromList [1, 2, 3]) :: Maybe (SizedVector 3 Int))
Just (SizedVector [2,3,4])
```

But this isn’t that useful. If the vector comes from outside of our program (command line argument, read from a file, received from a socket, randomly generated, etc.) we won’t be able to provide the length in a type annotation.

## Unification

By encoding the length of vectors in their types, we’ve lost *something*. It might seem like we’ve lost the ability to mention vectors whose size we don’t know, but that’s not quite what’s going on. Consider the type of `map`

from above, with explicit `forall`

:

In the definition of `map`

we’re clearly working with vectors whose size we don’t “know.” `map`

works the same way for any `n`

, `a`

, and `b`

, no matter what they are. And yet, the `n`

in the signature for `map`

is somehow different than the `n`

in the signature for `tryMakeSizedVector`

:

The `n`

in the signature for `map`

expresses a relationship between the type of its second argument and the type of its result. When we gain information about either the second argument (by applying `map`

to something) or the result (by applying something to `map`

), we gain information about the other. Consider these expression fragments:

```
let x :: SizedVector 5 Int
y = map (+1) x
-- map :: forall n a b. (a -> b) -> SizedVector n a -> SizedVector n b
-- n, a, and b are the free type variables.
-- We learn from the type of x that:
-- n = 5
-- a = Int
-- And we learn from the type of (+1) that:
-- b = Int
-- Therefore:
-- y :: SizedVector 5 Int
\j -> let i :: Num c => SizedVector 10 c
i = map fromIntegral j
-- map :: forall n a b. (a -> b) -> SizedVector n a -> SizedVector n b
-- n, a, and b are the free type variables.
-- We learn from the type of i that:
-- n = 10
-- a = forall c. Num c => c
-- And we learn from the type of fromIntegral that:
-- b = forall d. Integral d => d
-- Therefore:
-- j :: forall d. Integral d => SizedVector 10 d
```

Even if the argument and/or result of `map`

aren’t explicitly annotated with types, the type of `map`

can be used to gain information about the relationship between the types of the arguments and the result. This process of discovering information about terms by applying relationships is called *unification*. There’s even an entire programming language whose evaluation model is based on unification. Contrast this with the information we gain by applying `tryMakeSizedVector`

to its argument:

```
let x :: V.Vector Int
y = tryMakeSizedVector x
-- tryMakeSizedVector :: forall n a.
-- KnownNat n
-- => V.Vector a
-- -> Maybe (SizedVector n a)
-- n and a are the free type variables.
-- We learn from the type of x that:
-- a = Int
-- Therefore:
-- y :: forall n. KnownNat n => Maybe (SizedVector n Int)
```

In the case of `tryMakeSizedVector`

, we can only gain information about `a`

from the argument. `n`

is a new universally quantified type variable introduced by `tryMakeSizedVector`

without any relationship to other terms. We can concretize `n`

by annotating the result of `tryMakeSizedVector`

or assigning it to a term with a more concrete type:

```
>>= map (+1) <$>
(tryMakeSizedVector (V.fromList [1, 2, 3]) :: Maybe (SizedVector 3 Int))
Just (SizedVector [2,3,4])
```

This reveals the source of the problem: in order to do this in a useful way, we need access to the length of the argument vector at the type level, the opposite of what `natVal`

does.

## Existential Quantification

`GHC.TypeNats`

contains this definition:

```
-- | This type represents unknown type-level natural numbers.
data SomeNat = forall n. KnownNat n => SomeNat (Proxy n)
```

By introducing a type variable on the right hand side of the definition (instead of introducing it in the type constructor on the left hand side), we can “hide” the type. The type is “hidden” in the sense that it does not appear in the type constructor, and a function passed a value of type `SomeNat`

cannot learn the specific `n`

that was hidden inside it, nor express any relationship between `n`

an another type. If `SomeNat`

is defined with GADT syntax, the analogy with something like `tryMakeSizedVector`

becomes a bit clearer:

```
data SomeNat where
SomeNat :: forall n. KnownNat n => Proxy n -> SomeNat
tryMakeSizedVector :: forall n a.
KnownNat n
=> V.Vector a
-> Maybe (SizedVector n a)
```

`tryMakeSizedVector`

allows a new universally quantified type variable to be introduced, while the `SomeNat`

constructor allows a universally quantified type variable to be consumed or “forgotten.”

`GHC.TypeLits`

contains this handy definition:

This looks sort of like what we’re after. `someNatVal`

allows us to hoist an arbitrary `Integer`

up to the type level, hidden inside of `SomeNat`

. `Maybe`

is there simply because `Nat`

s must be positive, so `someNatVal`

will return `Nothing`

if the argument is negative. The guts of `someNatVal`

aren’t too interesting since `Nat`

is a bit magical in GHC. However, all the machinery is available to define our own `SomeSizedVector`

and `someSizedVectorVal`

:

```
-- Definition with forall:
data SomeSizedVector a =
forall n. KnownNat n => SomeSizedVector (SizedVector n a)
-- Definition with GADT syntax:
data SomeSizedVector a where
SomeSizedVector :: KnownNat n => SizedVector n a -> SomeSizedVector a
someSizedVectorVal :: V.Vector a -> SomeSizedVector a
someSizedVectorVal x =
case someNatVal (fromIntegral (V.length x)) of
Nothing -> error "negative length?!"
Just (SomeNat (_ :: Proxy n)) ->
SomeSizedVector (SizedVector x :: SizedVector n a)
```

Note that we don’t need to hide the `a`

(the type of vector elements), just the `n`

(the type-level vector length). Let’s see if `map`

works:

```
>>= case someSizedVectorVal (V.fromList [1, 2, 3]) of
SomeSizedVector sv -> map (+1) sv
<interactive> error:
• Couldn't match expected type ‘p’
with actual type ‘SizedVector
n Integer’
because type variable ‘n’ would escape its scope
This (rigid, skolem) type variable is bound by
a pattern with constructor:
SomeSizedVector :: forall a (n :: Nat).
KnownNat n =>
SizedVector n a -> SomeSizedVector a,
in a case alternative
at <interactive>:13:51-68
• In the expression: map (+ 1) sv
In a case alternative: SomeSizedVector sv -> map (+ 1) sv
In the expression:
case someSizedVectorVal (V.fromList [1, 2, 3]) of {
SomeSizedVector sv -> map (+ 1) sv }
• Relevant bindings include
sv :: SizedVector n integer-gmp-1.0.2.0:GHC.Integer.Type.Integer
(bound at <interactive>:13:67)
it :: p (bound at <interactive>:13:1)
```

What would the type of this expression look like if it somehow did typecheck? The type of `someSizedVectorVal (V.fromList [1, 2, 3])`

is `Num a => SomeSizedVector a`

. The pattern match `SomeSizedVector sv`

yields an `sv`

with the type `forall n. KnownNat n => SizedVector n a`

. `a`

is the same type from `Num a => SomeSizedVector a`

, but `n`

is the type hidden by the `SomeSizedVector`

constructor, and `map`

will leak it: the type of `map (+1) sv`

would be something like `exists n. KnownNat n => SizedVector n a`

. All we know is that some `n`

existed when `SomeSizedVector`

was constructed. If this type leaked out into the rest of the program, there’d be no way for it to unify with anything. There’s no way to recover the type-level information; using the `SomeSizedVector`

destroys the type.

The type `n`

needs to be “consumed” by something with a type like `forall n. KnownNat n => (forall r. SizedVector n a -> r)`

where `r`

is not allowed to reference `n`

. `map`

clearly doesn’t fit this criteria: the returned vector is known to have the same length as the argument, so `n`

will leak to the result. However, `foldl`

can be used in a way that meets these requirements:

```
>>= case someSizedVectorVal (V.fromList [1, 2, 3]) of
SomeSizedVector sv -> foldl (+) 0 sv
6
```

In fact, `map`

can be used as part of the expression on the right hand side of the pattern match, as long as the type of the whole expression doesn’t reference the forbidden `n`

:

```
>>= case someSizedVectorVal (V.fromList [1, 2, 3]) of SomeSizedVector sv ->
foldl (+) 0 (map (*5) sv)
30
```

## Quantified Continuations

Every expression that uses a `SizedVector`

whose size isn’t known up front will need to have this same structure:

If the type of `x`

is `V.Vector a`

, then by applying the type of the `SomeSizedVector`

constructor we know that the type of `f`

must be something like `forall r. (forall n. KnownNat n => SizedVector n a) -> r`

. With the `RankNTypes`

language extension, we can write a function that takes a function like `f`

as an argument, short-circuiting the `SomeSizedVector`

type:

```
withSizedVector :: forall a r.
V.Vector a
-> (forall n. KnownNat n => SizedVector n a -> r)
-> r
withSizedVector x f =
case someNatVal (fromIntegral (V.length x)) of
Nothing -> error "negative length?!"
Just (SomeNat (Proxy :: Proxy n)) ->
f (SizedVector x :: SizedVector n a)
```

This is a bit easier to use than pattern matching on `SomeSizedVector`

:

```
>>= withSizedVector (V.fromList [1, 2, 3]) (foldl (+) 0 . map (*5))
30
```

The type of `withSizedVector`

provides a nice intuition too: it allows an arbitrary vector to be injected into a type-safe bubble, as long as the expression in the bubble can handle a sized vector of any length, and as long as the type of the result doesn’t leak the length of the vector in its type. The length can appear at the term-level, of course:

```
>>= withSizedVector (V.fromList [1, 2, 3, 4, 5]) (foldl (\i _ -> i+1) 0)
5
```

Programming with this pattern makes explicit the interface between a safe model with checked constraints and arbitrary values from the cold, harsh, outside world.

## Enforcing Model Invariants with DataKinds

I recently worked on a project involving software components that handle streams of video. Each component is modeled as a video transformer: it consumes video of some resolution, colorspace, framerate, etc. and produces a new video stream with possibly different properties. Some components can only produce video (e.g. reading a video stream from a file or capturing it from a camera), while others only consume video (e.g. encoding the video into a bytestream).

One obvious invariant for such an API is that two components can only be connected if the upstream component produces video that the downstream component knows how to consume. Let’s make a tiny version of this API that only checks if the components have compatible resolutions:

```
data VideoRes = VideoRes {
width :: Nat
, height :: Nat
}
| None -- ^ No video consumed/produced at this end.
data VideoStream (inVideo :: VideoRes) (outVideo :: VideoRes) where
VideoStream :: VideoStream inVideo outVideo
```

In reality the `VideoStream`

constructor would contain a record of functions providing the stream component’s implementation. Type families can be used to encode how a component modifies the resolution:

```
type family ResMult (n :: Nat) (res :: VideoRes) :: VideoRes
type instance ResMult n ('VideoRes w h) = 'VideoRes (n * w) (n * h)
type instance ResMult n 'None = 'None
```

Now we can mock a camera, video upscaler, and encoder:

```
camera :: Proxy ('VideoRes w h)
-- ^ Configure the camera to output this resolution.
-> VideoStream 'None ('VideoRes w h)
camera = undefined
upscaler :: Proxy n
-- ^ Scale up by a factor of n.
-> VideoStream inVideo (ResMult n inVideo)
upscaler = undefined
-- | An encoder should be able to accept video of any resolution.
encoder :: VideoStream ('VideoRes w h) 'None
encoder = undefined
```

Stream composition looks a bit like function composition:

In fact, it looks so much like function composition that we can write a `Category`

instance:

Here’s a video processing pipeline that grabs video from a camera, doubles the frame, then encodes it. The type is `VideoStream 'None 'None`

because this composition of streams is closed; a `camera`

only produces video, and an `encoder`

only consumes it.

```
doubleScaler :: Proxy ('VideoRes w h) -> VideoStream 'None 'None
doubleScaler proxyRes = camera proxyRes
|> upscaler (Proxy :: Proxy 2)
|> encoder
```

It’d be nice to be able to configure the `camera`

dynamically at runtime, so we’ll need a function like `someNatVal`

and `someSizedVectorVal`

to hoist a term level resolution up to the type level:

```
dynRes :: Integer -- ^ Width.
-> Integer -- ^ Height.
-> (forall w h. (KnownNat w, KnownNat h) =>
Maybe (Proxy ('VideoRes w h)) -> a)
-> a
dynRes w h f =
case someNatVal w of
Just (SomeNat (_ :: Proxy wn)) ->
case someNatVal h of
Just (SomeNat (_ :: Proxy hn)) ->
f (Just (Proxy :: Proxy ('VideoRes wn hn)))
Nothing -> f (Nothing :: Maybe (Proxy ('VideoRes 0 0)))
Nothing -> f (Nothing :: Maybe (Proxy ('VideoRes 0 0)))
```

Now we can provide a double scaler that has its model checked by the typechecker and is configurable at runtime:

```
dynDoubleScaler :: Integer -> Integer -> Maybe (VideoStream 'None 'None)
dynDoubleScaler w h = dynRes w h (fmap doubleScaler)
```

I hope that these examples have improved your intuition for how to use `DataKinds`

in a dynamic way. When I first began using types like this I struggled to find a good explanation of how to actually *use* them in a real program.

I have a lot more to say about using quantification in Haskell programs. Understanding how to use `forall`

to solve problems is the gateway to many modern type system features, and the attendant intuitions offer guidance even when working in less-mathematically-inspired languages; I’ve found it helps to light the way when designing C++ templates.

Module containing this post’s code.

By “show” I really mean “promise.” These functions don’t constitute a proof that e.g. the length of

`x ++ y`

is the length of`x`

plus the length of`y`

. Instead, their types simply carry a promise that this is true, and indeed we could misbehave with the`SizedVector`

constructor and make a function with a bogus promise. We could use our own inductive natural type instead of`Nat`

to really prove that this is true, but something like Agda, Idris, or Coq would be better suited for this purpose, and I’ve found very little Haskell code in the wild that contains such proofs. The promise is a useful compromise.↩︎