T••LBX: Blog

Elm type declarations demystified

While trying to learn Elm, I came across a few concepts that were not easy to grasp. Especially when it comes to types and the syntax. I struggled with the difference between type and type alias. Also I did not fully understand what was going on in a type declaration. I like to identify in the code what is a type, what is a value, which value is a function, etc. And I will try to answer all these questions here.

Disclaimer: I would like to make it very clear that the purpose of this article is not to criticize the Elm language, its syntax, or its features. The purpose is to share a few things I've learned about the language so that people struggling with the same things. And in the process enjoy Elm a little bit more.

What Type Declaration Really Does

I had a hard time figuring out what to start with because all the questions I would like to answer are all kind of connected. Initially I wanted to start with the difference between type and type alias. But first I think it is simpler to just concentrate on the type declaration and what is going on there.

I was quite comfortable with the type annotations already and seeing all these types. I have used many typed languages before. But that might have been what confused me. At first I saw this as a union type in a language like Crystal. Let say something could be a String or an Int.

type StringOrInt = String | Int -- Not what you probably expect !!!

I quickly realized that I was wrong. It did not take long. The above code compiles, but it probably does not do what you think it does. We'll see why further in this article.

Anyway I stopped thinking about them as union types when I saw things like this.

type State
  = Open
  | Closed

Then I thought I was wrong, what is after the equal sign are values because I can use these in the code. In a way I thought these are exactly how you would use atoms or symbols in languages like Lisp, Ruby or Erlang. And it is actually true until this comes along.

type Visitor
  = Authorized String
  | Guest

To be honest I carried on for a long time without questioning this one. It is easy to have an idea of what it is used for. I was already familiar with the concept of Maybe from other languages. I kind of understood the mechanism without understanding the implementation. In my head it was that thing you could use like symbols or sometimes with arguments.

Spoiler alert: what should have strike me more is the fact that there is a mix of made up words and types in there. Like I said I lived with it since most of the time it was used in case statements. It did a good job at making me forget these are actually values. I was only questioning it when it was passed as argument. I was using it without really understanding what it was.

That is until one day I had a bug and it forced me to get my head around this. While tracking my bug I realized a few things, thanks to the help of nice people on the slack channel.

Here is the example again:

type Visitor
  = Authorized String
  | Guest

So when you use Guest as a value, it is of the type Visitor. Like symbols but with type safety. However, when you use Authorized alone as a value, its type is actually a function which takes a String as an argument and returns a Visitor. Essentially String -> Visitor.

This was probably obvious for many people, but for me it was quite a shift. In most compiled languages I have used, there is a clear separation between types and values. Even Erlang which is a functional language would achieve this with either symbols, a tuple or multiple function declarations with pattern matching.

Now that I know this, I feel a lot more comfortable. So the way I see the type declaration now is that you declare a finite list of values it can represent. Some of them are direct values of this type. Some of them are functions to which you pass arguments until you get a value of this type. Once you pass a String to the function Authorized, it returns a value of type Visitor.

The Difference Between type and type alias

Now that we have seen what the type declaration does in the previous chapter, it is easier to understand the difference with type alias. Before I knew how type works, I sort of simplified it in my head and thought type alias is for records. It is a bit like defining a typedef struct in a language like C. That is obviously until I saw type aliases that were not records.

It may seem stupid now but in some cases it all seemed like a bunch of ModuleCase words and the difference with or without alias was not obvious.

type alias Name = String
type UniqueState = NoOp

Here it is easy because you already know String is an existing type, not a made up word. But when reading code as a beginner, you don't know all the existing types and then the difference is a little bit more blurry. Especially types that are not obvious core types like String or Int. Think about these examples which both compile in the repl.

type Boolies = List Bool
type alias Boolies = List Bool

It turns out type alias is more what you would find in compiled languages like C with typedef. It is just a shortcut for another existing type, or more often an aggregate of existing types. It does not create any usable values, just a type you can use in a type signature.

Well... Except the constructor function.

Here is how I view the difference between type and type alias.

-- The name of the type is only a type
type Visitor
  -- After the equal sign, the first word of each line is a value
  = Guest              -- Direct value
  | Authorized String  -- Authorized is a function value
                       -- String is the type of the argument

-- The name is a type AND a constructor function (therefore a value)
type alias Person =
  -- After the equal sign are only existing types
  { first : String
  , last : String
  }

I am sure there are semantic errors in the way I describe things, but this is how I see it so far.

The Union Type Trap

Like we have seen previously, if you use a made up word for a type alias declaration, it does not compile.

type alias Binary = Right | Wrong -- This does not compile

However, using an existing type in a type declaration does compile as we've seen earlier.

type StringOrInt = String | Int

This does not create a type which could be either a String or an Int. This just creates 2 values that have nothing to do with types.

import Html

type StringOrInt = String | Int

fun : StringOrInt -> String
fun _ =
  "Whatever"

main = Html.text <| fun String

Here there are 2 use of String, but only one is a type. It is the type returned by fun. The other places where String is used is just a word, like a symbol if you come from Ruby. No relation at all with the type String.

To be honest, my instinct is that the compiler should at least discourage us from using the name of a type here. Especially one that is in the core library. But there might be a good reason why it is not the case. I am too much of a newbie to make a fair assumption.

Anyway here is how you really create a type which is either a String or a Int.

type StringOrInt
  = ActuallyString String
  | ActuallyInt Int

ensureString : StringOrInt -> String
ensureString val =
  case val of
    ActuallyString str ->
      str

    ActaullyInt num ->
      String.fromInt num

ensureString (ActuallyString "Hello")
-- "Hello"

ensureString (ActuallyInt 42)
-- "42"

The names ActuallyString and ActuallyInt can be whatever you want. I just intentionally picked a name that is not a type, but indicates it.

The Many ModuleCase Names

One aspect that could be overwhelming is the different things that are ModuleCase but actually different. Even sometimes spelled the same.

module Visitor exposing (..)
-- Visitor is a module name

-- Here Visitor is a type inside the Visitor module (Visitor.Visitor)
type Visitor 
  = Authorized String -- Authorized is a function, String is a type
                      -- Authorized returns a value of type Visitor 
                      -- once called with a String
  | Guest             -- Guest is a value of type Visitor

Authorized "Bob"  -- Authorized called with a String

type alias Person =   -- Person is a type AND a function 
  { first : String    -- String is a type
  , last : String     -- String is a type
  }

fullname : Person -> String  -- Person used as a type
  -- ...

Person "John" "Doe"  -- Person used as a constructor function

The constructors look similar but the alias name is the constructor, whereas obviously for types, it is not the case. It is just a type and each line with arguments is a constructor.

This example summarizes quite well what I struggled with. I like to identify in the code what is a type, what is a value, which value is a function, etc. Hopefully this article makes it clearer and not worse.

Like I said in the disclaimer, it is absolutely not a criticism of the language. Writing a language is bloody hard, and there is only so many things you can differentiate with text if you want to avoid line noise. I don't know any language which completely avoid the same styling for different things. The rest comes from context.

A Big Thank You To The Elm Community

I have to say my progress was only recent and I owe it to people on the Elm slack channel and others in real life. Everybody has been nice and helpful. And beginner friendly which is not the case in all communities. We've all been on forums and got answers like "Why would you do that?" or "RTFM".

It is quite remarkable considering that there are 2 main categories of people attracted to Elm. That would be a group of people familiar with functional programming, coming from Haskell or ML. And another group of people coming from JavaScript who wants to get more from what a functional language with type safety has to offer. I am personally from the later group.

Actually, that might be why it works well. I had maths prodigies throwing category theory terms at me so kindly that I was comfortable enough to make them believe I was understanding what they were talking about. That says a lot.

Thank you.


Tags:     elm functional programming types