Type Classes
Chapter Goals
This chapter will introduce a powerful form of abstraction which is enabled by PureScript's type system - type classes.
This motivating example for this chapter will be a library for hashing data structures. We will see how the machinery of type classes allow us to hash complex data structures without having to think directly about the structure of the data itself.
We will also see a collection of standard type classes from PureScript's Prelude and standard libraries. PureScript code leans heavily on the power of type classes to express ideas concisely, so it will be beneficial to familiarize yourself with these classes.
Project Setup
The source code for this chapter is defined in the file src/Data/Hashable.purs
.
The project has the following dependencies:
maybe
, which defines theMaybe
data type, which represents optional values.tuples
, which defines theTuple
data type, which represents pairs of values.either
, which defines theEither
data type, which represents disjoint unions.strings
, which defines functions which operate on strings.functions
, which defines some helper functions for defining PureScript functions.
The module Data.Hashable
imports several modules provided by these packages.
Show Me!
Our first simple example of a type class is provided by a function we've seen several times already: the show
function, which takes a value and displays it as a string.
show
is defined by a type class in the Prelude
module called Show
, which is defined as follows:
This code declares a new type class called Show
, which is parameterized by the type variable a
.
A type class instance contains implementations of the functions defined in a type class, specialized to a particular type.
For example, here is the definition of the Show
type class instance for Boolean
values, taken from the Prelude:
This code declares a type class instance called showBoolean
- in PureScript, type class instances are named to aid the readability of the generated JavaScript. We say that the Boolean
type belongs to the Show
type class.
We can try out the Show
type class in PSCi, by showing a few values with different types:
These examples demonstrate how to show
values of various primitive types, but we can also show
values with more complicated types:
If we try to show a value of type Data.Either
, we get an interesting error message:
The problem here is not that there is no Show
instance for the type we intended to show
, but rather that PSCi was unable to infer the type. This is indicated by the unknown type a
in the inferred type.
We can annotate the expression with a type, using the ::
operator, so that PSCi can choose the correct type class instance:
Some types do not have a Show
instance defined at all. One example of this is the function type ->
. If we try to show
a function from Int
to Int
, we get an appropriate error message from the type checker:
Exercises
(Easy) Use the
showShape
function from the previous chapter to define aShow
instance for theShape
type.
Common Type Classes
In this section, we'll look at some standard type classes defined in the Prelude and standard libraries. These type classes form the basis of many common patterns of abstraction in idiomatic PureScript code, so a basic understanding of their functions is highly recommended.
Eq
The Eq
type class defines the eq
function, which tests two values for equality. The ==
operator is actually just an alias for eq
.
Note that in either case, the two arguments must have the same type: it does not make sense to compare two values of different types for equality.
Try out the Eq
type class in PSCi:
Ord
The Ord
type class defines the compare
function, which can be used to compare two values, for types which support ordering. The comparison operators <
and >
along with their non-strict companions <=
and >=
, can be defined in terms of compare
.
The compare
function compares two values, and returns an Ordering
, which has three alternatives:
LT
- if the first argument is less than the second.EQ
- if the first argument is equal to the second.GT
- if the first argument is greater than the second.
Again, we can try out the compare
function in PSCi:
Field
The Field
type class identifies those types which support numeric operators such as addition, subtraction, multiplication and division. It is provided to abstract over those operators, so that they can be reused where appropriate.
Note: Just like the Eq
and Ord
type classes, the Field
type class has special support in the PureScript compiler, so that simple expressions such as 1 + 2 * 3
get translated into simple JavaScript, as opposed to function calls which dispatch based on a type class implementation.
The Field
type class is composed from several more general superclasses. This allows us to talk abstractly about types which support some but not all of the Field
operations. For example, a type of natural numbers would be closed under addition and multiplication, but not necessarily under subtraction, so that type might have an instance of the Semiring
class (which is a superclass of Num
), but not an instance of Ring
or Field
.
Superclasses will be explained later in this chapter, but the full numeric type class hierarchy is beyond the scope of this chapter. The interested reader is encouraged to read the documentation for the superclasses of Field
in prelude
.
Semigroups and Monoids
The Semigroup
type class identifies those types which support an append
operation to combine two values:
Strings form a semigroup under regular string concatenation, and so do arrays. Several other standard instances are provided by the prelude
package.
The <>
concatenation operator, which we have already seen, is provided as an alias for append
.
The Monoid
type class (provided by the prelude
package) extends the Semigroup
type class with the concept of an empty value, called mempty
:
Again, strings and arrays are simple examples of monoids.
A Monoid
type class instance for a type describes how to accumulate a result with that type, by starting with an "empty" value, and combining new results. For example, we can write a function which concatenates an array of values in some monoid by using a fold. In PSCi:
The prelude
package provides many examples of monoids and semigroups, which we will use in the rest of the book.
Foldable
If the Monoid
type class identifies those types which act as the result of a fold, then the Foldable
type class identifies those type constructors which can be used as the source of a fold.
The Foldable
type class is provided in the foldable-traversable
package, which also contains instances for some standard containers such as arrays and Maybe
.
The type signatures for the functions belonging to the Foldable
class are a little more complicated than the ones we've seen so far:
It is instructive to specialize to the case where f
is the array type constructor. In this case, we can replace f a
with Array a
for any a, and we notice that the types of foldl
and foldr
become the types that we saw when we first encountered folds over arrays.
What about foldMap
? Well, that becomes forall a m. Monoid m => (a -> m) -> Array a -> m
. This type signature says that we can choose any type m
for our result type, as long as that type is an instance of the Monoid
type class. If we can provide a function which turns our array elements into values in that monoid, then we can accumulate over our array using the structure of the monoid, and return a single value.
Let's try out foldMap
in PSCi:
Here, we choose the monoid for strings, which concatenates strings together, and the show
function which renders an Int
as a String
. Then, passing in an array of integers, we see that the results of show
ing each integer have been concatenated into a single String
.
But arrays are not the only types which are foldable. foldable-traversable
also defines Foldable
instances for types like Maybe
and Tuple
, and other libraries like lists
define Foldable
instances for their own data types. Foldable
captures the notion of an ordered container.
Functor, and Type Class Laws
The Prelude also defines a collection of type classes which enable a functional style of programming with side-effects in PureScript: Functor
, Applicative
and Monad
. We will cover these abstractions later in the book, but for now, let's look at the definition of the Functor
type class, which we have seen already in the form of the map
function:
The map
function (and its alias <$>
) allows a function to be "lifted" over a data structure. The precise definition of the word "lifted" here depends on the data structure in question, but we have already seen its behavior for some simple types:
How can we understand the meaning of the map
function, when it acts on many different structures, each in a different way?
Well, we can build an intuition that the map
function applies the function it is given to each element of a container, and builds a new container from the results, with the same shape as the original. But how do we make this concept precise?
Type class instances for Functor
are expected to adhere to a set of laws, called the functor laws:
map id xs = xs
map g (map f xs) = map (g <<< f) xs
The first law is the identity law. It states that lifting the identity function (the function which returns its argument unchanged) over a structure just returns the original structure. This makes sense since the identity function does not modify its input.
The second law is the composition law. It states that mapping one function over a structure, and then mapping a second, is the same thing as mapping the composition of the two functions over the structure.
Whatever "lifting" means in the general sense, it should be true that any reasonable definition of lifting a function over a data structure should obey these rules.
Many standard type classes come with their own set of similar laws. The laws given to a type class give structure to the functions of that type class and allow us to study its instances in generality. The interested reader can research the laws ascribed to the standard type classes that we have seen already.
Exercises
(Easy) The following newtype represents a complex number:
Define
Show
andEq
instances forComplex
.
Type Class Constraints
Types of functions can be constrained by using type classes. Here is an example: suppose we want to write a function which tests if three values are equal, by using equality defined using an Eq
type class instance.
The type declaration looks like an ordinary polymorphic type defined using forall
. However, there is a type class constraint Eq a
, separated from the rest of the type by a double arrow =>
.
This type says that we can call threeAreEqual
with any choice of type a
, as long as there is an Eq
instance available for a
in one of the imported modules.
Constrained types can contain several type class instances, and the types of the instances are not restricted to simple type variables. Here is another example which uses Ord
and Show
instances to compare two values:
Note that multiple constraints can be specified by using the =>
symbol multiple times, just like we specify curried functions of multiple arguments. But remember not to confuse the two symbols:
a -> b
denotes the type of functions from typea
to typeb
, whereasa => b
applies the constrainta
to the typeb
.
The PureScript compiler will try to infer constrained types when a type annotation is not provided. This can be useful if we want to use the most general type possible for a function.
To see this, try using one of the standard type classes like Semiring
in PSCi:
Here, we might have annotated this function as Int -> Int
, or Number -> Number
, but PSCi shows us that the most general type works for any Semiring
, allowing us to use our function with both Int
s and Number
s.
Overlapping Instances
PureScript has another rule regarding type class instances, called the overlapping instances rule. Whenever a type class instance is required at a function call site, PureScript will use the information inferred by the type checker to choose the correct instance. At that time, there should be exactly one appropriate instance for that type. If there are multiple valid instances, the compiler will issue a error.
To demonstrate this, we can try creating two conflicting type class instances for an example type. In the following code, we create two overlapping Show
instances for the type T
:
This module will not compile. The overlapping instances rule will be enforced, resulting in an error:
The overlapping instances rule is enforced so that automatic selection of type class instances is a predictable process. If we allowed two type class instances for a type to exist, then either could be chosen depending on the order of module imports, and that could lead to unpredictable behavior of the program at runtime, which is undesirable.
If it is truly the case that there are two valid type class instances for a type, satisfying the appropriate laws, then a common approach is to define newtypes which wrap the existing type. Since different newtypes are allowed to have different type class instances under the overlapping instances rule, there is no longer an issue. This approach is taken in PureScript's standard libraries, for example in maybe
, where the Maybe a
type has multiple valid instances for the Monoid
type class.
Instance Dependencies
Just as the implementation of functions can depend on type class instances using constrained types, so can the implementation of type class instances depend on other type class instances. This provides a powerful form of program inference, in which the implementation of a program can be inferred using its types.
For example, consider the Show
type class. We can write a type class instance to show
arrays of elements, as long as we have a way to show
the elements themselves:
If a type class instance depends on multiple other instances, those instances should be grouped in parentheses and separated by commas on the left hand side of the =>
symbol:
These two type class instances are provided in the prelude
library.
When the program is compiled, the correct type class instance for Show
is chosen based on the inferred type of the argument to show
. The selected instance might depend on many such instance relationships, but this complexity is not exposed to the developer.
Exercises
(Easy) The following declaration defines a type of non-empty arrays of elements of type
a
:Write an
Eq
instance for the typeNonEmpty a
which reuses the instances forEq a
andEq (Array a)
.(Medium) Write a
Semigroup
instance forNonEmpty a
by reusing theSemigroup
instance forArray
.(Medium) Write a
Functor
instance forNonEmpty
.(Medium) Given any type
a
with an instance ofOrd
, we can add a new "infinite" value which is greater than any other value:Write an
Ord
instance forExtended a
which reuses theOrd
instance fora
.(Difficult) Write a
Foldable
instance forNonEmpty
. Hint: reuse theFoldable
instance for arrays.(Difficult) Given a type constructor
f
which defines an ordered container (and so has aFoldable
instance), we can create a new container type which includes an extra element at the front:The container
OneMore f
also has an ordering, where the new element comes before any element off
. Write aFoldable
instance forOneMore f
:
Multi Parameter Type Classes
It's not the case that a type class can only take a single type as an argument. This is the most common case, but in fact, a type class can be parameterized by zero or more type arguments.
Let's see an example of a type class with two type arguments.
The Stream
module defines a class Stream
which identifies types which look like streams of elements, where elements can be pulled from the front of the stream using the uncons
function.
Note that the Stream
type class is parameterized not only by the type of the stream itself, but also by its elements. This allows us to define type class instances for the same stream type but different element types.
The module defines two type class instances: an instance for arrays, where uncons
removes the head element of the array using pattern matching, and an instance for String, which removes the first character from a String.
We can write functions which work over arbitrary streams. For example, here is a function which accumulates a result in some Monoid
based on the elements of a stream:
Try using foldStream
in PSCi for different types of Stream
and different types of Monoid
.
Functional Dependencies
Multi-parameter type classes can be very useful, but can easily lead to confusing types and even issues with type inference. As a simple example, consider writing a generic tail
function on streams using the Stream
class given above:
This gives a somewhat confusing error message:
The problem is that the genericTail
function does not use the element
type mentioned in the definition of the Stream
type class, so that type is left unsolved.
Worse still, we cannot even use genericTail
by applying it to a specific type of stream:
Here, we might expect the compiler to choose the streamString
instance. After all, a String
is a stream of Char
s, and cannot be a stream of any other type of elements.
The compiler is unable to make that deduction automatically, and cannot commit to the streamString
instance. However, we can help the compiler by adding a hint to the type class definition:
Here, stream -> element
is called a functional dependency. A functional dependency asserts a functional relationship between the type arguments of a multi-parameter type class. This functional dependency tells the compiler that there is a function from stream types to (unique) element types, so if the compiler knows the stream type, then it can commit to the element type.
This hint is enough for the compiler to infer the correct type for our generic tail function above:
Functional dependencies can be quite useful when using multi-parameter type classes to design certain APIs.
Nullary Type Classes
We can even define type classes with zero type arguments! These correspond to compile-time assertions about our functions, allowing us to track global properties of our code in the type system.
An important example is the Partial
class which we saw earlier when discussing partial functions. Take for example the functions head
and tail
defined in Data.Array.Partial
that allow us to get the head or tail of an array without wrapping them in a Maybe
, so they can fail if the array is empty:
Note that there is no instance defined for the Partial
type class! Doing so would defeat its purpose: attempting to use the head
function directly will result in a type error:
Instead, we can republish the Partial
constraint for any functions making use of partial functions:
We've already seen the unsafePartial
function, which allows us to treat a partial function as a regular function (unsafely). This function is defined in the Partial.Unsafe
module:
Note that the Partial
constraint appears inside the parentheses on the left of the function arrow, but not in the outer forall
. That is, unsafePartial
is a function from partial values to regular values.
Superclasses
Just as we can express relationships between type class instances by making an instance dependent on another instance, we can express relationships between type classes themselves using so-called superclasses.
We say that one type class is a superclass of another if every instance of the second class is required to be an instance of the first, and we indicate a superclass relationship in the class definition by using a backwards facing double arrow.
We've already seen some examples of superclass relationships: the Eq
class is a superclass of Ord
, and the Semigroup
class is a superclass of Monoid
. For every type class instance of the Ord
class, there must be a corresponding Eq
instance for the same type. This makes sense, since in many cases, when the compare
function reports that two values are incomparable, we often want to use the Eq
class to determine if they are in fact equal.
In general, it makes sense to define a superclass relationship when the laws for the subclass mention the members of the superclass. For example, it is reasonable to assume, for any pair of Ord
and Eq
instances, that if two values are equal under the Eq
instance, then the compare
function should return EQ
. In other words, a == b
should be true exactly when compare a b
evaluates to EQ
. This relationship on the level of laws justifies the superclass relationship between Eq
and Ord
.
Another reason to define a superclass relationship is in the case where there is a clear "is-a" relationship between the two classes. That is, every member of the subclass is a member of the superclass as well.
Exercises
(Medium) Define a partial function which finds the maximum of a non-empty array of integers. Your function should have type
Partial => Array Int -> Int
. Test out your function in PSCi usingunsafePartial
. Hint: Use themaximum
function fromData.Foldable
.(Medium) The
Action
class is a multi-parameter type class which defines an action of one type on another:An action is a function which describes how monoidal values can be used to modify a value of another type. There are two laws for the
Action
type class:act mempty a = a
act (m1 <> m2) a = act m1 (act m2 a)
That is, the action respects the operations defined by the
Monoid
class.For example, the natural numbers form a monoid under multiplication:
This monoid acts on strings by repeating an input string some number of times. Write an instance which implements this action:
Does this instance satisfy the laws listed above?
(Medium) Write an instance
Action m a => Action m (Array a)
, where the action on arrays is defined by acting on each array element independently.(Difficult) Given the following newtype, write an instance for
Action m (Self m)
, where the monoidm
acts on itself usingappend
:(Difficult) Should the arguments of the multi-parameter type class
Action
be related by some functional dependency? Why or why not?
A Type Class for Hashes
In the last section of this chapter, we will use the lessons from the rest of the chapter to create a library for hashing data structures.
Note that this library is for demonstration purposes only, and is not intended to provide a robust hashing mechanism.
What properties might we expect of a hash function?
A hash function should be deterministic, and map equal values to equal hash codes.
A hash function should distribute its results approximately uniformly over some set of hash codes.
The first property looks a lot like a law for a type class, whereas the second property is more along the lines of an informal contract, and certainly would not be enforceable by PureScript's type system. However, this should provide the intuition for the following type class:
with the associated law that a == b
implies hash a == hash b
.
We'll spend the rest of this section building a library of instances and functions associated with the Hashable
type class.
We will need a way to combine hash codes in a deterministic way:
The combineHashes
function will mix two hash codes and redistribute the result over the interval 0-65535.
Let's write a function which uses the Hashable
constraint to restrict the types of its inputs. One common task which requires a hashing function is to determine if two values hash to the same hash code. The hashEqual
relation provides such a capability:
This function uses the on
function from Data.Function
to define hash-equality in terms of equality of hash codes, and should read like a declarative definition of hash-equality: two values are "hash-equal" if they are equal after each value has been passed through the hash
function.
Let's write some Hashable
instances for some primitive types. Let's start with an instance for integers. Since a HashCode
is really just a wrapped integer, this is simple - we can use the hashCode
helper function:
We can also define a simple instance for Boolean
values using pattern matching:
With an instance for hashing integers, we can create an instance for hashing Char
s by using the toCharCode
function from Data.Char
:
To define an instance for arrays, we can map
the hash
function over the elements of the array (if the element type is also an instance of Hashable
) and then perform a left fold over the resulting hashes using the combineHashes
function:
Notice how we build up instances using the simpler instances we have already written. Let's use our new Array
instance to define an instance for String
s, by turning a String
into an array of Char
s:
How can we prove that these Hashable
instances satisfy the type class law that we stated above? We need to make sure that equal values have equal hash codes. In cases like Int
, Char
, String
and Boolean
, this is simple because there are no values of those types which are equal in the sense of Eq
but not equal identically.
What about some more interesting types? To prove the type class law for the Array
instance, we can use induction on the length of the array. The only array with length zero is []
. Any two non-empty arrays are equal only if they have equals head elements and equal tails, by the definition of Eq
on arrays. By the inductive hypothesis, the tails have equal hashes, and we know that the head elements have equal hashes if the Hashable a
instance must satisfy the law. Therefore, the two arrays have equal hashes, and so the Hashable (Array a)
obeys the type class law as well.
The source code for this chapter includes several other examples of Hashable
instances, such as instances for the Maybe
and Tuple
type.
Exercises
(Easy) Use PSCi to test the hash functions for each of the defined instances.
(Medium) Use the
hashEqual
function to write a function which tests if an array has any duplicate elements, using hash-equality as an approximation to value equality. Remember to check for value equality using==
if a duplicate pair is found. Hint: thenubByEq
function inData.Array
should make this task much simpler.(Medium) Write a
Hashable
instance for the following newtype which satisfies the type class law:The newtype
Hour
and itsEq
instance represent the type of integers modulo 12, so that 1 and 13 are identified as equal, for example. Prove that the type class law holds for your instance.(Difficult) Prove the type class laws for the
Hashable
instances forMaybe
,Either
andTuple
.
Conclusion
In this chapter, we've been introduced to type classes, a type-oriented form of abstraction which enables powerful forms of code reuse. We've seen a collection of standard type classes from the PureScript standard libraries, and defined our own library based on a type class for computing hash codes.
This chapter also gave an introduction to the notion of type class laws, a technique for proving properties about code which uses type classes for abstraction. Type class laws are part of a larger subject called equational reasoning, in which the properties of a programming language and its type system are used to enable logical reasoning about its programs. This is an important idea, and will be a theme which we will return to throughout the rest of the book.
Last updated
Was this helpful?