Wednesday, March 13, 2013

HASKELL ALGEBRAIC DATA TYPES

HASKELL ALGEBRAIC DATA TYPES

REVISED: Friday, October 25, 2024




Haskell Algebraic Data Types.

I.  DEFINING YOUR OWN HASKELL DATA TYPES

A Haskell data type is a "record".

data is the keyword used to start a type definition of a new type in Haskell. 

You cannot declare new data types inside functions.

GHCi will print the result of an I/O action if and only if:

1. The result type is an instance of Show.

2. The result type is not ().

Type and data constructor names must always start with a capital letter; variables, including names of functions, must always start with a lowercase letter. Infix symbol "operator" constructors start with a : colon. Having an additional : at the end of an infix symbol "operator" is optional.

In Haskell, the names of types and values are independent of each other. We only use a type constructor; i.e., the type's name in a type declaration or a type signature. We only use a value constructor in actual code. Because these uses are distinct, there is no ambiguity if we give a type constructor and a value constructor the same name. If we are writing a type signature, we must be referring to a type constructor. If we are writing an expression, we must be using the value constructor.

A. DEFINING A NEW DATA TYPE

data (Type constructor) = (Value constructor or data constructor or function name) (type components or fields or function arguments)

It is very important to distinguish between the type constructor and the value constructor. When declaring a data type, the part before the = is the type constructor and the constructors after it, possibly separated by pipes |'s, are value constructors.

Use record syntax when a constructor has several fields and it's not obvious which field is which. 

B. NAMING TYPES AND VALUES

It is legal and normal for a type constructor and a value constructor to have the same name.

C. TYPE SYNONYMS

To give a type a more descriptive name we can introduce a synonym at any time for an existing type. The type keyword introduces a type synonym. The new name is on the left of the = and the existing name is on the right. Type synonyms make code more readable. A type synonym can be used to create a shorter name for a type that has a long name. We still use the same value constructors to create a value of the type.

D. ALGEBRAIC DATA TYPES

Algebraic data types are types that can be modeled using simple algebra, namely addition and multiplication.

All of the data types that we define with the data keyword are algebraic data types.

Algebraic data types correspond to a struct in C or C++, and its components correspond to the fields of a struct. The fields in the Haskell type are anonymous and positional. Algebraic data types are sometimes referred to as enumeration types.

Adding distinct data type declarations will benefit you in both type safety and readability.

An algebraic data type can have more than one value constructor. Each value constructor is separated in the definition by a pipe | character, which can be read as “or”.

We will create the following data class type Shape which can be a Circle or a Rectangle:

data Shape = Circle Float Float Float | Rectangle Float Float Float Float deriving (Show)

The Circle value constructor has three fields, which take floats. When we write a value constructor, we can add types after it and those types define the values, the value constructor, will contain. Here, the first two fields are the coordinates of the center of the circle, the third field is the radius of the circle. 

The Rectangle value constructor has four fields which accept floats. The first two are the coordinates to the upper left corner of the rectangle and the second two are coordinates to the lower right corner of the rectangle.

Value constructors are functions that return a value of a data type.

Prelude>  :type Circle
Circle :: Float -> Float -> Float -> Shape
Prelude>

Prelude>  :type Rectangle
Rectangle :: Float -> Float -> Float -> Float -> Shape
Prelude>  

E. CLASS INSTANCE DECLARATIONS

To derive an instance, the syntax is:

instance «preconditions» => Class «type» where
  «method» = «definition»

The part before the "=>" is the context, while the part after the "=>" is the head of the instance declaration.

A type may not be declared an instance of a particular class more than once in a program.

A type class is a sort of an interface that defines some behavior. A type can be made an instance of a type class if it supports that behavior. First we make our data type and then we think about what it can act like. If it can act like something that can be equated, we make it an instance of the Eq type class. If it can act like something that can be ordered, we make it an instance of the Ord type class. We can make our types instances of type classes by implementing the functions defined by the type classes.

Haskell separates the definition of a type from the definition of the methods associated with that type.

There is no access control, such as public or private class constituents, built into the Haskell class system. Instead, the module system must be used to hide or reveal components of a type class.

With a type class defined, we proceed to make existing types instances of it.

Type classes are not types, but categories of types; therefore, the instances of a type class are not values, but types.

Type synonyms defined with the type keyword cannot be made instances of a type class.

If we add deriving (Show) at the end of a data declaration, Haskell automatically makes that type part of the Show type class.

Haskell can automatically make our type an instance of any of the following type classes: Eq, Ord, Enum, Bounded, Show, Read. Haskell can derive the behavior of our types in these contexts if we use the deriving keyword when making our data type.

1. Eq 

Equality operators == and /=

2. Ord 

Comparison operators < <= > >=; min, max and compare.

In order to declare a type to be an instance of Ord you must already have declared it an instance of Eq in other words Ord is a subclass of Eq.

3. Enum 

For enumerations only. Allows the use of list syntax such as [Blue .. Green].

4. Bounded 

Also for enumerations, but can also be used on types that have only one constructor. Provides minBound and maxBound, the lowest and highest values that the type can take.

5. Show 

Defines the function show, which converts a value into a string, and other related functions.

6. Read 

Defines the function read, which parses a string into a value of the type, and other related functions.

F. KINDS

Types are labels values carry so we can reason about the values. Types have their own labels, called kinds. A kind is the type of a type. We can examine the kind of a type by using the :k command in GHCi.

Prelude>  :kind Int
Int :: *
Prelude>  

A * means that the type kind is a concrete type. A concrete type kind is a type that does not take any type parameters. Values can only have types that are concrete types. A * is pronounced "star" or "type". 

Prelude>  :kind Maybe
Maybe :: * -> *
Prelude>  

The Maybe type constructor takes one concrete type, like Int, and then returns a concrete type like Maybe Int. And that is what its kind tells us. Just like Int -> Int means that a function takes an Int and returns an Int, * -> * means that the type constructor takes one concrete type kind and returns a concrete type kind.

Prelude>  :kind Maybe Int
Maybe Int :: *
Prelude>  

We use :k on a type to get its kind, just like we can use :t on a value to get its type. Types are the labels of values and kinds are the labels of types and there are parallels between the two.

Prelude>  :kind Either
Either :: * -> * -> *
Prelude>

Either takes two concrete type kinds as type parameters to produce a concrete type kind. It looks like a type declaration of a function that takes two values and returns something. Type constructors are curried, just like functions, so we can partially apply them.

Prelude>  :kind Either String 
Either String :: * -> *
Prelude>

Prelude>  :kind Either String Int
Either String Int :: *
Prelude> 

If we wanted to make Either a part of the Functor typeclass, we would have to partially apply it because Functor wants types that take only one parameter while Either takes two. In other words, Functor wants types of kind * -> * and so we would have to partially apply Either to get a type of kind * -> * instead of its original type of kind * -> * -> *.
 
class Functor f where   
    fmap :: (a -> b) -> f a -> f b


The f type variable is used as a type that takes one concrete type kind to produce a concrete type kind. We know it has to produce a concrete type kind because it is used as the type of a value in a function. And from that, we can deduce that, types that want to be friends with Functor, have to be of type kind * -> *.

A class and type must have the same kind.

II. NEWTYPE

newtype is for making a completely new type out of an existing type. A newtype declaration creates a new type in much the same way as dataExcept for the keyword, the newtype declaration uses the same syntax as a data declaration with a single constructor containing a single field. Both newtype and the single constructor data introduce a single value constructor, but the value constructor introduced by newtype is strict and the value constructor introduced by data is lazy.

Types declared with the data keyword are lifted. Haskell provides the newtype keyword, for the construction of unlifted types.

In Haskell, the newtype declaration creates a new type from an existing one. A type created by newtype differs from an algebraic data type in that the representation of an algebraic data type has an extra level of indirection. This difference may make access to the representation less efficient. The difference is reflected in different rules for pattern matching.

When you make a new type from an existing type by using the newtype keyword, you can only have one value constructor and that value constructor can only have one field. But with data, you can make data types that have several value constructors and each constructor can have zero or more fields. Therefore, newtype normally processes faster.

III. CONCLUSION

In this tutorial, you have received an introduction to Haskell algebraic data types.

IV. REFERENCES

Bird, R. (2015). Thinking Functionally with Haskell. Cambridge, England: Cambridge University Press.

Davie, A. (1992). Introduction to Functional Programming Systems Using Haskell. Cambridge, England: Cambridge University Press.

Goerzen, J. & O'Sullivan, B. &  Stewart, D. (2008). Real World Haskell. Sebastopol, CA: O'Reilly Media, Inc.

Hutton, G. (2007). Programming in Haskell. New York: Cambridge University Press.

Lipovača, M. (2011). Learn You a Haskell for Great Good!: A Beginner's Guide. San Francisco, CA: No Starch Press, Inc.

Thompson, S. (2011). The Craft of Functional Programming. Edinburgh Gate, Harlow, England: Pearson Education Limited.