I wrote up a highlights of ICFP 2022 for the Tarides blog. It was great to get back to in-person conferences again and getting the chance to meet people. Thanks to my employer Tarides for covering the cost.
For me personally the OCaml Workshop was fantastic beginning to end, read the blog post for the full details. Outside of OCaml I spent time in the Haskell Implementors Workshop, hearing about the new features for GHC and excited by the progress that Cabal is making.
My take away research topics are:
]]>I am revisiting my OCaml setup post from 2021 because I needed to setup a new macOS machine. The official OCaml site points newcomers to Visual Studio Code which is a fine choice to get started. However I am using Emacs and have done so for over 20 years, and did not find a good description of how to set things up with Emacs. Here I could digress into why Emacs but I will just strongly encourage any developers to invest heavily in learning their editor with Emacs being a fine choice.
On macOS I use the pre-compiled GUI version of Emacs from emacsformacosx preferring that over compiling it by hand or using the version in homebrew. Both of which I have done previously but find the emacsformacos version saves me time and effort, plus the GUI version was removed from homebrew at some point in the past.
Next I choose to use an Emacs distro over the base Emacs setup, again this is a time saving choice and especially useful if you are new to Emacs. Use Prelude, which is an enhanced Emacs 25.1+ distribution that should make your experience with Emacs both more pleasant and more powerful. It gives a great modern setup for Emacs with minimal fuss. Once that is cloned and installed the Lisp config begins.
Prelude provides a base experience of packages available with some configuration. The configuration goes into ~/.emacs.d/tsmc/prelude-modules.el
where tsmc
is your macOS user. The same path would apply for Linux. A sample prelude-modules.ml is provided in https://github.com/bbatsov/prelude/blob/master/sample/prelude-modules.el
I choose the following modules to enable with prelude-lsp
and prelude-ocaml
being the core OCaml related choices. The other bits are optional but useful for editing lisp or navigating code.
(require 'prelude-ivy) ;; A mighty modern alternative to ido
(require 'prelude-company)
(require 'prelude-emacs-lisp)
(require 'prelude-lisp) ;; Common setup for Lisp-like languages
(require 'prelude-lsp) ;; Base setup for the Language Server Protocol
(require 'prelude-ocaml)
Now for the customisation to get LSP working properly. There are 3 main pieces:
direnv is a small program to load/unload environment variables based on $PWD (current working directory). This program ensures that when you open an OCaml file the correct opam switch is chosen and the tools installed in that switch are made available to Emacs. Opam is the OCaml package manager and manages local sandboxes of packages called switches. Without direnv Emacs will not find the correct tools and you would need to mess with Emacs PATHS to get it right. I have done that and it is much simplier with direnv.
So brew install direnv
and create a .envrc
file in an OCaml project with eval $(opam env --set-switch)
inside. Compared to my previous post I have been using local opam switches which exist inside an OCaml project. They are created as opam switch create . 4.14.0 --with-test --deps-only -y
and appear as an _opam
directory in the project root. Next run direnv allow
to tell direnv it is safe to use the .envrc
file in this directory. The reason I have switched is I often need to test different OCaml versions so removing the _opam
directory and recreating it is the simplier option.
OCaml LSP server needs to be installed in the current switch so run opam update && opam install ocaml-lsp-server -y
, this will make ocaml-lsp-server available to Emacs via direnv.
There is an opportunity here to use Emacs Lisp to install ocaml-lsp-server
if it was missing or to allow lsp-mode to download and install it itself. I would like to have this working in future. Next back into Lisp.
Create a file init.el in ~/.emacs.d/tsmc/
substituting your Unix user name for tsmc
. Thanks to emacs-prelude the configuration is very small.
;;; init.el --- @tsmc configuration entry point.
(prelude-require-packages '(use-package direnv))
;; Use direnv to select the correct opam switch and set the path
;; that Emacs will use to run commands like ocamllsp, merlin or dune build.
(use-package lsp-mode
:hook
(tuareg-mode . lsp))
;; Attach lsp hook to modes that require it, here we bind to tuareg-mode rather than
;; prelude-ocaml. For unknown reasons the latter does not bind properly and does not
;; start lsp-mode
(provide 'tsmc)
;;; init.el ends here
We require a few packages use-package
and direnv
, and then tell Emacs to start lsp-mode when tuareg-mode
is started. Tuareg-mode is one of the OCaml modes available for Emacs, the other being caml-mode
which I have not really used. Now quit and restart Emacs. Opening an ml file inside the project you started earlier and ocaml-lsp should startup.
The types for expressions and modules will display on mouse hover or beside the definition. Hovering the mouse over a function or type will display the type plus the documentation comments for it. A successful dune build
for the project is required to generate the data used by ocaml-lsp-server. At this point in time prelude
relies on merlin
an assistant for editing OCaml code, that is used by ocaml-lsp-server
internally but also available as standalone tool. So I often have both installed, opam install merlin
should be enough to get it installed too.
At this point I am mostly happy, the types and documentation displays as required. Navigating using M-.
shows a preview of the type / function under point and return will take me to the definition. This is vastly improved in OCaml 4.14 (with the work on Shapes) which I have switched to for everything I can. Switching between ml and mli files is C-c C-a
and more, simply visit the M-x describe-mode
to show everything available.
The annoyances are more fundamental to how LSP wants to work. It uses what I am calling a push based interaction, where it generates the information for types and documentation in the background and pushes it into the Emacs buffer. You never need to ask what is the type, it will display for you. Sometimes I want to ask for what a type is inside an expression, with LSP you are encouraged to mouse hover over something rather than having a key binding for it. So far I haven’t found the lisp function that drives the hover functionality but when I do I will bind it to a key. The second issue is also around mouse usage to drive LSP functionality like rename or annotate types. I would strongly prefer a key chord driven approach to that. Again I will set this up once I find the right lsp functions. For now I use C-c C-t
from merlin to summon the types for things.
Overall the experience is solid. Types and docs appear as required. Navigation works. The speed has been good so far. LSP mode is less janky than it was 1 year ago.
There is a fine alternative LSP mode, Eglot for Emacs. It takes a more minimal approach and uses a pull based interaction. Where you ask for the information based on key bindings vs the information being pushed at you via UI elements. For example, the type of a function is requested rather than shown by default.
The corresponding configuration I was using previously is:
(use-package eglot
:config
(define-key eglot-mode-map
(kbd "C-c C-t") #'eldoc-print-current-symbol-info)
:hook
((tuareg-mode . eglot-ensure)))
Again using use-package
to configure the mode, the hooks are triggering Eglot to be loaded when tuareg-mode
is. Using the eglot-ensure
function which starts an Eglot session for current buffer if there isn’t one. No further configuration is needed in Emacs as Eglot knows the LSP server is called ocamllsp
and will look for it on the Unix PATH.
Getting started with OCaml using Emacs can be a struggle. Emacs is a fine editor but the documentation can be difficult to handle. Hopefully following through this setup will yield a working Emacs / LSP setup for OCaml.
In future I want to try binding more things to keys so I use the mouse less and streamline the installing of the ocaml lsp server. Then after that adding support for more interesting code interactions like extracting modules or hoisting let bindings would be nice to have. Happy hacking!
]]>OCaml is an awesome language with many fine features. I enjoy using it immensely!
Unfortunately, it suffers from a perceived weakness in how to get started. Like any new skill, there can be a learning curve. The tools are all there, but combining them for a good developer experience might seem difficult at first.
Often I’ve found that the barrier for getting into a new langauge is less about the new features of that language and more about learning the tools to become productive in that language. The package managers, build tools, and editor integration of a new language can be confusing, making for an awful experience.
Perhaps my opinionated guide to getting started with OCaml in 2021 will help reduce any mental blocks against trying out this excellent language.
First it’s necessary to install OCaml and Opam. Opam is the default package manager for OCaml projects. Ignore the other options for now, once you know more about what you want, you can make an informed choice. For now if you speak OPAM, you’ll get the most out of the community.
On Linux, use your local package manger, e.g., apt-get install opam
for Debian and apt install opam
for Ubuntu. For MacOS, use homebrew brew install opam
. I’ll assume if you run
something else, you can handle looking up how to install things.
On my Mac I get Opam 2.1.0:
$ opam --version
2.1.0
Once you’ve got Opam installed, you should be able to move on to the next step.
I strongly recommended that you pick a single OCaml version that your project will compile against. Supporting multiple compiler versions is possible and usually not too diffcult, but it complicates the process right now.
Running opam switch list-available
will show you a long list of every possible OCaml compiler.
Choose the latest mainline compiler identifed by Official release X.XX.X
where currently the latest
is 4.13.0
. Ignore the others.
opam switch list-available
...
ocaml-variants 4.12.0+domains OCaml 4.12.0, with support for multicore domains
ocaml-variants 4.12.0+domains+effects OCaml 4.12.0, with support for multicore domains and effects
ocaml-variants 4.12.0+options Official release of OCaml 4.12.0
ocaml-base-compiler 4.12.1 Official release 4.12.1
ocaml-variants 4.12.1+options Official release of OCaml 4.12.1
ocaml-variants 4.12.2+trunk Latest 4.12 development
ocaml-base-compiler 4.13.0~alpha1 First alpha release of OCaml 4.13.0
ocaml-variants 4.13.0~alpha1+options First alpha release of OCaml 4.13.0
ocaml-base-compiler 4.13.0~alpha2 Second alpha release of OCaml 4.13.0
ocaml-variants 4.13.0~alpha2+options Second alpha release of OCaml 4.13.0
ocaml-base-compiler 4.13.0~beta1 First beta release of OCaml 4.13.0
ocaml-variants 4.13.0~beta1+options First beta release of OCaml 4.13.0
ocaml-base-compiler 4.13.0~rc1 First release candidate of OCaml 4.13.0
ocaml-variants 4.13.0~rc1+options First release candidate of OCaml 4.13.0
ocaml-base-compiler 4.13.0~rc2 Second release candidate of OCaml 4.13.0
ocaml-variants 4.13.0~rc2+options Second release candidate of OCaml 4.13.0
ocaml-base-compiler 4.13.0 Official release 4.13.0
ocaml-variants 4.13.0+options Official release of OCaml 4.13.0
ocaml-variants 4.13.1+trunk Latest 4.13 developmet
ocaml-variants 4.14.0+trunk Current trunk
...
At this point, install the latest OCaml 4.13.0:
$ opam switch create 4.13.0
<><> Installing new switch packages <><><><><><><><><><><><><><><><><><><><> 🐫
Switch invariant: ["ocaml-base-compiler" {= "4.13.0"} | "ocaml-system" {= "4.13.0"}]
<><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫
∗ installed base-bigarray.base
∗ installed base-threads.base
∗ installed base-unix.base
∗ installed ocaml-options-vanilla.1
⬇ retrieved ocaml-base-compiler.4.13.0 (https://opam.ocaml.org/cache)
∗ installed ocaml-base-compiler.4.13.0
∗ installed ocaml-config.2
∗ installed ocaml.4.13.0
Done.
You can start using this version by typing the following:
$ opam switch set 4.13.0
And verify which switch you are using:
$ opam switch show
4.13.0
When you work with several OCaml projects, it’s best to create a switch per project, as it keeps
each project isolated and prevents issues with installing conflicting versions of libraries.
For example, I use a naming scheme of ocaml-version-project-name
,
e.g., 4.13.0-ocurrent
. Then in each project directory, run opam switch link 4.13.0-ocurrent
to setup that named switch for that specific directory. Opam will take care of setting that switch
in your shell when you change into that directory.
For this step we need the Dune build tool, so go ahead and install it with opam install dune
.
Dune comes with a simple scaffolding command to create an empty project that is really useful
to get started.
I’m calling my project box
, so run:
$ dune init proj box
Success: initialized project component named box
In the project generated, we get a library component, a CLI, and a test component, which will all compile out of the box.
$ cd box
$ tree
.
├── bin
│ ├── dune
│ └── main.ml
├── box.opam
├── lib
│ └── dune
└── test
├── box.ml
└── dune
3 directories, 6 files
Lets try a compile:
$ dune build @all
Info: Creating file dune-project with this contents:
| (lang dune 2.8)
| (name box)
Running the CLI:
$ dune exec bin/main.exe
Hello, World!
Each of the bin
, lib
, and test
directories contains the source code in the form of *.ml
files,
along with a dune
file which tells Dune how to build the source and on what libraries it depends.
The box bin\dune
file declares it’s an executable
with a name box
and depends on the box
library.
(executable
(public_name box)
(name main)
(libraries box))
CLI tools require command line parsing, Cmdliner
is a common library that implements CLI parsing.
We need to add it in two places: first in the dune-project
file, to get it installed, and then
in bin/dune
, to say where we’re using it.
One small digression, when generating our project, dune
created an box.opam
file. This describes
our project to Opam, telling it what libraries it requires and what the project does. You need this
if you ever publish a package for other people to use. Newer versions of Dune can generate the box.opam
file from a dune-project
file. Having a single source of information is helpful, so lets create that file:
(lang dune 2.8)
(name box)
(generate_opam_files true)
(package
(name box)
(depends
(ocaml (>= 4.13.0))
(cmdliner (>= 0.9.8)))
(synopsis "Box cli"))
Remove the rm box.opam
file to test the generation. Now run dune build @all
to regenerate the Opam
file. This file should be checked in, and any further edits should be at the top-level dune-project
file, which should look like this:
$ cat box.opam
# This file is generated by dune, edit dune-project instead
opam-version: "2.0"
synopsis: "Box cli"
depends: [
"dune" {>= "2.8"}
"ocaml" {>= "4.13.0"}
"cmdliner" {>= "0.9.8"}
"odoc" {with-doc}
]
build: [
["dune" "subst"] {dev}
[
"dune"
"build"
"-p"
name
"-j"
jobs
"@install"
"@runtest" {with-test}
"@doc" {with-doc}
]
]
The final step is to actually install the cmdliner
library. Run opam install . --deps-only -ty
,
which will look at the *.opam
files present and install just their dependencies with the correct
version bounds.
The -y
says yes to installing the packages. You can remove it if you like by pressing Y
or if
you want to review what will be installed.
-t
will run the package tests, which isn’t always necessary, but it’s sometimes useful for certain
packages with native C components.
Alternatively you could run opam install cmdliner
, as this doesn’t look at version constraints in *.opam
files, you might not get what you expect.
Finally, you’ll want to get comfy with your chosen editor. If you have a preference, you should use the native LSP support in that editor, along with installing opam install ocaml-lsp-server
. OCaml is standardising on the LSP protocol for editor interaction. If you have no editor preference, then start with VSCode and install the OCaml LSP package from the Marketplace.
Personally, I’m using Emacs with the LSP mode eglot
, which works really nicely, along with some customisations to bind certain LSP actions to keys. I highly recommend getting into Emacs as an editor because the customisation via a fully-featured language, like Lisp, is fantastic if you live in your editor like I do.
This post is an update to an earlier post by Adam in 2017, and I hope this short tutorial helps get you started with OCaml!
]]>I wanted to port my blog across from an old Jeykll setup to Haykll. The Jekyll was out of date and keeping the required ruby tools installed when I swapped machines was a huge pain. I don’t write ruby much anymore.
Considering my options, I looked at Hugo and Hakyll, discarding Hugo because I don’t want to keep up with the JS churn, even though they have lots of great resources and themes available. So Hakyll seems like the best option. I already regularily write Haskell so the tools will be up to date and I can make it do everything I want by digging into the source code.
My requirements are:
First things first! I like the following layout when setting up a basic Haskell project:
$ tree -L 1
.
├── CNAME
├── LICENSE
├── README.md
├── css
├── drafts
├── images
├── index.html
├── lambdafoo.cabal
├── main
├── pages
├── posts
├── talks
└── templates
Initially I used cabal init --cabal-version=2.4 --license=BSD3 -p lambdafoo.com
to get a skeleton
project with a reasonable cabal file. Then I moved things around, making main/site.hs
the
entry point for running Hakyll and adding a TODO list of features into the README.md
* ~~basic pages~~
* ~~about~~
* ~~talks~~
* ~~archive~~
* ~~individual post with code highlighting~~
* ~~rss/atom feed~~
* ~~add rss/atom feed to archive page~~
* ~~github action build and deploy~~
* ~~html url redirects to new url structure~~
* ~~serve js talks/slides directly from Hakyll~~
* configure dependabot for Haskell
* ~~add generated sitemap.xml~~
* ~~integrate Google Analytics~~
These directories are used for Hakyll content:
The trickiest part was getting a version of the cabal file that worked with GHC 8.10 and a
recent version of Hakyll. I ended up needing to pin Hakyll as hakyll ^>= 4.13
and left the
other dependencies floating.
executable site
main-is: site.hs
hs-source-dirs: main
default-language: Haskell2010
build-depends:
base >= 4.6 && < 5
, binary >= 0.5
, directory >= 1.2
, filepath >= 1.3
, hakyll ^>= 4.13
, blaze-html
, lens
, time
, aeson
, lens-aeson
, containers
, pandoc
, process >= 1.6
, text >= 1.2
At this point, I could have either continued setting up Hakyll or setup CI. I usually prefer setting up CI as early as possible in a project, so I stared there. Here is what that looks like:
There are a few options for cloud CI, and my requirements were simple: no cost, easy setup, and integration with GitHub pages where I host my site. It was a toss up between CircleCI and Github Actions, as I’ve had good experience with CircleCI, but Idecided to try Github Actions.
First, create a directory mkdir -p .github/workflows/
with a ci.yml
file
name: CI
on:
push:
branches:
- master
pull_request:
types:
- opened
- synchronize
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
cabal: ["3.4.0.0"]
ghc: ["8.10.7"]
The matrix
section sets up a build for ghc 8.10.7
and cabal 3.4
, which is enough for a simple blog,
but is where you’d add extra options, for say a library. Next, we use some community GitHub Actions to
checkout
and setup Haskell.
steps:
- uses: actions/checkout@v2
- uses: haskell/actions/setup@v1
id: setup-haskell-cabal
with:
ghc-version: ${{ matrix.ghc }}
cabal-version: ${{ matrix.cabal }}
Here we run cabal update
to update our Hackage index and then setup some build caching for
our dependencies. You can copy this directly and it should work:
- name: Cabal Update
run: |
cabal v2-update
cabal v2-freeze $CONFIG - uses: actions/cache@v2.1.4
with:
path: |
${{ steps.setup-haskell-cabal.outputs.cabal-store }}
dist-newstyle key: ${{ runner.os }}-${{ matrix.ghc }}-${{ hashFiles('cabal.project.freeze') }}
restore-keys: |
${{ runner.os }}-${{ matrix.ghc }}-
Then we run the cabal build and Hakyll site build.
- name: Build Site
run: |
cabal v2-build $CONFIG cabal exec site build
Adding that into your repo’s main branch of your repo should yield a working CI. On top of that, I added a dependabot configuration to check that my GitHub Actions config was up to date.
Add a file dependabot.yml
to .github
:
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
commit-message:
prefix: "GA"
include: "scope"
labels:
- "CI"
This will check that your GitHub Actions use the latest version and open a PR to bump versions if you aren’t. Something like this for Haskell would be super sweet.
Let’s quickly walk through the contents of main/site.hs
, but there are more in-depth tutorials on the
main Hakyll site
{-# LANGUAGE OverloadedStrings #-}
import Hakyll
main :: IO ()
= hakyll $ do main
Here we import Hakyll, setup overloaded strings, and create a main function:
"images/*" $ do
match
route idRoute
compile copyFileCompiler
"css/*" $ do
match
route idRoute compile compressCssCompiler
Serve stylesheets and images from directories css
and images
, respectively. This is standard code
that can be copied directly, it basically copies the files into the final static site directory _site
.
Next I wanted to serve some old talk slides written in HTML and JavaScript directly from my site.
I couldn’t find any posts talking about how to do this, but after thinking about it, I realized that I
just wanted to serve static assets again like the css
and images
above. So that’s exactly what was required!
If course, I lie. I had to fix a few hard coded paths in the HTML but otherwise it worked.
The layout for talks
looks like:
talks
├── erl-syd-2012-webmachine
├── fp-syd-freer-2016
├── fp-syd-higher-2015
├── lambda-jam-2014-raft
├── lambda-jam-2015-ocaml-functors
├── lambda-jam-2016-performance
├── roro-2012-riak
└── scala-syd-2015-modules
So I needed an extra wildcard in my match
statement:
"talks/**/*" $ do
match
route idRoute$ copyFileCompiler compile
This content then gets served under lambdafoo.com/talks/scala-syd-2015-modules/
.
In retrospect, this is an obvious solution to serving any static content generated outside of Hakyll,
but it did take me a while to realise it.
Next we load the individual blog posts:
"posts/*" $ do
match $ setExtension "html"
route $
compile
pandocCompiler>>= loadAndApplyTemplate "templates/post.html" postCtx
-- Used by the RSS/Atom feed
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" postCtx
>>= relativizeUrls
After getting a few simple things out of the way, the Markdown-based workflow already worked with Hakyll, so there’s nothing really to see there. Creating a simple YAML file with the following meta-data and content is enough to get a simple post working.
---
title: Hakyll Blog setup
author: Tim McGilchrist
date: 2021-02-01 00:00
tags: haskell
description: How I setup my blog with Hakyll
---
Content of post
I have a domain lambdafoo.com
that I use to serve my blog. Github pages has up-to-date
information on how to set this up with your DNS provider.
Here is where choosing Github Actions really pays off! There is a community action to do it all!
Assuming you’ve turned on GitHub Pages in the settings for you repo, add this to the end of the ci.yml
:
- name: Deploy 🚀
uses: JamesIves/github-pages-deploy-action@4.1.5
if: github.ref == 'refs/heads/master'
with:
token: ${{ secrets.GITHUB_TOKEN }}
branch: gh-pages # The branch the action should deploy to.
folder: _site # The folder the action should deploy.
clean: true # Automatically remove deleted files from the deploy branch
This deploys the output of the Build Site
step from folder _site
to the branch gh-pages
on all master
builds (controlled via if: github.ref == 'refs/heads/master'
).
On the first build, there is a bit of lag to deploy. I had issues with my DNS setup and two personal repositories using the same CNAME values. Apart from that, the process was smooth, and I quickly had a new version working. Again, if you setup dependabot, it will check that this action is up-to-date.
I wanted to share a simple configuration for running OCaml projects in CircleCI. CircleCI is what I’m using at work plus it supports a killer feature that you can re-run a failing build getting an SSH session into the machine. This one feature has saved me loads of time in debugging CI configuration and flakey tests. Most of the other features are similar to other cloud CI solutions, the documentation is solid and setting up more advanced workflows is easy enough.
Our requirements are simple to build OCaml projects that use OPAM and have simple test requirements (just running unit tests).
First we add a file .circleci/config.yml
with:
version: 2
jobs:
build-4.10:
docker:
- image: ocaml/opam:ubuntu-18.04-ocaml-4.10
steps:
- checkout
- run:
name: Build
command: ./bin/ci
workflows:
version: 2
build:
jobs:
- build-4.10
This creates a job build-4.10
using docker image ocaml/opam2:4.10
published by the OCaml team.
The steps
defines the commands to run, we use a built in checkout
command provided by CirclCI and
then a run
command that executes a shell script ./bin/ci
.
You could use your own docker container in place of ocaml/opam2:4.10
, maybe pre-installing
some things or using a different linux distro. How to run the command could also be inlined
rather than being its own file. I chose to make it a file for two reasons, when you SSH to debug a
script you can just re-run ./bin/ci
, and you can re-use the steps between local and CI.
Now to the shell script
#!/bin/sh -eux
WORKING_DIR=$(pwd)
# Install some extras
sudo apt-get install m4 pkg-config -y
# Make sure opam is setup in your environment.
eval `opam config env`
opam update
# Install each package as a dev dependency
find . -type f -name '*.opam' | sort -d | while read P; do
opam pin add -n "$(basename -s .opam ${P})" . -y --dev
opam install --deps-only "$(basename -s .opam ${P})" -y
eval `opam config env`
done
# Run the builds and
dune build
dune runtest
This configuration is from a project with multiple opam
files so we have a find
to locate all those files. One gotcha with this is it’ll sort the file names which may not
match the dependency order, if that is the case you will need to explicitly list them.
If you have a single opam
file then replace that with
the following (replacing project-name
with your project name).
opam pin add -n "project-name" . -y --dev
opam install --deps-only "project-name" -y
Push that into your github main branch, then Set up Project
in the circleci UI and
you should be off and building. From here the circleci docs can help with setting up
different builds based off branches. Adding other OCaml builds is as easy as duplicating
the build-4.10
section in YAML, pointing it to another docker container like 4.08
and
adding the new build name to workflows
under jobs:
.
There’s a working setup in my ocaml-bitbucket project. Good luck!
]]>In choosing Haskell as a language you sign up for a certain class of features and behaviours. e.g. lazy evaluation, static typing
This gives you a general point in the design space for general purpose languages but like all languages you are still left with a number of choices in building software. These choices are broad, diverse and hotly debated, sometimes they get labelled with Best Practices or the Right way. Like any good engineer you should recognise that everything involves trade-offs and that these labels are trying to hide that. There is not always one best way, an approach has positives and negatives. Knowing those trade offs and deliberately choosing an approach based off them is good engineering.
In programming language communities there are always bikeshedding arguments and Haskell is no different. I want to call out a particular point of view around using exceptions vs data types in Haskell when dealing with errors. Both are valid design points in a wider error handling design space. The exception path is widely associated with Snoyman, who has written much software and written extensively about this in Exceptions Best Practices in Haskell and in the Safe Exceptions package.
I’d like to highlight the negatives, as I see them, of that approach and suggest a different set of trade offs around modelling errors as data types using EitherT/ExceptT.
EitherT is a Monad Transformer built on the familar Either data type.
data Either a b
= Left a
| Right b
where typically Left
represents some failure case in this context and Right
represents success.
Another formulation from OCaml community is:
type ('a,'b) result
Ok of 'a
= Error of 'b |
which is more explicit about what the two constructors represent.
Back in the beginning, actually in 2000/01, asynchronous exceptions were added to Haskell. [2] Quoting Simon Marlow:
Basically it comes down to this: if we want to be able to interrupt purely functional code, asynchronous exceptions are the only way, because polling would be a side-effect.
So Haskell has async exceptions whether you like them or not, the ship has sailed.
This means that any code in IO
can throw a runtime exception, further any thread can receive an
async exception.
So, how should we best deal with this reality and structure our code?
We have exceptions; lets use them.
To start doing that you need to define your own custom exception type.
data VapourError =
InsufficientFunds
| ItemUnavailable Text
| MachineMalfunction Text
deriving Typeable
-- Write a reasonable Show instance for each error
instance Show VapourError where
show a = case a of
InsufficientFunds -> "Insufficient funds."
ItemUnavailable i -> "Item " ++ i ++ " unavailable."
MachineMalfunction e -> "Hardware malfunction " ++ e ++ "."
instance Exception VapourError
The three steps we need are:
Exception
At this point you can use throw
, catch
and handle
with your custom error type.
runVendingMachine :: VendingMachineState -> Coin
-> IO Product
= do
runVendingMachine state coin > 0) $ throw InsufficientFunds
unless (coin
dispenseItem state coin
dispenseItem :: VendingMachineState -> Coin
-> IO VendingMachineState
= .... dispenseItem
Looking at the signature of runVendingMachine
you can see that it returns a Product
by running a
computation in IO
. The problem you have when looking at that code is the signature doesn’t
give you any indication that it might fail outside of the IO
which we saw earlier can fail with anything.
So as a consumer of this function, how are you to know what exceptions to catch. Your options are:
The first option is dangerous as catching all exceptions includes asynchronous exceptions like
stack/heap overflow, thread killed and user interrupt. The documentation in safe-exception
is
particularly helpful here and I recommend you read it thoroughly, it is well written. The short version is
you should only catch certain exceptions, trying to handle StackOverflow
, HeapOverflow
and ThreadKilled
exceptions could cause your program to crash or behave in unexpected ways.
The second option is error prone. The process for finding the possible exceptions involves reading the source code
and reading the haddock docs, with the goal of finding the set of sensible exceptions you need to put into
a catch
or handle
call. Have you found all the places an exception might be thrown? What about if you
pull in a new dependency, does it throw exceptions? What about a sub-dependency of a dependency?
What about the functions runVendingMachine
calls? And their functions? To me it feels like going
back to Javascript or Ruby land and giving up on some of the benefits of a typed language. I want the types to
help me find the places I need to consider the errors, just like pattern matching does for data types.
The other less obvious (perhaps) issue is that you force the consumers of your function to know all the gory details of exceptions in Haskell, which ones are safe to catch and what to do. Getting this right is hard and tricky, and really belongs in a library so that it can be written one and reused.
Finally the behaviour of a Haskell system in production is such that throwing an exception would yield you exactly
what the show instance for VapourError
is. It wouldn’t give you a classic stack trace (unless you set that up)
so you loose context where the exception was raised and what was happening around it. At a previous workplace we
spend many weeks tracking down SSL and connection reset exceptions that occured in a base library but bubbled
out through multiple layers of application code. It wasn’t fun.
This style is perfect for a quick script to munge some data, or an ICFP programming contest
If you really need exceptions, use bracket
pattern or safe-exceptions
like library. Keep the
complexity contained and code needs to be written very carefully.
We mentioned data types earlier, using data types to model your computation is the natural approach in Haskell.
You build a data type that accurately reflects the data or states that you want to model. We even
did it for the custom VapourError
type earlier.
Extending that we will use a particular data type EitherT
to model errors. This is a monad transformer
with
an Either
where the monad could be anything.
In context it would look something like:
crankHandle :: Int -> EitherT VapourError IO Product
-- or
crankHandle :: Monad m => Int -> EitherT VapourError m Product
-- or
crankHandle :: MonadIO m => Int -> EitherT VapourError m Product
The type of our error is present in the type of our function, a familar situation.
If the monad m
isn’t IO then we have a good degree of confidence that
none of the base exceptions
will be present.
# Build a data type that represents the possible error states
data VapourError =
InsufficientFunds
| ItemUnavailable Text
| MachineMalfunction Text
# Provide a function for turning errors into text
renderVapourError :: VapourError -> Text
= ...
renderVapourError
# Usage site
runVendingMachine :: VendingMachineState -> Coin -> EitherT VapourError IO Product
= ... runVendingMachine
Examples of substantial pieces of code using EitherT
to organise errors.
Basically the compiler helps you handle the various states required using the type system.
Example of code using Exceptions
to organise errors
The main downsides as I see it to exception oriented code are:
Here the compiler is less helpful in guiding you, giving little or no help with handling particular exceptions or giving compile errors for new exceptions that you might need to consider.
The supporting libraries for this pattern of error handling are:
type EitherT = ExceptT
plus addition operators.There is nothing revolutionary about transformers-either
, you could roll your own version
easily or use the ExceptT
transformer provided in the transformers
package (adding any helper
functions you need). The value codes in a structured, consious handling of errors and using the
Haskell compiler to help.
The primary value of avoiding exceptions is that it makes error behavior explicit in the type of the function. If you’re in an environment where everything might fail, being explicit about it is probably a negative. But if most of your function calls are total, then knowing which ones might fail highlights places where you should consider what the correct behavior is in the case of that failure. Remember that the failure of an individual step in your program doesn’t generally mean the overall failure of your code.
It’s a little bit like null-handling in languages without options. If everything might be null, well, option types probably don’t help you. But if most of the values you encounter in your program are guaranteed to be there, then tracking which ones might be null be tagging them as options is enormously helpful, since it draws your attention to the cases where it might be there, and so you get an opportunity to think about what the difference really is.
- Yaron Minsky
One thing that always comes up with your favourite language is how do you use libraries written in another language. Typically this involves needing to talk to a particular C library, either because it’s faster than a native one or just that it is already written.
For OCaml there is the ctypes library for binding to C libraries using pure OCaml. Written by the people at the good people at OCaml Labs http://ocaml.io
The core of ctypes is a set of combinators for describing the structure of C types – numeric types, arrays, pointers, structs, unions and functions. You can use these combinators to describe the types of the functions that you want to call, then bind directly to those functions – all without writing or generating any C!
Lets go through a simple example binding to libyaml. Here’s a declaration form libyaml to get the version string.
/**
* Get the library version as a string.
*
* @returns The function returns the pointer to a static string of the form
* @c "X.Y.Z", where @c X is the major version number, @c Y is a minor version
* number, and @c Z is the patch version number.
*/
(const char *)
YAML_DECLARE(void); yaml_get_version_string
To bind to this we need to declare a compatible signature for our OCaml code.
open Ctypes
open Foreign
let get_version_string =
"yaml_get_version_string"
foreign string) (void @-> returning
We’re pulling in Ctypes and Foreign. Then the let binding is using foreign with the name of the c method we want to call plus a type signature for that method.
Next we need some calling code to print out the version string.
open Core.Std
let () =
let version_string = get_version_string() in
"Version: %s\n" version_string printf
Assuming you’ve got opam installed you can get the dependencies opam install core ctypes
and compile the whole thing.
> corebuild -pkg ctypes.foreign -lflags -cclib,-lyaml version_string.native
...
./version_string.native
Version: 0.1.6
We’ve got bindings to a native C library without writing any C.
More complicated example involving passing an allocated string back from C, lets
look at the proc_pidpath
call from OSX. This particular library call takes a
process id (PID) and returns back
int
(int pid, void * buffer, uint32_t buffersize) proc_pidpath
To bind to this call we again define a compatible signature.
let pidpath =
true "proc_pidpath"
foreign ~check_errno:int @-> ptr char @-> int @-> returning int) (
The arguments simply mirror those for the C library call, along with a new
argument check_errno
which indicates the c library sets errno if it encounters
a problem.
http://stackoverflow.com/questions/22651910/returning-a-string-from-a-c-library-to-ocaml-using-ctypes-and-foreign
Ctypes provides native bindings for most things you’ll need. There’s all sorts of pointers and types matching pretty much every native C type you’ll need here.
]]>I’ve been excpetionally fortunate in the past month to accomplish a long held goal of mine. As of the 13th of April I’ve been employed full time as a functional programmer. In particular I’ve taken the deepest of dives into Haskell. I thought it might be interesting, at least for me, to write up my thoughts after completing a month of Haskell.
First the depth of the dive has been overwhelming and the learning curve more equivalent to a vertical rock climb. But the entire time, no matter the exhaustion and believe me there was a lot of that, has been extrodinary. When I take a moment to reflect, I’ve had a smile on my face.
First thing, the degree to which types are ingrained in Haskell. That might seem surprising in itself, Haskell is afterall a strongly typed langauge and it was surprising to me too. I’ve used Erlang, with dialyzer, and OCaml a great deal before starting, and both these languages have reasonable type systems. Ocaml is even described as strongly typed. So what am I getting at?
Everything in Haskell feels typed to the nth degree. Every possible abstraction is pulled out into a common place, either Applicatives, Monads, Bimaps or Monad Transformers. Which is great that you can abstract like that. Using any Haskell library will require you to know about some of these things.
Coming from a background where I’d done Lisp, Erlang and OCaml, and some Haskell I thought I was totally prepared to start working in Haskell full time.
Learning Haskell the language is a good first step, but knowing the syntax and being comfortable reading code is one thing. What really surprised me was that knowing Haskell isn’t sufficient, you need to learn the set of typical Haskell libraries before you can really start making progress and feel at home in Haskell. Of course I’d used things like Monads in OCaml and read
No equivalent to Monad, Monad transformers, lenses, applicative, traversable library eco-system in OCaml. I naievely wonder why this hasn’t been built before and whether it’s even a good idea. The parallels to Scala and scalaz are all too apparent to me. Scala is a mixed OO/FP langauge in a similar way to OCaml. It also doesn’t enforce the same level of strictness with respect to side effects that Haskell does. So in both langauges if you want you can create a mess of side effectey code if you;re not careful. Also both langauges allow mutation, again another side effect, without tracking this via the type system.
I want to thank Mark Hibberd and Charles O’Farrell for being such great mentors over the last month, may they never grow tired of my endless questions.
]]>Being on the curious side of things I have been interested lately in the dualities between programming languages. Like how one feature say Type Classes in Haskell compares to what is available in Scala or OCaml. This has lead to me reading a substantial amount of academic papers about the subject.
So with that in mind I would like to give a brief introduction to OCaml style modules. Perhaps in another post going into how can you encode something like rank n types from Haskell in OCaml which natively doesn’t support them.
Preface, the use of the word module can be confusing, and it sometimes seems that module is used to refer to structures interchangeably. I’ve tried to avoid that but it’s helpful to keep in mind for further reading. Look at what’s on the right hand side of the equals in the code. Let start.
OCaml is a member of the ML family of languages, sharing common features like modules, Hindley-Milner type sytem and strict evaluation. OCaml as a language can be though of 2 distinct part; one a core language that’s values and types and a second module language that revolves around modules and signatures. While OCaml does provide some support for bridging these parts in the form of First Class Modules, I won’t cover them here.
The key parts of the module system in OCaml are:
Structures provide a way for grouping together related declarations like data types and functions the operate on them; they also provide the values in the module langauge. Below is a module for integer Sets:
module IntSet = struct
type t = int
type set = t list
let empty = []
let member i s = List.exists (fun x -> x = i) s
let insert i s = if member i s then s else (i::s)
end
This code defines a new structure using the struct
keyword and binds it to a name
using module. It’s useful to note that OCaml types are written in lowercase (t
,
list
and set
) and type variables are written with a single quote 'a
. Also
type constructors are written differently to Haskell, in Haskell you’d have
List a
while in OCaml the order is reveresed t list
.
Basically a struct is an opening struct
followed by a bunch of type
and
let
bindings, and closed with an end
.
At the call site exposed declarations are referred to by dot notation:
IntSet.t IntSet.empty
If no module name is defined within a file, say you have a file called set.ml
with:
type t = int
type set = t list
let empty = []
let member i s = List.exists (fun x -> x = i) s
let insert i s = if member i s then s else (i::s)
It will implicitly be given a structure name derived from the file name Set
but
as you may have worked out module names are not bound to file names. Further
structures can be nested within other structures, leading to more freedom than
just having 1 file becoming 1 module.
module IntSet = struct
module Compare = struct
type t = int
let eql x y = x = y
end
end;;
The values within the nested module are referred to like so:
1 1;; IntSet.Compare.eql
While it is great to have functions namespaced like so, it would become tedious if you needed to use the longer name to refer to a nested module. OCaml provides a couple of solutions, first local opens.
Rather than having an open
statement at the top of the file and bringing
every thing into scope for that file we can do a local open and restrict the
scope to between the two brackets.
1 1);; IntSet.Compare.(eql
The other option available is aliasing the module name to something shorter
module X = IntSet.Compare;;
1 1;; X.eql
I mentioned open
before without saying what it does. Simply open brings the
contents of a module within another module, so they can be referred to without
the module name prefix.
Signatures are the interfaces for structures, a signature defines what parts of a structure is visable from the outside. A signature can be used to hide components of a structure or export some definitions with more general types.
A signature is introduced with the sig
keyword
module type Set =
sig
type elt
type t
val empty : t
val member : elt -> t -> bool
val insert : elt-> t -> t
end
As you can see looking at our definition of Set, it lists a type and function
signatures without specifying a concrete implementation. It’s also bound to a
name Set
using module type
.
As I metnioned before signatures are typically used to hide or change the interface a module exposes. By default all types and functions are exported from a module. Useful for doing things like hiding implementation details or only construct the data type via the invariant-preserving operations that the module provides.
Typically in OCaml you’ll define your struct
in one file set.ml
and then
create a second file set.mli
which contains the signature for the module set.
Only occasionally will you see the signature and structure defined together.
Now to the functors, they’re not exactly like Haskell’s though they do perform a kind of mapping.
Functors are for lifting functions into the module language, or another way
they are functions
from structures to structures. Which brings the abstract
idea of functors from category theory back to 2 concrete examples, where
Haskell functors are functions
from types to types, OCaml’s functors are
functions
from structures to structures.
Following out set example we can make set operations abstract across both the type inside the set and the equality comparison.
module type ORDERING =
sig
type t
val compare : t -> t -> int
end;;
module type Set =
sig
type elt
type t
val empty : t
val member : elt -> t -> bool
val insert : elt-> t -> t
end;;
module MkSet (Ord : ORDERING) : (Set with type elt := Ord.t) =
struct
type elt = Ord.t
type t = Empty | Node of t * elt * t
let empty = Empty
let rec insert x = function
| Empty -> Node(Empty, x, Empty)when Ord.compare x y < 0 -> Node(insert x a, y, b)
| Node(a, y, b) when Ord.compare x y > 0 -> Node(a, y, insert x b)
| Node(a, y, b) as s -> s
| Node(a, y, b)
let rec member x = function
false
| Empty ->
| Node(l, v, r) ->let c = Ord.compare x v in
0 || member x (if c < 0 then l else r)
c = end;;
module IntOrdering = struct
type t = int
let compare x y = Pervasives.compare x y
end;;
module IntSet' = MkSet(IntOrdering);;
Here we define ORDERING
and Set
as signatures, similar to our previous
definitons. Then a functor is defined MkSet
that takes the ORDERING
signature and defines the types and functions for set based off that interface.
So the definition of MkSet
is completely abstracted away from the type used in
the set and the functions used on those types. As long as it implements
ORDERING
.
The last part defines a particular ordering for int
using, binding t to int
and compare to Int.compare
.
After covering what is in the OCaml module system, what exactly do we use it for. At the very basic level we collect together types and functions, which is pretty much what all modules do. Outside of that we can:
module type SETFUNCTOR =
functor (O: ORDERING) ->
sig
type t = O.t (* concrete *)
type set (* abstract *)
val empty : set
val add : t -> set -> set
val member : t -> set -> bool
end;;
Here we expose the elements within the set via type t = O.t
so they’re a
concrete type, while set
isn’t given a definition so the consumers of this
module can’t look into that type without using the functions provided in the Set
module. This hiding using abstract types lets us swap out different
implementations for testing purposes or if requirements change.
Namespace functions and type, all types and functions live within some module.
Extending existing modules in a type safe way. You may want to extend a
module from a library with extra derived functions. For example the Core
library from Jane Street extends the built in OCaml library with a number of
new and different functions. eg Say Lists didn’t provide a transpose
function.
Instantiating modules with State, OCaml allows modules to include mutable state (while we may not particularly like mutable things) sometimes it’s necessary and you may want multiple instances of a particular module with their own state. Functors make doing this more succinct.
Collecting definitions and exporting as a single module, e.g. Core.Std inside Jane Street Core library.
The best reference is really Real World OCaml. If you’ve got some Haskell experience and don’t mind reading a paper then “ML Modules and Haskell Type Classes: A Constructive Comparison” by Stefan Wehr and Manuel Chakravarty gives a thorough coverage of how ML modules stack up to Type Classes.
]]>Lenses have been on my mind since encountering them last year in the context of Haskell. Much of the literature on lenses has a very Haskell slant so show how they can be used in OCaml.
The theory of lenses and their accompanying prisms and traversals, have been better described by other people. This article at FPComplete was a particularly good one. I’m just going to cover how to use ocaml-lens as a minimal lens implementation.
First since ocaml-lens isn’t in opam, clone the repo locally and open up
utop
. Then load the lens.ml
file into utop
.
{%codeblock lang:ocaml%} utop # #use “lens.ml”;; .. {% endcodeblock %}
Starting with a few record types for a car, editor and book.
{%codeblock lang:ocaml%} type car = { make : string; model: string; mileage: int; };;
type editor = { name: string; salary: int; car: car; };;
type book = { name: string; author: string; editor: editor; };; {%endcodeblock%}
Creating a new book is as simple as.
{%codeblock lang:ocaml%} let scifi_novel = { name = “Metro 2033”; author = “Dmitry Glukhovsky”; editor = { name = “Vitali Gubarev”; salary = 1300; car = { make = “Lada”; model = “VAZ-2103”; mileage = 310000 } } };;
{% endcodeblock %}
Given our scifi_novel
we can access the editor’s car mileage:
{%codeblock lang:ocaml%} let mileage = scifi_novel.editor.car.mileage;; {% endcodeblock %}
Setting the mileage is a bit trickier, we need to unpack each record:
{%codeblock lang:ocaml%} let second_edition = { scifi_novel with editor = { scifi_novel.editor with car = { scifi_novel.editor.car with mileage = 1000 } } };; {% endcodeblock %}
That’s not really an appealing prospect, can we do better?
Enter lenses, at the most simple level a lense is a pair of functions for getting and setting a property.
{%codeblock lang:ocaml%} (** Lens type definition *) type (’a, ’b) t = { get : ’a -> ’b; (** Functional getter *) set : ’b -> ’a -> ’a (** Functional setter *) } {% endcodeblock %}
With this definition of a lens, modifying the mileage is now:
{%codeblock lang:ocaml%} let a = compose mileage_lens (compose car_lens editor_lens) in _set 10 scifi_novel a;; {% endcodeblock %}
In the background we need to define some lenses for the records above:
{%codeblock lang:ocaml%} let car_lens = { get = (fun x -> x.car); set = (fun v x -> { x with car = v }) };;
let editor_lens = { get = (fun x -> x.editor); set = (fun v x -> { x with editor = v }) };;
let mileage_lens = { get = (fun x -> x.mileage); set = (fun v x -> { x with mileage = v })
};; {% endcodeblock %}
Using these definitions the original lens version of modify the editor’s car mileage works.
The compose operator we used allows us to combine 2 lenses to go from the novel into the editor and then into the car. And compose can be combined with itself to build up arbitarily deep lenses into a structure.
{%codeblock lang:ocaml%} let editor_car_lens = compose car_lens editor_lens;; {% endcodeblock %}
This way of composing can seem backwards, you supply the inner lens first then
the outer lens. We can fix that by using the infix operators, open the Infix
module and define the same lens:
{%codeblock lang:ocaml%} let editor_car_lens = editor_lens |– car_lens;; {% endcodeblock %}
This feels more intuative reading it left to right. Revisiting our original
_set
mileage example we can now write it.
{%codeblock lang:ocaml%} _set 10 scifi_novel (editor_lens |– car_lens |– mileage_lens);; (* or even *) ((editor_lens |– car_lens |– mileage_lens) ^= 10) @@ scifi_novel;; {% endcodeblock %}
The infix module comes with some other helpful operators like
|.
for get and ^=
for set. All these operators avoid mutation so our
code remains pure and referentially transparent.
There are a heap more things that lenses can do, and while this ocaml-lens
package is pretty basic, looking at the hundreds of functions exported by
Control.Lens
in Haskell you can get a good idea of the possibilities. Control.Lens
includes
all the basic lens functions plus things like:
prisms
which are lenses but for sum typestraversals
are lenses that focus on multiple targets simultaneouslyI made use of the following resources to write this and took some of the examples and definitions from the following articles. All mistakes are my own and probably accidental.
]]>