A love letter to Fmt

OCaml is somewhat unique among languages in that the binaries produced by its compiler do not contain type information; there is no concept of "reflection", and no way to check the "type" of a value at runtime; most type information is discarded during compilation. A side effect of this is that if you want to print a value to the console, you have to instruct the program how to print it; there is nothing like the "%v" directive in Go's fmt.Printf.

For printing stuff, the Ocaml standard library provides the Format module. It is similar to printf-style libraries in other languages. It adds the concept of "boxes", that track and preserve indentation across line breaks within a box, and automatically break long lines. It's a powerful library, but it's cumbersome. Take the record type below, as an example:

type color = {
  r : int;
  g : int;
  b : int;
}
let slate_grey = {r = 0x70; g = 0x80; b = 0x90}
let hot_pink   = {r = 0xff; g = 0x69; b = 0xb4}
let aquamarine = {r = 0x7f; g = 0xff; b = 0xd4}

Let's say we want to print the color as a color code suitable for use in a CSS style sheet. We could define a printer function, pp_color_hex, that would look like this:

let pp_color_hex out { r; g; b} = Format.fprintf out "#%02x%02x%02x" r g b

With a suitable printer function, we can print values of this type like so:

Format.asprintf "Color %a" pp_color_hex c ;;
- : string = "#70809"

In the simple case, it is relatively easy to use. What about the not-so-simple case? What if we want to print a list of these?

let pp_color_list out vl =
  List.iter (pp_color_hex out) vl

Format.asprintf "%a" pp_color_list [slate_grey; hot_pink; aquamarine] ;;
- : string = "#70809#ff69b4#7fffd4"

That's still not so bad; we can use currying in the first argument to List.iter to create a single-argument function concisely. But the values are all smushed together; what if we want to add delimiters?

let pp_color_list_delim out vl =
  let first = ref true in
  List.iter (fun elt ->
      if !first then first := false
      else Format.fprintf out ";@ ";
      pp_color_hex out elt)

That's a lot of code just to print a list. We can use the pp_print_list function to make things a bit simpler:

let pp_color_list_delim out vl =
  let pp_sep out () = Format.fprintf out "; " in
  Format.pp_print_list ~pp_sep pp_color_hex out vl

Format.asprintf "[%a]" pp_color_list_delim [slate_grey; hot_pink; aquamarine] ;;
- : string = "[#70809; #ff69b4; #7fffd4]"

It's manageable, but it's a lot to type, and a lot to remember. In languages with reflection, you have the ability to print values without knowing their specific structure. As mentioned earlier, OCaml does not have reflection, and much of the information about a type, like field names, is discarded after compilation. This means that printing something requires more thought than just typing out print(v) or fmt.Printf "%v" v or whatever.

Enter the Fmt module, from Daniel Bünzli. It takes a compositional approach to pretty-printing, allowing you to build up complex pretty printers as a combination of simple ones. It is not a wild departure from the standard library's design, but its use of short, intuitive names for its API makes all the difference. This is where Ocaml's "local open" syntax really shines. In the expression

Fmt.(pair string int)

The sub-expression pair string int is evaluated with all identifiers from the Fmt module in scope. This lets the Fmt module use intuitive, short names that could otherwise clash with names in the standard library or confuse the reader if they didn't know where they were defined. The above statement, for example, defines a pretty-printer which prints a tuple of a string and an integer. Let's go back to our color example. We can define pp_color_list like so:

let pp_color_list = Fmt.(brackets (list ~sep:semi pp_color_hex))

Fmt.str "%a" pp_color_list [hot_pink; aquamarine; slate_grey] ;;
- : string = "[#ff69b4; #7fffd4; #708090]"

The definitions feel more declarative, and describe the structure of the thing to print in an almost-english way. The library really shines as the structure of your data gets more complicated and you have to deal with multiple container types. Here is a more complicated example from a side project I have, that prints a representation of a protocol buffers schema expression as an S-expression:

type t = term list
and term =
  | Int of int
  | Real of float
  | Str of string
  | Bool of bool
  | Id of string
  | Expr of t

let rec pp ppf =
  Fmt.(parens (list ~sep:sp pp_term)) ppf

and pp_term ppf = function
  | Int x -> Fmt.int ppf x
  | Real x -> Fmt.float ppf x
  | Str x -> Fmt.(quote string) ppf x
  | Bool x -> Fmt.bool ppf x
  | Id x -> Fmt.string ppf x
  | Expr e -> pp ppf e

Produces output like this:

(oneof  target
 ((oneof_field google.protobuf.Timestamp time 2 ())
  (oneof_field string snapshot 3
   ((google.api.resource_reference.type
     "pubsub.googleapis.com/Snapshot")))))

One thing I struggled with initially, however, is the brevity of the documentation. Here is an example:

val nop : 'a t
(** nop formats nothing. **)

val any : (unit, Format.formatter, unit) Stdlib.format -> 'a t
(** [any fmt ppf v] formats any value with the constant format [fmt]. *)

val using : ('a -> 'b) -> 'b t -> 'a t
(** [using f pp ppf v] is [pf ppf (f v)]. *)

val const : 'a t -> 'a -> 'b t
(** [const pp_v v] always formats [v] using [pp_v]. *)

For someone who is already familiar with OCaml, and perhaps other languages or libraries that encourage an applicative or point-free style where functions are built up using combinators, these definitions are efficient and mathematical. For a newcomer, however, they require more effort to decipher. You may understand, mechanically, that

Fmt.using f pp ppf v

is equivalent to

Fmt.pf ppf (f v)

but you may not immediately understand what it means in practice, and why you'd use it. I'll walk through some examples of the more powerful combinators.

Printing records

Let's take an arbitrary record type, adapted from the Unix module:

type file_kind =
  | REG (** regular file *)
  | DIR (** directory *)

type file_perm = int

type xattr = string * bytes

type stats = {
	dev : int;         (** device number *)
	ino : int;         (** inode number *)
	kind : file_kind;  (** kind of file *)
	perm : file_perm;  (** file permissions *)
	uid : string;      (** file owner *)
	gid : string;      (** group owner *)
	xattr : xattr list (** extended attributes *)
}

How can we print a value of type stats ? We can define printers for the types it depends on, starting with the file_kind type. We probably already have written a function to convert this value to a string:

let file_kind_to_string = function
  | REG -> "regular file"
  | DIR -> "directory"

We can use the Fmt.using combinator:

let pp_file_kind = Fmt.(using file_kind_to_string string)

You can read Fmt.using conv p as "use conv to convert the value into something that can be printed with p". This combinator takes as its first argument a function that converts the item to print into something else, and as a second argument a printer for the result of the conversion. The Fmt module provides printers for most well-known types. The printer for strings is conveniently named string.

To print the permission in the conventional 4-digit octal format, we can use the fmt combinator:

let pp_file_perm = Fmt.fmt "%04o"

To print an extended attribute, we can use the "pair" combinator, along with the printers for binary data:

let pp_xattr =
  let bindata = Fmt.(on_bytes (octets ~w:2 ())) in
  Fmt.(pair ~sep:(any "=") string bindata)

Now we're ready to print the record using the "record" combinator:

let pp_stats =
  Fmt.(braces (record [
    field "dev"   (fun v -> v.dev)   int;
    field "ino"   (fun v -> v.ino)   int;
    field "kind"  (fun v -> v.kind)  pp_file_kind;
    field "perm"  (fun v -> v.perm)  pp_file_perm;
    field "uid"   (fun v -> v.uid)   string;
    field "gid"   (fun v -> v.gid)   string;
    field "xattr" (fun v -> v.xattr) (list pp_xattr);
  ]))

Writing a record printer is one of the more tedious things to do from Ocaml, because, as you see above, you have to tell the printer how to access each field. If you find yourself doing this a lot in a codebase, you may want to generate pretty-printers for records using a preprocessor like ppx-deriving.show

Printing sequences

The program I use to build this blog represents an index of blog entries as a mapping of a file name to a set of key/value tuples, like so:

module SMap = Map.Make(String)
module SSet = Map.Make(String)

type attr = string * string
type tuple = attr SSet.t
type index = tuple SMap.t

I serialize it to a text file, with a format like so:

path="about" title="about" hidden="true" date="2024-11-25T08:28:13Z"
path="tech/9p-from-scratch/part-1" title="Writing a 9P server from scratch" tags="plan9"

This format is emitted by the Fmt module with the help of printing functions that I've written. Again, we start from the base components and work our way up.

let pp_attr = Fmt.(pair ~sep:(any "=") string string)

To print a set, we can use the Set module's iter function with the Fmt.iter combinator:

let pp_tuple = Fmt.(iter ~sep:sp SSet.iter string)

To print a map, we can use the iter_binding combinator:

let pp_key = Fmt.fmt "%s=%S"
let pp_index =
  let sep = Fmt.any "@;" in
  Fmt.(iter_bindings ~sep SMap.iter (pair ~sep:sp pp_key pp_tuple))

Combining printers

Sometimes, when printing a large or multi-faceted type, you want to break up the work into smaller printers, and combine them together. The ++ operator can help you do that. For example, I have been working on a Netlink library and I needed to define a printer for a netlink message, which has a header, a body, and attributes. I can define printers for those separately:

type nlmsghdr = { ... }
type nlmsgbody =
  | RTM_GETLINK of ifinfomsg
  | RTM_GETADDR of ifaddrmsg
  (* .. and so on .. *)

type nlmsg = nlmsghdr * nlmsgbody

let pp_nlmsghdr: nlmsghdr Fmt.t =
  Fmt.(record
    [ field "nlmsg_len" (fun v -> v.nlmsg_len) int32
    ; field "nlmsg_type" (fun v -> v.nlmsg_len) (fmt "0x%04x")
    ; field "nlmsg_flags" (fun v -> v.nlmsg_flags) (array pp_nlmsg_flags)
    (* .. you get the idea *)

let pp_nlmsgpayload ppf = function
  | RTM_GETLINK ifi
  | RTM_NEWLINK ifi
  | RTM_DELLINK ifi -> pp_ifinfomsg ifi
  (* ... and so on ... *)

And then combine them, inserting strings and line breaks, or wrapping parts in delimiters wherever I see fit:

let pp_nlmsg: nlmsg Fmt.t = Fmt.(
  const string "(Nlmsg)"
  ++ braces (using fst pp_nlmsghdr)
  ++ cut
  ++ braces (using snd pp_nlmsgpayload))

There is much more to this library, including colored input, paragraph rendering, conditional output, english ordinals, and more. What I really appreciate is the use of short, descriptive names that is only really possible because of Ocaml's local open syntax. What was just a nice library becomes essentially a little domain-specific language; after using it for awhile, it feels really fluent to write printers like this.

There are quite a few programs which will automatically generate printing routines for your types during compilation, and I use them on occasion, but I still prefer the flexibility of writing my own.