The Standard ML Programming Language


Table of Contents

1 Overview of Standard ML (SML)

SML is a procedural computer programming language with extremely strong support for higher-order functions and abstraction. It is often called a “functional” language, and it can be used that way, but SML is an imperative language with mutable storage and side effects.

In syntax and semantics, SML stays fairly close to the λ-calculus. For example, just like in the λ-calculus, each function in SML has exactly 1 argument, so if you want to pass more than one thing you need to pack them all together into a data structure.

SML is called a “strongly typed” language, because each type has a well defined meaning which is enforced by SML's typing rules. Because the meanings of types exclude the possibility of ill-defined behavior, it is said that “a well typed SML program can not go wrong”. Although an SML program can (correctly) exit due to an exception if one is raised and not caught, this behavior will be strictly according to the rules rather than being unpredictable.

An SML implementation will automatically calculate most of the types used by a program, and typically very few types need to be written by a programmer.

You can learn about SML from the following resources.

2 Implementations of SML

  • NOTE: This list only includes SML implementations that have been made available to the public.
  • “REPL”: This means an interactive top-level “read-eval-print loop”, which allows the user to type individual top-level SML declarations and have them parsed, type checked, and executed (either interpreted, or compiled and then run as machine code), and then have the results pretty printed. After processing each declaration, the REPL will ask the user for more.
  • “FFI”: This means a “foreign function interface”, generally the ability to call functions in C libraries, but sometimes the ability to call other languages, and sometimes the ability to have SML called from other languages.
  • General-purpose non-proprietary implementations
    • MLton
    • Standard ML of New Jersey (SML/NJ)
      • REPL: yes
      • Compilation, building multi-file programs, and execution model
      • Support
      • FFI
        • SML/NJ-C

          This FFI has these advantages:

          • simple interface (relatively speaking)
          • C can call SML (after first being called from SML, I think, i.e., you can't have a C main)

          The disadvantages:

          • The SML run-time support code must have the list of C functions that can be called compiled into it. You have to edit some files in the SML/NJ source code and recompile SML/NJ if you need access to additional C functions.
          • The interface is simple, i.e., only a restricted set of data types can be passed across the SML/C boundary as function arguments or returned results.
          • Anything complex passed to C must be presumed to live forever and can not be garbage collected. This mainly means SML function closures (because this FFI can't really pass other complex things?).
        • NLFFI (“No-Longer-Foreign Function Interface”)


          • Can build and handle arbitrary C data structures in SML code.
          • Can load C libraries and find functions/variables in them at run-time. You never need to recompile SML/NJ (provided it was built with NLFFI support).
          • Can manage memory of C data structures. If you know something is not used, you can free it.


          • Very complex types (but presumably a type error slicer will help a lot here).
          • No support for calling SML from C (I think). In principle, you could use the same method the other FFI uses to wrap SML closures so they can be called from C, but you would have to do it by hand.
          • Must manage memory of C data structures.
      • Variations to SML

        See the SML/NJ list of special features and the list of SML/NJ deviations on the MLton web site.

        • Limitations and errors
          • Disallows multiple uses of “where” on signatures with the same name.
          • Implements multiple simultaneous “withtype” incorrectly.
          • Disallows non-datatype uses of datatype replication.
          • Disallows certain cases of “sharing”.
          • Handles the value polymorphism restriction incorrectly in some cases.
          • Assigns overly liberal types to some things in the standard basis library.
        • Extensions
          • Allows some additional flexibility that is contrary to SML 1997.
          • Vector syntax, or-patterns.
          • Allows higher-order functors.
          • Supports quasiquotation.
          • Supports lazy evaluation of datatypes. You must do (Control.lazysml := true) to enable this. Then you have a “lazy” keyword that can be used with datatype definitions.
    • Poly/ML
    • Moscow ML CORRECT URL?:
      • REPL: yes??? VERIFY
      • Development support: nothing special, use Emacs VERIFY
      • FFI: yes
      • External library bindings
        • Comes with library bindings to Gdimage, PostgreSQL, MySQL, POSIX regular expressions, sockets (Socket, different from the SML Basis Library), GNU gdbm, etc.
      • Variations to SML
        • Limitations
          • Moscow ML does not strictly obey SML 1997 for the dynamic semantics, because it evaluates arguments in a different order.
          • In the Moscow ML library, the implementations of TextIO, Array, and Vector do not conform to the SML Basis Library standard (according to Norman Ramsey).
        • Extensions
          • Moscow ML extends SML 1997 with higher-order functors, first-class modules (functors and structures), and recursive signatures and structures.
          • Supports quasiquotation.
    • ML Kit (with Regions)
      • REPL: ??? CHECK
      • Development support: nothing special, use Emacs VERIFY
      • FFI: yes
      • Variations to SML
        • Supports quasiquotation (with –quotation command-line option).
      • Support
      • ML Kit version 1 (1993)

        The ML Kit described above is “the ML Kit with Regions” which first arrived around 1997. Its predecessor is now known as the ML Kit version 1, and it was described in the announcement as follows:

        “The ML Kit is a straight translation of the Definition of Standard ML into a collection of Standard ML modules. For example, every inference rule in the Definition is translated into a small piece of Standard ML code which implements it. The translation has been done with as little originality as possible – even variable conventions from the Definition are carried straight over to the Kit.

        If you are primarily interested in executing Standard ML programs efficiently, the ML Kit is not the system for you! (It uses a lot of space and is very slow.) The Kit is intended as a tool box for those people in the programming language community who may want a self-contained parser or type checker for full Standard ML but do not want to understand the clever bits of a high-performance compiler. We have tried to write simple code and module interfaces; we have not paid any attention to efficiency.”

    • SML# (change history)
    • Alice ML
      • REPL: yes
      • Development support
        • Comes with a GUI interface for the interactive top-level. Includes auto-updating data inspector windows.
        • Alice can also be used with SML Mode for Emacs with Alice patches.
      • External library bindings
        • GUI: supplied by these structures: Gtk, GLib, Pango, Atk, Gdk, Canvas
        • Database: structure SQLite
        • XML: structure XML, a binding for libxml2
      • Support
        • Mailing lists (,
        • Bug tracker (Bugzilla)
        • Wiki
      • Variations to SML

        See these documentation pages: limitations, incompatibilities, sugar, futures, packages, modules, types.

        • Limitations
          • No overloading of literal constants (e.g., 1 can not have type
          • Some things are missing from the standard basis library.
          • Valid SML 1997 code can fail in Alice ML due to variations in how functors, structures, and datatypes work, additional reserved words, extra material in the standard basis library, additional syntax for line comments, and the effect of concurrent futures on the exhaustiveness of pattern matching.
          • Datatypes are structural (like SML's records) rather than generative. This can be seen as a feature.
        • Extensions
          • Many additional syntax options (improvements).
          • Can create recursive/cyclic values without needing to use ref cells.
          • User-defined extensible sum types (like SML's single built-in “exn” type).
          • Futures (concurrent, lazy, promised).
          • Stronger modules: first-class, higher-order, local.
          • (De)serialization/(un)pickling/(un)marshalling of module values with dynamic type checking to allow storing them in files and sending them over the network.
      • Compilation, building multi-file programs, and execution model
        • Notion of “component”, with “imports” and “exports”.
        • SML files become “components” by prefixing them with import announcements, or these import announcements can go in a separate file (for compatibility with other SML compilers).
        • Component manager runs compiled code on Alice virtual machine, which includes JIT compiler that generates either x86 native code or bytecode. WTF?
        • Components are loaded lazily, on demand, and can even be computed at run-time as first-class values of the language.
  • Specialized implementations
  • Research implementations
    • HaMLet and HaMLet-S
    • TIL and TILT CHECK
    • Fox ML (a modified SML/NJ 0.93)
    • DML: Dependent ML: SML plus fancier types for verification (obsoleted by ATS, which is no longer SML) writes:

      “Conservative ML extension, has type system to enrich ML with restricted form of dependent types, to allow many interesting program properties: memory safety, termination can be captured in type system and thus be verified while compiling.”

    • Extensible ML CHECK

      Extensible ML (EML) is an ML-like programming language that adds support for object-oriented idioms in a functional setting. EML extends ML-style datatypes and functions with a class construct designed to be extended into hierarchies, thus allowing the programmer to seamlessly integrate the object-oriented programming paradigm with the traditional functional style.

      Extensible ML is related neither to the programming language Extended ML (other than being similarly derived from ML), nor to the specification language eXtensible Markup Language, nor to extensible programming.

      Millstein, Bleckner, Chambers. Modular typechecking for hierarchically extensible datatypes and functions. Proceedings of the seventh ACM SIGPLAN international conference on Functional programming, ACM Press, 2002.

    • SML/E

      This is Dominic Duggan's extension of SML/NJ with type error explanation.

  • Limited implementations (not full SML, or pre-1997 SML, or no longer available, or obsolete, or not fully working, or dependent on proprietary software)

3 SML libraries

4 IDE (integrated development environment) support for SML

5 General guidance on writing SML

6 Books and reports on SML

7 SML communication channels

8 Programming patterns in SML that involve advanced use of types

Author: The Skalpel Team

Date: 2012-08-01 16:04:22 BST

HTML generated by org-mode 6.33x in emacs 23