HereDoc : "Here Document" facility for OCaml

release 2001-07-13

1. Introduction

This file doc.sections.tpl, last modified on 2001-07-14.

Without any doubt, OCaml is a great langage for Web publishing. I used it for a long time to produce my static web pages and more recently to write some CGI scripts.

HereDoc is an attempt to provide syntactic sugar for such applications.

In OCaml, strings constants can span multiple lines. Of course, this is most useful for the applications we have in mind. But OCaml lacks some imporant features.

HereDoc relies on Camlp4, the powerful preprocessor for OCaml, to extend the concrete syntax of the language and provide convenient notations for all these operations. Most of the work is done at compile time, so the code is fast.

2. Installation

HereDoc has been developped on a Linux box, with: To install HereDoc, just untar the distribution tarbal, and do:
make HereDoc
This will produce the files:
pa_HereDoc.cma, text.cmo, text.cmx
To compile a file with the HereDoc syntax, you must pass to the compiler the option:
-pp 'camlp4o pa_HereDoc.cmo'
(put the correct path to pah_HereDoc.cmo).

To link, you'll probably need text.cmo (for the bytecode compiler) or text.cmx (for the native compiler).

To gain a better understanding of HereDoc, you may want to see the plain OCaml code it produces. First, you should build a version of HereDoc producing source code instead of an abstract syntax tree representation :

mkcamlp4 -o HereDoc_dump str.cma pa_o.cmo pr_o.cmo hereDoc_lexer.cmo pa_HereDoc.cmo
or simply:
make HereDoc_dump
Then, for instance:
./HereDoc_dump yourfile.ml
CamlP4 can't print everything, so the result of HereDoc_dump may not be usable as an input for OCaml.

Have a look at the Makefile and the files doc.ml, doc.layout.tpl, doc.sections.tpl (the source code for this page). They illustrate the use of HereDoc.

3. The module Text: Convenient text manipulation

Before we describe the new syntax, we have to introduce a tiny module : Text (what an original name). With this module, every Caml value "is" a string. To understand what it means, let's define the "string content" S(x) of a value x. It is a string, defined by:

Similarly, the "string components" SC(x) is defined by: The module defines an opaque type Text.t with an "universal constructor" Text.repr: 'a -> t. The function Text.iter:(string -> unit) -> t -> unit applies a given function to the string components of a Text.t value. There are also two functions to compute S(x) and its length.

So, for instance, we have (oh yes, this is not well-typed; you have to add Text.repr everywhere):

Typically, an application builds its output by concatenating strings. This involves many copy operations so it may be quite slow. With the module Text, just put your substrings in a list (or better, an array: only one big block), and see it as a string with Text.repr. When you want to finally output the "string", use Text.iter or Text.to_string on the result.

Another feature of the module Text is the manipulation of "postponed texts". Suppose you want to create a text with a hole to be filled later, when an extra piece of information is available. A simple way to use a reference:

   let post = ref Text.empty in
   repr [Text.repr "Total size :"; Text.repr post];
   (* ... *)
   post := Text.repr (string_of_int (total_size ()));
Sometimes it is better to put the expression to be computed later at the position of the postponed text, then to activate it later; also, having a global reference is not easy to manage. You can do that:
   repr [Text.repr "Total size :"; Text.postponed "totalsize" (fun ()
-> Text.repr (string_of_int (total_size ())))];
   (* ... *)
   Text.activate "totalsize"
Here, "totalsize" identifies the delayed computation and Text.activate triggers the evaluation.

4. Quotation

HereDoc defines the quotation <:here<...>>. It returns a value of type Text.t (see the Text module). HereDoc declare this quotation as the default quotation, so you can simply write <<...>>.

The quotation content is interpreted with the following rules:

5. Verbatim includes and templates

It is sometimes convenient to put some text in external file. HereDoc extends the syntax to provides four kind of includes: verbatim, templates, declaration includes and expression includes. To print a template, use:
Text.iter print_string TPL "file.txt"
You can put several chunks of text in a file. Use the notation:
"file.txt"."chunk"
to designate the section "chunk" in the file "file.txt", that is all the lines between the line "==chunk==" (two signs equal, the name of the chunk, two signs equal, alone on their line), and the line "====" (four signs equal). Templates and expression includes may take arguments: if the opening ligne is ==chunk arg1 arg2 .. argn== instead of ==chunk==, the chunk returns a function with labeled arguments arg1,...,argn. For instance, if the template file is:
 ==add x y==
 x + y
 ====

 ==link dest txt==
 <a href="$dest">$txt</a>
 ====

you can declare:
let add = EXPR "file.txt"."add"
let link = TPL "file.txt"."link"
and get:
val add : x:int -> y:int -> int
val link : dest:'a -> txt:'b -> Text.t

If many of the chunks you need are in the same file, you can use the toplevel phrase:

TEMPLATE_FILE "file.txt"
and then omit the file name, for instance:
TPL ."chunck"
When parsing an external template, the default file is the current file. By default, the template filename is "templates.tpl".

The commands "VERBATIM", "EXPR" and "TPL" are expressions; you can use them inside "expression in quotation". So you can for instance include a template from a template :

 ==chunk==
 Want to see another template ?
 $$TPL ."another"$$
 ====

Return to the top


Author's mail : Alain.Frisch@ens.fr
Author homepage
The program took 0.01 seconds to generate this page.