λ

Journal manual

Table of Contents

[in package JOURNAL with nicknames JRN]

λ

1 The journal ASDF System

λ

2 Links

Here is the official repository and the HTML documentation for the latest version.

λ

3 Portability

Tested on ABCL, CCL, CLISP, CMUCL, ECL, and SBCL. AllegroCL Express edition runs out of heap while running the tests. On Lisps that seem to lack support for disabling and enabling of interrupts, such as ABCL and CLISP, durability is compromised, and any attempt to sync-journal (see Synchronization strategies and Safety) will be a runtime error.

λ

4 Background

Logging, tracing, testing, and persistence are about what happened during code execution. Recording machine-readable logs and traces can be repurposed for white-box testing. More, when the code is rerun, selected frames may return their recorded values without executing the code, which could serve as a mock object framework for writing tests. This ability to isolate external interactions and to reexecute traces is sufficient to reconstruct the state of a program, achieving simple persistence not unlike a journaling filesystem or event sourcing.

Journal is the library to log, trace, test and persist. It has a single macro at its heart: journaled, which does pretty much what was described. It can be thought of as generating two events around its body: one that records the name and an argument list (as in a function call), and another that records the return values. In Lisp-like pseudocode:

(defmacro journaled (name args &body body)
  `(progn
     (record-event `(:in ,name :args ,args))
     (let ((,return-values (multiple-value-list (progn ,@body))))
       (record-event `(:out ,name :values ,return-values))
       (values-list ,return-values))))

This is basically how recording works. When replaying events from a previous run, the return values of body can be checked against the recorded ones, or we may return the recorded values without even running body.

In summary, we can produce selective execution traces by wrapping code in journaled and use those traces for various purposes. The Journal library is this idea taken to its logical conclusion.

λ

5 Distinguishing features

As a logging facility
#68200.234: ("some-context")
#68200.234:   Informative log message
#68200.250: => NIL

See Logging for a complete example.

Compared to cl:trace
(FOO 2.1)
  (1+ 2.1)
  => 3.1
=E "SIMPLE-ERROR" "The assertion (INTEGERP 3.1) failed."

See Tracing for a complete example.

As a test framework
(define-file-bundle-test (test-user-registration :directory "registration")
  (let ((username (replayed ("ask-username")
                    (format t "Please type your username: ")
                    (read-line))))
    (add-user username)
    (assert (user-exists-p username))))

See Testing for a complete example.

As a solution for persistence
(defun my-resumable-autosaving-game-with-history ()
  (with-bundle (bundle)
    (play-guess-my-number)))

See Persistence for a complete example.

λ

6 Basics

The journaled macro does both recording and replaying of events, possibly at the same time. Recording is easy: events generated by journaled are simply written to a journal, which is a sequence of events much like a file. What events are generated is described in journaled. Replay is much more involved, thus it gets its own section. The journals used for recording and replaying are specified by with-journaling or by with-bundle.

The Journals reference is presented later, but for most purposes, creating them (e.g. with make-in-memory-journal, make-file-journal) and maybe querying their contents with list-events will suffice. Some common cases of journal creation are handled by the convenience function to-journal.

Built on top of journals, Bundles juggle repeated replay-and-record cycles focussing on persistence.

λ

6.1 In-events

Upon entering a block, journaled generates an in-event, which conceptually opens a new frame. These in-events are created from the name, version and args arguments of journaled. For example,

(journaled (name :version version :args args) ...)

creates an event like this:

`(:in ,name :version ,version :args ,args)

where :version and :args may be omitted if they are nil. Versions are used for Replay.

λ

6.2 Out-events

Upon leaving a block, journaled generates an out-event, closing the frame opened by the corresponding in-event. These out-events are property lists like this:

(:out foo :version 1 :values (42))

Their name and version (foo and 1 in the example) are the same as in the in-event: they come from the corresponding arguments of journaled. exit and outcome are filled in differently depending on how the block finished its execution.

There is a further grouping of outcomes into expected and unexpected.

λ

6.3 Working with unreadable values

The events recorded often need to be readable. This is always required with file-journals, often with in-memory-journals, but never with pprint-journals. By choosing an appropriate identifier or string representation of the unreadable object to journal, this is not a problem in practice. journaled provides the values hook for this purpose.

With external-events, whose outcome is replayed (see Replaying the outcome), we also need to be able to reverse the transformation of values, and this is what the replay-values argument of journaled is for.

Let's see a complete example.

(defclass user ()
  ((id :initarg :id :reader user-id)))

(defmethod print-object ((user user) stream)
  (print-unreadable-object (user stream :type t)
    (format stream "~S" (slot-value user 'id))))

(defvar *users* (make-hash-table))

(defun find-user (id)
  (gethash id *users*))

(defun add-user (id)
  (setf (gethash id *users*) (make-instance 'user :id id)))

(defvar *user7* (add-user 7))

(defun get-message ()
  (replayed (listen :values (values-> #'user-id)
                    :replay-values (values<- #'find-user))
    (values *user7* "hello")))

(jtrace user-id find-user get-message)

(let ((bundle (make-file-bundle "/tmp/user-example/")))
  (format t "Recording")
  (with-bundle (bundle)
    (get-message))
  (format t "~%Replaying")
  (with-bundle (bundle)
    (get-message)))
.. Recording
.. (GET-MESSAGE)
..   (USER-ID #<USER 7>)
..   => 7
.. => #<USER 7>, "hello"
.. Replaying
.. (GET-MESSAGE)
..   (FIND-USER 7)
..   => #<USER 7>, T
.. => #<USER 7>, "hello"
==> #<USER 7>
=> "hello"

To be able to journal the return values of get-message, the user object must be transformed to something readable. On the Recording run, (values-> #'user-id) replaces the user object with its id in the event-outcome recorded, but the original user object is returned.

When Replaying, the journaled out-event is replayed (see Replaying the outcome):

(:OUT GET-MESSAGE :VERSION :INFINITY :VALUES (7 "hello"))

The user object is looked up according to :replay-values and is returned along with "hello".

λ

6.4 Utilities

λ

6.5 Pretty-printing

Instead of collecting events and then printing them, events can be pretty-printed to a stream as they generated. This is accomplished with Pretty-printing journals, discussed in detail later, in the following way:

(let ((journal (make-pprint-journal)))
  (with-journaling (:record journal)
    (journaled (foo) "Hello")))
..
.. (FOO)
.. => "Hello"

Note that Pretty-printing journals are not tied to with-journaling and are most often used for Logging and Tracing.

λ

6.6 Error handling

λ

7 Logging

Before we get into the details, here is a self-contained example that demonstrates typical use.

(defvar *communication-log* nil)
(defvar *logic-log* nil)
(defvar *logic-log-level* 0)

(defun call-with-connection (port fn)
  (framed (call-with-connection :log-record *communication-log*
                                :args `(,port))
    (funcall fn)))

(defun fetch-data (key)
  (let ((value 42))
    (logged ((and (<= 1 *logic-log-level*) *logic-log*))
      "The value of ~S is ~S." key value)
    value))

(defun init-logging (&key (logic-log-level 1))
  (let* ((stream (open "/tmp/xxx.log"
                       :direction :output
                       :if-does-not-exist :create
                       :if-exists :append))
         (journal (make-pprint-journal
                   :stream (make-broadcast-stream
                            (make-synonym-stream '*standard-output*)
                            stream))))
    (setq *communication-log* journal)
    (setq *logic-log* journal)
    (setq *logic-log-level* logic-log-level)))

(init-logging)

(call-with-connection 8080 (lambda () (fetch-data :foo)))
..
.. (CALL-WITH-CONNECTION 8080)
..   The value of :FOO is 42.
.. => 42
=> 42

(setq *logic-log-level* 0)
(call-with-connection 8080 (lambda () (fetch-data :foo)))
..
.. (CALL-WITH-CONNECTION 8080)
.. => 42
=> 42

(ignore-errors
  (call-with-connection 8080 (lambda () (error "Something unexpected."))))
..
.. (CALL-WITH-CONNECTION 8080)
.. =E "SIMPLE-ERROR" "Something unexpected."
Default to muffling

Imagine a utility library called glib.

(defvar *glib-log* nil)
(defvar *patience* 1)

(defun sl33p (seconds)
  (logged (*glib-log*) "Sleeping for ~As." seconds)
  (sleep (* *patience* seconds)))

Glib follows the recommendation to have a special variable globally bound to nil by default. The value of *glib-log* is the journal to which glib log messages will be routed. Since it's nil, the log messages are muffled, and to record any log message, we need to change its value.

Routing logs to a journal

Let's send the logs to a pprint-journal:

(setq *glib-log* (make-pprint-journal
                  :log-decorator (make-log-decorator :time t)))
(sl33p 0.01)
..
.. 2020-08-31T12:45:23.827172+02:00: Sleeping for 0.01s.

That's a bit too wordy. For this tutorial, let's stick to less verbose output:

(setq *glib-log* (make-pprint-journal))
(sl33p 0.01)
..
.. Sleeping for 0.01s.

To log to a file:

(setq *glib-log* (make-pprint-journal
                  :stream (open "/tmp/glib.log"
                                :direction :output
                                :if-does-not-exist :create
                                :if-exists :append)))
Capturing logs in with-journaling's record-journal

If we were recording a journal for replay and wanted to include glib logs in the journal, we would do something like this:

(with-journaling (:record t)
  (let ((*glib-log* :record))
    (sl33p 0.01)
    (journaled (non-glib-stuff :version 1)))
  (list-events))
=> ((:LEAF "Sleeping for 0.01s.")
    (:IN NON-GLIB-STUFF :VERSION 1)
    (:OUT NON-GLIB-STUFF :VERSION 1 :VALUES (NIL)))

We could even (setq *glib-log* :record) to make it so that glib messages are included by default in the record-journal. In this example, the special *glib-log* acts like a log category for all the log messages of the glib library (currently one).

Rerouting a category

Next, we route *glib-log* to wherever *app-log* is pointing by binding *glib-log* to the symbol *app-log* (see :log-record).

(defvar *app-log* nil)

(let ((*glib-log* '*app-log*))
  (setq *app-log* nil)
  (logged (*glib-log*) "This is not written anywhere.")
  (setq *app-log* (make-pprint-journal :pretty nil))
  (sl33p 0.01))
..
.. (:LEAF "Sleeping for 0.01s.")

Note how pretty-printing was turned off, and we see the leaf-event generated by logged in its raw plist form.

Conditional routing

Finally, to make routing decisions conditional we need to change sl33p:

(defvar *glib-log-level* 1)

(defun sl33p (seconds)
  (logged ((and (<= 2 *glib-log-level*) *glib-log*))
          "Sleeping for ~As." (* *patience* seconds))
  (sleep seconds))

;;; Check that it all works:
(let ((*glib-log-level* 1)
      (*glib-log* (make-pprint-journal)))
  (format t "~%With log-level ~A" *glib-log-level*)
  (sl33p 0.01)
  (setq *glib-log-level* 2)
  (format t "~%With log-level ~A" *glib-log-level*)
  (sl33p 0.01))
..
.. With log-level 1
.. With log-level 2
.. Sleeping for 0.01s.
Nested log contexts

logged is for single messages. journaled, or in this example framed, can provide nested context:

(defun callv (var value symbol &rest args)
  "Call SYMBOL-FUNCTION of SYMBOL with VAR dynamically bound to VALUE."
  (framed ("glib:callv" :log-record *glib-log*
                        :args `(,var ,value ,symbol ,@args))
    (progv (list var) (list value)
      (apply (symbol-function symbol) args))))

(callv '*print-base* 2 'print 10)
..
.. ("glib:callv" *PRINT-BASE* 2 PRINT 10)
.. 1010 
.. => 10
=> 10

(let ((*glib-log-level* 2))
  (callv '*patience* 7 'sl33p 0.01))
..
.. ("glib:callv" *PATIENCE* 7 SL33P 0.01)
..   Sleeping for 0.07s.
.. => NIL

λ

7.1 Customizing logs

Customizing the output format is possible if we don't necessarily expect to be able to read the logs back programmatically. There is an example in Tracing, which is built on Pretty-printing journals.

Here, we discuss how to make logs more informative.

λ

7.2 :log-record

with-journaling and with-bundle control replaying and recording within their dynamic extent, which is rather a necessity because Replay needs to read the events in the same order as the journaled blocks are being executed. However, log-events do not affect replay, so we can allow more flexibility in routing them.

The log-record argument of journaled and logged controls where log-events are written both within with-journaling and without. The algorithm to determine the target journal is this:

  1. If log-record is :record, then the record-journal is returned.

  2. If log-record is nil, then it is returned.

  3. If log-record is a journal, then it is returned.

  4. If log-record is a symbol (other than nil), then the symbol-value of that symbol is assigned to log-record, and we go to step 1.

If the return value is nil, then the event will not be written anywhere, else it is written to the journal returned.

This is reminiscent of synonym-streams, also in that it is possible end up in cycles in the resolution. For this reason, the algorithm stop with a journal-error after 100 iterations.

Interactions

Events may be written to log-record even without an enclosing with-journaling, and it does not affect the journal-state. However, it is a journal-error to write to a :completed journal (see journal-state).

When multiple threads log to the same journal, it is guaranteed that individual events are written atomically, but frames from different threads do not necessarily nest. To keep the log informative, the name of thread may be added to the events as decoration.

Also, see notes on thread Safety.

λ

7.3 Logging with leaf-events

λ

8 Tracing

jtrace behaves similarly to cl:trace but deals with non-local exits gracefully.

Basic tracing
(defun foo (x)
  (sleep 0.12)
  (1+ x))

(defun bar (x)
  (foo (+ x 2))
  (error "xxx"))

(jtrace foo bar)

(ignore-errors (bar 1))
..
.. 0: (BAR 1)
..   1: (FOO 3)
..   1: FOO => 4
.. 0: BAR =E "SIMPLE-ERROR" "xxx"
Log-like output

It can also include the name of the originating thread and timestamps in the output:

(let ((*trace-thread* t)
      (*trace-time* t)
      (*trace-depth* nil)
      (*trace-out-name* nil))
  (ignore-errors (bar 1)))
..
.. 2020-09-02T19:58:19.415204+02:00 worker: (BAR 1)
.. 2020-09-02T19:58:19.415547+02:00 worker:   (FOO 3)
.. 2020-09-02T19:58:19.535766+02:00 worker:   => 4
.. 2020-09-02T19:58:19.535908+02:00 worker: =E "SIMPLE-ERROR" "xxx"
Profiler-like output
(let ((*trace-real-time* t)
      (*trace-run-time* t)
      (*trace-depth* nil)
      (*trace-out-name* nil))
  (ignore-errors (bar 1)))
..
.. #16735.736 !68.368: (BAR 1)
.. #16735.736 !68.369:   (FOO 3)
.. #16735.857 !68.369:   => 4
.. #16735.857 !68.369: =E "SIMPLE-ERROR" "xxx"
Customizing the content and the format

If these options are insufficient, the content and the format of the trace can be customized:

(let ((*trace-journal*
       (make-pprint-journal :pretty '*trace-pretty*
                     :prettifier (lambda (event depth stream)
                                   (format stream "~%Depth: ~A, event: ~S"
                                           depth event))
                     :stream (make-synonym-stream '*error-output*)
                     :log-decorator (lambda (event)
                                      (append event '(:custom 7))))))
  (ignore-errors (bar 1)))
..
.. Depth: 0, event: (:IN BAR :ARGS (1) :CUSTOM 7)
.. Depth: 1, event: (:IN FOO :ARGS (3) :CUSTOM 7)
.. Depth: 1, event: (:OUT FOO :VALUES (4) :CUSTOM 7)
.. Depth: 0, event: (:OUT BAR :ERROR ("SIMPLE-ERROR" "xxx") :CUSTOM 7)

In the above, *trace-journal* was bound locally to keep the example from wrecking the global default, but the same effect could be achieved by setfing pprint-journal-prettifier, pprint-journal-stream and journal-log-decorator.

λ

8.1 Slime integration

Slime, by default, binds C-c C-t to toggling cl:trace. To integrate jtrace into Slime, load src/mgl-jrn.el into Emacs.

Then, assuming the Elisp file is in the quicklisp directory, add this to your .emacs:

(load "~/quicklisp/mgl-jrn.el")

Since jtrace lacks some features of cl:trace, most notably that of tracing non-global functions, it is assigned a separate binding, C-c C-j.

λ

9 Replay

During replay, code is executed normally with special rules for blocks. There are two modes for dealing with blocks: replaying the code and replaying the outcome. When code is replayed, upon entering and leaving a block, the events generated are matched to events read from the journal being replayed. If the events don't match, replay-failure is signalled, which marks the record journal as having failed the replay. This is intended to make sure that the state of the program during the replay matches the state at the time of recording. In the other mode, when the outcome is replayed, a block may not be executed at all, but its recorded outcome is reproduced (i.e. the recorded return values are simply returned).

Replay can be only be initiated with with-journaling (or its close kin with-bundle). After the per-event processing described below, when with-journaling finishes, it might signal replay-incomplete if there are unprocessed non-log events left in the replay journal.

Replay is deemed successful or failed depending on whether all events are replayed from the replay journal without a replay-failure. A journal that records events from a successful replay can be used in place of the journal that was replayed, and so on. The logic of replacing journals with their successful replays is automated by Bundles. with-journaling does not allow replay from journals that were failed replays themselves. The mechanism, in terms of which tracking success and failure of replays is implemented, revolves around journal-state and event-versions, which we discuss next.

λ

9.1 Journaled for replay

The following arguments of journaled control behaviour under replay.

Built on top of journaled, the macros below record a pair of In-events and Out-events but differ in how they are replayed and the requirements on their blocks. The following table names the type of event produced (Event), how In-events are replayed (In-e.), whether the block is always run (Run), how Out-events are replayed (Out-e.), whether the block must be deterministic (Det) or side-effect free (sef).

|          | Event     | In-e.  | Run | Out-e. | Det | SEF |
|----------+-----------+--------+-----+--------+-----+-----|
| FRAMED   | log       | skip   | y   | skip   | n   | n   |
| CHECKED  | versioned | match  | y   | match  | y   | n   |
| REPLAYED | external  | match  | n   | replay | n   | y   |
| INVOKED  | versioned | replay | y   | match  | y   | n   |

Note that the replay-replay combination is not implemented because there is nowhere to return values from replay-triggered functions.

λ

9.2 Bundles

Consider replaying the same code repeatedly, hoping to make progress in the processing. Maybe based on the availability of external input, the code may error out. After each run, one has to decide whether to keep the journal just recorded or stick with the replay journal. A typical solution to this would look like this:

(let ((record nil))
  (loop
    (setq record (make-in-memory-journal))
    (with-journaling (:record record :replay replay)
      ...)
    (when (and
           ;; RECORD is a valid replay of REPLAY ...
           (eq (journal-state record) :completed)
           ;; ... and is also significantly different from it ...
           (journal-diverged-p record))
      ;; so use it for future replays.
      (setq replay record))))

This is pretty much what bundles automate. The above becomes:

(let ((bundle (make-in-memory-bundle)))
  (loop
    (with-bundle (bundle)
      ...)))

With file-journals, the motivating example above would be even more complicated, but file-bundles work the same way as in-memory-bundles.

λ

9.3 The replay strategy

The replay process for both In-events and Out-events starts by determining how the generated event (the new event from now on) shall be replayed. Roughly, the decision is based on the name and version of the new event and the replay event (the next event to be read from the replay). There are four possible strategies:

The strategy is determined by the following algorithm, invoked whenever an event is generated by a journaled block:

  1. Log events are not matched to the replay. If the new event is a log event or a replay-failure has been signalled before (i.e. the record journal's journal-state is :mismatched), then insert is returned.

  2. Else, log events to be read in the replay journal are skipped, and the next unread, non-log event is peeked at (without advancing the replay journal).

    • end of replay: If there are no replay events left, then:

      • If replay-eoj-error-p is nil in with-journaling (the default), insert is returned.

      • If replay-eoj-error-p is true, then end-of-journal is signalled.

    • mismatched name: Else, if the next unread replay event's name is not equal to the name of the new event, then:

      • For versioned-events, replay-name-mismatch is signalled if insertable is nil, else insert is returned.

      • For external-events, replay-name-mismatch is signalled.

    • matching name: Else, if the name of the next unread event in the replay journal is equal to the name of new event, then it is chosen as the replay event.

      • If the replay event's version is higher than the new event's version, then replay-version-downgrade is signalled.

      • If the two versions are equal, then match is returned.

      • If the new event's version is higher, then upgrade is returned.

      Where :infinity is considered higher than any integer and equal to itself.

In summary:

 | new event | end-of-replay     | mismatched name   | matching name |
 |-----------+-------------------+-------------------+---------------|
 | Log       | insert            | insert            | insert        |
 | Versioned | insert/eoj-error  | insert/name-error | match-version |
 | External  | insert/eoj-error  | insert/name-error | match-version |

Version matching (match-version above) is based on which event has a higher version:

 | replay event    | =     | new event |
 |-----------------+-------+-----------|
 | downgrade-error | match | upgrade   |

λ

9.4 Matching in-events

If the replay strategy is match, then, for in-events, the matching process continues like this:

λ

9.4.1 Replaying the outcome

So, if an in-event is triggered that matches the replay, event-version(0 1) is :infinity, then normal execution is altered in the following manner:

To be able to reproduce the outcome in the replay journal, some assistance may be required from replay-values and replay-condition:

with-replay-filter's no-replay-outcome can selectively turn off replaying the outcome. See Testing on multiple levels, for an example.

λ

9.5 Matching out-events

If there were no Replay failures during the matching of the in-event, and the conditions for Replaying the outcome were not met, then the block is executed. When the outcome of the block is determined, an out-event is triggered and is matched to the replay journal. The matching of out-events starts out as in The replay strategy with checks for event-name and event-version.

If the replay strategy is insert or upgrade, then the out-event is written to record-journal, consuming an event with a matching name from the replay-journal in the latter case. If the strategy is match, then:

Note that The replay strategy for the in-event and the out-event of the same frame may differ if the corresponding out-event is not present in replay-journal, which may be the case when the recording process failed hard without unwinding properly, or when an unexpected outcome triggered the transition to journal-state :logging.

λ

9.6 Replay failures

λ

9.7 Upgrades and replay

The replay mechanism is built on the assumption that the tree of frames is the same when the code is replayed as it was when the replay journal was originally recorded. Thus, non-deterministic control flow poses a challenge, but non-determinism can be isolated with external-events. However, when the code changes, we might find the structure of frames in previous recordings hard to accommodate. In this case, we might decide to alter the structure, giving up some of the safety provided by the replay mechanism. There are various tools at our disposal to control this tradeoff between safety and flexibility:

With that, let's see how with-replay-filter works.

λ

10 Testing

Having discussed the Replay mechanism, next are Testing and Persistence, which rely heavily on replay. Suppose we want to unit test user registration. Unfortunately, the code communicates with a database service and also takes input from the user. A natural solution is to create mock objects for these external systems to unshackle the test from the cumbersome database dependency and to allow it to run without user interaction.

We do this below by wrapping external interaction in journaled with :version :infinity (see Replaying the outcome).

(defparameter *db* (make-hash-table))

(defun set-key (key value)
  (replayed ("set-key" :args `(,key ,value))
    (format t "Updating db~%")
    (setf (gethash key *db*) value)
    nil))

(defun get-key (key)
  (replayed ("get-key" :args `(,key))
    (format t "Query db~%")
    (gethash key *db*)))

(defun ask-username ()
  (replayed ("ask-username")
    (format t "Please type your username: ")
    (read-line)))

(defun maybe-win-the-grand-prize ()
  (checked ("maybe-win-the-grand-prize")
    (when (= 1000000 (hash-table-count *db*))
      (format t "You are the lucky one!"))))

(defun register-user (username)
  (unless (get-key username)
    (set-key username `(:user-object :username ,username))
    (maybe-win-the-grand-prize)))

Now, we write a test that records these interactions in a file when it's run for the first time.

(define-file-bundle-test (test-user-registration
                          :directory (asdf:system-relative-pathname
                                      :journal "test/registration/"))
  (let ((username (ask-username)))
    (register-user username)
    (assert (get-key username))
    (register-user username)
    (assert (get-key username))))

;; Original recording: everything is executed
JRN> (test-user-registration)
Please type your username: joe
Query db
Updating db
Query db
Query db
Query db
=> NIL

On reruns, none of the external stuff is executed. The return values of the external journaled blocks are replayed from the journal:

;; Replay: all external interactions are mocked.
JRN> (test-user-registration)
=> NIL

Should the code change, we might want to upgrade carefully (see Upgrades and replay) or just rerecord from scratch:

JRN> (test-user-registration :rerecord t)
Please type your username: joe
Query db
Updating db
Query db
Query db
Query db
=> NIL

Thus satisfied that our test runs, we can commit the journal file in the bundle into version control. Its contents are:


(:IN "ask-username" :VERSION :INFINITY)
(:OUT "ask-username" :VERSION :INFINITY :VALUES ("joe" NIL))
(:IN "get-key" :VERSION :INFINITY :ARGS ("joe"))
(:OUT "get-key" :VERSION :INFINITY :VALUES (NIL NIL))
(:IN "set-key" :VERSION :INFINITY :ARGS ("joe" (:USER-OBJECT :USERNAME "joe")))
(:OUT "set-key" :VERSION :INFINITY :VALUES (NIL))
(:IN "maybe-win-the-grand-prize" :VERSION 1)
(:OUT "maybe-win-the-grand-prize" :VERSION 1 :VALUES (NIL))
(:IN "get-key" :VERSION :INFINITY :ARGS ("joe"))
(:OUT "get-key" :VERSION :INFINITY :VALUES ((:USER-OBJECT :USERNAME "joe") T))
(:IN "get-key" :VERSION :INFINITY :ARGS ("joe"))
(:OUT "get-key" :VERSION :INFINITY :VALUES ((:USER-OBJECT :USERNAME "joe") T))
(:IN "get-key" :VERSION :INFINITY :ARGS ("joe"))
(:OUT "get-key" :VERSION :INFINITY :VALUES ((:USER-OBJECT :USERNAME "joe") T))

Note that when this journal is replayed, new versioned-events are required to match the replay. So, after the original recording, we can check by eyeballing that the record represents a correct execution. Then on subsequent replays, even though maybe-win-the-grand-prize sits behind register-user and is hard to test with asserts, the replay mechanism verifies that it is called only for new users.

This record-and-replay style of testing is not the only possibility: direct inspection of a journal with the low-level events api (see Events reference) can facilitate checking non-local invariants.

λ

10.1 Testing on multiple levels

Nesting replayeds (that is, frames of external-events) is not obviously useful since the outer replayed will be replayed by outcome, and the inner one will be just echoed to the record journal. However, if we turn off Replaying the outcome for the outer, the inner will be replayed.

This is useful for testing layered communication. For example, we might have written code that takes input from an external system (read-line) and does some complicated processing (read-from-string) before returning the input in a form suitable for further processing. Suppose we wrap replayed around read-from-string for Persistence because putting it around read-line would expose low-level protocol details in the journal, making protocol changes difficult.

However, upon realizing that read-from-string was not the best tool for the job and switching to parse-integer, we want to test by replaying all previously recorded journals. For this, we prevent the outer replayed from being replayed by outcome with with-replay-filter:

(let ((bundle (make-in-memory-bundle)))
  ;; Original with READ-FROM-STRING
  (with-bundle (bundle)
    (replayed ("accept-number")
      (values (read-from-string (replayed ("input-number")
                                  (read-line))))))
  ;; Switch to PARSE-INTEGER and test by replay.
  (with-bundle (bundle)
    (with-replay-filter (:no-replay-outcome '("accept-number"))
      (replayed ("accept-number")
        ;; 1+ is our bug.
        (values (1+ (parse-integer (replayed ("input-number")
                                     (read-line)))))))))

The inner input-number block is replayed by outcome, and parse-integer is called with the string read-line returned in the original invocation. The outcome of the outer accept-number block checked as if it was a versioned-event and we get a replay-outcome-mismatch due to the bug.

λ

11 Persistence

λ

11.1 Persistence tutorial

Let's write a simple game.

(defun play-guess-my-number ()
  (let ((my-number (replayed (think-of-a-number)
                     (random 10))))
    (format t "~%I thought of a number.~%")
    (loop for i upfrom 0 do
      (write-line "Guess my number:")
      (let ((guess (replayed (read-guess)
                     (values (parse-integer (read-line))))))
        (format t "You guessed ~D.~%" guess)
        (when (= guess my-number)
          (checked (game-won :args `(,(1+ i))))
          (format t "You guessed it in ~D tries!" (1+ i))
          (return))))))

(defparameter *the-evergreen-game* (make-in-memory-bundle))
Original recording

Unfortunately, the implementation is lacking in the input validation department. In the transcript below, parse-integer fails with junk in string when the user enters not a number:

CL-USER> (handler-case
             (with-bundle (*the-evergreen-game*)
               (play-guess-my-number))
           (error (e)
             (format t "Oops. ~A~%" e)))
I thought of a number.
Guess my number:
7 ; real user input
You guessed 7.
Guess my number:
not a number ; real user input
Oops. junk in string "not a number"
Replay and extension

Instead of fixing this bug, we just restart the game from the beginning, Replaying the outcome of external interactions marked with replayed:

CL-USER> (with-bundle (*the-evergreen-game*)
           (play-guess-my-number))
I thought of a number.
Guess my number:
You guessed 7.
Guess my number: ; New recording starts here
5 ; real user input
You guessed 5.
Guess my number:
4 ; real user input
You guessed 4.
Guess my number:
2 ; real user input
You guessed 2.
You guessed it in 4 tries!
It's evergreen

We can now replay this game many times without any user interaction:

CL-USER> (with-bundle (*the-evergreen-game*)
           (play-guess-my-number))
I thought of a number.
Guess my number:
You guessed 7.
Guess my number:
You guessed 5.
Guess my number:
You guessed 4.
Guess my number:
You guessed 2.
You guessed it in 4 tries!
The generated events

This simple mechanism allows us to isolate external interactions and write tests in record-and-replay style based on the events produced:

CL-USER> (list-events *the-evergreen-game*)
((:IN THINK-OF-A-NUMBER :VERSION :INFINITY)
 (:OUT THINK-OF-A-NUMBER :VERSION :INFINITY :VALUES (2))
 (:IN READ-GUESS :VERSION :INFINITY)
 (:OUT READ-GUESS :VERSION :INFINITY :VALUES (7))
 (:IN READ-GUESS :VERSION :INFINITY :ARGS NIL)
 (:OUT READ-GUESS :VERSION :INFINITY :VALUES (5))
 (:IN READ-GUESS :VERSION :INFINITY :ARGS NIL)
 (:OUT READ-GUESS :VERSION :INFINITY :VALUES (4))
 (:IN READ-GUESS :VERSION :INFINITY :ARGS NIL)
 (:OUT READ-GUESS :VERSION :INFINITY :VALUES (2))
 (:IN GAME-WON :VERSION 1 :ARGS (4))
 (:OUT GAME-WON :VERSION 1 :VALUES (NIL)))

In fact, being able to replay this game at all already checks it through the game-won event that the number of tries calculation is correct.

In addition, thus being able to reconstruct the internal state of the program gives us persistence by replay. If instead of a in-memory-bundle, we used a file-bundle, the game would have been saved on disk without having to write any code for saving and loading the game state.

Discussion

Persistence by replay, also known as event sourcing, is appropriate when the external interactions are well-defined and stable. Storing events shines in comparison to persisting state when the control flow is too complicated to be interrupted and resumed easily. Resuming execution in deeply nested function calls is fraught with such peril that it is often easier to flatten the program into a state machine, which is as pleasant as manually managing continuations.

In contrast, the Journal library does not favour certain styles of control flow and only requires that non-determinism is packaged up in replayed, which allows it to reconstruct the state of the program from the recorded events at any point during its execution and resume from there.

λ

11.2 Synchronization to storage

In the following, we explore how journals can serve as a persistence mechanism and the guarantees they offer. The high-level summary is that journals with sync can serve as a durable and consistent storage medium. The other two ACID properties, atomicity and isolation, do not apply because Journal is single-client and does not need transactions.

λ

11.2.1 Synchronization strategies

When a journal or bundle is created (see make-in-memory-journal, make-file-journal, make-in-memory-bundle, make-file-bundle), the sync option determines when – as a record-journal – the recorded events and journal-state changes are persisted durably. For file-journals, persisting means calling something like fsync, while for in-memory-journals, a user defined function is called to persist the data.

λ

11.2.2 Synchronization with in-memory journals

Unlike file-journals, in-memory-journals do not have any built-in persistent storage backing them, but with sync-fn, persistence can be tacked on. If non-nil, sync-fn must be a function of a single argument, an in-memory-journal. sync-fn is called according to Synchronization strategies, and upon normal return the journal must be stored durably.

The following example saves the entire journal history when a new data event is recorded. Note how sync-to-db is careful to overwrite *db* only if it is called with a journal that has not failed the replay (as in Replay failures) and is sufficiently different from the replay journal as determined by journal-divergent-p.

(defparameter *db* ())

(defun sync-to-db (journal)
  (when (and (member (journal-state journal)
                     '(:recording :logging :completed))
             (journal-divergent-p journal))
    (setq *db* (journal-events journal))
    (format t "Saved ~S~%New events from position ~S~%" *db*
            (journal-previous-sync-position journal))))

(defun make-db-backed-record-journal ()
  (make-in-memory-journal :sync-fn 'sync-to-db))

(defun make-db-backed-replay-journal ()
  (make-in-memory-journal :events *db*))

(with-journaling (:record (make-db-backed-record-journal)
                  :replay (make-db-backed-replay-journal))
  (replayed (a)
    2)
  (ignore-errors
    (replayed (b)
      (error "Whoops"))))
.. Saved #((:IN A :VERSION :INFINITY)
..         (:OUT A :VERSION :INFINITY :VALUES (2)))
.. New events from position 0
.. Saved #((:IN A :VERSION :INFINITY)
..         (:OUT A :VERSION :INFINITY :VALUES (2))
..         (:IN B :VERSION :INFINITY)
..         (:OUT B :ERROR ("SIMPLE-ERROR" "Whoops")))
.. New events from position 2
..

In a real application, external events often involve unreliable or high-latency communication. In the above example, block b signals an error, say, to simulate some kind of network condition. Now, a new journal for replay is created and initialized with the saved events, and the whole process is restarted.

(defun run-with-db ()
  (with-journaling (:record (make-db-backed-record-journal)
                    :replay (make-db-backed-replay-journal))
    (replayed (a)
      (format t "A~%")
      2)
    (replayed (b)
      (format t "B~%")
      3)))

(run-with-db)
.. B
.. Saved #((:IN A :VERSION :INFINITY)
..         (:OUT A :VERSION :INFINITY :VALUES (2))
..         (:IN B :VERSION :INFINITY)
..         (:OUT B :VERSION :INFINITY :VALUES (3)))
.. New events from position 0
..
=> 3

Note that on the rerun, block a is not executed because external events are replayed simply by reproducing their outcome, in this case returning 2. See Replaying the outcome. Block b, on the other hand, was rerun because it had an unexpected outcome the first time around. This time it ran without error, a data event was triggered, and sync-fn was invoked.

If we were to invoke the now completed run-with-db again, it would simply return 3 without ever invoking sync-fn:

(run-with-db)
=> 3

With journal-replay-mismatch, sync-fn can be optimized to to reuse the sequence of events in the replay journal up until the point of divergence.

λ

11.2.3 Synchronization with file journals

For file-journals, sync determines when the events written to the record-journal and its journal-state will be persisted durably in the file. Syncing to the file involves two calls to fsync and is not cheap.

Syncing events to files is implemented as follows.

Note that this implementation assumes that after writing the start transaction marker, a crash cannot leave any kind of garbage bytes around: it must leave zeros. This is not true for all filesytems. For example, ext3/ext4 with data=writeback can leave garbage around.

λ

12 Safety

Thread safety

Changes to journals come in two varieties: adding an event and changing the journal-state. Both are performed by journaled only unless the low-level streamlet interface is used (see Streamlets reference). Using journaled wrapped in a with-journaling, with-bundle, or :log-record without with-journaling is thread-safe.

Process safety

Currently, there is no protection against multiple OS processes writing the same file-journal or file-bundle.

Signal safety

Journal is designed to be async-unwind safe but not reentrant. Interrupts are disabled only for the most critical cleanup forms. If a thread is killed without unwinding, that constitutes aborted execution, so guarantees about Synchronization to storage apply, but journal objects written by the thread are not safe to access, and the Lisp should probably be restarted.

λ

13 Events reference

Events are normally triggered upon entering and leaving the dynamic extent of a journaled block (see In-events and Out-events) and also by logged. Apart from being part of the low-level substrate of the Journal library, working with events directly is sometimes useful when writing tests that inspect recorded events. Otherwise, skip this entire section.

All events have event-name and event-version(0 1), which feature prominently in The replay strategy. After the examples in In-events and Out-events, the following example is a reminder of how events look in the simplest case.

(with-journaling (:record t)
  (journaled (foo :version 1 :args '(1 2))
    (+ 1 2))
  (logged () "Oops")
  (list-events))
=> ((:IN FOO :VERSION 1 :ARGS (1 2))
    (:OUT FOO :VERSION 1 :VALUES (3))
    (:LEAF "Oops"))

So, a journaled block generates an in-event and an out-event, which are simple property lists. The following reference lists these properties, their semantics and the functions to read them.

λ

13.1 Event versions

λ

13.2 In-events

λ

13.3 Out-events

λ

13.4 Leaf-events

λ

14 Journals reference

In Basics, we covered the bare minimum needed to work with journals. Here, we go into the details.

λ

14.1 Comparing journals

After replay finished (i.e. with-journaling completed), we can ask whether there were any changes produced. This is answered in the strictest sense by identical-journals-p and somewhat more functionally by equivalent-replay-journals-p.

Also see journal-divergent-p.

The rest of section is about concrete subclasses of journal.

λ

14.2 In-memory journals

λ

14.3 File journals

λ

14.4 Pretty-printing journals

λ

15 Bundles reference

In Bundles, we covered the repeated replay problem that with-bundle automates. Here, we provide a reference for the bundle classes.

λ

15.1 In-memory bundles

λ

15.2 File bundles

λ

16 Streamlets reference

This section is relevant mostly for implementing new kinds of journals in addition to file-journals and in-memory-journals. In normal operation, streamlets are not worked with directly.

λ

16.1 Opening and closing

λ

16.2 Reading from streamlets

λ

16.3 Writing to streamlets

λ

17 Glossary