jepsen.nemesis

bisect

(bisect coll)

Given a sequence, cuts it in half; smaller half first.

bitflip

(bitflip)

A nemesis which introduces random bitflips in files. Takes operations like:

{:f     :bitflip
 :value {"some-node" {:file         "/path/to/file or /path/to/dir"
                      :probability  1e-3}}}

This flips 1 x 10^-3 of the bits in /path/to/file, or a random file in /path/to/dir, on “some-node”.

bitflip-dir

Where do we install the bitflip utility?

bridge

(bridge nodes)

A grudge which cuts the network in half, but preserves a node in the middle which has uninterrupted bidirectional connectivity to both components.

clock-scrambler

(clock-scrambler dt)

Randomizes the system clock of all nodes within a dt-second window.

complete-grudge

(complete-grudge components)

Takes a collection of components (collections of nodes), and computes a grudge such that no node can talk to any nodes outside its partition.

compose

(compose nemeses)

Combines multiple Nemesis objects into one. If all, or all but one, nemesis support Reflection, compose can simply take a collection of nemeses, and use (fs nem) to figure out what ops to send to which nemesis. Otherwise…

Takes a map of fs to nemeses and returns a single nemesis which, depending on (:f op), routes to the appropriate child nemesis. fs should be a function which takes (:f op) and returns either nil, if that nemesis should not handle that :f, or a new :f, which replaces the op’s :f, and the resulting op is passed to the given nemesis. For instance:

(compose {#{:start :stop} (partition-random-halves)
          #{:kill}        (process-killer)})

This routes :kill ops to process killer, and :start/:stop to the partitioner. What if we had two partitioners which both take :start/:stop?

(compose {{:split-start :start
           :split-stop  :stop} (partition-random-halves)
          {:ring-start  :start
           :ring-stop2  :stop} (partition-majorities-ring)})

We turn :split-start into :start, and pass that op to partition-random-halves.

f-map

(f-map lift nem)

Remaps the :f values that a nemesis accepts. Takes a function (presumably injective) which transforms :f values: (lift f) -> g, and a nemesis which accepts operations like {:f f}. The nemesis must support Reflection/fs. Returns a new nemesis which takes {:f g} instead. For example:

(f-map (fn f :foo f) (partition-random-halves))

… yields a nemesis which takes ops like {:f [:foo :start] ...} and calls the underlying partitioner nemesis with {:f :start ...}. This is designed for symmetry with generator/f-map, so you can say:

(gen/f-map lift gen) (nem/f-map lift gen)

and get a generator and nemesis that work together. Particularly handy for building up complex nemesis packages using nemesis.combined!

If you know all of your fs in advance, you can also do this with compose, but it turns out to be handy to have this as a separate function.

hammer-time

(hammer-time process)(hammer-time targeter process)

Responds to {:f :start} by pausing the given process name on a given node or nodes using SIGSTOP, and when {:f :stop} arrives, resumes it with SIGCONT. Picks the node(s) to pause using (targeter list-of-nodes), which defaults to rand-nth. Targeter may return either a single node or a collection of nodes.

invert-grudge

(invert-grudge nodes conns)

Takes a universe of nodes and a map of nodes to nodes they should be connected to, and returns a map of nodes to nodes they should NOT be connected to.

majorities-ring

(majorities-ring nodes)

A grudge in which every node can see a majority, but no node sees the same majority as any other. There are nice, exact solutions where the topology does look like a ring: these are possible for 4, 5, 6, 8, etc nodes. Seven, however, does not work so cleanly–some nodes must be connected to more than four others. We therefore offer two algorithms: one which provides an exact ring for 5-node clusters (generally common in Jepsen), and a stochastic one which doesn’t guarantee efficient ring structures, but works for larger clusters.

Wow this actually is shockingly complicated. Wonder if there’s a better way?

majorities-ring-perfect

(majorities-ring-perfect nodes)

The perfect variant of majorities-ring, used for 5-node clusters.

majorities-ring-stochastic

(majorities-ring-stochastic nodes)

The stochastic variant of majorities-ring, used for larger clusters.

Nemesis

protocol

members

invoke!

(invoke! this test op)

Apply an operation to the nemesis, which alters the cluster.

setup!

(setup! this test)

Set up the nemesis to work with the cluster. Returns the nemesis ready to be invoked

teardown!

(teardown! this test)

Tear down the nemesis when work is complete

node-start-stopper

(node-start-stopper targeter start! stop!)

Takes a targeting function which, given a list of nodes, returns a single node or collection of nodes to affect, and two functions (start! test node) invoked on nemesis start, and (stop! test node) invoked on nemesis stop. Returns a nemesis which responds to :start and :stop by running the start! and stop! fns on each of the given nodes. During start! and stop!, binds the jepsen.control session to the given node, so you can just call (c/exec ...).

The targeter can take either (targeter test nodes) or, if that fails, (targeter nodes).

Re-selects a fresh node (or nodes) for each start–if targeter returns nil, skips the start. The return values from the start and stop fns will become the :values of the returned :info operations from the nemesis, e.g.:

{:value {:n1 [:killed "java"]}}

noop

Does nothing.

partition-halves

(partition-halves)

Responds to a :start operation by cutting the network into two halves–first nodes together and in the smaller half–and a :stop operation by repairing the network.

partition-majorities-ring

(partition-majorities-ring)

Every node can see a majority, but no node sees the same majority as any other. Randomly orders nodes into a ring.

partition-random-halves

(partition-random-halves)

Cuts the network into randomly chosen halves.

partition-random-node

(partition-random-node)

Isolates a single node from the rest of the network.

partitioner

(partitioner)(partitioner grudge)

Responds to a :start operation by cutting network links as defined by (grudge nodes), and responds to :stop by healing the network. The grudge to apply is either taken from the :value of a :start op, or if that is nil, by calling (grudge (:nodes test))

Reflection

protocol

Optional protocol for reflecting on nemeses.

members

fs

(fs this)

What :f functions does this nemesis support? Returns a set. Helpful for composition.

set-time!

(set-time! t)

Set the local node time in POSIX seconds.

split-one

(split-one coll)(split-one loner coll)

Split one node off from the rest

timeout

(timeout timeout-ms nemesis)

Sometimes nemeses are unreliable. If you wrap them in this nemesis, it’ll time out their operations with the given timeout, in milliseconds. Timed out operations have :value :timeout.

truncate-file

(truncate-file)

A nemesis which responds to

{:f     :truncate
 :value {"some-node" {:file "/path/to/file or /path/to/dir"
                      :drop 64}}}

where the value is a map of nodes to {:file, :drop} maps, on those nodes, drops the last :drop bytes from the given file, or a random file from the given directory.

validate

(validate nemesis)

Wraps a nemesis, validating that it constructs responses to setup and invoke correctly.