jepsen.nemesis.membership

EXPERIMENTAL: provides standardized support for nemeses which add and remove nodes from a cluster.

This is a tricky problem. Even the concept of cluster state is complicated: there is Jepsen’s knowledge of the state, and each individual node’s understanding of the current state. Depending on which node you ask, you may get more or less recent (or, frequently, divergent) views of cluster state. Cluster state representation is highly variable across databases, which means our standardized state machine must allow for that variability.

We are guided by some principles that crop up repeatedly in writing these sorts of nemeses:

  1. We should avoid creating useless cluster states–e.g. those that can’t fulfill any requests–for very long.

  2. There are both safe and unsafe transitions. In general, commands like join/remove should always be safe. Removing data, however, is unsafe unless we can prove the node has been properly removed.

  3. We want to leave nodes running, with data files intact, after removing them. This is when interesting things happen.

  4. We must be safe in the presence of concurrent node kill/restart operations.

  5. Nodes tend to go down or fail to reach the rest of the cluster, but we want to continue making decisions during this time.

  6. Requested changes to the cluster may time out, or simply take a while to perform. We need to remember these ongoing operations, use them to constrain our choices of further changes (e.g. if four node removals are underway, don’t initiate a fifth), and find ways to resolve those ongoing changes, e.g. by confirming they took place.

Our general approach is to define a sort of state machine where the state is our representation of the cluster state, how all nodes view the cluster, and the set of ongoing operations, plus any auxiliary material (e.g. after completing a node removal, we can delete its data files). This state is periodically updated by querying individual nodes, and also by performing operations–e.g. initiating a node removal.

The generator constructs those operations by asking the nemesis what sorts of operations would be legal to perform at this time, and picking one of those. It then passes that operation back to the nemesis (via nemesis/invoke!), and the nemesis updates its local state and performs the operation.

initial-state

(initial-state test)

Constructs an initial cluster state map for the given test.

node-view-future

(node-view-future test state running? opts node)

Spawns a future which keeps the given state atom updated with our view of this node.

node-view-interval

How many seconds between updating node views.

package

(package opts)

Constructs a nemesis and generator for membership operations. Options are a map like

{:faults #{:membership …} :membership membership-opts}.

Membership opts are:

{:state A record satisfying the State protocol :log-resolve-op? Whether to log the resolution of operations :log-resolve? Whether to log each resolve step :log-node-views? Whether to log changing node views :log-view? Whether to log the entire cluster view.

The package includes a :state field, which is an atom of the current cluster state. You can use this (for example) to have generators which inspect the current cluster state and use it to target faults.

resolve

(resolve state test opts)

Resolves a state towards its final form by calling resolve and resolve-ops until converged.

resolve-ops

(resolve-ops state test opts)

Try to resolve any pending ops we can. Returns state with those ops resolved.

State

protocol

For convenience, a copy of the membership State protocol. This lets users implement the protocol without requiring the state namespace themselves.

update-node-view!

(update-node-view! state test node opts)

Takes an atom wrapping a State, a test, and a node. Gets the current view from that node’s perspective, and updates the state atom to reflect it.