diff --git a/README.md b/README.md index 76eec53d5f3e4830d27e08aed7da2d6d993f37e8..b4a925ed8c4bf7c27f1b390cdb27b8d15d1d3a96 100644 --- a/README.md +++ b/README.md @@ -1,20 +1,23 @@ replds (v2) === -Minimalistic filesystem-like replicated database, with -*last-write-wins* semantics, authentication and ACLs. It offers a +Minimalistic filesystem-like replicated data distribution service, +with *last-write-wins* semantics, authentication and ACLs. It offers a simple client to replicate locally (on the filesystem) parts of the -database, and trigger the execution of shell scripts when they change. +database, and trigger the execution of shell scripts when they change, +which makes it particularly suited for the distribution of *secrets* +that need to be modified by online automation. The last-write-wins model makes it suitable for scenarios where there is either a single writer, or the key space is completely -partitioned. The absence of transactions makes it not quite a *real* -database: it is best to think about it as a *data distribution* +partitioned. The absence of transactions or other database-like +features is why it is best to think of replds as a *data distribution* solution, with a familiar filesystem-like data model. The current limitations of the internal replication protocols makes it -only suitable for small-ish datasets (i.e., fewer than 100k objects, -less than 10G total size). +only suitable for small-ish datasets (i.e., fewer than 100k objects, a +few GBs total size). See the *Replication semantics* section below for +further details. # Building @@ -28,7 +31,22 @@ sufficiently recent Go environment (>= 1.14): go build ./cmd/replds ``` -# Quick tutorial +# Running + +All the functionality is available in the *replds* binary, which has +subcommands to run servers, clients, etc. You can run + +``` +replds help +``` + +to get a quick summary of the available commands and options. + +## Quick tutorial + +This tutorial will set up a local *replds* service cluster with +multiple instances, without authentication or ACLs, and will show how +to perform some simple operations. Let's start by running the server. We'll use the *test.sh* script to start 3 instances of "replds server", on different ports, that talk to @@ -79,3 +97,120 @@ replds pull --store=./sub --server=localhost:12100 /data/sub The "replds pull" program will keep running to receive incremental updates, but now the *sub* local directory should have the expected contents (files *one* and *two). + +## Replication semantics + +The service implements a *last-write-wins* conflict resolution +approach, which favors availability as opposed to consistency, to the +extent that there is **no** consistency guarantee whatsoever: there +are no transactions, there is no serialization. The API contract is +the following: + +* Each object in the data store (*node*, in internal terminology) has + a *version* associated with it. +* Clients are guaranteed that the version of objects they see will + always increase monotonically: that is, they will never revert to + seeing an "older" version of an object. +* The above applies to each object separately: if you store objects A + and B in a single API call, there is no guarantee that clients will + see A and B updated at the same time (though the current + implementation tries really hard to make it so). + +Replds instances synchronize updates between each other using a +gossip-like protocol, with periodic data exchanges. The current +implementation of this protocol is very simple and **highly** +inefficient: the periodic data exchange includes a representation of +the entire contents of the data store, so bandwidth usage scales with +the number of objects in the database! Also, due to the randomness +intrinsic to the gossip protocol, update propagation delays can only +be expressed statistically, so it is best to limit the number of +instances to a small value. + +There is no concept of consensus, so writes will always succeed: the +service relies on internal synchronization to propagate the update to +all instances. In the presence of a network partition, clients may not +see these updates until the partition is resolved. This makes it +pretty much impossible for clients to implement read / modify / update +cycles, even with a single writer, unless they maintain local state +(possibly using a Watcher). + +The instances store node data on disk, but keep all the nodes' +metadata in memory for speed, although using a highly efficient radix +tree (compressed trie) data structure. + +The maximum data size is limited by the inefficiency of the +synchronization protocol, and the requirement to keep all the node +metadata in memory at all times. The current implementation is also +not very smart in that it needs to hold the whole dataset (including +node data!) in memory when doing an initial synchronization pass, to +serialize it into protobufs. All this places limits on dataset size +both in terms of number of entities (say, less than 10k) and total +size of the data (a few GBs). + +## API + +The public replds API is a GRPC API, and is defined in +[proto/replds.proto](proto/replds.proto) and consists of two RPC +methods: + +* *Store*, to upload data to the service. The "replds store" command + implements a simple command-line client for this method. +* *Watch*, for incremental synchronization of parts of the data + store. This is a long-lived streaming RPC that implements an + asynchronous replication protocol: the clients sends a summary of + the initial state of its database, then listens for streaming + incremental updates. The "replds pull" command implements this + method. + +## Deployment + +In order to provide high availability, you're supposed to run multiple +instances of *replds server*. The cluster model is static, i.e. each +instance needs to be told explicitly about the others using the +*--peer* command-line option. + +Clients can talk to any instance, both for read and write API calls. + +### Authentication + +Replds supports authentication, both of clients and server instances, +using mTLS (mutually-authenticated TLS). Clients, including server +instances when they talk to other server instances, need a valid TLS +client certificate, signed by a specific CA. + +Furthermore, the identity associated with the client certificate, in +the form of the Common Name (CN) part of the X509 subject, can be used +in *access control lists* (ACLs), to limit which clients can access +parts of the data store. In replds, an ACL consists of three parts: + +* identity +* path +* operation (read / write) + +when ACLs are defined, access is **denied by default**, and ACL rules +are used to allow specific accesses. Paths in ACL rules are prefixes: +an ACL for /foo will also allow access to /foo/bar etc. + +ACLs are configured in a text file, one per line, with the following +format: + +> *identity* *path* *operation* + +where operation can be either `read` or `write` (possibly abbreviated +to `r` or `w`). Pass this file to "replds server" using the *--acls* +option. Lines that start with a # are considered comments and ignored. + +### Triggers + +The "replds pull" command can execute user-defined triggers (shell +commands) when certain parts of the data store tree are updated. This +can be useful, for instance, to restart services with new credentials. + +To load triggers, specify a directory with *--triggers-dir*. Replds +will load all the files from this directory (whose name does not +contain a dot, so no extensions should be used) as JSON-encoded files, +each one should be a dictionary with two attributes: + +* *path*, specifying the path prefix that will activate this trigger; +* *command*, the shell command to execute. +