Extend README with info on semantics, authentication, acls, etc

a008e1cf · ale · d9105cf3 · a008e1cf
Commit a008e1cf authored 3 years ago by ale
--- a/README.md
+++ b/README.md
 replds (v2)
 ===
-Minimalistic filesystem-like replicated database, with
+Minimalistic filesystem-like replicated data distribution service,
-*last-write-wins* semantics, authentication and ACLs. It offers a
+with *last-write-wins* semantics, authentication and ACLs. It offers a
 simple client to replicate locally (on the filesystem) parts of the
-database, and trigger the execution of shell scripts when they change.
+database, and trigger the execution of shell scripts when they change,
+which makes it particularly suited for the distribution of *secrets*
+that need to be modified by online automation.
 The last-write-wins model makes it suitable for scenarios where there
 is either a single writer, or the key space is completely
-partitioned. The absence of transactions makes it not quite a *real*
+partitioned. The absence of transactions or other database-like
-database: it is best to think about it as a *data distribution*
+features is why it is best to think of replds as a *data distribution*
 solution, with a familiar filesystem-like data model.
 The current limitations of the internal replication protocols makes it
-only suitable for small-ish datasets (i.e., fewer than 100k objects,
+only suitable for small-ish datasets (i.e., fewer than 100k objects, a
-less than 10G total size).
+few GBs total size). See the *Replication semantics* section below for
+further details.
 # Building
@@ -28,7 +31,22 @@ sufficiently recent Go environment (>= 1.14):
 go build ./cmd/replds
 ```
-# Quick tutorial
+# Running
+All the functionality is available in the *replds* binary, which has
+subcommands to run servers, clients, etc. You can run
+```
+replds help
+```
+to get a quick summary of the available commands and options.
+## Quick tutorial
+This tutorial will set up a local *replds* service cluster with
+multiple instances, without authentication or ACLs, and will show how
+to perform some simple operations.
 Let's start by running the server. We'll use the *test.sh* script to
 start 3 instances of "replds server", on different ports, that talk to
@@ -79,3 +97,120 @@ replds pull --store=./sub --server=localhost:12100 /data/sub
 The "replds pull" program will keep running to receive incremental
 updates, but now the *sub* local directory should have the expected
 contents (files *one* and *two).
+## Replication semantics
+The service implements a *last-write-wins* conflict resolution
+approach, which favors availability as opposed to consistency, to the
+extent that there is **no** consistency guarantee whatsoever: there
+are no transactions, there is no serialization. The API contract is
+the following:
+* Each object in the data store (*node*, in internal terminology) has
+  a *version* associated with it.
+* Clients are guaranteed that the version of objects they see will
+  always increase monotonically: that is, they will never revert to
+  seeing an "older" version of an object.
+* The above applies to each object separately: if you store objects A
+  and B in a single API call, there is no guarantee that clients will
+  see A and B updated at the same time (though the current
+  implementation tries really hard to make it so).
+Replds instances synchronize updates between each other using a
+gossip-like protocol, with periodic data exchanges. The current
+implementation of this protocol is very simple and **highly**
+inefficient: the periodic data exchange includes a representation of
+the entire contents of the data store, so bandwidth usage scales with
+the number of objects in the database! Also, due to the randomness
+intrinsic to the gossip protocol, update propagation delays can only
+be expressed statistically, so it is best to limit the number of
+instances to a small value.
+There is no concept of consensus, so writes will always succeed: the
+service relies on internal synchronization to propagate the update to
+all instances. In the presence of a network partition, clients may not
+see these updates until the partition is resolved. This makes it
+pretty much impossible for clients to implement read / modify / update
+cycles, even with a single writer, unless they maintain local state
+(possibly using a Watcher).
+The instances store node data on disk, but keep all the nodes'
+metadata in memory for speed, although using a highly efficient radix
+tree (compressed trie) data structure.
+The maximum data size is limited by the inefficiency of the
+synchronization protocol, and the requirement to keep all the node
+metadata in memory at all times. The current implementation is also
+not very smart in that it needs to hold the whole dataset (including
+node data!) in memory when doing an initial synchronization pass, to
+serialize it into protobufs. All this places limits on dataset size
+both in terms of number of entities (say, less than 10k) and total
+size of the data (a few GBs).
+## API
+The public replds API is a GRPC API, and is defined in
+[proto/replds.proto](proto/replds.proto) and consists of two RPC
+methods:
+* *Store*, to upload data to the service. The "replds store" command
+  implements a simple command-line client for this method.
+* *Watch*, for incremental synchronization of parts of the data
+  store. This is a long-lived streaming RPC that implements an
+  asynchronous replication protocol: the clients sends a summary of
+  the initial state of its database, then listens for streaming
+  incremental updates. The "replds pull" command implements this
+  method.
+## Deployment
+In order to provide high availability, you're supposed to run multiple
+instances of *replds server*. The cluster model is static, i.e. each
+instance needs to be told explicitly about the others using the
+*--peer* command-line option.
+Clients can talk to any instance, both for read and write API calls.
+### Authentication
+Replds supports authentication, both of clients and server instances,
+using mTLS (mutually-authenticated TLS). Clients, including server
+instances when they talk to other server instances, need a valid TLS
+client certificate, signed by a specific CA.
+Furthermore, the identity associated with the client certificate, in
+the form of the Common Name (CN) part of the X509 subject, can be used
+in *access control lists* (ACLs), to limit which clients can access
+parts of the data store. In replds, an ACL consists of three parts:
+* identity
+* path
+* operation (read / write)
+when ACLs are defined, access is **denied by default**, and ACL rules
+are used to allow specific accesses. Paths in ACL rules are prefixes:
+an ACL for /foo will also allow access to /foo/bar etc.
+ACLs are configured in a text file, one per line, with the following
+format:
+> *identity* *path* *operation*
+where operation can be either `read` or `write` (possibly abbreviated
+to `r` or `w`). Pass this file to "replds server" using the *--acls*
+option. Lines that start with a # are considered comments and ignored.
+### Triggers
+The "replds pull" command can execute user-defined triggers (shell
+commands) when certain parts of the data store tree are updated. This
+can be useful, for instance, to restart services with new credentials.
+To load triggers, specify a directory with *--triggers-dir*. Replds
+will load all the files from this directory (whose name does not
+contain a dot, so no extensions should be used) as JSON-encoded files,
+each one should be a dictionary with two attributes:
+* *path*, specifying the path prefix that will activate this trigger;
+* *command*, the shell command to execute.