Commit e533b12b authored by ale's avatar ale

Initial import from old repository

parents
.vagrant/
ansible.log
*.retry
FLOAT
====
*float* is a configuration management toolkit to manage
container-based services on bare-metal hardware. It is implemented as
a series of Ansible plugins and roles that you should use from your
own Ansible configuration.
# Documentation
More detailed documentation about *float* is available in the *docs/*
subdirectory:
* [Guide to Ansible integration](docs/ansible.md)
* [Configuration reference](docs/configuration.md)
* [Infrastructure services](docs/service_mesh.md)
* [HTTP router](docs/http_router.md)
Ansible
=======
This document describes how the infrastructure uses Ansible, how to
use it as a user, and how to extend it as a service developer.
# Features
The toolkit is implemented as a set of Ansible plugins and roles,
meant to be integrated into your own Ansible configuration. These
plugins and roles augment the Ansible functionality in a few useful
ways:
### GPG integration
You can use GPG to encrypt files containing host and group
variables. Such files must have a `.yml.gpg` extension. Ansible will
then decrypt them at runtime (use of gpg-agent is advised). Same holds
for the *ansible-vault* password file.
Useful when you're using GPG to manage the root trust repository.
This functionality is implemented by the
[gpg_vars](../plugins/vars/gpg_vars.py) plugin.
### Automated credentials management
Service-level credentials are automatically managed by the toolkit and
are encrypted with *ansible-vault*. This includes an internal X509 PKI
for TLS service authentication, a SSH PKI for hosts, and application
credentials defined in *passwords.yml*. See the *Credentials* section
of the [configuration reference](configuration.md) for specific
details.
All autogenerated credentials are stored in the *credentials_dir*
specified in the top-level Float configuration, which will usually
point at a separate git repository (or a temporary directory for test
environments).
### Integration of services and Ansible roles
The toolkit defines Ansible host groups for each service, to make it
easy to customize services with Ansible roles. For instance, suppose
that a service *foo* is defined in *services.yml*, and you have
created your own *foo* Ansible role to go with it (with some
foo-specific host setup). You can then tie the two together in your
playbook by making use of the *foo* host group:
```
- hosts: foo
roles:
- foo
```
# Usage
The toolkit lets you define container-based services, which may not
require any configuration beyond the service definition, as well as
Ansible-based services. It does so primarily by taking over the
Ansible inventory (see the [configuration reference](configuration.md)
for details). It is expected that you will have your own Ansible
configuration, with service-specific roles and configuration,
extending the base roles and making use of the toolkit's features.
So, to use the toolkit, you will have to include it from your own
Ansible configuration, and specifying the inventory and service
configuration in our own format.
There are some (minimal) requirements on how your Ansible environment
should be set up for this to work:
* you must have a *group_vars/all* directory (this is where we'll
write the autogenerated application credentials)
* you must include *playbooks/all.yml* from the toolkit source
directory at the beginning of your playbook
* you must use the *run-playbook* wrapper instead of running
*ansible-playbook*
## Ansible environment setup how-to
Let's walk through creating an example Ansible configuration for your
project.
First, check out the base *float* repository somewhere. We'll store
everything related to this project below a top-level directory called
*~/myproject*. We'll put the *float* repository in the *float*
subdirectory.
$ mkdir ~/myproject
$ git clone ... ~/myproject/float
Let's create the directory with our own Ansible configuration in the
*ansible* subdirectory:
$ mkdir ~/myproject/ansible
And put a top-level Ansible configuration file (*ansible.cfg*) in
there that refers to the toolkit repository location:
```
$ cat >ansible.cfg <<EOF
[defaults]
roles_path = ../float/roles:./roles
inventory_plugins = ../float/plugins/inventory
action_plugins = ../float/plugins/action
vars_plugins = ../float/plugins/vars
[inventory]
enable_plugins = float
EOF
```
This will look for plugins and base roles in *~/myproject/float*, and
it will load our own Ansible roles and config from
*~/myproject/ansible*.
We're going to need a place to store global configuration, *float*
requires to have a *group_vars/all* directory anyway, so we can use
that and put some global variables in
*~/myproject/ansible/group_vars/all/config.yml*:
```
$ mkdir -p ~/myproject/ansible/group_vars/all
$ cat > ~/myproject/ansible/group_vars/all/config.yml <<EOF
---
domain: internal.myproject.org
domain_public:
- myproject.org
EOF
```
Then you can create the main configuration file (*float.yml*), the
host inventory (*hosts.yml*), and the service definition
(*services.yml*). Check out the [configuration
reference](configuration.md) for details.
Finally, we are going to set up a basic playbook in *site.yml* that
will just run all the playbooks in the main repository:
```
---
- import_playbook: ../float/playbooks/all.yml
```
Now you can create your own service-specific Ansible configuration and
roles based on this skeleton.
## Running playbooks
The *run-playbook* wrapper makes some necessary fixes to the
environment and invokes *ansible-playbook* with the same command-line
arguments, so you should use it whenever you would use
*ansible-playbook*.
The ansible-vault setup is mandatory, so you are going to have to pass
the location of the ansible-vault encryption passphrase to Ansible via
the environment. Just use the `ANSIBLE_VAULT_PASSWORD_FILE` variable
as you normally would, with one additional feature: if the filename
ends in *.gpg*, the passphrase will be decrypted using GPG.
With respect to the previous example:
```
$ echo secret > vault-pw
$ ANSIBLE_VAULT_PASSWORD_FILE=vault-pw \
../float/run-playbook -i config.yml site.yml
```
## Initialize the permanent credentials
Before you can run Ansible to set up the services in your config,
there is one more step that needs to be done. In order to bootstrap
the internal PKIs, and generate the application credentials (which are
also valid forever, or until revoked of course), you need to invoke
the playbook in *playbooks/init-credentials.yml*:
```
$ ANSIBLE_VAULT_PASSWORD_FILE=vault-pw \
../float/run-playbook -i config.yml ../float/playbooks/init-credentials.yml
```
This will write a bunch of files in your *credentials_dir*, including
the private keys for the various PKIs (X509, SSH, etc), and a
*secrets.yml* file containing the autogenerated application
credentials.
These files are of course to be kept private when setting up a
production environment.
## Credentials
The system uses two major types of credentials:
* *managed* credentials for accounts, services, etc - these can be
autogenerated automatically, based on the top-level description in
*passwords.yml*. They are stored encrypted with *ansible-vault*.
* *root* credentials, that need to be provided externally, including
for instance the *ansible-vault* password used for managed
credentials, and other third-party secrets (like credentials for a
private Docker registry, etc).
All services that require per-host secrets, such as SSH and the
internal X509 PKI, manage those secrets directly on the hosts
themselves, renewing them when necessary. Those secrets are not stored
in a central location.
This means that for normal usage (i.e. except when new credentials are
added), the credentials repository is read-only, which makes it easier
to integrate deployment with CI systems.
The expectation is that, for production environments, it will be saved
in private git repositories. Temporary setups such as test
environments, on the other hand, need no persistence as managed
credentials can simply be re-created every time.
# Implementation details
These are details of how parts of the infrastructure is implemented,
useful if you want to understand how a service is deployed, or how to
write a new one.
## Scheduler
The *float* service scheduler sets a large number of host variables and
global configuration parameters. In the Ansible host scope, the
following variables will be defined:
* `services` holds all the service metadata, in a dictionary indexed
by service name;
* `service_assignments` is a {service: [hosts]} dictionary with all
the service assignments;
* `enable_<service>` for each service that evaluates to
true on the hosts assigned to the service (note: dashes in the
service name are converted to underscores);
* `enabled_services` contains the list of enabled services on this
host;
* `disabled_services` contains the list of disabled services on this host;
* `enabled_containers` contains a list of dictionaries describing the
containers that should be active on this host. The dictionaries have
the following attributes:
* `service` is the service metadata
* `container` is the container metadata
* `disabled_containers` has the same information as the above, but for
disabled containers;
* `<service>_master` is true on the host where the master instance is
scheduled, and false elsewhere. This variable is only defined for
services using static master election (i.e. where *master_election*
is true in the service metadata).
The scheduler also defines new dynamic Ansible groups based on
service assignments:
* For each service, create a host group named after the service, whose
members are the hosts assigned to the service.
* For each network overlay defined in the inventory, create a host
group named `overlay-<name>` whose members are the hosts on that
overlay.
# Writing new service roles
This diff is collapsed.
Public HTTP Router
==================
The infrastructure provides a way for HTTP-based services to expose
themselves to the public Internet by
defining
[public endpoints](configuration.md#Global_HTTP_routing). This is done
via the so-called *public HTTP router*, which is simply a NGINX
reverse-proxy installation that is automatically configured based on
service metadata.
The clients of this service are users (or generically, external
clients), not other services, which should instead talk directly to
each other.
The public HTTP router will force all incoming requests to HTTPS.
## SSL Certificates
The public HTTP router will automatically generate SSL certificates
for the required domain names. It will generate self-signed
certificates on the first install, and then switch to using
Letsencrypt in production environments.
## Cache
A global HTTP cache is available for services that require it. Its
location is */var/cache/nginx*, and it can be configured with the
following variables:
`nginx_cache_keys_mem` is the memory size of the key buffer.
`nginx_cache_fs_size` is the maximum on-disk size of the cache (note
that nginx might use as much as twice what specified here, depending
on expiration policy).
Service mesh
============
The *service mesh* is a fancy name for the layer of glue that allows
services to find and talk to each other. Ours offers the following
features:
* The ability to set up *overlay* networks to isolate
service-to-service traffic from the public Internet.
* Services find each other with DNS A / AAAA lookups, so the client
must know the target port. As a consequence, each service must use a
globally unique port. This also implies that it's impossible to
schedule more than one instance of a service on each host.
* DNS views are used to provide topology-aware service resolution, so
that hosts sharing a network overlay will route service requests
over that network.
* Connections between services are direct, not mediated by proxies, so
there is no global load balancing and clients are expected to keep
track of the state of backends and implement retry policies.
* Services can securely authenticate each other by using credentials
automatically provided by the service mesh.
# Naming
Services are identified by their *name*, an alphanumeric string (it
can also include a dash '-' character).
All DNS entries are served under an internal domain *domain*.
Every service has a number of *backends*, pointing at a specific
service instance. In our case, this identifier is simply the IP
address of the host the instance is scheduled on. The specific IP
address will depend on whether the source and target host share any
network overlays, otherwise it will be the public IP address of the
backend host.
### Partitioned services
Patterns of access to backends vary depending on the service itself:
for instance, with services that are replicated for high-availability,
the client usually does not care which backend it talks to. In other
cases, such as with *partitioned* services, clients need to identify
individual backends. There are two ways of doing so:
Backends are numbered sequentially in the range *0*...*N*, in a
non-permanently-stable fashion: every time the instances are
rescheduled to different hosts, the index assignments might
change. This type of identifier is useful to find a single backend for
a service, when you don't care about long-term consistency (a global,
partitioned cache, for instance).
Backends can also have permanent *shard* identifiers, that identify a
specific backend host, and that do not change on reschedules. These
are useful when a service is partitioned across multiple backends and
the hosts have state or data associated with it. A shard identifier is
an alphanumeric literal (it's best not to use numbers, or they will
conflict with the indexed notation), specific to the host.
### DNS queries
The following DNS queries are supported:
*service.domain*
Returns one or more backend addresses for the service. This, as with
all other queries, refers to the *configured* backends, not
necessarily the ones that are alive at the time of the query: the
client is expected to do its own health checking. Note also that the
list may be incomplete, if the number of backends is large, due to
limitations of the DNS protocol; we can expect though that a
sufficient number of backends will be returned for high availability
purposes.
Long-running clients should periodically (say, every minute) re-check
the results of the DNS query to identify new and expired backends.
*index.service.domain*
Will return the address of the backend number *index*.
*shard.service.domain*
Will return the address of the backend shard named *shard*.
# Mutual authentication
Service communication should be encrypted, and communicating services
should authenticate each other. The standard way to do this is with
TLS as the transport layer. The service mesh provides its own *service
PKI* to automatically generate X509 credentials for all services.
The X509 certificates are deployed on the host filesystem, and access
to them is controlled via UNIX permissions. This provides an
attestation of UNIX identity across the whole infrastructure.
Each service, in *services.yml*, can define multiple credentials, each
with its own name and attributes: this can be useful for complex
services with multiple processes, but in most cases there will be just
a single credential, with the same name as the service.
Credentials are saved below `/etc/credentials/x509/<name>`, with
the following structure:
/etc/credentials/x509/<name>/
+-- ca.pem CA certificate for the service PKI
+-- client/
| +-- cert.pem Client certificate
| \-- private_key.pem Client private key
\-- server/
+-- cert.pem Server certificate
\-- private_key.pem Server private key
Private keys are stored unencrypted, and are only readable by the
`<name>-credentials` group. The user that the service runs as must be
a member of this group.
Server certificates will include all the names and IP addresses that
service backends are reachable as. This includes:
* *name.domain*
* *name*
* *index.name.domain*
* *shard.name.domain* (if present)
* localhost
* *hostname*
* *fqdn*
* all public IP addresses of the host
* all IP addresses of the host on its network overlays
The purpose is to pass server name validation on the largest number of
clients possible, without forcing a specific implementation.
Client certificates have the following names:
* *name.domain*
* *name*
The service configuration for credentials is described in the
[configuration](configuration.md#Credentials) page.
# Network overlays
It is possible to define internal networks that span multiple hosts,
called *overlays*, which can then be used for service traffic,
ignoring the actual underlying network topology.
For now, only a single IPv4 address can be assigned to a host on each
overlay. In the future, it should be possible to assign an entire
subnet, so that individual IPs will be available to services.
The current implementation uses [tinc](https://www.tinc-vpn.org/) and
sets up a fully connected mesh.
See the [configuration](configuration.md#Host_variables) page for
details on the host configuration required to enable network overlays.
# Usage
## Server implementation
Servers should use TLS 1.2, and they should require clients to provide
a certificate and validate that it is signed by the CA. Since
credentials are ECDSA certificates, servers should at least support
the ECDHE\_ECDSA\_WITH\_AES\_128\_GCM\_SHA256 cipher suite.
For authenticating the client, servers can look at the client
certificate subject, and apply ACLs based on it.
A reference HTTPS server implementation for Go is provided in the
[git.autistici.org/ai3/go-common/serverutil](https://git.autistici.org/ai3/go-common/blob/master/serverutil/) package.
HTTP servers should apply backpressure (when they detect overload) by
returning responses with HTTP status 429.
# Client implementation
Since the infrastructure provides little in terms of traffic control,
clients should be smart and well-behaved, they should expect failures
and handle them gracefully. At the very least this implies:
* Clients should use DNS to fetch the backend(s). The results should
be refreshed periodically to detect new and expired backends. This
is important, so that you don't need to restart all the clients when
a service is rescheduled.
* All outbound requests must have deadlines, and handle timeout
errors, so that a failure to reach a backend does not cause requests
to pile up indefinitely.
* Clients should retry failed requests as many times as necessary,
within the deadline, using exponential back-off. Retriable failures
should include transport-level errors and backpressure errors at the
application level.
Optionally, clients could implement an error-avoiding backend
selection policy, to minimize the latency impact of long-running
failures.
## Go
A reference HTTP(S) client implementation for Go is provided in the
[git.autistici.org/ai3/go-common/clientutil](https://git.autistici.org/ai3/go-common/blob/master/clientutil/) package.
## Python
A robust Python HTTPS client could be something as simple as this
(using the [requests](http://python-requests.org/)
and [backoff](https://pypi.python.org/pypi/backoff) packages):
```python
import backoff
import requests
class StatusError(Exception):
def __init__(self, code):
self.code = code
super(StatusError, self).__init__('HTTP status %d' % code)
class RetriableStatusError(StatusError):
pass
@backoff.on_exception(backoff.expo,
(requests.exceptions.Timeout,
requests.exceptions.ConnectionError,
RetriableStatusError),
max_tries=10)
def json_request(self, uri, data, ssl_cert, ssl_key, ssl_ca, timeout=10):
session = requests.Session()
session.cert = (ssl_cert, ssl_key)
session.verify = ssl_ca
session.timeout = timeout
req = session.prepare_request(requests.Request(
'POST',
uri,
data=json.dumps(data),
headers={'Content-Type': 'application/json'},
))
resp = session.send(req)
if resp.status_code == 429 or resp.status_code >= 500:
raise RetriableStatusError(resp.status_code)
elif resp.status_code != 200:
raise StatusError(resp.status_code)
return resp.json()
```
---
# Setup base infra.
- import_playbook: infra.yml
- hosts: ldap
roles:
- ldap
- hosts: sso-server
become: true
roles:
- sso-server
---
# Playbook that sets up the basic infrastructure.
- hosts: all
roles:
- base
- credentials
- vagrant-compat
- hosts: net-overlay
roles:
- net-overlay
- hosts: all
roles:
- docker
- hosts: nginx
roles:
- nginx
- hosts: frontend
roles:
- dns
- hosts: prometheus
roles:
- prometheus
- hosts: log-collector
roles:
- log-collector
---
# Generate *all* local credentials at once, on localhost.
# This is better than having tasks with 'local_action' and 'run_once',
# mostly because the semantics of 'run_once' are odd (the task may be
# executed multiple times, depending on grouping done by the Ansible
# scheduler). But it is a bit less elegant than automatically creating
# what is necessary close to where it is used... on the other hand,
# this closeness is already missing for services like SSO where the
# public and secret parts are handled by separate roles.
- hosts: localhost
gather_facts: no
tasks:
# Create the paths below credentials_dir that we're going to use.
- name: "Create paths below {{ credentials_dir }}"
file:
path: "{{ credentials_dir }}/{{ item }}"
state: directory
with_items:
- ssh
- sso
- vars
- x509
# First of all, generate secrets from the passwords.yml file.
- name: Initialize secrets
local_action: command ../scripts/pwgen.py --vars "{{ credentials_dir }}/secrets.yml" "{{ passwords_file }}"
- name: Link secrets.yml from the vars directory
file:
src: "{{ credentials_dir }}/secrets.yml"
dest: "{{ vars_dir }}/secrets.yml"
state: link
# Generate the SSH CA.
- name: Generate SSH CA
local_action: sshca ca="{{ credentials_dir }}/ssh/key"
# Generate the SSO ED25519 key pair.
- name: Generate SSO credentials
local_action: ed25519 privkey="{{ credentials_dir }}/sso/secret.key" pubkey="{{ credentials_dir }}/sso/public.key"
# Generate all the X509 service credentials. The first time this
# runs, the service CA will be initialized too.
#- name: Generate X509 credentials for all services
# local_action: x509 ca_root="{{ credentials_dir }}/x509" ca_subject="{{ x509_ca_subject | default('') }}" domain="{{ domain }}"
- name: Generate global DH params
local_action: command openssl dhparam -out "{{ credentials_dir }}/x509/dhparam" "{{ dhparam_bits | default('2048') }}" creates="{{ credentials_dir }}/x509/dhparam"
- name: Generate the X509 CA certificate
local_action: x509_ca ca_subject="{{ x509_ca_subject | default('') }}" ca_cert_path="{{ credentials_dir }}/x509/ca.pem" ca_key_path="{{ credentials_dir }}/x509/ca_private_key.pem"
# Automatically create ED25519 credentials for services.
import contextlib
import os