Skip to content
Snippets Groups Projects
Commit ee1a743e authored by ale's avatar ale
Browse files

Add some Ansible coding guidelines

parent 2f18b762
Branches
No related tags found
No related merge requests found
......@@ -17,14 +17,15 @@ surface between itself and Ansible as small as possible.
* We use Ansible to copy files in place, eventually using
templates. This is the fundamental feature of all configuration
management systems. Most of our file copies are hooked into some
sort of restart-the-associated-service loop.
management systems. Most of our file copies are hooked with some
kind of restart-the-associated-service task.
* We use a lot of verbose Ansible code to run loops over data
structures (credentials, host / DNS configs, etc), accounting for a
significant part of code bloat and execution time. This is a
particularly bad spot for float/Ansible integration, but the
underlying algorithms are very simple (just nested loops).
underlying algorithms are very simple (just nested loops). Some of
these are slowly being unrolled using functional filters.
* We use a pattern of remote execution + data transfers to sign
credentials without the private key ever leaving the hosts. This is
......@@ -32,17 +33,53 @@ surface between itself and Ansible as small as possible.
but the workflow is similar in all cases: generate private key on
the host, sign public key on the Ansible host, transfer the results;
the advantage is that the private key material never moves around.
This logic represents the vast majority of our custom Python code,
and it's largely the same functionality repeated for different CA
models (X509, SSH).
* Some services require us to refer to other host's IP addresses
rather than just by name, so we access other host's metadata
variables (facts, which are part of the inventory).
* In a couple of places we run a global collection pass in order to
use the results in a subsequent Ansible task: for instance when
collecting all *tinc* public keys to generate the tinc configuration
(every host needs all the other peers' public keys). This pattern of
accessing non-inventory facts about other hosts is quite problematic
due to the requirement of all the hosts being up (or using a
persistent cache for facts), and it is highly discouraged. There are
probably alternative solutions that could be explored (like storing
the tinc public keys in the repository), on a case-by-case basis.
* We maintain a minimal dependency on "discovered" ansible_facts as a
way to decouple the configuration templates from Ansible itself as
much as possible.
* There's [still one place where we run a global collection
pass](https://git.autistici.org/ai3/float/-/issues/86) in order to
use the results in a subsequent Ansible task: when collecting all
*tinc* public keys to generate the tinc configuration (every host
needs all the other peers' public keys). This pattern of accessing
non-inventory facts about other hosts is quite problematic due to
the requirement of all the hosts being up (or using a persistent
cache for facts), and it is highly discouraged. There are probably
alternative solutions that could be explored (like storing the tinc
public keys in the repository), on a case-by-case basis.
Also, some float limitations are directly derived from Ansible
characteristics:
* Float requires you to manually write a playbook tying your service
group to your Ansible role, because it is currently impossible to
parameterize the *import_role* Ansible directive.
* We can't meaningfully use role dependencies (at least not the way
the roles are currently written), because Ansible considers
dependencies to be local to each playbook, so if every role depends
on *float-base*, it will end up being run as many times as there are
services.
## Advice for writing Ansible roles for services
If one is running services on float, and wishes to retain some level
of flexibility to switch to other mechanisms in the future, here are
some guidelines that could be followed to maximize that chance:
* Do not use discovered *ansible_facts*, especially not from other
hosts, but use facts from the host inventory instead. If you need
information about hosts to be available to service configurations,
define additional host attributes in the inventory.
* Only use simple Ansible task modules such as "copy", "template",
"file" and possibly "systemd" to restart containers. If you're using
a container-based service deployment style, there should be no need
for anything else. These types of tasks, along with their usually
minimal logic, should be relatively simple to translate to other
environments.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment