Skip to content
Snippets Groups Projects
Select Git revision
  • lintian-fixes
  • master default protected
2 results

runcron

  • Clone with SSH
  • Clone with HTTPS
  • Forked from ai3 / tools / runcron
    Source project has a limited visibility.
    Name Last commit Last update
    library
    roles
    test
    README.md
    inventory.py
    site.yml

    minicluster

    An attempt at defining a minimum viable infrastructure for simple distributed services. It aims to provide the following features:

    • support modern service-oriented architecture
    • enforce robust security standards
    • require few overhead resources

    while taking into account the following limitations:

    • sacrifice performance for simplicity if necessary, a human can understand and operate it
    • tolerate higher latencies (for global deployments)

    These decisions obvoiusly shape the type of application that can be built on top of this infrastructure, hence the simple qualifier above. The target is small self-managed projects, rather than multi-datacenter, 10K-node compute clusters.

    Overview

    It is assumed that you want to manage a set of identical machines (nodes) into a coherent entity (cluster) that provides a service. The provisioning of these machines is not covered. Machines being identical means they should not have distinguishing features to tell them apart, so their role assignments can be changed dynamically.

    Also, you should have a domain name to use for the purpose of running the infrastructure itself. This in most cases will be separate from the domain you intend to run the public application on.

    Service interface

    The applications built on this infrastructure will consist themselves of various services that talk between each other and with the users via some kind of RPC.

    The infrastructure defines a well-specified interface for the application's services, which consists of a series of APIs:

    • service discovery
      • the ability to find endpoints for other services
      • the ability for a service to declare itself available
    • credentials management
      • the ability to authenticate from/to other services
      • the ability to authenticate users (if desired)
    • coordination primitives
      • masterelection
      • global (small) runtime data store
    • monitoring
      • centralized collection of metrics
    • logging
      • centralized collection of service logs

    Scheduling

    One of the points of infrastructure as a service, among others, is to allow easy horizontal scaling. The issue with that, however, is that not everything scales proportionally. Consider, for instance, consensus-based services: it isn't practical to grow these beyond a small odd number of instances (3, 5, ...), based on the desired fault tolerance. Another case is master/standby setups.

    The allocation of service instances is usually controlled by an infrastructure component called scheduler (Kubernetes is a good example), that quickly reacts at runtime to changes in node and service availability as needed to maintain the desired constraints. Schedulers are usually large and complex pieces of software.

    There is currently no runtime scheduler in this implementation. Instead, there is a simpler alternative, consisting of two parts:

    • all consensus-related services (all the core infrastructure services) run on a well-defined subset of cluster nodes with a fixed size chosen at configuration time. Any other nodes will not participate in the cluster consensus, but they will simply act as clients to the core infrastructure services.

    • a configuration-time service scheduler allocates services that need to run with a limited number of instances. These assignments only change when the nodes in the cluster change.

    The trade-off implied by this approach is that reacting to node failures is a manual administrative operation, same as a cluster footprint change (new nodes being added or removed). While this is a significant limitation, it is still possible to build reliable services this way, for instance with masterelection protocols using the provided coordination infrastructure.

    Service discovery

    There are two main mechanisms available for service discovery, depending on context. Let's break down the major use cases:

    • Find service backends at configuration time, based on membership on a specific host group for example. This is important, among other things, for monitoring, where an authoritative list of expected backends is required for presence alerts.

      Static service discovery is available in Ansible using global variables and groups:

        groups['foo-service']

      will contain the full list of hosts running a specific service (of course, in this case the port should be known). Further attributes about those hosts can be accessed via the Ansible fact repository.

    • Find service backends dynamically at runtime, based on availability. This will use some form of presence lookup (currently implemented with etcd/servicereg/SkyDNS), and it's normally used when you need to find at least one available backend for an RPC.

      Dynamic service discovery currently uses DNS: lookups for A or SRV records will result in the list of available backends, clients can then pick their own desired assignment policy.

    Service credentials

    Since potentially service-to-service communication goes over the open Internet, it needs to be encrypted and authenticated. The first step is to provide a X509 infrastructure for a trusted transport layer.

    X509 certificates

    The X509 infrastructure provides a pair of long-term certificates for each service (client and server usage are separate). Certificates will be signed by an offline CA (colocated with the ansible host).

    The certificates for a given service can be found here:

    /etc/credentials/x509/service/client/cert.pem
    /etc/credentials/x509/service/client/private_key.pem
    /etc/credentials/x509/service/server/cert.pem
    /etc/credentials/x509/service/server/private_key.pem

    The public CA certificate will be available at:

    /etc/credentials/x509/ca.pem

    Client certificates must have a subject CN of the form 'service@host'. Server certificates must include the following SANs:

    • service.domain (name used for service discovery)
    • short hostname
    • fully qualified hostname
    • all IP addresses of the host
    • localhost

    this ensures that client-side validation will succeed.

    Certificate access on the local machine is controlled with standard UNIX access controls. The private keys will only be readable by the user the service runs as.

    For legacy servers, or for those cases where authorization is handled separately (OAuth, etc), it is sufficient to verify that client certificates are signed by the CA. Alternatively, servers can use the client certificate's subject as a trusted attestation of the peer's identity.

    OAuth2

    It is also possible to automatically create OAuth 2.0 clients for services that use client credentials to authenticate to the IDP and interact with user credentials.

    Usage

    Requirements

    You'll need the x509ca tool to be installed on the machine where you run Ansible, to generate SSL certificates. Install it with:

    $ sudo apt-get install golang-go
    $ go get git.autistici.org/ale/x509ca

    Then make sure that $GOPATH/bin is in your PATH.

    Ansible setup

    To integrate minicluster with your own Ansible configuration, you'll need to tell Ansible where to find it. Add this to your ansible.cfg file:

    [defaults]
    library = /path/to/ansible-minicluster/library
    roles_path = /path/to/ansible-minicluster/roles:./roles
    hostfile = /path/to/ansible-minicluster/inventory.py

    Note that the hostfile points at the dynamic inventory script, which in turn will look for a hostfile named hosts in the current directory. The hostfile should define at least a core group.

    Your site.yml Ansible playbook file should make sure that the core group of hosts gets assigned the core role. One way to do this explicitly is:

    - hosts: core
      become: true
      roles:
      - core

    All other hosts should include a dependency on the common role.

    Put your cluster configuration in your group_vars/all file:

    domain: infra.example.com
    domain_public: example.com
    ...

    Finally, and most importantly, cluster services should be documented in the services.yml file. This contains the high-level metadata about each service running in the cluster. Examples of such metadata include scheduling details, monitoring information, service credentials, and more.

    With the above setup, running

    $ ansible-playbook site.yml

    will set up the minicluster on your hosts.

    Storing secrets

    The Ansible roles generate a number of secrets, from service credentials to public SSL certificates and SSH keys. These are stored locally, on the host running Ansible, and distributed to the other hosts as required.

    By default, such secrets will be saved in the private subdirectory of your top-level Ansible config path. This can be configured by setting the private_dir variable (for example in your group_vars/all file). It is recommended to point it at a directory outside of your Ansible configuration repository, for instance a local copy of a separate encrypted git repository.

    Writing a new service configuration

    There are two options for adding a service to the cluster:

    • For services which can be considered part of the basic feature set of the cluster, or that require a high level of system integration, it is best to use an Ansible-based setup;

    • Services that only use the cluster interfaces and are part of the application level should probably use a container-based setup. In this case, all configuration is done via services.yml, and there's no need for any Ansible-specific knowledge.

    In both cases, you will need to create a services.yml file.

    Service metadata

    The service metadata file is a dictionary encoded in YAML format, where keys are service names and values contain the associated metadata. This file is consumed by the static service scheduler that assigns services to hosts.

    The scheduler dynamically modifies the Ansible inventory in two ways, i.e., for an example service myservice, it will:

    • create a host group named myservice, for the purpose of static service discovery;
    • define a boolean variable enable_myservice on all hosts, that is only True on those hosts the service has been allocated to. This can be used for service turndowns when the service allocations change.

    Metadata for services that are part of the core infrastructure ships embedded with the software, so when writing your own services.yml file, you only need to add your services to it.

    Supported attributes include:

    scheduling_group: Only schedule the service on hosts of the specified host group. By default, schedule on all hosts.

    num_instances: Run a limited number of instances of the service (selected among the hosts identified by the scheduling_group). By default this is set to all, which will run an instance on every host.

    service_credentials: A list of dictionaries, one for each service credential that should be generated for this service.

    monitoring_endpoints: List of monitoring endpoints exported by the service.

    public_endpoints: List of endpoints exported by the service that should be made available to end users via the service HTTP router.

    containers: List of containerized instances that make up the service (for container-based services).

    service_credentials

    Each credential object supports the following attributes:

    name (mandatory): Name for this set of credentials, usually the same as the service name. Certificates will be stored in a directory with this name below /etc/credentials/x509.

    enable_client: Whether to generate a client certificate (true by default).

    client_cert_mode: Key usage bits to set on the client certificate. One of client, server, or both, the default is client.

    enable_server: Whether to generate a server certificate (true by default).

    server_cert_mode: Key usage bits to set on the server certificate. One of client, server or both, the default is server.

    extra_san: Additional DNS domains to add as subjectAltName fields in the generated server certificate. This should be a list. The internal domain name will be appended to all entries.

    monitoring_endpoints

    Each element in the monitoring endpoints list can have the following attributes:

    job_name: Job name in Prometheus, defaults to the service name.

    type: Selects the service discovery mechanism used by Prometheus to find the service endpoints. One of static or dynamic.

    port: Port where the /metrics endpoint is exported.

    scheme: HTTP scheme for the service endpoint. The default is https.

    public_endpoints

    Elements in the public endpoints list can have the following attributes:

    name: Public name of the service. This can be different from the service name, for instance you might want to export the internal prometheus service as monitoring under the user-facing external domain name.

    type: Selects the service discovery mechanism to be used to find the service endpoints. One of static or dynamic.

    port: Port where the service is running.

    scheme: HTTP scheme for the service endpoint. The default is https.

    containers

    Each object in this list represents a containerized image to be run. Supported attributes include:

    name: Name of the container. It is possible to have containers with the same name in different services.

    image: Which Docker image to run.

    port: Port exposed by the Docker image. It will be exposed on the host network interface.

    docker_options: Additional options to be passed to the docker run command.

    args: Arguments to be passed to the container entry point.

    volumes: Map of source:target paths to bind-mount in the container.

    Containers will be registered on the DNS-based dynamic service discovery mechanism as service-container.

    Examples

    Let's look at some example services.yml files:

    myservice:
      num_instances: 2
      service_credentials:
        - name: myservice
          enable_client: false
      public_endpoints:
        - name: myservice
          type: static
          port: 1234

    This defines an Ansible-based service, of which we'll run two instances. The service exposes an HTTP server on port 1234, which, assuming domain_public is set to mydomain.com, will be available at https://myservice.mydomain.com/ on the nginx service gateways of the core role. Communication between the HTTPS gateway and the service goes over HTTPS internally using auto-generated credentials.

    myservice:
      containers:
        - name: myapp
          image: myapp:latest
          port: 1234
        - name: redis
          image: redis
          port: 6379
      public_endpoints:
        - name: myservice
          type: static
          port: 1234
          scheme: http

    The above describes a container-based service, consisting of two separate processes: a web application, and a local Redis server (maybe for caching). The two processes will always be scheduled together, so myapp can be configured to talk to Redis on localhost:6379. This time, the service gateway will talk to myapp over HTTP.

    This service does not have any service credentials, but if it did they would be bind-mounted under /etc/credentials inside the container.

    Ansible role structure

    The following applies if you are writing an Ansible-based service.

    Your own roles should have a dependency on the common role, to ensure that client cluster services are available. It should also bootstrap the service-specific setup by depending on the service-setup role. In the role's meta/main.yml file:

    dependencies:
      - { role: common }
      - role: service-setup
        service: myservice

    Roles that run with a limited number of instances need to take care of service turndowns. To do that, we suggest to add another role, let's call it myservice-global, that includes the primary role conditionally and performs turndown steps otherwise. Your playbook would then include the myservice-global role on all hosts. Such a role's meta/main.yml file might look like this:

    dependencies:
      - role: myservice 
        when: enable_myservice

    Making use of the enable_myservice variable defined on each host by the static service scheduler (see below). The role's tasks/main.yml performs service turndown when the service should not be enabled:

    - service: name=myservice state=stopped
      when: not enable_myservice

    This has the advantage that the myservice role itself looks like a standard Ansible role (that is, setup-only).

    Implementation details

    Configuration management roles

    From the configuration management point of view, machines in a cluster can have one of two roles, one of which is a subset of the other:

    • the common role, which includes all the plumbing required to use the core cluster services as a client;

    • the core role, which includes the common role, provides the core services, coordinated by way of consensus protocols and such.

    For small clusters there is obviously no need for separate lightweight clients, as the core role allows for client activity itself.

    The purpose of the split is to allow different rates of horizontal scaling for consensus-based services and other types of services.

    Of course you can add as many other roles as you want: for instance to reflect differences in the underlying machines for the purpose of limiting service scheduling (using the scheduling_group attribute in the service metadata).

    The common role

    The common role includes everything that is required to use the cluster's infrastructure as a client (that is, run applications on top of it). That includes:

    • clients for service discovery:

      • dynamic service discovery uses DNS queries to SkyDNS (backed by etcd)
    • servicereg (a systemd -> skydns bridge)

      • provides service presence over etcd
    • monitoring

      • prometheus node exporter
    • log transport to collector

      • relays syslog logs to the global log collector
    • ssh key management

      • ssh CA for hosts
      • distribute user ssh keys
    • service credentials management

      • provides per-service X509 certificates

    The core role

    The main role (core) installs the core infrastructure services:

    • replicated database for coordination

      • etcd
      • used mostly for runtime information (presence, master-election, locking...)
    • service discovery

      • SkyDNS
      • talks to the local etcd
    • monitoring

      • Prometheus
      • one global instance (2?)
      • one alertmanager
      • static service discovery for targets
    • centralized log collection

      • one global instance
      • ELK? (would like something simpler and less resource-intensive)
    • user-facing HTTP reverse proxy for services

      • expose internal services to the public
      • automatically create SSL certificates with Letsencrypt
    • identity provider (OAuth 2 / OpenID Connect)

    TODO:

    • replicated database for application data

    Global services

    A number of services follow a scheme where one or more global instances are needed, and there is some state associated with the service. In this case it is not always practical to do automated failovers (using locks on etcd for instance), and instead it would be best to trigger rebalances only on administrative action (running Ansible).

    In this case, we install the service on a subset of hosts, determined in a random fashion (stable across multiple executions of ansible) in the course of executing the Ansible playbook. This is done using a dynamic inventory that allocates services to nodes and tweaks the configuration accordingly.

    This only reacts to changes in the host inventory, rather than each host's availability, without requiring the user to manually allocate these services.

    Other Ansible-specific considerations

    One thing we do a lot is assembling cluster-wide configurations with per-host and per-service snippets, and then ship the configuration to a subset of the hosts. To do this in Ansible we simply store all snippets locally, usually with sub-roles such as prometheus-target, then put them all together (again with a local_action) and copy them to the target machines. Role dependencies usually ensure that the process is run in the expected order.