autoradio Tuning Guide
======================

This document attempts to provide a high-level overview of the
trade-offs involved in tuning the free parameters of an autoradio
cluster. While autoradio works with the default settings out of the
box in testing environments, most real-world deployments will require
some tuning.


# Etcd

The default settings for etcd are tuned for a local (LAN) network
environment. In the case of a geographically distributed cluster,
the default timeouts are so low that it's unlikely that a consensus
will ever be reached. You'll want to set both the peer heartbeat
interval and the election timeout to higher values. A reasonable value
for the heartbeat interval is 5x to 10x the maximum inter-node latency
in your cluster, while the election timeout should be at least 3 times
the heartbeat interval.

With our etcd package, you can set these values in
`/etc/default/etcd` (values are milliseconds):

    DAEMON_OPTS="--heartbeat-interval=1000 --election-timeout=3000"

Increasing the etcd timeouts causes a related increase in the time
required to reach consensus and elect a new etcd master in case of
node failure. It is advisable that the radiod master election ttl is
set to a value greater than the etcd peer election timeout.


# Radiod timeouts

Similar considerations, with respect to latency, apply to the presence
and master-election protocols that are run by autoradio itself. These
are controlled by radiod's `--heartbeat` and `--master-election-ttl`
command-line flags. For these time values, though, there are further
considerations to be made:

## Presence

The node presence heartbeat sets the lower time bound for peers to
discover that a node is down, and stop sending client requests to it.

It also determines how often node utilization is propagated to the
peers. This is less of a concern if one is using query cost estimators
in the load balancing policy (as it is by default).

Setting this value too low, depending on the number of nodes in the
cluster, will cause excessive churn on etcd, leading to unnecessary
intra-cluster network traffic. As a side effect of the churn, watches
on etcd data will expire more often (due to the log position
increasing beyond the allowed horizon), which will cause more frequent
reloads of the full configuration, causing even more unnecessary
network traffic and increasing the load on etcd.


## Master Election

The node master election timeout determines how quickly a source
(assuming it retries continuously on error) will be able to reconnect
to the cluster if the node that is currently the master becomes
unavailable.


# Capacity

One of the nice properties of the autoradio traffic control logic is
the ability to reject incoming traffic when the cluster reaches its
maximum capacity, to prevent overload and ensure that existing
connections are served reliably. This is of course only possible if
the capacity limits are set to match reality. Since these values
usually can't be guessed by autoradio, they must be set using
command-line arguments.

Autoradio models capacity along two separate dimensions: bandwidth
(outbound), and number of connected listeners. CPU/memory are not
included due to their negligible incremental cost per-request. Limits
can be set separately for each node in the cluster, by passing the
`--bwlimit` and `--max-clients` command-line flags to *radiod*.

The traffic control logic is then able to use utilization metrics to
make decisions about where to send traffic. For details on how this is
done, and how to control it, check the Go source documentation for the
`fe/lbv2` package.

The default traffic control policy only checks the number of
listeners: this is because it usually makes the most sense to express
the global cluster capacity in those terms (bandwidth is hardly a good
metric in presence of variable bitrate streams, for instance). The
disadvantage is that finding the "real" maximum capacity numbers for a
given node might take some experimentation.