Redesign DNS and service discovery integration
There are a few issues to touch around DNS and how service discovery interacts with it:
- getting rid of the static /etc/hosts
While very convenient, relying on /etc/hosts for service discovery has the extremely annoying consequence that the service containers do not see updates to it: to maintain correctness, we are forced to restart all containers whenever there is a change, which is disruptive.
Moving service discovery to use "real" DNS would solve this issue.
- authoritative vs recursive DNS request flows
Right now the DNS server installed as part of the frontend role also operates as a caching nameserver. This is a practical optimization that ends up being a bit confusing: the purpose of the service, after all, is exclusively to serve authoritative DNS zones to the public.
DNS caching, and generally handling resolv.conf in a more structured way, is probably best handled as a separate, orthogonal configuration space: it should be possible to control its deployment and usage (from the "local caching everywhere" model to "dedicated service"), keeping an eye on the eventual possible integration of service discovery.
- testing requirements
An argument against caching/authoritative separation is that the test environments rely on the public zones being reachable from inside the infrastructure (where the tests run), so we have to find a solution for that too.