diff --git a/README.md b/README.md index 4c8b9e28ebf123dfaefaafceb7c838db1287d0db..0f94cf90e59f63530815c9dfe43e21f6ded1be69 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,23 @@ -gitlab-docker-autodep +gitlab-deps === -Automatically rebuild all the downstream dependencies of Docker-based -projects on a Gitlab instance. +*Gitlab-deps* is a simple build orchestration toolkit: it tracks +dependencies between projects on a Gitlab instance and it can +automatically rebuild dependencies when a project pipeline completes +successfully. -It scans the *master* branch of all repositories containing a -Dockerfile looking for FROM lines and navigates the resulting -dependency tree to find all projects that needs to be rebuilt when -their base image (or an upstream image thereof) changes. +It can track dependencies between projects by either of two supported +ways: + +* projects can add a `.gitlab-deps` file to the root of their + repository, containing the fully qualified project URLs of their + dependencies; +* gitlab-deps can scan Dockerfiles (in the repository root) and + automatically infer dependencies based on FROM lines. + +The implementation depends on HTTP hooks triggered by pipeline events: +gitlab-deps runs a small HTTP server to respond to these requests and +trigger new builds. By default, since it is meant to be used as a trigger as the last step in a CI script, it will not navigate the dependency tree recursively @@ -17,105 +27,130 @@ has been rebuilt. # Installation +The tools require Python 3. + Install the tool either in a virtualenv of or system-wide with any of the standard Python installation mechanisms, for instance (using *setuptools*): ``` -sudo python setup.py install +sudo python3 setup.py install ``` -This will install the *gitlab-docker-autodep* command-line tool in -/usr/local/bin. The tool should work with either Python 2 and Python -3, and it has few dependencies (just the [Gitlab -API](https://python-gitlab.readthedocs.io/en/stable/), and -[Flask](https://flask.pocoo.org/)). +This will install the *gitlab-deps* command-line tool in +/usr/local/bin. The tool has few dependencies (just the [Gitlab +API](https://python-gitlab.readthedocs.io/en/stable/), +[Flask](https://flask.pocoo.org/) and +[Cheroot](https://pypi.org/project/cheroot/)). # Usage -The tool is split into functional components: +The toolkit is split into functional components (all wrapped in a +single executable with different sub-commands): -* scan Gitlab and generate a dependency map (stored as a JSON file) +* scan Gitlab and generate a dependency map * manually trigger builds using the dependency map * run a server that listens for Gitlab notifications and trigger builds -In all cases, the program is configured via command-line options. +The tools talk to Gitlab using its API, so you're going to need an +admin token in order to create new pipelines. ## Common options The tool must be pointed at your Gitlab instance with the *--url* -command-line option, +command-line option, or alternatively using the `GITLAB_URL` +environment variable. You can pass an authentication token using the *--token* or *--token-file* command-line options. This is usually required in order -to trigger CI pipelines: the access token must have the *api* scope. +to trigger CI pipelines, or to access private projects: the access +token must have at least the *api* scope. Credentials can also be +provided using the `GITLAB_AUTH_TOKEN` or `GITLAB_AUTH_TOKEN_FILE` +environment variables. -The tool will only examine Docker images hosted on the Docker registry -associated with the Gitlab instance. By default the registry name is -automatically derived from the server URL (adding a *registry* -prefix), but it can be changed with the *--registry* command-line -option. +### Listing projects -### Scope +The *list-projects* sub-command can be used to create a list of +projects (and their branches) in the Gitlab instance. It provides some +basic functionality for filtering (using the *--search* option), but +it generates output suitable for *grep*, e.g. to filter a specific +path prefix (Gitlab group): -On larger Gitlab instances, parsing Dockerfiles for all projects can -be an expensive (long) operation. The program offers two options to -manage the scope of the dependency analysis: *--match* and *--filter*. +``` +gitlab-deps list-projects | grep ^path/to/my/group/ +``` -The former, *--match*, allows for filtering the project list on the -server side, using a Gitlab search query. The latter, *--filter*, -applies a regular expression to the project names (including -namespaces) before parsing their dependencies. Combining the two, for -example, it is possible to efficiently limit the scope of the tool to -a specific namespace: +or to only select "master" branches: ``` -gitlab-docker-autodep deps --match myns --filter ^myns/ ... +gitlab-deps list-projects | grep ':master$' ``` -Note that, when building the dependency tree: +The output from this command is just a list of project paths (with +namespaces) and branch names, separated by a ':', one per line: -* tags in FROM lines are ignored -* only the *master* branch of repositories is scanned for Dockerfiles +``` +path/to/project1:master +path/to/project1:test-branch +path/to/project2:master +... +``` -This might lead to more rebuilds than strictly necessary. +## Computing dependencies -## Command-line +The *deps* sub-command will scan the projects and their repositories, +and it will produce a list of all the edges in the dependency +graph. It takes a list of project_path:branch specs as input (as +produced by the *list-projects* sub-command), and it will produce a +list of edges as whitespace-separated project:branch pairs, e.g.: -The `rebuild` command will trigger a rebuild of all the dependencies -of a given Docker image, possibly waiting for the CI pipelines to -complete. Pass the qualified Docker image name (as it appears on FROM -lines in Dockerfiles) as a command-line argument. +``` +project:master dependency1:master +project:master dependency2:master +``` + +The output format is once again meant to be processed with standard +UNIX tools such as *awk* and *grep*. -The tool will print out the project names of the dependencies it -found. The *--recurse* option will traverse the dependency tree -recursively, waiting for CI pipelines to complete so that they are -built in the right order. +## Configuring pipeline_events hooks -It is possible to limit the scope of the initial dependency scan -(which is an expensive operation) to projects matching a Gitlab search -keyword using the *--match* option. +To work, gitlab-deps needs a HTTP hook for pipeline_events on all +projects that have dependencies. Since setting this up in Gitlab is a +manual and laborious process, the *set-hooks* sub-command is provided +to do this automatically using the API. The intended usage is to run +it on the right-hand side of the dependency edges (i.e. the list of +projects/branches that actually have dependencies): + +``` +gitlab-deps deps | awk '{print $2}' | gitlab-deps set-hooks +``` -## Gitlab CI integration +## One-off rebuilds -In order to automatically rebuild the dependencies of a Docker image -built using Gitlab CI, it is possible to run *gitlab-docker-autodep* -as a webhook: this way, whenever a successful CI pipeline completes, -you can trigger a rebuild of the dependencies. +The *rebuild* sub-Command will trigger a rebuild of all the +dependencies of a given project, possibly waiting for the CI pipelines +to complete. Pass a qualified project name and branch as a +command-line argument. The dependency graph (list of edges as produced +by the *deps* sub-command) must also be provided, either as a file or +on standard input. -To do this, use the *server* command of *gitlab-docker-autodep*, and -specify the address to bind to using the *--host* and *--port* -options. It is also possible to enforce authentication of the webhook -with a secret token +The *--recurse* option will traverse the dependency tree recursively, +waiting for CI pipelines to complete so that they are built in the +right order. + +## Running the server + +The gitlab-deps tool has a *server* command to start a simple HTTP +server that receives the pipeline_events webhooks from Gitlab, and +trigger builds for project dependencies. + +The *server* command requires an address to bind to, specified using +the *--host* and *--port* options. It is also possible to enforce +authentication of the webhook with a secret token ([X-Gitlab-Token](https://docs.gitlab.com/ee/user/project/integrations/webhooks.html)) using the *--webhook-auth-token* option. -When running in this mode, it is assumed that all your Docker-related -projects have webhooks set up to rebuild their dependencies, so -*gitlab-docker-autodep* will only trigger a build of the immediate -dependencies of a project. - Also note that the server does not have any TLS support: if necessary, it is best to use a dedicated reverse proxy (Apache, NGINX, etc). @@ -124,35 +159,13 @@ machine as Gitlab itself, and that the Gitlab authentication token is stored in */etc/gitlab_docker_token*: ``` -gitlab-docker-autodep \ +gitlab-deps deps \ +| gitlab-deps server \ --url=https://my.gitlab \ --token-file=/etc/gitlab_docker_token \ - server \ - --deps=deps.json --host=127.0.0.1 --port=14001 ``` -You can then configure your project's webhooks with the URL -`http://localhost:14001/`, with the *Trigger* checkbox set only -on *Pipeline events*. - -Then you should generate the *deps.json* dependency map periodically, -for instance with a cron job: - -``` -*/30 * * * * root gitlab-docker-autodep - --url=https://my.gitlab - --token-file=/etc/gitlab_docker_token - deps > deps.json -``` - -It can be useful to run the *rebuild* command from a cron job, for -instance in order to rebuild images on a periodic schedule, and -assuming all your projects share a common base image: - -``` -50 5 * * * root gitlab-docker-autodep - --url=https://my.gitlab - --token-file=/etc/gitlab_docker_token - rebuild $MY_BASE_IMAGE -``` +If configuring webhooks manually (rather than with *set-hooks*), +create a new webhook with the URL `http://localhost:14001/`, and with +the *Trigger* checkbox set only on *Pipeline events*.