Commit 4bdd388c authored by ale's avatar ale
Browse files

Update README

parent efd3b720
gitlab-docker-autodep
gitlab-deps
===
Automatically rebuild all the downstream dependencies of Docker-based
projects on a Gitlab instance.
*Gitlab-deps* is a simple build orchestration toolkit: it tracks
dependencies between projects on a Gitlab instance and it can
automatically rebuild dependencies when a project pipeline completes
successfully.
It scans the *master* branch of all repositories containing a
Dockerfile looking for FROM lines and navigates the resulting
dependency tree to find all projects that needs to be rebuilt when
their base image (or an upstream image thereof) changes.
It can track dependencies between projects by either of two supported
ways:
* projects can add a `.gitlab-deps` file to the root of their
repository, containing the fully qualified project URLs of their
dependencies;
* gitlab-deps can scan Dockerfiles (in the repository root) and
automatically infer dependencies based on FROM lines.
The implementation depends on HTTP hooks triggered by pipeline events:
gitlab-deps runs a small HTTP server to respond to these requests and
trigger new builds.
By default, since it is meant to be used as a trigger as the last step
in a CI script, it will not navigate the dependency tree recursively
......@@ -17,105 +27,130 @@ has been rebuilt.
# Installation
The tools require Python 3.
Install the tool either in a virtualenv of or system-wide with any of
the standard Python installation mechanisms, for instance (using
*setuptools*):
```
sudo python setup.py install
sudo python3 setup.py install
```
This will install the *gitlab-docker-autodep* command-line tool in
/usr/local/bin. The tool should work with either Python 2 and Python
3, and it has few dependencies (just the [Gitlab
API](https://python-gitlab.readthedocs.io/en/stable/), and
[Flask](https://flask.pocoo.org/)).
This will install the *gitlab-deps* command-line tool in
/usr/local/bin. The tool has few dependencies (just the [Gitlab
API](https://python-gitlab.readthedocs.io/en/stable/),
[Flask](https://flask.pocoo.org/) and
[Cheroot](https://pypi.org/project/cheroot/)).
# Usage
The tool is split into functional components:
The toolkit is split into functional components (all wrapped in a
single executable with different sub-commands):
* scan Gitlab and generate a dependency map (stored as a JSON file)
* scan Gitlab and generate a dependency map
* manually trigger builds using the dependency map
* run a server that listens for Gitlab notifications and trigger
builds
In all cases, the program is configured via command-line options.
The tools talk to Gitlab using its API, so you're going to need an
admin token in order to create new pipelines.
## Common options
The tool must be pointed at your Gitlab instance with the *--url*
command-line option,
command-line option, or alternatively using the `GITLAB_URL`
environment variable.
You can pass an authentication token using the *--token* or
*--token-file* command-line options. This is usually required in order
to trigger CI pipelines: the access token must have the *api* scope.
to trigger CI pipelines, or to access private projects: the access
token must have at least the *api* scope. Credentials can also be
provided using the `GITLAB_AUTH_TOKEN` or `GITLAB_AUTH_TOKEN_FILE`
environment variables.
The tool will only examine Docker images hosted on the Docker registry
associated with the Gitlab instance. By default the registry name is
automatically derived from the server URL (adding a *registry*
prefix), but it can be changed with the *--registry* command-line
option.
### Listing projects
### Scope
The *list-projects* sub-command can be used to create a list of
projects (and their branches) in the Gitlab instance. It provides some
basic functionality for filtering (using the *--search* option), but
it generates output suitable for *grep*, e.g. to filter a specific
path prefix (Gitlab group):
On larger Gitlab instances, parsing Dockerfiles for all projects can
be an expensive (long) operation. The program offers two options to
manage the scope of the dependency analysis: *--match* and *--filter*.
```
gitlab-deps list-projects | grep ^path/to/my/group/
```
The former, *--match*, allows for filtering the project list on the
server side, using a Gitlab search query. The latter, *--filter*,
applies a regular expression to the project names (including
namespaces) before parsing their dependencies. Combining the two, for
example, it is possible to efficiently limit the scope of the tool to
a specific namespace:
or to only select "master" branches:
```
gitlab-docker-autodep deps --match myns --filter ^myns/ ...
gitlab-deps list-projects | grep ':master$'
```
Note that, when building the dependency tree:
The output from this command is just a list of project paths (with
namespaces) and branch names, separated by a ':', one per line:
* tags in FROM lines are ignored
* only the *master* branch of repositories is scanned for Dockerfiles
```
path/to/project1:master
path/to/project1:test-branch
path/to/project2:master
...
```
This might lead to more rebuilds than strictly necessary.
## Computing dependencies
## Command-line
The *deps* sub-command will scan the projects and their repositories,
and it will produce a list of all the edges in the dependency
graph. It takes a list of project_path:branch specs as input (as
produced by the *list-projects* sub-command), and it will produce a
list of edges as whitespace-separated project:branch pairs, e.g.:
The `rebuild` command will trigger a rebuild of all the dependencies
of a given Docker image, possibly waiting for the CI pipelines to
complete. Pass the qualified Docker image name (as it appears on FROM
lines in Dockerfiles) as a command-line argument.
```
project:master dependency1:master
project:master dependency2:master
```
The output format is once again meant to be processed with standard
UNIX tools such as *awk* and *grep*.
The tool will print out the project names of the dependencies it
found. The *--recurse* option will traverse the dependency tree
recursively, waiting for CI pipelines to complete so that they are
built in the right order.
## Configuring pipeline_events hooks
It is possible to limit the scope of the initial dependency scan
(which is an expensive operation) to projects matching a Gitlab search
keyword using the *--match* option.
To work, gitlab-deps needs a HTTP hook for pipeline_events on all
projects that have dependencies. Since setting this up in Gitlab is a
manual and laborious process, the *set-hooks* sub-command is provided
to do this automatically using the API. The intended usage is to run
it on the right-hand side of the dependency edges (i.e. the list of
projects/branches that actually have dependencies):
```
gitlab-deps deps | awk '{print $2}' | gitlab-deps set-hooks
```
## Gitlab CI integration
## One-off rebuilds
In order to automatically rebuild the dependencies of a Docker image
built using Gitlab CI, it is possible to run *gitlab-docker-autodep*
as a webhook: this way, whenever a successful CI pipeline completes,
you can trigger a rebuild of the dependencies.
The *rebuild* sub-Command will trigger a rebuild of all the
dependencies of a given project, possibly waiting for the CI pipelines
to complete. Pass a qualified project name and branch as a
command-line argument. The dependency graph (list of edges as produced
by the *deps* sub-command) must also be provided, either as a file or
on standard input.
To do this, use the *server* command of *gitlab-docker-autodep*, and
specify the address to bind to using the *--host* and *--port*
options. It is also possible to enforce authentication of the webhook
with a secret token
The *--recurse* option will traverse the dependency tree recursively,
waiting for CI pipelines to complete so that they are built in the
right order.
## Running the server
The gitlab-deps tool has a *server* command to start a simple HTTP
server that receives the pipeline_events webhooks from Gitlab, and
trigger builds for project dependencies.
The *server* command requires an address to bind to, specified using
the *--host* and *--port* options. It is also possible to enforce
authentication of the webhook with a secret token
([X-Gitlab-Token](https://docs.gitlab.com/ee/user/project/integrations/webhooks.html))
using the *--webhook-auth-token* option.
When running in this mode, it is assumed that all your Docker-related
projects have webhooks set up to rebuild their dependencies, so
*gitlab-docker-autodep* will only trigger a build of the immediate
dependencies of a project.
Also note that the server does not have any TLS support: if necessary,
it is best to use a dedicated reverse proxy (Apache, NGINX, etc).
......@@ -124,35 +159,13 @@ machine as Gitlab itself, and that the Gitlab authentication token is
stored in */etc/gitlab_docker_token*:
```
gitlab-docker-autodep \
gitlab-deps deps \
| gitlab-deps server \
--url=https://my.gitlab \
--token-file=/etc/gitlab_docker_token \
server \
--deps=deps.json
--host=127.0.0.1 --port=14001
```
You can then configure your project's webhooks with the URL
`http://localhost:14001/`, with the *Trigger* checkbox set only
on *Pipeline events*.
Then you should generate the *deps.json* dependency map periodically,
for instance with a cron job:
```
*/30 * * * * root gitlab-docker-autodep
--url=https://my.gitlab
--token-file=/etc/gitlab_docker_token
deps > deps.json
```
It can be useful to run the *rebuild* command from a cron job, for
instance in order to rebuild images on a periodic schedule, and
assuming all your projects share a common base image:
```
50 5 * * * root gitlab-docker-autodep
--url=https://my.gitlab
--token-file=/etc/gitlab_docker_token
rebuild $MY_BASE_IMAGE
```
If configuring webhooks manually (rather than with *set-hooks*),
create a new webhook with the URL `http://localhost:14001/`, and with
the *Trigger* checkbox set only on *Pipeline events*.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment