From 4bdd388c123fb250ebccf89a4250ff31dce569a1 Mon Sep 17 00:00:00 2001
From: ale <ale@incal.net>
Date: Thu, 2 Jul 2020 12:26:00 +0100
Subject: [PATCH] Update README

---
 README.md | 195 +++++++++++++++++++++++++++++-------------------------
 1 file changed, 104 insertions(+), 91 deletions(-)

diff --git a/README.md b/README.md
index 4c8b9e2..0f94cf9 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,23 @@
-gitlab-docker-autodep
+gitlab-deps
 ===
 
-Automatically rebuild all the downstream dependencies of Docker-based
-projects on a Gitlab instance.
+*Gitlab-deps* is a simple build orchestration toolkit: it tracks
+dependencies between projects on a Gitlab instance and it can
+automatically rebuild dependencies when a project pipeline completes
+successfully.
 
-It scans the *master* branch of all repositories containing a
-Dockerfile looking for FROM lines and navigates the resulting
-dependency tree to find all projects that needs to be rebuilt when
-their base image (or an upstream image thereof) changes.
+It can track dependencies between projects by either of two supported
+ways:
+
+* projects can add a `.gitlab-deps` file to the root of their
+  repository, containing the fully qualified project URLs of their
+  dependencies;
+* gitlab-deps can scan Dockerfiles (in the repository root) and
+  automatically infer dependencies based on FROM lines.
+
+The implementation depends on HTTP hooks triggered by pipeline events:
+gitlab-deps runs a small HTTP server to respond to these requests and
+trigger new builds.
 
 By default, since it is meant to be used as a trigger as the last step
 in a CI script, it will not navigate the dependency tree recursively
@@ -17,105 +27,130 @@ has been rebuilt.
 
 # Installation
 
+The tools require Python 3.
+
 Install the tool either in a virtualenv of or system-wide with any of
 the standard Python installation mechanisms, for instance (using
 *setuptools*):
 
 ```
-sudo python setup.py install
+sudo python3 setup.py install
 ```
 
-This will install the *gitlab-docker-autodep* command-line tool in
-/usr/local/bin. The tool should work with either Python 2 and Python
-3, and it has few dependencies (just the [Gitlab
-API](https://python-gitlab.readthedocs.io/en/stable/), and
-[Flask](https://flask.pocoo.org/)).
+This will install the *gitlab-deps* command-line tool in
+/usr/local/bin. The tool has few dependencies (just the [Gitlab
+API](https://python-gitlab.readthedocs.io/en/stable/),
+[Flask](https://flask.pocoo.org/) and
+[Cheroot](https://pypi.org/project/cheroot/)).
 
 # Usage
 
-The tool is split into functional components:
+The toolkit is split into functional components (all wrapped in a
+single executable with different sub-commands):
 
-* scan Gitlab and generate a dependency map (stored as a JSON file)
+* scan Gitlab and generate a dependency map
 * manually trigger builds using the dependency map
 * run a server that listens for Gitlab notifications and trigger
   builds
 
-In all cases, the program is configured via command-line options.
+The tools talk to Gitlab using its API, so you're going to need an
+admin token in order to create new pipelines.
 
 ## Common options
 
 The tool must be pointed at your Gitlab instance with the *--url*
-command-line option, 
+command-line option, or alternatively using the `GITLAB_URL`
+environment variable.
 
 You can pass an authentication token using the *--token* or
 *--token-file* command-line options. This is usually required in order
-to trigger CI pipelines: the access token must have the *api* scope.
+to trigger CI pipelines, or to access private projects: the access
+token must have at least the *api* scope. Credentials can also be
+provided using the `GITLAB_AUTH_TOKEN` or `GITLAB_AUTH_TOKEN_FILE`
+environment variables.
 
-The tool will only examine Docker images hosted on the Docker registry
-associated with the Gitlab instance. By default the registry name is
-automatically derived from the server URL (adding a *registry*
-prefix), but it can be changed with the *--registry* command-line
-option.
+### Listing projects
 
-### Scope
+The *list-projects* sub-command can be used to create a list of
+projects (and their branches) in the Gitlab instance. It provides some
+basic functionality for filtering (using the *--search* option), but
+it generates output suitable for *grep*, e.g. to filter a specific
+path prefix (Gitlab group):
 
-On larger Gitlab instances, parsing Dockerfiles for all projects can
-be an expensive (long) operation. The program offers two options to
-manage the scope of the dependency analysis: *--match* and *--filter*.
+```
+gitlab-deps list-projects | grep ^path/to/my/group/
+```
 
-The former, *--match*, allows for filtering the project list on the
-server side, using a Gitlab search query. The latter, *--filter*,
-applies a regular expression to the project names (including
-namespaces) before parsing their dependencies. Combining the two, for
-example, it is possible to efficiently limit the scope of the tool to
-a specific namespace:
+or to only select "master" branches:
 
 ```
-gitlab-docker-autodep deps --match myns --filter ^myns/ ...
+gitlab-deps list-projects | grep ':master$'
 ```
 
-Note that, when building the dependency tree:
+The output from this command is just a list of project paths (with
+namespaces) and branch names, separated by a ':', one per line:
 
-* tags in FROM lines are ignored
-* only the *master* branch of repositories is scanned for Dockerfiles
+```
+path/to/project1:master
+path/to/project1:test-branch
+path/to/project2:master
+...
+```
 
-This might lead to more rebuilds than strictly necessary.
+## Computing dependencies
 
-## Command-line
+The *deps* sub-command will scan the projects and their repositories,
+and it will produce a list of all the edges in the dependency
+graph. It takes a list of project_path:branch specs as input (as
+produced by the *list-projects* sub-command), and it will produce a
+list of edges as whitespace-separated project:branch pairs, e.g.:
 
-The `rebuild` command will trigger a rebuild of all the dependencies
-of a given Docker image, possibly waiting for the CI pipelines to
-complete. Pass the qualified Docker image name (as it appears on FROM
-lines in Dockerfiles) as a command-line argument.
+```
+project:master dependency1:master
+project:master dependency2:master
+```
+
+The output format is once again meant to be processed with standard
+UNIX tools such as *awk* and *grep*.
 
-The tool will print out the project names of the dependencies it
-found. The *--recurse* option will traverse the dependency tree
-recursively, waiting for CI pipelines to complete so that they are
-built in the right order.
+## Configuring pipeline_events hooks
 
-It is possible to limit the scope of the initial dependency scan
-(which is an expensive operation) to projects matching a Gitlab search
-keyword using the *--match* option.
+To work, gitlab-deps needs a HTTP hook for pipeline_events on all
+projects that have dependencies. Since setting this up in Gitlab is a
+manual and laborious process, the *set-hooks* sub-command is provided
+to do this automatically using the API. The intended usage is to run
+it on the right-hand side of the dependency edges (i.e. the list of
+projects/branches that actually have dependencies):
+
+```
+gitlab-deps deps | awk '{print $2}' | gitlab-deps set-hooks
+```
 
-## Gitlab CI integration
+## One-off rebuilds
 
-In order to automatically rebuild the dependencies of a Docker image
-built using Gitlab CI, it is possible to run *gitlab-docker-autodep*
-as a webhook: this way, whenever a successful CI pipeline completes,
-you can trigger a rebuild of the dependencies.
+The *rebuild* sub-Command will trigger a rebuild of all the
+dependencies of a given project, possibly waiting for the CI pipelines
+to complete. Pass a qualified project name and branch as a
+command-line argument. The dependency graph (list of edges as produced
+by the *deps* sub-command) must also be provided, either as a file or
+on standard input.
 
-To do this, use the *server* command of *gitlab-docker-autodep*, and
-specify the address to bind to using the *--host* and *--port*
-options. It is also possible to enforce authentication of the webhook
-with a secret token
+The *--recurse* option will traverse the dependency tree recursively,
+waiting for CI pipelines to complete so that they are built in the
+right order.
+
+## Running the server
+
+The gitlab-deps tool has a *server* command to start a simple HTTP
+server that receives the pipeline_events webhooks from Gitlab, and
+trigger builds for project dependencies.
+
+The *server* command requires an address to bind to, specified using
+the *--host* and *--port* options. It is also possible to enforce
+authentication of the webhook with a secret token
 ([X-Gitlab-Token](https://docs.gitlab.com/ee/user/project/integrations/webhooks.html))
 using the *--webhook-auth-token* option.
 
-When running in this mode, it is assumed that all your Docker-related
-projects have webhooks set up to rebuild their dependencies, so
-*gitlab-docker-autodep* will only trigger a build of the immediate
-dependencies of a project.
-
 Also note that the server does not have any TLS support: if necessary,
 it is best to use a dedicated reverse proxy (Apache, NGINX, etc).
 
@@ -124,35 +159,13 @@ machine as Gitlab itself, and that the Gitlab authentication token is
 stored in */etc/gitlab_docker_token*:
 
 ```
-gitlab-docker-autodep \
+gitlab-deps deps \
+| gitlab-deps server \
     --url=https://my.gitlab \
     --token-file=/etc/gitlab_docker_token \
-    server \
-    --deps=deps.json
     --host=127.0.0.1 --port=14001
 ```
 
-You can then configure your project's webhooks with the URL
-`http://localhost:14001/`, with the *Trigger* checkbox set only
-on *Pipeline events*.
-
-Then you should generate the *deps.json* dependency map periodically,
-for instance with a cron job:
-
-```
-*/30 * * * * root gitlab-docker-autodep
-                  --url=https://my.gitlab 
-                  --token-file=/etc/gitlab_docker_token
-                  deps > deps.json
-```
-
-It can be useful to run the *rebuild* command from a cron job, for
-instance in order to rebuild images on a periodic schedule, and
-assuming all your projects share a common base image:
-
-```
-50 5 * * * root gitlab-docker-autodep 
-                --url=https://my.gitlab
-                --token-file=/etc/gitlab_docker_token
-                rebuild $MY_BASE_IMAGE
-```
+If configuring webhooks manually (rather than with *set-hooks*),
+create a new webhook with the URL `http://localhost:14001/`, and with
+the *Trigger* checkbox set only on *Pipeline events*.
-- 
GitLab