From fce5a24f80d0b78f12fbc923561b063d49c6485e Mon Sep 17 00:00:00 2001
From: ale <ale@incal.net>
Date: Thu, 6 Mar 2025 17:36:44 +0000
Subject: [PATCH] Update README

---
 README.md | 49 +++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 41 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index a7eace3..3875bcd 100644
--- a/README.md
+++ b/README.md
@@ -4,21 +4,52 @@ Experimental Emacs LLM autocomplete assistant
 Note: the code in this repository isn't ready for use yet.
 
 This is a simple minimalistic Emacs package that provides
-autocomplete-like functionality using LLMs. It uses the Ollama API for
-LLM access.
+autocomplete-like functionality using LLMs. It uses
+[Ollama](https://ollama.com) for LLM access.
 
 It tries to improve the autocompletion quality by providing
 *additional context* for the LLM, obtained by indexing your project's
 source code, and retrieving snippets that are relevant to the code
 being completed (a form of *RAG*).
 
-The indexing code is taken from
-[Aider](https://github.com/Aider-AI/aider) and can run as a local
-service with its own HTTP API in order to return results with low
-latency.
+# How it works
 
+## Code indexing
 
-## Usage
+The indexing code is taken almost verbatim from
+[Aider](https://github.com/Aider-AI/aider), specifically its RepoMap
+implementation which uses
+[grep-ast](https://github.com/Aider-AI/grep-ast) under the hood. It's
+a very interesting implementation, that uses tree-sitter to extract
+semantically useful elements from the code, and PageRank to score
+them. It's also solid, well isolated, and quite sophisticated (it
+caches parsed ASTs in a project-wide SQLite database for performance,
+among other things), so I saw no need to re-implement it.
+
+The indexer operates on the project level, which in our case is
+identified by a Git repository. Note: in the current implementation,
+it's the *indexer* that implements the project abstraction, not Emacs
+(so we can't use Emacs's own project-aware tooling).
+
+For latency reasons, the indexer is implemented as a service that runs
+in the background and exposes a HTTP API. This API takes local file
+names in input, so the indexer daemon needs to have full access to the
+local filesystem. A good way to achieve this is to use systemd's user
+session management capabilities, as outlined below.
+
+## LLM access
+
+In the current implementation, we're just using the *ollama* binary to
+access the LLM. It seems relatively straightforward to switch to a
+direct API call eventually.
+
+## Emacs integration
+
+The Emacs package is very simple (and surely incorrect in a bunch of
+ways), it defines the *copilot-complete* function that will attempt to
+autocomplete the active buffer at the current position.
+
+# Usage
 
 To run the indexing service you can use Systemd. But first you need to
 install the *ecopilot_srcindex* Python package somewhere: for the
@@ -44,4 +75,6 @@ systemctl --user enable ecopilot-srcindex.socket
 systemctl --user start ecopilot-srcindex.socket
 ```
 
-The service will be automatically started when necessary.
+The service will be automatically started when necessary. It needs to
+run as your user because it needs to read source code from the local
+filesystem.
-- 
GitLab