From fce5a24f80d0b78f12fbc923561b063d49c6485e Mon Sep 17 00:00:00 2001 From: ale <ale@incal.net> Date: Thu, 6 Mar 2025 17:36:44 +0000 Subject: [PATCH] Update README --- README.md | 49 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 41 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index a7eace3..3875bcd 100644 --- a/README.md +++ b/README.md @@ -4,21 +4,52 @@ Experimental Emacs LLM autocomplete assistant Note: the code in this repository isn't ready for use yet. This is a simple minimalistic Emacs package that provides -autocomplete-like functionality using LLMs. It uses the Ollama API for -LLM access. +autocomplete-like functionality using LLMs. It uses +[Ollama](https://ollama.com) for LLM access. It tries to improve the autocompletion quality by providing *additional context* for the LLM, obtained by indexing your project's source code, and retrieving snippets that are relevant to the code being completed (a form of *RAG*). -The indexing code is taken from -[Aider](https://github.com/Aider-AI/aider) and can run as a local -service with its own HTTP API in order to return results with low -latency. +# How it works +## Code indexing -## Usage +The indexing code is taken almost verbatim from +[Aider](https://github.com/Aider-AI/aider), specifically its RepoMap +implementation which uses +[grep-ast](https://github.com/Aider-AI/grep-ast) under the hood. It's +a very interesting implementation, that uses tree-sitter to extract +semantically useful elements from the code, and PageRank to score +them. It's also solid, well isolated, and quite sophisticated (it +caches parsed ASTs in a project-wide SQLite database for performance, +among other things), so I saw no need to re-implement it. + +The indexer operates on the project level, which in our case is +identified by a Git repository. Note: in the current implementation, +it's the *indexer* that implements the project abstraction, not Emacs +(so we can't use Emacs's own project-aware tooling). + +For latency reasons, the indexer is implemented as a service that runs +in the background and exposes a HTTP API. This API takes local file +names in input, so the indexer daemon needs to have full access to the +local filesystem. A good way to achieve this is to use systemd's user +session management capabilities, as outlined below. + +## LLM access + +In the current implementation, we're just using the *ollama* binary to +access the LLM. It seems relatively straightforward to switch to a +direct API call eventually. + +## Emacs integration + +The Emacs package is very simple (and surely incorrect in a bunch of +ways), it defines the *copilot-complete* function that will attempt to +autocomplete the active buffer at the current position. + +# Usage To run the indexing service you can use Systemd. But first you need to install the *ecopilot_srcindex* Python package somewhere: for the @@ -44,4 +75,6 @@ systemctl --user enable ecopilot-srcindex.socket systemctl --user start ecopilot-srcindex.socket ``` -The service will be automatically started when necessary. +The service will be automatically started when necessary. It needs to +run as your user because it needs to read source code from the local +filesystem. -- GitLab