Skip to content
Snippets Groups Projects
Commit 06c001b3 authored by ale's avatar ale
Browse files

Style changes

parent 9d5edc8d
No related branches found
No related tags found
No related merge requests found
Pipeline #62733 passed
......@@ -6,6 +6,8 @@ its ability to *scale down* for small installations, using very few
resources while maintaining a certain level of usefulness, offering an
alternative to heavyweight stacks like ELK in this scenario.
[[_TOC_]]
## Overview
The system's functionality is split into two parts:
......@@ -111,9 +113,9 @@ The flattened records are then written to
periodically (and when they reach a certain size). These files can be
stored remotely, on S3-like backends.
The ingestion API endpoint is at */ingest*, and it expects a POST
request with a ND-JSON request body: newline-delimited JSON-encoded
records, no additional headers or footers.
The ingestion API endpoint is at `/ingest`, and it expects a POST
request with a ND-JSON request body (newline-delimited JSON-encoded
records, no additional headers or footers).
### Schema unification
......@@ -151,7 +153,7 @@ you won't see logs until the ingestion server decides it's time to
finalize the current Parquet file. For this reason, it might be
sensible to set the *--rotation-interval* option to a few minutes.
The query API is at */query* and it takes a full SQL query as the *q*
The query API is at `/query` and it takes a full SQL query as the *q*
parameter. The response will be JSON-encoded. Since the table to query
is created on-the-fly with every request, its name is not known in
advance to the caller: the SQL query should contain the placeholder
......@@ -214,10 +216,8 @@ the URI scheme:
* *minio* - Generic S3-like API support. Use standard environment
variables (MINIO_ACCESS_KEY etc) for credentials, URIs should have
this form: `minio://hostname/bucket/path`
* *s3* - AWS S3 (not ready yet). Supports URIs like
`s3://bucket/path`
* *gcs* - Google Cloud Storage (not ready yet). Supports URIs of the
form `gcs://project_id/bucket/path`
......@@ -227,9 +227,9 @@ the URI scheme:
The server offers some debugging endpoints which might be useful to
understand what it is doing:
* */schema* will return the current schema in JSON format
* */debug/schema* will return a human-readable dump of the internal
state of the schema guesser
* `/schema` will return the current schema in JSON format
* `/debug/schema` will return a human-readable dump of the internal
state of the schema guesser, where you can find a report on the errors encountered
### Performance and Scaling
......@@ -261,3 +261,10 @@ is certainly possible to run multiple instances of *pqlogd* in
parallel, pointing them at the same storage: generated filenames are
unique, so the query layer will maintain the aggregate view of all
logs.
Note that multiple instances of the indexer will each run their own,
independent schema analysis, which can potentially result in different
schemas depending on the input. This is not an issue, because what
matters is that the schema is consistent within each individual
Parquet file: the database engine can easily merge those together at
query time.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment