Architecture¶

System Overview¶

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Frontend  │────▶│   Nginx     │────▶│   Backend   │
│   (React)   │     │  (Reverse   │     │  (FastAPI)  │
└─────────────┘     │   Proxy)    │     └──────┬──────┘
                    └─────────────┘            │
                                               ▼
                    ┌─────────────┬─────────────┬─────────────┐
                    │  PostgreSQL │   ChromaDB  │    Redis    │
                    │  (Metadata) │  (Vectors)  │  (Queue)    │
                    └─────────────┴─────────────┴──────┬──────┘
                                                       │
                                                       ▼
                                               ┌─────────────┐
                                               │ ARQ Worker  │
                                               │  (Tasks)    │
                                               └─────────────┘

Components¶

Frontend¶

React SPA built with Vite and TypeScript. Handles:

User authentication
Document upload interface
Chat interface
Document management

Backend¶

FastAPI application providing REST API endpoints. Core responsibilities:

Authentication and authorization
Document processing orchestration
Chat/RAG endpoints
Admin functionality

Background Task Processing (ARQ + Redis)¶

Document processing is handled asynchronously using ARQ, a fast job queue built on Redis. This architecture allows the API to remain responsive while heavy tasks run in the background.

How it works:

Job Enqueueing — When a document is uploaded, FastAPI enqueues a processing job to Redis via ARQ
Job Queue — Redis stores the job queue, maintaining task order and state
Worker Processing — A separate ARQ worker process pulls jobs from Redis and executes them
Status Updates — The worker updates job status in Redis/PostgreSQL; the frontend polls for progress

Why ARQ?

Native async/await support (matches FastAPI's async model)
Lightweight compared to Celery
Built specifically for Redis
Simple API with robust retry and timeout handling

Task Examples:

Document text extraction
Chunk generation and embedding
Vector store indexing
Batch operations

The worker runs as a separate container/process (arq app.workers.tasks.WorkerSettings), allowing horizontal scaling independent of the API.

Data Stores¶

Store	Purpose
PostgreSQL	User accounts, document metadata, chat history
ChromaDB	Vector embeddings for similarity search
Redis	ARQ job queue, task state, caching

RAG Pipeline¶

Query — User submits a question
Embed — Question is converted to vector embedding
Retrieve — Similar chunks are found via ChromaDB
Augment — Retrieved chunks are added to prompt context
Generate — LLM generates response using context
Cite — Response includes source chunk references

Authentication¶

JWT-based authentication with:

Access tokens (short-lived)
Refresh tokens (long-lived)
Secure HTTP-only cookies