Field notes
Writing about agents, retrieval, and the things in between.
Design notes, engineering decisions, and the occasional postmortem from building RunaxAI.
How a Mac mini under my desk serves runaxai.com
The deployment stack behind RunaxAI: K3s on a Mac mini, a Helm chart that covers everything, self-hosted GitHub Actions runners via ARC, and Cloudflare Tunnel doing what port forwarding used to.
11 min read·#deployment#kubernetes#helm#cloudflare#infra#engineeringThe Agentic RAG Pipeline: Adaptive Retrieval at Scale
How RunaxAI dynamically adjusts retrieval strategies based on corpus size, combining Hybrid Search, HyDE, and Cross-Encoder Reranking to deliver high-fidelity context.
3 min read·#rag#architecture#pinecone#embeddingsMemory, the way we wish chat apps did it
Why we store user memory as atomic facts with supersession, how the extraction pipeline avoids reprocessing the entire conversation every turn, and the three-store split between Redis, Postgres, and pgvector.
8 min read·#memory#architecture#rag#engineeringAgent Orchestration: Taming the LLM Tool Loop
Building deterministic constraints around non-deterministic LLMs: Tool policies, duplicate suppression, budget limits, and dynamic summarization in RunaxAI.
3 min read·#llm#orchestration#agents#engineeringRedis as the Backbone: Caching, Sessions, and State
How RunaxAI leverages a single Redis instance to manage active SSE chat sessions, semantic tool caching, and background worker queues.
3 min read·#redis#caching#architecture#state-managementBehind the chat interface: orchestration, memory, caching, eval — the full picture
A deep dive into Runax — a production document intelligence platform built without LangChain or LangGraph.
10 min read·#rag#architecture#production#engineeringIntroducing RunaxAI
An agentic RAG system with two chat modes, six tools, four specialised agents, hybrid retrieval, and an observability stack — what it does today, and how it actually works under the hood.
10 min read·#product#launch#rag#architecture