Rag

pg-raggraph-rs ↗ ↖

1 July 2026·2 mins

Rust Postgresql Pgrx Graphrag Performance Rag

The Rust performance line for pg-raggraph: a real pgrx extension with async background-worker ingest and hybrid retrieval, plus a sidecar mode for managed Postgres where you can’t load an extension at all.

llm-judge ↗ ↖

1 July 2026·2 mins

Python Llm Rag Benchmarking Evaluation Cli

Portable CLI for judging RAG and LLM benchmark runs across local, OpenAI-compatible, and cloud providers — a deterministic quick mode, a paraphrase-tolerant LLM-as-judge mode, and a full per-case audit trail for every verdict.

graphrag-demo ↗ ↖

1 July 2026·2 mins

Python Postgresql Apache-Age Pgvector Graphrag Rag

Apache AGE + pgvector in one PostgreSQL instance, side by side. Ask a question, watch vector-only, graph-only, and graph+vector retrieval run in parallel with timing for each.

We Gave Agent Memory Semantic Search. It Still Lost to Boring Old RAG.

4 June 2026·7 mins

Agent-Memory Vector-Search Pgvector Rag Postgres

We added semantic search to agent memory, then benchmarked it against plain document RAG on the same questions. The boring baseline won by 6x. Here is why that is the point.

Your Agent Doesn't Need Memory. It Needs Six of Them.

3 June 2026·7 mins

Agent-Memory Ai-Infrastructure Vector-Search Postgres Rag

“Add memory to the agent” sounds like one feature. It is six different jobs that need three different mechanisms. Here is the map, with a concrete example for each.

Reading the Big-Ass Grid: A Field Guide to Our RAG Bake-Off

2 June 2026·4 mins

Rag Benchmarks Stele Agent-Memory Retrieval Ai

A 150-row benchmark grid looks like the output of a robot having a stroke — until you know the three things each row tells you. A field guide to reading our RAG bake-off: read the parametric floor first, decode the system and lane columns, and ask the only two questions that matter — is it right, and what did it cost?

What Actually Moves RAG Accuracy (And What I Spent A Week Measuring Wrong)

1 June 2026·7 mins

Rag Agent-Memory Benchmarks Stele Retrieval Ai

One failing LoCoMo question turned into a cross-corpus, multi-system benchmark — and a pile of retracted conclusions. Small-N runs lie, cross-vendor numbers are rarely apples-to-apples, and a correctness bug will impersonate an architecture win every time. Run the no-context baseline, 6x your sample, and diff the bytes that reach the model before you trust any RAG number.

Can You Speed Up Embeddings by Removing Filler Words and Still Keep Accuracy?

31 May 2026·7 mins

Chunkshop Embeddings Rag Benchmarks Ai

Strip the filler words out of your documents before you embed them and embedding gets ~25% cheaper for one to two points of retrieval accuracy — flat, across every model I tried. The real lesson isn’t the caveman trick: it’s that twelve test questions will lie to you with a perfectly straight face, and a clean model-by-model story can be complete garbage until you run a few hundred.

Your Agent Has Goldfish Memory (And Your Vector Store Won't Fix It)

24 May 2026·8 mins

Chunkshop Agent-Memory Rag Postgres Pgvector Ai Agents

Agent memory has two completely different jobs — fast context for the next reply, and curated truth three weeks later — and most people try to do both with one tool. Here’s the two-tier pattern I built chunkshop’s memory layer around, the late-event bug that silently eats conversations, and why ‘just use pgvector’ isn’t the whole answer.

stele ↗ ↖

17 May 2026·2 mins

Postgresql Agent-Memory Ai Agents Rag

Agentic memory implemented natively in PostgreSQL — the episodic, relational, time-anchored memory layer agents actually forget, kept in the database you already run.

↑