We added semantic search to agent memory, then benchmarked it against plain document RAG on the same questions. The boring baseline won by 6x. Here is why that is the point.
The practical follow-up to the goldfish-memory post. Bring a Postgres database with pgvector and an agent that talks to users; an hour later you’ve got two-tier memory bolted on. Staging, realtime and consolidate cells, three scheduling options, three reader patterns, and an LLM fact extractor — Python and Rust both.
Agent memory has two completely different jobs — fast context for the next reply, and curated truth three weeks later — and most people try to do both with one tool. Here’s the two-tier pattern I built chunkshop’s memory layer around, the late-event bug that silently eats conversations, and why ‘just use pgvector’ isn’t the whole answer.
The features I’d argue are genuinely novel — framers, hierarchical summaries, BYO embedders via four lines of YAML, schema-flex append mode, cross-language vector compatibility, and the modular-backends roadmap toward MariaDB and ClickHouse. Plus the four bets chunkshop is making about where RAG infrastructure goes next.
Real OLTP corpus, twelve-combo bakeoff with three baked-in models plus Snowflake Arctic via BYO YAML, hybrid search via promoted metadata, then wired into a LangGraph agent through inline mode. Every command actually run.
GraphRAG that runs entirely in PostgreSQL — pgvector for vectors, recursive CTEs for graph traversal, tsvector BM25 for keyword search. No graph database, no second backup strategy, no data sync.
Most teams reach for Neo4j or Apache AGE the moment they read the Microsoft GraphRAG paper. The honest answer is most GraphRAG workloads don’t need a graph database — pgvector + recursive CTEs + tsvector handle 1-3 hop traversal in one ACID database.
Seven walkthroughs with opinions — what each chunker is good at, where it falls over, and the corpus shape that flips the leaderboard between them. A field guide, not a recommendation. Bakeoff first.
An illegal chop shop for your data — the YAML-driven RAG ingest tool that ships a bakeoff primitive so you measure chunker × embedder × your corpus instead of vibe-picking from somebody else’s blog post.
Most production RAG pipelines run on one signal: chunks. Add doc-level summaries plus structured metadata in Postgres and you get three signals — with working SQL at the bottom of the post.