A 17× perf gap between pg-raggraph and Apache AGE turned out to be 5 lines of glue code in the bakeoff adapter, not an architectural problem. The fix, the four library-side wins still on the floor, and the three architectural directions ahead — pg_net sidecar, pgrx Rust extension, hybrid embedding tiers.
Three corpora, three different winners, none of them the chunker the README recommended. Why nobody can tell you in advance which chunker to use, and the 30-minute primitive that does the work for you.
AGE is a property-graph engine; pg-raggraph is read-mostly retrieval that combines vector + BM25 + shallow graph traversal in one query plan. Where each wins, where neither fits, and the deployment story that closes off most of the postgres install base.
Real OLTP corpus, twelve-combo bakeoff with three baked-in models plus Snowflake Arctic via BYO YAML, hybrid search via promoted metadata, then wired into a LangGraph agent through inline mode. Every command actually run.
GraphRAG that runs entirely in PostgreSQL — pgvector for vectors, recursive CTEs for graph traversal, tsvector BM25 for keyword search. No graph database, no second backup strategy, no data sync.
Most teams reach for Neo4j or Apache AGE the moment they read the Microsoft GraphRAG paper. The honest answer is most GraphRAG workloads don’t need a graph database — pgvector + recursive CTEs + tsvector handle 1-3 hop traversal in one ACID database.
Seven walkthroughs with opinions — what each chunker is good at, where it falls over, and the corpus shape that flips the leaderboard between them. A field guide, not a recommendation. Bakeoff first.
An illegal chop shop for your data — the YAML-driven RAG ingest tool that ships a bakeoff primitive so you measure chunker × embedder × your corpus instead of vibe-picking from somebody else’s blog post.
Most production RAG pipelines run on one signal: chunks. Add doc-level summaries plus structured metadata in Postgres and you get three signals — with working SQL at the bottom of the post.
Sub-millisecond extractive summarization with byte-identical Python and Rust implementations. The preprocessor that sits in front of the LLM call and cuts tokens 40-94 percent.