Skip to main content

Yonk-Labs

Building tools in the open

Welcome to Yonk-Labs — we build open source tools and share what we learn along the way.

Explore our blog for technical write-ups, watch our videos, listen to our podcasts, browse sample code, or check out our projects.

Recent

What Actually Moves RAG Accuracy (And What I Spent A Week Measuring Wrong)

One failing LoCoMo question turned into a cross-corpus, multi-system benchmark — and a pile of retracted conclusions. Small-N runs lie, cross-vendor numbers are rarely apples-to-apples, and a correctness bug will impersonate an architecture win every time. Run the no-context baseline, 6x your sample, and diff the bytes that reach the model before you trust any RAG number.

Can You Speed Up Embeddings by Removing Filler Words and Still Keep Accuracy?

Strip the filler words out of your documents before you embed them and embedding gets ~25% cheaper for one to two points of retrieval accuracy — flat, across every model I tried. The real lesson isn’t the caveman trick: it’s that twelve test questions will lie to you with a perfectly straight face, and a clean model-by-model story can be complete garbage until you run a few hundred.