Skip to main content

Embeddings

Can You Speed Up Embeddings by Removing Filler Words and Still Keep Accuracy?

Strip the filler words out of your documents before you embed them and embedding gets ~25% cheaper for one to two points of retrieval accuracy — flat, across every model I tried. The real lesson isn’t the caveman trick: it’s that twelve test questions will lie to you with a perfectly straight face, and a clean model-by-model story can be complete garbage until you run a few hundred.