The LLM Personalization vs Latency Tradeoff in SaaS


Aditya Lahiri
CTO & Co-Founder @ OpenFunnel
We've been debating this constantly at OpenFunnel. It's the new fundamental tradeoff in LLM-first SaaS.
Here's the tension:
With a few fast LLM calls, we can make everything hyper-personalized.
Contextual insights. Specific reasoning. Information tailored exactly to what this user cares about right now. It hits the user instantly and they "just get it". Its like a friend that knows them + their business is giving them information through our product.
But nobody wants to wait 3 seconds for their results.
The old solution was simple: Pre-compute everything. Store it. Serve it instantly. Fast, but generic.
The new problem: Users now expect both. Personalized AND instant.
So the engineering challenge becomes: What do we pre-compute and store? What do we personalize at runtime? And how do we make that runtime layer fast enough that nobody notices?
Our current approach: Fast open-source models for the personalization layer Vector embeddings for instant semantic search A ton of optimization work on the unsexy stuff (caching strategies, parallel calls, smart pre-fetching)
The bar has moved. "Fast or personalized" isn't a choice. Users NEED both.
That's what makes this fun to build!





