The LLM Personalization vs Latency Tradeoff in SaaS

Aditya Lahiri

CTO & Co-Founder @ OpenFunnel

We've been debating this constantly at OpenFunnel. It's the new fundamental tradeoff in LLM-first SaaS.

Here's the tension:

With a few fast LLM calls, we can make everything hyper-personalized.

Contextual insights. Specific reasoning. Information tailored exactly to what this user cares about right now. It hits the user instantly and they "just get it". Its like a friend that knows them + their business is giving them information through our product.

But nobody wants to wait 3 seconds for their results.

The old solution was simple: Pre-compute everything. Store it. Serve it instantly. Fast, but generic.

The new problem: Users now expect both. Personalized AND instant.

So the engineering challenge becomes: What do we pre-compute and store? What do we personalize at runtime? And how do we make that runtime layer fast enough that nobody notices?

Our current approach: Fast open-source models for the personalization layer Vector embeddings for instant semantic search A ton of optimization work on the unsexy stuff (caching strategies, parallel calls, smart pre-fetching)

The bar has moved. "Fast or personalized" isn't a choice. Users NEED both.

That's what makes this fun to build!

Made with

in SF

© 2026 OPENFUNNEL. ALL RIGHTS RESERVED.

Ask AI about OpenFunnel

Made with

in SF

© 2026 OPENFUNNEL. ALL RIGHTS RESERVED.

Ask AI about OpenFunnel