Notes · By TXYOU

When NOT to use AI in a feature

Five tests I run before reaching for an LLM. A practical checklist for product decisions where the obvious answer is "add AI" — and where doing so makes the product worse.

Published 2026-05-16 · 5 min read

It's never been easier to add AI to a feature. That's the problem. The default move for product people in 2026 is to wire up an LLM call and call it done. Often the result is worse than the boring procedural alternative — more latency, more cost, less predictable, and harder to debug.

Here are the questions I ask before reaching for a model.

1. Is the task closed-form?

If the answer can be computed by an algorithm — a date range filter, a regex match, a sort, a calculation — use the algorithm. LLMs are bad at deterministic tasks. They occasionally hallucinate, occasionally miss obvious patterns, and they're slow and expensive compared to a function call.

A red flag I see often: “we use an LLM to extract phone numbers from text.” That's a regex problem.

2. Does the user need determinism?

If the same input must always produce the same output — a calculation, a routing decision, a security check — don't use an LLM. Even with temperature: 0, LLM outputs drift across model versions, prompts, and conditions you can't fully control.

Determinism isn't about absolute correctness; it's about consistency. Users get angry when “the same question got a different answer yesterday.”

3. Is the latency budget tight?

LLM calls take 500ms–5s depending on model and prompt length. If your feature must respond in under 200ms — autocompletes, search-as-you-type, gesture responses, ambient UI — an LLM is the wrong tool. Use a smaller, faster mechanism: embeddings + nearest-neighbor, a finite-state machine, or just a lookup table.

If you must use a model in the critical path, cache aggressively or stream so the user sees progress.

4. Are the unit economics bad?

For high-frequency features — every page view, every keystroke, every background sync — even cheap models add up fast. A $0.001 call per action sounds tiny. At 10,000 actions per user per month, that's $10 per user before you've paid for anything else.

If the feature has to scale, model the unit economics first. Often the answer is “use a tiny model” or “use a model only on the critical 1% of calls.”

5. Do users want control, not magic?

Some users want the tool to do the work. Others want the tool to help them do the work, with their hands on the wheel. Power users in particular hate it when AI takes over a step they want to control.

For a feature like “rename my files,” ask: do users want the AI to rename automatically, or to suggest names they can edit? Suggesting almost always wins — same value, less surprise, no undo dance.

The heuristic

When I'm tempted to add AI to a feature, I do a thought experiment: if I had to ship this without any AI, what would the procedural version look like? If the answer is “fine” or even “decent,” I usually ship the procedural version first and add AI only where it measurably improves on it.

The bar should be: AI here removes a step the user actually dreads, or unlocks something the procedural version literally cannot do. Not: AI here makes the marketing copy sound modern.

What this saves

Cost — usually 5–50× cheaper
Latency — usually 10–100× faster
Debug time — procedural code is inspectable; LLM behaviour is not
User trust — predictable tools are easier to trust than smart ones

LLMs are best used as a top layer on top of a solid procedural product, not as a substitute for one. The features I'm proudest of in TXYOU's apps use AI sparingly — a sentence here, a generation there — on top of a deterministic shell.