Back to Blog
MEP

Why WYRM MEP never lets a language model size a cable

James Reed|June 10, 2026|9 min read

Key Takeaways

  • -A general-purpose model that is usually right becomes a liability the moment 'usually' turns into a mis-sized cable or a duct that does not meet the standard.
  • -WYRM MEP draws a hard line: language, planning and coordination are handled by AI agents, but every engineering number is produced by codified calculation engines running BS 7671, CIBSE and SBEM methods exactly and identically every time.
  • -The point is the shape of the failure, not a percentage. Generative error is unbounded and silent; algorithmic error is bounded and visible — out-of-range inputs are rejected, not guessed.
  • -Every figure traces back to a named standard, so what reaches the engineer is fast, reproducible, and built to be checked and signed.

There is a temptation, building AI into engineering software, to let the model do everything. It reads the brief, so why not let it size the cable too? The answer is that engineering cannot run on probability, and a general-purpose model that is usually right is exactly the wrong tool for the moment that matters. A mis-sized cable, a duct that does not meet the standard, a breaker that no longer coordinates — these are not typos. They are compliance risks that read on the page exactly like correct answers. WYRM MEP is built around that distinction.

The line we draw is explicit. Language, planning, and coordination are handled by AI agents — reading documents, extracting constraints, dividing work, drafting prose, chasing the knock-on effects of a change. But every engineering number is produced by deterministic calculation engines: codified implementations of the same methods an engineer applies by hand — BS 7671 for cable sizing and protection coordination, CIBSE for ventilation and loads, SBEM for the energy and EPC build-up — executed exactly and identically every time. The agents never size the cable. The engine does, and the engine runs the standard.

It helps to be precise about how the two kinds of system fail, because that is the real argument. A generative model asked a complex multi-step engineering question produces an answer that is often plausible and sometimes wrong, and — this is the dangerous part — it cannot tell you which. Ask the same question twice and the answer can drift. The error is unbounded: there is no ceiling on how wrong it can be, and it arrives confidently and silently. An algorithmic engine fails the opposite way. Identical inputs give the identical, correct answer on every run. Out-of-range inputs are rejected rather than guessed. The error is bounded and visible.

Put those side by side and the preference is obvious for anything that gets signed. A 90%-accurate engine that flags loudly when it is unsure beats a 95%-accurate one that invents the rest without telling you. The first you can build a checking process around; the second you cannot, because you never know which 5% to check. In a regulated deliverable, a visible gap is a manageable risk and a silent one is a latent defect.

This is not a hypothetical concern dressed up as a design principle. Even as models improve, hallucination persists on things as checkable as a citation — published studies found fabricated references a substantial fraction of the time on earlier model generations, and while the rate keeps falling it is not zero. If a frontier model will confidently invent a reference that anyone can look up, the risk of it confidently inventing a cable size that nobody re-derives is not one a practice should carry. The fix is not a better prompt; it is to not ask the model to do that job at all.

So the labour divides by what each part is good at. The agents read the Employer's Requirements, plan the work, draft the specification, and coordinate the disciplines. The calculation engines do the maths, deterministically. And two independent Principal-QA gates check the join between them — that the numbers the engines produced match the spec the agents extracted, and that nothing crossed from the generative side into the engineering side. What reaches your engineer is both fast and right, and it shows its working: every figure carries the named standard it came from, so it can be reviewed and signed with confidence rather than re-derived from scratch.

There is a quieter benefit to reproducibility that matters on long jobs. Because the engines are deterministic, a calculation re-run six months later against the same inputs returns the same answer, with the same audit trail. When a design is challenged in review, or a parameter changes and the question is what else moves, the engine gives a stable, inspectable basis rather than a fresh roll of the dice. That is what makes the output defensible, which in MEP is the property that actually matters.

WYRM MEP keeps the generative and the algorithmic in their lanes on purpose: generative where it is safe, codified where it counts. It is one of WYRM's two flagship engineering products, sold alongside WYRM Data and bundled as WYRM Engineering at £250 per seat. The design philosophy is the product — an engineer can sign what comes out because they can see exactly how it was produced and which parts a machine was never allowed to guess.