Artificial Intelligence

5 min.

read time

Why Structural Randomness in Public LLMs Creates Challenges for Scaling Customer Value Conversations

November 25, 2025

A Practical Reflection For Organizations Building Internal AI-Driven Value Capabilities

A recent white paper from Thinking Machines Lab, co-founded by a former CTO of OpenAI, who helped architect large language model systems, offers one of the clearest demonstrations of a phenomenon many leaders suspect but rarely see quantified: structural randomness in LLM inference.

The paper is publicly available here.

The Researchers Designed a Controlled Experiment

They held everything constant:

The same prompt
The same text
The same model
The same configuration
The same environment
Temperature set to 0, intended to force determinism

They ran this prompt 1,000 times. Even with all variables controlled, the model produced 80 unique completions.

The variation was traced to well-known sources inside LLM inference such as floating point behavior, kernel differences, and non-associative operations. These are not flaws that can be removed through prompting or fine tuning. They are inherent properties of large-scale model inference.

The conclusion is straightforward: variability is structural, and although it can be reduced, it cannot be eliminated without significant architectural changes.

Why this Matters for Organizations Scaling Customer Value Conversations

When organizations build an internally built value agent powered by a large-scale LLM, the expectation is that it will support value hypothesis work, business case framing, discovery summarization, and value realization planning.

Yet customer value conversations depend on something very specific: continuity.

Assumptions should persist
Metrics should be stable
Targets should carry forward
Context should build from one conversation to the next
Roles across the lifecycle should refer to the same information

These are not preferences.

They are requirements for any organization that wants to scale value work with clarity and predictability.

To understand why structural randomness matters, it helps to step into the customer’s perspective.

A Simple Metaphor from the Customer’s Point of View

Every customer expects that their value conversations form a coherent story. Each meeting should pick up where the last one ended.

Now imagine sitting in the customer’s seat and experiencing the opposite:

Each time you meet with your provider, the value narrative shifts slightly
The projected benefit changes
The timeline moves
The metric definitions vary
The emphasis drifts

It feels as if you are speaking to someone who does not remember the last conversation.

The continuity is missing. The thread does not carry through. It is like watching a play where each scene is performed by a new cast who never saw the scene before it.

They are all trying to tell the same story, but without the shared script, the story feels misaligned.

This is the natural outcome when an internally built value agent powered by a large-scale LLM generates slightly different outputs for slightly different inputs.

Structural randomness introduces subtle irregularities that accumulate across the lifecycle.

A Practical Example Across the Customer Journey

→ Early Stage Exploration

A value hypothesis is drafted using the internal value agent.
“We estimate 1.2 million dollars in efficiency improvements over 12 months.”

→ Deepening the Business Case

A colleague later uses the same value agent for a more detailed analysis.
“We project 900,000 dollars in benefits over 18 months with a 35 percent throughput improvement.”

→ Post Sale Value Realization

A team member generates the value tracking plan during onboarding.
“Our goal is 1.1 million dollars in benefits measured over 9 months through three primary metrics.”

No one intended these differences.

Yet each output shifts slightly based on the probabilistic nature of the model, the context window, the phrasing of the prompt, or the point in the conversation where it was invoked.

From an internal view, these seem like small variations.

From a customer view, they represent different versions of the same story.

The customer cannot easily determine:

Which assumption set is authoritative
Which benefit estimate drives the business case
Which timeframe is expected
Which metrics define success
Which narrative they should align to internally

These issues emerge gradually, and they introduce friction into what should be a clear, predictable value journey.

Why a Closed Loop System of Record Changes the Equation

A different approach is to ground value work in a secure, collaborative system of record.

This system is shared by the provider, the customer, and when appropriate, a partner.

Such a system:

Captures initial assumptions
Defines metrics and baselines up front
Establishes targets with customer input
Stores these as structured, persistent data
Provides a single reference point for every role
Incorporates telemetry and real performance data
Makes edits intentional and transparent

This ensures that every conversation builds on the one before it.

The storyline does not drift because the data does not drift.

The value narrative is not regenerated. It is carried forward.

Instead of creating the value story anew each time, it works inside the structure of the agreed information. It enhances continuity rather than replacing it.

This creates a consistent experience for the customer, who sees the same metrics, the same targets, and the same definitions throughout the journey.

In Summary

The Thinking Machines Lab research illustrates a simple but important point. Large language models contain structural randomness. When an organization aims to scale customer value conversations, continuity becomes essential.

A value narrative must persist across roles, phases, and time.

Customers must feel the thread of the story carry forward.

An internally built value agent powered by a large-scale LLM cannot provide that continuity on its own.

A collaborative, closed loop system of record can.

It turns the value narrative into something shared, stable, and persistent, rather than a set of probabilistic outputs that shift from conversation to conversation.

‍

< BACK TO BLOG

Featured Posts

Customer Value Management

Co-Creating the Future of Value: Lessons from the Customer Value Community Forum, Hosted by Autodesk in Denver, CO

7 minutes

Read

Value Selling

Scaling Value Content Without Complexity: Lessons from Appian and Autodesk

3 minutes

Read

Press Release

Ecosystems Unveils ViViEN™ 2.0: AI-Enabled GTM Infrastructure for Aligning Providers and Customers on Business Outcomes

5 min.

Read