Designing for a system that can't explain itself
Jun 13, 2025

In aviation, there is a concept called the glass cockpit problem. When analogue instruments were replaced by digital displays in the 1980s and 1990s, the screens gave pilots more information than they had ever had access to. More data, faster updates, cleaner presentation. The information environment improved in every measurable way.
But something unexpected happened. Certain categories of error increased. Pilots who had developed deep intuitive understanding of analogue instruments, who could read the subtle behavioural signals of a gauge before the number changed, were now reading summaries produced by a system. The system was accurate. But it had introduced a layer between the pilot and the raw state of the aircraft. And in edge cases, when the system itself was confused or failing, the pilots were slower to recognise it than they had been before the displays arrived.
The instruments had become too good at presenting confidence. Even when confidence was not warranted.
I think about that problem often now. Because it is exactly what product teams are being asked to solve with AI-native features, and almost nobody in the conversation has the language for it yet.
The design challenge that has emerged quietly over the past year is not how to make AI features usable in the conventional sense. Usability, in the legacy meaning of the word, is mostly solved. Users understand buttons and flows and feedback states. The patterns exist. Teams know how to apply them.
The new problem is something different: how do you design for a system whose internal state is genuinely unknowable, even to the people who built it? How do you communicate confidence levels to a user when the system's confidence does not map cleanly to the kind of certainty a human would express? How do you design for error in a system where the errors are not predictable, reproducible, or always recognisable as errors?
These are not usability questions. They are epistemic design questions. The field has useful instincts for the first kind. But it does not yet have a settled vocabulary for the second.
The failure mode I have seen most consistently in AI feature implementations is what I have started calling the false floor. The system presents its output in the visual language of certainty: clean typography, structured layout, confident prose. Nothing in the presentation signals that the system might be wrong, incomplete, or operating at the edge of its actual capability.
The user trusts the output at face value. The output is wrong, or partially wrong, in a way that is not self-evident. The user acts on it. Something downstream breaks.
But the design team did not build a dishonest interface. They built the same interface they would have built for any feature. The problem is that the same visual language that communicates confidence accurately for a deterministic system communicates it inaccurately for a probabilistic one.
The cockpit was not lying. But it was presenting uncertainty in the language of certainty, and the difference killed people.
What does it look like to design honestly for a system that cannot fully explain itself?
Some of the answers are tactical. Surfacing confidence signals. Designing explicit correction pathways. Making the provenance of AI-generated output visible rather than presenting it as a final answer. These are real improvements and they matter.
But the deeper answer is a shift in how designers think about their own role in the relationship between user and system.
For most of the history of product design, the designer's job was to reduce friction. To make the path from user intent to user outcome as short and smooth as possible. That instinct produced enormous value. It produced the clean, fast, intuitive products that raised the baseline and got us to the seatbelt problem.
But AI systems sometimes require friction. Not because friction is good. But because the absence of friction can produce unwarranted trust in a system that has not earned it. The design of appropriate resistance, the moment that asks the user to confirm, to verify, to apply their own judgment before acting on a system output, is a design decision as important as any flow optimisation.
The Boeing engineers who saw the glass cockpit problem eventually introduced features that were deliberately harder to dismiss. Alerts that required active acknowledgment. Displays that showed instrument disagreement rather than resolving it into a single confident reading. Friction as a trust mechanism.
Product designers building AI features are being asked to solve a problem that the field's existing language was not built for. The old vocabulary, usability, affordance, feedback, findability, assumes a system that behaves deterministically in response to user input. AI features do not.
The new vocabulary is still forming. But the question at the centre of it is not "how do we make this easier to use?" It is "how do we help the user know when to trust this, and when to check?"
That is a harder design brief. But it is the right one.


