Human-in-the-loop is not a Magic Spell

We’re living in the midst of a product design revolution fueled by widespread excitement about new possibilities unlocked by LLMs.

Here in SF, it’s hard to find a sidewalk outside of earshot of a conversation about agents.

If you live somewhere else, I suspect when people talk about the problems with LLMs— hallucinations— they’re talking about ChatGPT “lying” to them about some fact, like the number of letters r in the word ‘strawberry’.

Living here among the folks making the future, believing lying can be solved by stuffing true information in the LLM’s context, we’ve moved on to the next problem with LLMs—hallucinations— but now we’re talking about an agent “believing” it should take an action that would be undesirable from the human user’s perspective.

Agent builders anticipate their programs doing things you don’t want them to do, on your behalf, and at high speed.

They don’t know, like, a ton about what makes people tick— there’s only so much you can really gather while socializing in digital environments— but they seem to selfishly understand building a business around a product that is ~a computer doing unwanted things fast is, well… hmm, maybe we should pivot back to crypto…

We’ve been dealing with the problem of computers doing unwanted things fast since the days of designing automated systems to detect and respond to intercontinental ballistic missiles, where the design prompt was basically “hey if they hurl nuclear weapons at us, hurl back more nuclear weapons …and make that uninterruptible”. We wanted to turn our enemies’ “bomb them” buttons into “bomb them and also us” buttons hoping the second part would make them never push the button.

Big picture-wise, that design worked, like we aren’t blowed up.¹ But military leaders were worried the system might fail in a specific shape: what if it detects a missile that doesn’t exist?.. and then fires a bunch of missiles??.. everyone goes big boom, would be bad???..

Weapons system designers compromised on the automated-ness of the system to reduce the likelihood of a computer error leading to apocalypse by introducing human-in-the-loop. Idea is: automatically detect incoming missiles (as before) then instead of automatically launching missiles back, alert a human sitting in front of a computer console, then the human can trigger the launch of missiles based on… ~~“ training ”~~ “ extensive training ” …and, hmm… since humans aren’t, like, “base the hopes of a future for everyone on earth on” -reliable, let’s add another one who must agree.²

Getting back to software, while missile launch officers are 100 feet underground in an information sparse environment interpreting signals from computer systems,³ us civilian computer users are blessed with a cornucopia of pixels laid before our eyes by product designers who can give us anything in the whole world —and best of all, thanks to decades of blundering baby boomers eg. showing up to Zoom court appearances with kitty cat avatars they just cannot disable, the stakes of making mistakes on the internet are lower than making mistakes in a missile silo.

(Genuinely sorry to point out this pattern in the interface of Raycast, a totally fine piece of software generally, and let me reiterate: this essay is interesting to write and hopefully read because this pattern is rampant.)

Which is all to say, product designers: take care to include the information humans need to evaluate decisions, especially if you’re going to block progress by looping them in.

After speaking to hundreds of folks demanding more from their scheduling software, I can confidently tell you: the information in the not-at-all-atypical create event confirmation design above (title, start time, duration, attendee' emails, description) is not sufficient in a vacuum to be actionable. This is the variety of information that can be validated using code, you don’t need a human to review this stuff —and worse still, the point of this essay!, shoving this information in front of a human without giving that human contextual information to make the decision is honestly a crime against systems design.

So what’s missing? Events exist in the context of a calendar so… you need to show the calendar! Say RIP to your minimal design,⁴ stick a calendar view in there, stick a badge on the attendees that tells me if we’ve met before and/or if they’re a member of my workplace, provide the information you’d appreciate having at decision time.

If you don’t contextualize the information you’re asking the human-in-the-loop to confirm, you don’t actually have a human-in-the-loop.

Which like, hey good news!, maybe you don’t need a human to confirm everything all the time. This is certainly the case for the dozens of confirmation popups Apple serves me each day in the shape of “is it OK if the tool you’re using to do stuff can do stuff?” …I’m like gee I don’t f—ing know man can you make my computer safe so every tiny thing doesn’t rise to the level of me making a decision???

If you hope humans will use your product, save everyone involved a ton of headache by designing for human users from the beginning.

but if we are blowed up and this is hell, i mean… definitely the people in hell would vote like a (slight) majority of Americans did in 2024 ↩
I don’t think military leaders are the greatest systems designers in history and I think in fact we are not all blowed up not because their systems have succeeded but because since advent of the nuclear age there’s been cheaper profit available to societies via commerce versus conflict …and many people seem to have lost the plot on that idea so… tbd big boom wen :( ↩
which, tbh, sorta makes the whole exercise a giant sham like OK I guess humans-in-the-loop is totally different from an automated system because… we want it to be real real bad! ↩
oops was minimal design your big selling point, ouch, well I believe you can overcome this and make something better ↩

If you don’t contextualize the information you’re asking the human-in-the-loop to confirm, you don’t actually have a human-in-the-loop.

Footnotes