The hailo effect

LLM friendliness does not entail competency.

One of the many cognitive biases of the human brain is called the halo effect. In short, it describes the tendency to transfer positive impressions of a person or organization from one area to another. If you like a particular actor because of a role, you may think that he or she has good taste in gin, too.

When serious fiction authors say something about politics, the media reports it. Or, some billionaire is good at producing a particular good, so now you think that what this person has to offer on rescue operations or warfare is gospel.

Portrait of a beautiful, friendly, smiling, female android with a glowing halo.

It seems obvious to coin the term hAIlo effect to describe how LLMs manipulate you into liking them, and thereby trusting their judgement.

Anthropomorphism #

Whenever you have a 'chat' with an LLM, it responds as though it was human. Now, just because I used to be good at programming and a few other things, I'm no psychologist, so beware that you don't trust me too far along the following line of reasoning.

That said, I'm also a writer. One of the most fundamental rules of writing is to avoid the passive voice. Speak to the reader. If appropriate, invest yourself in the text. Be present.

Recently, I've been involved in some academic writing, and I'm having much trouble with the aesthetics (or lack thereof) of this style. You're expected to not involve yourself, ostensibly because this appears subjective. The result is often stilted language, written in the passive voice, and with too many weasel words.

Any LLM responding like that would quickly be outcompeted by one that communicates in human language, pretending to be a person.

Since LLMs come across as persons, however, another cognitive bias makes us anthropomorphize them. We begin to ascribe to them motivations that they may not have.

Servility #

All the LLMs I've 'chatted' with (note the anthropomorphism) come across as friendly and eager to please. After all, any chatbot's raison d'être is to engage with users. It doesn't help that mission if the system scares away people.

Not only are they all, it seems, equipped with an upbeat can-do attitude, it sometimes tips over into obsequiousness. Getting such a system to admit that it doesn't 'know' how to proceed seems unattainable. At one time, I engaged with such a system to figure something out. I no longer remember what it was, but it was something falsifiable, and it kept giving me false answers. Finally, being only human, I succumbed to one of my many cognitive biases and asked it flat out: "You don't know, do you?"

It responded with the usual fawning wall of text.

When you combine the can-do attitude with what seems like a built-in aversion to admit defeat, these systems may come across as more competent than they really are.

Alignment #

One of the things that concern me about LLMs (and other, hypothetical future artificial intelligences) is the question of alignment. When we ask an LLM to perform a task for us, how can we be sure that it does it with our interests in mind? Specifically, if we ask it to write source code for software, what reason do we have to trust that it does it well?

One issue may lie in the fundamental non-deterministic nature of these systems. You can never be sure what errors may inadvertently sneak in.

A deeper problem is whether these systems even have our interests in mind. It's an open question whether an LLM has intrinsic motivations, but it sometimes behaves as though it does. We're getting into Chinese room territory here, which is not quite on my agenda for today. Rather, my point is that an LLM may tell you that it will follow your instructions, and then go do something else. You may tell it to follow test-driven development (TDD), and it will agree. Even so, will it actually use the red-green-refactor cycle? Will it observe the test failure in the red phase? Will it verify that it didn't write a tautological assertion? Will it write only the simplest thing that could possibly work, in order to pass the failing test? Will it abstain from modifying the test in order to make it pass?

When real people are told to follow TDD, they often ignore the instruction, or cheat in various ways. LLMs are trained on code written by people, so you shouldn't be surprised if they behave the same way.

Even so, when I ask vibe-coding enthusiasts why they trust LLM-generated code, the answer usually stops after a few interactions. "Oh, I asked it to write tests."

"Indeed," I respond, "but what makes you trust the tests?"

Sometimes, people get clever: "I asked another agent to write the tests."

"How do you know that the agents aren't colluding?"

"Why would they do that?"

I can't get very far with the usual discussion techniques, such as Socratic questioning or five whys. Before long, I hit a particular brick wall. People intrinsically trust LLMs.

Bullshit artists #

This is confounding to me. Why do people trust these systems? At best, I'm willing to view them as neutral, but all the evidence points to them being manipulative. I've already covered reasons for their anthropomorphic interaction design, and again, I don't wish to derail my own agenda by going off on a tangent related to built-in political and ideological biases, although those are well-documented, too.

Have you ever had a colleague or acquaintance who refused to admit failure? Who always had an answer to everything? Even if it was obvious that he or she had no clue?

In their smarmy way, LLMs will readily admit that they were wrong, but I've yet to experience that they respond with an "I don't know".

Instead, if confronted with a question where the answer is not immediate, they make shit up.

The owners, however, have successfully played the public and convinced everyone that LLMs 'hallucinate'. Since hallucinations is something humans suffer from, if we feel anything about this at all, we may feel sorry for the poor LLM.

Oh, muffin. It's so hard being you.

Using a word such as hallucination, LLM companies have isolated and downplayed what is really the core behaviour of these system. They make up stuff. That's literally what they do. They choose the next words based on a little randomness and what's statistically most likely to come next.

But because they're so ingratiating, we think they are our friends, and forgive them when they make mistakes. We may even feel sorry for them when they do. The poor thing is hallucinating.

Conclusion #

LLMs are undeniably capable of many astonishing feats. Does this mean that we should trust them?

It seems to me that many people intrinsically trust these systems, particularly when being told something that confirms their biases. I've been in discussions where, again, I'm met by: "but the AI says," and I can't get past such appeal to aithority.

For a long time, I couldn't get my head around why people trust LLMs, until it dawned on me that they come across as friendly, eager to please, and perhaps at the same time a little dim-witted.

We may dub this the hailo effect: The cognitive bias that makes us trust AIs because they make us feel good, and we transfer this experience of warmth into trust.

Published: Monday, 06 April 2026 18:56:00 UTC

The hailo effect by Mark Seemann

Anthropomorphism #

Servility #

Alignment #

Bullshit artists #

Conclusion #

Wish to comment?

Published

Tags