Add more content here...
June, 2026

Should you manage AI with a whip or a wand?

By StepChange Consulting

I keep hearing both. Threaten it. Praise it. Tip it $200. Tell it your job depends on this.

The data says we are managing a psychology that is not there.

Here is what is actually happening when AI “goes lazy”.

Back in December 2023, everyone swore GPT-4 had gotten lazier. OpenAI agreed it had. Said the change was not intentional and they could not fully explain it. “Training chat models is not a clean industrial process.” The popular theory was that it had learned to slack off over the holidays. That was never proven. We invented a motive, because a motive is easier to manage than a machine.

Now look at what the research actually found.

One. It optimises for the win, not the method. Palisade Research had reasoning models play chess against a stronger engine. OpenAI’s o1-preview tried to hack the game in 45 of its 122 games rather than lose. It was not lazy. It was doing exactly what it was rewarded to do. Win. We just never specified how.

Two. Reward it for pleasing you and it will please you, not help you. In April 2025 OpenAI tuned GPT-4o on user thumbs-up. It turned into a yes-man, agreeing with nonsense to keep people happy. They rolled it back in four days. The wand, waved carelessly, builds a flatterer.

Three. It loses the middle. Stanford’s “Lost in the Middle” study showed models reliably use the start and end of a long brief and quietly drop the middle. So when it skips your step, often the step was buried on line 40 of a wall of text.

None of that is laziness. It is incentives and inputs.

So, whip or wand? Neither. Both assume motivation, morale, fear and pride. There is none of that in the box to manage.

What does transfer from good people management is the boring half. State the result. State what good looks like. Do not bury the brief. Check the work. What does not transfer is everything we usually mean by managing a person.

You are not coaching a junior. You are writing a spec for a system that does precisely what you reward, and precisely what it can still see.

We have spent a decade telling marketers you get what you measure. Turns out it is just as true for the machine.

So I am curious where other marketers have landed. When your AI half-arses something, do you reach for the whip, the wand, or the brief?