This Paradox Used to Split Smart People 50/50, But Now Things Have Changed
What is Newcomb's Paradox?
Imagine two boxes in front of you. The transparent Box A definitely contains
You may choose only Box B (one-boxing), or take both boxes (two-boxing).
But before you choose, a highly accurate "super predictor" has already predicted your behavior:
- If it predicts that you will take both boxes, it leaves Box B empty.
- If it predicts that you will take only Box B, it puts $1,000,000 in Box B.
Now the money has already been placed or not placed, and the predictor will not change anything anymore. What would you choose?
If you have never seen this paradox before, pause for a moment and think about your own answer first. Then watch Veritasium's video and fill out its poll: This Paradox Splits Smart People 50/50.
In fact, both one-boxing and two-boxing are internally coherent choices. But what interested me was this: why would the split be 50/50? I chose one-boxing almost immediately. In Veritasium's poll, more than 70% also chose one box, and in another poll around me, one-boxing was also the majority. So what is going on here? Something must be off.
To fully understand the logic behind it, I abstracted the problem into a mathematical formula and did a deep think.
Q: What exactly is Newcomb's Paradox calculating?
At its core, this is a problem of maximizing expected payoff.
Let
Assume that both
After simplification, we get the final payoff function:
This leads directly to the core disagreement in the paradox: how should we define
Q: If is a constant, what value of maximizes the payoff?
If we treat the "super predictor's" decision as an already completed fact from yesterday, something absolutely impossible to change today, then
In the formula above, the coefficient of
Q: What if is a variable positively correlated with ?
If we believe the machine is extremely accurate, and that my choice (
At that point, the game becomes a confrontation between a
Q: So what is this question really testing?
After thinking it through, I realized the problem contains two layers of ambiguity:
- At the intuitive level, if your intuition is driven by
, you choose one box. If your intuition is driven by loss aversion and temporal logic, you choose both boxes. - After fully understanding the setup, if you insist that "correlation does not imply causation," you choose both boxes. If you think "correlation can point to causation," you choose one box.
For decades, these two intuitions were roughly evenly matched in the population. But now the balance has shifted.
Q: Why do recent polls show an overwhelming majority choosing one box?
This is very different from the earlier 50/50 split. The deeper reason is that the technological background of our era has reshaped human intuition at a subconscious level.
The setup describes the predictor using the language of a "supercomputer," and that naturally makes people think of LLMs. An LLM is a typical model that uses statistical distributions to predict causal probabilities. Hugging Face even literally calls that model family CausalLM.
So the problem statement now carries an implicit suggestion that "correlation can indicate causation," which nudges more people toward one-boxing. In earlier surveys, by contrast, very few people believed such a "supercomputer" could really exist, so answers stayed closer to an even split.
How the Times Reshape Intuition
Newcomb's Paradox was proposed in 1969. In an era when personal computers did not even exist yet, and computers were still room-sized punch-card machines, "a supercomputer that can perfectly predict human behavior" sounded like theology or magic.
- People in the past, especially philosophers: they saw that kind of prediction as absurd, so the mind automatically retreated to the strongest defensive line available, namely classical causality and the arrow of time. Yesterday cannot be changed by today, so they firmly chose two boxes.
- People like us today: we are "mind-read" by recommendation algorithms every day, and we watch AI predict and even manipulate human behavior through probability distributions. Forget LLMs for a moment. What software product today does not have a recommendation algorithm? People in the past did not have that mindset. Anyone who could do this would have seemed like a mystic. Now people even use DeepSeek for fortune-telling, and large models are starting to replace the old mystic's job. We already have a felt sense that a machine can see through us. The predictor no longer feels theological. It feels like an engineering reality that is already on the horizon. So we are more willing to loosen our attachment to strict temporal causality, embrace evidential decision theory, and choose one box.