Backward Induction

Published by Mario Oettler on 14. February 202214. February 2022

Last Updated on 28. April 2023 by Martin Schuster

Backward induction is a game-theoretic concept that can be applied to sequential or repeated games. The aim is to find games that are subgame perfect. This concept and related terms are explained best by an example.

Suppose we have two players. They play a game where player A first puts a coin on the table, and then B puts a coin on the table. B can see what side of A’s coin is up. The pay-off matrix shows the pay-offs for all possible combinations.

If both coins display the Head, player A wins 2, and B receives nothing. If A’s coin shows Head and B’s coin shows Tail, A wins nothing, and B receives 1.

If A’s coin shows Tail and B’s coin shows Head, A wins 1 and B wins 1. If A’s coin shows Tail and B’s coin shows Tail, A wins 1, and B receives 0.

In a single shot simultaneous game, there is no Nash equilibrium. But player A would favor [Head, Head] since it promises the highest reward. To see if this is plausible, we can use a different form than the pay-off matrix to display the game – the extensive form.

This game tree shows the pay-offs for player A and B according to their strategies. The game tree consists of two subgames. One subgame is the decision of player A and the other subgame is the decision of player B. A subgame starts with a decision node and ends with a decision node.

With this decision tree, we can go backward from player B’s decisions.

In the upper branch, the best strategy for player B is to choose “Tail” since it yields a pay-off of 1 compared to 0 when playing “Head”.

In the lower branch, the best strategy for player B is to choose “Head” since it yields a pay-off of 1 compared to 0 when playing “Tail”.

The following figure displays the expected pay-offs for A.

You can see that the branch “Head” promises a pay-off of 0 and branch “Tail” a pay-off of 1. Hence, rational player A chooses Tail.

Backward induction can be used to evaluate threats and promises. To demonstrate this, we use another example.

Imagine we have two companies A and B. One is already present in a city. Another wants to enter this market. The existing company A threads the intruder B to lower the prices and wage a price war if B really enters its market. In this case, both suffer a loss of -10. If the existing company keeps the price stable and the intruder enters the market, both make a profit of 40 each. If the intruder retreats, the existing company makes a profit of 100.

Let’s consider the pay-off matrix again.

The question is, is the threat of lowering the prices by the existing company credible? Let’s have a look at the decision tree.

In case of a market entry, the existing company A would choose to keep the prices constant because it promises a profit of 40 instead of a loss of -10. This would yield a pay-off of 40 to the intruder. Compared to the profit of 0 in case of a retreat, this is favorable. As a consequence, the intruder enters the market, and the existing company keeps the prices constant. Hence, the thread is not plausible.

Of course, this example is not perfect, and it is likely that companies will follow another rationale. This game is only a one-shot game. The players cannot change their decision in the future. In reality, the existing company might have more money and could wage the price war until the intruder gets bankrupt and leaves the market forever, yielding high profits in the future that set off the loss in the price wartime.

Subgame Perfectness

A sequential (or repeated) game is subgame perfect if all subgames (strategies) follow a rational decision. It doesn’t mean that the overall outcome is optimal.

Suppose we have the following decision tree:

We have two subgames. The first is the one of player B, and the second one is the subgame of player A.

If A plays “up” B’s rational decision would be to play “up” too because it promises a pay-off of 6 instead of 5. This would leave A with an expected pay-off of 1.

If A plays “down” B’s rational decision would be to play “up” because it promises a pay-off of 9 instead of 8 when playing “down”. This would leave A with an expected pay-off of 0.

So, rationally, A can choose between playing “up” with a pay-off of 1 and “down” with a pay-off of 0. If A chooses “up” because it brings him the higher pay-off, the game is subgame perfect.

A Pirate Example

Suppose a group of pirates wants to divide their loot of 100 coins. The wildest pirate makes a suggestion who gets what share. If 50% or more vote for the suggestion, it is accepted. If less than 50% of the pirates vote for the suggestion, the suggesting pirate is thrown overboard, and the next least wild pirate has to make a suggestion.

We have a few assumptions:

Pirates act rationally.
Nobody wants to go overboard.
There is an order of how wild pirates are. There are no two pirates that are equally wild.
The suggesting pirate can vote too.
Pirates that went overboard cannot vote and receive no share of the loot.
Pirate N is wildest, pirate 1 is the least wild.
Coins cannot be divided.

Let’s consider a few cases:

Two pirates

P₂ suggests: P₂ receives 100 coins, P₁ receives 0.

P₂ votes for the suggestion, P₁ votes against it. 50% of the votes voted for the suggestion hence it is accepted.

Three pirates

If the suggestion is rejected, we will continue with two pirates. Then, P₂ would receive 100 coins, P₁ 0 coins.

Hence, P₁ accepts every suggestion that promises him more than 0 coins.

P₃ suggests P₃ receives 99 coins, P₂ receives 0 coins, P₁ receives 1 coin.

66.67% vote for the suggestion.

Four pirates

If the suggestion is rejected, P₂ receives 0 coins. Hence, P₄ has to bribe P2.

P₄ suggests that P₄ receives 99 coins, P₃ receives 0 coins, P₂ receives 1 coin, P₁ receives 0 coins.

50% vote for the suggestion.

Five pirates

If the suggestion is rejected, we continue with four pirates, pirates P₃ and P₁ receive 0 coins. Therefore, P₅ must bribe them. P₅ suggests P₅ receives 98 coins, P₄ receives 0 coins, P₃ receives 1 coin, P₂ receives 0 coins, P₁ receives 1 coin.

Three pirates out of five (60%) vote for the suggestion.

Six to 200 pirates

This can continue until 200 pirates. Then it becomes tricky.

201 pirates

P₂₀₁ has not enough coins to bribe enough pirates and keep something for himself. Hence, he needs to give away everything.

He gives 100 pirates 1 coin each and keeps 0.

101 out of 200 votes support this suggestion. P₂₀₁ remains on board but without profit

202 pirates

P₂₀₂ has to bribe all pirates that would get nothing under P₂₀₁. He gives away 100 coins and keeps 0 coins.

101 out of 202 votes support the suggestion, exactly 50%. P₂₀₂ remains on board.

203 pirates

P₂₀₃ would have to bribe all those pirates that would receive nothing under P₂₀₂. But 100 coins are not enough to gain the majority. He would need 102 votes.

Hence, he goes overboard for sure.

204 pirates

P204 has not enough coins to bribe the majority. But does he go overboard? No. P₂₀₃ doesn’t want to be the next pirate that has to make a suggestion. Hence he will support P₂₀₄’s suggestion. P₂₀₄ has to bribe 100 of the first 200 pirates.

He receives 102 votes and stays on board.

205 pirates

P₂₀₅ cannot count on the votes of P₂₀₄ and P₂₀₃. The 100 coins are not enough to bribe the majority. He is unlucky and goes overboard. But here, it becomes tricky. Because neither P₂₀₄ nor P₂₀₃ would benefit from letting P₂₀₅ go overboard (they have to give away also all their coins), they could also vote for his suggestion.

A Cinematic Example

The following movie snippet shows the essence of backward induction and the credibility of threads.

Note: This clip is from “Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb,” directed by Stanley Kubrick in 1964. (Length 4:22 min)