Kweku's blog: Don't let the meanest people win

With a general election coming in the UK I've been thinking about voting systems. There are lots of interesting proofs showing what's possible (or rather, what's impossible) to achieve in voting systems. One example is that every reasonable voting system (with three or more candidates) is susceptible to tactical voting (the Gibbard--Satterthwaite theorem), where in some circumstances voting for the wrong candidate can improve your chances of a preferred outcome. Similarly, no rank-based voting system for choosing between three or more candidates meets four basic criteria (Arrow's impossibility theorem): roughly that

we don't just let one person decide the outcome
every set of votes translates into a single winner, in a predictable (not random) manner
candidate C entering the race can't change the result from a win for A to a win for B
if every voter prefers A to B then B can't win the election.

(In the first version of this post condition 2 said "every election result is in principle possible" which is not the correct condition: it's in fact implied by the fourth condition.)

The last two conditions are the most nuanced; let's explore the last condition in the context of a chilli cook-off. The takeaway point will be careless scoring systems punish nice people. [My sister tells me that 'Come dine with me' is a good example of this: the winner is often not the best chef but the stingiest voter.] Here's the setup.

You're hosting a chilli cook competition with some friends. Each participant scores the other chillis out of 10, but of course can't vote on their own.

As a first exercise, suppose everyone submits literally the same chilli. Who will win? Think about it a bit (I've already hinted at the answer!).

Thought about it?

We can start by imagining there are just two participants. Then if one really likes the chilli they score it highly; the other person, liking it less, scores it lower and hence wins. You can perhaps already see what happens with more than two participants already: with identical chillis, whoever likes the chilli the least (or rather, is harshest in their scoring) will win. We can prove this carefully by considering who scores higher of a given pairs: in the head-to-head between A and B in the table below, they each receive 7 points from C and 9 points from D, so the decisive factor in who ranks higher is that B gives 5 points to A, while A only gives 2 points to B, hence the 3 point gap in their final scores.

Cook Voter	A	B	C	D	*Total points given out*
A	--	5	5	5	15
B	2	--	2	2	6
C	7	7	--	7	21
D	9	9	9	--	17
*Total points received*	18	21	16	14

Our argument proves, as can be seen in the table, that rankings based on the total points received will be exactly the opposite of 'generosity rankings' based on the total points given out (in this identical chilli example).
__________

Of course, most cook-offs won't involve everyone submitting the exact same chilli, but the same principle applies to show that harsher judges are at an advantage compared to more generous ones. Here's an example of this principle in practice, drawn from a BBC sport article of a couple of years ago. The headline declared that Virat Kohli was the best batsman, Mitchell Starc the best bowler, and MS Dhoni the best wicketkeeper in world cricket. These rankings result from Kohli scoring 15 points, to 9 for his nearest rival; Starc scoring 14, to 6 for his nearest rival and Dhoni scoring 13, to 12 for his nearest rival. Where did these scores come from? Here's the explanation:

"Completing our survey on behalf of their team-mates, captains were not allowed to vote for players from their own country and their top selection would be worth three points, their second selection two points and their third selection in a category one point.

Australia, Bangladesh, England, New Zealand, Pakistan and South Africa all took part but, despite repeated attempts to get their responses, India and Sri Lanka did not meet the deadline.''

See the problem? If you don't know the nationalities of the winners it may help you to note that two of the three were Indian.

Why is that unsurprising? The 6 points the Indian team should have given out in each category, which would have been spread among the other teams, are missing from the totals. If this were a chilli cookoff it would be the Indian team (also the Sri Lankan team) criticising every chilli, while the other teams voted more generously. Starc (Australian) was able to win despite this handicap, and Kohli (Indian) was a worthy victor who would have come first in any case: for chillis, this tells us that when the difference in quality is pronounced enough, a poorly designed scoring rule doesn't on its own swing the result. (Notice also that no Sri Lankan player won, despite sharing the same advantage as the Indian players.) But Dhoni's narrow victory was likely the result of the failings of the scoring system: if the Indian captain had given any points to the runner up, de Kock of South Africa, his own player Dhoni could at best have tied.
__________

In this case, with Kohli and Starc's 6+ point margins of victory means they would definitely have won even with every captain voting, while Dhoni's margin of just one point means we can be fairly confident de Kock should have won. What if Dhoni had been two points clear? And, more generally, how should you score your chilli cookoffs to stop the meannest entrants winning? (Unless, of course, they also happen to make the best chilli.)

One simple fix in the cricket case would be to rank players not based on the points they got, but based on how many points they got out of the total available to them if everyone had agreed they were the best. Equivalently, we can rank players based on the average points they received per eligible voter. So Dhoni's scores would be divided by the 6 captains eligible to vote for him, while de Kock's scores would only be divided by 5 since the South African captain couldn't vote for him anyway. As we expected, this puts de Kock ahead (2.4 points per voter compared to 2.1666...).

For a chilli cookoff we can employ a similar idea: give everyone a fixed budget of points that they have to distribute among the other competitors. Equivalently, we can normalise everyone's scores, dividing by the total number of points they gave out, so that someone who gives 1 point to each other chilli is treated the same as someone who gives 10 points to each chilli. If we went back to the table from before, we would find that now everyone scores 1/3rd of a point from each other person (for better scaling, we could multiply up by 5(n-1), where n is the number of competiors; this would ensure everyone gives scores centred around 5 points). So the competition is a tie, as we would hope when the chillis are identical!

__________

As a final remark, this new system doesn't fix every issue. Indeed, as the Gibbard--Satterthwaite theorem suggests, we can't hope to rule out scoring tactically (for example, if you know yours is the second best chilli, you could assign most of your points to the worst chilli and none to the best, to give yourself a chance of winning). Even if everyone accurately represents their preferences the system still isn't perfect. Notably, if one of the participants has no sense of taste, they will distribute their budget of points completely evenly among the other chillis. Such a person is more likely to come last -- they're boosting the worst other chilli with more points than it could hope to fairly get -- and, similarly, also more likely to come first. A 'second order' fix for this would normalise everyone's spread of scores. And we don't have to stop with matching the spread of scores: what we're doing is 'moment matching' (related: the method of moments). If we just match averages and spreads, we leave unchanged two voters whose score patterns are something like (1,6,6,6,6) and (4,4,4,4,9) (the difference being whether you deviate more from your average score to praise good chillis or to punish bad ones: the 'skewness' of your scores). I leave you with three exercises: which of these two voting styles makes you more likely to come last? Is it the same one that makes you more likely to come first? And can you design a scoring system which matches skewness, and even one which forces all voters to have every moment of their scoring patterns match?

1 comment:

Kweku Abraham25 January 2020 at 01:39
On reflection the simple fix in the cricket scores still biases in favour of Dhoni, for reasons related to the "second order fix" discusses in the closing comment. In particular, it seems that captains were nearly unaninimous that Dhoni and de Kock were the best two wicketkeepers. That means India "should" have given de Kock three points, which is more than the 2.4 points he effectively gets from them under the described rescaling, because his rival for the three points is eliminated. A simple (ish) way to account for this would be to say that in deciding who wins each head-to-head, you ignore the own team's vote, deleting (in Dhoni vs de Kock) the presumably 3 points given to Dhoni by de Kock. This also reduces incentives to tactically vote: your vote never impacts your own player's head-to-heads. Unfortunately, it needn't lead to a single winner: we can end up in "rock-paper-scissors" scenarios, as discussed in the blog post following this one.

Monday, 9 December 2019

Don't let the meanest people win

1 comment: