[EM] Summing runoff/IRV: can this hold?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[EM] Summing runoff/IRV: can this hold?

John
[Not subscribed, I think?  CC me on replies]

I think I can sum a countable total for up to five candidates, but it breaks after that.

Consider these properties:
  1. There are N ballots.
  2. For each alternative {A}, there are A[first] ballots where A occurs first.
  3. For each alternative {A}, there are A[last] ballots where A occurs last.
  4. For each pair of alternatives {A,B}, there are A[B] ballots where A directly and immediately precedes B.
  5. No candidates are ranked equally on any ballot, except that all unranked candidates are equally worse than all ranked candidates and are considered to not appear on any ballot on which they are unranked.
In theory, it should be possible to generate two sets of ranked ballots which share these properties.  If this cannot be done, then for C candidates, we require (C*(C-1+2)+1) or (C^2+C+1) pieces of information each equal or less in size than the information that there are N ballots.  In theory, we must store N! distinct pieces of information to store ranked ballots.

Should it be impossible to generate two sets of ranked ballots sharing the above properties, then the above properties can be summed.

Should it be possible to generate two sets of ranked ballots sharing the above properties—it must be—then there is another question.

Is it possible to generate two sets of ranked ballots sharing the above properties such that a runoff between the ballots does not compute exactly the same values for the above properties at each round?

If no, then we may sum the ballots and compute e.g. Instant Runoff Voting, but we may not know what the actual ballots are in this case.

If yes, then a third question:  is it possible to generate two sets of ranked ballots sharing the above properties such that a runoff between the ballots does not compute exactly the same values for the number of ballots on which each alternative occurs first in each round?

If no, then we may still sum the ballots and compute the runoff without knowing the ballots themselves.

If the above were possible, then it should be possible to eliminate each candidate in turn and test the number of additional ballots on which each other candidate appears first, and thus reason that there are N ballots where A appears first, and by elimination you have M ballots each where B, C, D, and so forth are ranked second directly after A, plus ballots where A is the only candidate ranked, and so have counts for A>B A>C A>D and so forth.  Repeat with each alternative and you will have the entirety of first ranks, and then can proceed through every round of run-off singly, and in this way discover all rankings in polynomial time O(n^2) for (n) alternatives.

Consider the logic:

 - "A is first on 10 ballots" is not the proposition; rather, it is that BEGIN precedes A on 10 ballots.
 - Likewise, "A is last on 10 ballots" is just "A precedes END"
 - END cannot precede any alternative on any ballot
 - No candidate can precede BEGIN on any ballot
 - Ballots cannot be added or removed
 - The candidates must appear on exactly the same number of ballots

The initial rules are essentially similar to saying "the [first|last] pair of candidates is {A,B} on N ballots", with A or B being BEGIN or END.

When rearranging ballots, the same number of ballots must start and end with the given candidates.  Consider any race with three candidates:  

 - First and last candidates must stay in place
 - It is impossible to insert candidates between the two candidates ranked on any ballot of two candidates, as you would change the number of ballots or the immediate-precedence properties
 - It is impossible to rearrange second candidates, as this would change the immediate-precedence properties

Thus in a race of three candidates, the above properties will always describe all ballots.

In a race of four candidates, I cannot break these rules:

 - A>B>C
 - D>A>B>C

The unit of three can move around, but we cannot have a mobile candidate without ranking in multiple positions on a single ballot.

In a race of five candidates, I can break these rules:

 - A>B>C>E
 - E>A>D>C

If we were to swap B and D, then all is well; however, because only one candidate remains unranked, these ballots are mathematically-equivalent to a completed ballot, and can be recorded as such:

 - A>B>C>E>D
 - E>A>D>C>B

Now swapping any candidates between the rankings is again impossible.  If instead our original ballots are as follows:

 - A>B>C
 - E>A>D>C

Then we can produce the following:

 - A>D>C
 - E>A>B>C

Yet this continues to illustrate the problem that we can simply account as such:

 - A>B>C
 - E>A>D>C>B

Now with five candidates it is impossible to generate a set of ballots with mobile candidates which can swap positions.

To my senses, this creates an absurdity.

There must be 5! or 120 combinations, and so we need 120 pieces of information—those being integers of log(N) bits for N ballots—to store the count of any individual combination, each of which may be 100% of the ballots or 0% of the ballots, or some number in between.

Yet to store this information uniquely-identifying a set of ranked ballots in a race between five candidates, we only need:

 - 5 pieces for the number of each candidate ranked first.
 - 5 pieces for the number of each candidate ranked last.
 - 20 pieces, such that for each of the 5 candidates, 5 pieces to indicate on how many ballots the candidate precedes each other candidate.
 - 1 piece, indicating the number of ballots.

That is 31 pieces of information or 26%.

It must be that this information can be stored in variable sizes of space by indicating default values and truncation, and 30% is the absolute average compression.

The properties fail to extend to six candidates:

- F>A>B>C
- E>A>D>C>B>F

If we truncate ballot two:

 - F>A>B>C
 - E>A>D>C

We can now swap:

 - F>A>D>C
 - E>A>B>C

Eliminate F. then A, and you have B first on one set, but D first on the other.

However, there's one more problem:  if you have a list of pairwise vote margins, such swaps change those totals.  Thus for six or more, you also need C^2-C additional pieces of information.  For six:

 - 6 pieces for the number of each candidate ranked first.
 - 6 pieces for the number of each candidate ranked last.
 - 30 pieces, such that for each of the 5 candidates, 5 pieces to indicate on how many ballots the candidate precedes each other candidate.
 - 30 pieces, being the pairwise vote margins
 - 1 piece, indicating the number of ballots.

Rather than 6! or 720, you need a mere 73 or 10% of the storage space.

I am uncertain of the pairwise vote margins conjecture and have been unable to prove it mathematically within these bounds.  Pairwise margins and even pairwise vote tallies is not sufficient to uniquely identify a set of ranked ballots.

As well, it may very well be NP-Hard to compute ballots from this information even if it is unique per ballot.

Still, it is summable and unique for up to five candidates, or so it seems; and may be summable and unique for six with this extra information.

----
Election-Methods mailing list - see https://electorama.com/em for list info