[EM] VoteFair Ranking software version 6.0 in C++ with MIT license

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[EM] VoteFair Ranking software version 6.0 in C++ with MIT license

VoteFair-2
Recently I "ported" the Perl-language VoteFair Ranking software to the
C++ language.  The Perl version was released under the Perl Artistic
License.  The C++ version is available under the MIT license.  Here is
the link to the new C++ version:

   https://github.com/cpsolver/VoteFair-ranking-cpp

For those who don't already know about VoteFair ranking, here is a brief
description (copied from the announcement on Reddit):

VoteFair Ranking uses ranked ballots and provides two kinds of PR
(Proportional Representation). In addition to allocating some
statewide/nationwide seats based on political-party preferences, each
double-sized-plus district has two seats, and the second seat is won by
the candidate who best represents the voters who are not
well-represented by the first-seat winner.

An additional fairness feature of VoteFair Ranking is that voters rank
the political parties, and the results determine which two or three
parties in that district can offer a second candidate in the next
election. It also identifies which parties are so unpopular (in that
district) that they should not be allowed to offer any candidate in that
district. (Remember that vote splitting is what currently limits each
political party to a single candidate.)

The ballots are very simple. One question asks voters to rank the
candidates in that district, and the other question asks voters to rank
political parties. The recommended ballot format is paper ballots with
ovals that are marked to indicate first choice, second choice, third
choice, and so on down to least-favorite choice. The software allows
more than one candidate (or party) to be marked at the same preference
level.

Recently the Perl version was used in the 2019 Canadian Federal Election
Anti-Vote-Splitting Poll here:

   https://www.newsherenow.com/news_here_now_poll_results_canada.html

Alas, there were not enough participants to provide meaningful results.

VoteFair calculations use pairwise counting to ensure especially fair
results. Specifically, VoteFair popularity ranking is mathematically
equivalent to the Condorcet-Kemeny method. Although some people dismiss
this method as requiring too much computer time when there are lots of
candidates, this software is very fast, even when there are 50 choices.

The VoteFair.org website has been using earlier versions of VoteFair
Ranking software for the last two decades, so the software has been used
for many real-life elections and polls in non-governmental organizations
throughout the world.


Here is what I wrote in response to a question about how the software
works:  If you're already familiar with "Condorcet" methods, the
foundation of the calculations can be thought of this way: When the
pairwise (one-on-one) counts are arranged in a table (matrix) the
sequence of candidates (from possibly most popular to possibly least
popular) is rearranged until the biggest counts are in the upper-right
triangular area (of the table/matrix), and the smallest counts are in
the lower-left triangular area.


Richard Fobes
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

Kristofer Munsterhjelm-3
On 16/12/2019 07.42, VoteFair wrote:

> VoteFair calculations use pairwise counting to ensure especially fair
> results. Specifically, VoteFair popularity ranking is mathematically
> equivalent to the Condorcet-Kemeny method. Although some people dismiss
> this method as requiring too much computer time when there are lots of
> candidates, this software is very fast, even when there are 50 choices.

I can't help myself, but must point out again that I think it's more
accurate to say (if this is true), that VoteFair popularity ranking is
an approximation to Kemeny that becomes Kemeny in the limit as a certain
parameter approaches infinity.

I don't remember the name of the parameter; you probably do. But I
remember I found examples where Kemeny and your method disagreed for
elections with very large Smith sets. You said that you could always
increase the value of this parameter to get the correct result.

Unfortunately, letting that parameter approach infinity destroys
VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-)
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

VoteFair-2
On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote:
 > I can't help myself, but must point out again that I think it's more
 > accurate to say (if this is true), that VoteFair popularity ranking is
 > an approximation to Kemeny that becomes Kemeny in the limit as a certain
 > parameter approaches infinity.

Yes, when there are a large number of candidates, such as 50 candidates
as an example, estimations are done to yield a full ranking, and then
the top 6 or 7 or 8 (this is adjustable) most-popular candidates are
ranked using the exact Condorcet-Kemeny method to determine the ranking
for those top candidates.

Yes, you came up with an example of a carefully constructed case that
involved something like 50 candidates -- and a small number of carefully
balanced groups of voters.  But that example illustrates my point that
such a case would not occur among voters who are sincerely marking their
ballots independently, without coordination with each other.

In elections, where just a single winner is needed, the only way
VoteFair ranking and the full Condorcet-Kemeny calculations can identify
different winners is if the case involves numerous almost-equally
popular candidates (at the top, not the middle or bottom).
(Specifically there needs to be a carefully constructed Condorcet cycle
that involves a large number of candidates and a small number of
carefully balanced groups of voters.)

 > Unfortunately, letting that parameter approach infinity destroys
 > VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-)

Yes you are correct that increasing the full calculation parameter
further (such as even to 20) would result in a very long computation
time, which is of course not a "polynomial runtime."

Yet for election purposes, as opposed to mathematical-proof purposes, I
cannot imagine needing to increase the full calculation parameter to
check more than a few top candidates.  (If that affects the outcome,
then just a few spoiled ballots also would be just as likely to change
the outcome.)

Richard Fobes


On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote:

> On 16/12/2019 07.42, VoteFair wrote:
>
>> VoteFair calculations use pairwise counting to ensure especially fair
>> results. Specifically, VoteFair popularity ranking is mathematically
>> equivalent to the Condorcet-Kemeny method. Although some people dismiss
>> this method as requiring too much computer time when there are lots of
>> candidates, this software is very fast, even when there are 50 choices.
>
> I can't help myself, but must point out again that I think it's more
> accurate to say (if this is true), that VoteFair popularity ranking is
> an approximation to Kemeny that becomes Kemeny in the limit as a certain
> parameter approaches infinity.
>
> I don't remember the name of the parameter; you probably do. But I
> remember I found examples where Kemeny and your method disagreed for
> elections with very large Smith sets. You said that you could always
> increase the value of this parameter to get the correct result.
>
> Unfortunately, letting that parameter approach infinity destroys
> VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-)
>
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

Kristofer Munsterhjelm-3
On 19/12/2019 00.26, VoteFair wrote:

> On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote:
>> I can't help myself, but must point out again that I think it's more
>> accurate to say (if this is true), that VoteFair popularity ranking is
>> an approximation to Kemeny that becomes Kemeny in the limit as a certain
>> parameter approaches infinity.
>
> Yes, when there are a large number of candidates, such as 50 candidates
> as an example, estimations are done to yield a full ranking, and then
> the top 6 or 7 or 8 (this is adjustable) most-popular candidates are
> ranked using the exact Condorcet-Kemeny method to determine the ranking
> for those top candidates.
>
> Yes, you came up with an example of a carefully constructed case that
> involved something like 50 candidates -- and a small number of carefully
> balanced groups of voters.  But that example illustrates my point that
> such a case would not occur among voters who are sincerely marking their
> ballots independently, without coordination with each other.

I couldn't seem to find the example, so I tried to create a new one. My
tools are a bit old and hacky, so I may have got it wrong, but I think
this should work:

1: A>D>F>E>C>B>G
1: C>A>F>B>E>G>D
1: C>B>A>F>D>G>E
1: C>G>E>A>F>B>D
1: D>F>C>E>B>A>G
1: E>C>A>D>G>F>B
1: F>E>G>C>D>B>A
1: F>G>B>E>A>D>C
1: G>A>F>C>D>B>E

Pairwise matrix:

- 5 3 7 4 6 5
4 - 1 4 4 1 4
6 8 - 6 5 4 6
2 5 3 - 4 3 4
5 5 4 5 - 2 5
3 8 5 6 7 - 6
4 5 3 5 4 3 -

VoteFair ballot format:

request-no-rep  request-no-party  case 1 q 1 choices 7  x 1 q 1 3 2 1 6
4 7 5  x 1 q 1 7 1 6 3 4 2 5  x 1 q 1 5 3 1 4 7 6 2  x 1 q 1 6 5 7 3 4 2
1  x 1 q 1 4 6 3 5 2 1 7  x 1 q 1 1 4 6 5 3 2 7  x 1 q 1 3 7 5 1 6 2 4
x 1 q 1 3 1 6 2 5 7 4  x 1 q 1 6 7 2 5 1 4 3

There are four candidates in the Smith set: A, C, E, and F.

The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but
C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking.

If that's right, there's no need for 50 candidates to make a
counterexample. The example above does have a carefully balanced
Condorcet cycle, however (moreso because it is optimized to have as few
voters as possible).

> In elections, where just a single winner is needed, the only way
> VoteFair ranking and the full Condorcet-Kemeny calculations can identify
> different winners is if the case involves numerous almost-equally
> popular candidates (at the top, not the middle or bottom). (Specifically
> there needs to be a carefully constructed Condorcet cycle that involves
> a large number of candidates and a small number of carefully balanced
> groups of voters.)
>
>> Unfortunately, letting that parameter approach infinity destroys
>> VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-)
>
> Yes you are correct that increasing the full calculation parameter
> further (such as even to 20) would result in a very long computation
> time, which is of course not a "polynomial runtime."
>
> Yet for election purposes, as opposed to mathematical-proof purposes, I
> cannot imagine needing to increase the full calculation parameter to
> check more than a few top candidates.  (If that affects the outcome,
> then just a few spoiled ballots also would be just as likely to change
> the outcome.)

My point is that a phrase like "mathematically equivalent" is about
mathematical-proof purposes, not practical election purposes. So I
wouldn't say "mathematically equivalent" would be accurate for an
approximation, any more than say

f(x) = sum k = 0...100: x^k/k! if x is a prime integer > 10^9
       sum k = 0...infinity x^k/k! otherwise

is mathematically equivalent to e^x (where you always have to take the
sum as k goes to infinity, not stop at 100). It may be pretty close for
a typical number (as integers have measure zero), but it's not the same
function.

If functions are mathematically equivalent, then proofs about the former
should also apply to the latter. But properties about e^x won't hold of
f above, and properties of Kemeny won't hold of VoteFair over the whole
domain. E.g. Kemeny passes LIIA but VoteFair doesn't (as eliminating the
loser of the example election above would probably result in VoteFair
returning the optimal Kemeny result).

Mathematical equivalence might be too strict, but it is what it is. Or
that's what I'd think the term would mean.

You could save mathematical equivalence by restricting the domain to
e.g. elections with a Smith set no larger than three candidates, prove
that the sort-based VoteFair mechanism agrees with Kemeny within this
domain with the usual parameters, and then say that VoteFair is
mathematically equivalent on that domain. That would be a qualified
mathematical equivalence, corresponding to your "real election purposes".
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

VoteFair-2
On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote:
 > ...
 > The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but
 > C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking.

Yes, the default value for variable
"global_check_all_scores_choice_limit" is 6 and it gives a result that
differs from the (full) Condorcet-Kemeny method.

Increasing this variable to 7 causes the full Kemeny calculation process
to be done across all 7 choices, and it gets the same result as doing
the full Condorcet-Kemeny calculation method.

This is consistent with what I said earlier.

Of course the calculation time for the case in which there is a
Condorcet (rock-paper-scissors) cycle across 50 choices is going to be
way too time-consuming for this software.

As I've also said before, the odds of such a cycle (involving either 7
or 50 choices) happening in a real election is lower than the odds of a
conventional tie.  Either kind of tie can be resolved using a judicial
decision -- like the Bush versus Gore decision from the US Supreme
Court, which was not even an exact tie.

The point remains that the VoteFair ranking software matches the
Condorcet-Kemeny calculations for as many choices as you choose.  The
choice is determined by the value of the variable
"global_check_all_scores_choice_limit".

Thank you for checking.  And especially thank you for providing the case
in the format that I could directly plug into the software.

I have added this case to the GitHub repository.  Again, thanks!

Richard Fobes


On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote:

> On 19/12/2019 00.26, VoteFair wrote:
>> On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote:
>>> I can't help myself, but must point out again that I think it's more
>>> accurate to say (if this is true), that VoteFair popularity ranking is
>>> an approximation to Kemeny that becomes Kemeny in the limit as a certain
>>> parameter approaches infinity.
>>
>> Yes, when there are a large number of candidates, such as 50 candidates
>> as an example, estimations are done to yield a full ranking, and then
>> the top 6 or 7 or 8 (this is adjustable) most-popular candidates are
>> ranked using the exact Condorcet-Kemeny method to determine the ranking
>> for those top candidates.
>>
>> Yes, you came up with an example of a carefully constructed case that
>> involved something like 50 candidates -- and a small number of carefully
>> balanced groups of voters.  But that example illustrates my point that
>> such a case would not occur among voters who are sincerely marking their
>> ballots independently, without coordination with each other.
>
> I couldn't seem to find the example, so I tried to create a new one. My
> tools are a bit old and hacky, so I may have got it wrong, but I think
> this should work:
>
> 1: A>D>F>E>C>B>G
> 1: C>A>F>B>E>G>D
> 1: C>B>A>F>D>G>E
> 1: C>G>E>A>F>B>D
> 1: D>F>C>E>B>A>G
> 1: E>C>A>D>G>F>B
> 1: F>E>G>C>D>B>A
> 1: F>G>B>E>A>D>C
> 1: G>A>F>C>D>B>E
>
> Pairwise matrix:
>
> - 5 3 7 4 6 5
> 4 - 1 4 4 1 4
> 6 8 - 6 5 4 6
> 2 5 3 - 4 3 4
> 5 5 4 5 - 2 5
> 3 8 5 6 7 - 6
> 4 5 3 5 4 3 -
>
> VoteFair ballot format:
>
> request-no-rep  request-no-party  case 1 q 1 choices 7  x 1 q 1 3 2 1 6
> 4 7 5  x 1 q 1 7 1 6 3 4 2 5  x 1 q 1 5 3 1 4 7 6 2  x 1 q 1 6 5 7 3 4 2
> 1  x 1 q 1 4 6 3 5 2 1 7  x 1 q 1 1 4 6 5 3 2 7  x 1 q 1 3 7 5 1 6 2 4
> x 1 q 1 3 1 6 2 5 7 4  x 1 q 1 6 7 2 5 1 4 3
>
> There are four candidates in the Smith set: A, C, E, and F.
>
> The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but
> C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking.
>
> If that's right, there's no need for 50 candidates to make a
> counterexample. The example above does have a carefully balanced
> Condorcet cycle, however (moreso because it is optimized to have as few
> voters as possible).
>
>> In elections, where just a single winner is needed, the only way
>> VoteFair ranking and the full Condorcet-Kemeny calculations can identify
>> different winners is if the case involves numerous almost-equally
>> popular candidates (at the top, not the middle or bottom). (Specifically
>> there needs to be a carefully constructed Condorcet cycle that involves
>> a large number of candidates and a small number of carefully balanced
>> groups of voters.)
>>
>>> Unfortunately, letting that parameter approach infinity destroys
>>> VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-)
>>
>> Yes you are correct that increasing the full calculation parameter
>> further (such as even to 20) would result in a very long computation
>> time, which is of course not a "polynomial runtime."
>>
>> Yet for election purposes, as opposed to mathematical-proof purposes, I
>> cannot imagine needing to increase the full calculation parameter to
>> check more than a few top candidates.  (If that affects the outcome,
>> then just a few spoiled ballots also would be just as likely to change
>> the outcome.)
>
> My point is that a phrase like "mathematically equivalent" is about
> mathematical-proof purposes, not practical election purposes. So I
> wouldn't say "mathematically equivalent" would be accurate for an
> approximation, any more than say
>
> f(x) = sum k = 0...100: x^k/k! if x is a prime integer > 10^9
>        sum k = 0...infinity x^k/k! otherwise
>
> is mathematically equivalent to e^x (where you always have to take the
> sum as k goes to infinity, not stop at 100). It may be pretty close for
> a typical number (as integers have measure zero), but it's not the same
> function.
>
> If functions are mathematically equivalent, then proofs about the former
> should also apply to the latter. But properties about e^x won't hold of
> f above, and properties of Kemeny won't hold of VoteFair over the whole
> domain. E.g. Kemeny passes LIIA but VoteFair doesn't (as eliminating the
> loser of the example election above would probably result in VoteFair
> returning the optimal Kemeny result).
>
> Mathematical equivalence might be too strict, but it is what it is. Or
> that's what I'd think the term would mean.
>
> You could save mathematical equivalence by restricting the domain to
> e.g. elections with a Smith set no larger than three candidates, prove
> that the sort-based VoteFair mechanism agrees with Kemeny within this
> domain with the usual parameters, and then say that VoteFair is
> mathematically equivalent on that domain. That would be a qualified
> mathematical equivalence, corresponding to your "real election purposes".
>
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

Kristofer Munsterhjelm-3
On 04/01/2020 01.58, VoteFair wrote:

> On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote:
>> ...
>> The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but
>> C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking.
>
> Yes, the default value for variable
> "global_check_all_scores_choice_limit" is 6 and it gives a result that
> differs from the (full) Condorcet-Kemeny method.
>
> Increasing this variable to 7 causes the full Kemeny calculation process
> to be done across all 7 choices, and it gets the same result as doing
> the full Condorcet-Kemeny calculation method.
>
> This is consistent with what I said earlier.
>
> Of course the calculation time for the case in which there is a
> Condorcet (rock-paper-scissors) cycle across 50 choices is going to be
> way too time-consuming for this software.
>
> As I've also said before, the odds of such a cycle (involving either 7
> or 50 choices) happening in a real election is lower than the odds of a
> conventional tie.  Either kind of tie can be resolved using a judicial
> decision -- like the Bush versus Gore decision from the US Supreme
> Court, which was not even an exact tie.

Do you agree with my point about the use of the term "mathematically
equivalent"? If the method appears to be equivalent, that might give the
impression that you don't need to care about cycles at all.
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

VoteFair-2
On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote:
 > Do you agree with my point about the use of the term "mathematically
 > equivalent"? If the method appears to be equivalent, that might give the
 > impression that you don't need to care about cycles at all.

VoteFair popularity ranking IS mathematically equivalent to the
Condorcet-Kemeny method. The difference is that John Kemeny described
the version that counts disapprovals (and finds the smallest sequence
score), whereas VoteFair popularity ranking counts approvals (and finds
the largest sequence score).

The VoteFair Ranking software does calculate the full version (of
VoteFair popularity ranking) to the extent that the
"global_check_all_scores_choice_limit" value is as large as desired.

I choose to leave that value at 6, even though it could be set to 20 or
higher with no execution errors.  Of course in that case it would be
helpful to insert code that checks for time delays and shows progress in
case a Condorcet cycle is slowing it down.

I'm not sure what you mean by:
 > ... that might give the
 > impression that you don't need to care about cycles at all.

Neither the method nor the software implies that cycles are of no
importance.

Richard Fobes


On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote:

> On 04/01/2020 01.58, VoteFair wrote:
>> On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote:
>>> ...
>>> The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but
>>> C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking.
>>
>> Yes, the default value for variable
>> "global_check_all_scores_choice_limit" is 6 and it gives a result that
>> differs from the (full) Condorcet-Kemeny method.
>>
>> Increasing this variable to 7 causes the full Kemeny calculation process
>> to be done across all 7 choices, and it gets the same result as doing
>> the full Condorcet-Kemeny calculation method.
>>
>> This is consistent with what I said earlier.
>>
>> Of course the calculation time for the case in which there is a
>> Condorcet (rock-paper-scissors) cycle across 50 choices is going to be
>> way too time-consuming for this software.
>>
>> As I've also said before, the odds of such a cycle (involving either 7
>> or 50 choices) happening in a real election is lower than the odds of a
>> conventional tie.  Either kind of tie can be resolved using a judicial
>> decision -- like the Bush versus Gore decision from the US Supreme
>> Court, which was not even an exact tie.
>
> Do you agree with my point about the use of the term "mathematically
> equivalent"? If the method appears to be equivalent, that might give the
> impression that you don't need to care about cycles at all.
>
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

Kristofer Munsterhjelm-3
On 06/01/2020 06.19, VoteFair wrote:

> On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote:
>> Do you agree with my point about the use of the term "mathematically
>> equivalent"? If the method appears to be equivalent, that might give the
>> impression that you don't need to care about cycles at all.
>
> VoteFair popularity ranking IS mathematically equivalent to the
> Condorcet-Kemeny method. The difference is that John Kemeny described
> the version that counts disapprovals (and finds the smallest sequence
> score), whereas VoteFair popularity ranking counts approvals (and finds
> the largest sequence score).
>
> The VoteFair Ranking software does calculate the full version (of
> VoteFair popularity ranking) to the extent that the
> "global_check_all_scores_choice_limit" value is as large as desired.
>
> I choose to leave that value at 6, even though it could be set to 20 or
> higher with no execution errors.  Of course in that case it would be
> helpful to insert code that checks for time delays and shows progress in
> case a Condorcet cycle is slowing it down.

What you seem to be saying is that the objective function (what you're
trying to optimize), when taking into account the direction of the
optimization, is the same, so the problem is mathematically equivalent.

That's right. But that doesn't suffice to make the *solution function*
mathematically equivalent. If it did, then likewise any approximation to
e.g. Traveling Salesman would be mathematically equivalent with solving
Traveling Salesman itself, just because it counted the total distance
the salesman has to travel the same way.

VoteFair may *approximate* Kemeny, and it may use an equivalent scoring
function, but for any value of the constant, there exist an infinite
number of inputs for which the outputs will disagree.

Note that this isn't about "likely to disagree". Mathematically, if
there's a single difference between two functions' mapping, then the
functions are different. This holds regardless of the implementation
(because a mathematical mapping is agnostic to how it's implemented).
E.g. both Insertion and Merge sort implement the mapping from unsorted
lists to stably sorted ones, but the algorithms themselves are very
different.

> I'm not sure what you mean by:
>> ... that might give the
>> impression that you don't need to care about cycles at all.
>
> Neither the method nor the software implies that cycles are of no
> importance.

Someone who reads "mathematically equivalent" as "proven to give the
same results" will think that there are no inputs for which the outputs
differ. Thus he may think that there's no need to be on the lookout for
strange behavior under certain conditions.

I seem to recall that around 2010, Warren thought that you were claiming
that VoteFair did exactly reproduce Kemeny results/optima (for every
election, i.e. not an approximation). There was then a long thread about
NP-hardness and whether or not determining the Kemeny winner is NP-hard
(which it is). I *think* you reached the conclusion that VoteFair was an
approximation, but I don't clearly remember.

The point is, Warren's a mathematician and got confused about what
"mathematically equivalent" meant as you used it. I'd imagine a
non-mathematician would have even less chance of getting it right.

And if he understands "mathematically equivalent" that way, and think
VoteFair always returns the Kemeny optimum, then he's not going to
think that "oops, I have to be careful about *these* elections", because
there's nothing to be careful about. After all, it's equivalent, so
shouldn't it return the correct outcome for every election? VoteFair
itself doesn't give any indication of when the results are suspect and
should be thrown out for a tie instead, either.

(I'd suggest replacing the insertion sort code with an integer
programming solver (like Math::LP::Solve or Math::GLPK) and setting a
timeout. Such a solver either returns the optimum or says "couldn't do
that in the time given". It's equivalent over the domain where it
returns a result, and outside of that domain, it clearly times out.

Or more generally, alter the code so that instead of returning a
possibly-right result always in finite time, always returning a right
result but sometimes taking a very long time about it. A timeout is
easier to diagnose than a result that looks right but isn't. But if
you're going to do that, why not use an integer programming solver?)
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

VoteFair-2
Kristofer ~

It occurs to me that you might be thinking that the VoteFair Ranking
SOFTWARE provides the definition of VoteFair popularity ranking.  That
is not the case.

VoteFair popularity ranking is defined in two of my books -- "The
Creative Problem Solver's Toolbox" and "Ending The Hidden Unfairness In
U.S. Elections" -- and in the Condorcet-Kemeny ("Kemeny-Young")
Wikipedia article.

Those definitions should make it clear to anyone who understands
mathematics that the difference between VoteFair popularity ranking and
the method John Kemeny published is that Kemeny counts opposition and
VoteFair counts support, and that otherwise the two methods are
mathematically equivalent.

If you think that the VoteFair Ranking SOFTWARE does not correctly
calculate VoteFair popularity ranking results, please recognize -- as I
have said before -- that you (or anyone) can increase the constant that
specifies how many choices are calculated according to the full VoteFair
popularity ranking method.

As I have said before, I choose a smaller value for that constant
because the software is intended for use in real-life voting situations
where other complications arise as more important than issues of
mathematical "purity."

So, to repeat, VoteFair popularity ranking IS mathematically equivalent
to the Condorcet-Kemeny method, and the VoteFair Ranking software can
calculate either the pure VoteFair popularity ranking results or results
that only rarely differ from the pure VoteFair popularity ranking
results. The choice is determined by the constant.

As another option, if someone prefers that the VoteFair Ranking SOFTWARE
is ALWAYS mathematically pure -- according to the VoteFair popularity
ranking definition -- and never consumes too much computation time, then
they can reduce the limit on the maximum number of choices to something
like 12 choices, and also change the constant (referred to above) to 12.
  Then the software will NEVER depart from VoteFair popularity ranking
as defined in my books and in Wikipedia.

Is this distinction between method and software clear?  If not, please
clarify your concern, and specify whether you are referring to the
method or the software.

Thanks,

Richard Fobes


On 1/10/2020 3:30 AM, Kristofer Munsterhjelm wrote:

> On 06/01/2020 06.19, VoteFair wrote:
>> On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote:
>>> Do you agree with my point about the use of the term "mathematically
>>> equivalent"? If the method appears to be equivalent, that might give the
>>> impression that you don't need to care about cycles at all.
>>
>> VoteFair popularity ranking IS mathematically equivalent to the
>> Condorcet-Kemeny method. The difference is that John Kemeny described
>> the version that counts disapprovals (and finds the smallest sequence
>> score), whereas VoteFair popularity ranking counts approvals (and finds
>> the largest sequence score).
>>
>> The VoteFair Ranking software does calculate the full version (of
>> VoteFair popularity ranking) to the extent that the
>> "global_check_all_scores_choice_limit" value is as large as desired.
>>
>> I choose to leave that value at 6, even though it could be set to 20 or
>> higher with no execution errors.  Of course in that case it would be
>> helpful to insert code that checks for time delays and shows progress in
>> case a Condorcet cycle is slowing it down.
>
> What you seem to be saying is that the objective function (what you're
> trying to optimize), when taking into account the direction of the
> optimization, is the same, so the problem is mathematically equivalent.
>
> That's right. But that doesn't suffice to make the *solution function*
> mathematically equivalent. If it did, then likewise any approximation to
> e.g. Traveling Salesman would be mathematically equivalent with solving
> Traveling Salesman itself, just because it counted the total distance
> the salesman has to travel the same way.
>
> VoteFair may *approximate* Kemeny, and it may use an equivalent scoring
> function, but for any value of the constant, there exist an infinite
> number of inputs for which the outputs will disagree.
>
> Note that this isn't about "likely to disagree". Mathematically, if
> there's a single difference between two functions' mapping, then the
> functions are different. This holds regardless of the implementation
> (because a mathematical mapping is agnostic to how it's implemented).
> E.g. both Insertion and Merge sort implement the mapping from unsorted
> lists to stably sorted ones, but the algorithms themselves are very
> different.
>
>> I'm not sure what you mean by:
>>> ... that might give the
>>> impression that you don't need to care about cycles at all.
>>
>> Neither the method nor the software implies that cycles are of no
>> importance.
>
> Someone who reads "mathematically equivalent" as "proven to give the
> same results" will think that there are no inputs for which the outputs
> differ. Thus he may think that there's no need to be on the lookout for
> strange behavior under certain conditions.
>
> I seem to recall that around 2010, Warren thought that you were claiming
> that VoteFair did exactly reproduce Kemeny results/optima (for every
> election, i.e. not an approximation). There was then a long thread about
> NP-hardness and whether or not determining the Kemeny winner is NP-hard
> (which it is). I *think* you reached the conclusion that VoteFair was an
> approximation, but I don't clearly remember.
>
> The point is, Warren's a mathematician and got confused about what
> "mathematically equivalent" meant as you used it. I'd imagine a
> non-mathematician would have even less chance of getting it right.
>
> And if he understands "mathematically equivalent" that way, and think
> VoteFair always returns the Kemeny optimum, then he's not going to
> think that "oops, I have to be careful about *these* elections", because
> there's nothing to be careful about. After all, it's equivalent, so
> shouldn't it return the correct outcome for every election? VoteFair
> itself doesn't give any indication of when the results are suspect and
> should be thrown out for a tie instead, either.
>
> (I'd suggest replacing the insertion sort code with an integer
> programming solver (like Math::LP::Solve or Math::GLPK) and setting a
> timeout. Such a solver either returns the optimum or says "couldn't do
> that in the time given". It's equivalent over the domain where it
> returns a result, and outside of that domain, it clearly times out.
>
> Or more generally, alter the code so that instead of returning a
> possibly-right result always in finite time, always returning a right
> result but sometimes taking a very long time about it. A timeout is
> easier to diagnose than a result that looks right but isn't. But if
> you're going to do that, why not use an integer programming solver?)
>
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

Kristofer Munsterhjelm-3
On 12/01/2020 05.34, VoteFair wrote:

> Kristofer ~
>
> It occurs to me that you might be thinking that the VoteFair Ranking
> SOFTWARE provides the definition of VoteFair popularity ranking.  That
> is not the case.
>
> VoteFair popularity ranking is defined in two of my books -- "The
> Creative Problem Solver's Toolbox" and "Ending The Hidden Unfairness In
> U.S. Elections" -- and in the Condorcet-Kemeny ("Kemeny-Young")
> Wikipedia article.
>
> Those definitions should make it clear to anyone who understands
> mathematics that the difference between VoteFair popularity ranking and
> the method John Kemeny published is that Kemeny counts opposition and
> VoteFair counts support, and that otherwise the two methods are
> mathematically equivalent.

Let me try to restate that to see if I got it right, because it seems
like an odd distinction.

A ranked voting method is a function that takes, as an input, a set of
ballots, and produces either a social ranking or a winner set (depending
on what definition you agree with).

What happens inside the function is irrelevant to the definition. You
could, if you had infinite memory, implement it as a huge table that you
would just look up to find out what the output would be for a given input.

Given this definition, there are two possibilities. Either the VoteFair
software implements the VoteFair method, or it does not. By "implements
the method" I mean that running the VoteFair software on some input
election always provides the output that the method would.

Now, I was assuming that the software was an accurate implementation of
the method, so that instead of a method, you actually had a family of
them (parameterized by that constant), none of which would agree with
Kemeny all the time. It seems that you instead are saying that, as you
define the VoteFair ranking, it does agree with Kemeny all the time, but
the software doesn't.

Could you give me a definition of the VoteFair method itself? Without
knowing what bit is the software and what bit is the method, it's
difficult to get further. For instance, it would be hard to know whether
the results that the VoteFair software gives when Kemeny orderings are
tied is a property of the method or of the software, because the
definition of the Kemeny method itself doesn't say what should happen in
such a case.
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

VoteFair-2
On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote:
 > Given this definition, there are two possibilities. Either the VoteFair
 > software implements the VoteFair method, or it does not. ...

If the default settings are used, and if the number of candidates is no
more than the check-all-sequence-scores setting (which I think is 6),
then the software implements the VoteFair popularity ranking method,
which yields the same results as the Condorcet-Kemeny method.

In other words you are not allowing for a third possibility, which -- of
course -- depends on the number of candidates.

 > ... It seems that you instead are saying that, as you
 > define the VoteFair ranking, it does agree with Kemeny all the time, but
 > the software doesn't.

Yes, this is the third possibility. The software CAN always yield the
same results -- simply by limiting the number of candidates it can
handle (or, up to a limit based on computing power, increasing the
number of candidates checked by the full/slow method).

 > Could you give me a definition of the VoteFair method itself? Without
 > knowing what bit is the software and what bit is the method, it's
 > difficult to get further. For instance, it would be hard to know whether
 > the results that the VoteFair software gives when Kemeny orderings are
 > tied is a property of the method or of the software, because the
 > definition of the Kemeny method itself doesn't say what should happen in
 > such a case.

Originally I wrote the definition of VoteFair popularity ranking (which
at that time I called VoteFair ranking) on Wikipedia, and Markus Schulze
re-named the article to the "Kemeny-Young method" claiming that the two
methods are the same. So the definition there matches VoteFair
popularity ranking because I wrote it.

Yes, the issue of how to handle Condorcet-cycle ties is not stated by
John Kemeny (as far as I know), and I have not explicitly stated how
such ties are handled in VoteFair popularity ranking. If I did make such
a statement then you or others might argue that the two methods are not
the same.

The software does attempt to handle ties, either to use a tie-breaking
criteria or to declare an official tie. In the latter case, the VoteFair
approach presumably matches the Kemeny approach. In the former case the
candidates in the cycle can be re-sequenced and the sequence sums yield
the same highest sum, yet SOME of those "tied" sequences can yield
different sums. If this seems complex it's because it IS complex. It
took lots of work to get the software to handle these and other "edge"
cases. This approach is unlike what many software developers do, which
is to dismiss edge cases with some kind of error message. In other
words, my software looks deeper into the "tie."

So, keeping in mind that neither John Kemeny nor I have officially
stated how ties (in the sequence sums) are to be handled, both methods
are mathematically equivalent.

I avoid using the word "same" because John Kemeny defines the sequence
score in terms of counting votes that OPPOSE a candidate (and finding
the minimum score), whereas VoteFair popularity ranking (which I came up
with without knowing about the Kemeny method) defines the sequence score
in terms of counting votes that SUPPORT a candidate (and finding the
maximum score).

If this still isn't clear, please keep asking for clarifications. Thank
you for seeking to understand this method.

Richard Fobes


On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote:

> On 12/01/2020 05.34, VoteFair wrote:
>> Kristofer ~
>>
>> It occurs to me that you might be thinking that the VoteFair Ranking
>> SOFTWARE provides the definition of VoteFair popularity ranking.  That
>> is not the case.
>>
>> VoteFair popularity ranking is defined in two of my books -- "The
>> Creative Problem Solver's Toolbox" and "Ending The Hidden Unfairness In
>> U.S. Elections" -- and in the Condorcet-Kemeny ("Kemeny-Young")
>> Wikipedia article.
>>
>> Those definitions should make it clear to anyone who understands
>> mathematics that the difference between VoteFair popularity ranking and
>> the method John Kemeny published is that Kemeny counts opposition and
>> VoteFair counts support, and that otherwise the two methods are
>> mathematically equivalent.
>
> Let me try to restate that to see if I got it right, because it seems
> like an odd distinction.
>
> A ranked voting method is a function that takes, as an input, a set of
> ballots, and produces either a social ranking or a winner set (depending
> on what definition you agree with).
>
> What happens inside the function is irrelevant to the definition. You
> could, if you had infinite memory, implement it as a huge table that you
> would just look up to find out what the output would be for a given input.
>
> Given this definition, there are two possibilities. Either the VoteFair
> software implements the VoteFair method, or it does not. By "implements
> the method" I mean that running the VoteFair software on some input
> election always provides the output that the method would.
>
> Now, I was assuming that the software was an accurate implementation of
> the method, so that instead of a method, you actually had a family of
> them (parameterized by that constant), none of which would agree with
> Kemeny all the time. It seems that you instead are saying that, as you
> define the VoteFair ranking, it does agree with Kemeny all the time, but
> the software doesn't.
>
> Could you give me a definition of the VoteFair method itself? Without
> knowing what bit is the software and what bit is the method, it's
> difficult to get further. For instance, it would be hard to know whether
> the results that the VoteFair software gives when Kemeny orderings are
> tied is a property of the method or of the software, because the
> definition of the Kemeny method itself doesn't say what should happen in
> such a case.
>
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

Kristofer Munsterhjelm-3
I hope this reply isn't too late, so here we go:

On 23/01/2020 18.53, VoteFair wrote:

> On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote:
>> Given this definition, there are two possibilities. Either the VoteFair
>> software implements the VoteFair method, or it does not. ...
>
> If the default settings are used, and if the number of candidates is no
> more than the check-all-sequence-scores setting (which I think is 6),
> then the software implements the VoteFair popularity ranking method,
> which yields the same results as the Condorcet-Kemeny method.
>
> In other words you are not allowing for a third possibility, which -- of
> course -- depends on the number of candidates.
>
>> ... It seems that you instead are saying that, as you
>> define the VoteFair ranking, it does agree with Kemeny all the time, but
>> the software doesn't.
>
> Yes, this is the third possibility. The software CAN always yield the
> same results -- simply by limiting the number of candidates it can
> handle (or, up to a limit based on computing power, increasing the
> number of candidates checked by the full/slow method).
>
>> Could you give me a definition of the VoteFair method itself? Without
>> knowing what bit is the software and what bit is the method, it's
>> difficult to get further. For instance, it would be hard to know whether
>> the results that the VoteFair software gives when Kemeny orderings are
>> tied is a property of the method or of the software, because the
>> definition of the Kemeny method itself doesn't say what should happen in
>> such a case.
>
> Originally I wrote the definition of VoteFair popularity ranking (which
> at that time I called VoteFair ranking) on Wikipedia, and Markus Schulze
> re-named the article to the "Kemeny-Young method" claiming that the two
> methods are the same. So the definition there matches VoteFair
> popularity ranking because I wrote it.

Alright, that enabled me to track down the original definition by
looking at the Wikipedia history archive. I've quoted it below.

> VoteFair ranking shall be done using software (such as accessible at
> www.VoteFair.org) that performs the following calculations. The
> preferences indicated in the ballots are counted to produce a tally
> table in which all the possible pairs of candidates are listed, one
> number for each pair indicates the number of voters who prefer one
> candidate in the pair over the other candidate in the pair, another
> number for each pair indicates the number of voters who have the
> opposite preference for these two candidates, and a third number for
> each pair indicates the number of voters who express no preference
> between the two candidates. Using a computer, each possible sequence of
> candidates is considered, where a sequence consists of one of the
> candidates being regarded as the most popular candidate, another
> candidate being regarded as the second-most popular candidate, and so
> on. For each such sequence the numbers in the tally table that apply to
> that sequence are added together to produce a sequence score for this
> sequence. The sequence that has the highest sequence score indicates the
> overall order of preference for the candidates. If there is more than
> one sequence that has the same highest score, the sequences with this
> score shall be analyzed to identify one or more ties at one or more
> preference levels.

I agree that the definition above describes the Kemeny-Young method. I'd
even say it's the same (not just mathematically equivalent) even though
the mechanics are different:

There exist implementations of Kemeny-Young that don't rely on
enumerating all combinations and then choosing the best one. For
instance, integer programming implementations scale much better while
producing the optimal result. If the implementation mattered as to
whether one method is the same as another, then these implementations
would not be of the Kemeny-Young method because they don't use
exhaustive enumerations. But they are generally considered to be.

> Yes, the issue of how to handle Condorcet-cycle ties is not stated by
> John Kemeny (as far as I know), and I have not explicitly stated how
> such ties are handled in VoteFair popularity ranking. If I did make such
> a statement then you or others might argue that the two methods are not
> the same.
>
> The software does attempt to handle ties, either to use a tie-breaking
> criteria or to declare an official tie. In the latter case, the VoteFair
> approach presumably matches the Kemeny approach. In the former case the
> candidates in the cycle can be re-sequenced and the sequence sums yield
> the same highest sum, yet SOME of those "tied" sequences can yield
> different sums. If this seems complex it's because it IS complex. It
> took lots of work to get the software to handle these and other "edge"
> cases. This approach is unlike what many software developers do, which
> is to dismiss edge cases with some kind of error message. In other
> words, my software looks deeper into the "tie."
>
> So, keeping in mind that neither John Kemeny nor I have officially
> stated how ties (in the sequence sums) are to be handled, both methods
> are mathematically equivalent.

So it seems the situation is different than I thought. Instead of the
method not being the same as Kemeny, the case is more that the software
doesn't implement the method, but rather something that approximates
(and differs from) it.

As I've said before, I consider a method to be algorithm-agnostic.
Schulze implemented by the beatpath mechanism is the same method as
Schulze implemented by the cloneproof Schwartz dropping mechanism.
Furthermore, I think that a method is a function that takes an election
as its input and then provides either a winner set or a ranking (order
of finish).

If a method uses some kind of auxiliary input parameter, then the method
with one value of that auxiliary input parameter is not the same as the
method with another value of that parameter, unless it happens to be
inconsequential for the outcome. E.g. STV with a Droop quota is not the
same method as STV with the Hare method.

A software implementation is "of the method" or "implements the method"
if it agrees with the method for every input that it returns an output for.

Since the VoteFair method definition is the same (or equivalent) to the
Kemeny-Young method definition, the methods are the same. But the
software doesn't implement that method[1]. Instead, it implements a
family of approximations determined by the parameter. Each approximation
agrees with the full method if the number of candidates is below a
certain number, and returns undefined results above this number.

But the software doesn't give any impression that this is what it does.
Instead, from the documentation, one would get the impression that it
implements *the* Kemeny method (for any number of candidates). E.g. from
the VoteFair::VoteFairRanking Perl module documentation:

> This module calculates VoteFair Ranking results. The portions of
> VoteFair Ranking implemented here are:
>
> - VoteFair popularity ranking. This voting method calculates the full
> popularity ranking of all candidates (or choices in the case of a
> survey) from most popular and second-most popular down to least
> popular. It uses the preference information collected on 1-2-3
> ballots (or any equivalent way of expressing "ranked" preferences).
> When a single position is being filled, the most popular candidate is
> declared the winner. This calculation method is mathematically
> equivalent to the Condorcet-Kemeny election method.

and further,

> VoteFair popularity ranking is described in Wikipedia, and is
> mathematically equivalent to the Condorcet-Kemeny method. The following
> comments explain the algorithm used here, which quickly calculates the
> ranking results. This explanation is important because some academic
> sources claim that the computations cannot be done quickly if there is a
> large number of choices being ranked.

> Calculation time:
>
> The full algorithm used to calculate VoteFair popularity ranking
> results has a calculation time that is less than or equal to the
> following polynomial function:
>
> T = A + ( B * N ) + ( C * ( N * N ) )
>
> where T is the calculation time, N is the number of choices, and A
> and B and C are constants. (In mathematical notation, N * N would be
> written as N squared.) This function includes the calculation time
> required for the Choice-Specific Pairwise-Score (CSPS) pre-sort
> calculations.
>
> This time contrasts with the slow execution times of the "NP-hard"
> approach, in which every sequence score is calculated in order to find
> the sequence with the highest score. If every sequence score were
> calculated (from scratch), the calculation time would be proportional to:
>
> N! * N * N
>
> where N is the number of choices, N! is N factorial (2 * 3 * 4 * ...
> * N), and N * N equals N squared. Note that N factorial equals the number
> of possible sequences, and N squared times one-half approximately equals
> the number of pairwise counts that are added to calculate each sequence
> score.
>
> This clarification about calculation time is included because there
> is an academically common -- yet mistaken -- belief that calculating the
> "Condorcet-Kemeny method" is "NP-hard" and cannot be calculated in a
> time that is proportional to a polynomial function of N (the number of
> choices).

This description gives the impression that the software calculates the
VoteFair ranking (period, i.e. no approximations), that it does so in
polynomial time, and that it is possible to determine the Kemeny-Young
winner in polynomial time. But what it does is calculate an
approximation, which is how it can manage to do so in polytime without
proving P=NP.

Similarly, in the VoteFair C++ README.txt file:

> //  * VoteFair popularity ranking.  This voting method calculates the
> //  full popularity ranking of all candidates (or choices in the case
> //  of a survey) from most popular and second-most popular down to
> //  least popular.  It uses the preference information collected on
> //  1-2-3 ballots (or any equivalent way of expressing "ranked"
> //  preferences).  When a single position is being filled, the most
> //  popular candidate is declared the winner.  This calculation method
> //  is mathematically equivalent to the Condorcet-Kemeny election
> //  method.

which gives the impression that again, the software calculates the
method (although at least it doesn't mention being able to circumvent
NP-hardness :-).

Now, if I'm generous, I could say that what I understand you to be
saying is that when you say "this voting method calculates...", you're
talking about the software, but when you're saying "this calculation
method is mathematically equivalent...", you're talking about the
method. But it's easy to draw the wrong conclusion, as I did earlier.
The documentation could thus use being more clear about what the
software implements -- or the software could be rewritten so that it
implements the actual VoteFair (or Kemeny) method.

And again, I would suggest the latter as a good strategy. Instead of
producing correct results up to a certain number of candidates and then
producing undefined results - that it produces the correct results in
potentially exponential time, and that it just quits with a timeout if
it takes too long a time.

That way, it's immediately obvious to the user whether the method
produces a correct result (namely, that it'd do so when it does produce
a result at all). Instead of producing a result in managable time with
some probability (which you argue is low) of producing something
incorrect, it produces a result in managable time with similar (high if
you're right) probability of not getting a timeout.

In addition, if you use a reasonably good integer programming solver to
do so, you can quickly solve cases that the current software can't, e.g.
1000 candidates but no cycles at all.

[1] Strictly speaking, it implements a generalization of that method,
but I don't want to make things more confusing than they need to be; I'd
rather deal with one point at a time.
----
Election-Methods mailing list - see https://electorama.com/em for list info
Reply | Threaded
Open this post in threaded view
|

Re: [EM] VoteFair Ranking software version 6.0 in C++ with MIT license

VoteFair-2
On 2/25/2020 3:17 PM, Kristofer Munsterhjelm wrote:
 > So it seems the situation is different than I thought. Instead of the
 > method not being the same as Kemeny, the case is more that the software
 > doesn't implement the method, but rather something that approximates
 > (and differs from) it.
 > ...
 > ... the software doesn't give any impression that this is what it does.
 > Instead, from the documentation, one would get the impression that it
 > implements *the* Kemeny method (for any number of candidates). ...
 > ...
 > The documentation could thus use being more clear about what the
 > software implements -- or the software could be rewritten so that it
 > implements the actual VoteFair (or Kemeny) method.

I agree that the software deserves better documentation, so I have added
the following clarification to the VoteFair ranking software (both in
README.txt and votefair_ranking.cpp).

//  Clarification: For reasons of computation time, there are some
//  very rare cases for which this software can produce results that
//  differ from full VoteFair popularity ranking results.
//  These exceptions involve more choices (candidates) than the
//  value of the constant named "global_check_all_scores_choice_limit",
//  which currently has a default value of 6.
//  In those rare cases, the software estimates which top (6) choices
//  are the most popular, and then does the full VoteFair popularity
//  ranking to determine which of those top choices is most popular.
//  The rare cases in which the estimation method produces different
//  results involve one or more rock-paper-scissors-like cycles
//  that include the top choices, which is a rare combination.
//  As an analogy, those cases are like finding the highest
//  sand dune in a desert, whereas most cases are like finding
//  the highest mountain peak in a mountain range.
//  As the number of ballots increases (such as beyond 50 ballots),
//  the likelihood of such cycles greatly decreases.
//  If this difference is important, the value of the constant
//  "global_check_all_scores_choice_limit" can be increased, but
//  that change will dramatically increase calculation time,
//  to the point of requiring years of computation time for values
//  even as small as 20.  Therefore, values larger than 12 are not
//  recommended.

Kristofer, thank you for taking the time to understand VoteFair Ranking
and the VoteFair ranking software, and for calling attention to this
need for better software documentation.

Richard Fobes


On 2/25/2020 3:17 PM, Kristofer Munsterhjelm wrote:

> I hope this reply isn't too late, so here we go:
>
> On 23/01/2020 18.53, VoteFair wrote:
>> On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote:
>>> Given this definition, there are two possibilities. Either the VoteFair
>>> software implements the VoteFair method, or it does not. ...
>>
>> If the default settings are used, and if the number of candidates is no
>> more than the check-all-sequence-scores setting (which I think is 6),
>> then the software implements the VoteFair popularity ranking method,
>> which yields the same results as the Condorcet-Kemeny method.
>>
>> In other words you are not allowing for a third possibility, which -- of
>> course -- depends on the number of candidates.
>>
>>> ... It seems that you instead are saying that, as you
>>> define the VoteFair ranking, it does agree with Kemeny all the time, but
>>> the software doesn't.
>>
>> Yes, this is the third possibility. The software CAN always yield the
>> same results -- simply by limiting the number of candidates it can
>> handle (or, up to a limit based on computing power, increasing the
>> number of candidates checked by the full/slow method).
>>
>>> Could you give me a definition of the VoteFair method itself? Without
>>> knowing what bit is the software and what bit is the method, it's
>>> difficult to get further. For instance, it would be hard to know whether
>>> the results that the VoteFair software gives when Kemeny orderings are
>>> tied is a property of the method or of the software, because the
>>> definition of the Kemeny method itself doesn't say what should happen in
>>> such a case.
>>
>> Originally I wrote the definition of VoteFair popularity ranking (which
>> at that time I called VoteFair ranking) on Wikipedia, and Markus Schulze
>> re-named the article to the "Kemeny-Young method" claiming that the two
>> methods are the same. So the definition there matches VoteFair
>> popularity ranking because I wrote it.
>
> Alright, that enabled me to track down the original definition by
> looking at the Wikipedia history archive. I've quoted it below.
>
>> VoteFair ranking shall be done using software (such as accessible at
>> www.VoteFair.org) that performs the following calculations. The
>> preferences indicated in the ballots are counted to produce a tally
>> table in which all the possible pairs of candidates are listed, one
>> number for each pair indicates the number of voters who prefer one
>> candidate in the pair over the other candidate in the pair, another
>> number for each pair indicates the number of voters who have the
>> opposite preference for these two candidates, and a third number for
>> each pair indicates the number of voters who express no preference
>> between the two candidates. Using a computer, each possible sequence of
>> candidates is considered, where a sequence consists of one of the
>> candidates being regarded as the most popular candidate, another
>> candidate being regarded as the second-most popular candidate, and so
>> on. For each such sequence the numbers in the tally table that apply to
>> that sequence are added together to produce a sequence score for this
>> sequence. The sequence that has the highest sequence score indicates the
>> overall order of preference for the candidates. If there is more than
>> one sequence that has the same highest score, the sequences with this
>> score shall be analyzed to identify one or more ties at one or more
>> preference levels.
>
> I agree that the definition above describes the Kemeny-Young method. I'd
> even say it's the same (not just mathematically equivalent) even though
> the mechanics are different:
>
> There exist implementations of Kemeny-Young that don't rely on
> enumerating all combinations and then choosing the best one. For
> instance, integer programming implementations scale much better while
> producing the optimal result. If the implementation mattered as to
> whether one method is the same as another, then these implementations
> would not be of the Kemeny-Young method because they don't use
> exhaustive enumerations. But they are generally considered to be.
>
>> Yes, the issue of how to handle Condorcet-cycle ties is not stated by
>> John Kemeny (as far as I know), and I have not explicitly stated how
>> such ties are handled in VoteFair popularity ranking. If I did make such
>> a statement then you or others might argue that the two methods are not
>> the same.
>>
>> The software does attempt to handle ties, either to use a tie-breaking
>> criteria or to declare an official tie. In the latter case, the VoteFair
>> approach presumably matches the Kemeny approach. In the former case the
>> candidates in the cycle can be re-sequenced and the sequence sums yield
>> the same highest sum, yet SOME of those "tied" sequences can yield
>> different sums. If this seems complex it's because it IS complex. It
>> took lots of work to get the software to handle these and other "edge"
>> cases. This approach is unlike what many software developers do, which
>> is to dismiss edge cases with some kind of error message. In other
>> words, my software looks deeper into the "tie."
>>
>> So, keeping in mind that neither John Kemeny nor I have officially
>> stated how ties (in the sequence sums) are to be handled, both methods
>> are mathematically equivalent.
>
> So it seems the situation is different than I thought. Instead of the
> method not being the same as Kemeny, the case is more that the software
> doesn't implement the method, but rather something that approximates
> (and differs from) it.
>
> As I've said before, I consider a method to be algorithm-agnostic.
> Schulze implemented by the beatpath mechanism is the same method as
> Schulze implemented by the cloneproof Schwartz dropping mechanism.
> Furthermore, I think that a method is a function that takes an election
> as its input and then provides either a winner set or a ranking (order
> of finish).
>
> If a method uses some kind of auxiliary input parameter, then the method
> with one value of that auxiliary input parameter is not the same as the
> method with another value of that parameter, unless it happens to be
> inconsequential for the outcome. E.g. STV with a Droop quota is not the
> same method as STV with the Hare method.
>
> A software implementation is "of the method" or "implements the method"
> if it agrees with the method for every input that it returns an output for.
>
> Since the VoteFair method definition is the same (or equivalent) to the
> Kemeny-Young method definition, the methods are the same. But the
> software doesn't implement that method[1]. Instead, it implements a
> family of approximations determined by the parameter. Each approximation
> agrees with the full method if the number of candidates is below a
> certain number, and returns undefined results above this number.
>
> But the software doesn't give any impression that this is what it does.
> Instead, from the documentation, one would get the impression that it
> implements *the* Kemeny method (for any number of candidates). E.g. from
> the VoteFair::VoteFairRanking Perl module documentation:
>
>> This module calculates VoteFair Ranking results. The portions of
>> VoteFair Ranking implemented here are:
>>
>> - VoteFair popularity ranking. This voting method calculates the full
>> popularity ranking of all candidates (or choices in the case of a
>> survey) from most popular and second-most popular down to least
>> popular. It uses the preference information collected on 1-2-3
>> ballots (or any equivalent way of expressing "ranked" preferences).
>> When a single position is being filled, the most popular candidate is
>> declared the winner. This calculation method is mathematically
>> equivalent to the Condorcet-Kemeny election method.
>
> and further,
>
>> VoteFair popularity ranking is described in Wikipedia, and is
>> mathematically equivalent to the Condorcet-Kemeny method. The following
>> comments explain the algorithm used here, which quickly calculates the
>> ranking results. This explanation is important because some academic
>> sources claim that the computations cannot be done quickly if there is a
>> large number of choices being ranked.
>
>> Calculation time:
>>
>> The full algorithm used to calculate VoteFair popularity ranking
>> results has a calculation time that is less than or equal to the
>> following polynomial function:
>>
>> T = A + ( B * N ) + ( C * ( N * N ) )
>>
>> where T is the calculation time, N is the number of choices, and A
>> and B and C are constants. (In mathematical notation, N * N would be
>> written as N squared.) This function includes the calculation time
>> required for the Choice-Specific Pairwise-Score (CSPS) pre-sort
>> calculations.
>>
>> This time contrasts with the slow execution times of the "NP-hard"
>> approach, in which every sequence score is calculated in order to find
>> the sequence with the highest score. If every sequence score were
>> calculated (from scratch), the calculation time would be proportional to:
>>
>> N! * N * N
>>
>> where N is the number of choices, N! is N factorial (2 * 3 * 4 * ...
>> * N), and N * N equals N squared. Note that N factorial equals the number
>> of possible sequences, and N squared times one-half approximately equals
>> the number of pairwise counts that are added to calculate each sequence
>> score.
>>
>> This clarification about calculation time is included because there
>> is an academically common -- yet mistaken -- belief that calculating the
>> "Condorcet-Kemeny method" is "NP-hard" and cannot be calculated in a
>> time that is proportional to a polynomial function of N (the number of
>> choices).
>
> This description gives the impression that the software calculates the
> VoteFair ranking (period, i.e. no approximations), that it does so in
> polynomial time, and that it is possible to determine the Kemeny-Young
> winner in polynomial time. But what it does is calculate an
> approximation, which is how it can manage to do so in polytime without
> proving P=NP.
>
> Similarly, in the VoteFair C++ README.txt file:
>
>> //  * VoteFair popularity ranking.  This voting method calculates the
>> //  full popularity ranking of all candidates (or choices in the case
>> //  of a survey) from most popular and second-most popular down to
>> //  least popular.  It uses the preference information collected on
>> //  1-2-3 ballots (or any equivalent way of expressing "ranked"
>> //  preferences).  When a single position is being filled, the most
>> //  popular candidate is declared the winner.  This calculation method
>> //  is mathematically equivalent to the Condorcet-Kemeny election
>> //  method.
>
> which gives the impression that again, the software calculates the
> method (although at least it doesn't mention being able to circumvent
> NP-hardness :-).
>
> Now, if I'm generous, I could say that what I understand you to be
> saying is that when you say "this voting method calculates...", you're
> talking about the software, but when you're saying "this calculation
> method is mathematically equivalent...", you're talking about the
> method. But it's easy to draw the wrong conclusion, as I did earlier.
> The documentation could thus use being more clear about what the
> software implements -- or the software could be rewritten so that it
> implements the actual VoteFair (or Kemeny) method.
>
> And again, I would suggest the latter as a good strategy. Instead of
> producing correct results up to a certain number of candidates and then
> producing undefined results - that it produces the correct results in
> potentially exponential time, and that it just quits with a timeout if
> it takes too long a time.
>
> That way, it's immediately obvious to the user whether the method
> produces a correct result (namely, that it'd do so when it does produce
> a result at all). Instead of producing a result in managable time with
> some probability (which you argue is low) of producing something
> incorrect, it produces a result in managable time with similar (high if
> you're right) probability of not getting a timeout.
>
> In addition, if you use a reasonably good integer programming solver to
> do so, you can quickly solve cases that the current software can't, e.g.
> 1000 candidates but no cycles at all.
>
> [1] Strictly speaking, it implements a generalization of that method,
> but I don't want to make things more confusing than they need to be; I'd
> rather deal with one point at a time.
>
----
Election-Methods mailing list - see https://electorama.com/em for list info