Recently I "ported" the Perl-language VoteFair Ranking software to the
C++ language. The Perl version was released under the Perl Artistic License. The C++ version is available under the MIT license. Here is the link to the new C++ version: https://github.com/cpsolver/VoteFair-ranking-cpp For those who don't already know about VoteFair ranking, here is a brief description (copied from the announcement on Reddit): VoteFair Ranking uses ranked ballots and provides two kinds of PR (Proportional Representation). In addition to allocating some statewide/nationwide seats based on political-party preferences, each double-sized-plus district has two seats, and the second seat is won by the candidate who best represents the voters who are not well-represented by the first-seat winner. An additional fairness feature of VoteFair Ranking is that voters rank the political parties, and the results determine which two or three parties in that district can offer a second candidate in the next election. It also identifies which parties are so unpopular (in that district) that they should not be allowed to offer any candidate in that district. (Remember that vote splitting is what currently limits each political party to a single candidate.) The ballots are very simple. One question asks voters to rank the candidates in that district, and the other question asks voters to rank political parties. The recommended ballot format is paper ballots with ovals that are marked to indicate first choice, second choice, third choice, and so on down to least-favorite choice. The software allows more than one candidate (or party) to be marked at the same preference level. Recently the Perl version was used in the 2019 Canadian Federal Election Anti-Vote-Splitting Poll here: https://www.newsherenow.com/news_here_now_poll_results_canada.html Alas, there were not enough participants to provide meaningful results. VoteFair calculations use pairwise counting to ensure especially fair results. Specifically, VoteFair popularity ranking is mathematically equivalent to the Condorcet-Kemeny method. Although some people dismiss this method as requiring too much computer time when there are lots of candidates, this software is very fast, even when there are 50 choices. The VoteFair.org website has been using earlier versions of VoteFair Ranking software for the last two decades, so the software has been used for many real-life elections and polls in non-governmental organizations throughout the world. Here is what I wrote in response to a question about how the software works: If you're already familiar with "Condorcet" methods, the foundation of the calculations can be thought of this way: When the pairwise (one-on-one) counts are arranged in a table (matrix) the sequence of candidates (from possibly most popular to possibly least popular) is rearranged until the biggest counts are in the upper-right triangular area (of the table/matrix), and the smallest counts are in the lower-left triangular area. Richard Fobes ---- Election-Methods mailing list - see https://electorama.com/em for list info |
On 16/12/2019 07.42, VoteFair wrote:
> VoteFair calculations use pairwise counting to ensure especially fair > results. Specifically, VoteFair popularity ranking is mathematically > equivalent to the Condorcet-Kemeny method. Although some people dismiss > this method as requiring too much computer time when there are lots of > candidates, this software is very fast, even when there are 50 choices. I can't help myself, but must point out again that I think it's more accurate to say (if this is true), that VoteFair popularity ranking is an approximation to Kemeny that becomes Kemeny in the limit as a certain parameter approaches infinity. I don't remember the name of the parameter; you probably do. But I remember I found examples where Kemeny and your method disagreed for elections with very large Smith sets. You said that you could always increase the value of this parameter to get the correct result. Unfortunately, letting that parameter approach infinity destroys VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-) ---- Election-Methods mailing list - see https://electorama.com/em for list info |
On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote:
> I can't help myself, but must point out again that I think it's more > accurate to say (if this is true), that VoteFair popularity ranking is > an approximation to Kemeny that becomes Kemeny in the limit as a certain > parameter approaches infinity. Yes, when there are a large number of candidates, such as 50 candidates as an example, estimations are done to yield a full ranking, and then the top 6 or 7 or 8 (this is adjustable) most-popular candidates are ranked using the exact Condorcet-Kemeny method to determine the ranking for those top candidates. Yes, you came up with an example of a carefully constructed case that involved something like 50 candidates -- and a small number of carefully balanced groups of voters. But that example illustrates my point that such a case would not occur among voters who are sincerely marking their ballots independently, without coordination with each other. In elections, where just a single winner is needed, the only way VoteFair ranking and the full Condorcet-Kemeny calculations can identify different winners is if the case involves numerous almost-equally popular candidates (at the top, not the middle or bottom). (Specifically there needs to be a carefully constructed Condorcet cycle that involves a large number of candidates and a small number of carefully balanced groups of voters.) > Unfortunately, letting that parameter approach infinity destroys > VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-) Yes you are correct that increasing the full calculation parameter further (such as even to 20) would result in a very long computation time, which is of course not a "polynomial runtime." Yet for election purposes, as opposed to mathematical-proof purposes, I cannot imagine needing to increase the full calculation parameter to check more than a few top candidates. (If that affects the outcome, then just a few spoiled ballots also would be just as likely to change the outcome.) Richard Fobes On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote: > On 16/12/2019 07.42, VoteFair wrote: > >> VoteFair calculations use pairwise counting to ensure especially fair >> results. Specifically, VoteFair popularity ranking is mathematically >> equivalent to the Condorcet-Kemeny method. Although some people dismiss >> this method as requiring too much computer time when there are lots of >> candidates, this software is very fast, even when there are 50 choices. > > I can't help myself, but must point out again that I think it's more > accurate to say (if this is true), that VoteFair popularity ranking is > an approximation to Kemeny that becomes Kemeny in the limit as a certain > parameter approaches infinity. > > I don't remember the name of the parameter; you probably do. But I > remember I found examples where Kemeny and your method disagreed for > elections with very large Smith sets. You said that you could always > increase the value of this parameter to get the correct result. > > Unfortunately, letting that parameter approach infinity destroys > VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-) > Election-Methods mailing list - see https://electorama.com/em for list info |
On 19/12/2019 00.26, VoteFair wrote:
> On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote: >> I can't help myself, but must point out again that I think it's more >> accurate to say (if this is true), that VoteFair popularity ranking is >> an approximation to Kemeny that becomes Kemeny in the limit as a certain >> parameter approaches infinity. > > Yes, when there are a large number of candidates, such as 50 candidates > as an example, estimations are done to yield a full ranking, and then > the top 6 or 7 or 8 (this is adjustable) most-popular candidates are > ranked using the exact Condorcet-Kemeny method to determine the ranking > for those top candidates. > > Yes, you came up with an example of a carefully constructed case that > involved something like 50 candidates -- and a small number of carefully > balanced groups of voters. But that example illustrates my point that > such a case would not occur among voters who are sincerely marking their > ballots independently, without coordination with each other. I couldn't seem to find the example, so I tried to create a new one. My tools are a bit old and hacky, so I may have got it wrong, but I think this should work: 1: A>D>F>E>C>B>G 1: C>A>F>B>E>G>D 1: C>B>A>F>D>G>E 1: C>G>E>A>F>B>D 1: D>F>C>E>B>A>G 1: E>C>A>D>G>F>B 1: F>E>G>C>D>B>A 1: F>G>B>E>A>D>C 1: G>A>F>C>D>B>E Pairwise matrix: - 5 3 7 4 6 5 4 - 1 4 4 1 4 6 8 - 6 5 4 6 2 5 3 - 4 3 4 5 5 4 5 - 2 5 3 8 5 6 7 - 6 4 5 3 5 4 3 - VoteFair ballot format: request-no-rep request-no-party case 1 q 1 choices 7 x 1 q 1 3 2 1 6 4 7 5 x 1 q 1 7 1 6 3 4 2 5 x 1 q 1 5 3 1 4 7 6 2 x 1 q 1 6 5 7 3 4 2 1 x 1 q 1 4 6 3 5 2 1 7 x 1 q 1 1 4 6 5 3 2 7 x 1 q 1 3 7 5 1 6 2 4 x 1 q 1 3 1 6 2 5 7 4 x 1 q 1 6 7 2 5 1 4 3 There are four candidates in the Smith set: A, C, E, and F. The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking. If that's right, there's no need for 50 candidates to make a counterexample. The example above does have a carefully balanced Condorcet cycle, however (moreso because it is optimized to have as few voters as possible). > In elections, where just a single winner is needed, the only way > VoteFair ranking and the full Condorcet-Kemeny calculations can identify > different winners is if the case involves numerous almost-equally > popular candidates (at the top, not the middle or bottom). (Specifically > there needs to be a carefully constructed Condorcet cycle that involves > a large number of candidates and a small number of carefully balanced > groups of voters.) > >> Unfortunately, letting that parameter approach infinity destroys >> VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-) > > Yes you are correct that increasing the full calculation parameter > further (such as even to 20) would result in a very long computation > time, which is of course not a "polynomial runtime." > > Yet for election purposes, as opposed to mathematical-proof purposes, I > cannot imagine needing to increase the full calculation parameter to > check more than a few top candidates. (If that affects the outcome, > then just a few spoiled ballots also would be just as likely to change > the outcome.) My point is that a phrase like "mathematically equivalent" is about mathematical-proof purposes, not practical election purposes. So I wouldn't say "mathematically equivalent" would be accurate for an approximation, any more than say f(x) = sum k = 0...100: x^k/k! if x is a prime integer > 10^9 sum k = 0...infinity x^k/k! otherwise is mathematically equivalent to e^x (where you always have to take the sum as k goes to infinity, not stop at 100). It may be pretty close for a typical number (as integers have measure zero), but it's not the same function. If functions are mathematically equivalent, then proofs about the former should also apply to the latter. But properties about e^x won't hold of f above, and properties of Kemeny won't hold of VoteFair over the whole domain. E.g. Kemeny passes LIIA but VoteFair doesn't (as eliminating the loser of the example election above would probably result in VoteFair returning the optimal Kemeny result). Mathematical equivalence might be too strict, but it is what it is. Or that's what I'd think the term would mean. You could save mathematical equivalence by restricting the domain to e.g. elections with a Smith set no larger than three candidates, prove that the sort-based VoteFair mechanism agrees with Kemeny within this domain with the usual parameters, and then say that VoteFair is mathematically equivalent on that domain. That would be a qualified mathematical equivalence, corresponding to your "real election purposes". ---- Election-Methods mailing list - see https://electorama.com/em for list info |
On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote:
> ... > The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but > C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking. Yes, the default value for variable "global_check_all_scores_choice_limit" is 6 and it gives a result that differs from the (full) Condorcet-Kemeny method. Increasing this variable to 7 causes the full Kemeny calculation process to be done across all 7 choices, and it gets the same result as doing the full Condorcet-Kemeny calculation method. This is consistent with what I said earlier. Of course the calculation time for the case in which there is a Condorcet (rock-paper-scissors) cycle across 50 choices is going to be way too time-consuming for this software. As I've also said before, the odds of such a cycle (involving either 7 or 50 choices) happening in a real election is lower than the odds of a conventional tie. Either kind of tie can be resolved using a judicial decision -- like the Bush versus Gore decision from the US Supreme Court, which was not even an exact tie. The point remains that the VoteFair ranking software matches the Condorcet-Kemeny calculations for as many choices as you choose. The choice is determined by the value of the variable "global_check_all_scores_choice_limit". Thank you for checking. And especially thank you for providing the case in the format that I could directly plug into the software. I have added this case to the GitHub repository. Again, thanks! Richard Fobes On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote: > On 19/12/2019 00.26, VoteFair wrote: >> On 12/18/2019 2:04 AM, Kristofer Munsterhjelm wrote: >>> I can't help myself, but must point out again that I think it's more >>> accurate to say (if this is true), that VoteFair popularity ranking is >>> an approximation to Kemeny that becomes Kemeny in the limit as a certain >>> parameter approaches infinity. >> >> Yes, when there are a large number of candidates, such as 50 candidates >> as an example, estimations are done to yield a full ranking, and then >> the top 6 or 7 or 8 (this is adjustable) most-popular candidates are >> ranked using the exact Condorcet-Kemeny method to determine the ranking >> for those top candidates. >> >> Yes, you came up with an example of a carefully constructed case that >> involved something like 50 candidates -- and a small number of carefully >> balanced groups of voters. But that example illustrates my point that >> such a case would not occur among voters who are sincerely marking their >> ballots independently, without coordination with each other. > > I couldn't seem to find the example, so I tried to create a new one. My > tools are a bit old and hacky, so I may have got it wrong, but I think > this should work: > > 1: A>D>F>E>C>B>G > 1: C>A>F>B>E>G>D > 1: C>B>A>F>D>G>E > 1: C>G>E>A>F>B>D > 1: D>F>C>E>B>A>G > 1: E>C>A>D>G>F>B > 1: F>E>G>C>D>B>A > 1: F>G>B>E>A>D>C > 1: G>A>F>C>D>B>E > > Pairwise matrix: > > - 5 3 7 4 6 5 > 4 - 1 4 4 1 4 > 6 8 - 6 5 4 6 > 2 5 3 - 4 3 4 > 5 5 4 5 - 2 5 > 3 8 5 6 7 - 6 > 4 5 3 5 4 3 - > > VoteFair ballot format: > > request-no-rep request-no-party case 1 q 1 choices 7 x 1 q 1 3 2 1 6 > 4 7 5 x 1 q 1 7 1 6 3 4 2 5 x 1 q 1 5 3 1 4 7 6 2 x 1 q 1 6 5 7 3 4 2 > 1 x 1 q 1 4 6 3 5 2 1 7 x 1 q 1 1 4 6 5 3 2 7 x 1 q 1 3 7 5 1 6 2 4 > x 1 q 1 3 1 6 2 5 7 4 x 1 q 1 6 7 2 5 1 4 3 > > There are four candidates in the Smith set: A, C, E, and F. > > The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but > C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking. > > If that's right, there's no need for 50 candidates to make a > counterexample. The example above does have a carefully balanced > Condorcet cycle, however (moreso because it is optimized to have as few > voters as possible). > >> In elections, where just a single winner is needed, the only way >> VoteFair ranking and the full Condorcet-Kemeny calculations can identify >> different winners is if the case involves numerous almost-equally >> popular candidates (at the top, not the middle or bottom). (Specifically >> there needs to be a carefully constructed Condorcet cycle that involves >> a large number of candidates and a small number of carefully balanced >> groups of voters.) >> >>> Unfortunately, letting that parameter approach infinity destroys >>> VoteFair's polynomial runtime. So VoteFair does not prove P=NP :-) >> >> Yes you are correct that increasing the full calculation parameter >> further (such as even to 20) would result in a very long computation >> time, which is of course not a "polynomial runtime." >> >> Yet for election purposes, as opposed to mathematical-proof purposes, I >> cannot imagine needing to increase the full calculation parameter to >> check more than a few top candidates. (If that affects the outcome, >> then just a few spoiled ballots also would be just as likely to change >> the outcome.) > > My point is that a phrase like "mathematically equivalent" is about > mathematical-proof purposes, not practical election purposes. So I > wouldn't say "mathematically equivalent" would be accurate for an > approximation, any more than say > > f(x) = sum k = 0...100: x^k/k! if x is a prime integer > 10^9 > sum k = 0...infinity x^k/k! otherwise > > is mathematically equivalent to e^x (where you always have to take the > sum as k goes to infinity, not stop at 100). It may be pretty close for > a typical number (as integers have measure zero), but it's not the same > function. > > If functions are mathematically equivalent, then proofs about the former > should also apply to the latter. But properties about e^x won't hold of > f above, and properties of Kemeny won't hold of VoteFair over the whole > domain. E.g. Kemeny passes LIIA but VoteFair doesn't (as eliminating the > loser of the example election above would probably result in VoteFair > returning the optimal Kemeny result). > > Mathematical equivalence might be too strict, but it is what it is. Or > that's what I'd think the term would mean. > > You could save mathematical equivalence by restricting the domain to > e.g. elections with a Smith set no larger than three candidates, prove > that the sort-based VoteFair mechanism agrees with Kemeny within this > domain with the usual parameters, and then say that VoteFair is > mathematically equivalent on that domain. That would be a qualified > mathematical equivalence, corresponding to your "real election purposes". > Election-Methods mailing list - see https://electorama.com/em for list info |
On 04/01/2020 01.58, VoteFair wrote:
> On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote: >> ... >> The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but >> C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking. > > Yes, the default value for variable > "global_check_all_scores_choice_limit" is 6 and it gives a result that > differs from the (full) Condorcet-Kemeny method. > > Increasing this variable to 7 causes the full Kemeny calculation process > to be done across all 7 choices, and it gets the same result as doing > the full Condorcet-Kemeny calculation method. > > This is consistent with what I said earlier. > > Of course the calculation time for the case in which there is a > Condorcet (rock-paper-scissors) cycle across 50 choices is going to be > way too time-consuming for this software. > > As I've also said before, the odds of such a cycle (involving either 7 > or 50 choices) happening in a real election is lower than the odds of a > conventional tie. Either kind of tie can be resolved using a judicial > decision -- like the Bush versus Gore decision from the US Supreme > Court, which was not even an exact tie. Do you agree with my point about the use of the term "mathematically equivalent"? If the method appears to be equivalent, that might give the impression that you don't need to care about cycles at all. ---- Election-Methods mailing list - see https://electorama.com/em for list info |
On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote:
> Do you agree with my point about the use of the term "mathematically > equivalent"? If the method appears to be equivalent, that might give the > impression that you don't need to care about cycles at all. VoteFair popularity ranking IS mathematically equivalent to the Condorcet-Kemeny method. The difference is that John Kemeny described the version that counts disapprovals (and finds the smallest sequence score), whereas VoteFair popularity ranking counts approvals (and finds the largest sequence score). The VoteFair Ranking software does calculate the full version (of VoteFair popularity ranking) to the extent that the "global_check_all_scores_choice_limit" value is as large as desired. I choose to leave that value at 6, even though it could be set to 20 or higher with no execution errors. Of course in that case it would be helpful to insert code that checks for time delays and shows progress in case a Condorcet cycle is slowing it down. I'm not sure what you mean by: > ... that might give the > impression that you don't need to care about cycles at all. Neither the method nor the software implies that cycles are of no importance. Richard Fobes On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote: > On 04/01/2020 01.58, VoteFair wrote: >> On 1/2/2020 3:00 PM, Kristofer Munsterhjelm wrote: >>> ... >>> The VoteFair ranking is F>C>E>A>G>D>B with Kemeny score 118, but >>> C>A>F>E>G>D>B has Kemeny score 119 and is the unique maximum ranking. >> >> Yes, the default value for variable >> "global_check_all_scores_choice_limit" is 6 and it gives a result that >> differs from the (full) Condorcet-Kemeny method. >> >> Increasing this variable to 7 causes the full Kemeny calculation process >> to be done across all 7 choices, and it gets the same result as doing >> the full Condorcet-Kemeny calculation method. >> >> This is consistent with what I said earlier. >> >> Of course the calculation time for the case in which there is a >> Condorcet (rock-paper-scissors) cycle across 50 choices is going to be >> way too time-consuming for this software. >> >> As I've also said before, the odds of such a cycle (involving either 7 >> or 50 choices) happening in a real election is lower than the odds of a >> conventional tie. Either kind of tie can be resolved using a judicial >> decision -- like the Bush versus Gore decision from the US Supreme >> Court, which was not even an exact tie. > > Do you agree with my point about the use of the term "mathematically > equivalent"? If the method appears to be equivalent, that might give the > impression that you don't need to care about cycles at all. > Election-Methods mailing list - see https://electorama.com/em for list info |
On 06/01/2020 06.19, VoteFair wrote:
> On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote: >> Do you agree with my point about the use of the term "mathematically >> equivalent"? If the method appears to be equivalent, that might give the >> impression that you don't need to care about cycles at all. > > VoteFair popularity ranking IS mathematically equivalent to the > Condorcet-Kemeny method. The difference is that John Kemeny described > the version that counts disapprovals (and finds the smallest sequence > score), whereas VoteFair popularity ranking counts approvals (and finds > the largest sequence score). > > The VoteFair Ranking software does calculate the full version (of > VoteFair popularity ranking) to the extent that the > "global_check_all_scores_choice_limit" value is as large as desired. > > I choose to leave that value at 6, even though it could be set to 20 or > higher with no execution errors. Of course in that case it would be > helpful to insert code that checks for time delays and shows progress in > case a Condorcet cycle is slowing it down. What you seem to be saying is that the objective function (what you're trying to optimize), when taking into account the direction of the optimization, is the same, so the problem is mathematically equivalent. That's right. But that doesn't suffice to make the *solution function* mathematically equivalent. If it did, then likewise any approximation to e.g. Traveling Salesman would be mathematically equivalent with solving Traveling Salesman itself, just because it counted the total distance the salesman has to travel the same way. VoteFair may *approximate* Kemeny, and it may use an equivalent scoring function, but for any value of the constant, there exist an infinite number of inputs for which the outputs will disagree. Note that this isn't about "likely to disagree". Mathematically, if there's a single difference between two functions' mapping, then the functions are different. This holds regardless of the implementation (because a mathematical mapping is agnostic to how it's implemented). E.g. both Insertion and Merge sort implement the mapping from unsorted lists to stably sorted ones, but the algorithms themselves are very different. > I'm not sure what you mean by: >> ... that might give the >> impression that you don't need to care about cycles at all. > > Neither the method nor the software implies that cycles are of no > importance. Someone who reads "mathematically equivalent" as "proven to give the same results" will think that there are no inputs for which the outputs differ. Thus he may think that there's no need to be on the lookout for strange behavior under certain conditions. I seem to recall that around 2010, Warren thought that you were claiming that VoteFair did exactly reproduce Kemeny results/optima (for every election, i.e. not an approximation). There was then a long thread about NP-hardness and whether or not determining the Kemeny winner is NP-hard (which it is). I *think* you reached the conclusion that VoteFair was an approximation, but I don't clearly remember. The point is, Warren's a mathematician and got confused about what "mathematically equivalent" meant as you used it. I'd imagine a non-mathematician would have even less chance of getting it right. And if he understands "mathematically equivalent" that way, and think VoteFair always returns the Kemeny optimum, then he's not going to think that "oops, I have to be careful about *these* elections", because there's nothing to be careful about. After all, it's equivalent, so shouldn't it return the correct outcome for every election? VoteFair itself doesn't give any indication of when the results are suspect and should be thrown out for a tie instead, either. (I'd suggest replacing the insertion sort code with an integer programming solver (like Math::LP::Solve or Math::GLPK) and setting a timeout. Such a solver either returns the optimum or says "couldn't do that in the time given". It's equivalent over the domain where it returns a result, and outside of that domain, it clearly times out. Or more generally, alter the code so that instead of returning a possibly-right result always in finite time, always returning a right result but sometimes taking a very long time about it. A timeout is easier to diagnose than a result that looks right but isn't. But if you're going to do that, why not use an integer programming solver?) ---- Election-Methods mailing list - see https://electorama.com/em for list info |
Kristofer ~
It occurs to me that you might be thinking that the VoteFair Ranking SOFTWARE provides the definition of VoteFair popularity ranking. That is not the case. VoteFair popularity ranking is defined in two of my books -- "The Creative Problem Solver's Toolbox" and "Ending The Hidden Unfairness In U.S. Elections" -- and in the Condorcet-Kemeny ("Kemeny-Young") Wikipedia article. Those definitions should make it clear to anyone who understands mathematics that the difference between VoteFair popularity ranking and the method John Kemeny published is that Kemeny counts opposition and VoteFair counts support, and that otherwise the two methods are mathematically equivalent. If you think that the VoteFair Ranking SOFTWARE does not correctly calculate VoteFair popularity ranking results, please recognize -- as I have said before -- that you (or anyone) can increase the constant that specifies how many choices are calculated according to the full VoteFair popularity ranking method. As I have said before, I choose a smaller value for that constant because the software is intended for use in real-life voting situations where other complications arise as more important than issues of mathematical "purity." So, to repeat, VoteFair popularity ranking IS mathematically equivalent to the Condorcet-Kemeny method, and the VoteFair Ranking software can calculate either the pure VoteFair popularity ranking results or results that only rarely differ from the pure VoteFair popularity ranking results. The choice is determined by the constant. As another option, if someone prefers that the VoteFair Ranking SOFTWARE is ALWAYS mathematically pure -- according to the VoteFair popularity ranking definition -- and never consumes too much computation time, then they can reduce the limit on the maximum number of choices to something like 12 choices, and also change the constant (referred to above) to 12. Then the software will NEVER depart from VoteFair popularity ranking as defined in my books and in Wikipedia. Is this distinction between method and software clear? If not, please clarify your concern, and specify whether you are referring to the method or the software. Thanks, Richard Fobes On 1/10/2020 3:30 AM, Kristofer Munsterhjelm wrote: > On 06/01/2020 06.19, VoteFair wrote: >> On 1/5/2020 5:48 AM, Kristofer Munsterhjelm wrote: >>> Do you agree with my point about the use of the term "mathematically >>> equivalent"? If the method appears to be equivalent, that might give the >>> impression that you don't need to care about cycles at all. >> >> VoteFair popularity ranking IS mathematically equivalent to the >> Condorcet-Kemeny method. The difference is that John Kemeny described >> the version that counts disapprovals (and finds the smallest sequence >> score), whereas VoteFair popularity ranking counts approvals (and finds >> the largest sequence score). >> >> The VoteFair Ranking software does calculate the full version (of >> VoteFair popularity ranking) to the extent that the >> "global_check_all_scores_choice_limit" value is as large as desired. >> >> I choose to leave that value at 6, even though it could be set to 20 or >> higher with no execution errors. Of course in that case it would be >> helpful to insert code that checks for time delays and shows progress in >> case a Condorcet cycle is slowing it down. > > What you seem to be saying is that the objective function (what you're > trying to optimize), when taking into account the direction of the > optimization, is the same, so the problem is mathematically equivalent. > > That's right. But that doesn't suffice to make the *solution function* > mathematically equivalent. If it did, then likewise any approximation to > e.g. Traveling Salesman would be mathematically equivalent with solving > Traveling Salesman itself, just because it counted the total distance > the salesman has to travel the same way. > > VoteFair may *approximate* Kemeny, and it may use an equivalent scoring > function, but for any value of the constant, there exist an infinite > number of inputs for which the outputs will disagree. > > Note that this isn't about "likely to disagree". Mathematically, if > there's a single difference between two functions' mapping, then the > functions are different. This holds regardless of the implementation > (because a mathematical mapping is agnostic to how it's implemented). > E.g. both Insertion and Merge sort implement the mapping from unsorted > lists to stably sorted ones, but the algorithms themselves are very > different. > >> I'm not sure what you mean by: >>> ... that might give the >>> impression that you don't need to care about cycles at all. >> >> Neither the method nor the software implies that cycles are of no >> importance. > > Someone who reads "mathematically equivalent" as "proven to give the > same results" will think that there are no inputs for which the outputs > differ. Thus he may think that there's no need to be on the lookout for > strange behavior under certain conditions. > > I seem to recall that around 2010, Warren thought that you were claiming > that VoteFair did exactly reproduce Kemeny results/optima (for every > election, i.e. not an approximation). There was then a long thread about > NP-hardness and whether or not determining the Kemeny winner is NP-hard > (which it is). I *think* you reached the conclusion that VoteFair was an > approximation, but I don't clearly remember. > > The point is, Warren's a mathematician and got confused about what > "mathematically equivalent" meant as you used it. I'd imagine a > non-mathematician would have even less chance of getting it right. > > And if he understands "mathematically equivalent" that way, and think > VoteFair always returns the Kemeny optimum, then he's not going to > think that "oops, I have to be careful about *these* elections", because > there's nothing to be careful about. After all, it's equivalent, so > shouldn't it return the correct outcome for every election? VoteFair > itself doesn't give any indication of when the results are suspect and > should be thrown out for a tie instead, either. > > (I'd suggest replacing the insertion sort code with an integer > programming solver (like Math::LP::Solve or Math::GLPK) and setting a > timeout. Such a solver either returns the optimum or says "couldn't do > that in the time given". It's equivalent over the domain where it > returns a result, and outside of that domain, it clearly times out. > > Or more generally, alter the code so that instead of returning a > possibly-right result always in finite time, always returning a right > result but sometimes taking a very long time about it. A timeout is > easier to diagnose than a result that looks right but isn't. But if > you're going to do that, why not use an integer programming solver?) > Election-Methods mailing list - see https://electorama.com/em for list info |
On 12/01/2020 05.34, VoteFair wrote:
> Kristofer ~ > > It occurs to me that you might be thinking that the VoteFair Ranking > SOFTWARE provides the definition of VoteFair popularity ranking. That > is not the case. > > VoteFair popularity ranking is defined in two of my books -- "The > Creative Problem Solver's Toolbox" and "Ending The Hidden Unfairness In > U.S. Elections" -- and in the Condorcet-Kemeny ("Kemeny-Young") > Wikipedia article. > > Those definitions should make it clear to anyone who understands > mathematics that the difference between VoteFair popularity ranking and > the method John Kemeny published is that Kemeny counts opposition and > VoteFair counts support, and that otherwise the two methods are > mathematically equivalent. Let me try to restate that to see if I got it right, because it seems like an odd distinction. A ranked voting method is a function that takes, as an input, a set of ballots, and produces either a social ranking or a winner set (depending on what definition you agree with). What happens inside the function is irrelevant to the definition. You could, if you had infinite memory, implement it as a huge table that you would just look up to find out what the output would be for a given input. Given this definition, there are two possibilities. Either the VoteFair software implements the VoteFair method, or it does not. By "implements the method" I mean that running the VoteFair software on some input election always provides the output that the method would. Now, I was assuming that the software was an accurate implementation of the method, so that instead of a method, you actually had a family of them (parameterized by that constant), none of which would agree with Kemeny all the time. It seems that you instead are saying that, as you define the VoteFair ranking, it does agree with Kemeny all the time, but the software doesn't. Could you give me a definition of the VoteFair method itself? Without knowing what bit is the software and what bit is the method, it's difficult to get further. For instance, it would be hard to know whether the results that the VoteFair software gives when Kemeny orderings are tied is a property of the method or of the software, because the definition of the Kemeny method itself doesn't say what should happen in such a case. ---- Election-Methods mailing list - see https://electorama.com/em for list info |
On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote:
> Given this definition, there are two possibilities. Either the VoteFair > software implements the VoteFair method, or it does not. ... If the default settings are used, and if the number of candidates is no more than the check-all-sequence-scores setting (which I think is 6), then the software implements the VoteFair popularity ranking method, which yields the same results as the Condorcet-Kemeny method. In other words you are not allowing for a third possibility, which -- of course -- depends on the number of candidates. > ... It seems that you instead are saying that, as you > define the VoteFair ranking, it does agree with Kemeny all the time, but > the software doesn't. Yes, this is the third possibility. The software CAN always yield the same results -- simply by limiting the number of candidates it can handle (or, up to a limit based on computing power, increasing the number of candidates checked by the full/slow method). > Could you give me a definition of the VoteFair method itself? Without > knowing what bit is the software and what bit is the method, it's > difficult to get further. For instance, it would be hard to know whether > the results that the VoteFair software gives when Kemeny orderings are > tied is a property of the method or of the software, because the > definition of the Kemeny method itself doesn't say what should happen in > such a case. Originally I wrote the definition of VoteFair popularity ranking (which at that time I called VoteFair ranking) on Wikipedia, and Markus Schulze re-named the article to the "Kemeny-Young method" claiming that the two methods are the same. So the definition there matches VoteFair popularity ranking because I wrote it. Yes, the issue of how to handle Condorcet-cycle ties is not stated by John Kemeny (as far as I know), and I have not explicitly stated how such ties are handled in VoteFair popularity ranking. If I did make such a statement then you or others might argue that the two methods are not the same. The software does attempt to handle ties, either to use a tie-breaking criteria or to declare an official tie. In the latter case, the VoteFair approach presumably matches the Kemeny approach. In the former case the candidates in the cycle can be re-sequenced and the sequence sums yield the same highest sum, yet SOME of those "tied" sequences can yield different sums. If this seems complex it's because it IS complex. It took lots of work to get the software to handle these and other "edge" cases. This approach is unlike what many software developers do, which is to dismiss edge cases with some kind of error message. In other words, my software looks deeper into the "tie." So, keeping in mind that neither John Kemeny nor I have officially stated how ties (in the sequence sums) are to be handled, both methods are mathematically equivalent. I avoid using the word "same" because John Kemeny defines the sequence score in terms of counting votes that OPPOSE a candidate (and finding the minimum score), whereas VoteFair popularity ranking (which I came up with without knowing about the Kemeny method) defines the sequence score in terms of counting votes that SUPPORT a candidate (and finding the maximum score). If this still isn't clear, please keep asking for clarifications. Thank you for seeking to understand this method. Richard Fobes On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote: > On 12/01/2020 05.34, VoteFair wrote: >> Kristofer ~ >> >> It occurs to me that you might be thinking that the VoteFair Ranking >> SOFTWARE provides the definition of VoteFair popularity ranking. That >> is not the case. >> >> VoteFair popularity ranking is defined in two of my books -- "The >> Creative Problem Solver's Toolbox" and "Ending The Hidden Unfairness In >> U.S. Elections" -- and in the Condorcet-Kemeny ("Kemeny-Young") >> Wikipedia article. >> >> Those definitions should make it clear to anyone who understands >> mathematics that the difference between VoteFair popularity ranking and >> the method John Kemeny published is that Kemeny counts opposition and >> VoteFair counts support, and that otherwise the two methods are >> mathematically equivalent. > > Let me try to restate that to see if I got it right, because it seems > like an odd distinction. > > A ranked voting method is a function that takes, as an input, a set of > ballots, and produces either a social ranking or a winner set (depending > on what definition you agree with). > > What happens inside the function is irrelevant to the definition. You > could, if you had infinite memory, implement it as a huge table that you > would just look up to find out what the output would be for a given input. > > Given this definition, there are two possibilities. Either the VoteFair > software implements the VoteFair method, or it does not. By "implements > the method" I mean that running the VoteFair software on some input > election always provides the output that the method would. > > Now, I was assuming that the software was an accurate implementation of > the method, so that instead of a method, you actually had a family of > them (parameterized by that constant), none of which would agree with > Kemeny all the time. It seems that you instead are saying that, as you > define the VoteFair ranking, it does agree with Kemeny all the time, but > the software doesn't. > > Could you give me a definition of the VoteFair method itself? Without > knowing what bit is the software and what bit is the method, it's > difficult to get further. For instance, it would be hard to know whether > the results that the VoteFair software gives when Kemeny orderings are > tied is a property of the method or of the software, because the > definition of the Kemeny method itself doesn't say what should happen in > such a case. > Election-Methods mailing list - see https://electorama.com/em for list info |
I hope this reply isn't too late, so here we go:
On 23/01/2020 18.53, VoteFair wrote: > On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote: >> Given this definition, there are two possibilities. Either the VoteFair >> software implements the VoteFair method, or it does not. ... > > If the default settings are used, and if the number of candidates is no > more than the check-all-sequence-scores setting (which I think is 6), > then the software implements the VoteFair popularity ranking method, > which yields the same results as the Condorcet-Kemeny method. > > In other words you are not allowing for a third possibility, which -- of > course -- depends on the number of candidates. > >> ... It seems that you instead are saying that, as you >> define the VoteFair ranking, it does agree with Kemeny all the time, but >> the software doesn't. > > Yes, this is the third possibility. The software CAN always yield the > same results -- simply by limiting the number of candidates it can > handle (or, up to a limit based on computing power, increasing the > number of candidates checked by the full/slow method). > >> Could you give me a definition of the VoteFair method itself? Without >> knowing what bit is the software and what bit is the method, it's >> difficult to get further. For instance, it would be hard to know whether >> the results that the VoteFair software gives when Kemeny orderings are >> tied is a property of the method or of the software, because the >> definition of the Kemeny method itself doesn't say what should happen in >> such a case. > > Originally I wrote the definition of VoteFair popularity ranking (which > at that time I called VoteFair ranking) on Wikipedia, and Markus Schulze > re-named the article to the "Kemeny-Young method" claiming that the two > methods are the same. So the definition there matches VoteFair > popularity ranking because I wrote it. Alright, that enabled me to track down the original definition by looking at the Wikipedia history archive. I've quoted it below. > VoteFair ranking shall be done using software (such as accessible at > www.VoteFair.org) that performs the following calculations. The > preferences indicated in the ballots are counted to produce a tally > table in which all the possible pairs of candidates are listed, one > number for each pair indicates the number of voters who prefer one > candidate in the pair over the other candidate in the pair, another > number for each pair indicates the number of voters who have the > opposite preference for these two candidates, and a third number for > each pair indicates the number of voters who express no preference > between the two candidates. Using a computer, each possible sequence of > candidates is considered, where a sequence consists of one of the > candidates being regarded as the most popular candidate, another > candidate being regarded as the second-most popular candidate, and so > on. For each such sequence the numbers in the tally table that apply to > that sequence are added together to produce a sequence score for this > sequence. The sequence that has the highest sequence score indicates the > overall order of preference for the candidates. If there is more than > one sequence that has the same highest score, the sequences with this > score shall be analyzed to identify one or more ties at one or more > preference levels. I agree that the definition above describes the Kemeny-Young method. I'd even say it's the same (not just mathematically equivalent) even though the mechanics are different: There exist implementations of Kemeny-Young that don't rely on enumerating all combinations and then choosing the best one. For instance, integer programming implementations scale much better while producing the optimal result. If the implementation mattered as to whether one method is the same as another, then these implementations would not be of the Kemeny-Young method because they don't use exhaustive enumerations. But they are generally considered to be. > Yes, the issue of how to handle Condorcet-cycle ties is not stated by > John Kemeny (as far as I know), and I have not explicitly stated how > such ties are handled in VoteFair popularity ranking. If I did make such > a statement then you or others might argue that the two methods are not > the same. > > The software does attempt to handle ties, either to use a tie-breaking > criteria or to declare an official tie. In the latter case, the VoteFair > approach presumably matches the Kemeny approach. In the former case the > candidates in the cycle can be re-sequenced and the sequence sums yield > the same highest sum, yet SOME of those "tied" sequences can yield > different sums. If this seems complex it's because it IS complex. It > took lots of work to get the software to handle these and other "edge" > cases. This approach is unlike what many software developers do, which > is to dismiss edge cases with some kind of error message. In other > words, my software looks deeper into the "tie." > > So, keeping in mind that neither John Kemeny nor I have officially > stated how ties (in the sequence sums) are to be handled, both methods > are mathematically equivalent. So it seems the situation is different than I thought. Instead of the method not being the same as Kemeny, the case is more that the software doesn't implement the method, but rather something that approximates (and differs from) it. As I've said before, I consider a method to be algorithm-agnostic. Schulze implemented by the beatpath mechanism is the same method as Schulze implemented by the cloneproof Schwartz dropping mechanism. Furthermore, I think that a method is a function that takes an election as its input and then provides either a winner set or a ranking (order of finish). If a method uses some kind of auxiliary input parameter, then the method with one value of that auxiliary input parameter is not the same as the method with another value of that parameter, unless it happens to be inconsequential for the outcome. E.g. STV with a Droop quota is not the same method as STV with the Hare method. A software implementation is "of the method" or "implements the method" if it agrees with the method for every input that it returns an output for. Since the VoteFair method definition is the same (or equivalent) to the Kemeny-Young method definition, the methods are the same. But the software doesn't implement that method[1]. Instead, it implements a family of approximations determined by the parameter. Each approximation agrees with the full method if the number of candidates is below a certain number, and returns undefined results above this number. But the software doesn't give any impression that this is what it does. Instead, from the documentation, one would get the impression that it implements *the* Kemeny method (for any number of candidates). E.g. from the VoteFair::VoteFairRanking Perl module documentation: > This module calculates VoteFair Ranking results. The portions of > VoteFair Ranking implemented here are: > > - VoteFair popularity ranking. This voting method calculates the full > popularity ranking of all candidates (or choices in the case of a > survey) from most popular and second-most popular down to least > popular. It uses the preference information collected on 1-2-3 > ballots (or any equivalent way of expressing "ranked" preferences). > When a single position is being filled, the most popular candidate is > declared the winner. This calculation method is mathematically > equivalent to the Condorcet-Kemeny election method. and further, > VoteFair popularity ranking is described in Wikipedia, and is > mathematically equivalent to the Condorcet-Kemeny method. The following > comments explain the algorithm used here, which quickly calculates the > ranking results. This explanation is important because some academic > sources claim that the computations cannot be done quickly if there is a > large number of choices being ranked. > Calculation time: > > The full algorithm used to calculate VoteFair popularity ranking > results has a calculation time that is less than or equal to the > following polynomial function: > > T = A + ( B * N ) + ( C * ( N * N ) ) > > where T is the calculation time, N is the number of choices, and A > and B and C are constants. (In mathematical notation, N * N would be > written as N squared.) This function includes the calculation time > required for the Choice-Specific Pairwise-Score (CSPS) pre-sort > calculations. > > This time contrasts with the slow execution times of the "NP-hard" > approach, in which every sequence score is calculated in order to find > the sequence with the highest score. If every sequence score were > calculated (from scratch), the calculation time would be proportional to: > > N! * N * N > > where N is the number of choices, N! is N factorial (2 * 3 * 4 * ... > * N), and N * N equals N squared. Note that N factorial equals the number > of possible sequences, and N squared times one-half approximately equals > the number of pairwise counts that are added to calculate each sequence > score. > > This clarification about calculation time is included because there > is an academically common -- yet mistaken -- belief that calculating the > "Condorcet-Kemeny method" is "NP-hard" and cannot be calculated in a > time that is proportional to a polynomial function of N (the number of > choices). This description gives the impression that the software calculates the VoteFair ranking (period, i.e. no approximations), that it does so in polynomial time, and that it is possible to determine the Kemeny-Young winner in polynomial time. But what it does is calculate an approximation, which is how it can manage to do so in polytime without proving P=NP. Similarly, in the VoteFair C++ README.txt file: > // * VoteFair popularity ranking. This voting method calculates the > // full popularity ranking of all candidates (or choices in the case > // of a survey) from most popular and second-most popular down to > // least popular. It uses the preference information collected on > // 1-2-3 ballots (or any equivalent way of expressing "ranked" > // preferences). When a single position is being filled, the most > // popular candidate is declared the winner. This calculation method > // is mathematically equivalent to the Condorcet-Kemeny election > // method. which gives the impression that again, the software calculates the method (although at least it doesn't mention being able to circumvent NP-hardness :-). Now, if I'm generous, I could say that what I understand you to be saying is that when you say "this voting method calculates...", you're talking about the software, but when you're saying "this calculation method is mathematically equivalent...", you're talking about the method. But it's easy to draw the wrong conclusion, as I did earlier. The documentation could thus use being more clear about what the software implements -- or the software could be rewritten so that it implements the actual VoteFair (or Kemeny) method. And again, I would suggest the latter as a good strategy. Instead of producing correct results up to a certain number of candidates and then producing undefined results - that it produces the correct results in potentially exponential time, and that it just quits with a timeout if it takes too long a time. That way, it's immediately obvious to the user whether the method produces a correct result (namely, that it'd do so when it does produce a result at all). Instead of producing a result in managable time with some probability (which you argue is low) of producing something incorrect, it produces a result in managable time with similar (high if you're right) probability of not getting a timeout. In addition, if you use a reasonably good integer programming solver to do so, you can quickly solve cases that the current software can't, e.g. 1000 candidates but no cycles at all. [1] Strictly speaking, it implements a generalization of that method, but I don't want to make things more confusing than they need to be; I'd rather deal with one point at a time. ---- Election-Methods mailing list - see https://electorama.com/em for list info |
On 2/25/2020 3:17 PM, Kristofer Munsterhjelm wrote:
> So it seems the situation is different than I thought. Instead of the > method not being the same as Kemeny, the case is more that the software > doesn't implement the method, but rather something that approximates > (and differs from) it. > ... > ... the software doesn't give any impression that this is what it does. > Instead, from the documentation, one would get the impression that it > implements *the* Kemeny method (for any number of candidates). ... > ... > The documentation could thus use being more clear about what the > software implements -- or the software could be rewritten so that it > implements the actual VoteFair (or Kemeny) method. I agree that the software deserves better documentation, so I have added the following clarification to the VoteFair ranking software (both in README.txt and votefair_ranking.cpp). // Clarification: For reasons of computation time, there are some // very rare cases for which this software can produce results that // differ from full VoteFair popularity ranking results. // These exceptions involve more choices (candidates) than the // value of the constant named "global_check_all_scores_choice_limit", // which currently has a default value of 6. // In those rare cases, the software estimates which top (6) choices // are the most popular, and then does the full VoteFair popularity // ranking to determine which of those top choices is most popular. // The rare cases in which the estimation method produces different // results involve one or more rock-paper-scissors-like cycles // that include the top choices, which is a rare combination. // As an analogy, those cases are like finding the highest // sand dune in a desert, whereas most cases are like finding // the highest mountain peak in a mountain range. // As the number of ballots increases (such as beyond 50 ballots), // the likelihood of such cycles greatly decreases. // If this difference is important, the value of the constant // "global_check_all_scores_choice_limit" can be increased, but // that change will dramatically increase calculation time, // to the point of requiring years of computation time for values // even as small as 20. Therefore, values larger than 12 are not // recommended. Kristofer, thank you for taking the time to understand VoteFair Ranking and the VoteFair ranking software, and for calling attention to this need for better software documentation. Richard Fobes On 2/25/2020 3:17 PM, Kristofer Munsterhjelm wrote: > I hope this reply isn't too late, so here we go: > > On 23/01/2020 18.53, VoteFair wrote: >> On 1/22/2020 3:49 PM, Kristofer Munsterhjelm wrote: >>> Given this definition, there are two possibilities. Either the VoteFair >>> software implements the VoteFair method, or it does not. ... >> >> If the default settings are used, and if the number of candidates is no >> more than the check-all-sequence-scores setting (which I think is 6), >> then the software implements the VoteFair popularity ranking method, >> which yields the same results as the Condorcet-Kemeny method. >> >> In other words you are not allowing for a third possibility, which -- of >> course -- depends on the number of candidates. >> >>> ... It seems that you instead are saying that, as you >>> define the VoteFair ranking, it does agree with Kemeny all the time, but >>> the software doesn't. >> >> Yes, this is the third possibility. The software CAN always yield the >> same results -- simply by limiting the number of candidates it can >> handle (or, up to a limit based on computing power, increasing the >> number of candidates checked by the full/slow method). >> >>> Could you give me a definition of the VoteFair method itself? Without >>> knowing what bit is the software and what bit is the method, it's >>> difficult to get further. For instance, it would be hard to know whether >>> the results that the VoteFair software gives when Kemeny orderings are >>> tied is a property of the method or of the software, because the >>> definition of the Kemeny method itself doesn't say what should happen in >>> such a case. >> >> Originally I wrote the definition of VoteFair popularity ranking (which >> at that time I called VoteFair ranking) on Wikipedia, and Markus Schulze >> re-named the article to the "Kemeny-Young method" claiming that the two >> methods are the same. So the definition there matches VoteFair >> popularity ranking because I wrote it. > > Alright, that enabled me to track down the original definition by > looking at the Wikipedia history archive. I've quoted it below. > >> VoteFair ranking shall be done using software (such as accessible at >> www.VoteFair.org) that performs the following calculations. The >> preferences indicated in the ballots are counted to produce a tally >> table in which all the possible pairs of candidates are listed, one >> number for each pair indicates the number of voters who prefer one >> candidate in the pair over the other candidate in the pair, another >> number for each pair indicates the number of voters who have the >> opposite preference for these two candidates, and a third number for >> each pair indicates the number of voters who express no preference >> between the two candidates. Using a computer, each possible sequence of >> candidates is considered, where a sequence consists of one of the >> candidates being regarded as the most popular candidate, another >> candidate being regarded as the second-most popular candidate, and so >> on. For each such sequence the numbers in the tally table that apply to >> that sequence are added together to produce a sequence score for this >> sequence. The sequence that has the highest sequence score indicates the >> overall order of preference for the candidates. If there is more than >> one sequence that has the same highest score, the sequences with this >> score shall be analyzed to identify one or more ties at one or more >> preference levels. > > I agree that the definition above describes the Kemeny-Young method. I'd > even say it's the same (not just mathematically equivalent) even though > the mechanics are different: > > There exist implementations of Kemeny-Young that don't rely on > enumerating all combinations and then choosing the best one. For > instance, integer programming implementations scale much better while > producing the optimal result. If the implementation mattered as to > whether one method is the same as another, then these implementations > would not be of the Kemeny-Young method because they don't use > exhaustive enumerations. But they are generally considered to be. > >> Yes, the issue of how to handle Condorcet-cycle ties is not stated by >> John Kemeny (as far as I know), and I have not explicitly stated how >> such ties are handled in VoteFair popularity ranking. If I did make such >> a statement then you or others might argue that the two methods are not >> the same. >> >> The software does attempt to handle ties, either to use a tie-breaking >> criteria or to declare an official tie. In the latter case, the VoteFair >> approach presumably matches the Kemeny approach. In the former case the >> candidates in the cycle can be re-sequenced and the sequence sums yield >> the same highest sum, yet SOME of those "tied" sequences can yield >> different sums. If this seems complex it's because it IS complex. It >> took lots of work to get the software to handle these and other "edge" >> cases. This approach is unlike what many software developers do, which >> is to dismiss edge cases with some kind of error message. In other >> words, my software looks deeper into the "tie." >> >> So, keeping in mind that neither John Kemeny nor I have officially >> stated how ties (in the sequence sums) are to be handled, both methods >> are mathematically equivalent. > > So it seems the situation is different than I thought. Instead of the > method not being the same as Kemeny, the case is more that the software > doesn't implement the method, but rather something that approximates > (and differs from) it. > > As I've said before, I consider a method to be algorithm-agnostic. > Schulze implemented by the beatpath mechanism is the same method as > Schulze implemented by the cloneproof Schwartz dropping mechanism. > Furthermore, I think that a method is a function that takes an election > as its input and then provides either a winner set or a ranking (order > of finish). > > If a method uses some kind of auxiliary input parameter, then the method > with one value of that auxiliary input parameter is not the same as the > method with another value of that parameter, unless it happens to be > inconsequential for the outcome. E.g. STV with a Droop quota is not the > same method as STV with the Hare method. > > A software implementation is "of the method" or "implements the method" > if it agrees with the method for every input that it returns an output for. > > Since the VoteFair method definition is the same (or equivalent) to the > Kemeny-Young method definition, the methods are the same. But the > software doesn't implement that method[1]. Instead, it implements a > family of approximations determined by the parameter. Each approximation > agrees with the full method if the number of candidates is below a > certain number, and returns undefined results above this number. > > But the software doesn't give any impression that this is what it does. > Instead, from the documentation, one would get the impression that it > implements *the* Kemeny method (for any number of candidates). E.g. from > the VoteFair::VoteFairRanking Perl module documentation: > >> This module calculates VoteFair Ranking results. The portions of >> VoteFair Ranking implemented here are: >> >> - VoteFair popularity ranking. This voting method calculates the full >> popularity ranking of all candidates (or choices in the case of a >> survey) from most popular and second-most popular down to least >> popular. It uses the preference information collected on 1-2-3 >> ballots (or any equivalent way of expressing "ranked" preferences). >> When a single position is being filled, the most popular candidate is >> declared the winner. This calculation method is mathematically >> equivalent to the Condorcet-Kemeny election method. > > and further, > >> VoteFair popularity ranking is described in Wikipedia, and is >> mathematically equivalent to the Condorcet-Kemeny method. The following >> comments explain the algorithm used here, which quickly calculates the >> ranking results. This explanation is important because some academic >> sources claim that the computations cannot be done quickly if there is a >> large number of choices being ranked. > >> Calculation time: >> >> The full algorithm used to calculate VoteFair popularity ranking >> results has a calculation time that is less than or equal to the >> following polynomial function: >> >> T = A + ( B * N ) + ( C * ( N * N ) ) >> >> where T is the calculation time, N is the number of choices, and A >> and B and C are constants. (In mathematical notation, N * N would be >> written as N squared.) This function includes the calculation time >> required for the Choice-Specific Pairwise-Score (CSPS) pre-sort >> calculations. >> >> This time contrasts with the slow execution times of the "NP-hard" >> approach, in which every sequence score is calculated in order to find >> the sequence with the highest score. If every sequence score were >> calculated (from scratch), the calculation time would be proportional to: >> >> N! * N * N >> >> where N is the number of choices, N! is N factorial (2 * 3 * 4 * ... >> * N), and N * N equals N squared. Note that N factorial equals the number >> of possible sequences, and N squared times one-half approximately equals >> the number of pairwise counts that are added to calculate each sequence >> score. >> >> This clarification about calculation time is included because there >> is an academically common -- yet mistaken -- belief that calculating the >> "Condorcet-Kemeny method" is "NP-hard" and cannot be calculated in a >> time that is proportional to a polynomial function of N (the number of >> choices). > > This description gives the impression that the software calculates the > VoteFair ranking (period, i.e. no approximations), that it does so in > polynomial time, and that it is possible to determine the Kemeny-Young > winner in polynomial time. But what it does is calculate an > approximation, which is how it can manage to do so in polytime without > proving P=NP. > > Similarly, in the VoteFair C++ README.txt file: > >> // * VoteFair popularity ranking. This voting method calculates the >> // full popularity ranking of all candidates (or choices in the case >> // of a survey) from most popular and second-most popular down to >> // least popular. It uses the preference information collected on >> // 1-2-3 ballots (or any equivalent way of expressing "ranked" >> // preferences). When a single position is being filled, the most >> // popular candidate is declared the winner. This calculation method >> // is mathematically equivalent to the Condorcet-Kemeny election >> // method. > > which gives the impression that again, the software calculates the > method (although at least it doesn't mention being able to circumvent > NP-hardness :-). > > Now, if I'm generous, I could say that what I understand you to be > saying is that when you say "this voting method calculates...", you're > talking about the software, but when you're saying "this calculation > method is mathematically equivalent...", you're talking about the > method. But it's easy to draw the wrong conclusion, as I did earlier. > The documentation could thus use being more clear about what the > software implements -- or the software could be rewritten so that it > implements the actual VoteFair (or Kemeny) method. > > And again, I would suggest the latter as a good strategy. Instead of > producing correct results up to a certain number of candidates and then > producing undefined results - that it produces the correct results in > potentially exponential time, and that it just quits with a timeout if > it takes too long a time. > > That way, it's immediately obvious to the user whether the method > produces a correct result (namely, that it'd do so when it does produce > a result at all). Instead of producing a result in managable time with > some probability (which you argue is low) of producing something > incorrect, it produces a result in managable time with similar (high if > you're right) probability of not getting a timeout. > > In addition, if you use a reasonably good integer programming solver to > do so, you can quickly solve cases that the current software can't, e.g. > 1000 candidates but no cycles at all. > > [1] Strictly speaking, it implements a generalization of that method, > but I don't want to make things more confusing than they need to be; I'd > rather deal with one point at a time. > Election-Methods mailing list - see https://electorama.com/em for list info |
Free forum by Nabble | Edit this page |