Sunday, 27 March 2011

Polling and modelling Alternative Vote

Alan Bell at Vote Geek has taken the algorithm and data I used for the Alternative Vote swingometer to come up with a graphical version. It allows you to select a set of transfer values for second preferences, and then graphically generates the results for various first preference values.

It's definitely worth having a look at. As it stands it also exposes some of the severe problems with existing AV polling and predictions (this is not a criticism of Alan Bell, who has done about the best possible with the available data)

Should the AV referendum pass, the polling and modelling of the general election result might become very tricky indeed - it might be the first election for some time where the results weren't generally obvious in advance.

Problem 1: England is not the UK

As I mentioned more briefly in the introduction to the swingometer, it's for England only.

There is virtually no polling - not even First Past The Post polling - for Northern Ireland. There is no useful second preference polling for Scotland or Wales, either - the few second preference polls that have been done are for Great Britain as a whole, and so the Scottish National Party and Plaid Cymru are on 3-4% of the first preference vote and a similar proportion of transfer votes.

While this might be correct for Britain as a whole, it's certainly not true within either Scotland or Wales. Under AV, any prediction for those countries would require focused polling because the transfers of SNP and PC voters would be crucial.

(In practice, the media and the nation being what it is, I'd expect to see lots of England-only polls and a very occasional Welsh or Scottish poll)

Problem 2: Second preference scores are imprecise

Because under AV (as opposed to Borda, for instance) the impact of a second preference depends on what the first preference was (and the impact of a third preference depends on both the first and second preferences), the second preferences need to be broken down by first preference.

This means that rather than polling the full properly-weighted sample, the second preference data is for a sub-sample.

Sub-samples are not generally internally-weighted as well, and are also much smaller (in the case of sub-sampling by first preferences, at best a quarter of the size) than the full sample.

The margin for random error on a typical 1,000-response poll of first preferences (or FPTP votes) is around 3%.

The margin for random error on a 250-vote sub-sample (for second preferences of major party voters, for instance) would be around 6% assuming the sub-sample is also properly weighted - since they aren't there's also an unknown systematic error introduced.

If AV polling becomes necessary, the sample sizes of polls will have to increase, and more weighting of the sub-samples will be needed.

Problem 3: Second preferences are not independent of first preferences

The Vote Geek visualisation lets you put in a set of transfer rates, and then gives the English result for those transfer rates for various first preference rates.

Of course, in reality, changes in first preference polling will have an effect on second preference polling. It's very hard to work out how this will actually affect second preferences, though.

For instance, between the general election and the latest1 second preference polling (YouGov/Spectator, 6 July), the Conservatives and Labour both gained some first preference voters at the expense of the Lib Dems.

The proportion of Labour votes with Lib Dem second preferences decreased, but the proportion of Conservative votes with Lib Dem second preferences increased. Similarly, Lib Dem second preferences changed from benefiting Labour over the Conservatives to being roughly equal between the two parties.

The reasons for this happening are - politically - rather obvious. But there is no mathematical model for second preference transfers that would predict this without a knowledge of the political context (and any model that has the political context built in would become rapidly inaccurate were that context to change)

This isn't a problem if you're just running scenarios from polling results, but it is a problem if you want to run "what if" scenarios.


1 For all the polling about how people might vote in the referendum, there's been a surprising lack of polling about what might happen if people vote 'Yes'. I'm surprised that none of the newspapers funding regular political polls have wanted to ask about it.