Polling Data Tables 101: Cross-breaks!

In which I also use a very terribly worded question about marriage to demonstrate a point!

The Smokeless Room is a newsletter by Rushaa Louise Hamid designed to help you clear the smoke from the air and better understand the tools of decision makers, with a special focus on all you need to know to sift the bunk from the gold of opinion surveys!

Cross-breaks are the columns you find in polling tables that break the data down into different demographic groups.

The neat thing about cross-breaks is they can be used as very rough indicators of how opinion on a topic might be affected by different aspects of a person and their life. They can also tell us how much weighting was used for each demographic group, and highlight just how representative a poll actually is.

How do polling companies pick what cross-breaks to add to their tables?

Firstly the British Polling Council requires publication as a cross-break anything used to weight the sample. So if you are weighting by age, sex, and region those have to be included. Most companies will also include some sort of education level and/or socio-economic indicator. And there’ll often be other vote-based weighting used - for instance previous general election vote - if the poll is a political one.

So most of the time the cross-breaks are determined by the weights, which is also useful to see if things that might be important to increasing the accuracy of the poll have been left out. If I was running a poll on voting intention but didn’t weight by previous vote (or even if people voted at all), how can I be sure that my sample isn’t skewed horribly in favour of one party?

Sometimes though some other cross-breaks will be added in that aren’t weighted - these tend to be things that are hard to weight, such as relationship status, but that might be useful to know to get more context on the answer.

Let’s say a poll has asked the question “Do you hate marriage?” and 56% of people said “Yes” and produced the table below:

As you can see, in the cross-breaks a slightly different story from that overall 56% is showing - here the majority of married people have answered “No”, and single people are quite closely split between “Yes” and “Don’t Know”. Even though it is unweighted it is telling us that the figures are more complicated than at first glance.

So, how excited can I get about a cross-break?

Here’s the thing - anything unweighted should have a giant asterik next to it that reads “THIS IS JUST THE OPINIONS OF A CERTAIN AMOUNT OF PEOPLE AND MIGHT BE INTERESTING, BUT CANNOT BE MADE DEFINITIVE AND REPRESENTATIVE”. Anything weighted though should also have an astrerik, this time saying clearly:


You see interlocking weights where you pay close attention to how many young people are also living where and have what education level don’t tend to happen regularly. Instead the focus is just on making the poll sample as a whole match up to those general figures.

Thinking of last week’s example, maybe all the young people in our sample happen to be blue, and there are no red young people. Therefore we are missing out a key subdemographic of our chosen demographic so saying all young people think a particular opinion based off a cross-break is stretching a bit too far. And the smaller the cross-break sample, the bigger your margin of error.

What can I do then?

In polling generally you’re looking for sustained trends, and that’s the case with cross-breaks too. They tend to produce more volitile results since they’re not internally weighted which makes them juicy for the media which feeds of dramatic and unexpected figures.

If you read an article, try to pay attention to if the shock figure for a demographic is said to come from a nationally representative poll - that’s a big flag that a cross-break is being improperly used. Ethnicity (as mentioned in a previous newsletter) is often a cross-break that is treated as gospel when writing articles, even when the sample it comes from is less than 100 people.

Cross-breaks are great at adding context to the story that the data is trying to tell - for instance where a voting intention poll shows that people seem to be switching away from the party they previously voted for. However they are not polls in and of themselves, and treating their findings as such is a recipe for disaster.

Next week we’ll be continuing to look at other parts of polling data tables, focusing on the questions themselves!

As always, if you'd like to drop me a note, you can contact me by replying to this email or over on Twitter at @thesecondrussia.

Newsletter icon made by Freepik from Flaticon.

Buy Me a Coffee!

Share The Smokeless Room