Posted on 1 Comment

Evaluating the Usefulness of Official Statistics

Official Statistics are numerical data collected by governments and their agencies. This post examines a ranges of official statistics collected by the United Kingdom government and evaluates their usefulness.

Click the image to search 13, 848 official statistics produced by the U.K. government

The aim of this post is to demonstrate one of the main strengths of official statistics – they give us a ‘snap shot’ of life in the U.K. and they enable us to easily identify trends over time.

Of course the validity and thus the usefulness of official statistics data varies enormously between different types of official statistic, and this post also looks at the relative strengths and limitations of these different types of official statistic: some of these statistics are ‘hard statistics’, they are objective, and there is little disagreement over how to measure what is being measured (the number of schools in the U.K. for example), whereas others are ‘softer statistics’ because there is more disagreement over the definitions of the concepts which are being measured (the number of pupils with Special Educational Needs, for example).

If you’re a student working through this, there are two aims accompanied with this post:

  1. Before reading the material below, play this ‘U.K. official statistics matching game’, you can also do it afterwards to check yer knowledge.
  2. After you’ve read through this material, do the ‘U.K. official statistics validity ranking exercise’.

Please click on the images below to explore the data further using the relevant ONS data sets and analysis pages.

Ethnic Identity in the United Kingdom According the U.K. 2011 Census

U.K. Census 2011 data showed us that 86% of people in the United Kingdom identified themselves as ‘white’ in 2011.

How valid are these statistics?

To an extent, ethnic identity is an objective matter – for example, I was kind of ‘born white’ in that both my parents are/ were white, all of my grandparents were white, and all of my great-grandparents were white, so I can’t really claim I belong to any other ethnic group. However, although I ticked ‘white’ box when I did the U.K. Census, this personally means very little to me, whereas to others (probably the kind of people I wouldn’t get along with very well) their ‘whiteness’ is a very important part of their identity, so there’s a whole range of different subjective meanings that go along with whatever ethnic identity box people ticked. Census data tells us nothing about this.

Religion according to the U.K. 2011 Census

In the 2011 Census, 59% of people identified as ‘Christian’ in 2011, the second largest ‘religious group’ was ‘no religion’, which 25% of the U.K. population identified with.

Statistics on religious affiliation may also lack validity – are 59% of people really Christian? And if they really are, then what does this actually mean? Church attendance is significantly lower than 59% of the population, so the ‘Christian’ box covers everything from devout fundamentalists to people that are just covering their bases (‘I’d better tick yes, just in case there is a God, or gods?’)

The British Humanist Society present a nice summary of why statistics on religious belief may lack validity…basically based on the ‘harder’ statistics such as church attendance which show a much lower rate of committed religious practice.

The United Kingdom Employment Rate

The employment rate is the proportion of people aged from 16 to 64 in work.

The lowest employment rate for people was 65.6% in 1983, during the economic downturn of the early 1980s. The employment rates for people, men and women have been generally increasing since early 2012.As of December 2016, the employment rate for all people was 74.6%, the highest since records began in 1971

Critics of the above data point to the existence of an informal or shadow economy in the United Kingdom which is worth an estimated £150 billion a year – people who are working and earning an income, but not declaring it. In reality, the actual paid-employment rate is higher.

Household Income Distribution in the United Kingdom

Household income statistics are broken down into the following three broad categories:

  • original income is income before government intervention (benefits)
  • gross income is income after benefits but before tax
  • disposable income is income after benefits and tax (income tax, National Insurance and council tax).

In the year ending 2016, after cash benefits were taken into account, the richest fifth had an average income that was roughly 6 times the poorest fifth (gross incomes of £87,600 per year compared with £14,800, respectively)

Reasons why household income data may lack validity

While measuring income does appear to be purely objective (you just add and minus the pounds), the income data above may lack validity because some people might not declare some of the income they are earning. Cash in hand work, for example, would not be included in the above statistics, and some money earned via the ‘gig economy’ might not be declared either – how many people actually pay tax on their YouTube revenue for example, or from the goods they sell on Ebay?

The United Kingdom Crime Rate

Below I discuss data from the Crime Survey of England and Wales (CSEW), which is a victim-survey conducted by structured interview with 35 000 households. It seems pointless discussing the crime rate according to police recorded crime because it’s such an obviously invalid measurement of crime (and the police know it), simply because so many crimes go unreported and hence unrecorded by the police.

Latest figures from the Crime Survey for England and Wales (CSEW) show there were an estimated 6.1 million incidents of crime experienced by adults aged 16 and over based on interviews in the survey year ending December 2016.

The green dot shows the figure if we include computer based crimes and online fraud, a new type of crime only recently introduced to the survey (so it wouldn’t be fair to make comparisons over time!) – if we include these the number of incidents of crime experienced jumps up to 11.5 million.

Reasons why even the CSEW might lack validity

Even though its almost certainly more valid than police recorded crime – there are still reasons why the CSEW may not report all crimes – domestic crimes may go under-reported because the perpetrator might be in close proximity to the victim during the survey (it’s a household survey), or people might mis-remember crimes, and there are certain crimes that the CSEW does not ask about – such as whether you’ve been a victim of Corporate Crime.

The U.K. Prison Population


 

 

The average prison population has increased from just over 17,400 in 1900 to just over 85,300 in 2016 (a five-fold increase). Since 2010, the average prison population has again remained relatively stable.

Prison Population Statistics – Probably have Good Validity?

I’ve included this as it’s hard to argue with the validity of prison population stats. Someone is either held in custody or they or not at the time of the population survey (which are done weekly!) – A good example of a truly ‘hard’ statistic! This does of course assume we have open and due process where the law and courts are concerned.

Of course you could argue for the sake of it that they lack validity – what about hidden prisoners, or people under false imprisonment? I’m sure in other countries (North Korea?) – their prison stats are totally invalid, if they keep any!

United Kingdom Population and Migration Data


 

 

Net migration to the U.K. stood at 248 000 in 2016, lower than the previous year, but still historically high compared to the 1980s-1990s.

There are a number of reasons why UK immigration statistics may lack validity

According to this migration statistics methodology document only about 1/30 people are screened (asked detailed questions about whether they are long term migrants or not), on entering the United Kingdom, and only a very small sample of people (around 4000) are subjected to the more detailed International Passenger Survey.

Then of course there is the issue of people who enter Britain legally but lie about their intentions to remain permanently, as well as people who are smuggled in. In short the above statistics are just based on the people the authorities know about, so while I’m one to go all ‘moral panic’ on the issue of immigration, there is sufficient reason to be sceptical about the validity of the official figures!

Ranking Exercise:

You might like to rank the following ‘official statistics’ in terms of validity – which of these statistics is closest to actual reality?

  • Immigration statistics – Net migration in 2016 was 248 000
  • Prison statistics – There are just over 85 000 people in prison
  • Crime statistics – There were around 6 million incidents of crime in 2016
  • The richest 20% of households had an average income of around £85 000 in 2016
  • The U.K. employment rate is 75% in 2016.
  • 59% of the population were Christina in 2011
  • 86% of the population was white in 2011

Related Posts

Official Statistics in Sociology

Education Statistics – 12 things Department for Education data tell us about the state of education in England and Wales today (forthcoming)

Family and Household Statistics – seven interesting statistics about family life in the U.K.

Sources

Please click the pictures above to follow links to sources…

The United Kingdom Census is a survey of every person in the United Kingdom, carried out every 10 years, the last one being in March 2011. It asks a series of ‘basic’ questions about sex, ethnicity, religion and occupation. It is the only survey which is based on a ‘total sample’ of all U.K. households. You might also like this summary – What is a Census?

U.K. Prison Population Statistics – House of Commons Research Briefing

Advertisements
Posted on 7 Comments

The strengths and limitations of secondary data

What is secondary data?

Information which has been collected previously, by someone else, other than the researcher. Secondary data can either be qualitative, such as diaries, newspapers or government reports, or quantitative, as with official statistics, such as league tables.

Strengths of using secondary data in social research

  • There is a lot of it! It is the richest vein of information available to researchers in many topic areas. Also, some large data sets might not exist if it wasn’t for the government collecting data.
  • Sometimes documents and official statistics might be the only means of researching the past.
  • Official statistics may be especially useful for making comparisons over time. The U.K. Census for example goes back to 1851.
  • At a practical level, many public documents and official statistics are freely available to the researcher.

Limitations of using secondary data

  • Official statistics may reflect the biases of those in power – limiting what you can find out.
  • Official statistics – the way things are measured may change over time, making historical comparisons difficult (As with crime statistics, the definition of crime keeps changing.)
  • Documents may lack authenticity– parts of the document might be missing because of age, and we might not even be to verify who actually wrote the document, meaning we cannot check whether its biased or not.
  • Representativeness – documents may not be representative of the wider population –especially a problem with older documents. Many documents do not survive because they are not stored, and others deteriorate with age and become unusable. Other documents are deliberately withheld from researchers and the public gaze, and therefore do not become available.

This was a brief post, for revision purposes, designed as last minute revision for the AS and A Level sociology exams.

Posted on Leave a comment

How Old are Twitter Users?

‘Who Tweets’ is an interesting piece of recent research which attempts to determine some basic demographic characteristics of Twitter users, relying on nothing but the data provided by the users themselves in their twitter profiles.

Based on a sample of 1470 twitter profiles* in which users clearly stated** their age, the authors of ‘Who Tweets’ found that 93.9% of twitter users were under the age of 35. The full age-profile of twitter users (according to the ‘Who Tweets’/ COSMOS data) compared to the actual age profile taken from the UK Census is below:

The age profiles of Twitter users - really?
The age profiles of Twitter users – really?

 

Compare this to the Ipsos MORI Tech Tracker report for the third quarter of 2014 (which the above research draws on) which used face to face interviews based on a quota sample of 1000 people.

Ages of twitter users according to a face to face Mori Poll
Ages of twitter users according to a face to face Mori Poll

Clearly this shows that only 67% of media users are under the age of 35, quite a discrepancy with the user-defined data!

The researchers note that:

‘We might… hypothesis that young people are more likely to profess their age in their profile data and that this would lead to an overestimation of the ‘youthfulness’ of the UK Twitter population. As this is a new and developing field we have no evidence to support this claim, but the following discussion and estimations should be treated cautiously.

Looking again at the results from the Technology Tracker study conducted by Ipsos MORI, nearly two thirds of Twitter users were under 35 years of age in Q3 of 2014 whereas our study clearly identifies 93.9% as being 35 or younger. There are two possible reasons for this. The first is that the older population is less likely to state their age on Twitter. The second is that the age distribution in the survey data is a function of sample bias (i.e. participants over the age of 35 in the survey were particularly tech-savvy). This discrepancy between elicited (traditional) and naturally occurring (new) forms of social data warrants further investigation…’

Comment 

This comparison clearly shows how we get some very different data on a very basic question (‘what is the age distribution of twitter users’?) depending on the methods we use, but which is more valid? The Ipsos face to face poll is done every quarter, and it persistently yields results which are nothing like COSMOS, and it’s unlikely that you’re going to get a persistent ‘tech savy’ selection bias in every sample of over 35 year olds, so does that mean it’s a more accurate reflection of the age profile of Twitter users?

Interestingly the Ipsos data shows a definite drift to older users over time, it’d be interesting to know if more recent COSMOS data reflects this. More interestingly, the whole point of COSMOS is to provided us with more up to date, ‘live’ information – so where is it?!? Sort of ironic that the latest public reporting is already 12 months behind good old Ipsos –

Age profiles of Twitter users in final quarter of 2015 according to MORI
Age profiles of Twitter users in final quarter of 2015 according to MORI

 

 

At the end of the day, I’m not going to be too harsh about the above ‘Who Tweets’ study, it is experimental, and many of the above projects are looking at the methodological limitations of this data.  It would just be nice if they, err, got on with it a bit… come on Sociology, catch up!

One thing I am reasonably certain about is that the above comparison certainly shows the continued importance of terrestrial methods if we want demographic data.

Of course, one simple way of checking the accuracy of the COSMOS data is simply to do a face to face survey and ask people what there age is and whether they state this in their Twitter profiles, then again I’m sure they’ve thought of that… maybe in 2018 we’ll get a report?

*drawn from the  Collaborative Online Social Media Observatory (COSMOS)

**there’s an interesting discussion of the rules applied to determine this in the ‘Who Tweets’ article.