Variables in quantitative reserach

What is the difference between interval/ ratio, ordinal, nominal and categorical variables? This post answers this question!

Interval/ ratio variables

Where the distances between the categories are identical across the range of categories.

For example, in question 2, the age intervals go up in years, and the distance between the years is same between every interval.

Interval/ ratio variables are regarded as the highest level of measurement because they permit a wider variety of statistical analyses to be conducted.

There is also a difference between interval and ratio variables… the later have a fixed zero point.

Ordinal variables

These are variables that can be rank ordered but the distances between the categories are not equal across the range. For example, in question 6, the periods can be ranked, but the distances between the categories are not equal.

NB if you choose to group an interval variable like age in question 2 into groups (e.g. 20 and under, 21-30, 31-40 and so on) you are converting it into an ordinal variable.

Nominal or categorical variables

These consist of categories that cannot be rank ordered. For example, in questions 7-9, it is not possible to rank subjective responses of respondents here into an order.

Dichotomous variables

These variables contain data that have only two categories – e.g. ‘male’ and ‘female’. Their relationship to the other types of variable is slightly ambiguous. In the case of question one, this dichotomous variable is also a categorical variable. However, some dichotomous variables may be ordinal variables as they could have one distinct interval between responses – e.g. a question might ask ‘have you ever heard of Karl Marx’ – a yes response could be regarded as higher in rank order to a no response.

Multiple-indicator measure such as Likert Scales provide strictly speaking ordinal variables, however, many writers argue they can be treated as though they produce interval/ ratio variables, if they generate large number of categories.

In fact Bryman and Cramer (2011) make a distinction between ‘true’ interval/ ratio variables and those generated by Likert Scales.

A flow chart to help define variables

*A nominal variable – aka categorical variable! 

Questionnaire Example 

This section deals with how different types of question in a questionnaire can be designed to yield different types of variable in the responses from respondents.

If you look at the example of a questionnaire below, you will notice that the information you receive varies by question

Some of the questions ask for answers in terms of real numbers, such as question 2 which asks ‘how old are you’ or questions 4 and 5 and 6 which asks students how many hours a day they spend doing sociology class work and homework. These will yield interval variables.

Some of the questions ask for either/ or answers or yes/ no answers and are thus in the form of dichotomies. For example, question 1 asks ‘are you male or female’ and question 10 asks students to respond ‘yes’ or ‘no’ to whether they intend to study sociology at university. These will yield dichotomous variables.

The rest of the questions ask the respondent to select from lists of categories:

The responses to some of these list questions can be rank ordered – for example in question 6, once a day is clearly more than once a month! Responses to these questions will yield ordinal variables. 

Some other ‘categorical list’ questions yield responses which cannot be ranked in order – for example it is impossible to say that studying sociology because you find it generally interesting is ranked higher than studying it because it fits in with your career goals.  These will yield categorical variables.

These different types of response correspond to the four main types of variable above.

 

 

 

Advertisements

How equal are men and women in the UK?

The gap between men and women in terms of pay, and representation in big companies is decreasing rapidly, but significant inequalities remain in both of these areas, domestic life, and chances of being a victim of sexual assault. All of this is despite the fact that girls have been outperforming boys at GCSE (and above) for decades. The only area of life where there seems to be equality is reported happiness levels, yet women still report slightly higher anxiety levels.

This post summarizes statistics from six key areas of social life:

  • income – the gender pay gap.
  • domestic life – amount of time spent on leisure and unpaid work
  • economic power – the proportion of women represented on the boards of large companies
  • education – GCSE results
  • crime – the number of men and women who have been victims of sexual assault.
  • well being – reported levels of  happiness and anxiety.

There are a lot statistics available on gender inequality (both in the UK and worldwide) and here I’ve tried to select just six key statistics that summarize the state of gender inequality today.

I’ve kept the data to a minimum so as to avoid information overload, as this post is written as part of an introduction to A-level sociology for students in their first week of study. I’ve also deliberately selected data that is relevant to the topics students are likely to be studying deeper into the A-level, such as families and households and education, so they can get a first look at it now.

If you want to find out more about trends in gender equality in the U.K. I recommend the U.K. Government’s Gender Equality Monitor, which tracks progress towards gender equality.  This recent report was very much the basis for this post!

NB – you’ll find it easier just read the charts if you click here to get to my Tableau Public page where I’ve stored all of the data visualizations below.

Women’s Income compared to men’s 

The gender pay gap has fallen by about 10 percentage points since 1997, but the pay gap remains at just below 9%. 

Source: ONS: Gender Pay Gap in the UK, 2018.

Number of women running big companies

Source: Hampton-Alexander Review FTSE Women Leaders Improving gender balance in FTSE Leadership, November 2018.

GCSE results 

The 9-4 and 9-5 GCSE pass rates for girls are both approximately 7% higher than the corresponding pass rate for boys.

Source: GCSE and equivalent results: 2017 to 2018 (provisional).

Leisure and unpaid work 


Women report having an hour less leisure time per day and do an hour’s more unpaid work per day than men

Source: ONS analysis of UK Harmonised European Time Use Survey (HETUS), 2015.

Chances of being a victim of sexual assault

While the rates of BCS reported sexual assaults against females have fallen significantly, females are still more than three times more likely to be victims than males.

Source: ONS.

Happiness and anxiety 


Despite all of the above the reported happiness levels are almost identical for both males and females, and female anxiety levels are only slighter higher than male anxiety levels!

Source: ONS, Personal well-being estimates in the UK: October 2016 to September 2017.

Conclusions/ about this post

Hopefully you found this post useful, writing it has been a bit of a learning curve as I’m currently teaching myself how to use Tableau to do data visualizations.

What’s the most valid representation of trends in Life Expectancy?

Here’s a dual line chart showing trends in life expectancy for males and females in the UK from 1948 to 2016….

The above chart is only one way of visualizing this data, starting at zero. It gives the impression of a steadily increasing life expectancy for both sexes, with little difference between them.

 

Visualizing starting at age 64

However, if you cut off the bottom 60 odd years, you get the impression of a much faster increase in life expectancy and you also get the impression of a more rapidly closing gap between male and female life expectancy:

 

Same data, two different impressions…. the first ‘calm and steady’, the second ‘rapid and intense’ – it just goes to show how easy it is to ‘distort’ even ‘hard’ data in the visualisation/ representation phase!

 

School Types in England and Wales – Statistical Overview…

As of 2017, there were over 250 000 children in ‘Converter Academies’, 86, 000 students in sponsored academies, and 170 000 students in LEA maintained schools. This that in 2017 there were twice as many students in converter and sponsored academies combined as there are in LEA funded mainstream schools….

Number Pupils Schools Academies

Free Schools, meanwhile, cater to only just over 3000 students, with studio schools the least popular type of school, with only 1200 students.

Click on the link above, for the (slightly lame) interactive version… NB this is me still trying to get my head around Tableau!

 

What is Big Data?

Big data refers to things one can do at a large scale that cannot be done at a smaller scale. Big data analysis typically uses all available information and billions of data points to identify correlations which reveal new insights about human behaviour which are simply not available when using smaller data sets.

What is Big Data.png

Big data has emerged with the widespread digitisation of information which has made it easier to store and process the increasing volume of information available to us.

Big data is also dependent on the emergence of new data processing tools such as Hadoop which are not based on the rigid hierarchies of the ‘analogue’ age, in which data was typically collected with specific purposes in mind. The rise of big data is likely to continue given that society is increasingly engaged in a process of ‘datification’ – there is an ongoing process of companies collecting data about all things under the sun.

Big data is also fundamentally related to the rise of large information technology companies, most obviously Google, Facebook and Amazon, who collect huge volumes of data and see that data as having an economic value.

A good example of ‘big data analysis’ is Google’s use of its search data to predict the spread of the H1N1 flue virus in 2009, based on the billions of  search queries which it receives every day. They took 50 million of the most search terms and compared them with CDC (Centre for Disease Control) data, and found 45 search terms which were correlated with the official figures on the spread of flu.

As a result, Google was able to tell how the H1N1 virus was spreading in real time in 2009 without relying on the reporting-lag which came with CDC data, which is based on people visiting doctors to report flu, a method which can only tell us about the spread of flu some days after it has already spread.

A second useful example is Oren Etzioni’s ‘Farecast company’ – which evolved to use 200 billion flight-price records to predict when the best time for consumers would be to buy plane tickets. The technology he evolved to crunch the data today forms the basis of sites such as Expedia.

There are three shifts in information analysis that occur with Big Data

  1. Big data analysts seek to use all available data rather than relying on sampling. This is especially useful for gaining insights into niche subcategories.
  2. Big data analysts give up on exactitude at the micro level to gain insight at the macro level – they look for the general direction rather than measuring exactly down to the single penny or inch.
  3. Big data analysis looks for correlations, not causation – it can tell us that something is happening rather than why it is happening.

Cukier uses two analogies to emphasise the differences of working with big data compared to the ‘sampled data’ approach of the analogue age.

Firstly, he likens it the shift from painting as a form of representation to movies – the later is fundamentally different to a still painting.

Secondly, he likens it to the fact that at the subatomic level materials act differently to how they do at the atomic level – a whole new system of laws seem to work at the micro level.

Big Data – don’t forget to be sceptical! 

This post is only intended to provide a simple, starting point definition of big data, and the above summary  is taken from a best selling book on big data (source below) – this book is very pro-big data – extremely biased, overwhelmingly in favour of it – if you buy it and read it, keep this in mind! Big data also has its critics, but more of that later.

Sources

Based on chapter 1 of ‘Mayer-Schonberger and Cuker (2017) Big Data: The Essential Guide to Work, Life and Learning in the Age of Insight’.