Univariate Analysis in Quantitative Social Research

Univariate analysis reviews one variable at a time and typically uses frequency tables and diagrams like bar charts and pie charts. Measures of central tendency and dispersion are tools for analyzing data, with central tendency often involving the mean, median, or mode while dispersion relies on range and standard deviation. Understanding these statistical methods aids in the comprehension of data distribution in areas of interest such as wealth statistics.

Univariate analysis refers to the analysis of one variable at a time.

The most common approaches are:

• Frequency tables.
• Diagrams: bar charts, histograms, pie charts.
• Measures of central tendency: mean, median, mode.
• Measures of dispersion: range and standard deviation.

Frequency Tables

A frequency table provides the number of cases and the percentages belonging to each of the categories for a variable. Frequency tables can be used for all the different types of variable.

Below is a simple example of a frequency table showing the number of schools in three different categories of the ‘type of school’ variable for the 2022-2023 academic year. I rounded the percentages below.

Analysts usually clean from raw data to make frequency tables so people can understand and visualise them more easily.

Frequency tables are the starting point for generating diagrams which put the data into visual form making trends stand out.

Diagrams

Diagrams representing quantitative data in visual form to make data easier to understand and interpret. Bar charts and pie charts are two of the most commonly used visual representations of quantitative data.

Bar charts

The chart below shows the same data as in the frequency table above. Each bar represents one of the three school types.

The bar chart below shows the largest category is Local Authority (LA) maintained schools with academies the second largest category. You can also see there are relatively few independent schools.

Pie Charts

The main advantaged of a pie chart is that you can see the proportion of each category in relation to the total. A pie chart shows this sense of relation to the whole more clearly than a bar chart.

For example you can clearly see below that LA Maintained schools make up nearly 50% of the total. This doesn’t stand out as much in the bar chart.

You can also see that Independent schools represent around 10% of schools from the pie chart.

Frequency tables and diagrams: final thoughts

Diagrams are useful to make frequency tables easer to understand.

Bar charts are more useful when you want to look at proportions in relation to each other. Pie charts are more useful when you want to look at proportions in relation to the whole.

Keep in mind however that charts are only as useful as the data. For example, one limitation with the above data is that it tells you nothing about pupil numbers, only school numbers!

Sources

Gov.UK (accessed July 2023) Schools, Pupils and their Characteristics 2022-23.

Measures of Central Tendency

Measures of central tendency encapsulate in one figure a value which is typical for a distribution of values. In effect, we are seeking out an average for a distribution.

Quantitative social research analysts recognise three different forms of average:

• mean
• median
• mode

Arithmetic mean

The mean is the sum of all values in a distribution divided by the number of values.

In diagram one above, we add ALL the ages together and divide by 20 which is the total number of ages in the sample. This gives us a mean of 51.6.

The mean should be applied to interval/ ratio variables. It can also be applied to ordinal variables too.

Median

The median is the mid-point in a distribution of values. We arrive at the median by lining up all the values smallest to largest and then finding the middle value.

Whereas the mean is vulnerable to outliers which are extreme values at either end of the distribution. Outliers can greatly increase or decrease the mean, but they have much less of an affect on the median.

We see this in diagram one above, where the median point is 45.5, considerably lower than the mean of 51.6. In the case above the mean is higher because the oldest four people skew the mean average upwards. The four oldest are a lot older than the people in the middle, compared to the average ages of the rest of population.

The median can be used in relation to both interval/ratio and ordinal variables.

Mode

The mode is simply the value that occurs most frequently in a distribution. The mode can be applied to all types of variable.

In the diagram above, the mode is 28, because that is the only age which occurs twice.

Median more useful than the Mean?

With social data it is often more useful to know the median rather than the mean. This is especially true with wealth statistics in the UK.

Wealth and income distribution are of special interest to sociologists, because there is a lot of variation in distribution. Neither wealth nor income are equally distributed. Understanding how they are distributed has significant implications for life chances and social policy.

Visualising the total wealth in a bar chart looks like this:

Here you can clearly see a skew towards the top two deciles, especially the first decile. The richest 10% of households have an average of almost £2 million in wealth, which 8 times more than even the 4th decile.

In cases where there is a lot of variation in data, in terms of a large skew showing up at one end, as above, then get the mean and median being very different.

in the chart above the mean is £489 000, pulled up by the huge relative wealth of the top 20%.

The median wealth is only £280 000 and 50% of people have less than this.

Mean wealth in the UK gives you a misleading picture of the amount of wealth most people in the UK have!

Measures of Dispersion

Measures of dispersion show the variation in a distribution.

Two measures of dispersion include:

• the range (the simplest)
• the standard variation.

Range

The range of data is the distance between minimum and maximum values in a distribution. Like the mean, outliers can greatly affect the range.

The range of household wealth (grouped by decile) in the UK is £1.9 Million (see chart below).

This is a very simple measure which doesn’t tell us vary much about how much wealth ordinary people.

For example it doesn’t tell us that the top decile of households are almost twice as wealthy as the next decile down.

Standard Deviation

We calculate the standard deviation by taking the difference in each value in a distribution from the mean and then dividing the total of the differences by the number of values.

The standard deviation is the average amount of deviation around the mean.

For example, the standard deviation of wealth in the UK (grouped by decile) is £575 211.

Outliers don’t affect the standard deviation as much as the range. The impact of outliers on the standard deviation is offset by dividing by the number of values.

Box Plots

Box plots are popular for showing dispersion for interval/ratio variables.

The box plot provides an indication of both the central tendency (median) and dispersion (outliers).

The box plot of wealth below treats the top richest decile as an outlier. It clearly shows you the skew is the top.

The box shows you where the middle 50% of households sit: between £800 000 and £50 000.

The line in the box shows you the median value of household wealth: £280 000.

The shape of a box plot will vary depending on whether cases tend to be high or low in relation to the median. They show us whether there is more or less variation above or below the median.

Signposting and related posts

This material is most relevant to the Research Methods module. It might be a little advanced for A-level sociology. You are more likely to need this during a first year university statistical methods course.

cross national comparisons are a useful way for students to learn more about the strengths and limitations of quantitative data and positivist methods.

Below is a task students of A-level sociology can usefully do to give them a feel for doing Cross National Research.

The main aim of this research task is to illustrate some of the strengths and limitations of doing cross national comparisons.

Cross National comparisons are one of the main methods used by positivists and so doing this will help to get students thinking like positivists!

Select any one of the questions below and use the resources nuder the relevant headings below to explore these questions

1. Why are some countries richer than others?
2. Why do some countries have higher levels of gender equality than others?
3. Why do some countries perform better in the PISA tests than others?
4. Why are some countries happier than others?
5. Why are some countries more peaceful than others?

Why are some countries richer than others?

This is a list of countries by Gross National Income per Capita, provided by the World Bank. The countries should appear listed in order.

Look at the top 10 countries, the bottom 10 countries, and look at ten in the middle.

NB you may need to screen out certain odd countries (such as those which are Islands with very small populations for example!)

Using your own knowledge, and further research on these countries if necessary, try to find out if any of the above three groups (top 10, middle 10, bottom 10) have anything in common.

Can you come up with theory for why rich countries are rich and poor countries are poor?

Why do some countries have higher levels of gender equality than others?

Go to the World Economic Forum’s Global Gender Gap Report, 2023.

Look at the top 10 countries, the bottom 10 countries, and look at ten in the middle.

NB you may need to screen out certain odd countries (such as those which are Islands with very small populations for example!)

Using your own knowledge, and further research on these countries if necessary, try to find out if any of the above three groups (top 10, middle 10, bottom 10) have anything in common.

Can you come up with theory for why some countries are more gender equal than others?

Why do some countries perform better in the PISA tests than others?

The Programme for International Student Assessment assess students from dozens of countries in their ability in maths, reading and science. All students do the same test and so we get national league tables as a result.

This is the hub page for the 2018 PISA results (results are only released every four years). Have a look at the countries at the top of the league tables compared to those at the bottom – can you think of a theory for why students in some countries do better than students others?

Why are some countries happier than others?

This is a link to the World Happiness Report 2023.

Can you think of a theory for why people in some countries report higher levels of happiness than people in others?

Why are some countries more peaceful than others?

The Global Peace Index uses around 30 indicators to measure how peaceful countries are and reports every year. This is a link to the 2023 Peacefulness results.

Can you think of a theory which explains why countries such as Iceland are at the top, which countries such as Afghanistan are at the bottom?

Try to think of why some countries might be more prone to war and conflict than others.

The Spirit Level – A Summary

The Spirit Level – Why more equal societies almost always do better – Richard Wilkinson and Kate Pickett

This book is relevant to both the module on Crime and Deviance and Theory Methods

Based on thirty years of research – Its findings are that almost every modern social and environmental problem is moire likely to occur in a less equal society (where the difference between rich and poor is greater). This is one of the most important areas of social and political research – the issue of inequality goes to the heart of the political divide between left and right.

Wilkinson and Pickett use a wealth of statistical data to compare inequality in several European countries (the research mainly focuses on Europe with a few other countries thrown in too) and the reserachers use different measurements of inequality to increase valdity. The main section of the book outlines the ‘costs of inequality’ in which the authors show that greater levels of inequality are positively correlated with higher rates of ill- health, lack of community life, violence, drug problems, obesity, mental health problems, long working hours and big prison populations. The final section, which I haven’t read yet, goes on to suggest some policy solutions.

Check out this video for a humorous overview of the book –

The book has its supporters -see  http://www.equalitytrust.org.uk/resources/slides – where you can download slides of the book as an education tool to help spread the word about the ills that inequality ’causes’

Also see http://unrepentantcommunist.blogspot.com/2009/06/spirit-level-by-pickett-and-wilkinson.html for an interesting supportive blog and interesting discussion thread about the virtues and otherwise of equal/ unequal societies.

Have a look at this video where Wilkinson discusses some of the details of the book –

However, the book has come under some heavy criticism – see http://spiritleveldelusion.blogspot.com/ which refers to a recent book called ‘The Spirit Level delusion’ – with ’20 questions for the Wilkinson and Picket’ – to which they respond.

If you click on this link, it takes you to a criticism of the Spirit Level by a guy called Peter Saunders (a right wing sociologists, a rare breed!) – http://www.thersa.org/events/audio-and-past-events/2010/the-spirit-level

To give you a gist of the criticisms – one arguement is that the relationship between inequality and some factors such as homicide is skewed dramatically by a few exceptional countries – such as the USA in the case of Homicide. You can listen to a debate between the authors of these two studies at the link above. A second similar arguement is that some countries have been left out of the cross national comparisons.

This debate shows you an interesting example of how even ‘scientific’ quantitative sociology – in the form of cross national comparisons struggles to be objective – because when you are dealing with cross national comparisons, there are so many variables to choose from, one has to be selective – and these selections are open to bias (in this case which countries to include and exlude.

One interesting thing worth thinking about  is that although the debate is all about whether the relationship between inequality and social problems can be scientifically proven – one can also make a moral arguement against inequality -perhaps it is fair to say that wealth inequalities like we have in modern Britain are wrong just because no one human being is so talented or so productive that they can legitimately end up being thousands of times wealthier than the average person.

At the end of my ‘brief review’ I’ve realised that I don’t really know whether to believe the spirit level’s data or not – it seams to me that those on the left, commited to fighting inequaility, are likely to believe it, while those on the right are more likely to criticise it.