## Univariate Analysis in Quantitative Social Research

Univariate analysis reviews one variable at a time and typically uses frequency tables and diagrams like bar charts and pie charts. Measures of central tendency and dispersion are tools for analyzing data, with central tendency often involving the mean, median, or mode while dispersion relies on range and standard deviation. Understanding these statistical methods aids in the comprehension of data distribution in areas of interest such as wealth statistics.

Univariate analysis refers to the analysis of one variable at a time.

The most common approaches are:

• Frequency tables.
• Diagrams: bar charts, histograms, pie charts.
• Measures of central tendency: mean, median, mode.
• Measures of dispersion: range and standard deviation.

## Frequency Tables

A frequency table provides the number of cases and the percentages belonging to each of the categories for a variable. Frequency tables can be used for all the different types of variable.

Below is a simple example of a frequency table showing the number of schools in three different categories of the ‘type of school’ variable for the 2022-2023 academic year. I rounded the percentages below.

Analysts usually clean from raw data to make frequency tables so people can understand and visualise them more easily.

Frequency tables are the starting point for generating diagrams which put the data into visual form making trends stand out.

## Diagrams

Diagrams representing quantitative data in visual form to make data easier to understand and interpret. Bar charts and pie charts are two of the most commonly used visual representations of quantitative data.

#### Bar charts

The chart below shows the same data as in the frequency table above. Each bar represents one of the three school types.

The bar chart below shows the largest category is Local Authority (LA) maintained schools with academies the second largest category. You can also see there are relatively few independent schools.

#### Pie Charts

The main advantaged of a pie chart is that you can see the proportion of each category in relation to the total. A pie chart shows this sense of relation to the whole more clearly than a bar chart.

For example you can clearly see below that LA Maintained schools make up nearly 50% of the total. This doesn’t stand out as much in the bar chart.

You can also see that Independent schools represent around 10% of schools from the pie chart.

### Frequency tables and diagrams: final thoughts

Diagrams are useful to make frequency tables easer to understand.

Bar charts are more useful when you want to look at proportions in relation to each other. Pie charts are more useful when you want to look at proportions in relation to the whole.

Keep in mind however that charts are only as useful as the data. For example, one limitation with the above data is that it tells you nothing about pupil numbers, only school numbers!

#### Sources

Gov.UK (accessed July 2023) Schools, Pupils and their Characteristics 2022-23.

## Measures of Central Tendency

Measures of central tendency encapsulate in one figure a value which is typical for a distribution of values. In effect, we are seeking out an average for a distribution.

Quantitative social research analysts recognise three different forms of average:

• mean
• median
• mode

### Arithmetic mean

The mean is the sum of all values in a distribution divided by the number of values.

In diagram one above, we add ALL the ages together and divide by 20 which is the total number of ages in the sample. This gives us a mean of 51.6.

The mean should be applied to interval/ ratio variables. It can also be applied to ordinal variables too.

### Median

The median is the mid-point in a distribution of values. We arrive at the median by lining up all the values smallest to largest and then finding the middle value.

Whereas the mean is vulnerable to outliers which are extreme values at either end of the distribution. Outliers can greatly increase or decrease the mean, but they have much less of an affect on the median.

We see this in diagram one above, where the median point is 45.5, considerably lower than the mean of 51.6. In the case above the mean is higher because the oldest four people skew the mean average upwards. The four oldest are a lot older than the people in the middle, compared to the average ages of the rest of population.

The median can be used in relation to both interval/ratio and ordinal variables.

### Mode

The mode is simply the value that occurs most frequently in a distribution. The mode can be applied to all types of variable.

In the diagram above, the mode is 28, because that is the only age which occurs twice.

#### Median more useful than the Mean?

With social data it is often more useful to know the median rather than the mean. This is especially true with wealth statistics in the UK.

Wealth and income distribution are of special interest to sociologists, because there is a lot of variation in distribution. Neither wealth nor income are equally distributed. Understanding how they are distributed has significant implications for life chances and social policy.

Visualising the total wealth in a bar chart looks like this:

Here you can clearly see a skew towards the top two deciles, especially the first decile. The richest 10% of households have an average of almost £2 million in wealth, which 8 times more than even the 4th decile.

In cases where there is a lot of variation in data, in terms of a large skew showing up at one end, as above, then get the mean and median being very different.

in the chart above the mean is £489 000, pulled up by the huge relative wealth of the top 20%.

The median wealth is only £280 000 and 50% of people have less than this.

Mean wealth in the UK gives you a misleading picture of the amount of wealth most people in the UK have!

## Measures of Dispersion

Measures of dispersion show the variation in a distribution.

Two measures of dispersion include:

• the range (the simplest)
• the standard variation.

### Range

The range of data is the distance between minimum and maximum values in a distribution. Like the mean, outliers can greatly affect the range.

The range of household wealth (grouped by decile) in the UK is £1.9 Million (see chart below).

This is a very simple measure which doesn’t tell us vary much about how much wealth ordinary people.

For example it doesn’t tell us that the top decile of households are almost twice as wealthy as the next decile down.

### Standard Deviation

We calculate the standard deviation by taking the difference in each value in a distribution from the mean and then dividing the total of the differences by the number of values.

The standard deviation is the average amount of deviation around the mean.

For example, the standard deviation of wealth in the UK (grouped by decile) is £575 211.

Outliers don’t affect the standard deviation as much as the range. The impact of outliers on the standard deviation is offset by dividing by the number of values.

### Box Plots

Box plots are popular for showing dispersion for interval/ratio variables.

The box plot provides an indication of both the central tendency (median) and dispersion (outliers).

The box plot of wealth below treats the top richest decile as an outlier. It clearly shows you the skew is the top.

The box shows you where the middle 50% of households sit: between £800 000 and £50 000.

The line in the box shows you the median value of household wealth: £280 000.

The shape of a box plot will vary depending on whether cases tend to be high or low in relation to the median. They show us whether there is more or less variation above or below the median.

This material is most relevant to the Research Methods module. It might be a little advanced for A-level sociology. You are more likely to need this during a first year university statistical methods course.

To return to the homepage – revisesociology.com

## Gender and Subject Choice 2017

Looking at the A level exam entries by gender in 2017 over 90% of people who sat computing were male, compared to only 23% of people who sat sociology A level.

Either click on the above graphic or check out the interactive version here (irritatingly I can’t embed dynamic visuals in a wordpress.com blog, yet!)

These are only selected A-levels, to make the amount of information more manageable…. I didn’t deliberately select it so sociology was the most feminine, but it’s certainly ‘up there’ as one of the most female dominated subjects… must be all that empathizing us sociologists do?!

Talking about empathy, or lack of it, I have to say absolutely no thanks whatsoever to the Joint Qualifications Alliance who did not even respond to my email request for a spread sheet of this A-level data.

The JCQ only make this public data available in the form of a PDF which makes it less accessible for data manipulation – I had to enter this info by hand, which was massively inefficient use of my time.

This post is primarily me testing out the capabilities of Tableau (Free data viz software)…. More on subject choice at different levels of education later, as well as analysis of WHY males and females choose different subjects…..

Source:

Joint Council for Qualifications and Enemies of Open Data.

## Outline and explain two reasons why Positivists generally prefer to use quantitative methods (10)

The theory and methods 10 mark question appears as a special treat at the end of paper 1 (Education, Methods in Context and Theory and Methods), you’ll also get a big 30 mark essay question at the end of paper 3 (Crime and Deviance with Theory and Methods) too, but more about the 30 markers in other blog post.

The reason for splitting the theory and methods questions across two papers is probably to make sure that more students fail the exam, and possibly because the man has a burning hatred of teenagers.  Apparently every A-Level exam has one aspect split across two papers, so at least the hate is evenly distributed, otherwise this might be an example of a ‘hate crime’ against sociology students.

For 10 mark questions it’s good practice to select two very different reasons, which are as far apart from each other as possible. In this question, it’s also good practice to contrast Positivism to Interpretivism (to get analysis marks) and to use as many theory and methods concepts and examples as possible.

The first reason is that Positivists are interested in looking at society as a whole, in order to find out the general laws which shape human action, and numerical data is really the only way we can easily study and compare large groups within society, or do cross national comparisons – qualitative data by contrast is too in-depth and too difficult to compare.

Numerical data allow us to make comparisons easily as once we have social data reduced down to numbers, it is easy to put into graphs and charts and to make comparisons and find correlations, enabling us to see how one thing affects another.

For example, Durkheim famously claimed that the higher the divorce rate, the higher the suicide rate, thus allowing him to theorise that lower levels of social integration lead to higher rates of suicide (because of increased anomie).

The second reason for preferring quantitative methods is that Positivists think it is important to remain detached from the research process, in order to remain objective, or value free.

Quantitative methods allow for a greater level of detachment as the researcher does not have to be directly involved with respondents, meaning that their own personal values are less likely to distort the research process, as might be the case with more qualitative research.

This should be especially true for official statistics, which merely need to be interpreted by researchers, but less true of structured questionnaires, which have to be written by researchers, and may suffer from the imposition problem.

You may need to add in a further layer of development to each of these points!

## YouGov Surveys – What the World Thinks?

The YouGov website is a great source for finding examples of social surveys and results from survey data.

YouGov is company which collects mainly survey data on a wide range of topics from people all over the world, and publishes it’s findings on a daily basis.

On their intro page they say ‘YouGov is a community of 4 million people around the world who share their views…. w’ere pretty sure its the largest daily updated database of people’s habits and opinions in the world’ – in addition to the structured survey data, some people also comment on the findings of said data, so you get a more qualitative feel added into the mix.

The data is very easy to access – for example below are YouGov’s latest findings on attitudes towards the children of illegal immigrants:

You can see from the above that we are pretty intolerant of illegal immigrants as a nation, which is one of the advantages of survey data.

You can also ‘drill down’ into the data and find correlations between attitudes and politics/ gender/ age and social class. Below we see that older people are less tolerant than younger people:

## The advantages and disadvantages of social surveys

The big strength of this site is that it’s very accessible – you can very easily get some quick ‘facts’ about what people think about a lot of different topics, and you can easily see the correlations between attitudes and other variables such as class and gender.

The information contained in the site is also good for illustrating the limitations of survey data – you don’t really get any depth or explanation of why people hold these views (not even with the comments, because relatively few people comment).

Finally, I really like the fact that you get to see the specific question asked, so you can always bung a particular question, or set of questions on Socrative to check out the reliability with your students!

## Research Methods in Sociology – An Introduction

An introduction to research methods in Sociology covering quantitative, qualitative, primary and secondary data and defining the basic types of research method including social surveys, experiments, interviews, participant observation, ethnography and longitudinal studies.

## Why do social research?

The simple answer is that without it, our knowledge of the social world is limited to our immediate and limited life-experiences. Without some kind of systematic research, we cannot know the answer to even basic questions such as how many people live in the United Kingdom, let alone the answers to more complex questions about why working class children get worse results at school or why the crime rate has been falling every year since 1995.

So the most basic reason for doing social research is to describe the social world around us: To find out what people think and feel about social issues and how these thoughts and feelings vary across social groups and regions. Without research, you simply do not know with any degree of certainty, what is going on in the world.

However, most research has the aim of going beyond mere description. Sociologists typically limit themselves to a specific research topic and conduct research in order to achieve a research aim or sometimes to answer a specific question.

## Subjective and Objective Knowledge in Social Research

Research in Sociology is usually carefully planned, and conducted using well established procedures to ensure that knowledge is objective – where the information gathered reflects what is really ‘out there’ in the social, world rather than ‘subjective’ – where it only reflects the narrow opinions of the researchers. The careful, systematic and rigorous use of research methods is what makes sociological knowledge ‘objective’ rather than ‘subjective’.

Subjective knowledge – is knowledge based purely on the opinions of the individual, reflecting their values and biases, their point of view

Objective knowledge – is knowledge which is free of the biases, opinions and values of the researcher, it reflects what is really ‘out there’ in the social world.

While most Sociologists believe that we should strive to make our data collection as objective as possible, there are some Sociologists (known as Phenomenologists) who argue that it is not actually possible to collect data which is purely objective – The researcher’s opinions always get in the way of what data is collected and filtered for publication.

## Sources and types of data

In social research, it is usual to distinguish between primary and secondary data and qualitative and quantitative data

Quantitative data refers to information that appears in numerical form, or in the form of statistics.

Qualitative data refers to information that appears in written, visual or audio form, such as transcripts of interviews, newspapers and web sites. (It is possible to analyse qualitative data and display features of it numerically!)

Secondary data is data that has been collected by previous researchers or organisations such as the government. Quantitative sources of secondary data include official government statistics and qualitative sources are very numerous including government reports, newspapers, personal documents such as diaries as well as the staggering amount of audio-visual content available online.

Primary data is data collected first hand by the researcher herself. If a sociologist is conducting her own unique sociological research, she will normally have specific research questions she wants answered and thus tailor her research methods to get the data she wants. The main methods sociologists use to generate primary data include social surveys (normally using questionnaire), interviews, experiments and observations.

## Four main primary research methods

For the purposes of A-level sociology there are four major primary research methods

• social surveys (typically questionnaires)
• experiments
• interviews
• participant observation

I have also included in this section longitudinal studies and ethnographies/ case studies.

### Social Surveys

Social Surveys – are typically structured questionnaires designed to collect information from large numbers of people in standardised form.

Social Surveys are written in advance by the researcher and tend to to be pre-coded and have a limited number of closed-questions and they tend to focus on relatively simple topics. A good example is the UK National Census. Social Surveys can be administered (carried out) in a number of different ways – they might be self-completion (completed by the respondents themselves) or they might take the form of a structured interview on the high street, as is the case with some market research.

### Experiments

Experimentsaim to measure as precisely as possible the effect which one variable has on another, aiming to establish cause and effect relationships between variables.

Experiments typically start off with a hypothesis – a theory or explanation made on the basis of limited evidence as a starting point for further investigation, and will typically take the form of a testable statement about the effect which one or more independent variables will have on the dependent variable. A good experiment will be designed in such a way that objective cause and effect relationships can be established, so that the original hypothesis can verified, or rejected and modified.

There are two types of experiment – laboratory and field experiments – A laboratory experiment takes place in a controlled environment, such as a laboratory, whereas a field experiment takes place in a real-life setting such as a classroom, the work place or even the high street.

### Interviews

Interviews – A method of gathering information by asking questions orally, either face to face or by telephone.

Structured Interviews are basically social surveys which are read out by the researcher – they use pre-set, standardised, typically closed questions. The aim of structured interviews is to produce quantitative data.

Unstructured Interviews, also known as informal interviews, are more like a guided conversation, and typically involve the researcher asking open-questions which generate qualitative data. The researcher will start with a general research topic in and ask questions in response to the various and differentiated responses the respondents give. Unstructured Interviews are thus a flexible, respondent-led research method.

Semi-Structured Interviews consist of an interview schedule which typically consists of a number of open-ended questions which allow the respondent to give in-depth answers. For example, the researcher might have 10 questions (hence structured) they will ask all respondents, but ask further differentiated (unstructured) questions based on the responses given.

### Participant Observation

Participant Observation involves the researcher joining a group of people, taking an active part in their day to day lives as a member of that group and making in-depth recordings of what she sees.

Participant Observation may be overt, in which case the respondents know that researcher is conducing sociological research, or covert (undercover) where the respondents are deceived into thinking the researcher is ‘one of them’ do not know the researcher is conducting research.

### Ethnographies and Case Studies

Ethnographies are an in-depth study of the way of life of a group of people in their natural setting. They are typically very in-depth and long-term and aim for a full (or ‘thick’), multi-layred account of the culture of a group of people. Participant Observation is typically the main method used, but researchers will use all other methods available to get even richer data – such as interviews and analysis of any documents associated with that culture.

Case Studies involves researching a single case or example of something using multiple methods – for example researching one school or factory. An ethnography is simply a very in-depth case study.

### Longitudinal Studies

Longitudinal Studies are studies of a sample of people in which information is collected from the same people at intervals over a long period of time. For example, a researcher might start off in 2015 by getting a sample of 1000 people to fill in a questionnaire, and then go back to the same people in 2020, and again in 2025 to collect further information.

## Secondary Research Methods

The main type of secondary quantitative data which students of A-level sociology need to know about are official statistics, which are data collected by government agencies, usually on a regular basis and include crime statistics, the Census and quantitative schools data such as exam results.

Secondary qualitative data is data which already exists in written or audiovisual form and include news media, the entire qualitative content of the internent (so blogs and social media data), and more old-school data sources such as diaries, autobiographies and letters.

Sociologists sometimes distinguish between private and public documents, which is a starting point to understanding the enormous variety of data out there!

Secondary data can be a challenge to get your head around because there are so many different types, all with subtly different advantages and disadvantages, and so this particular sub-topic is more likely to demand you to apply your knowledge (rather than just wrote learn) compared to other research methods!

