What is the difference between interval/ ratio, ordinal, nominal and categorical variables? This post answers this question!
Interval/ ratio variables
Where the distances between the categories are identical across the range of categories.
For example, in question 2, the age intervals go up in years, and the distance between the years is same between every interval.
Interval/ ratio variables are regarded as the highest level of measurement because they permit a wider variety of statistical analyses to be conducted.
There is also a difference between interval and ratio variables… the later have a fixed zero point.
These are variables that can be rank ordered but the distances between the categories are not equal across the range. For example, in question 6, the periods can be ranked, but the distances between the categories are not equal.
NB if you choose to group an interval variable like age in question 2 into groups (e.g. 20 and under, 21-30, 31-40 and so on) you are converting it into an ordinal variable.
Nominal or categorical variables
These consist of categories that cannot be rank ordered. For example, in questions 7-9, it is not possible to rank subjective responses of respondents here into an order.
These variables contain data that have only two categories – e.g. ‘male’ and ‘female’. Their relationship to the other types of variable is slightly ambiguous. In the case of question one, this dichotomous variable is also a categorical variable. However, some dichotomous variables may be ordinal variables as they could have one distinct interval between responses – e.g. a question might ask ‘have you ever heard of Karl Marx’ – a yes response could be regarded as higher in rank order to a no response.
Multiple-indicator measure such as Likert Scales provide strictly speaking ordinal variables, however, many writers argue they can be treated as though they produce interval/ ratio variables, if they generate large number of categories.
In fact Bryman and Cramer (2011) make a distinction between ‘true’ interval/ ratio variables and those generated by Likert Scales.
A flow chart to help define variables
*A nominal variable – aka categorical variable!
This section deals with how different types of question in a questionnaire can be designed to yield different types of variable in the responses from respondents.
If you look at the example of a questionnaire below, you will notice that the information you receive varies by question
Some of the questions ask for answers in terms of real numbers, such as question 2 which asks ‘how old are you’ or questions 4 and 5 and 6 which asks students how many hours a day they spend doing sociology class work and homework. These will yield interval variables.
Some of the questions ask for either/ or answers or yes/ no answers and are thus in the form of dichotomies. For example, question 1 asks ‘are you male or female’ and question 10 asks students to respond ‘yes’ or ‘no’ to whether they intend to study sociology at university. These will yield dichotomous variables.
The rest of the questions ask the respondent to select from lists of categories:
The responses to some of these list questions can be rank ordered – for example in question 6, once a day is clearly more than once a month! Responses to these questions will yield ordinal variables.
Some other ‘categorical list’ questions yield responses which cannot be ranked in order – for example it is impossible to say that studying sociology because you find it generally interesting is ranked higher than studying it because it fits in with your career goals. These will yield categorical variables.
These different types of response correspond to the four main types of variable above.
‘This book is about contemporary self-tracking cultures, analysed from a critical sociological perspective. It explores how the practices meanings, discourse, and technologies associated with self-tracking are the product of broader social cultural and political processes.’
This summary is really just some extended notes I took on the book as self-tracking and the quantified self are concepts which interest me.
It’s an academic book, written for an academic audience, and probably way beyond most A-level sociology students, but it’s still fascinating, and relevant as the practice of self-tracking is a growing trend.
Definition of self-tracking: ‘monitoring, measuring and recording elements of one’s body and life as a form of self-improvement and self-reflection’. Commonly using digital technologies.
Chapter 1 – Know Thyself: Self-Tracking Technologies and Practices
The emergence of self-tracking
Covers the pre-digital origins of the practice, a few examples of some self-tracking obsessives, outlines the self-tracking movement and charts the recent growth and ‘mainstreaming’ of the practice.
Contemporary self-tracking technologies
Provides an overview of the most common areas of social life to which self-tracking is applied – everything from education to emotions and from individual health to the home.
Research on self-tracking
A brief overview of research on self-tracking (going up to 2013-15): most of the studies are conducted by market research companies, there are few academic studies and focus on health.
From this research we find that in 2014, fitness bands were the most popular, and white middle class men with high levels of education and technological know how seem to be the most involved.
Academic research has revealed strong positive views about self-tracking among most self-trackers, with a measure of scepticism about how their personal data might be used. There is also evidence of strong ethos of self-responsibility (the neoliberal subject).
Chapter 2 – New Hybrid Beings: Theoretical Perspectives
Because self-tracking is a complex process, we should seek to understand it from multiple perspectives. This chapter outlines theoretical perspectives (in bold below) on self-tracking
Datafication via digital devices is a fundamental aspect of selfhood today.
People invest digital technologies with meaning, and we need to understand these meanings to understand people’s identities.
Individual human actors should be understood as part of an assemblage that consists of (besides humans), digital devices, software and networks.
Code/ space is another concept that’s been developed to capture the hybridity of human-technological networks
G. our objects may govern our access to space (e-tickets)
Draws on actor-network theory.
A concept developed by Nigel Thrift to denote the way that capitalism has shifted from commodifying workers’ physical labour to profiting from the data they generate and upload.
This is in the context of a big data economy, there is a lot of money to be made from data-driven insights.
In the age of prosumumption, people upload this information for free, why social media sites are generally free, because it is the data that has value.
The four big tech companies need to be taken into consideration, due to the sheer amount of data they have access to, they have power.
Fluidity is key to metaphors used to describe the digital data economy.
HOWEVER, data can become frozen, stuck if people do not know how to use it.
Data can have a determining influence on people’s life chances
When data is rendered 2D it is frozen.
When data is represented, it is a result of social processes, we need to ask about who has made the decision to represent data in particular ways.
Self-tracking and the neo-liberal subject
Foucault’s concepts of selfhood, governmentality via biopolitics and surveillance are especially relevant to understanding the social significance of self-tracking.
In contemporary western societies, the dominant idea is that ‘care of the self’ is an ethical project that the individual is responsible for – the ‘good citizen’ sees the self as a project to worked on, they don’t expect much from the state or other people in society.
Giddens, Beck and Bauman have focused on how the self has become individualised – society is full of uncertainties, and lots of choices, and it is down to the individual to do the work to make those choices (and take responsibility for making the right choices).
The ‘self’ in today’s society is one which must be constantly re-invented – improved in order to be a success.
There is a dominant discourse of morality surrounding self-improvement – people are expected to do it!
The psy disciplines have become increasingly popular today because they fit this era of self-responsibility.
Despite the focus on the individual, power is still at work through these practices and discourses of the self. They fit in well with neoliberalism, which depends on soft modes of governing rather than hard – the former basically being everyone controlling themselves because they have taken responsibility for themselves and themselves only.
Discourses of self-improvement and the focus on the individual ignore the role of structural factors (class, gender, ethnicity) in shaping people’s lives and the problems they may face during their lives.
Self-tracking fits in with this neoliberal discourse of self-responsibilization.
Cultures of Embodiment
The way we understand our bodies is culturally, socially and historically contingent.
Digital devices offer people numerous ways for people to ‘digitise’ their bodies, and thus we are changing the way we think of our bodies.
Digital technologies mean people are starting to think of their bodies visually (the screen body) rather than haptically (to do with touch). Rather than rely on their ‘fleshy’ feelings they rely on the more ‘real’ visually represented data.
Self-tracking practices may be viewed simply as another set of technologies through which individuals seek to control their bodies.
Foucault’s concept of biopower is a useful analytical tool to explore digitised bodies: it emphasises how the body is a site of struggle.
Biopower is subtler than traditional forms of power and control – it focuses on the disciplines of self-management and control.
In the discourse of self-tracking, those who can control their bodies are ‘moral’, those who cannot are deficient.
Theories of boundary maintenance and purity (a la Mary Douglas) are also relevant: and we need to keep in mind that the boundary between the body and the social in digital space is less clear than ever.
Data tracking technologies render what was previously hidden about our bodies much more visible, and subject to greater control (but by whom>?).
NB – much of the way the body is visually represented is quantitatively – biometrics are largely quantitative, and this data can be used as a basis for inclusion and exclusion.
‘Critical data studies’ have emerged to challenge the claims of big data being ‘all positive’
The process of datatification = rendering complex human feelings and relationships into digital data. This typically involves metricization, which involves numbers
This makes complex and diverse humans ‘easily comparable’ and this formed the basis of control through normalization in the 19th century, it seems to be even more central to contemporary strategies of biopower.
Data collected is often quite narrow (e.g. think about education) and is often used by powerful agencies to control and manipulate people. However this is not a neutral process: value judgements lie behind what data is collected and how it is used.
We are entering into a world in which biopower and the knowledges which underpin them are increasingly digitised. Such data are frequently presented as neutral, more reliable than individual subjective data, and thus forming a more robust basis for ‘truth claims’.
Datafication offers a late modern promise of rendering messy populations understandable and controllable.
Algorithmic authority is increasingly important in identity construction and governing inclusion to areas of social life.
It is also sometimes difficult to challenge, given that the algorithms are often black-boxed.
Dataveillance = veillance which uses digital technology.
Dataveillance and Privacy
The generation of more data increases the opportunities for monitoring.
Veillance is Lupton’s preferred term – because there are multiple types of watching in society.
Some obvious forms of surveillance include CCTV and Passports, but Foucault’s idea of the panopticon is probably the most relevant to understanding veilance today – where people take on responsibility for controlling their own actions because they ‘might’ be being watched.
Veillance is extremely pervasive and works across multiple sites simultaneously and can be purposed and repurposed in multiple ways.
It is increasingly used as a means of categorising – often based on risk.
Sousveillance is increasingly important.
There is no longer a clear spatial boundary between public and private…. Some commentators have even suggested that the internet = the end of privacy.
We need to ask lots of questions about data ownership and usage rights.
Chapter 3 – ‘An Optimal Human Being’: The Body and Self in Self-Tracking Cultures
The reflexive monitoring of the self
analysis of interviews with two self-trackers reveals a discourse of self-awareness and self-improvement facilitated by self-tracking technology.
The data used is mainly quantitative and individuals seek greater understanding by finding patterns in their lives.
There is always a focus on ‘becoming’ – present data is interpreted in light of a desired future (very goal-oriented).
There is a focus on individual self-knowledge within the movement, which some have viewed as narcissistic.
There is a strong ethic of self-responsibility, and an implication that those who don’t seek to improve their lives through self-tracking are morally incomplete.
Self-tracking selves thus seem to be neoliberal subjects.
The concept of the self fits well with digital entrepreneurialism, especially where the tracking of productivity is concerned.
Representations of embodiment
Metaphors of the body as a machine and specifically as an information processing machine are often employed in self-tracking cultures.
Inputs/ outputs/ performance are all parts of the discourse.
‘I can therefor I am’ is also part of the discourse of selfhood (Lury 1997)
Digital wearable devices are viewed as ‘prosthetics’ (data prosthetics) – enhancing the capacity to act in a similar way to prosthetic limbs. E.g. videos of life loggers expand the human capacity to remember.
The prosthetics also extend the body into a network of other bodies…. E.g. through the representation of data in social networks.
It becomes increasingly unclear where the body ends and environmental space (‘out there’) begins (code/space is a new concept to describe this).
The affective dimensions of self-tracking
Self-tracking devices and software and the data they generate are invested with a high degree of personal meaning.
Obviously, the devices themselves, especially phones, matter to us, and the data collected through these devices is part of our lives, part of our biography: it is ‘my data’.
We use these data (images, stats etc to ‘present ourselves’ and engage in ‘algorithmic self-promotion’.
NB Even the way we organise our apps has personal meaning.
A more over affective dimension is where apps actually track our emotions.
The data generated by self-tracking and the responses this gets when presented also generates emotions – from satisfaction to frustration.
Those who do not self-track may be perceived as immoral because of not taking the responsibility to control their lives. (There is a barely hidden discourse of morality in the movement)
Emotions also come into the fact that devices sometimes measure what they are supposed to effectively, and sometimes don’t work at all – they tie people’s emotional states into the robustness of the material devices.
Wearable devices also affect people’s emotional states differently – if they make them feel more self-conscious, this may not be in a good way: some may feel ‘fitter’, others may feel fatter.
There are also design and fashion to consider – many people won’t wear devices if they don’t look good.
Taking and losing control
Part of the discourse of self-tracking is one of using data to gain greater control over one’s life.
This fits in well with the uncertainty of late modern society – data collection and using it is a means of reducing risk: in terms of poor health or broken relationships for example.
This is most advanced in the sphere of medicine and health where the concept of the ‘participatory patient’ is well established – many patients are expected to engage in a routine of data collection and monitoring, along with their Doctors.
However, this effectively brings the body under surveillance as never before: the technologies used may be talked about as ‘inobtrusive, but the effects are to foreground the body through the data collected.
Some ex self-trackers report they gave up because data ‘took over’ their lives, drowning out their intuition.
Others reported they gave it up as they found they were only happy when their numbers were trending upwards.
And if you don’t have your device, you might regret it…
Some people also change their habits because of their devices, not necessarily in good ways – eating foods because it fits your diet regime and not actually enjoying the food!
Self-tracking may be a terrible idea for those with OCD or anorexia.
Self-Tracking and Surveillance
Self-tracking and the data generated by it blur the boundary between the public and the private.
Especially when we publish our data on networking sites, our private data becomes public.
The practice of self-tracking is typically done as part of an assemblage – tracking of ‘intimate’ information, displayed in public.
There is a positive side to all of this – gamifying one’s data can be motivational, as can messages of support from others.
We need to consider that some forms of tracking may be imposed from above, and users have little choice over engaging in the practice
Finally, there are the political implications of how our data is stored and used!
Chapter 4 – You Are Your Data: Personal Data, Meanings, Practices and Materialisations
Covers the ways in which self-trackers seek to make sense of, materialise and use their personal information.
The meaning and value of personal digital data
Self-tracking is not only about controlling one’s body and one’s self, but controlling the data generated by self-tracking.
Data assemblages are constantly shifting, and the data drawn upon is context dependent. They are also reflexive and recursive – people may act on the data, and those changes in action change the data.
Even though certain data assemblages may provide a snap shot, frozen, the data are liquid entities, constantly shifting, and this requires self-trackers to engage in constant meaning negotiation to make sense of the data and the selves those data represent.
The Quantified Self Movement says this is one of its primary purposes – to help people make better sense of the data – as they see it, collecting it is easy, making sense of it a life skill which needs practice/ training.
There is a sense in which the data is more reliable than gut feeling or memory.
Personal Analytics (according to QS) will help us develop optimal selves often defined as us becoming more efficient/ productive.
There is a ‘big data mind set’ – we can get new insights from this data that was not previously available – e.g. I can look at my phone and see how stressed I am.
Self-trackers often present themselves as scientists, collecting their own data, the digitized an information processing system
The data is often presented as trustworthy, and the body’s perceptions as untrustworthy.
This fits in with a long held medicalized view of the body, the only difference now is that we are visual not haptic and data is available to the layman, not just the expert.
The data is seen as emblematic of their ‘true selves’.
Metricization and the Lure of Numbers…
Quantification is central to the quantified self discourse.
More and more areas of social life have become quantified in recent years (obviously?)
Although data is presented as neutral, there is a ‘politics’ to quantification.
The rationales of both commerce and government are supported by datafication – publics are rendered manageable by data: BIG DATA allows for people to be managed algorithmically.
‘Comensuration’ is a result of metricization…. This is the process whereby a broader range of previously different social phenomena are brought together under one metric – thus the process favours homogeneity over heterogeneity – – e.g. the Klout score.
Such metrics create ‘climates of futurity’.
These metrics invariably favour some qualities over others.
Viewing the self through such data/ metrics encourages one to take a scientific/ comparable, and reductionist view of life…
This cuts out the experience of (real?) life as messy/ complex/ contradictory.
Data Spectacles: Materializations of Personal Data
Visualising data is an integral part of the Quantified self-movement. A lot of these data visualizations are very ‘neat’.
Most self-trackers derive pleasure and motivation from seeing their data visualised
They also see the data as ‘more real’ than their own subjective feelings.
Artistic and Design Interventions
Artists/ designers have tried to enhance/ challenge the way self-trackers visualize their data.
FRICKBITS – invited self-trackers to turn their data into art
The ‘Dear Data’ projected invited women to physically draw an aspect of their ‘data lives’ once a week.
Lucy Kimbell’s LIX index took data from various aspects of her life, and turned them into one index to criticise self-tracking
Critical making and design fiction aim to combine critical theory and art/ fiction. Their purpose is to envisage alternative futures (that are not necessarily either utopian or dystopian) – to challenge dominant power/ knowledge regimes/ discourses.
These may be messier/ more ambiguous than many of the representations of current data and imagined futures made by self-tracking communities.
Outlines a few projects which have sort to get us thinking about the boundaries between self/machine, and how these are shifting in assemblages.
3D Printers are also being used to visualise data.
Data is also being used to produce things, based on data.
The Importance of Context
There is growing cynicism about the use of numbers in self-tracking, because it is often not clear what numbers mean (e.g. a high heart rate can mean different thing) – we thus need to know the context in which the data is collected.
‘Morris’ (blog) is a good example of how context and quality may be more useful – he took thousands of photos of his daily routine, on reviewing them he said he started to recognise more people on his daily commute, feeling more connected to them.
Presenting self-data is an important aspect, this is context, emotional.
Data collected and then presented back might conjure up uncomfortable emotions… e.g Eric Myer’s Facebook Year in Review experience.
Self-trackers are also self-qualifiers… they use the data to tell stories about themselves.
Chapter 5 – Data’s Capacity for Betrayal: Personal Data Politics
Covers the political dimension of self-tracking data (who stores the data, what they do with that data and how they benefit).
Self-tracking practices generate digital biocapital (value derived from a combination of bodies and data)
The generation and storage of this data is now beyond the consensual and the personal and this raises all sorts of questions pertaining to who should have access to this data and its use…. Much of which has been highlighted by the recent Facebook scandal.
Digital biocapital also raises the spectre of governments and corporations being able to algorithmically manipulate people.
Prosumption is a form of work… the value people derive from generating the data not monetary, but the data is commodified and then has a monetary value… this is exploitation.
Employers data trawl prospective employers
Insurance companies are already using predictive algorithms to set premiums
Data is being used in some legal cases.
Pushed and imposed self-tracking
Although self-tracking is usually presented as something voluntary, there are some fields where the practice is used ‘coercively’ – where institutions use self-tracking to ‘nudge’ (often unwilling) participants’ behaviour in a ‘desirable’ direction.
It is mostly in the sphere of health that we find this.
This fits in well with soft power in neoliberal regimes.
One example is insurance companies getting people to upload their health data (also driving).
Another is Corporations offering reduced health insurance packages for employees who enrol in their wellness programmes.
There is a fine line between consensual, pushed and imposed self-tracking.
Personal data security and privacy
Written before GDPR – ‘many companies fail to tell customers how their data will be used’.
Personal information is very sort after by criminal gangs who can gain access to it at two main points – data transfer, and when data is stored on online databases.
Survey data show that people are generally OK with their data being used for beneficial purposes but are suspicious of and worried about the use of data by governments and corporations to manipulate people, and of the fact that their data may be used to exclude them.
Communal self-tracking and taking control of personal data
Some in the quantified self movement talk of ‘pooling’ their small data so as to gain big data insights.
(Small data is personal and identifiable, big data as impersonal and anonymous).
Nafus and Sherman (2014) have theorised that this can be a form of resistance against control of big data by large companies.
A very small pool of experts can create their own means of dealing with their data, most people are dependent on commercial products.
Some self-tracking initiatives encourage collective positive projects – e.g. environmental, collective steps, hours meditated. This could be a new form of digital citizenship moving forwards.
Responses and resistances to dataveillance
Outlines three counter responses…
Selectively recording information (the power of forgetting)
Obfuscation – deliberately generating false data or digital noise.
Making people aware of the sheer amount of data being collected.
More detailed summary: chapter 1 (NB – find points of interest and think of the questions I can ask, to then find further research on (reorganising this!)
Self-tracking cultures have emerged in a sociocultural and political context in which various rationales, discourses, practices and technologies are converging… these include the following:
A self-concept that values self-knowledge and entrepreneurialism
The privileging of quantitative scientific knowledges seen as neutral
A moral imperative to take responsibility for the regulation and tight control of one’s body
Digital technologies which allow the recording of more aspects of life in ever greater detail
A digital data economy which commodifies personal data
Governments and commercial agencies seeking to use data to manipulate behaviours.
The notion of autonomous individualism is central to many self-tracking cultures – the individual is seen as being morally responsible for rationally improving their own well-being. Little account is taken of the role of structural factors (poverty, discrimination) in affecting life chances.
Technologies tend to have been designed by white middle class men in the global North, and the decisions about what to measure through tech reflects their bias – for example the Apple Watch does not track menstrual cycles.
At the same time as being reductive, the process of generating self-knowledge is also productive – it is an active process which gives rise to new knowledges, and people use them to ‘improve the self’.
How self-tracking knowledge changes power relations is not clear – presumption means lay people can track and present data, which challenges the role of the big tech companies. However, producers of data have little control over it once it has been generated and uploaded to social media sites.
Self-tracking practices are now mainstream, and way beyond just in the realms of health and fitness.
Lupton has identified five ‘modes’ of self-tracking:
The differences are to do with the extent of consent and the purposes for which data is used.
Data devices are learning more about humans. Some of them already tell us what to do. This makes future assemblages more complex – once the world of the Internet of Things really kicks into gear!
Data Literacy is a common thing today, but we need to focus more on getting people to think about the power relations between the users of tech and the designers who make them, and commercial and governmental agencies involved.
There are many new positive uses to which self-tracking might be put, and the penultimate few paragraphs outline some of these – such as ‘empathy’ projects and creative projects.
Useful links to quantitative and qualitative research studies, statistics, researchers, and news paper articles relevant to gender and education. These links should be of interest to students studying A-level and degree level sociology, as well as anyone with a general interest in the relationship between gender, gender identity, differential educational achievement and differences in subject choice.
Just a few links to kick-start things for now, to be updated gradually over time…
A link to Professor Becky Francis’ research, which focuses mainly on gender differences in educational achievement – at time of writing (November 2017) her main focus seems to be on girls lack of access to science and banding and streaming (the later not necessarily gender focused)
Specific resources for exploring gender and differential educational achievement
Education as a strategy for international development – despite the fact that girls are outperforming boys in the United Kingdom and most other developed countries, globally girls are underachieving compared to boys in most countries. This link takes you to a general post on education and social development, many of the links explore gender inequality in education.
Specific resources for exploring gender and subject choice
Dolls are for Girls, Lego is for Boys – A Guardian article which summarizes a study by Becky Francis’s on Gender, Toys and Learning, Francis asked the parents of more than 60 three- to five-year-olds what they perceived to be their child’s favourite toy and found that while parental choices for boys were characterised by toys that involved action, construction and machinery, there was a tendency to steer girls towards dolls and perceived “feminine” interests, such as hairdressing.
Girls are Logging Off – A BBC article which briefly alerts our attention to the small number of girls opting to do computer science.
Bryman (2016) identifies four criticisms of quantitative research:
Quantitative researchers fail to distinguish people and social institutions from the world of nature
Schutz (1962) is the main critique here.
Schutz and other phenomenologists accuse quantitative social researchers of treating the social world as if it were no different from the natural world. In so doing, quantitative researchers tend to ignore the fact that people interpret the world around them, whereas this capacity for self-reflection cannot be found among the objects of the natural sciences.
The measurement process possesses an artificial and spurious sense of precision and accuracy
Cicourel (1964) is the main critique here.
He argues that the connection between the measures developed by social scientists and the concepts they are supposed to be revealing is assumed rather than real – basically measures and concepts are both effectively ‘made up’ by the researchers, rather than being ‘out there’ in reality.
A further problem is that quantitative researchers assume that everyone who answers a survey interprets the questions in the same way – in reality, this simply may not be the case.
The reliance on instruments and procedures hinders the connection between research and everyday life
This issue relates to the question of ecological validity.
Many methods of quantitative research rely heavily on administering research instruments to participants (such as structured interviews or self-completion questionnaires), or controlling situations to determine effects.
However, these instruments simply do not ‘tap into’ people’s real life experiences – for example, many of the well known lab experiments on the A-level sociology syllabus clearly do not reflect real life, while surveys which ask people about their attitudes towards immigration, or the environment, do not necessarily tell us about how people act towards migrants or the environment on a day to day basis.
The analysis of relationships between variables creates a static view of social life that is independent of people’s lives.
The main critique here is Blumer (1956).
Blumer (1956) argued that studies that seek to bring out the relationships between variables omit ‘the process of interpretation or definition that goes on in human groups’.
This is a combination of criticisms 1 and 3 above, but adds on an additional problem – that in isolating out variables, quantitative research creates an artificial, fixed and frozen social (un)reality – whereas social reality is (really) alive and constantly being created through processes of interaction by its various members.
In other words, the criticism here is that quantitative research is seen as carrying an objective ontology that reifies the social world.
The above criticisms have lead intepretivists to prefer more qualitative research methods. However, these too have their limitations!
Quantitative researchers generally have four main preoccupations: they want their research to be measurable, to focus on causation, to be generalisable, and to be replicable.
These preoccupations reflect epistemological grounded beliefs about what constitutes acceptable knowledge, and can be contrasted with the preoccupations of researchers who prefer a qualitative approach.
It may sound like it’s stating the obvious – but quantitative researchers are primarily interested in collecting numerical data, which means they are essentially concerned with counting social phenomena, which will often require concepts to be operationalised.
In most quantitative research there is a strong concern with explanation: qualitative researchers are more concerned with explaining why things are as they are, rather than merely describing them (which tends to be the focus of more qualitative research).
It follows that it is crucial for quantitative researchers to effectively isolate variables in order to establish causal relationships.
Quantitative researchers tend to want their findings to be representative of wider populations, rather than the just the sample involved in the study, thus there is a concern with making sure appropriate sampling techniques will be used.
If a study is repeatable then it is possible to check that the original researchers’ own personal biases or characteristics have not influenced the findings: in other words, replication is necessary to test the objectivity of an original piece of research.
Quantitative researchers tend to be keen on making sure studies are repeatable, although most studies are never repeated because there is a lack of status attached to doing so.
Sociomaterial perspectives hold that datafication via digital devices (both personal and public) are fundamentally intertwined with the way we construct our identities and ‘practice selfhood’, so much so that it is more accurate to say that today we ‘live in media’ rather than ‘we live with media’.
The most obvious manifestation of the intertwining of digital technologies, datafication and selfhood is our extensive use of mobile phones, tablets and laptops: not only do we rely on these devices for information, we also use them (sometimes consciously, sometimes not) to continually upload information about ourselves to the net.
And even if we choose to reduce our use of such technologies, or live without them altogether, our sense of self will still be partially governed by digital technology because so much of public life and public space is informed by its use.
Sociomaterial perspectives on human action are strongly influenced by actor-network theory and take our extensive use of digital technologies into account by focussing on the way that humans interact with non-human material objects such as computers in heterogeneous and diverse networks.
This approach sees objects as agents within a network, able to exert influence on humans, and it is interested in how things and meanings interrelated. It also takes account of how factors such as class, gender and ethnicity influence the context of a relational network.
Sociomaterial perspectives also recognize that there is a complex ‘web’ of interaction which lies beyond (or behind) technologically mediated networks: programmers, marketers etc, and (importantly I think) that the technologies and software which governs action within a network are themselves the product of human interactions (and thus values).
This perspective offers a useful response to post-structuralism which focuses purely on discourses and meanings, which are largely seen as floating free from the material context of action.
More specifically the sociomaterial perspective on understanding selfhood in a digital age focuses on:
How people experience technologies
How technologies are incorporated into people’s senses of self, and how they extend their sense of self
How social relations are configured through such networks incorporating networks.
The concept of assemblage is often used in the sociomaterialism literature. An assemblage is configured when humans, nonhumans, practices, ideas and discourses come together in a complex system. With digital systems, an assemblage will consist of the following:
Computer software and hardware
Manufacturers and retailers
Computer servers and archives
The computing cloud
Platforms and social media
According to sociomaterial perspective, individuals are ‘entangled’ in such assemblages – and understanding these entanglements is a complex business, precisely because these assemblages are complex – there are lot of human, and non-human actors involved.
Within these assemblages, humans can iimbue objects (such as their phones) with biological meaning, and understanding these meanings is key to understanding human action, but humans are also changed by all of the above ‘objects’ (along with the other actual humans) which make up the assemblage in which an individual acts.
Turkle (2007) for example calls mobile devices ‘evocative objects’ because they are basically repositories of ourselves – we have so much information stored on them!
Kitchen and Dodge (2011) use the term code/space to denote the ways in which software and devices such as mobile phones and sensors are configuring concepts of space and identity – our devices may even govern our access to certain spaces (think etickets), and because our behaviour can be tracked through them, we can also be nudged, or disciplined into certain ways of acting via our technologies.
Sources and Notes
This is my summary of part one of chapter two of my current January 2018 read:
Lupton, Deborah (2017) The Quantified Self, Polity
This kind of theory should hit A-level sociology about 2035, about 2 years before the cyborgs take over once and for all.
A Likert* scale is a multiple-indicator or multiple-item measure of a set of attitudes relating to a particular area. The goal of a Likert scale is to measure intensity of feelings about the area in question.
A Likert scale about Likert scales!
In its most common format, the Likert scale consists of a statement (e.g. ‘I love Likert scales’) and then a range of ‘strength of feeling’ options which respondents choose from – in the above example, there are five such options ranging from strongly agree to strongly disagree.
Each respondents reply on each item is scored, typically with a high score (5 in the above example) being given for positive feelings and a low score (1 in the above example) for negative feelings.
Once all respondents have completed the questionnaire, the scores from all responses are aggregated to give an overall score, or ‘strength of feeling’ about the issue being measured.
Some examples of sociological research using Likert scales:
The World Values Survey is my favourite example – they use a simple four point scale to measure happiness. The poll below gives you the exact wording used in the survey…
The results on the web site (and below) show you the percentages who answer in each category, but I believe that the researchers also give scores to each response (4 to 1) and then do the same for similar questions, combine the scores and eventually come up with a happiness rating for a country out of 10. I think the USA scores around 7.2 or something like that, it might be more! Look it up if you’re interested….
Important points to remember about Likert scales
The items must be statements, not questions.
The items must all relate to the same object being measured (e.g. happiness, strength of religious belief)
The items that make up the scale should be interrelated so as to ensure internal reliability is strong.
*The Likert Scale is named after Rensis Likert, who developed the method.
Within sociology, one might even say that there’s a more ‘fundamental’ layer of concepts that lie behind the above – such as ‘society’, ‘culture’ and ‘socialization‘, even ‘sociology’ itself is a concept, as are ‘research’ and ‘knowledge’.
Concepts also include some really ‘obvious’ aspects of social life such as ‘family’, ‘childhood’, ‘religious belief’, ‘educational achievement’ and ‘crime’. Basically, anything that can be said to be ‘socially constructed’ is a concept.
Each concept basically represents a label that researchers give to elements of the social world that strikes them as significant. Bulmer (1984) suggests that concepts are ‘categories for the organisation of ideas and observations’.
Concepts and their measurement in quantitative research
If a concept is to be employed in quantitative research, a measure will have to be developed for it so it can be quantified.
Once they have been converted into measures, concepts can then take the form of independent or dependent variables. In other words, concepts may provide an explanation of a certain aspect of the social world, or they may stand for things we want to explain. A concept such as educational achievement may be used in either capacity – we may explore it as a dependent variable (why some achieve fewer GCSE results than others?) Or: as an independent variable (how do GCSE results affect future earnings?).
Measures also make it easier to compare educational achievement over time and across countries.
As we start to investigate such issues we are likely to formulate theories to help us understand why, for example, educational achievement varies between countries or over time.
This will in turn generate new concepts, as we try to refine our understanding of variations in poverty rates.
Why Measure Concepts?
It allows us to find small differences between individuals – it is usually obvious to spot large differences, for example between the richest 0.1% and the poorest 10%, but smaller once can often only be seen by measuring more precisely – so if we want to see the differences within the poorest 10%, we need precise measurements of income (for example).
Measurement gives us a consistent device, or yardstick for making such distinctions – a measurement device allows us to achieve consistency over time, and thus make historical comparisons, and with other researchers, who can replicate our research using the same measures. This relates to reliability.
Measurement allows for more precise estimates to be made about the correlation between independent and dependent variables.
Indicators in Quantitative Social Research
Because most concepts are not directly observable in quantitative form (i.e. they do not already appear in society in numerical form), sociologists need to devise ‘indicators’ to measure most sociological concepts. An indicator is something that stands for a concept and enables (in quantitative research at least) a sociologist to measure that concept.
We might use ‘Average GCSE score’ as an indicator to measure ‘educational achievement’.
We might use the number of social connections an individual has to society to measure ‘social integration’, much like Hirschi did in his ‘bonds of attachment theory‘.
We might use the number of barriers women face compared to men in politics and education to measure ‘Patriarchy’ in society.
NB – there is often disagreement within sociology as to the correct indicators to use to measure concepts – before doing research you should be clear about which indicators you are using to measure your concepts, why you are choosing these particular indicators , and be prepared for others to criticize your choice of indicators.
Direct and Indirect indicators
Direct indicators are ones which are closely related to the concept being measured. In the example above, it’s probably fair to say that average GCSE score is more directly related to ‘educational achievement’ than ‘bonds of attachment’ are to ‘social integration’, mainly because the later is more abstract.
How sociologists devise indicators:
There are a number of ways indicators can be devised:
through a questionnaire
through recording behaviour
through official statistics
through content analysis of documents.
Using multiple-indicator measures
It is often useful to use multiple indicators to measure concepts. The advantages of doing so are three fold:
there are often many dimensions to a concept – for example to accurately tap ‘religious belief’ questionnaires often include questions on attitudes and beliefs about ‘God’, ‘the afterlife’, ‘the spirit’, ‘as well as practices – such as church attendance. Generally speaking, the more complex the concept, the more indicators are required to measure it accurately.
Some people may not understand some of the questions in a questionnaire, so using multiple questions makes misunderstanding less likely.
It enables us to make more nuanced distinctions between respondents.
Measuring the effectiveness of measures in quantitative social research
It is crucial that indicators provide both a valid and reliable measurement of the concepts under investigation.
Big data will change the nature of social research – more data will do away with the need for sampling (and eradicated the biases that emerge with sampling); big data analysis will be messier, but this will lead to more insights and allow for greater depth of analysis; and finally it will move us away from a limiting hypothesis-led search for causality, to non-causal analysis based on correlation.
At least according to Mayer-Schonberger and Cuker (2017) Big Data: The Essential Guide to Work, Life and Learning in the Age of Insight.
Below I outline their summary of how Cukier thinks big data will change social research:
The ability to collect and analyse large amounts of data in real time has many advantages:
It does away with the need for sampling, and all the problems that can emerge with biased sampling.
More data enables us to make accurate predictions down to smaller levels – as with the case of Google’s flu predictions being able to predict the spread of flu on a city by city basis across the USA.
It enables us to use outliers to spot interesting trends – for example credit card companies can use it to detect fraud if too many transactions for a particular type of card originate in one particular area.
When we use all the data, we are more likely to find things which we never expected to find…
Cukier uses Steven Levitt’s analysis of all the data from 11 years worth of Sumo bouts as a good example of the interesting insights to be gained through big data analysis.
A suitable analogy for big data may be the Lytro camera, which captures not just a single plane of light, as with conventional cameras, but rays from the entire light field… the photographer decides later on which element of light to focus on in the digital file…. And he can reuse the same information in different ways.
One of the areas that is most dramatically being shaken up by big data is the social sciences, which have traditionally made use of sampling techniques. This monopoly is likely to be broken by big data firms and the old biases associated with sampling should disappear.
Albert-Laszlo Barabasi examined social networks using logs of mobile phones from about one fifth of an unidentified European country’s population – which was the first analysis done on networks at the societal level using a dataset in the spirit of n = all. They found something unusual – if one removes people with lots of close links in the local area the societal network remains intact, but if one removes people with links outside their community, the social network degrades.
All other things being equal, big data is ‘messier’ than small data – because the more data you collect, the higher the chance that some of it will be inaccurate. However, the aggregate of all the data should provide more breadth and frequency of data than smaller data sets.
Cukier uses the analogy of measuring temperature in a vineyard to illustrate this – if we have just one temperature gauge, we have to make sure it is working perfectly, but it we have a thousand, we will have more errors, but a much wider breadth of data, and if we take measurements with greater frequency, we will have a more sensitive measurement of changes over time.
When using big data, analysts are generally happy sacrificing some accuracy for knowing the general trend – in the big data world, it is OK if 2+2 = 3.9.
More data is sometimes all we need for 100% accuracy, for example chess games with fewer than 6 pieces on the board have all been mapped out in their entirety, thus a human will never be able to beat a computer again once this point has been reached.
The fact that messiness doesn’t matter that much is evidenced in Google’s success with its translation software – Google employed a relatively simply algorithm but fed it trillions of words from across the internet – all of the messy data it could find – this proves that simple models and lot of data trump smart models and less data.
We see messiness in action all over the internet – it lies in ‘tagging’ and likes being rounded up – none of this is precise, but it works, it provides us with usable information.
Ultimately big data means we are going to have to become happier with uncertainty.
It might be hard to fathom today, but when Amazon started up it actually employed book critics and editors to write reviews of books and make recommendations to customers.
Then the CEO Jeff Bezos had the idea of making specific recommendations to customers based on their individual shopping preferences and employed someone called Greg Linden to develop a recommendation system – in 19898 he and his colleagues applied for a patent on ‘item to item’ collaborative filtering – which allowed Amazon to look for relationships between products.
As a result, Amazon’s sales shot up, they sacked the human advisors, and today about 1/3rd of all its sales are based on their recommendations systems. Amazon was an early adopter of big data analytics to drive up sales, and today many other companies such as Netflix also use it as one of the primary methods to keep profits rolling in.
These companies don’t need to know why consumers like the products that they do, knowing that there’s a relationship between the products people like is enough to drive up sales.
Predictions and Predilections
In the big data world, correlations really shine – we can use them to gain more insights extremely rapidly.
At its core, a correlation quantifies the statistical relationship between two data values. A strong correlation means that when one of the data values changes, the other is highly likely to change as well.
Correlations let us analyse a phenomenon not by shedding light on its inner workings, but by identifying a useful proxy for it.
In the small data age, researchers needed to use hypotheses to select one or a handful of proxies to analyse, and hence hard statistical evidence on the relationship between variables was collected quite slowly; with the increase in computational power we don’t need hypothesis-driven analysis, we can simply analyse billions of data points and ‘stumble upon’ correlations.
In the big-data age we can use a data-driven approach to collecting data, and our results should be less biased and more accurate, and we should also be able to get them faster.
One such example of where this data-driven approach has been applied and strong big data correlations was the case of Google’s flu predictions. We didn’t need to know what flu search terms were the best proxy for ‘people with flu symptoms’, in this case, the data simply showed us which search terms were the best proxies.
With correlations there is no certainty, only probability, but this can still provide us with actionable data, as with the case of Amazon above, and there are many other examples of where data driven big data analytics are changing our lives. (p56)
We can use correlations to predict the future – for example, Wal-Mart noticed a correlation between Hurricanes and Flash Light sales, but also pop tarts, so when a Hurricane is predicted, it moves the pop tarts to the front of store and further boosts its sales.
Probably the most notorious use of big data correlations to make predictions is the American discount retailer, Target, who use their data on the products women buy as a proxy for pregnancy – women tend to buy non scented body lotions around the third month of pregnancy and then various vitamin supplements around the 6 month mark – big data even allows predictions about the approximate birth date to be made!
Finding proxies in social contexts is only one way that big-data techniques are being employed – another use is through ‘predictive analytics’, which aims to forsee events before they happen.
One example of predictive analytics is the shipping company UPS using them to monitor its fleet of 10s of 1000s of vehicles – to replace parts just before they wear out, saving them millions of dollars.
Another use is in health care – one piece of research by Dr Carolyn McGregor, with IBM,, used 16 different data streams to track the stats of premature babies – and found that there was a correlation between certain stats and an infection occurring 24 hours later. Interestingly this research found that an infant’s stability was a predictor of a forthcoming infection, which flew in the face of convention – again we don’t know why this is, but the correlation was there.
Illusions and Illuminations
Big data also makes it easier to find more complex, non-linear relationships than when working within a hypothesis-limiting small data paradigm.
One example of a non-linear relationship uncovered by big data analysis is that of the relationship between income and happiness – that happiness increases with income (up until about $30K per year, but then it levels out – once we have ‘enough’ adding on more money doesn’t make us any happier…
Big data also opens up more possibilities for exploring networks – by analyzing how ideas spread through the nodes of networks such as Facebook, for example.
In network analysis, it is very difficult to attribute causality, because everything is connected to everything else, and big data analysis is typically non-causal, just looking for correlations not ‘causation’.
Does big data mean the end of theory?
In 2008 Wired magazine’s chief editor argued that in the ‘Petabyte age’ we would be able to do away with theory – that correlation would be enough for us to understand reality – citing as examples Google’s search engine and gene sequencing – where simply huge amounts of data and applied mathematics replace every other tool that might be brought to bear.
However, this view is problematic because big data is itself founded on theory – it employs mathematical and statistical theories for example, and humans still select data, or at least the tools which select data, which in turn are often driven by convenience and economic concerns.
Having said that, Big Data does potentially move us away from theory and closer to empiricism than in the small data age.
Quantitative research is a strategy which involves the collection of numerical data, a deductive view of the relationship between theory and research, a preference for a natural science approach (and for positivism in particular), and an objectivist conception of social reality.
It is important to note that quantitative research thus means more than the quantification of aspects of social life, it also has a distinctive epistemological and ontological position which distinguishes it from more qualitative research.
An ideal-typical outline of the stages of quantitative research:
The fact that quantitative research starts off with theory signifies the broadly deductive approach to the relationship between theory and research in this tradition. The sociological theory most closely associated with this approach is Functionalism, which is a development of the positivist origins of sociology.
It is common outlines of the main steps of quantitative research to suggest that a hypothesis is deduced from the theory and is tested.
However, a great deal of quantitative research does not entail the specification of a hypothesis, and instead theory acts loosely as a set of concerns in relation to which social researcher collects data. The specification of hypotheses to be tested is particularly likely to be found in experimental research but is often found as well in survey research, which is usually based on cross-sectional design.
3. Research design
The next step entails the selection of a research design which has implications for a variety of issues, such as the external validity of findings and researchers’ ability to impute causality to their findings.
4. Operationalising concepts
Operationalising concepts is a process where the researcher devises measure of the concepts which she wishes to investigate. This typically involves breaking down abstract sociological concepts into more specific measures which can be easily understood by respondents. For example, ‘social class’ can be operationalied into ‘occupation’ and ‘strength of religious believe’ can be measured by using a range of questions about ‘ideas about God’ and ‘attendance at religious services’.
5. selection of a research site or sites
With laboratory experiments, the site will already be established, in field experiments, this will involve the selection of a field-site or sites, such as a school or factory, while with survey research, site-selection may be more varied. Practical and ethical factors will be a limiting factor in choice of research sites.
6. Selection of respondents
Step six involves ‘choosing a sample of participants’ to take part in the study – which can involve any number of sampling techniques, depending on the hypothesis, and practical and ethical factors. If the hypothesis requires comparison between two different groups (men and women for example), then the sample should reflect this.
Step six may well precede step five – if you just wish to research ‘the extent of teacher labelling in schools in London’, then you’re pretty much limited to finding schools in London as your research site(s).
7. Data collection
Step seven, is what most people probably think of as ‘doing research’. In experimental research this is likely to involve pre-testing respondents, manipulating the independent variable for the experimental group and then post-testing respondents. In cross-sectional research using surveys, this will involve interviewing the sample members by structured-interview or using a pre-coded questionnaire. For observational research this will involve watching the setting and behaviour of people and then assigning categories to each element of behaviour.
8. Processing data
This means transforming information which has been collected into ‘data’. With some information this is a straightforward process – for example, variables such as ‘age’, or ‘income’ are already numeric.
Other information might need to be ‘coded’ – or transformed into numbers so that it can be analysed. Codes act as tags that are placed on data about people which allow the information to be processed by a computer.
9. Data analysis
In step nine, analysing data, the researcher uses a number of statistical techniques to look for significant correlations between variables, to see if one variable has a significant effect on another variable.
The simplest type of technique is to organise the relationship between variables into graphs, pie charts and bar charts, which give an immediate ‘intuitive’ visual impression of whether there is a significant relationship, and such tools are also vital for presenting the results of one’s quantitative data analysis to others.
In order for quantitative research to be taken seriously, analysis needs to use a number of accepted statistical techniques, such as the Chi-squared test, to test whether there is a relationship between variables. This is precisely the bit that many sociology students will hate, but has become much more common place in the age of big data!
10. Findings and conclusions
On the basis of the analysis of the data, the researcher must interpret the results of the analysis. It is at this stage that the findings will emerge: if there is a hypothesis, is it supported? What are the implications of the findings for the theoretical ideas that formed the background of the research?
11. Writing up Findings
Finally, in stage 11, the research must be written up. The research will be writing for either an academic audience, or a client, but either way, a write-up must convince the audience that the research process has been robust, that data is as valid, reliable and representative as it needs to be for the research purposes, and that the findings are important in the context of already existing research.
Once the findings have been published, they become part of the stock of knowledge (or ‘theory’ in the loose sense of the word) in their domain. Thus, there is a feedback loop from step eleven back up to step one.
The presence of an element of both deductivism (step two) and inductivism is indicative of the positivist foundations of quantitative research.