It is possible to analyse qualitative social media data to reveal social trends in attitudes.
Twitter recently released an analysis of the content of 4 billion tweets made over the past three years, from users based in the United States. (Source)
The fastest growing theme which Twitter users are talking about is ‘creator culture’, with people tweeting about products they create in order to sell to make a living…
They claim that the content of tweets reveal that the U.S. population has become increasingly interested in six major cultural themes over the last 4 years (from 2016 to 2020):
Tweets about ‘Creator Culture’ are up 462% – which includes tweets about creative currency, ‘hustle life’ and connecting through video.
One Planet tweets are up – 285% – includes tweets on the themes of the ethical self, sustainability, and clean corporations
Tweets about Well Being are up 225% – tweets about digital monitoring, holistic health and being well together
Tech Life tweets are up – 166% – on the topics of blended realities, future tech and ‘tech angst’.
‘My Identity’ tweets are up – 167% – fandom, gender redefined and ‘representing me’ are the main themes here.
Tweets about ‘Everyday Wonders’ are up 161% – a theme which includes DIY spirituality, awe of nature and cosmic fascination.
The 2020 report by twitter (here) was produced for marketing purposes, but nonetheless reveals what twitter users are becoming increasingly interested in, and there are no real surprises here.
The report is broken down into several sections which include the nice infographics I’ve put up in this post, there are many more available in the reports.
Intuitively I’m not surprised to see any of the above trends emerging from this analysis – I’m sure that as a population as a whole, we are generally more interested in all of the above in 2020, compared to 2016.
The limitations of using Twitter data to reveal cultural trends
There may be a lot of data, but there are possible problems with representativeness – twitter users tend to be younger and more educated than the wider population. (Source).
There’s also a problem with the motivations behind the data being collected – this was done for marketing purposes, to be useful to companies wishing to advertise on Twitter – so this analysis wouldn’t show any more negative trends which may have been tweeted about.
A limitation of the way this data is published is that we’re not told the raw numbers – so we know how much more a particular trend is being tweeted about in percentages, but we don’t know about the actual numbers. Some of these may have started from a very low base in 2016, in which case a 250% increase in 4 years still wouldn’t be that signficant!
This analysis paints Twitter as a wholly positive place where people are full of wonder and fascination, and are creative and positive. In reality we all know there’s a darker side to Twitter!
This study contradicts many of the ‘moral panic’ type headlines which suggests a link between heavy social media use and depression. Such headlines tend to be based on studies which look at correlations between indicators of depression and indicators of social media use at the same point in time, which cannot tell us which comes first: the depression or the heavy social media use.
This Canadian study followed a sample of teenagers from 2015 (and university students for 6 years) and surveyed them at intervals using a set of questions designed to measure depression levels and another set designed to measure social media usage and other aspects of screen time.
What they found was that teenage girls who showed signs of depression early on in the study were more likely to have higher rates of social media usage later on, leading to the theory that teenage girls who are depressed may well turn to social media to make themselves feel better.
The study found no relationship between boys or adults of both sexes and depression and social media.
This is an interesting research study which really goes to show the advantages of the longitudinal method (researching the same sample at intervals over time) in possibly busting a few myths about the harmful effects of social media!
The UK’s Chief Medical Officers are now officially advising parents to ban their children from using phones and other electronic devices in their bedrooms and during meal times. These are two out of nine specific recommendations made in a recent official report entitled:
It’s also a great example of an amazing ‘literature review’… they go through a stack of evidence on social media/ screen time/ internet use effects and ask lots of methods questions about each piece of research to determine whether or not those studies are high/ middle or low quality.
Interestingly the report said that there wasn’t enough available evidence to issue any guidelines on the total amount of time children should spend online or using screens in any one day or week, but that there was sufficient evidence to suggest limiting uses in specific contexts when using them can upset other beneficial activities.
Hence why the report recommends that parents limit their children’s use of phones at the following times:
While crossing the road.
The report also highlighted the fact that parents shouldn’t just assume that their children would be happy with them posting lots of pictures of them online and criticised some parents for ‘oversharing’.
Interestingly the report also highlighted the lack of high quality research into the impact of screen time, and stressed that more research was needed and they called on tech companies to share data to aid research.
Finally, the report also recommended that social media platforms and technology companies sign up to a voluntary code of conduct to protect children online, and hinted at possibly introducing new laws to protect children online.
Relevance to A level sociology
Firstly, the report seems to suggest there is some evidence that increased screen time has made childhood more ‘toxic’, because using them is proven to disrupt beneficial activities such as sleep and conversation during meal times.
The report seems to be saying the government is powerless to do anything to prevent Corporations from carrying on with their deliberate attempts to get children to spend more time on screens, merely suggesting that they might sign up to a voluntary code of conduct. So this demonstrates the might of the tech TNCs and the weakness of the Nation State.
Instead, the report focuses on ‘lifeworld’ or ‘privatised solutions to public problems’ – in other words, it’s down to the individual parents to regulate their children’s use of screens.
The report also makes it clear that we cannot say ‘a certain amount of screen time is bad’ – there isn’t evidence to back up a particular figure. This isn’t surprising given that there are different ways we can use our screens, so the idea that ‘screen time’ in general is going to be good or bad is maybe a bit ridiculous!
Finally, this is a good example of a late modern response rather than a postmodern response to a social problem – the report doesn’t just say ‘we’re uncertain, do what you like’, it says ‘there is some evidence that specific uses of screens at particular times prevent beneficial activities taking place, thus you should do x/y/z… i.e. we still have valid knowledge and a clear path of action even in the midst of uncertainty!
‘The Circle’ is a new ‘reality’ show currently airing on Channel 4 in the UK…. It is quite literally a ‘popularity contest In which 8 contestants compete over a 3-week period to be the most popular person in ‘The Circle’. The most popular contestant at the end wins £50K.
The rub is that there is no actual face to face interaction: competitors set up a social media profile (this can be anything from a more genuine portrayal of themselves to an outright catfish profile) and interact with all other competitors via a specially designed social media interface, called ‘the circle’.
The Circle is basically like Watts App – in addition to the profile, the contestants can have private 1-1 conversations, various ‘wittily named’ group chats, and whole ‘circle chats’. The circle also provides news feeds from the outside world, which competitors are expected to discuss.
Every few days, the competitors rate each other (a five star, Trip Advisor style rating) – the top two or three become ‘influencers’ and get to decide who to ‘block’ from the bottom three….. whoever is blocked gets kicked out and replaced by a new circle member.
Competitors are confined to an apartment room for the duration of the competition and have no contact with outside world, except for the snippets of news mentioned above.
The programme says of itself that it is…. ‘Timely and provocative [and] will ask questions about modern identity – how we portray ourselves and communicate on social media’…. but does it?
A few sociological observations…
An easy ‘critical starter’ is to focus on just how unrepresentative of the wider UK population the circle contestants are. They are all young (typically in their 20s, with the odd ‘young’ 30 year old), but they do not represent young people in Britain today: nearly without exception the contestants are confident, outgoing, party-types, clearly selected for their ability to ‘entertain’ on camera. Then (OF COURSE?) there’s the fact that that most of them are very attractive.
I guess it’s no surprise that all of the contestants are very comfortable interacting via ‘The Circle’, that is comfortable interacting blind (as in not face to face) with communication in short, sharp bursts, and sentences of more than 20 words are rare and emojis and hashtags being very much the norm, as is the practice of ‘leaving someone hanging’ by signing off when they’ve had enough of a private chat.
Interestingly, most of the contestants have chosen to be (more or less) themselves. Only two contestants (out of about a dozen I’ve seen) have gone for a virtual sex-swap, and one more a sexuality swap, everyone else is ‘more or less’ themselves. They know how exhausting it is ‘putting on an act’ for any length of time. In short, there simply aren’t that many catfish!
Alarmingly, the contestants are very comfortable with rating people quantitatively…. they do so, and give their reasons, with relish. And they seem to love it when they come out on top.
The contestants also know this is a game and are comfortable with this fact that this is a game…. which is why I think parallels with Charlie Brooker’s Black Mirror aren’t justified. It’s not a harbinger of a dystopian future, they know it’s just a bit of fun, even if the experience is stressful.
Ultimately ‘The Presentation of Self In Every Day Life‘ is the most relevant theory to draw on to analyse what’s going on here… clearly these contestants are putting on masks, not only via their Circle social media profiles, but also when they’re acting on camera for the C4 audience – let’s not forget, most of these contestants are media-personality wannabes!
One person (probably among many others) that’s not happy about this is Russell Brand, who pointed out that yet again it’s the marginalised and powerless who are being made to suffer so that the elite can have a ‘jolly nice time’.
He outlines his views in this brief, 5 minute video clip:
One of this suggestions is that Slough Council should hand over one its buildings to SHOC ‘Slough Homeless Our Concern’, so at least there is some real, tangible, extra support being made available for the homeless in the area.
You can sign an online petition in support of the idea here>
Relevance to A level sociology
I thought this was a cheeky little example to highlight how the marginalised get treated in this country, also illustrates elements of the social construction of crime – in that ‘homelessness’ becomes more of a problem when the context (the impending wedding) approaches.
Also – here we have celebrity Russell Brand, a ‘moral entrepreneur’ spearheading a very specific, niche, social policy campaign (/suggested intervention) via his YouTube channel – there’s something very postmodern about all of this…
OK OK I know this is a shameless ‘plug my online constructed self page’, but I’ve just spent two hours consolidating my social media profiles, so that’s it for today folks!
It’s probably worth noting that this blog is the main ‘hub site’ I use to post stuff: and most of the other sites are just what I use to publicize what I post on this blog – so cycling through all the above sites will give your the most wonderful feeling of an inward looking cycle of self-referentiality.
Also this is something of an experiment with the ‘contact details’ part of my C.V. which I’m currently trying to reinvent for the 21st century, and that is honestly about as much fun as it sounds!
Websites, social media posts and similar virtual documents are all forms of secondary data, and thus amenable to both quantitative and qualitative content analysis.
There are, however, many difficulties in using web sites as sources of content analysis. Following Scott’s (1990) four criteria of assessing the quality of documents, we need consider why a web site is constructed in the first place, whether it is there for commercial purposes, and whether it has a political motive.
In addition, we also need to consider the following potential problems of researching web sites:
Finding websites will probably require a search engine, and search engines only ever provide a selection of available web sites on a topic, and the sample they provide will be biased according to algorithm the engine uses to find its websites. It follows that use of more than one search engine is advisable.
Related to the above point, a search is only as good as the key words the researcher inputs into the search engines, and it could be time consuming to try out all possible words and combinations.
New web sites are continually appearing while old ones disappear. This means that by the time research is published, they may be based on web sites which no longer exist and not be applicable to the new ones which have emerged.
Similar to the above point, existing web sites are continually being updated.
The analysis of web sites is a new field which is very much in flux. New approaches are being developed at a rapid rate. Some draw on traditional ways of interpreting documents such as discourse analysis and qualitative content analysis, others have been developed specifically in relation to the Web, such as the examination of hyperlinks between websites and their significance.
Most researchers who use documents accept the fact that it can be difficult to determine the population from which they are sampling, and when researching documents online, the speed of development and change of the Web accentuate this problem. The experience of researching documents online can be like trying to hit a moving target that not only moves, but is in a constant state of metamorphosis.
Three examples of content analysis of documents online
Boepple and Thompson (2014) conducted quantitative analysis of 21 ‘healthy living blogs’. Their sampling frame was only blogs which had received an award, and from those, they selected the blogs with the largest number of page views.
They found that content emphasised appearance and disordered messages about food/ nutrition,with five bloggers using very negative language about being fat or overweight and four invoking admiration for being thin. They concluded that these blogs spread messages that are ‘potentially problematic’ for anyone changing their behaviour on the basis of advice contained in them.
Davis et al (2015) conducted an analysis of postings that followed a blog post concerning a cyberbullying suicide y a 15 year old named Amanda Todd. There were 1094 comments of which 482 contained stories about being bullied, 12% about cyberbullying, 75% about traditional bullying, the rest a mixture of both.
The research found that the main reason victims of bullying are targeted is because they do not conform in one way or another to society’s mainstream norms and values, with the most common specific reason for bullying being a victim’s physical appearance.
Humphries et al (2014) conducted content analysis on the kinds of personal information disclosed on Twitter. The authors collected an initial sample of users and they searched friends of this initial sample. In total the collected 101, ,069 tweets and took a random sample of 2100 tweets from this.
One of their findings was that Twitter users not only share information about themselves, they frequently share information about others too.
Researching documents online may be challenging, but it is difficult to see how sociologists can avoid it as more and more of our lives are lived out online, so researching documents such as web sites, and especially blogs and social media postings is, I think, very much set to become a growth area in social research.
I just typed in ‘how many likes does it take to be satisfied’ into Google and got the responses below (second picture) – although just as interesting are the auto-complete options which cropped up.
I guess we live in a virtual world where many more people are asking themselves how to get more likes, without asking themselves whether this will make them satisfied or not?
I KNOW there are plenty more ways you can phrase the question, and of course the above responses may be difference because of my own search history (although whey the Ask Men link came up is beyond me), but intuitively this seems to be an obvious limit to reflexivity in an an online age – asking how to get more likes is more common than reflecting on whether this is a worthy goal in the first place.
For some reasons I’m reminded of Habermas’ theory of communicative action – and those three basic types of question we can ask of each other (and ourselves) – (1) Is something effective, (2) is something true (i.e. what does it actually mean) and (3) is something good. When it comes to the economy of likes, I guess most people are stuck in that pragmatic domain. When it comes to likes – how many people stop to reflect on what a ‘like’ actually means, and whether seeking more of them is a worthwhile act in itself, and how (more?) many people have just unconsciously based part of their self-esteem on gathering likes and limit themselves to the pragmatic question of how to get more of them?
Now there’s a research agenda to stick in your pipe and smoke!
And of course I do appreciate the irony of the media here.
The middle classes and especially those in creative industries are more likely to be on twitter, but finding this out is more difficult than you might think, at least according to some recent research:
Who Tweets?: Deriving the Demographic Characteristics of Age, Occupation and Social Class from Twitter User Meta-Data
This post is a brief summary of the methods and findings of the above.
Introduction/ Context/ Big Data
90% of the world’s data has been generated in the past 2 years and the trend is apparently exponential, the key challenges of harnessing this data (known as the 5Vs: volume,veracity, velocity, variety and value) are not so easily overcome.
The primary criticism of such data is that it is there to be collected and analysed before the question is asked and, because of this, the data required to answer the research question may not be available with important information such as demographic characteristics being absent.
The sheer volume of data and its constant, flowing, locomotive nature provides an opportunity to take the ‘pulse of the world’ every second of the day rather than relying on punctiform and time-consuming terrestrial methods such as surveys. Only 1% of Twitter users in the UK amounts to around 150,000 users. Even a tiny kernel of ‘useful’ data can still amount to a sample bigger than some of the UK’s largest sample surveys
However, social media data sources are often considered to be ‘data-light’ as there is a paucity of demographic information on individual content producers.
Yet, as Savage and Burrows argue, sociology needs to respond to the emergence of these new data sources and investigate the ways in which they inform us of the social world. One response to this has been the development of using ‘signatures’ in social media as proxies for real world events and individual characteristics
This paper builds on this work conducted at the Collaborative Online Social Media Observatory (COSMOS),through proposing methods and processes for estimating two demographic variables: age and occupation (with associated class).
How Do Twitter Users Vary by Occupation and Social Class – Methods
The researchers used a sample 32, 032 twitter profiles collected by COSMOS, relying on the entry in the ‘profile’ box to uncover occupation and class background.
They took the occupation with the most number of words as the primary occupation, and, if multiple occupations are listed, they took the first occupation as the primary occupation.
They then randomly selected 1,000 cases out of the 32,032 to which an occupation was assigned and three expert coders visually inspected the results of 1000 twitter profiles in anticipation of inaccuracies and errors.
They found that 241 (so 24%) had been misclassified, with a high level of inter-rater reliability.
The main problems of identification stemmed from the multiple meanings of many words related to occupations, Hobbies, and with obscure occupations. For example, people might refer to themselves as a ‘Doctor Who fan’ or a ‘Dancer trapped in a software engineer’s body’.
So what is the class background of twitter users?
The table below shows you three different data sets – the class backgrounds as automatically derived from the entire COSMOS sample of profiles, the class background of the 32 000 sample the researcher used and the class backgrounds of the 1000 that were visually verified by the three expert coders (for comments on the differences see ‘validity problems’ below).
There is a clear over representation of NS-SEC 2 occupations in the data compared with the general UK population which may be explained by the confusion between occupations and hobbies and/or the use of Twitter to promote oneself or one’s work. NS-SEC 2 is where occupations such as ‘artist’, ‘singer’, ‘coach’, ‘dancer’ and ‘actor’ are located and the utility of the tool for identifying occupation for this group is further exacerbated by the fact that this is by far the most populous group for Twitter users and the largest group in the general UK population by 10% points. Alternatively, if the occupation of these individuals has been correctly classified then we can observe that they are over represented on Twitter by a factor of two when using Census data as a baseline measure.
Occupations such as ‘teacher’, ‘manager’ and ‘councillor’ are not likely to be hobbies but there is an unusually high representation of creative occupations which could also be pursued as leisure interests with 4% of people in the dataset claiming to be an ‘actor’, 3.5% an ‘artist’ and 3.5% a ‘writer’. An alternative explanation is that Twitter is used by people who work in the creative industries as a promotional tool.
Validity problems with the social-class demographics of twitter data
Interestingly, the researchers rejected the idea that people would just outright lie about their occupations noting that ‘previous research [has] indicated that identity-play and the adoption of alternative personas was often short-lived, with ‘real’ users’ identities becoming dominant in prolonged interactions. The exponential uptake of the Internet,beyond this particular group of early adopters,was accompanied with a shift in the presentation of self online resulting in a reduction in online identity-play’.
The COSMOS engine does automatically identify occupation, but it identifies occupation inaccurately – and the degree of inaccuracy varies with social class background. The researchers note:
‘unmodified occupation identification tool appears to be effective and accurate for NS-SEC groups in which occupational titles are unambiguous such as professions and skilled trades (NS-SEC 1,3,4 and 5). Where job titles are less clear or are synonymous with alternative activities (NS-SEC 2, 6 and7) the requirement for human validation becomes apparent as the context of the occupational term must betaken into account such as the difference between “I’m a dancer in a ballet company”and “I’m a dancer trapped in the body of a software engineer’.
The researchers note that the next step is to further validate their methodology through establishing the ground-truth via ascertaining the occupation of tweeters through alternative means, such as social surveys (an on-going programme of work for the authors).
In some ways the findings are not surprising – that the middle class professionals and self-employed are over-represented on twitter, but if we are honest, we don’t know by how much, because of the factors mentioned above. It seems fairly likely that many of the people self-identifying on twitter as ‘actors’ and so on don’t do this as their main job, but we just can’t access this method by twitter alone.
Thus this research is a reminder that hyper reality is not more real than actual reality. In hyper-reality these people are actors, in actual reality, they are frustrated actors. This is an important distinction, and this alone could go some way to explaining why virtual worlds can be so much meaner than real-worlds.
This research also serves as a refreshing reminder of how traditional ‘terrestrial’ methods such as surveys are still required to ascertain the truth of the occupations and social class backgrounds of twitter users. As it stands if we left it to algorithms we’d end up with 25% of people bring incorrectly identified, which is a huge margin of error. If we leave these questions up to twitter, then we are left with a very misleading picture of ‘who tweets’ by social class background.
Having said this, it is quite possible for further rules to be developed and applied to algorithms which could increase the accuracy of automatic demographic data-mining.
Privacy & Cookies Policy
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.