In 2006, ‘You’ were Time magazine’s person of the year.1 ‘You’, who wrote Wikipedia, populated Facebook and kept the stream of tweets flowing. Billions upon billions of times over, through those apps, blogs, forums and platforms of a new ‘social’ media, You wrestled control of the Information Age. Welcome to your world.
But as we emerge, blinking, into this new world we’re realising that none of us understand it very well. Are our brains being re-wired? Are we turned off by the drab reality of the offline world? Why are we sharing more and more things about our lives? What is privacy anymore; do we even have private lives? The debates continues.
Social media savvy-groups like Beppe Grillo’s Five Star Movement and the Pirate Parties across Europe have begun to take on the political establishment. Online currencies like Bitcoin are charging down banks, bankers and the world of ‘fiat’ commerce. Whether piracy, intellectual property, misogyny or libel, online and offline norms have begun to clash. We’re facing up to a new social reality that is in many respects very different to what came before.
Yet, underneath this tangle of noisy and visible social change, another, quieter revolution has taken place. Social media has also come bearing a promise: that it can help us understand ourselves better than ever before. It was on this promise that CASM was formed.
Social media meets big data
Social media simply captures more information about social life than ever before. As we move onto digital-social worlds, we take our cultural, social, political and intellectual activity with us, now in the form of Facebook likes, Instagram posts, Twitter tweets and so on. All this activity leaves a digital trace. Behaviour that was naturally and normally lost to the past is now captured, stored.
This all makes social media an unmatched resource for research. It is the ‘datafication’ of social life. For the first time, we can see society-in-motion, a digital visage of millions of people doing all the things they have always done: joking, arguing, insulting, gossiping, falling in love. Taken together, this represents a new digital commons of enormous size and wealth.
This explosion of information is not just happening on social media. There is a wider story here, ‘big data’. Around 1,200 exabytes of data now exists, and 90 per cent of it didn’t exist two years ago. Data increasingly surrounds all of our lives. Your supermarket, your insurer, your doctor all collect more information than they ever have before.
They are doing so because data has become more useful. Bigger computers, more powerful algorithms and better ways of visualising it have all in the last few years come together to allow us to turn huge clouds of complex data into insight.
Thanks to big data, we are getting better at predicting the weather, spotting infectious pandemics and even playing the stock markets. If you want to design a chair, you measure thousands upon thousands of different data-points of stress and pressure. If you want to know why an engine keeps breaking down, you take as many of measures of temperature, many times a second, at every single point of every moving mechanism of the engine.
The door is now open for social research to take a great leap forwards. The emergence of social media is a transformative moment in our long history of trying to study ourselves.
The challenge of social media research
We’re not there yet. The ways that we have of making sense out of society – polls, focus groups, and surveys – simply can’t cope with social media. They are swamped by the sheer volume of data that is now created; the 500 million tweets a day, the 1.13 trillion Facebook likes since it began. Facebook alone is much bigger than the entire internet was a decade ago. And it’s not just the size of the data, but also the speed. Social media information often appears in real-time, while most social scientific methods work with a delay of days, weeks, or even longer.
To cope with the new kinds of data that exist, we need to use new big data techniques that can cope with them: computer systems to marshal the deluges of data, algorithms to shape and mould the data as we want and ways of visualising the data to turn complexity into sense.
Yet people are a frustratingly difficult thing to research; often self-contradictory and multi-layered with deep, hidden motivations and urges. The fundamental challenge is this: while normal social research can’t handle complex data, the methods that can currently can’t handle the complexity of people.
So the current big data tools aren’t up to the job. While a brash, young, ‘social media analytics’ field has grown, it hasn’t moved us much closer to any insight. The numbers are getting bigger and bigger, and the analysis shallower and shallower. They can measure the easy things – likes, retweets, followers – but we don’t really know what they mean for people. Attempts to measure sentiment – ‘positive’, ‘negative’ and ‘neutral’ – are better at producing pretty graphs than serious, rigorous research.
This kind of research may be filling up academic journals, but it is leaving think-tankers, civil servants and politicians shrugging their shoulders. They can’t trust it; they can’t use it.
The Centre for the Analysis of Social Media (CASM)
The ambition of CASM (and, it should be said, others too) – is to unlock all the insight that social media now holds. We see the key to be a fully-fledged new discipline: social media science. New methods need to be built by technologists and social scientists working intensively together to produce ethical, robust insight that can practically change minds and influence decisions.
Two examples of how we are doing this: attitudes and prediction.
Attitudes. Since its birth twenty years ago, Demos has tried to include people’s voices in policy-making, and sought new ways to find them. Demos researchers have gone to hairdressers, built an urban beach, carried around a Domesday book, all in the search for what people really think.
Brits now spend 62 million hours a day on social media: that’s an hour a day for every adult and child in the UK. They do so on a number of platforms of course, but one of the fastest growing and most exciting is Twitter. Its 200 million active users worldwide post over 500 million ‘micro-blogs’, or tweets, a day. Tweets are short, snappy messages or updates of no more than 140 characters in length.
People turn to it to tweet about things that they have otherwise heard about in their daily lives. Major events – whether controversies, disasters or court cases – are now accompanied by a surging cloud of reaction on Twitter, a kaleidoscopic deluge of digital commentary, arguments, discussions, questions and answers. These ‘twitcidents’ are becoming a routine aftermath to events, a way that society reacts to and annotates the events it experiences. Indeed, they are becoming an important dimension of the events themselves.
Untangling these clouds of reaction is potentially very valuable. Doing so gives us a real-time picture of a society as it argues with itself about those experiences and points of controversy people find most important. Alongside surveys and polls, we can now have a real-time window into what people care and are talking about.
For over a year, we have been building technology to make sense of this deluge and develop an understanding of people’s views on the important events that they talk about. The fruits of this work – Vox Digitas: Understanding digital voices – will be published early this year.
The technology works using ‘natural language processing’ – a subfield of artificial intelligence and linguistics – through which algorithms are taught to automatically recognise whether tweets are relevant to one of the issues, whether they contain an attitude and (most ambitiously) what this attitude is. Each of these taught algorithms – 63 in total – act like sluice gates, controlling the flow of some tweets into deeper levels of the architecture, and siphoning off the rest into a waste pool.
Having this window could bring people and politicians closer together. Take a fairly routine and usual occurrence – a European Commission Summit. During one such summit, on the 14 March 2013, tweets mentioning Jose Manuel Barroso (the President of the Commission) surged to more than ten times its average.
70 per cent of the tweets were about the summit and while around half shared a link to a media story or an official EU website, no single particular issue dominated. Instead, tweeters took the occasion of the summit to talk about the EU-related issue that affects them. The summit therefore acted as a sounding board for a range of different concerns, fears and hope that people felt related to the EU.
Looking under the volume of tweets, we can see clear clusters of concern and interest. There are discussions about young people, unemployment and money, about competitiveness and austerity, about Syria and Francois Hollande, and about Cyprus’s debt and growth.
Importantly, people tended to be broadly optimistic about the ability of the European Commission and Parliament to enact positive change on the things that they cared about. The figure implies a digital agenda for the Summit: something politicians should pay attention to when they land in Brussels. In the future, as these methods improve, they should change the agendas politicians find in front in front of them when they arrive.
It would be astonishing if political parties here in the UK did not start using this technology to craft messages, positions and manifestos around what people think. Of course, this could become a ‘dark art’ of the political elite – of speeches changing halfway through in response to a Twitter-storm, and of unprincipled leadership swayed by the latest hashtags. But it is already a cheap, direct and very quick way of learning about what people care about – and it will become part of our political landscape.
Prediction. Once you understand people’s attitudes on social media, you can begin to predict how they act. To see if we could, we tried to predict the outcome of The X Factor (series 8) every week, solely based on what we could measure from social media. It’s a perfect experiment: every week, millions of people talk about which contestants they like and dislike, who they thought performed well or badly, and then they vote on it. Tweets about each of the contestants peaks as they take the stage.
Prediction is always tricky, and it is especially difficult when you are trying to map online content onto offline phenomena. Our attempts to do so over the series demonstrated how important social science is: we ended up weaving in Facebook, Twitter and YouTube information to build a model that took into account cognitive biases, demographic representativeness (people who actually vote tend to older and more male, it turned out) and the fact that some people vote for the best performer, whilst other die-hard fans build up long-term ‘brand’ loyalties and will vote for their contestant even if they forget the words.
We made our predictions of who would get voted off public every week, with green names indicating where we got it right. It didn’t go too badly:
Seizing the opportunity that social media has given us is one of the most important research challenges that we face. For think-tanks, academics, government researchers and private sector innovators, the challenge is to achieve the alchemy of turning messy, chaotic and confusing data into something valuable that people can trust. For many more people, the challenge is to work out how these new insights can be woven into their organisations and the decisions that they take.
For us at Demos, the task will be to ensure this often technical, technological and rarefied enterprise practically serves our founding aim: to bring people closer to the political institutions that represent them.