Success story

HateLab

Understanding toxic conversations to combat them.

Here's the tl;dr

Do research

Build for good

HateLab is on a mission to end hate speech and improve healthy conversations online. Using the Twitter API, along with the latest techniques in machine learning to classify toxic speech, they developed the HateLab Dashboard. This dashboard powers insight into how and when toxic conversation flows, helping guide charities, government departments, and other organizations on how to respond and support their communities.

Endpoints

Filtered stream →

Challenge

Online conversations, particularly in times of stress or crisis, can range from helpful and supportive to deeply toxic. HateLab, part of Cardiff University’s Social Data Science Lab, is a partnership between the university’s School of Social Sciences and School of Computer Science and Informatics. Their focus is to understand the dynamics of the toxic conversations on social media - the enablers, the drivers, the inhibitors - to help inform how to foster healthy conversations online.

"One positive trend we see is that there’s significantly more healthy conversations going on than toxic ones, but the toxic conversations get more press. We want to make the positive conversations more effective."

Professor Matthew Williams, HateLab’s Principal investigator

Solution

Starting in 2011, Professor Williams was interested in how social media data could help answer social science research questions. Up to that point, there was some nascent work but very few social scientists had the ability to access data from the public conversation on Twitter, much less analyze it.

“So, for the first time, we started working with computer scientists. We’d never done it before. It was a challenge because their way of working was so different -- different expectations, different funding sources, different publication paths -- we had to figure out how working together would be a benefit for both groups. Once we did, it’s been a success ever since.”

HateLab has conducted research on a range of events that gave rise to hate speech, from terrorist attacks to Brexit to COVID-19. From this work, they developed their Online Hate Speech Dashboard - a realtime monitor of online conversations at scale. It draws on the Twitter API to help filter Tweets, and then uses the latest techniques in machine learning to classify toxic speech. Their novel approach to hate speech detection earned them 1stplace in the SIGLEX and Microsoft sponsored SemEval competition in 2019. The dashboard displays aggregate data on the ebb and flow of online hate speech, with associated intelligence on topics, hashtags, and high-level audience segmentation.

Approved partners can monitor aggregate conversations in realtime to identify divisive topics, and then use the data to develop targeted counter speech campaigns. For example, if a politician were to say something Islamophobic, there is often a follow-on effect of social media users empowered to post something similar. HateLab Dashboard can analyze the conversation to identify the key topics, and partner organizations can use this intelligence to deploy social media content to try to defuse the online tension and promote healthier conversations.

Results

Publishing papers (like these) is a key success metric for academics, but HateLab also wants their work to have a meaningful impact on reducing toxicity and improving positive, healthy conversation online. Enter the HateLab Dashboard. The dashboard currently identifies toxic conversation flows and brings those insights to charities, government departments, and other interested organizations who want to create counter messages that will help defuse toxic speech.

Sefa Ozalp, the lead data science researcher at HateLab, states that: “It is not easy for policy makers or community organizations to get a grasp of the big picture of the discussions and online tensions on Twitter due to the massive volume of Tweets arriving every second of the day.

By putting our machine learning research on online hate speech detection into production, HateLab Dashboard addresses this challenge and presents an interactive and intuitive way to explore online tensions on Twitter and assists with data-driven decision making. We have received overwhelmingly positive feedback about the usefulness of the Dashboard during the field trials with approved partners who are interested in making sense of social media data to promote community cohesion”.

HateLab’s partner, Social Data Science Lab, has developed a desktop and web tool for more general social research, COSMOS. COSMOS filters Tweets using the Twitter API to give researchers without programming skills an easy way to analyze the public conversation. This tool helps researchers source data from Twitter in a way that is ethical, and turns it into something they can use as source material for research.

Back to Blog Home

Explore more

See how others have used the Twitter API.

Man wearing a surgical mask and surfing on the mobile with purple abstract background and computer code snippets

Success story

Penn Medicine CDH

Penn Medicine CDH uses Twitter data to understand the COVID-19 health crisis.

HateLab

Here's the tl;dr

Challenge

"One positive trend we see is that there’s significantly more healthy conversations going on than toxic ones, but the toxic conversations get more press. We want to make the positive conversations more effective."

Solution

Results

Ready to start your next project?

Explore more