The small team of two engineers looked to Twitter as their source for incident data. They chose this platform for two reasons: firstly, it contains rich and diverse information, such as news, people's ideas, thoughts, daily life events, and things happening in their communities. Because of this, Twitter offers a unique, intimate sample of various populations. Twitter is also the only platform with a truly open API where the TAAF developers could access data and has in-depth tools to collect and analyze that data.
Using the Twitter API v2 search Tweets and Tweets lookup endpoints, the team was able to search Tweet statuses and identify when people reported a hate incident (and even what hashtags they would use). This helped the team design a 1023-character search query that narrowed the billions of Tweets down to an amount the team could work with.
Next came the task of verifying that the Tweets were indeed about hate incidents. The team developed a tool with natural language processing to help sift through thousands of Tweets to find ones relevant to the project. After applying their trained machine learning model, the Tweets were then verified by a human to further increase accuracy.
In 2021, TAAF also co-authored the Documenting Anti-AAPI Hate Codebook with the Stop AAPI Hate coalition. This resource includes draft standards and practices for community-based data collection which helped the team classify incidents that people were sharing over Twitter.
From there, the next challenge was to present this data in a way that people could easily digest. TAAF created a data visualization tool, Decoding Hate, that displays their vast amount of data in an interactive way. With the help of a data visualization studio, they were able to take the annotated Twitter data and turn thousands of Tweets into insightful stories that expose the truth behind what was happening within AAPI communities.