Thank you for your interest in curated datasets. We are just starting to explore this as a solution. We believe low-code or no-code curated datasets can make it easier for researchers to study Twitter data, and we want to hear what you think of our early ideas.
Curated datasets can help make it faster and easier for researchers to analyze public Twitter data related to high-interest topics. These purpose-built datasets will be comprehensive; intended to include all public Tweets on a given topic or use case. For transparency and research validation, this would also include the makeup and methodology used to create these datasets.
We have seen that purpose-built datasets can accelerate the pace at which academics can conduct research and produce insights with Twitter data. One example of this is the Twitter Transparency archive: curated datasets of public Tweets and media from accounts believed to be connected to state-backed manipulation and information operations. Our hope is that curated datasets can enable more of this type of work across a variety of disciplines.
A few example curated datasets that Twitter might create:
COVID-19 and coronavirus
Recent natural disasters, by impacted regions
2020 US election and other global elections
#MeToo or other social movements
Twitter infrastruture traces
A random sample for teaching & training purposes
And more… Have ideas on datasets you would like to see? Please fill out this form and let us know!