Building queries for Search Tweets

The search endpoints accept a single query with a GET request and return a set of historical Tweets that match the query.  Queries are made up of operators that are used to match on a variety of Tweet attributes. 

To learn more about how to create high-quality queries, visit the following tutorial:
Building high-quality filters for getting Twitter data

 

Table of contents

Building a query

Query limitations

Your queries will be limited depending on which product track you are using. 

If you are using the Standard product track at the Basic access level, your query can be 512 characters long.

If you are using the Academic Research product track, your query can be 1024 characters long. 
 

Operator availability

While most operators are available to any developer, there are several that are reserved for those that have been approved for the Academic Research product track. We list which product tracks each operator is available to in the list of operators table using the following labels:

  • All: Available when using any Project.
  • Academic Research only: Available when using an Academic Research Project 
     

Operator types: standalone and conjunction-required

Standalone operators can be used alone or together with any other operators (including those that require conjunction).

For example, the following query will work because it uses the #hashtag operator, which is standalone:

#twitterapiv2

Conjunction required operators cannot be used by themselves in a query; they can only be used when at least one standalone operator is included in the query. This is because using these operators alone would be far too general, and would match on an extremely high volume of Tweets.

For example, the following queries are not supported since they contain only conjunction required operators:

has:media
has:links OR is:retweet

If we add in a standalone operator, such as the phrase "twitter data", the query would then work properly. 

"twitter data" has:mentions (has:media OR has:links)


Boolean operators and grouping

If you would like to string together multiple operators in a single query, you have the following tools at your disposal:

AND logic Successive operators with a space between them will result in boolean "AND" logic, meaning that Tweets will match only if both conditions are met. For example, snow day #NoSchool will match Tweets containing the terms snow and day and the hashtag #NoSchool.
OR logic Successive operators with OR between them will result in OR logic, meaning that Tweets will match if either condition is met. For example, specifying grumpy OR cat OR #meme will match any Tweets containing at least the terms grumpy or cat, or the hashtag #meme.
NOT logic, negation Prepend a dash (-) to a keyword (or any operator) to negate it (NOT). For example, cat #meme -grumpy will match Tweets containing the hashtag #meme and the term cat, but only if they do not contain the term grumpy. One common query clause is -is:retweet, which will not match on Retweets, thus matching only on original Tweets. All operators can be negated, but negated operators cannot be used alone.
Do not negate a set of operators grouped together in a set of parentheses. Instead, negate each individual operator. For example, Instead of using -(grumpy OR cat OR #meme), we suggest that you use -grumpy -cat -#meme.
Grouping You can use parentheses to group operators together. For example, (grumpy cat) OR (#meme has:images) will return either Tweets containing the terms grumpy and cat, or Tweets with images containing the hashtag #meme. Note that ANDs are applied first, then ORs are applied.

A note on negations

All operators can be negated except for sample:, and -is:nullcast must always be negated. Negated operators cannot be used alone.

Do not negate a set of operators grouped together in a set of parentheses. Instead, negate each individual operator.

For example, instead of using skiing -(snow OR day OR noschool), we suggest that you use skiing -snow -day -noschool


Order of operations

When combining AND and OR functionality, the following order of operations will dictate how your query is evaluated.

  1. Operators connected by AND logic are combined first
  2. Then, operators connected with OR logic are applied

For example:

  • apple OR iphone ipad would be evaluated as apple OR (iphone ipad)
  • ipad iphone OR android would be evaluated as (iphone ipad) OR android

To eliminate uncertainty and ensure that your query is evaluated as intended, group terms together with parentheses where appropriate. 

For example:

  • (apple OR iphone) ipad
  • iphone (ipad OR android)
     

Punctuation, diacritics, and case sensitivity

If you specify a keyword or hashtag query with character accents or diacritics, it will match Tweet text honoring the diacritics (hashtags or keywords). Queries with a keyword Diacrítica or hashtag #cumpleaños will match Diacrítica or #cumpleaños but not Diacritica or #cumpleanos without the tilde í or eñe.

Characters with accents or diacritics are treated the same as normal characters and are not treated as word boundaries. For example, a query with the keyword cumpleaños would only match activities containing the word cumpleaños and would not match activities containing cumplea, cumplean, or os.

All operators are evaluated in a case-insensitive manner. For example, the query cat will match Tweets with all of the following: cat, CAT, Cat.
 

Specificity and efficiency

When you start to build your query, it is important to keep a few things in mind.

  • Using broad, standalone operators for your query such as a single keyword or #hashtag is generally not recommended since it will likely match on a massive volume of Tweets. Creating a more robust query will result in a more specific set of matching Tweets, and will hopefully reduce the amount of noise in the payload that you will need to sift through to find valuable insights. 
    • For example, if your query was just the keyword happy you will likely get anywhere from 200,000 - 300,000 Tweets per day.
    • Adding more conditional operators narrows your search results, for example (happy OR happiness) place_country:GB -birthday -is:retweet
  • Writing efficient queries is also beneficial for staying within the characters query length restriction. The character count includes the entire query string including spaces and operators.
    • For example, the following query is 59 characters long: (happy OR happiness) place_country:GB -birthday -is:retweet


Iteratively building a query

Test your query early and often

Getting a query to return the "right" results the first time is rare. There is so much on Twitter that may or may not be obvious at first and the query syntax described above may be hard to match to your desired search. As you build a query, it is important for you to periodically test it out

For this section, we are going to start with the following query and adjust it based on the results that we receive during our test: 

happy OR happiness

Use results to narrow the query

As you test the query, you should scan the returned Tweets to see if they include the data that you are expecting and hoping to receive. Starting with a broad query and a superset of Tweet matches allows you to review the result and narrow the query to filter out undesired results.  

When we tested the example query, we noticed that we were getting Tweets in a variety of different languages. In this situation, we want to only receive Tweets that are in english, so we’re going to add the lang: operator:

(happy OR happiness) lang:en

The test delivered a number of Tweets wishing people a happy birthday, so we are going to add -birthday as a negated keyword operator. We also want to only receive original Tweets, so we’ve added the negated -is:retweet operator:

(happy OR happiness) lang:en -birthday -is:retweet

Adjust for inclusion where needed

If you notice that you are not receiving data that you expect and know that there are existing Tweets that should return, you may need to broaden your query by removing operators that may be filtering out the desired data. 

For our example, we noticed that there were other Tweets in our personal timeline that expressed the emotion that we are looking for and weren’t included in the test results. To ensure we have greater coverage, we are going to add the keywords, excited and elated.

(happy OR happiness OR excited OR elated) lang:en -birthday -is:retweet

Adjust for popular trends/bursts over the time period

Trends come and go on Twitter quickly. Maintaining your query should be an active process. If you plan to use a query for a while, we suggest that you periodically check in on the data that you are receiving to see if you need to make any adjustments.

In our example, we notice that we started to receive some Tweets that are wishing people a “happy holidays”. Since we don’t want these Tweets included in our results, we are going to add a negated -holidays keyword.

(happy OR happiness OR excited OR elated) lang:en -birthday -is:retweet -holidays 

 

Operators

Operator Type Availability Description
keyword Standalone All Matches a keyword within the body of a Tweet. This is a tokenized match, meaning that your keyword string will be matched against the tokenized text of the Tweet body. Tokenization splits words based on punctuation, symbols, and Unicode basic plane separator characters.

For example, a Tweet with the text “I like coca-cola” would be split into the following tokens: I, like, coca, cola. These tokens would then be compared to the keyword string used in your query. To match strings containing punctuation (for example coca-cola), symbol, or separator characters, you must wrap your keyword in double-quotes.

Example: pepsi OR cola OR "coca cola"
emoji Standalone All Matches an emoji within the body of a Tweet. Similar to a keyword, emojis are a tokenized match, meaning that your emoji will be matched against the tokenized text of the Tweet body.

Note that if an emoji has a variant, you must wrap it in double quotes to add to a rule.

Example: (😃 OR 😡) 😬
"exact phrase match" Standalone All Matches the exact phrase within the body of a Tweet.

Example: ("Twitter API" OR #v2) -"recent search"
# Standalone All Matches any Tweet containing a recognized hashtag, if the hashtag is a recognized entity in a Tweet.

This operator performs an exact match, NOT a tokenized match, meaning the rule #thanku will match posts with the exact hashtag #thanku, but not those with the hashtag #thankunext.

Example: #thankunext #fanart OR @arianagrande
@ Standalone All Matches any Tweet that mentions the given username, if the username is a recognized entity (including the @ character).

Example: (@twitterdev OR @twitterapi) -@twitter
$ Standalone Academic Research only Matches any Tweet that contains the specified ‘cashtag’ (where the leading character of the token is the ‘$’ character).

Note that the cashtag operator relies on Twitter’s ‘symbols’ entity extraction to match cashtags, rather than trying to extract the cashtag from the body itself.

Example: $twtr OR @twitterdev -$fb
from: Standalone All Matches any Tweet from a specific user.
The value can be either the username (excluding the @ character) or the user’s numeric user ID.

Example: from:twitterdev OR from:twitterapi -from:twitter
to: Standalone All Matches any Tweet that is in reply to a particular user.
The value can be either the username (excluding the @ character) or the user’s numeric user ID.

Example: to:twitterdev OR to:twitterapi -to:twitter
url: Standalone All Performs a tokenized match on any validly-formatted URL of a Tweet.

This operator can matches on the contents of both the url or expanded_url fields. For example, a Tweet containing "You should check out Twitter Developer Labs: https://t.co/c0A36SWil4" (with the short URL redirecting to https://developer.twitter.com) will match both the following rules:

from:TwitterDev url:"https://developer.twitter.com"
(because it will match the contents of entities.urls.expanded_url)

from:TwitterDev url:"https://t.co"
(because it will match the contents of entities.urls.url)

Tokens and phrases containing punctuation or special characters should be double-quoted (for example, url:"/developer"). Similarly, to match on a specific protocol, enclose in double-quotes (for example, url:"https://developer.twitter.com").
retweets_of: Standalone All Matches Tweets that are Retweets of the specified user. The value can be either the username (excluding the @ character) or the user’s numeric user ID.

Example: retweets_of:twitterdev OR retweets_of:twitterapi
context: Standalone All NEW Matches Tweets with a specific domain id and/or domain id, enitity id pair where * represents a wildcard. To learn more about this operator, please visit our page on annotations.

context:domain_id.entity_id
context:domain_id.*
context:*.entity_id

Examples:
context:10.799022225751871488
(domain_id.entity_id returns Tweets matching that specific domain-entity pair)

context:47.*
(domain_id.* returns Tweets matching that domain ID, with any domain-entity pair)

context:*.799022225751871488
(*.entity_id returns Tweets matching that entity ID, with any domain-entity pair)
entity: Standalone All NEW Matches Tweets with a specific entity string value. To learn more about this operator, please visit our page on annotations.

entity:"string declaration of entity/place"

Examples: entity:"Michael Jordan" OR entity:"Barcelona"
conversation_id: Standalone All NEW Matches Tweets that share a common conversation ID. A conversation ID is set to the Tweet ID of a Tweet that started a conversation. As Replies to a Tweet are posted, even Replies to Replies, the conversation_id is added to its JSON payload.

Example: conversation_id:1334987486343299072 (from:twitterdev OR from:twitterapi)
place: Standalone Academic Research only Matches Tweets tagged with the specified location or Twitter place ID. Multi-word place names (“New York City”, “Palo Alto”) should be enclosed in quotes.

Note: See the GET geo/search standard v1.1 endpoint for how to obtain Twitter place IDs.

Note: This operator will not match on Retweets, since Retweet's places are attached to the original Tweet. It will also not match on places attached to the original Tweet of a Quote Tweet.

Example: place:"new york city" OR place:seattle OR place:fd70c22040963ac7
place_country: Standalone Academic Research only Matches Tweets where the country code associated with a tagged place/location matches the given ISO alpha-2 character code.

You can find a list of valid ISO codes on Wikipedia.

Note: This operator will not match on Retweets, since Retweet's places are attached to the original Tweet. It will also not match on places attached to the original Tweet of a Quote Tweet.

Example: place_country:US OR place_country:MX OR place_country:CA
point_radius: Standalone Academic Research only

Matches against the place.geo.coordinates object of the Tweet when present, and in Twitter, against a place geo polygon, where the Place polygon is fully contained within the defined region.

point_radius:[longitude latitude radius]

  • Units of radius supported are miles (mi) and kilometers (km)
  • Radius must be less than 25mi
  • Longitude is in the range of ±180
  • Latitude is in the range of ±90
  • All coordinates are in decimal degrees
  • Rule arguments are contained within brackets, space delimited

Note: This operator will not match on Retweets, since Retweet's places are attached to the original Tweet. It will also not match on places attached to the original Tweet of a Quote Tweet.

Example: point_radius:[2.355128 48.861118 16km] OR point_radius:[-41.287336 174.761070 20mi]

 

bounding_box: Standalone Academic Research only Matches against the place.geo.coordinates object of the Tweet when present, and in Twitter, against a place geo polygon, where the place polygon is fully contained within the defined region.

bounding_box:[west_long south_lat east_long north_lat]

  • west_long south_lat represent the southwest corner of the bounding box where west_long is the longitude of that point, and south_lat is the latitude.
  • east_long north_lat represent the northeast corner of the bounding box, where east_long is the longitude of that point, and north_lat is the latitude.
  • Width and height of the bounding box must be less than 25mi
  • Longitude is in the range of ±180
  • Latitude is in the range of ±90
  • All coordinates are in decimal degrees.
  • Rule arguments are contained within brackets, space delimited.

Note: This operator will not match on Retweets, since Retweet's places are attached to the original Tweet. It will also not match on places attached to the original Tweet of a Quote Tweet.

Example: bounding_box:[-105.301758 39.964069 -105.178505 40.09455]
is:retweet Conjunction required All Matches on Retweets that match the rest of the specified rule. This operator looks only for true Retweets (for example, those generated using the Retweet button). Quote Tweets will not be matched by this operator.

Example: data @twitterdev -is:retweet
is:reply Conjunction required All Deliver only explicit replies that match a rule. Can also be negated to exclude replies that match a rule from delivery.

Note: This operator is also available with the filtered stream endpoint. When used with filtered stream, this operator matches on replies to an original Tweet, replies in quoted Tweets, and replies in Retweets.

Example: from:twitterdev is:reply
is:quote Conjunction required All Returns all Quote Tweets, also known as Tweets with comments.

Example: "sentiment analysis" is:quote
is:verified Conjunction required All Deliver only Tweets whose authors are verified by Twitter.

Example: #nowplaying is:verified
-is:nullcast Conjunction required Academic Research only Removes Tweets created for promotion only on ads.twitter.com that have a "source":"Twitter for Advertisers (legacy)" or "source":"Twitter for Advertisers".
This operator must be negated.

For more info on Nullcasted Tweets, see our page on Tweet availability.

Example: "mobile games" -is:nullcast
has:hashtags Conjunction required All Matches Tweets that contain at least one hashtag.

Example: from:twitterdev -has:hashtags
has:cashtags Conjunction required Academic Research only Matches Tweets that contain a cashtag symbol (with a leading ‘$’ character. For example, $tag).

Example: #stonks has:cashtags
has:links Conjunction required All This operator matches Tweets which contain links and media in the Tweet body.

Example: from:twitterdev announcement has:links
has:mentions Conjunction required All Matches Tweets that mention another Twitter user.

Example: #nowplaying has:mentions
has:media Conjunction required All Matches Tweets that contain a media object, such as a photo, GIF, or video, as determined by Twitter. This will not match on media created with Periscope, or Tweets with links to other media hosting sites.

Example: (kittens OR puppies) has:media
has:images Conjunction required All Matches Tweets that contain a recognized URL to an image.

Example: #meme has:images
has:videos Conjunction required All Matches Tweets that contain native Twitter videos, uploaded directly to Twitter. This will not match on videos created with Periscope, or Tweets with links to other video hosting sites.

Example: #icebucketchallenge has:videos
has:geo Conjunction required Academic Research only Matches Tweets that have Tweet-specific geolocation data provided by the Twitter user. This can be either a location in the form of a Twitter place, with the corresponding display name, geo polygon, and other fields, or in rare cases, a geo lat-long coordinate.

Note: Operators matching on place (Tweet geo) will only include matches from original tweets. Retweets do not contain any place data.

Example: recommend #paris has:geo -bakery
lang: Conjunction required All Matches Tweets that have been classified by Twitter as being of a particular language (if, and only if, the tweet has been classified). It is important to note that each Tweet is currently only classified as being of one language, so AND’ing together multiple languages will yield no results.

Note: if no language classification can be made the provided result is ‘und’ (for undefined).

Example: recommend #paris lang:en

The list below represents the currently supported languages and their corresponding BCP 47 language identifier:

Amharic: am German: de Malayalam: ml Slovak: sk
Arabic: ar Greek: el Maldivian: dv Slovenian: sl
Armenian: hy Gujarati: gu Marathi: mr Sorani Kurdish: ckb
Basque: eu Haitian Creole: ht Nepali: ne Spanish: es
Bengali: bn Hebrew: iw Norwegian: no Swedish: sv
Bosnian: bs Hindi: hi Oriya: or Tagalog: tl
Bulgarian: bg Latinized Hindi: hi-Latn Panjabi: pa Tamil: ta
Burmese: my Hungarian: hu Pashto: ps Telugu: te
Croatian: hr Icelandic: is Persian: fa Thai: th
Catalan: ca Indonesian: in Polish: pl Tibetan: bo
Czech: cs Italian: it Portuguese: pt Traditional Chinese: zh-TW
Danish: da Japanese: ja Romanian: ro Turkish: tr
Dutch: nl Kannada: kn Russian: ru Ukrainian: uk
English: en Khmer: km Serbian: sr Urdu: ur
Estonian: et Korean: ko Simplified Chinese: zh-CN Uyghur: ug
Finnish: fi Lao: lo Sindhi: sd Vietnamese: vi
French: fr Latvian: lv Sinhala: si Welsh: cy
Georgian: ka Lithuanian: lt  
Was this document helpful?
Thank you

Thank you for the feedback. We’re really glad we could help!

Thank you for the feedback. How could we improve this document?
Thank you for the feedback. Your comments will help us improve our documents in the future.