“Our company is interested in some references about the TV show “Cutthroat Kitchen” on the Food Network, as well as references to the host (Alton Brown) that relate to the show.”
Using what we learned in the first example, we can create a rule to capture mentions like this.
"cutthroat kitchen" OR cutthroatkitchen OR (("alton brown" OR altonbrown) host OR show)
This rule would capture:
- Tweets with the phrase “cutthroat kitchen” in the text.
- Tweets with "cutthroatkitchen" in the text (same as above, but without the space).
- Tweets mentioning alton brown (or "altonbrown" without the space), where that tweet also mentions "host" or "show".
Now, let’s imagine that the company collects Tweets with this rule for a week, but their customer is unhappy with the quality of the results, and wants to target content more specifically. This time around, they want to narrow the results, but also want to capture some specific mentions they missed before.
“We only want Tweets using our promoted hashtag (#cutthroatkitchen), Tweets mentioning the show’s host by his Twitter handle (@altonbrown) in a way that relates to the show, or Tweets that link to the Food Network’s online page about Alton Brown. Additionally, we only want Tweets that come from users who say they are based in the United States.”
Let’s begin with the first requirement.
Our previous rule would have captured Tweets using the promoted hashtag, thanks to our "cutthroatkitchen" term. This would have matched mentions like "#cutthroatkitchen", "@cutthroatkitchen", or just a bare reference to "cutthroatkitchen". These matches are due to the use of a tokenized match that tokenizes using punctuation. However, our customer wants to restrict this to only match on uses of "#cutthroatkitchen". To do this, we’ll use the # operator as follows:
While the original "cutthroatkitchen" term looked for matches in the general text of the Tweet, this rule actually changes the strategy, and looks for a match in the list of hashtags Twitter has extracted from the Tweet itself (with the native enriched format, this will match on the twitter_entities.hashtags field). Thus, it provides a much more targeted way for the customer to ensure they are only getting hashtag mentions of the phrase.
A similar concept applies to restricting the mentions of Alton Brown to those using his Twitter handle. The previous term ("altonbrown") would have gotten all mentions in the text of that specific string. However, we can use the @ operator to restrict it to ONLY explicit references to his Twitter handle.
This means the rule will be applied to Twitter’s extracted user mentions (with the native enriched format, this will match on the twitter_entities.mentions field) for a match, rather than the general text used in the Tweet. Since we are now specifically referencing the Twitter account of the show's host, the verified @altonbrown account, we no longer need to require the "host" and "show" keywords.
Next, the customer was previously getting some Tweets linking to their web page about Alton Brown just by looking for mentions of his name within the text, including any URLs included in the text of the Tweet. However, they were missing some references where Twitter users shortened their URLs with services like bit.ly before posting them. To accommodate this need, we need to use the url_contains: operator with the specific URL the customer wants to track.
The url_contains: operator looks for matches in the fully expanded URL that is provided in the Tweet as an enrichment by Twitter. In other words, even if a URL is wrapped in a bit.ly or other shortened link, Twitter will unwind it down to the final URL and allow you to look for matches there.
Last, the customer wants to restrict the results to Tweets where the user is from the United States. To do this, we will use Twitter’s Profile Geo enrichment, and corresponding premium operator to apply the restriction to all of the previously defined terms.
Please note that geo operators are not currently available in Twitter Developer Labs.
Incorporating the changes above, we can come up with a rule that will satisfy the customer:
profile_country:us (#cutthroatkitchen OR @altonbrown OR url_contains:"foodnetwork.com/chefs/alton-brown")
This would match the following:
- Tweets using the company’s promoted hashtag, but not those using the keyword without the hashtag.
- Mentions of @altonbrown, but excluding plain-text mentions that don’t use @ mention syntax.
- Tweets that include links to the Food Network’s page about Alton Brown, even where they are shortened using bit.ly or another service.
- Additionally, no Tweets meeting the requirements above will be delivered unless they also have a profile country code for the United States, based on Twitter's Profile Geo enrichment.
The syntax used is important – the use of parentheses where appropriate creates the boolean logic we want, and ensure that the profile_country:us operator is applied across the board. When in doubt, use parentheses to be sure you don’t end up with unexpected results due to the order of operations for rules.
Beyond these examples, there are hundreds of ways that you can combine operators and keywords to return the data that is critical to your analysis. Expanding these concepts to narrow your search based on profile information, follower count, Tweet location, language used in the text, and many more. In addition to the topics discussed here, you should be well-versed in the full documentation around operators and rules, including the limits around restricted characters and rule size. You can find these details at the following resources: