Interested in exploring Labs?
The endpoints we release in Labs are previews of tools that may be released more broadly in the future, but will likely undergo changes before then. We encourage you to take that into consideration as you explore. Before getting started, please read more about Twitter Developer Labs.

Still using v1?
This page documents the current version of this endpoints, however you can still reference the previous version. You should also check out our version migration guide and review our changelog.

 

Labs search pagination

Introduction

Search queries typically match on more Tweets than can be returned in a single API response. When that happens, the data is returned in a series of 'pages'. Pagination refers to methods for requesting all of the pages in order to retrieve the entire data set.

Here are fundamental recent search pagination details:

  • The Labs recent search endpoint will respond to a query with at least one page, and provide a next_token in its JSON response if additional pages are available. To receive matching Tweets, this process can be repeated until no token is included in the response.
  • Tweets are delivered in reverse-chronological order, in the UTC timezone. This is true within individual pages, as well as across multiple pages: 
    • The first Tweet in the first response will be the most recent one matching your query.
    • The last Tweet in the last response will be the oldest one matching your query.
  • The max_results request parameter enables you to configure the number of Tweets returned per response. This defaults to 10 Tweets and has a maximum of 100. 
  • Every pagination implementation will involve parsing next_tokens from the response payload, and including them in the 'next page' search request. 
     

The Labs recent search endpoint was designed to support two fundamental use patterns:

  • Get historical - Requesting matching Tweets from a time period of interest. These are typically one-time requests in support of historical research. Search requests can be based on start_time and end_time request parameters. The Labs recent search endpoint responds with Tweets delivered in reverse-chronological order, starting with the most recent matching Tweet. 
  • Polling - Requesting matching Tweets that have been posted since the last Tweet received. These use cases often have a near-real-time focus and are characterized by frequent requests, "listening" for new Tweets of interest. The Labs recent search endpoint provides the since_id request parameter in support of the 'polling' pattern. To help with navigating by Tweet IDs, the until_id request parameter is also available.
     

Next, we'll discuss the historical mode. This is the default mode of Labs recent search and illustrates the fundamentals of pagination. Then we'll discuss examples of polling use cases. When polling triggers pagination, there is an additional step to manage search requests.

 

Retrieving historical data

This section outlines how you can retrieve Tweets from a period of interest within the last seven days using the start_time and end_time request parameters. Historical requests are typically one-time requests in support of research and analysis. 

Making requests for a period of data is the default mode of the Labs recent search endpoint. If a search request does not specify a start_time, end_time, or since_id request parameter, the end_time will default to "now" (actually 30 seconds before the time of query) and the start_time will default to seven days ago. 

The endpoint will respond with the first 'page' of Tweets in reverse-chronological order, starting with the most recent Tweet. The response JSON payload will also include a next_token if there are additional pages of data. To collect the entire set of matching Tweets, regardless of the number of pages, requests are made until no next_token is provided. 

For example, here is an initial request for Tweets with the keyword snow from the last week:

/tweets/search?query=snow

The response includes the most recent 10 Tweets, along with these "meta" attributes in the JSON response:

  "meta": {
        "newest_id": "1204860593741553664",
        "oldest_id": "1204860580630278147",
        "next_token": "b26v89c19zqg8o3fobd8v73egzbdt3qao235oql",
        "result_count": 10
    }


To retrieve the next 10 Tweets, this next_token is added to the original request. The request would be:

/tweets/search?query=snow&next_token=b26v89c19zqg8o3fobd8v73egzbdt3qao235oql

The process of looking for a next_token and including it in a subsequent request can be repeated until all (or some number of) Tweets are collected, or until a specified number of requests have been made. If data fidelity (collecting all matches of your query) is key to your use case, a simple "repeat until request.next_token is null" design will suffice. 

This "paging backward through history" is the most simple form of pagination. Next we'll discuss polling use cases, where new Tweets are of interest are frequently checked for. When such requests require pagination, and additional step is needed to prepare the next "any new Tweets?" queries. See the next section for more details. 

 

Polling and listening use cases

This section outlines how you can retrieve recent Tweets by polling the Labs recent search endpoint with the since_id request parameter. 

With polling use cases, "any new Tweets of interest?" queries are made on an on-going, frequent basis. Unlike historical use cases, that base requests on time, polling use cases typically base requests on Tweet IDs.

Central to the polling use pattern is that every new Tweet has a unique ID that is 'emitted' from the Twitter platform generally in ascending order. If one Tweet has an ID smaller than another, it means it was posted earlier. Similarly, if the ID is greater than another, it was posted most recently. 

Labs recent search supports navigating the Tweet archive by Tweet ID. Responses from the Labs recent search endpoint include oldest_id and newest_id Tweet IDs. In the polling mode, requests are made with the since_id set to the largest/newest ID received so far. 

For example, say a query for new Tweets about snow is made every 5 minutes, and the last Tweet we received had a Tweet ID of 10000. When it is time to poll, the request looks like:

/tweets/search?query=snow&since_id=10000

Next, let's say seven Tweets were posted since our last request. Since all of these fit on a single data 'page', there is no next_token. The response provides the Tweet ID of the most recent (newest) Tweet:

"meta": {
        "newest_id": "12000",
        "oldest_id": "10005",
        "result_count": 7
    }


To make the next polling query, this newest_id value is used to set the next since_id parameter:

/tweets/search?query=snow&since_id=12000

When there is more data available, and next tokens are provided, only this first newest_id value is needed. Each page of data will include newest_id and oldest_id values, but the value provided in the first page is the only one needed for the next, regularly scheduled, polling request. So, If you are implementing a polling design, or searching for Tweets by ID range, pagination logic is slightly more complicated. 

Now say that there are now 18 more matching Tweets. The endpoint would respond with this initial response with a full data page and a next_token for requesting the next page. It would also include the newest Tweet ID need for the next polling interval in five minutes.  

"meta": {
        "newest_id": "13800",
        "oldest_id": "12500",
        "next_token": "fnsih9chihsnkjbvkjbsc",
        "result_count": 10
    }


To collect all the matching data, the next_token is included in the next request, which is otherwise the same as our original polling query (with the same since_id value)

/tweets/search?query=snow&since_id=12000&next_token=fnsih9chihsnkjbvkjbsc

 

"meta": {
        "newest_id": "12300",
        "oldest_id": "12010",
        "result_count": 8
    }


This second response provides the remaining eight Tweets, and no next_token. Note that we do not update our newest_id request parameter, and instead base our next request on the first response's value:

/tweets/search?query=snow&since_id=13800

 


 

Additional resources

  • Check out our API reference to learn about how to customize your requests and see the response JSON payloads.
  • Check out our Labs code examples in Python, Ruby, Node.js, and Java on our GitHub page