Paginating an API (pokeapi)
Many APIs return paginated responses, this means that not all results are available in a single request, the response however will include a pointer to the next set of results. This pointer could be a hash value, a page, an offset, or an explicit link. The crul api
command can handle many types of pagination using the --pagination.*
set of flags.
Let's take a look at a pair of examples.
Example 1​
Full Query​
api get https://pokeapi.co/api/v2/pokemon
--pagination.max 5
--pagination.next next
|| normalize results
Stage 1: Making the paginated request​
api get https://pokeapi.co/api/v2/pokemon
--pagination.max 5
--pagination.next next
The pokeapi
returns a paginated response when getting a list of Pokemon. In the case of the pokeapi
, the next set of results is pointed to by a link in the results contained in the next
key. We can use the --pagination.next
flag to tell the stage where to find the next page of results, and use the --pagination.max
flag to tell the stage how many pages to get.
Essentially, this stage will first make a request to the https://pokeapi.co/api/v2/pokemon
endpoint, then will look at the next
value in the results and make a request to that endpoint and continue until next
no longer exists, returns an error, or our --pagination.max
value is hit.
Stage 2: Normalizing the results​
...
|| normalize results
The results from this endpoint are nested in a results
array. Note the results.0.*
, results.1.*
etc. columns. We can expand the results array into a row per result using the normalize
command.
Example 2​
Full Query​
api get "https://api.twitter.com/2/tweets/search/recent?query=NASA&tweet.fields=created_at,author_id,public_metrics&max_results=100"
--bearer "$CREDENTIALS.twitter_bearer$"
--pagination.max 5
--pagination.next "meta.next_token"
--pagination.url "https://api.twitter.com/2/tweets/search/recent?query=NASA&tweet.fields=created_at,author_id,public_metrics&max_results=100&next_token=$pagination.next$"
|| normalize data
Stage 1: Making the paginated request​
api get "https://api.twitter.com/2/tweets/search/recent?query=NASA&tweet.fields=created_at,author_id,public_metrics&max_results=100"
--bearer "$CREDENTIALS.twitter_bearer$"
--pagination.max 5
--pagination.next "meta.next_token"
--pagination.url "https://api.twitter.com/2/tweets/search/recent?query=NASA&tweet.fields=created_at,author_id,public_metrics&max_results=100&next_token=$pagination.next$"
The Twitter API uses a different pagination strategy than the pokeapi
in the example above. The Twitter API response will contain a meta.next_token
value that can be used to construct a url for our next page of results. For this API, we will use the --pagination.next
flag to denote the value of the next page, then use the --pagination.url
flag to construct a url that uses the value from the --pagination.next
flag.
It may seem a bit confusing at first, but essentially, we are plucking out the next hash/token/page value from the results of the API request, then providing that value as a token to a part of a custom url. Generally the url will not be very different from the url used for the initial API request, it will just include a new query parameter or another small modification.
Again we have set --pagination.max
to 5
. This will limit the number of pages we visit to 5
. Set this to 0
to continue until no pages exist.
Stage 2: Normalizing the results​
...
|| normalize data
The results from this endpoint are nested in a data
array. Note the data.0.*
, data.1.*
etc. columns. We can expand the results array into a row per data using the normalize
command.