Querying an asynchronous API (Splunk Query)
Many services, such as query engines like GCP BigQuery, AWS Athena, Splunk, etc. have asynchronous dispatch APIs for running queries. This means that you can dispatch a query against these services, get back a job id, which you can then poll for status/completion before accessing the results. This is a common API pattern, and is supported by the crul api
command syntax.
Querying an asynchronous API is usually a three step process:
- Dispatching the job
- Polling for the job to complete
- Reading the results
Each of these steps can be contained in a crul query stage. Let's use the dispatch of a Splunk query as an example.
Example: Splunk Query​
Note: this query runs against the Splunk free tier, so authentication is not needed.
Full Query​
addcolumn search "index=main"
|| urlencode search
|| api post http://localhost:8000/services/search/jobs
--headers '{"Content-Type": "x-www-form-urlencoded"}'
--data "search=$search$"
--verifySSL false
--serializer xml
|| api post http://localhost:8000/services/search/jobs/$response.sid.0$
--verifySSL false
--serializer xml
--while 'entry.content.0.s:dict.0.s:key.6.textContent != "DONE"'
|| api get $entry.id.0$/results?output_mode=json
--verifySSL false
--serializer json
|| table results.*
|| normalize results
Stages 1-3: Dispatching the job​
addcolumn search "index=main"
|| urlencode search
|| api post http://localhost:8000/services/search/jobs
--headers '{"Content-Type": "x-www-form-urlencoded"}'
--data "search=$search$"
--verifySSL false
--serializer xml
This first stage will create a search
column with our Splunk search using the addcolumn
command.
We will then urlencode
the search
.
Next we will post
the search to the http://localhost:8000/services/search/jobs
endpoint using the api
command. We are using flags to control the --headers
and --data
to send with our request, as well as turning off SSL verification and setting xml
as our serialization (the Splunk api returns xml
by default).
Stage 4: Polling for the job to complete​
...
|| api post http://localhost:8000/services/search/jobs/$response.sid.0$
--verifySSL false
--serializer xml
--while 'entry.content.0.s:dict.0.s:key.6.textContent != "DONE"'
Our polling stage will use the api
command against an endpoint that returns the status of the search job. We are using the same SSL verification and serialization flags as in the previous stage, but note the --while
flag, which takes an expression to run on the results of the stage. If the expression is false, the stage will run again. This translates to polling the endpoint until the search's state changes to DONE
.
Note the $response.sid.0$
value in the request url. This is a template value derived from the results of the previous dispatch stage. It is the id of the job.
Stages 5-7: Reading the results​
...
|| api get $entry.id.0$/results?output_mode=json
--verifySSL false
--serializer json
|| table results.*
|| normalize results
The final stages will retrieve the results, and process/normalize them for further processing in crul or export to another system.