How to find filters
Using the open
or api
command often generates a large data set that can be tricky to find filters for. For example, if we are trying to get all headlines from a news site, how do we know what filters describe a headline?
A good pattern for finding the filters that describe a particular element is to use the find
command to look for a specific piece of text in our data.
For example, if one of the headlines is Liverpool wins the Champion's League
we could try isolating just elements on the page containing that text, then find the specific row/element containing the headline. Our query would look something like this:
... || find "Liverpool wins the Champion's League"
Then we can look through the row values for values that describe the element. Maybe in this case the row has an attributes.class
column where the value is headline
and a nodeName
column where the value is H2
. Our filter would look something like:
... || filter "attributes.class == 'headline' and nodeName == 'H2'"
If we remove the find
command, we will likely get back all headlines.
Sometimes you will need to try a few times to get the right combination of filters to capture all the elements you are interested in. For example, maybe the main headline has a slightly different filter that needs to be added as an or
to our filter expression.