Hi HN, we’re Carl and Nic, the creators of crul, and we’ve been hard at work for the last year and a half building our dream of turning the web into a dataset. In a nutshell crul is a tool for querying and building web and api data feeds from anywhere to anywhere.
With crul you can crawl and transform web pages into csv tables, explore and dynamically query APIs, filter and organize data, and push data sets to third party data lakes and analytics tools. Here’s a demo video, we’ve been told Nic sounds like John Mayer (lol)
We’ve personally struggled wrangling data from the web using puppeteer/playwright/selenium, jq or cobbling together python scripts, client libraries, and schedulers to consume APIs. The reality is that shit is hard, doesn’t scale (classic blocking for-loop or async saturation), and thorny maintenance/security issues. The tools we love to hate.
Crul’s value prop is simple: Query any Webpage or API for free.
At its core, crul is based on the foundational linked nature of Web/API content. It consists of a purpose built map/expand/reduce engine for hierarchical Web/API content (kind of like postman but with a membership to Gold's Gym) with a familiar parser expression grammar that naturally gets the job done (and layered caching to make it quick to fix when it doesn’t on the first try). There’s a boatload of other features like domain policies, scheduler, checkpoints, templates, REST API, Web UI, vault, OAuth for third parties and 20+ stores to send your data to.
Our goal is to open source crul as time and resources permit. At the end of the day it’s just the two of us trying to figure things out as we go! We’re just getting started.
Crul is one bad mother#^@%*& and the web is finally yours!
Download crul for free as a Mac OS desktop application or as a Docker image and let us know if you love it or hate it. And come say hello to us on our slack channel - we’re a friendly bunch!
Nic and Carl - (Crul early days)
P.S. Every download helps so give it a spin!