Browse by Keyword: "crawl"
amazon-products A node.js module to crawl product IDs from Amazon.
amazon-reviews A node.js module to crawl product reviews from Amazon.
bas Behaviour Assertion Sheets: CSS-like declarative syntax for client-side integration testing and quality assurance.
craigslist-utils Nodejs utilities for scraping/crawling/working with craigslist
crawl Website crawler and differencer
crawljs A basic nodejs crawler.
crawlstream Crawl websites in a streaming fashion
crawly make output crawl across your terminal all spooky like
curljs Node.js module that wraps curl functionality
gnosis A utility to traverse an object and execute a callback to transform the object, etc.
gretel Follows and collects breadcrumbs accross the web
grunt-url-image-crawler Crawl your CSS/SCSS or HTML files for img URL's and store the crawled image URL's in a local JSON file.
huntsman Super configurable async web spider
img-crawler A module to download images from a given URL
jedi-crawler Lightsabing Node/PhantomJS crawler. Crawl almost everything, including AJAX content.
node-spider Generic web crawler powered by NodeJS
phantom-crawl Web crawler for ajax applications
phantomjs-test-starter Starter Template for testing PhantomJS ‘Applications’ with Jasmine, Grunt, and Istanbul
rambler Walk a website checking for bad links
repubblica-sport-match A node.js module to scrape player points from Repubblica Sport match report
robotstxt a robotstxt parser for node.js
ruthless Crawl the web breadth-first from a seed url, statefully
scraperrr Web crawler configured by JSON configurations defining what data fields to scrape from the visited websites using regular expressions or DOM selectors and how to export them as JSON
scrowser A server-side scraping web browser
skrap Easily scrap web pages by providing json recipes
skrapper custom scrapping tool
spatula Spatula scrapes things! Webby things.
spiderweb Crawl multiple domains using one or more entry URLs.
upstage Utility to crawl an entry URL and produce a static copy of the site.
witchypoo Stores a unique list of domain names and their page rank at time of crawling