@themaximalist/scrape.js

0.0.9 • Public • Published

scrape.js

GitHub Repo stars NPM Downloads GitHub code size in bytes GitHub License

Scrape.js is an easy to use web scraping library for Node.js:

  • Extremely Fast
  • Scrape nearly any website
  • Auto-retries with increasing sophistication
  • Auto proxy rotation
  • ...it just works
const data = await scrape("https://example.com");
// { url, html, original_url, options }

You can specify additional options to scrape() for more control:

const data = await scrape("https://example.com", { headless: true, proxy: true});
// { url, html }

Installation

npm install @themaximalist/scrape.js

Usage

const scrape = require("@themaximalist/scrape.js");
await scrape("http://example.com");

Configuration

scrape.js uses Zen Rows for proxy rotation. To use it acquire a Zen Rows API key and setup the environment variable. scrape.js can be used without proxies, but is less effective.

ZENROWS_API_KEY=abcxyz123

Examples

View test on how to use scrape.js.

Projects

scrape.js is currently used in the following projects:

  • News Score — score the news, score the news, rewrite the headlines

Author

License

MIT

Package Sidebar

Install

npm i @themaximalist/scrape.js

Weekly Downloads

3

Version

0.0.9

License

MIT

Unpacked Size

60.2 kB

Total Files

19

Last publish

Collaborators

  • themaximalist