read-tree
TypeScript icon, indicating that this package has built-in type declarations

0.1.1 • Public • Published

Read Tree

build docs npm license

A parse5 tree adapter that works with readability. Simply use the exported treeAdapter with parse5's parse function, and pass the result into readability to get a summarized document that's much more lightweight than JSDom and doesn't require browser DOM parsing either.

import { Readability } from "@mozilla/readability";
import { parse } from "parse5";
import { treeAdapter } from "read-tree";
const doc = parse(content, { treeAdapter });
// NOTE Readabilty says it takes a compliant document, but it actually just
// takes the limited Document defined here.
const { content } = new Readability(parsed as unknown as Document).parse();

This was designed to work with Readablity, but not necessarily to be performant. There are several operations that don't scale linearly and when used by Readability result in quadratic time operators. These could be sped up to linear with significant more effort in caching, but at this point, compliance was more important than performance.

Conceptually this lies somewhere between parse5 and cheerio. It tries to imitate browser functionality more than parse5, but without as much functionality as cheerio.

Package Sidebar

Install

npm i read-tree

Weekly Downloads

2

Version

0.1.1

License

MIT

Unpacked Size

54.7 kB

Total Files

8

Last publish

Collaborators

  • erik.brinkman