Query and manipulate raw markup via CSS selectors, without losing original formatting

npm install soup
7 downloads in the last day
69 downloads in the last week
295 downloads in the last month

soup Build Status Dependency Status

A little library for querying and manipulating tag soup via CSS selectors.

It manipulates the string itself (rather than operating on a parsed DOM and then re-exporting it). So it retains all the syntactic/formatting nuances of the original, such as:

  • attribute quotes or lack thereof,
  • whitespace,
  • invalid-but-parseable stuff,
  • omitted closing tags, etc.

Use cases:

  • build tasks/plugins that need to manipulate markup without parsing away all the original formatting;
  • GUI webpage design tools that need to combine hand-coded HTML with WYSIWYG-driven edits;
  • anywhere else you need to make automated, light-touch changes to other people's markup.


npm install soup

var Soup = require('soup');

soup = new Soup('<div class=thing><img src=cat.jpg></div>');

// Change the img src
soup.setAttribute('img', 'src', 'dog.jpg');
soup.toString(); // <div class=thing><img src=dog.jpg></div>

// Add a class to the div
soup.setAttribute('.thing', 'class', function (oldValue) {
  return oldValue + ' another';
soup.toString(); // <div class="thing another"><img src=dog.jpg></div>


Soup uses Cheerio under the hood for finding elements to update, so you can use any CSS3 selector in the methods below.


setAttribute(selector, attributeName, newValue)

  • newValue can be:
    • any string – to set the attribute's value
    • true – to set it as a boolean attribute (eg required)
    • false – to delete the attribute
    • null – for "no change"
    • a function – which will be passed the current value, and should return one of the above values
      • it will also be passed an index as the second attribute (if the attribute was found), which contains the character index of the attribute you're changing.
  • Soup will respect the original quote style of each attribute it updates whenever possible (but quotes will be added to non-quoted values if necessitated by characters in the new value).

Example – adding a query string to all image URLs:

soup.setAttribute('img', 'src', function (oldValue) {
  return oldValue + '?12345'

getAttribute(selector, attributeName, callback)

  • Same as .setAttritute(), except your callback's return value won't have any effect.

setInnerHTML(selector, attributeName, newHTML)

  • newHTML can be:
    • a string of HTML
    • a function that returns a string of HTML
      • this will be passed the oldHTML
    • null for "no change"

Example – appending new content inside an element:

soup.setInnerHTML('#foo', function (oldHTML) {
  return oldHTML + '<p>appended content</p>'


Copyright (c) 2014 Callum Locke. Licensed under the MIT license.

npm loves you