Skip to content Skip to sidebar Skip to footer

Extracting Text Tags In Order - How Can This Be Done?

I am trying to find all the text along with the parent tag in the HTML. In the example below, the variable named html has the sample HTML where I try to extract the tags and the te

Solution 1:

You might want to just walk the tree in depth order. Walk function courtesy of this gist.

function walk(el, fn, parents = []) {
  fn(el, parents);
  (el.children || []).forEach((child) => walk(child, fn, parents.concat(el)));
}
walk(cheerio.load(html).root()[0], (node, parents) => {
  if (node.type === "text" && node.data.trim()) {
    console.log(parents[parents.length - 1].name, node.data);
  }
});

This prints out the stuff, but you could just as well put it in that array of yours.


Post a Comment for "Extracting Text Tags In Order - How Can This Be Done?"