Extracting Text Tags In Order - How Can This Be Done?
I am trying to find all the text along with the parent tag in the HTML. In the example below, the variable named html has the sample HTML where I try to extract the tags and the te
Solution 1:
You might want to just walk the tree in depth order. Walk function courtesy of this gist.
function walk(el, fn, parents = []) {
fn(el, parents);
(el.children || []).forEach((child) => walk(child, fn, parents.concat(el)));
}
walk(cheerio.load(html).root()[0], (node, parents) => {
if (node.type === "text" && node.data.trim()) {
console.log(parents[parents.length - 1].name, node.data);
}
});
This prints out the stuff, but you could just as well put it in that array of yours.
Post a Comment for "Extracting Text Tags In Order - How Can This Be Done?"