[Nodejs] Web Scraping note (cheerio)

Scrapping And Art

Cheerio module, you will be able to use the syntax of jQuery while working with downloaded web data. Cheerio provides developers with the ability to provide their attention on the downloaded data, rather than on parsing it.

Load html

var request = require('request');
var cheerio = require('cheerio');

request('http://www.google.com/', function(err, resp, html) {
if (!err){
const $ = cheerio.load(html);
console.log(html);
}
});

Selectors

Example html content:

<ul id="fruits">
<li class="apple">Apple</li>
<li class="orange">Orange</li>
<li class="pear">Pear</li>
</ul>
$('.apple', '#fruits').text()
//=> Apple

$('ul .pear').attr('class')
//=> pear

$('li[class=orange]').html()
//=> <li class = "orange">Orange</li>

Traversing

find(selector)

Get a set of descendants filtered by selector of each element in the current set of matched elements.

$('#fruits').find('li').length
//=> 3

.parent()

Gets the parent of the first selected element.

$('.pear').parent().attr('id')
//=> fruits

.next()

Gets the next sibling of the first selected element.

$('.apple').next().hasClass('orange')
//=> true

.prev()

Gets the previous sibling of the first selected element.

$('.orange').prev().hasClass('apple')
//=> true

.siblings()

Gets the first selected element’s siblings, excluding itself.

$('.pear').siblings().length
//=> 2

.children( selector )

Gets the children of the first selected element.

$('#fruits').children().length
//=> 3
$('#fruits').children('.pear').text()
//=> Pear

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!

Algorithm Artist and tech blogger from Macau, Co-Funder of golding.cc. ex @yahoo coder. https://www.linkedin.com/in/waheng