admin管理员组文章数量:1403461
There are divs with class="xj7" Below that , there is an a=href link.
How can i access the value of the a link? What makes it even trickier is that there are many elements with the same classname - so ideally i want to loop through them.
Another hindrance is that the link is in relative form. That means it doesn't specify the domain name. It is like this:
<div class="xj7">
<a href="/tst/gfhe7sje">
There are divs with class="xj7" Below that , there is an a=href link.
How can i access the value of the a link? What makes it even trickier is that there are many elements with the same classname - so ideally i want to loop through them.
Another hindrance is that the link is in relative form. That means it doesn't specify the domain name. It is like this:
<div class="xj7">
<a href="/tst/gfhe7sje">
Share
Improve this question
asked Jun 10, 2018 at 23:45
user1584421user1584421
3,89312 gold badges57 silver badges98 bronze badges
2 Answers
Reset to default 3Try this and let me know if it works.
async function run(){
await page.goto('<url_here>');
let div_selector= "div.xj7.Kwh5n";
let list_length = await page.evaluate((sel) => {
let elements = Array.from(document.querySelectorAll(sel));
return elements.length;
}, div_selector);
for(let i=0; i< list_length; i++){
var href = await page.evaluate((l, sel) => {
let elements= Array.from(document.querySelectorAll(sel));
let anchor = elements[l].getElementsByTagName('a')[0];
if(anchor){
return anchor.href;
}else{
return '';
}
}, i, div_selector);
console.log('--------> ', href)
}
await browser.close();
}
run();
You can do this:
const crawl = async (url) => {
try {
console.log(`Crawling ${url}`)
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(url)
const selector = '.xj7 > a'
await page.waitForSelector(selector)
const links = await page.$$eval(selector, am => am.filter(e => e.href).map(e => e.href))
console.log(links)
await browser.close()
} catch (err) {
console.log(err)
}
}
crawl('https://example.')
本文标签: javascriptPuppeteerRetrieving links from divs with specific class namesStack Overflow
版权声明:本文标题:javascript - Puppeteer - Retrieving links from divs with specific class names - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744344253a2601661.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论