admin管理员组

文章数量:1295636

I'm using Playwright to scrape some data. How do I click on all links on the page matching a selector?

const { firefox } = require('playwright');

(async () => {
  const browser = await firefox.launch({headless: false, slowMo: 50});
  const page = await browser.newPage();
  
  await page.goto('');
  
  page.pause(); // allow user to manually search for something
  
  const wut = await page.$$eval('a', links => {
    links.forEach(async (link) => {
      link.click();           // maybe works?
      console.log('whoopee'); // doesn't print anything
      page.goBack();          // crashes
    });
    return links;
  });

  console.log(`wut? ${wut}`); // prints 'wut? undefined'

  await browser.close();
})();

Some issues:

  1. console.log inside the $$eval doesn't do anything.
  2. page.goBack() and page.pause() inside the eval cause a crash.
  3. The return value of $$eval is undefined (if I ment out page.goBack() so I get a return value at all). If I return links.length instead of links, it's correct (i.e. it's a positive integer). Huh?

I get similar results with:

const links = await page.locator('a');
await links.evaluateAll(...)

Clearly I don't know what I'm doing. What's the correct code to achieve something like this?

(X-Y problem alert: I don't actually care if I do this with $$eval, Playwright, or frankly even Javascript; all I really want to do is make this work in any language or tool).

I'm using Playwright to scrape some data. How do I click on all links on the page matching a selector?

const { firefox } = require('playwright');

(async () => {
  const browser = await firefox.launch({headless: false, slowMo: 50});
  const page = await browser.newPage();
  
  await page.goto('https://www.google.');
  
  page.pause(); // allow user to manually search for something
  
  const wut = await page.$$eval('a', links => {
    links.forEach(async (link) => {
      link.click();           // maybe works?
      console.log('whoopee'); // doesn't print anything
      page.goBack();          // crashes
    });
    return links;
  });

  console.log(`wut? ${wut}`); // prints 'wut? undefined'

  await browser.close();
})();

Some issues:

  1. console.log inside the $$eval doesn't do anything.
  2. page.goBack() and page.pause() inside the eval cause a crash.
  3. The return value of $$eval is undefined (if I ment out page.goBack() so I get a return value at all). If I return links.length instead of links, it's correct (i.e. it's a positive integer). Huh?

I get similar results with:

const links = await page.locator('a');
await links.evaluateAll(...)

Clearly I don't know what I'm doing. What's the correct code to achieve something like this?

(X-Y problem alert: I don't actually care if I do this with $$eval, Playwright, or frankly even Javascript; all I really want to do is make this work in any language or tool).

Share asked Jan 13, 2022 at 20:37 SasgorillaSasgorilla 3,1308 gold badges44 silver badges77 bronze badges 4
  • Remember that the function you pass to $$eval runs inside the browser. So the console.log will be printed in the browser and page.goBack won't work inside the browser. – hardkoded Commented Jan 13, 2022 at 21:43
  • @hardkoded - Thank you, that explains why I'm missing the logging. I can perhaps return an array of strings for that. But if page.goBack() doesn't work, how can I get back to the original page? (Or does each link open in a new page/browser?) – Sasgorilla Commented Jan 13, 2022 at 23:37
  • Exactly. You can't keep the elements across navigations. You need to keep your elements in some way you can query them again. That could me as simple as keeping the index, if you trust that the links won't change across navigations. – hardkoded Commented Jan 14, 2022 at 13:31
  • Let me be sure I understand here: when I do link.click(), the browser navigates to a new page. There's no way to get a reference to the new page? No way to find an element, click a link on it, nothing? What good is link.click() then? – Sasgorilla Commented Jan 15, 2022 at 15:45
Add a ment  | 

2 Answers 2

Reset to default 6
const { context } = await launch({ slowMo: 250 });
const page = await context.newPage();
await page.goto('https://stackoverflow./questions/70702820/how-can-i-click-on-all-links-matching-a-selector-with-playwright');

const links = page.locator('a:visible');
const linksCount = await links.count();

for (let i = 0; i < linksCount; i++) {
  await page.bringToFront();

  try {
    const [newPage] = await Promise.all([
      context.waitForEvent('page', { timeout: 5000 }),
      links.nth(i).click({ modifiers: ['Control', 'Shift'] })
    ]);
    await newPage.waitForLoadState();
    console.log('Title:', await newPage.title());
    console.log('URL: ', page.url());

    await newPage.close();
  }
  catch {
    continue;
  }
}

There's a number of ways you could do this, but I like this approach the most. Clicking a link, waiting for the page to load, and then going back to the previous page has a lot of problems with it - most importantly is that for many pages the links might change every time the page loads. Ctrl+shift+clicking opens in a new tab, which you can access using the Promise.all pattern and catching the 'page' event.

I only tried this on this page, so I'm sure there's tons of other problems that my arise. But for this page in particular, using 'a:visible' was necessary to prevent getting stuck on hidden links. The whole clicking operation is wrapped in a try/catch because some of the links aren't real links and don't open a new page.

Depending on your use case, it may be easiest just to grab all the hrefs from each link:

const links = page.locator('a:visible');
const linksCount = await links.count();

const hrefs = [];
for (let i = 0; i < linksCount; i++) {
  hrefs.push(await links.nth(i).getAttribute('href'));
}

console.log(hrefs);

Try this approach.I will use typescript.

await page.waitForSelector(selector,{timeout:10000});
const links = await page.$$(selector);

for(const link of links)
{
   await link.click({timeout:8000});
   //your additional code
}

See more on https://youtu.be/54OwsiRa_eE?t=488

本文标签: javascriptHow can I click on all links matching a selector with PlaywrightStack Overflow