admin管理员组文章数量:1336181
I have some JavaScript code running in node.js which controls puppeteer to automate tasks in a web browser.
This code gets a list of links on the page and outputs them to the console:
const links = await page.evaluate(() => { return [...document.querySelectorAll('a')].map(({href, innerText}) => ({href, innerText})); });
links.forEach(a => console.log(`<a href="${a.href}">${a.innerText.trim()}</a>`));
If I remove the map()
like this:
const links = await page.evaluate(() => { return [...document.querySelectorAll('a')]; });
links.forEach(a => console.log(`<a href="${a.href}">${a.innerText.trim()}</a>`));
Then I get this error:
TypeError: Cannot read properties of undefined (reading 'trim')
Is there any way to work directly on the original array, without having to make a copy of the array using map()
?
There are a couple of hundred properties on each <a>
link, which I'd have to type out one at a time in the map()
if I wanted to use many of them.
As an aside, is there any way to combine the 2 lines of code in to 1?
If I change it to this:
await page.evaluate(() => { return [...document.querySelectorAll('a')].map(({href, innerText}) => ({href, innerText})); })
.forEach(a => console.log(`<a href="${a.href}">${a.innerText.trim()}</a>`));
Then I get this error:
TypeError: page.evaluate(...).forEach is not a function
I also found that it doesn't seem to be possible to do a console.log()
whilst inside a page.evaluate()
(I get no output). This is why I moved the forEach
on to a 2nd line.
I have some JavaScript code running in node.js which controls puppeteer to automate tasks in a web browser.
This code gets a list of links on the page and outputs them to the console:
const links = await page.evaluate(() => { return [...document.querySelectorAll('a')].map(({href, innerText}) => ({href, innerText})); });
links.forEach(a => console.log(`<a href="${a.href}">${a.innerText.trim()}</a>`));
If I remove the map()
like this:
const links = await page.evaluate(() => { return [...document.querySelectorAll('a')]; });
links.forEach(a => console.log(`<a href="${a.href}">${a.innerText.trim()}</a>`));
Then I get this error:
TypeError: Cannot read properties of undefined (reading 'trim')
Is there any way to work directly on the original array, without having to make a copy of the array using map()
?
There are a couple of hundred properties on each <a>
link, which I'd have to type out one at a time in the map()
if I wanted to use many of them.
As an aside, is there any way to combine the 2 lines of code in to 1?
If I change it to this:
await page.evaluate(() => { return [...document.querySelectorAll('a')].map(({href, innerText}) => ({href, innerText})); })
.forEach(a => console.log(`<a href="${a.href}">${a.innerText.trim()}</a>`));
Then I get this error:
TypeError: page.evaluate(...).forEach is not a function
I also found that it doesn't seem to be possible to do a console.log()
whilst inside a page.evaluate()
(I get no output). This is why I moved the forEach
on to a 2nd line.
3 Answers
Reset to default 2IT goldman has correctly identified why trying to return an array of Nodes won't work--HTML elements aren't serializable.
It's possible to remove the map
and operate on the original objects, but it will result in worse code. Mutating the original array of nodes to make them serializable isn't a good idea since it's risk to modify objects you don't own.
Avoid premature optimization. It's OK to copy by default and only switch to in-place modification once you encounter a bottleneck and have profiled and determined that in-place modification really does account for the performance issue--highly unlikely.
As far as the Puppeteer API goes, you can immediately simplify
await page.evaluate(() =>
[...document.querySelectorAll("a")].map(...)
);
to
await page.$$eval("a", els => els.map(...));
The parameter els
passed to the callback is a regular array, so .map
is available without a spread.
I also found that it doesn't seem to be possible to do a console.log() whilst inside a page.evaluate() (I get no output). This is why I moved the forEach on to a 2nd line.
By default, the browser console output goes to your browser, not Node, because that's the environment the evaluate
callback runs in.
You can forward the browser console to Node, but whether that's appropriate or not is unclear. You haven't provided much context for what you're doing here, or why you're mapping links back to formatted links with stripped attributes (you might want to use .outerHTML
instead, depending on what you're actually trying to achieve).
I'd avoid smushing multiple lines onto one. Let two lines be two lines (or more)--just write clear code and use an autoformatter. await
is not amenable to chaining or one-liners (by design!), so I'd avoid the (await foo()).property
antipattern in favor of two lines.
Consider
const links = await page.$$eval("a", els =>
els.map(a => `<a href="${a.href}">${a.textContent.trim()}</a>`)
);
links.forEach(console.log);
or
const links = await page.$$eval("a", els => els.map(el => el.outerHTML));
links.forEach(console.log);
Generally, prefer .textContent
to .innerText
.
Note also that it's possible for a
links to not have href
s, so you might want to adjust your selector to a[href]
.
The map
is necessary to convert each HTMLElement
into a serializable object { href, innerText }
that can be passed in page.evaluate
from the context of the browser (page) to the context of your node app.
If you want to work on the original array of elements, you can execute JavaScript on the context of the page inside the page.evaluate
handler.
Access the properties inside the evaluate
function, in the code that's running in the page, and send back only the results of your function back to the driver process. In your particular example, you'll want to simplify to
(await page.evaluate(() =>
Array.from(document.querySelectorAll('a'), a => `<a href="${a.href}">${a.innerText.trim()}</a>`)
)).forEach(console.log);
本文标签:
版权声明:本文标题:javascript - Why is an Array.map() necessary, otherwise the var is undefined?How to eliminate the Array.map()? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742401715a2468017.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
map()
. – Barmar Commented Nov 19, 2024 at 20:05[...]
, since the collection returned byquerySelectorAll()
has aforEach()
method. – Barmar Commented Nov 19, 2024 at 20:06.then()
:page.evaluate(...).then(links => links.forEach(...))
– Barmar Commented Nov 19, 2024 at 20:07[...]
spread operator then I getTypeError: document.querySelectorAll(...).map is not a function
– Danny Beckett Commented Nov 19, 2024 at 20:08forEach()
, notmap()
. I was talking about the version withoutmap()
. – Barmar Commented Nov 19, 2024 at 20:08