admin管理员组文章数量:1427503
We have a couple of legacy sites undergoing an upgrade. It would be useful to be able to screenshot every page and then md5 sum the results for both domains, and then test if everything which renders matches 100%.
I am unsure of how to do this - we have looked at cheerio
which would crawl the site but be unable to screenshot, and nightwatch which can take screenshots but not crawl the site. Does anyone have experience doing this?
We have a couple of legacy sites undergoing an upgrade. It would be useful to be able to screenshot every page and then md5 sum the results for both domains, and then test if everything which renders matches 100%.
I am unsure of how to do this - we have looked at cheerio
which would crawl the site but be unable to screenshot, and nightwatch which can take screenshots but not crawl the site. Does anyone have experience doing this?
- @Patrick Roberts - have you actually experienced this while screenshotting wikipedia? – pguardiario Commented Jun 8, 2018 at 9:25
2 Answers
Reset to default 3An easy solution is to use Chrome in headless mode which can also be controlled with many Node modules like Puppeteer.
Taken from the Google Developers page:
chrome --headless --disable-gpu --screenshot https://www.chromestatus./
About crawling, you can use a mix of Cheerio and Puppeteer to crawl links and take screenshots. Alternatively you could find some tool that allows to export a sitemap (example) with all the website URLs, at this point it should be easy to loop through them and take a screenshot of each.
You could use StormCrawler with Selenium and write a custom NavigationFilter to take the screenshot and store the md5sum of it in the document metadata. See tutorial for an introduction to SC+Selenium.
The next step could be to write a custom indexer and dump the URLs with the md5s into a database or file. Finally, you'd do the same for the newer version of the site and pare the content of the files or rows in the table.
本文标签: javascriptIs there a way to take a screenshot of every page on a websiteStack Overflow
版权声明:本文标题:javascript - Is there a way to take a screenshot of every page on a website? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745499867a2660979.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论