python - Why validate the `href` attribute twice? - Stack Overflow

IT技术

更新时间：2025-03-162

admin管理员组
文章数量:1325233

I found the following web scraping code in Web Scraping with Python by Ryan Mitchel:

from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import re 
pages = set() 
def getLinks(pageUrl): 
    global pages 
    html = urlopen(";+pageUrl) 
    bsObj = BeautifulSoup(html) 
    for link in bsObj.findAll("a", href=repile("^(/wiki/)")): 
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages: 
            #find new page
                newPage = link.attrs['href'] 
                print(newPage) 
                pages.add(newPage) 
                getLinks(newPage) 
getLinks("")

I believe that in the findAll() for loop, all tag objects with href attributes that meet the criteria have already been retrieved. Why do we still need to check if the object has the href attribute afterward?

In my opinion, I think that this line code should be deleted: if 'href' in link.attrs: Do I think correctly?

本文标签： pythonWhy validate the href attribute twiceStack Overflow

版权声明：本文标题：python - Why validate the `href` attribute twice? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1742079708a2419623.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - Why validate the `href` attribute twice? - Stack Overflow

更多相关文章

python - Why validate the `href` attribute twice? - Stack Overflow

发表评论

推荐文章

javascript - What is v-bind in vue - Stack Overflow

How to turn character date into integer JavaScript - Stack Overflow

block editor - Gutenberg link of internal page not showing

javascript - How could I pass data to chart.js with flask? - Stack Overflow

custom field - Function that replaces the image in the absence of the post meta

热门文章

javascript - onchange + validation + enter key weird behaviour - Stack Overflow

javascript - Recompose pure() vs React.PureComponent - Stack Overflow

javascript - JS - Read Data Between XML Tags - Stack Overflow

javascript - Programmatically firing a click handler - Stack Overflow

javascript - Special mouse events in a browser: wheel, right-click? - Stack Overflow

javascript - D3js: Dragging a group by using one of it's children - Stack Overflow

javascript - jQuery Ajax Error object is undefined - Stack Overflow

multisite - Moving from a non Wordpress site to a Wordpress Site - Login between both sites

javascript - how to add decimal in angular reactive form control with value starting with 1 and greater - Stack Overflow

Building a javascript web analytics tool from scratch - Stack Overflow

最新文章

有了这个免费搞机神器，小白也能秒变装机大神！

PE一键装机的实施步骤

Win10一键重装！官方纯净版系统+工具下载教程，小白秒变高手

使用U盘为笔记本电脑重装Win7系统详细教程

pe怎么安装kali linux,U盘+kali+pe三合一教程！装机，存储，渗透，persistence存储问题解决！...

javascript - jQuery UI position relative to two elements - Stack Overflow

How `window.wp.oldEditor` is being set in the editor?

javascript - Detect if a mouse event occurred inside an element's client area - Stack Overflow

javascript full text Date() format with PHP carbon - Stack Overflow

How to change key value in the nested object array javascript - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

编程频道|软件玩家 - 软件改变生活！

python - Why validate the `href` attribute twice? - Stack Overflow

更多相关文章

python - Why validate the `href` attribute twice? - Stack Overflow

发表评论

推荐文章

javascript - What is v-bind in vue - Stack Overflow

How to turn character date into integer JavaScript - Stack Overflow

block editor - Gutenberg link of internal page not showing

javascript - How could I pass data to chart.js with flask? - Stack Overflow

custom field - Function that replaces the image in the absence of the post meta

热门文章

javascript - onchange + validation + enter key weird behaviour - Stack Overflow

javascript - Recompose pure() vs React.PureComponent - Stack Overflow

javascript - JS - Read Data Between XML Tags - Stack Overflow

javascript - Programmatically firing a click handler - Stack Overflow

javascript - Special mouse events in a browser: wheel, right-click? - Stack Overflow

javascript - D3js: Dragging a group by using one of it&#39;s children - Stack Overflow

javascript - jQuery Ajax Error object is undefined - Stack Overflow

multisite - Moving from a non Wordpress site to a Wordpress Site - Login between both sites

javascript - how to add decimal in angular reactive form control with value starting with 1 and greater - Stack Overflow

Building a javascript web analytics tool from scratch - Stack Overflow

最新文章

有了这个免费搞机神器，小白也能秒变装机大神！

PE一键装机的实施步骤

Win10一键重装！官方纯净版系统+工具下载教程，小白秒变高手

使用U盘为笔记本电脑重装Win7系统详细教程

pe怎么安装kali linux,U盘+kali+pe三合一教程！装机，存储，渗透，persistence存储问题解决！...

javascript - jQuery UI position relative to two elements - Stack Overflow

How `window.wp.oldEditor` is being set in the editor?

javascript - Detect if a mouse event occurred inside an element&#39;s client area - Stack Overflow

javascript full text Date() format with PHP carbon - Stack Overflow

How to change key value in the nested object array javascript - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

javascript - D3js: Dragging a group by using one of it's children - Stack Overflow

javascript - Detect if a mouse event occurred inside an element's client area - Stack Overflow