admin管理员组文章数量:1335098
Example HTML:
<p class="labels">
<span>Item1</span>
<span>Item2</span>
<time class="time">
<span>I dont want to get this span</span>
</time>
</p>
I am currently getting all the spans within the tag with the labels
class, but i just want to get the 2 spans directly under the labels
class and i dont want to get any span
tags from child elements.
Currently i am doing it like this obviously:
First i am getting the labels HTML from a much bigger HTML:
labels = html.findAll(_class="labels")
Then i extract the span tags out of this.
spans = labels[0].findAll('span', {"class": None}
In my case the "class": None
doesn't change anything because no span tag has any class.
So my question again is, how can i just get the first 2 span tags without all child elements?
Example HTML:
<p class="labels">
<span>Item1</span>
<span>Item2</span>
<time class="time">
<span>I dont want to get this span</span>
</time>
</p>
I am currently getting all the spans within the tag with the labels
class, but i just want to get the 2 spans directly under the labels
class and i dont want to get any span
tags from child elements.
Currently i am doing it like this obviously:
First i am getting the labels HTML from a much bigger HTML:
labels = html.findAll(_class="labels")
Then i extract the span tags out of this.
spans = labels[0].findAll('span', {"class": None}
In my case the "class": None
doesn't change anything because no span tag has any class.
So my question again is, how can i just get the first 2 span tags without all child elements?
Share Improve this question edited Nov 26, 2015 at 22:31 Bioaim asked Nov 26, 2015 at 16:17 BioaimBioaim 1,0161 gold badge15 silver badges28 bronze badges 3-
Couldn't you make a list prehension that iterates over the direct children of
labels[0]
and grabs anyspan
s from there? – SuperBiasedMan Commented Nov 26, 2015 at 16:23 -
Do you need all
span
tags beforetime
tag insidep
tag ? – Learner Commented Nov 26, 2015 at 18:30 - Yes, exactly - and there could be more or less then 2. – Bioaim Commented Nov 26, 2015 at 18:41
3 Answers
Reset to default 4There is a little sentence in the BeautifulSoup Docs where one can find recursive = False
So the answer on this problem was:
spans = labels[0].findAll('span', {"class": None}, recursive=False)
for container in html.findAll(_class="labels"):
spans = container.findAll('span', {"class": None})
spans = [span for span in spans if span.parent is container]
Alternatively iterate the .children
:
for container in html.findAll(_class="labels"):
filter = lambda c: c.name == 'span' and c.class_ == None
spans = [child for child in container.children if filter(child)]
To extract first two span elements try below
>>>[i.text for i in html.find('p',{"class":"labels"}).findAll('span', {"class": None})[0:2]]
>>>[u'Item1', u'Item2']
If you want to grab all span
inside class labels
then remove the slice-
>>>[i.text for i in html.find('p',{"class":"labels"}).findAll('span', {"class": None})]
>>>[u'Item1', u'Item2', u'I dont want to get this span']
本文标签: javascriptPython BeautifulSoupget elements without child elementsStack Overflow
版权声明:本文标题:javascript - Python BeautifulSoup - get elements without child elements - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742384828a2464820.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论