admin管理员组

文章数量:1122846

I was working on a web scraping project using Python, Requests, bs4 libraries.

I was trying to Scrape IPL's Webpage, where I want to get all details from the page for every match for every season.

Attached here a Snippet for your reference

Expected: tag length should be 60 because 60 matches were played! Actual: 0

Actual Result Snippet

from flask import Flask, render_template, request,jsonify
from flask_cors import CORS,cross_origin
import requests
from bs4 import BeautifulSoup as bs
from urllib.request import urlopen as uReq

#Main Web page URL
ipl_url = ";
response = requests.get(ipl_url)
if response.status_code == 200:
  html_content = response.text
  soup = bs(html_content, 'html.parser')

else:
  print(f'Failed to retrieve the web page. Status code: {response.status_code}')

#HERE THE PROBLEM STARTS
match_center = soup.find_all('div', {'class':'vn-shedule-desk col-100 floatLft'})
len(match_center) # ==> Expected: 60 , Actual: 0

#got the HTML parser using 'bs' But when I try to find
#'div', {'class':'vn-shedule-desk col-100 floatLft'} this tag then I get an empty list

I was working on a web scraping project using Python, Requests, bs4 libraries.

I was trying to Scrape IPL's Webpage, where I want to get all details from the page for every match for every season.

Attached here a Snippet for your reference

Expected: tag length should be 60 because 60 matches were played! Actual: 0

Actual Result Snippet

from flask import Flask, render_template, request,jsonify
from flask_cors import CORS,cross_origin
import requests
from bs4 import BeautifulSoup as bs
from urllib.request import urlopen as uReq

#Main Web page URL
ipl_url = "https://www.iplt20.com/matches/results/2008"
response = requests.get(ipl_url)
if response.status_code == 200:
  html_content = response.text
  soup = bs(html_content, 'html.parser')

else:
  print(f'Failed to retrieve the web page. Status code: {response.status_code}')

#HERE THE PROBLEM STARTS
match_center = soup.find_all('div', {'class':'vn-shedule-desk col-100 floatLft'})
len(match_center) # ==> Expected: 60 , Actual: 0

#got the HTML parser using 'bs' But when I try to find
#'div', {'class':'vn-shedule-desk col-100 floatLft'} this tag then I get an empty list
Share Improve this question asked yesterday Nikhil SableNikhil Sable 132 bronze badges New contributor Nikhil Sable is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 2
  • Most likely the content is dynamically created using JavaScript. It's not present in the source HTML. – Robby Cornelissen Commented yesterday
  • Yes, I also tried to copy all the HTML code into a notepad & find the div tag class value, But was unable to find that. Do we have any other option? – Nikhil Sable Commented yesterday
Add a comment  | 

1 Answer 1

Reset to default 0

As we can see in the comments the issue is "content is dynamically created using JavaScript. It's not present in the source HTML."

So, you can try using Scrapy and Selenium. But I think Selenium is gonna be the best option for this scenario.

本文标签: web scrapingUnable to find an div tagclass value for a web pageStack Overflow