admin管理员组文章数量:1379588
I'm trying to retrieve some data from here, namely games and odds. I know the data is in the response of this GET request as shown in the network tab below:
However we can see that there is some websocket protocol and I'm not sure how to handle this.
I should mention I'm new to python (usually coding in R) and websockets but I've managed to find the socketio path in the code elements so here is what I've tried :
import socketio
sio = socketio.Client(logger=True, engineio_logger=True)
@sio.event
def connect():
print('connected!')
sio.emit('add user', 'Testing')
@sio.event
def print_message(sid):
print("Socket ID: " , sid)
@sio.event
def disconnect():
print('disconnected!')
sio.connect('',transports=['websocket'], socketio_path = '/uof-sports-server/socket.io')
sio.wait()
I'm able to connect but I'm not sure where to go next and get the actual response from the GET request above.
Any hints appreciated
I'm trying to retrieve some data from here, namely games and odds. I know the data is in the response of this GET request as shown in the network tab below:
However we can see that there is some websocket protocol and I'm not sure how to handle this.
I should mention I'm new to python (usually coding in R) and websockets but I've managed to find the socketio path in the code elements so here is what I've tried :
import socketio
sio = socketio.Client(logger=True, engineio_logger=True)
@sio.event
def connect():
print('connected!')
sio.emit('add user', 'Testing')
@sio.event
def print_message(sid):
print("Socket ID: " , sid)
@sio.event
def disconnect():
print('disconnected!')
sio.connect('https://sports-eu-west-3.winamax.fr',transports=['websocket'], socketio_path = '/uof-sports-server/socket.io')
sio.wait()
I'm able to connect but I'm not sure where to go next and get the actual response from the GET request above.
Any hints appreciated
Share Improve this question asked Mar 20 at 13:52 M.OM.O 5092 silver badges11 bronze badges 3- These requests usually come with some form of authentication. Check the headers! – Klaus D. Commented Mar 20 at 13:58
- I’ve already tried using get requests with all the required headers but it returns status 400. – M.O Commented Mar 20 at 15:56
- Well, many site try to prevent exactly what you doing and have measures against it in place. – Klaus D. Commented Mar 20 at 17:54
1 Answer
Reset to default 1I believe you were quite close, just need to emit events that are also known by the other side. Most of the data exchange there goes through "m"
events.
I didn't test with current socketio
, but according to Version compatibility table we should use v4.x here. Target Socket.IO version is probably v2.5.0, guessed from the header of bundled uof-sports-server/socket.io/socket.io.js
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "python-socketio[client]<5.0",
# ]
# ///
import socketio
import pprint
import uuid
sio = socketio.Client(
logger=True,
# engineio_logger=True
)
requestId = str(uuid.uuid4())
# connect & emit "m" event
@sio.event
def connect():
print("connected!")
data = dict(route="tournament:4", requestId=requestId)
print("sending", data)
sio.emit("m", data)
# wait for "m" event with matching requestId
@sio.on("m")
def m_response(data):
if data.get("requestId") == requestId:
pprint.pp(data.keys())
pprint.pp([match["title"] for match in data["matches"].values()])
sio.disconnect()
@sio.event
def disconnect():
print("disconnected!")
sio.connect(
url="https://sports-eu-west-3.winamax.fr",
transports=["websocket"],
socketio_path="/uof-sports-server/socket.io/",
)
sio.wait()
( you can use uv
to resolve dependencies from script's inline metadata )
$ uv run winamax_socketio.py
Engine.IO connection established
Namespace / is connected
connected!
sending {'route': 'tournament:4', 'requestId': '6e9ee3d4-0bcb-45f4-ab6c-652379f234cb'}
Emitting event "m" [/]
Received event "m" [/]
dict_keys(['tournaments', 'matches', 'bets', 'outcomes', 'odds', 'requestId'])
['Angers - Rennes',
'Auxerre - Montpellier',
'Le Havre - Nantes',
'Strasbourg - Lyon',
'Toulouse - Brest',
'Saint-Étienne - Paris SG',
'Reims - Marseille',
'Lille - Lens',
'Monaco - Nice',
'Marseille - Toulouse',
'Nice - Nantes',
'Brest - Monaco',
'Montpellier - Le Havre',
'Lyon - Lille',
'Lens - Saint-Étienne',
'Paris SG - Angers',
'Reims - Strasbourg',
'Rennes - Auxerre',
"Ligue 1 McDonald's® 2024/25"]
Engine.IO connection dropped
To help with such tasks and to check communication flows against know working examples you might want to look into debugging proxies (mitmproxy, Telerik Fiddler, HTTP Toolkit, ...).
本文标签: web scrapingConnect to socketio xhr request with pythonStack Overflow
版权声明:本文标题:web scraping - Connect to socket.io xhr request with python - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744405006a2604662.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论