python - Reading network switch html table into pandas dataframe results with "Empty DataFrame" - Stack Overfl

IT技术

更新时间：2025-04-103

admin管理员组
文章数量:1400211

When I use pd.read_html() to load a table into a pandas dataframe and then run print(dataframe) the output starts with "Empty DataFrame".

Also, I'm having trouble accessing the elements of the dataframe.

Here is the html:

<HTML>
<HEAD>

<TITLE>System Information 10.8.129.1</TITLE>

</HEAD>

<BODY>


<br><H4>Port Statistics</H4>



<table BORDER COLS=5 WIDTH="80%">

<tr><th>Slot/Port</th><th>Intf.</th><th>TX Frm.</th><th>TX Oct.</th><th>RX Frm.</th><th>RX Oct.</th><th>RX BC</th><th>RX MC</th><th>CRC Align.</th><th>Unders.</th><th>Overs.</th><th>Frag.</th><th>Jabbers</th><th>Total Coll.</th><th>Late Coll.</th><tr><th>1/1</th><th>1</th><th>66967821</th><th>3429650783</th><th>96815811</th><th>2328791105</th><th>39571627</th><th>3960333</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>

<tr><th>1/2</th><th>2</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>

<tr><th>2/1</th><th>9</th><th>17533</th><th>1899146</th><th>13646</th><th>1821221</th><th>416</th><th>34</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>

<tr><th>2/2</th><th>10</th><th>12941919</th><th>1909511968</th><th>3896084</th><th>1687222693</th><th>415948</th><th>78</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>



</table>


</BODY>

</HTML>

I try to read extract the dataframe by running:

import os
import pandas as pd

url='C:\\WebFiles\switch_stats.html'

dataframe = pd.read_html(url)

print(dataframe)

The output is:

[Empty DataFrame
Columns: [(Slot/Port, 1/1, 1/2, 2/1, 2/2), (Intf., 1, 2, 9, 10), (TX Frm., 66967821, 0, 17533, 12941919), (TX Oct., 3429650783, 0, 1899146, 1909511968), (RX Frm., 96815811, 0, 13646, 3896084), (RX Oct., 2328791105, 0, 1821221, 1687222693), (RX BC, 39571627, 0, 416, 415948), (RX MC, 3960333, 0, 34, 78), (CRC Align., 0, 0, 0, 0), (Unders., 0, 0, 0, 0), (Overs., 0, 0, 0, 0), (Frag., 0, 0, 0, 0), (Jabbers, 0, 0, 0, 0), (Total Coll., 0, 0, 0, 0), (Late Coll., 0, 0, 0, 0)]
Index: []]

When I use pd.read_html() to load a table into a pandas dataframe and then run print(dataframe) the output starts with "Empty DataFrame".

Also, I'm having trouble accessing the elements of the dataframe.

Here is the html:

<HTML>
<HEAD>

<TITLE>System Information 10.8.129.1</TITLE>

</HEAD>

<BODY>


<br><H4>Port Statistics</H4>



<table BORDER COLS=5 WIDTH="80%">

<tr><th>Slot/Port</th><th>Intf.</th><th>TX Frm.</th><th>TX Oct.</th><th>RX Frm.</th><th>RX Oct.</th><th>RX BC</th><th>RX MC</th><th>CRC Align.</th><th>Unders.</th><th>Overs.</th><th>Frag.</th><th>Jabbers</th><th>Total Coll.</th><th>Late Coll.</th><tr><th>1/1</th><th>1</th><th>66967821</th><th>3429650783</th><th>96815811</th><th>2328791105</th><th>39571627</th><th>3960333</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>

<tr><th>1/2</th><th>2</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>

<tr><th>2/1</th><th>9</th><th>17533</th><th>1899146</th><th>13646</th><th>1821221</th><th>416</th><th>34</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>

<tr><th>2/2</th><th>10</th><th>12941919</th><th>1909511968</th><th>3896084</th><th>1687222693</th><th>415948</th><th>78</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th><th>0</th></tr>



</table>


</BODY>

</HTML>

I try to read extract the dataframe by running:

import os
import pandas as pd

url='C:\\WebFiles\switch_stats.html'

dataframe = pd.read_html(url)

print(dataframe)

The output is:

[Empty DataFrame
Columns: [(Slot/Port, 1/1, 1/2, 2/1, 2/2), (Intf., 1, 2, 9, 10), (TX Frm., 66967821, 0, 17533, 12941919), (TX Oct., 3429650783, 0, 1899146, 1909511968), (RX Frm., 96815811, 0, 13646, 3896084), (RX Oct., 2328791105, 0, 1821221, 1687222693), (RX BC, 39571627, 0, 416, 415948), (RX MC, 3960333, 0, 34, 78), (CRC Align., 0, 0, 0, 0), (Unders., 0, 0, 0, 0), (Overs., 0, 0, 0, 0), (Frag., 0, 0, 0, 0), (Jabbers, 0, 0, 0, 0), (Total Coll., 0, 0, 0, 0), (Late Coll., 0, 0, 0, 0)]
Index: []]

Share Improve this question edited Mar 24 at 17:52 ouroboros1 14.7k7 gold badges48 silver badges58 bronze badges asked Mar 24 at 17:12 DBacker 112 bronze badges

Can you share the URL of the page whose table you are trying to scrape? – Ifeanyi Idiaye Commented Mar 24 at 17:56

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

The issue is that each Table Row (<tr>) contains only Table Header elements (th), even though only the first row should logically have such elements. That is to say: your table should have Table Data Cell elements (td) after the first row.

By default, pd.read_html looks for th elements to determine what it should treat as the header. Hence, you end up with an empty dataframe with MultiIndex column labels. To avoid this, simply set header=0:

Minimal Reproducible Example

import pandas as pd
from io import StringIO

# proper `td` elements from row 1 onwards
correct_table = """
<table>
    <tr><th>col1</th><th>col2</th></tr>
    <tr><td>1</th><td>A</td></tr>
    <tr><td>2</th><td>B</td></tr>
</table>
"""

df1 = pd.read_html(StringIO(correct_table))[0]

# only `th` elements
incorrect_table = """
<table>
    <tr><th>col1</th><th>col2</th></tr>
    <tr><th>1</th><th>A</th></tr>
    <tr><th>2</th><th>B</th></tr>
</table>
"""

df2 = pd.read_html(StringIO(incorrect_table), header=0)[0]

df1.equals(df2)
# True

Output:

   col1 col2
0     1    A
1     2    B

Above, StringIO(...) mimics your url file read. Without header=0, you get:

pd.read_html(StringIO(incorrect_table))[0].columns

MultiIndex([('col1', '1', '2'),
            ('col2', 'A', 'B')],
           )

本文标签：

版权声明：本文标题：python - Reading network switch html table into pandas dataframe results with "Empty DataFrame" - Stack Overfl 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744238000a2596648.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

python - Reading network switch html table into pandas dataframe results with &quot;Empty DataFrame&quot; - Stack Overfl

1 Answer 1

更多相关文章

javascript - Customize Firebase &quot;email validation&quot;, &quot;password reset&quot; pages - Stack Overflow

POST a file as a binary stream to server using javascript - Stack Overflow

javascript - Text not changing color with jquery addClass() - Stack Overflow

javascript - Can Backbone and Express routers work together in an Express application? - Stack Overflow

javascript - Is there a way to NOT execute beforeEach function only for certain tests (&#39;it&#39; blocks) - Stack Over

javascript - How to get right side position of element? - Stack Overflow

html - How to apply aspect-ratio on flex item inside flexbox with dynamic height - Stack Overflow

javascript - How to get MathJaxASCIIMath to ignore certain words - Stack Overflow

c# - How to force the newer NuGet package version when there&#39;s a version conflict due to transitive references - Stack O

javascript - how to implement slideshow transition effects? - Stack Overflow

javascript - How to pass data from on component to another using Angular EventEmitter - Stack Overflow

javascript - how to crop html page, not image - Stack Overflow

javascript - Vue style is not being applied - Stack Overflow

microsoft entra id - Azure RBAC User Authentication - Stack Overflow

javascript - Setting to SafeHTML - Angular 2 - Stack Overflow

javascript - NPM Cannot find module &#39;safe-buffer&#39; - Stack Overflow

python 3.x - How to correction this command? i need to create a command for delete all messages, users and bots - Stack Overflow

javascript - jQuery bind change event to all children in several DIV&#39;s except selected - Stack Overflow

javascript - Waiting for promises inside the object to resolve - Stack Overflow

javascript - jquery count the elements by class and set the another class a certain element in the list - Stack Overflow

发表评论

推荐文章

javascript - How to set data from a react hook inside of a function or event? - Stack Overflow

javascript - Code Mirror Get Current Line Number every time content changes - Stack Overflow

angular - Updated signal value is delayed in child component, if the value is passed down as an `input()` - Stack Overflow

plugins - URL path image error in ACF (Advanced Custom Field)

woocommerce offtopic - How to select from two different tables to display orders list with custom column from other table

热门文章

reactjs - Spring Boot Todo App: POST request not working with React &amp; Axios - Stack Overflow

javascript - Typescript concat two data types array in one - Stack Overflow

html - Seeking Alternatives to &#39;unsafe-inline&#39; for Angular CSP Configuration - Stack Overflow

javascript - Next.js: Skipping generating static sites at build time (instead: only at runtime, as getStaticProps has no data in

javascript - How to find an element inside another element in mootools - Stack Overflow

Login redirects and query strings

javascript - Bootstrap JS not loading in Symfony 5 with Encore - Stack Overflow

python - Debugpy Connection Refused Debug in VsCode - Stack Overflow

custom post types - Why does using WP_Query inside a shortcode in an elementor page cause the arguments for WP_Query to get malf

python - AttributeError: &#39;list&#39; object has no attribute &#39;items&#39; - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

javascript - jquery count the elements by class and set the another class a certain element in the list - Stack Overflow

apache spark - Pick a row based on a date or a default - Stack Overflow

javascript - How can I alter the types for material UI alert component? - Stack Overflow

jquery - How to use datepicker.min.js?

php - Automatic update with new data on ext.data.store - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

python - Reading network switch html table into pandas dataframe results with "Empty DataFrame" - Stack Overfl

javascript - Customize Firebase "email validation", "password reset" pages - Stack Overflow

javascript - Is there a way to NOT execute beforeEach function only for certain tests ('it' blocks) - Stack Over

c# - How to force the newer NuGet package version when there's a version conflict due to transitive references - Stack O

javascript - NPM Cannot find module 'safe-buffer' - Stack Overflow

javascript - jQuery bind change event to all children in several DIV's except selected - Stack Overflow

reactjs - Spring Boot Todo App: POST request not working with React & Axios - Stack Overflow

html - Seeking Alternatives to 'unsafe-inline' for Angular CSP Configuration - Stack Overflow

python - AttributeError: 'list' object has no attribute 'items' - Stack Overflow