azure synapse - Able to read FileInfo, but cannot read contents within the file: urlopen error [Errno 5] Inputoutput error - Sta

IT技术

更新时间：2025-04-185

admin管理员组
文章数量:1405907

I have 2 environments, Staging & Production. Both using the exact code, but one environment can read the contents within the file, while other cannot.

I can see both FileInfo using 2 codes:

Python:
mssparkutils.fs.ls(path)

mssparkutils.fs.ls(f'file:{mssparkutils.fs.getMountPath("/mount1")}{staging_path}')

Output:
FileInfo(path=abfss://container_name@storage_account.dfs.core.windows/Staging_path/test.csv, 
name=test.csv, size=1000)]

Output:
FileInfo(path=file:/synfs/notebook/22/mount1/Staging_path/test.csv, 
name=test.csv, size=1000)]

Staging works, but when I try this in Production

df = pd.read_csv(f'file:{mssparkutils.fs.getMountPath("/mount1")}{staging_path}test.csv')
display(df)

I have 2 environments, Staging & Production. Both using the exact code, but one environment can read the contents within the file, while other cannot.

I can see both FileInfo using 2 codes:

Python:
mssparkutils.fs.ls(path)

mssparkutils.fs.ls(f'file:{mssparkutils.fs.getMountPath("/mount1")}{staging_path}')

Output:
FileInfo(path=abfss://container_name@storage_account.dfs.core.windows/Staging_path/test.csv, 
name=test.csv, size=1000)]

Output:
FileInfo(path=file:/synfs/notebook/22/mount1/Staging_path/test.csv, 
name=test.csv, size=1000)]

Staging works, but when I try this in Production

df = pd.read_csv(f'file:{mssparkutils.fs.getMountPath("/mount1")}{staging_path}test.csv')
display(df)

Share Improve this question edited Mar 7 at 6:49 asked Mar 7 at 6:31 Dan Wang 215 bronze badges

it could be permission issue. try getting using spark once spark.read.csv("path_without_file_prefix") – JayashankarGS Commented Mar 10 at 11:19

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Make sure the Managed Identity assigned in the Production environment has the necessary permissions to access both the storage account and the specific file. Without the right permissions, the system will not be able to read the file properly.

Then, confirm that the mount point (/mount1) is correctly set up in Production. You can check the list of mounts using the below code:

mssparkutils.fs.mounts()

If you see /mount1 is missing or incorrectly mounted, you can remount it using the below code:

mssparkutils.fs.unmount("/mount1")
mssparkutils.fs.mount(
    "abfss://<Yourcontainername>@a<Your storage account name>dfs.core.windows",
    "/mount1",
    {"linkedService": "workspacestoragetest"}
)

After remounting, check if that the file path exists and is accessible by listing the directory contents using the below code:

mssparkutils.fs.ls(f'file:{mssparkutils.fs.getMountPath("/mount1")}{staging_path}')

ERROR: <urlopen error [Errno 5] Input/output error: '/synfs/notebook/22/mount1/Staging_path/test.csv'>

If you are still seeing an Input/Output error, it could be due to network issues. Check for any firewall rules or network restrictions that might be blocking access to the storage account from the Production environment.

If the Linked Service to Azure Data Lake Storage Gen2 is using a managed private endpoint with a dfs URI, you'll also need to set up a secondary managed private endpoint using the Azure Blob Storage option with a blob URI. This ensures that the internal fsspec/adlfs library can properly connect via the BlobServiceClient interface.

Know more about from this link

it is a good idea to implement retry logic to avoid failures due to temporary issues. Here's how you can do it:

import time
from urllib.error import URLError
retries = 3
for attempt in range(retries):
    try:
        df0 = pd.read_csv(
            f'file:{mssparkutils.fs.getMountPath("/mount1")}{staging_path}ABC.zip',
            compression='zip', sep='|', names=abc, dtype=xyz
        )
        break  # Exit the loop if successful
    except URLError as e:
        if attempt < retries - 1:
            time.sleep(5)  # Wait for 5 seconds before retrying
            continue
        else:
            raise e  # Raise the error if all retries fail

To make debugging easier, add logging so you can capture details about any errors:

import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
try:
    df0 = pd.read_csv(
        f'file:{mssparkutils.fs.getMountPath("/mount1")}{staging_path}ABC.zip',
        compression='zip', sep='|', names=abc, dtype=xyz
    )
except URLError as e:
    logger.error(f"Error reading file: {e}")
    raise e

By following these steps, you can identify the root cause of the issue—whether it’s permissions, mount points, network restrictions, or transient errors—and apply the necessary fix.

本文标签：

版权声明：本文标题：azure synapse - Able to read FileInfo, but cannot read contents within the file: urlopen error [Errno 5] Inputoutput error - Sta 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744945138a2633728.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

azure synapse - Able to read FileInfo, but cannot read contents within the file: urlopen error [Errno 5] Inputoutput error - Sta

1 Answer 1

更多相关文章

javascript - How to declare a module in TypeScript with an object as default export - Stack Overflow

Can&#39;t compile javascript using ant and closure compiler because of Jquery&#39;s $ is undeclared - Stack Overflow

python 3.x - Unable to install pyaudio via homebrew - Stack Overflow

javascript - JSDoc required parameter with default value - Stack Overflow

javascript - Running a discord.js node keeps returning ReferenceError: Intents is not defined - Stack Overflow

javascript - Import Unexpected identifier + SyntaxError: Unexpected string - Stack Overflow

multiroot workspaces: code-workspace files always with empty content - Stack Overflow

javascript - How to include .js script in html file - Stack Overflow

javascript - Can a variable equal two things in an if statement? - Stack Overflow

plugin development - Disable Auto-Expanding Menu in Wordpress Admin Menus

ClassCastException: Cannot Cast Agent to Custom Agent Type in AnyLogic - Stack Overflow

javascript - How to add address in marker google map? - Stack Overflow

plugin development - Rewrite url for existing page without flush_rewrite_rules

windows - How to install extension from &quot;vscode:extension&lt;extension name&gt;&quot; - Stack Overflow

javascript - Get Servlet session attributes using jQuery - Stack Overflow

javascript - Custom Color Picker TinyMCE - Stack Overflow

plugins - custom post type single page not found after aotumatic publish By the author&#39;s robot

javascript - AWS Lambda invoke function doesn&#39;t always return - Stack Overflow

html - The &quot;list-style-type&quot; style not working inside a :before pseudo-element in Chrome - Stack Overflow

plugins - Bingmsn bots is heavily requesting random of my website

发表评论

推荐文章

javascript - Import Excel file to a table of HTML page - Stack Overflow

kotlin - App crash after delete item in detail screen and then navigate back to list screen - Stack Overflow

react native - Can i develop and publish iOS app only using windows laptop with eas build and iPhone without mac? - Stack Overfl

javascript - Give every first, second and third element a unique class using jQuery - Stack Overflow

jquery - Javascript map function converting values to decimal - Stack Overflow

热门文章

javascript - Using Visual Composer in Wordpress; raw js not being applied to my raw html - Stack Overflow

html - How to create an accessible empty &lt;a&gt; tag with only CSS background image and no text - Stack Overflow

javascript - POSTing a new item to a SharePoint list using the MS Graph API with a People field - Stack Overflow

javascript - How to reset a form to pristine after submitted in Angular JS - Stack Overflow

javascript - Duplicate modules of same version in webpack build - Stack Overflow

javascript - Detect file type irrespective of file extension (NodeJS) - Stack Overflow

proxy - Wordpress is not loading resources

javascript - XMLHttpRequest receiving no data or just &quot;undefined&quot; - Stack Overflow

laravel echo - Ably: Auth.requestToken(): token request signing call returned error; err = TypeError: Cannot read property &

kotlin - Bluetooth didDiscoverPeripheral Not Triggering in CBCentralManager on iOS with Multiplatform Compose Library - Stack Ov

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

Plugin function in child theme

javascript - RangError: too many arguments provided for a function call - Stack Overflow

python - WhatsApp API Broadcast with Celery - Unable to Reach 1K Throughput Limit - Stack Overflow

javascript - What is wrong with getMonth() + 1; - Stack Overflow

javascript - How to change React parent component size based on child component size without rendering child component? - Stack

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

Can't compile javascript using ant and closure compiler because of Jquery's $ is undeclared - Stack Overflow

windows - How to install extension from "vscode:extension<extension name>" - Stack Overflow

plugins - custom post type single page not found after aotumatic publish By the author's robot

javascript - AWS Lambda invoke function doesn't always return - Stack Overflow

html - The "list-style-type" style not working inside a :before pseudo-element in Chrome - Stack Overflow

html - How to create an accessible empty <a> tag with only CSS background image and no text - Stack Overflow

javascript - XMLHttpRequest receiving no data or just "undefined" - Stack Overflow