python - handling large tgz with pcap in pyspark - ValueError: can not serialize object larger than 2G - Stack Overflow

IT技术

更新时间：2025-03-090

admin管理员组
文章数量:1289381

I have a pyspark based pipeline that uses spark.read.format("binaryFile") to decompress tgz files and handle the pcap file inside (exploding to packages etc). The code that handles the tar, pcap, and single packets is written as pure python and integrated as "User Defined Function".

This pipeline works fine but files that contain a pcap files larger 2GB yield ValueError: can not serialize object larger than 2G

is there any way to overcome this?

I would like to keep

self.unzipped: DataFrame = spark.read.format("binaryFile")\
    .option("pathGlobFilter", "*.pcap.tgz")\
    .option("compression", "gzip")\
    .load(folder)

Because of the abstraction layer regarding the file source, it works with file://, hadoop:// and other like azure (abfss://) - if you add the dependencies.

If not possible what are alternatives?

Since this is an error in python serializer - will this work in an Scala or R implementation?
if uncompressing on driver (with pure python code, and creating the first dataframe from chunks of packages from pcap) - how to read the file in similar way that accepts different protocols as path (i would need file:// and abfss://)
any other ideas?

Update:

I am using Pyspark 3.51

Source that raises the error: .py#L160

本文标签：

版权声明：本文标题：python - handling large tgz with pcap in pyspark - ValueError: can not serialize object larger than 2G - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741451344a2379505.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

javascript - Check if Bootstrap Modal currently Show - Stack Overflow

IT技术

23分钟前

I'm trying to bine the Keypress plugin with Modal dialogs in Bootstrap.Based on this question, I

javascript - iphone web page transitions - Stack Overflow

IT技术

22分钟前

I am developing a web page for iPhones and iPod Touch's.I am using the Universal iPhone UI frame

javascript - How to get the result of async.each in Node.js - Stack Overflow

IT技术

22分钟前

I'm trying to fill outputArray and then send it as a JSON. When I add console.log() after outputAr

css - How can I hide this custom slider while Elementor editing window is open?

IT技术

22分钟前

At holistic-health.care, which is a WordPress site using twenty seventeen and Elementor, there is a slider on the front

r - Cut() with custom bins - Stack Overflow

IT技术

21分钟前

I have the following two dataframes:data <- data.frame(yyyymm = c(202401, 202401,202401,202401,2024

jquery - How to remove JavaScript code in the template header (Joomla! 3.0.x)? - Stack Overflow

IT技术

15分钟前

I have created a template in Joomla! 3x, but I want to remove the default JavaScript code and styleshee

xml - How to change shadow color (glowing shadow) in ImagButton in Android studio? - Stack Overflow

IT技术

14分钟前

I want to implement a glowing shadow like in the pic.<androidx.cardview.widget.CardViewandroid:layou

Javascript ES6 modules not passing along .htaccess basic authentication - Stack Overflow

IT技术

13分钟前

When I run the following JavasScript, I can successfully log in but not access the modules. How can I p

javascript - Why is document.getElementById() returning a null value when I know the ID exists? - Stack Overflow

IT技术

10分钟前

I'm working on some custom Javascript for a CMS template at work. I want to be able to make a cert

javascript - How to define a filter box out of the ng-table? - Stack Overflow

IT技术

8分钟前

AllI create a filter box in ng-table following the instrunctions of my code is same as the example:cr

javascript - Loading a local React app inside WKWebView iOS - Stack Overflow

IT技术

8分钟前

I've built a site in React and would want to use the build files inside of the iOS WKWebView. Foll

Validate rest-api call on create

IT技术

7分钟前

WordPress Site A has an API, which is being used by another systemapplication (B). B is struggling to handlecontroles

javascript - Confirmation prompt before PHP execution - Stack Overflow

IT技术

7分钟前

I've build a decent CMS for my website. It allows me to manage the entire content, as well as dele

javascript - ReferenceError: Can't Find Variable - Stack Overflow

IT技术

7分钟前

So i'm new to React Native but every time i put a new Component like a Button or an Image from Fac

javascript - why won't this style.display work? - Stack Overflow

IT技术

6分钟前

Why won't this work? It should check the element for display css style.if (byId("annonsera_pr

Android Kotlin app the toolbar is not showing - Stack Overflow

IT技术

5分钟前

I have this working in the java version but I am clearly missing something when I rewrite it in kotlin.

redirect - wp_enqueue_style - CSS Not loading - ERR_TOO_MANY_REDIRECTS

IT技术

4分钟前

Closed. This question is off-topic. It is not currently accepting answers.Questions that are too localized (such as synt

javascript - extjs 4 portal example - Stack Overflow

IT技术

3分钟前

I would like to say that Im struggling with understanding the portal demo in ExtJS 4. Can someone pleas

javascript - reducer is not a function - Stack Overflow

IT技术

3分钟前

Try create reducers ang get data from actionBut get error in console: reducer is not a function.....My

javascript - How to dynamically add an element to HTML5 SVG? - Stack Overflow

IT技术

20秒前

I have read a few questions similar to this, but I can't get my code working properly. I want to a

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

python - handling large tgz with pcap in pyspark - ValueError: can not serialize object larger than 2G - Stack Overflow

更多相关文章

javascript - Check if Bootstrap Modal currently Show - Stack Overflow

javascript - iphone web page transitions - Stack Overflow

javascript - How to get the result of async.each in Node.js - Stack Overflow

css - How can I hide this custom slider while Elementor editing window is open?

r - Cut() with custom bins - Stack Overflow

jquery - How to remove JavaScript code in the template header (Joomla! 3.0.x)? - Stack Overflow

xml - How to change shadow color (glowing shadow) in ImagButton in Android studio? - Stack Overflow

Javascript ES6 modules not passing along .htaccess basic authentication - Stack Overflow

javascript - Why is document.getElementById() returning a null value when I know the ID exists? - Stack Overflow

javascript - How to define a filter box out of the ng-table? - Stack Overflow

javascript - Loading a local React app inside WKWebView iOS - Stack Overflow

Validate rest-api call on create

javascript - Confirmation prompt before PHP execution - Stack Overflow

javascript - ReferenceError: Can&#39;t Find Variable - Stack Overflow

javascript - why won&#39;t this style.display work? - Stack Overflow

Android Kotlin app the toolbar is not showing - Stack Overflow

redirect - wp_enqueue_style - CSS Not loading - ERR_TOO_MANY_REDIRECTS

javascript - extjs 4 portal example - Stack Overflow

javascript - reducer is not a function - Stack Overflow

javascript - How to dynamically add an element to HTML5 SVG? - Stack Overflow

发表评论

推荐文章

amazon web services - Use NextToken to parse next results using javascript aws-sdk - Stack Overflow

javascript - Listen to rightclick on svg element with Raphael.js - Stack Overflow

javascript - how to append a number to a function name to create many functions? - Stack Overflow

post meta - WP_POSTMETA - What do these values mean inside the data structure?

javascript - Detect when audio.play() has finished - Stack Overflow

热门文章

Tesseract, OCR and text based layout - Stack Overflow

php - I am having errors with checkout on wordpress

Split up files from one big folder to multiple folders in uploads after migration

javascript - With antd custom svg icons, how to pass props down to svg? - Stack Overflow

swift - How to Get Incoming Call Notifications to Create a Phone App in iOS 18.2+? - Stack Overflow

rest api - Can I overwrite default WordPress Json API For no more pages text

javascript - TinyMCE - have to refresh the page - Stack Overflow

javascript - How to render something in an if statement React Native - Stack Overflow

knitr - quarto ignoring gt column widths when converting to docx - Stack Overflow

eslint firing Unsafe assignment of an `any` value.eslint@typescript-eslintno-unsafe-assignment on perfectly typed out code - Sta

最新文章

Win7各正式版下载地址和SHA验证

怎么样把中文版的Windows7改成英文版的Windows7

Win7系统笔记本蓝牙打开指南：详细步骤助你轻松连接

win7开机弹计算机,win7开机弹出Windows Installer窗口的解决方法

windows7虚拟机安装vmtools方法

CSS properties in textarea in the Customizer

javascript - How to dynamically add an element to HTML5 SVG? - Stack Overflow

javascript - In TypeScript, How do I go about adding a property to an object type in an Interface? - Stack Overflow

node.js - npm install results in E404 error with HTTPS - resolved by switching to HTTP registry - Stack Overflow

javascript - Hideshow DIVs - change current effect to fade - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

javascript - ReferenceError: Can't Find Variable - Stack Overflow

javascript - why won't this style.display work? - Stack Overflow