admin管理员组文章数量:1395014
I have a Python script, that basically looks like this:
import mypackage
# this function generates always the same pandas.DataFrame
df = mypackage.create_the_dataframe()
# write the DataFrame to xlsx and csv
df.to_excel("the_dataframe_as.xlsx", index=False, engine="openpyxl")
df.to_csv("the_dataframe_as.csv", index=False)
I was trying to write a test for the create_the_dataframe
function. So I checked the hash of the resulting xlsx and csv files and found that for two different runs of the script, the hash and file size of the resulting xlsx file changes. The hash for the csv remains the same.
Although I can live with this, I am very curious to understand why this is the case?
I have a Python script, that basically looks like this:
import mypackage
# this function generates always the same pandas.DataFrame
df = mypackage.create_the_dataframe()
# write the DataFrame to xlsx and csv
df.to_excel("the_dataframe_as.xlsx", index=False, engine="openpyxl")
df.to_csv("the_dataframe_as.csv", index=False)
I was trying to write a test for the create_the_dataframe
function. So I checked the hash of the resulting xlsx and csv files and found that for two different runs of the script, the hash and file size of the resulting xlsx file changes. The hash for the csv remains the same.
Although I can live with this, I am very curious to understand why this is the case?
Share Improve this question edited Mar 27 at 9:58 d4tm4x asked Mar 27 at 8:56 d4tm4xd4tm4x 5884 silver badges18 bronze badges 2 |1 Answer
Reset to default 3XLSX files contain metadata like the creation timestamp, which change with every newly written file. Plaintext CSV files do not contain such variable metadata, and thus their contents are entirely predictable.
本文标签: pythonNondeterministic behaviour of openpyxlStack Overflow
版权声明:本文标题:python - Nondeterministic behaviour of openpyxl - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744102676a2590937.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
engine="openpyxl"
with the result being the same. So this seems more like an openpyxl topic. I'll update the question. – d4tm4x Commented Mar 27 at 9:57