admin管理员组

文章数量:1123706

I am using the dataframe_image library to generate and export images of dataframes.

When I try to run these in parallel using ThreadPoolExecutor, it takes the same time as running it sequentially. Im not sure why it is or if it can be fixed

Parallel Attempt

from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor, as_completed
import dataframe_image as dfi
import pandas as pd
import time

def export_df(df, title):
    dfi.export(
        df.style.background_gradient(), f'{title}.png', table_conversion='matplotlib'
    )

df1 = pd.DataFrame(np.random.rand (6, 4))
df2 = pd.DataFrame(np.random.rand (6, 4))
df3 = pd.DataFrame(np.random.rand (6, 4))
df4 = pd.DataFrame(np.random.rand (6, 4))

tasks = [
    (df1, 'one'),
    (df2,'two'),
    (df3,'three'),
    (df4,'four'),
]

def export_df_wrapper(data, title):
    export_df(data, title)

s = time.time()
with ThreadPoolExecutor() as executor:
    futures = [executor.submit(export_df_wrapper, *task) for task in tasks]
    for future in as_completed(futures):
        try:
            result = future.result()
        except Exception as e:
            print(f"Error occurred: {e}")

print(time.time() - s)

Sequential

s = time.time()
dfi.export(
    df1.style.background_gradient(), f'one.png', table_conversion='matplotlib'
)
dfi.export(
    df2.style.background_gradient(), f'two.png', table_conversion='matplotlib'
)
dfi.export(
    df3.style.background_gradient(), f'three.png', table_conversion='matplotlib'
)
dfi.export(
    df4.style.background_gradient(), f'four.png', table_conversion='matplotlib'
)
print(time.time() - s)

A link to the matplotlib converter from dataframe_image which is generating the image .py

本文标签: pythonGenerate images of dataframes in parallelStack Overflow