list - Batch processing multiple excel files in folder via python script and outputting into folder, while ignoring other file t

IT技术

更新时间：2025-01-087

admin管理员组
文章数量:1122832

Each week our inventory reporting software outputs 15+ different inventory reports (one for each vendor respectively) in .xlsx format into one folder.

All the excel sheets are in an identical format but require a fair bit of manipulation prior to being “usuable” to the end user. After trying to learn python over the last week, and piecing together various bits of code I've managed to come up with a VERY rough script to perform the processing required.

The script works as desired but as it stands currently I have to manually input the individual excel sheet file path, and change the output file name, within the script and then run the script for each corresponding excel file. Doing this for 15+ files each week renders the script redundant as would take less time to just reformat within excel manually.

After more extensive forum reading, and trying to plug my script into others code solutions (.html) I'm at an impasse. I'm struggling come up with a function to batch process all excel files in one folder, and ignore all other files. Ideally the manipulated files would replace the original excel files in the same folder but if that's not possible then it's not a deal breaker.

To throw another curveball into the mix I have been writing my script and deploying my code on my mac (using a copy of the .xlsx file which I sent myself from work, and a mac os filepath) to test my script on, but I hope to deploy this script on the Windows system at work eventually.

I've included only my original, currently functioning script below.

Thanks all

    #import pandas library
    import pandas as pd

    #import numpy
    import numpy as np

    #importing our excel to dataframe
    df = pd.read_excel('/Users/christiane/Downloads/AGLC CJ Test.xls.xlsx')

    #renaming our headers for each column
    df = df.set_axis(['vendor code', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'current inventory', 'incoming inventory', 'm', 'Sales: Previous 4 weeks', 'o'], axis=1)

    #convert 'vendor code' column data to string data
    df['vendor code'] = df['vendor code'].astype(str)

    #delete everything that doesn't start with a number from the bottom of the table
    df = df[~df['vendor code'].str.contains("[a-zA-Z]").fillna(False)]

    #convert 'vendor code' column data to interger data
    df['vendor code'] = df['vendor code'].astype(int)

    #sort values in column #1 from smallet to largest
    df.sort_values(by='vendor code', ascending=True, inplace=True)

    #add new column at the end input formula and populate downwards (=K1+L1)
    df['expectedinventory'] = df.loc[:,['current inventory', 'incoming inventory']].sum(axis=1)

    #move 'expected inventory' column from the end position to correct column position
    name_col = df.pop('expectedinventory')
    df.insert(12, 'expectedinventory', name_col)

    #delete columns 'm' and 'o'
    df = df.drop(['m', 'o'], axis=1)

    #run formula for 4 weeks of back orders and populate column “sold in last 4 weeks” with the results 
    df['Sales Advice'] = df['expectedinventory'] - (( df['Sales: Previous 4 weeks']*1) + (0.1*( df['Sales: Previous 4 weeks'])*1)) 


    #Don't know what this does but it works!
    pd.set_option('future.no_silent_downcasting', True)

    #change NaN values to Zero
    df['Sales Advice'] = df['Sales Advice'].fillna(0)


    inventory_data = pd.DataFrame(df)
    cols = ['Sales Advice']
    inventory_data.loc[:, cols] = inventory_data[cols].astype('float64').round()

    #delete unnecessary columns
    df = df.drop(['b', 'c', 'e', 'f','g','h', 'i', 'j'], axis=1)

    #adding a right-most column with NaN values
    df = df.reindex(columns=df.columns.tolist() + ['Ordering Now'])

    #output excel file
    df.to_excel('filteredaglctest.xlsx', index = False)

Each week our inventory reporting software outputs 15+ different inventory reports (one for each vendor respectively) in .xlsx format into one folder.

After more extensive forum reading, and trying to plug my script into others code solutions (https://python-forum.io/thread-10841.html) I'm at an impasse. I'm struggling come up with a function to batch process all excel files in one folder, and ignore all other files. Ideally the manipulated files would replace the original excel files in the same folder but if that's not possible then it's not a deal breaker.

I've included only my original, currently functioning script below.

Thanks all

    #import pandas library
    import pandas as pd

    #import numpy
    import numpy as np

    #importing our excel to dataframe
    df = pd.read_excel('/Users/christiane/Downloads/AGLC CJ Test.xls.xlsx')

    #renaming our headers for each column
    df = df.set_axis(['vendor code', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'current inventory', 'incoming inventory', 'm', 'Sales: Previous 4 weeks', 'o'], axis=1)

    #convert 'vendor code' column data to string data
    df['vendor code'] = df['vendor code'].astype(str)

    #delete everything that doesn't start with a number from the bottom of the table
    df = df[~df['vendor code'].str.contains("[a-zA-Z]").fillna(False)]

    #convert 'vendor code' column data to interger data
    df['vendor code'] = df['vendor code'].astype(int)

    #sort values in column #1 from smallet to largest
    df.sort_values(by='vendor code', ascending=True, inplace=True)

    #add new column at the end input formula and populate downwards (=K1+L1)
    df['expectedinventory'] = df.loc[:,['current inventory', 'incoming inventory']].sum(axis=1)

    #move 'expected inventory' column from the end position to correct column position
    name_col = df.pop('expectedinventory')
    df.insert(12, 'expectedinventory', name_col)

    #delete columns 'm' and 'o'
    df = df.drop(['m', 'o'], axis=1)

    #run formula for 4 weeks of back orders and populate column “sold in last 4 weeks” with the results 
    df['Sales Advice'] = df['expectedinventory'] - (( df['Sales: Previous 4 weeks']*1) + (0.1*( df['Sales: Previous 4 weeks'])*1)) 


    #Don't know what this does but it works!
    pd.set_option('future.no_silent_downcasting', True)

    #change NaN values to Zero
    df['Sales Advice'] = df['Sales Advice'].fillna(0)


    inventory_data = pd.DataFrame(df)
    cols = ['Sales Advice']
    inventory_data.loc[:, cols] = inventory_data[cols].astype('float64').round()

    #delete unnecessary columns
    df = df.drop(['b', 'c', 'e', 'f','g','h', 'i', 'j'], axis=1)

    #adding a right-most column with NaN values
    df = df.reindex(columns=df.columns.tolist() + ['Ordering Now'])

    #output excel file
    df.to_excel('filteredaglctest.xlsx', index = False)

Share Improve this question edited Nov 22, 2024 at 4:51 Marcin Orlowski 75.6k11 gold badges127 silver badges152 bronze badges asked Nov 22, 2024 at 3:59 Christian E 12 bronze badges

I assume the file AGLC CJ Test.xls.xlsx is one of the files you need to process and you would want to loop your script to process 14 other files just like it, possibly with some variation regarding the format of input or output? What varies between runs? (other than the input file name and the output file name)? – Grismar Commented Nov 22, 2024 at 4:59
If all of your spreadsheets are in the same directory and if there's a well-known pattern to the filenames you could glob the directory in a loop to acquire and process each file appropriately. See this document – SIGHUP Commented Nov 22, 2024 at 9:17
I plugged my script into the batch loop solution from this link stackoverflow.com/questions/72189803/…. Only problem is that now I'm running into a directory problem where my current "working directory" isn't the right one later on in the script (hard to tell where exaclty). – Christian E Commented Nov 22, 2024 at 16:38

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

from pathlib import Path 
# APIs get imported here 

# Main folder where excel files are found 
FOLDER = "/home/user/Documents/excel_files" #Change this to your folder path!
EXT = ".xlsx"

def updateExcel(file):
    print(f"---> Processing {file} <----")
    #Insert your code here:

def main():
    #Specift the path in which the files are found 
    folder_path = Path(FOLDER)
    #Get a list of files with a specific extension
    files_ext = [files_ext for file in folder_path.iterdir() if file.is_file() and file.sufix = EXT]
    #Process the files 
    for file in files_ext:
        updateExcel(FOLDER+"/"+file)

if __name__ == '__main__':
    main()

本文标签：

版权声明：本文标题：list - Batch processing multiple excel files in folder via python script and outputting into folder, while ignoring other file t 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736305710a1932688.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

发表评论

全部评论 0

暂无评论

编程频道|软件玩家 - 软件改变生活！

list - Batch processing multiple excel files in folder via python script and outputting into folder, while ignoring other file t

1 Answer 1

更多相关文章

最实用的雨林木风Win10系统推荐与下载指南

PC系统安装&amp;引导：5、安装windows系统

PC系统安装&amp;引导：2、安装windows系统维护环境(微PE工具箱)

PyCharm安装激活教程(Jetbrains其它软件可参考)

win11使用优化-这后，就可以放弃win10了

colors - How do I create CSS gradients that follow the square root average? - Stack Overflow

在Win10 64位系统上轻松安装Oracle 10g：一份详尽指南

python - dask `var` and `std` with ddof in groupby context and other aggregations - Stack Overflow

active directory - samba-tool GPO scripts - Stack Overflow

javascript - Stripe Payment Vue3 - Stack Overflow

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

华硕笔记本电脑用U盘重装windows系统

javascript - Odoo CORS Access Issue - Stack Overflow

python - Mocking imported class set to attribute in constructor with custom init of tested class - Stack Overflow

raspberry pi - FFmpeg h264_v4l2m2m encoder changing aspect ratio from 16:9 to 1:1 with black bars - Stack Overflow

Creating a listener for Branch.io deferred deep link in .NET MAUI - Stack Overflow

promql - Prometheus - how to group by lable 2 metrics and filter one with another? - Stack Overflow

How to run steps in parallel in Buildbot - Stack Overflow

scalatest - Scala-cli test doesnt exit after test run - Stack Overflow

ios - Sending &quot;Start&quot; Live Activity Notification from Apple Push Notifications Console successfully received b

发表评论

推荐文章

dom - How to ensure &lt;textarea&gt;&lt;input&gt; selection set in a keydown handler stays in place in mobile br

Need wp rest api for featured video post

plugins - wordpress Shortocode running twice?

How to connect TypeORM to PostgreSQL using a DATABASE_URL connection string in Next.js? - Stack Overflow

How do I partition disks in a VM instance using cloud-init - Stack Overflow

热门文章

javascript - Js errors in wp-admin

kotlin - Android Focus with Text Entry and Scrolling - Stack Overflow

php - How To add custom radio boxes to WooCommerce Billing page and change total price by this field?

sass - Set SCSS variables in &quot;color&quot; file, override in imported &quot;common&quot; file - Stack Overfl

azure - Error with flatten function while using Terraform - Stack Overflow

themes - How to schedule Automatic Wordpress Core and specifics Plugins updates for specific time a day

solana actions and blinks in express js - Stack Overflow

excel - Calculate percentage of different categories on same column in a pivot table - Stack Overflow

active directory - Why creating new user and roaming profile creation will result error logging in? - Stack Overflow

Highchart-export-server is exporting only once - Stack Overflow

最新文章

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

华硕笔记本电脑用U盘重装windows系统

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

如何一键安装win7系统(一键安装win7系统步骤)

Windows 11最稳定版本详解

multithreading - C++ thread exiting without a notice -- need help debugging with gdb - Stack Overflow

apache kafka - Unknown feature gate KafkaNodePools found in the configuration - Stack Overflow

New Python Instance in VS Code and the terminal is passing indentions that do not exist in the code editor window - Stack Overfl

ros2 - how to modify imu_filter_madgwick to transform RPY from imu_sensor frame to base_link frame? - Stack Overflow

Color a portion of a minipage in Manim - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

PC系统安装&引导：5、安装windows系统

PC系统安装&引导：2、安装windows系统维护环境(微PE工具箱)

ios - Sending "Start" Live Activity Notification from Apple Push Notifications Console successfully received b

dom - How to ensure <textarea><input> selection set in a keydown handler stays in place in mobile br

sass - Set SCSS variables in "color" file, override in imported "common" file - Stack Overfl