admin管理员组

文章数量:1289525

I am quite new to Pandas. I need to select/locate the records between 2 dates.

I have tried a range of methods, but cant seem to get it. I have included a cut down of sample of the CSV/Data I am working with.

Each column is a date, so all of the documentation I have found don't match this data structure

Thanks for any help.

sample csv file

I am quite new to Pandas. I need to select/locate the records between 2 dates.

I have tried a range of methods, but cant seem to get it. I have included a cut down of sample of the CSV/Data I am working with.

Each column is a date, so all of the documentation I have found don't match this data structure

Thanks for any help.

sample csv file

Share Improve this question asked Feb 20 at 12:02 JDPJDP 133 bronze badges 1
  • Images of data are not reproducible, please provide a minimal reproducible example as code/text. Also clearly explain what are the index/columns and what is the exact expected output. – mozway Commented Feb 20 at 12:04
Add a comment  | 

2 Answers 2

Reset to default 1

Here is the script to select the records between dates. This code should be little bit faster:

import pandas as pd

file_path = "file.xlsx"  # Update with the correct file path
df = pd.read_excel(file_path)

# Please change the dates according to your need (04/01/2025 to 06/01/2025).
selected_columns = df[["Fname"] + ["Lname"] + list(df.loc[:, "04/01/2025":"06/01/2025"].columns)]

print(selected_columns)

If you don't need Fname and Lname please remove "["Fname"] + ["Lname"] + ". Just use the line below

selected_columns = df[list(df.loc[:, "04/01/2025":"06/01/2025"].columns)]

If you want to run the script preventing an error if any column is missing, please use:

try:
    date_columns = list(df.loc[:, "04/01/2025":"06/01/2025"].columns)
except KeyError:
    print("Error: The specified date range columns do not exist in the dataset.")
    date_columns = []  # Prevents errors in the next step

selected_columns = df[["Fname"] + date_columns]

Output

Assuming you have non-date columns and date-like columns, you could convert them to date with pd.to_datetime and errors='coerce'. Select the non-date columns with isna, and the wanted dates with between, then perform boolean indexing on the columns and select them:

dates = pd.to_datetime(df.columns, errors='coerce', format='%d/%m/%Y')
m = dates.to_series().between(pd.Timestamp('2025-01-04'),
                              pd.Timestamp('2025-01-06'),
                              inclusive='both')

out = df.loc[:, dates.isna() | m.values]

Output:

     Fname       Lname  04/01/2025  05/01/2025  06/01/2025
0     Owen  Richardson         128         114         239
1   Edward       Jones         148         144         182
2   Steven     Cameron         228         272         140
3    Aldus      Turner         281         139         171
4  Dainton      Wright         269         176         142
5    Sofia    Harrison         100         103         154
6  Heather       Evans         155         163         201
7   Stella      Harris         126         183         157
8    Joyce       Smith         251         143         229
9    Tyler        Hill         299         293         218

If you just want the date-like:

df[df.columns[m]]

   04/01/2025  05/01/2025  06/01/2025
0         128         114         239
1         148         144         182
2         228         272         140
3         281         139         171
4         269         176         142
5         100         103         154
6         155         163         201
7         126         183         157
8         251         143         229
9         299         293         218

本文标签: Python Pandas Loc Columns Between 2 DatesStack Overflow