admin管理员组

文章数量:1122846

When running the snippet of example code below with pandas 2.2.3, I get an error saying KeyError: 'D'

index = pd.MultiIndex.from_tuples(
    [('A', 1), ('A', 2), ('A', 3), ('B', 1), ('B', 2), ('B', 2)],
    names=['letter', 'number']
)
df = pd.DataFrame({'value': [10, 20, 30, 40, 50, 60]}, index=index)
idx = pd.IndexSlice
result = df.loc[idx[['A', 'D'], [1,2]], :]

Does pandas offer any alternatives for searching a multi-index with values that don't exist?

If I run the same code using pandas 1.5.3, I get the expected value:

                    value
letter    number
A         1         10
          2         20

When running the snippet of example code below with pandas 2.2.3, I get an error saying KeyError: 'D'

index = pd.MultiIndex.from_tuples(
    [('A', 1), ('A', 2), ('A', 3), ('B', 1), ('B', 2), ('B', 2)],
    names=['letter', 'number']
)
df = pd.DataFrame({'value': [10, 20, 30, 40, 50, 60]}, index=index)
idx = pd.IndexSlice
result = df.loc[idx[['A', 'D'], [1,2]], :]

Does pandas offer any alternatives for searching a multi-index with values that don't exist?

If I run the same code using pandas 1.5.3, I get the expected value:

                    value
letter    number
A         1         10
          2         20
Share Improve this question edited Nov 21, 2024 at 15:17 ouroboros1 13.6k7 gold badges35 silver badges53 bronze badges asked Nov 21, 2024 at 15:16 X-LX-L 133 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

When you run this code with pandas 1.5.3 you should in fact receive a FutureWarning:

FutureWarning: The behavior of indexing on a MultiIndex with a nested sequence of labels is deprecated and will change in a future version. series.loc[label, sequence] will raise if any members of 'sequence' or not present in the index's second level. To retain the old behavior, use series.index.isin(sequence, level=1)

(Note that it should read: "are not present".)


So, let's indeed use Index.isin to allow boolean indexing:

m = (df.index.isin(['A', 'D'], level='letter') 
     & df.index.isin([1, 2], level='number'))

out = df.loc[m, :]

Output:

               value
letter number       
A      1          10
       2          20

If you have many different conditions, you could consider creating a dictionary and use np.logical_and + reduce:

dict_isin = {
    'letter': ['A', 'D'],
    'number': [1, 2]
    }

m = np.logical_and.reduce(
    [df.index.isin(v, level=k) for k, v in dict_isin.items()]
)

out2 = df.loc[m, :]

out2.equals(out)
# True

本文标签: pythonValue based partial slicing with nonexisting keys is now deprecatedStack Overflow