Selection by labels¶
Core¶
|
Select labels from a list, sorted by the order they appear in the list. |
|
Select all labels. |
|
Select labels where the condition is True. |
List selectors¶
String selectors¶
|
Select labels that start with a prefix. |
|
Select labels that end with a suffix. |
|
Select labels that contain a pattern or regular expression. |
|
Select labels that match a regular expression. |
Data type selectors¶
|
Select columns based on the column dtypes. |
|
Select boolean columns. |
Select numeric columns. |
|
|
Select categorical columns. |
|
Select columns with dtype |
|
Select nominal columns. |
Logical operators¶
All label selectors implement the following operators:
Operator |
Description |
---|---|
|
Inverse the selection. |
|
Select elements in both selectors. |
|
Select elements in the left side but not in the right side. |
|
Select elements in the left side but not in the right side. |
|
Select elements in the left side but not in the right side. |
For all operators, if one operand is incompatible, it will be wrapped with
Exact
first. In that case, the axis
and level
arguments are inferred from the other operand.
In [1]: from pandas_select import AnyOf
In [2]: AnyOf("A", axis="index", level=2) & "B"
Out[2]: AnyOf(values={'A'}, axis='index', level=2) & Exact(values=['B'], axis='index', level=2)
In [3]: ["A", "B"] | AnyOf("B")
Out[3]: Exact(values=['A', 'B'], axis='columns', level=None) | AnyOf(values={'B'}, axis='columns', level=None)
Duplicates¶
Label selectors return a pandas.Index
, which is interpreted by
DataFrame
[]
and loc
as a sequence
of strings.
Warning
pandas_select
will raise a RuntimeError
when the selection contains
duplicates. This is because selecting duplicates is probably not what you want. In
this case, Pandas gives you a DataFrame
that contains all columns
with that name, for each column name you selected.
In [4]: import pandas as pd
In [5]: df = pd.DataFrame([[2, 1], [1, 2]], columns=["A", "A"], index=["a", "a"])
In [6]: df
Out[6]:
A A
a 2 1
a 1 2
In [7]: df[["A", "A"]]
Out[7]:
A A A A
a 2 1 2 1
a 1 2 1 2
In [8]: df.loc[["a", "a"]]
Out[8]:
A A
a 2 1
a 1 2
a 2 1
a 1 2
In [9]: from pandas_select import AnyOf
In [10]: try:
....: df[AnyOf("A")]
....: except RuntimeError as e:
....: print(e)
....:
Found duplicated values in selection