pandas_select.pandera.SchemaSelector

class SchemaSelector(level=None, **attrs)[source]

Select columns based on the column attributes of the DataFrameSchema associated with the DataFrame.

Parameters
  • attrs (Dictionary of columns attributes to filter on.) –

  • level (optional) – Either the integer position of the level or its name. It should only be set if axis targets a MultiIndex, otherwise a IndexError will be raised.

Raises

ValueError: – If a DataFrame.

Notes

A DataFrameSchema is automatically added to a DataFrame after calling pandera.schemas.DataFrameSchema.validate().

Examples

>>> df = pd.DataFrame(data=[[1, 2, 3]], columns=["a", "abc", "b"])
>>> df
   a  abc  b
0  1    2  3
>>> import pandera as pa
>>> schema = pa.DataFrameSchema({"a": pa.Column(int, regex=True, required=False)})
>>> df = df.pandera.add_schema(schema)
>>> df[SchemaSelector(required=False)]
   a  abc
0  1    2