tidypandas
#
tidypandas
python package provides minimal, pythonic API for common
data manipulation tasks:
tidyframe
class (wrapper over pandas dataframe) provides a dataframe with simplified index structure (no more resetting indexes and multi indexes)Consistent ‘verbs’ (
select
,arrange
,distinct
, …) as methods totidyframe
class which mostly return atidyframe
Unified interface for summarizing (aggregation) and mutate (assign) operations across groups
Utilites for pandas dataframes and series
Uses simple python data structures, No esoteric classes, No pipes, No Non-standard evaluation
No copy data conversion between
tidyframe
and pandas dataframesAn accessor to apply
tidyframe
verbs to simple pandas datarames…
Example#
tidypandas
code:
df.filter(lambda x: x['col_1'] > x['col_1'].mean(), by = 'col_2')
equivalent pandas code:
(df.groupby('col2')
.apply(lambda x: x.loc[x['col_1'] > x['col_1'].mean(), :])
.reset_index(drop = True)
)
Why use tidypandas
#
tidypandas
is for you if:
you frequently write data manipulation code using pandas
you prefer to have stay in pandas ecosystem (see accessor)
you prefer to remember a limited set of methods
you do not want to write (or be surprised by)
reset_index
,rename_axis
oftenyou prefer writing free flowing, expressive code in dplyr style
tidypandas
relies on the amazingpandas
library and offers a consistent API with a different philosophy.
Presentation#
Learn more about tidypandas (presentation)
Installation#
Install release version from Pypi using pip:
pip install tidypandas
For offline installation, use whl/tar file from the releases page on github.
Contribution/bug fixes/Issues:#
Open an issue/suggestion/bugfix on the github issues page.
Use the master branch from github repo to submit your PR.