
tidypandas#
tidypandas python package provides minimal, pythonic API for common
data manipulation tasks:
tidyframeclass (wrapper over pandas dataframe) provides a dataframe with simplified index structure (no more resetting indexes and multi indexes)Consistent ‘verbs’ (
select,arrange,distinct, …) as methods totidyframeclass which mostly return atidyframeUnified interface for summarizing (aggregation) and mutate (assign) operations across groups
Utilites for pandas dataframes and series
Uses simple python data structures, No esoteric classes, No pipes, No Non-standard evaluation
No copy data conversion between
tidyframeand pandas dataframesAn accessor to apply
tidyframeverbs to simple pandas datarames…
Example#
tidypandascode:
df.filter(lambda x: x['col_1'] > x['col_1'].mean(), by = 'col_2')
equivalent pandas code:
(df.groupby('col2')
   .apply(lambda x: x.loc[x['col_1'] > x['col_1'].mean(), :])
   .reset_index(drop = True)
   )
Why use tidypandas#
tidypandas is for you if:
you frequently write data manipulation code using pandas
you prefer to have stay in pandas ecosystem (see accessor)
you prefer to remember a limited set of methods
you do not want to write (or be surprised by)
reset_index,rename_axisoftenyou prefer writing free flowing, expressive code in dplyr style
tidypandasrelies on the amazingpandaslibrary and offers a consistent API with a different philosophy.
Presentation#
Learn more about tidypandas (presentation)
Installation#
Install release version from Pypi using pip:
pip install tidypandas
For offline installation, use whl/tar file from the releases page on github.
Contribution/bug fixes/Issues:#
Open an issue/suggestion/bugfix on the github issues page.
Use the master branch from github repo to submit your PR.