After using pandas for quite some time now, I started to question if I was really using it effectively. After two MOOCs in R about 2 or 3 years ago, I realised that because my GIS work wasn’t in analysis, I would not be able to use it properly.
Similarly, because pandas is essentially the R of Python, I thought I wouldn’t be able to use all the features it had to offer. As it stands, I’m still hovering around in the data munging side of pandas.
|import pandas as pd|
|in_xls = r"C:\Some\Arb\Folder\test.xlsx"|
|columns = [0, 2, 3, 4, 6, 8]|
|# Use a function to define the mask|
|# to create a subset of the data frame|
|def mask(df, key, value):|
|return df[df[key] == value]|
|pd.DataFrame.mask = mask|
|# Out of the 101 rows, only 50 are stored in the data frame|
|df = pd.read_excel(in_xls,0, parse_cols=columns).mask('Create', 'Y')|
I used a pandas mask to filter a spreadsheet (or csv) based on some value. I originally used this to filter out which feature classes need to be created from a list of dozens of templates, but I’ve also used it to filter transactions in the money tracking app I made for my household.