After using pandas for quite some time now, I started to question if I was really using it effectively. After two MOOCs in R about 2 or 3 years ago, I realised that because my GIS work wasn’t in analysis, I would not be able to use it properly.
Similarly, because pandas is essentially the R of Python, I thought I wouldn’t be able to use all the features it had to offer. As it stands, I’m still hovering around in the data munging side of pandas.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|import pandas as pd|
|in_xls = r"C:\Some\Arb\Folder\test.xlsx"|
|columns = [0, 2, 3, 4, 6, 8]|
|# Use a function to define the mask|
|# to create a subset of the data frame|
|def mask(df, key, value):|
|return df[df[key] == value]|
|pd.DataFrame.mask = mask|
|# Out of the 101 rows, only 50 are stored in the data frame|
|df = pd.read_excel(in_xls,0, parse_cols=columns).mask('Create', 'Y')|
I used a pandas mask to filter a spreadsheet (or csv) based on some value. I originally used this to filter out which feature classes need to be created from a list of dozens of templates, but I’ve also used it to filter transactions in the money tracking app I made for my household.