Accessing metadata using ArcPy, Python, and now hermes!

Two weeks ago, a colleague asked me to write a script to extract some metadata values from dozens of feature classes in a gdb, and write it out to a spreadsheet along with some other descriptive properties. I fiddled around with the metadata using Python’s xml package, and managed to come up with a script for her.

So it was to my immense delight that this post popped up in my feedly on Friday. I immediately starred hermes on GitHub, and maybe this will be the first project I can actually contribute to, and not only because of the Futurama reference.

Create centroids from closed lines via temporary polys

Like it says on the tin. I wrote this script because I had thousands of features in a CAD drawing which were supposed to represent points, but were closed boxes made of lines instead.


#
# @date 26/06/2015
# @author Cindy Williams
#
# Creates centroids from closed lines by creating temporary
# polygons.
#
# For use as as a standalone script.
#
import arcpy
arcpy.env.workspace = r"C:\Some\Arb\Folder\work.gdb"
lyr_line = "ftr_line"
lyr_point = "ftr_point"
fields = ["SHAPE@", "Name"]
cursor_ins = arcpy.da.InsertCursor(lyr_point, fields)
with arcpy.da.SearchCursor(lyr_line, fields) as cursor:
for row in cursor:
poly = arcpy.Polygon(row[0].getPart())
centroid = arcpy.PointGeometry(poly.centroid)
cursor_ins.insertRow([centroid, row[1]])
del cursor_ins
print("Script complete.")

The script builds a polygon from the parts of the line, creates a centroid and inserts it into a point feature class.

ArcMap Woes Part 6: Random failure of gp tools on grouped layers

I had a series of errors recently while I was trying to run a few geoprocessing tools manually on layers within a group layer. The error message kept saying it couldn’t open the layer, even though the layer was clearly in the map. Even restarting ArcMap did not help.

I eventually discovered that geoprocessing tools may randomly fail on layers which are part of a group layer. I say randomly because sometimes the tools would run as expected, other times not. I’ve only experienced this anomaly when using the ArcMap interface – running the same sequence of tools on all the layers in a group layer in a for loop in Python does not give this error.

Create feature classes from a pandas data frame

I had a large CAD drawing which I had brought into ArcGIS, converted to a feature class and classified groups of features using a 3 letter prefix. I also had an spreadsheet containing a long list of those prefixes, along with additional columns of information for that prefix, including feature class name and shape type.

I wanted to create the feature classes for dozens of these prefixes, based on the values in my converted feature class, and a template feature class for the field structure. The Select geoprocessing tool could have easily split out all the different features per prefix for me, but that would have kept the CAD feature class structure, and not the structure that I wanted.

I figured this would be a good time to get into pandas (and eventually geopandas, maybe).


#
# @date 24/06/2015
# @author Cindy Williams
#
# Creates feature classes by looking up the
# name in a pandas data frame
#
# For use as a standalone script
#
import arcpy
import pandas as pd
# Set workspace
arcpy.env.workspace = r"C:\Some\Arb\Folder\work.gdb"
# Template feature class
fc_template = "ftr_template"
# Spreadsheet containing the values
xl_workbook = r"C:\Some\Arb\Folder\assets.xlsx"
lyr_source = arcpy.management.MakeFeatureLayer("ftr_source")
field_category = "Category"
# Get projection from template feature class
sr = arcpy.Describe(fc_template).spatialReference
# Create data frame and parse values
df = pd.read_excel(xl_workbook, 0, parse_cols[0,6], index_col="Prefix")
# Get the list of categories
categories = list(set(row[0] for row in arcpy.da.SearchCursor(lyr_source, field_category)))
for cat in categories:
print("Processing " + cat)
qry = """ "{0}" = '{1}' """.format(field_category, cat)
# Look up the category in the data frame and return the matching feature class name
fc_name = df.loc[cat, "Feature Class Name"]
try:
arcpy.management.CreateFeatureclass(arcpy.env.workspace,
fc_name,
"POINT",
fc_template,
"#",
"#",
sr)
print("Feature class created: " + fc_name)
lyr_cat = arcpy.management.MakeFeatureLayer(in_features=lyr_source,
where_clause=qry)
arcpy.management.Append(lyr_cat, fc_name, "NO_TEST")
except Exception as e:
print(e)
print("Finished " + cat)
print("Script complete.")

In Line 28, I load the first 7 columns on the first sheet in the workbook into a pandas data frame. I set the index column to the column called “Prefix”, so those values will be used for the lookup instead of the default int index pandas assigns.

In Line 37, the prefix value from the feature class is used to look up the corresponding feature class name in the pandas data frame. Once the feature class has been successfully created, a selection layer of the matching features is appended into the new feature class. I could use a SearchCursor and store the matching features in a python list to be copied into the new feature class, but that’s something I will test at another time.

Get the ArcCatalog connections folder location for current user

I wrote this gist because I was tired of manually building up the path for the location where ArcCatalog stores SDE, database and ArcGIS Server connection files. I work on several different machines, so it was very painful to have to change my script each time.


import arcpy
import os
# Detailed description
os_appdata = os.environ['APPDATA'] # Current user's APPDATA folder
folder_esri = "ESRI" # ESRI folder name
arc_prod = arcpy.GetInstallInfo()['ProductName'] # Get the installed product's name e.g. Desktop
arc_ver = arcpy.GetInstallInfo()['Version'] # Get the installed product's version number
arc_cat = "ArcCatalog" # ArcCatalog folder name
print(os.path.join(os_appdata,
folder_esri,
arc_prod + arc_ver,
arc_cat)

I’m considering creating a global functions file which I include with every toolbox/project, where these sorts of functions are readily available.

List the duplicate values in a field in an attribute table

I received some asset management data, and the state of it led me to write this script to determine if there were any values duplicated in the GIS ID field. There’s a number of ways that one can do this, but I do like to know as many methods as possible before settling on the optimised one.


#
# @date 29/05/2015
# @author Cindy Williams
#
# Checks an attribute field for duplicate values,
# and displays them along with the amount of duplicates.
#
# For use in the Python window in ArcMap.
#
import arcpy
mxd = arcpy.mapping.MapDocument("CURRENT")
lyrs = arcpy.mapping.ListLayers(mxd)
def checkDuplicate(ftr, field):
vals = [row[0] for row in arcpy.da.SearchCursor(ftr, field)]
dct = {x:vals.count(x) for x in vals}
print ftr.name
for k, v in dct.iteritems():
if v > 1:
print k, v
for lyr in lyrs:
field_name = arcpy.ListFields(lyr, "*_ID")[0].name
checkDuplicate(lyr, field_name)

In Line 18, the attribute value is stored as a key in the dictionary, along with the number of occurrences of that value. In Line 21, only the keys for which there is more than one occurrence is displayed. I wrote it as a function so that I could generate a report for all the layers in the map document.

Writing this report out to a table may be more useful, especially if there are lots of duplicates, but then one may as well run Summary Statistics.