Create feature classes from a pandas data frame

I had a large CAD drawing which I had brought into ArcGIS, converted to a feature class and classified groups of features using a 3 letter prefix. I also had an spreadsheet containing a long list of those prefixes, along with additional columns of information for that prefix, including feature class name and shape type.

I wanted to create the feature classes for dozens of these prefixes, based on the values in my converted feature class, and a template feature class for the field structure. The Select geoprocessing tool could have easily split out all the different features per prefix for me, but that would have kept the CAD feature class structure, and not the structure that I wanted.

I figured this would be a good time to get into pandas (and eventually geopandas, maybe).

In Line 28, I load the first 7 columns on the first sheet in the workbook into a pandas data frame. I set the index column to the column called “Prefix”, so those values will be used for the lookup instead of the default int index pandas assigns.

In Line 37, the prefix value from the feature class is used to look up the corresponding feature class name in the pandas data frame. Once the feature class has been successfully created, a selection layer of the matching features is appended into the new feature class. I could use a SearchCursor and store the matching features in a python list to be copied into the new feature class, but that’s something I will test at another time.

Get the ArcCatalog connections folder location for current user

I wrote this gist because I was tired of manually building up the path for the location where ArcCatalog stores SDE, database and ArcGIS Server connection files. I work on several different machines, so it was very painful to have to change my script each time.

I’m considering creating a global functions file which I include with every toolbox/project, where these sorts of functions are readily available.

View list of ArcGIS Server Data Store Items

Recently I had a task where, for >10 ‘clients’, I had to:

  1. Create a unique database user
  2. Grant privileges to certain datasets in a SDE database
  3. Save an mxd showing only what was relevant to that client
  4. Publish a feature service for consumption in ArcGIS Online

While I will leave the specifics of that delightful script for another post, during debugging I came up against an error where the script would fail if the database connection had already been registered with ArcGIS Server.

In lines 15 and 16, I get the location of the ArcCatalog connections folder for the current user. This is dependent on knowing which version of Arc is installed. I have found a slightly better way (written quite verbosely for clarity):

Toggle labels in a map document

I had a request from the Asset Management team to display all the asset management data I had for a particular area, grouped by service category and with labels displaying their unique IDs. My challenge was to turn on the labels for all the layers using a specific expression. However, the ID fields were not named the same in all the layers (again, the joys of working with data received from outside).

On line 19, I set the label expression for the default label class using the correct field name. The square brackets are what ArcGIS is expecting to delimit field names in the label expression dialog. Line 20 turns the labels on, while line 23 turns the layer on.

Generate chainage along a line feature

This is a request which comes up quite often. Normally, the request is to create points along a pipeline at every 1000m (or some other integer). Sometimes it will be for chainage along a road, but it’s always a line feature which defines a route of some kind. The first time I did this, I used ET GeoWizards, which was an excellent starting point. However, with access to an ArcInfo Advanced licence, I decided to try it using only geoprocessing tools found in ArcToolbox. This did not go well. I then turned to something which I should have turned to at the beginning – Python.

Create points at set distances and split line

This week I wrote a script to create points along an existing line based on distances found in a spreadsheet, and create a new line from the original line which is the length of the last distance recorded. That was a bit of a mouthful, so I will post it here in stages. Some helpful tips were found in the ArcGIS Help (as always), as well as the always helpful ArcPy cafĂ© (here and here). In reference to that last link, creating the new line would have been much easier using 10.3’s segementAlongLine, but since that’s not in general release yet, it was the long way around for me.

Created on 4 Nov 2014

@author: Cindy Williams

import arcpy

# Environment settings
arcpy.env.overwriteOutput = True
arcpy.env.workspace = r"C:\Some\Arb\Folder\work.gdb"

''' Inputs

ftr_baseline = "ftr_baseline" # Dissolved baseline feature class

ftr_distance = "ftr_distance" # New line feature class

ftr_points = r"ftr_points" # Created points feature class

# Table with distance data
print "Converting table..."
tbl_data = arcpy.conversion.ExcelToTable(r"C:\Some\Arb\Folder\distances.xls",

sr = arcpy.Describe(ftr_baseline).spatialReference.exporttostring() # Spatial reference

lst_points = [] # List for created points

The workflow is as follows:

  1. Working with only the geometry of the line is more efficient, so retrieve the shape of the first record in the existing line feature class. As there is only one record anyway, I’ve included two methods of doing this in the code. The one that’s not commented out is the preferred one.
  2. # Load line geometry
    # geom_line = [row[0] for row in arcpy.da.SearchCursor(ftr_baseline, "SHAPE@", "#", sr)][0]
    geom_line = arcpy.da.SearchCursor(ftr_baseline, "SHAPE@").next()[0]
  3. Create the point geometries by reading in the km values from the table, converting to metres, accumulating the distance and adding the the new point feature into the list. This is accomplished using an update cursor. Next, use an insert cursor to write the points to disk i.e. to the blank point feature class which has a matching schema.
    acc_dist = 0 # Accumulated distance
    # Create points along line according to distance
    with arcpy.da.SearchCursor(tbl_data, ["Day", "KM"], "#", sr) as cursor:
        for row in cursor:
            acc_dist += row[1]*1000 # Accumulate the distance
            pnt = geom_baseline.positionAlongLine(acc_dist) # Create point geometry
            lst_points.append((row[0], row[1], pnt)) # Add value to point list
            print "Appended {}".format(row[0]) 
    # Write points to disk
    ins_cursor = arcpy.da.InsertCursor(ftr_points, ["Day", "Dist_km", "SHAPE@XY"])
    for pnt in lst_points:
    del ins_cursor
  4. Finally, get the last point that was added, split the line at this point and write it to disk. I’m not sure if there is a better way to get the last entry – I did not want to retrieve it from the new point feature class as it already exists in the list.
    # Get point geometry from the tuple in the last list entry
    end_pnt = lst_points[len(lst_points) - 1][2] 
    first_line =, end_pnt, arcpy.Geometry())[0], ftr_distance)
    print "Script Complete."

    Full script is here.

Optimised version of projecting coordinates on the fly

Last week I posted about creating a csv file containing coordinates projected on the fly. Seeing as I had to update the files as the projection information changed, I decided I might as well optimise my code. I managed to halve the amount of code and not write anything temporary to disk as the final output I want is non-spatial anyway.

Created on 22 Sep 2014

@author: Cindy Williams

Project coordinates on the fly and write to csv

import arcpy
import csv
import os

arcpy.env.overwriteOutput = True

fld = r"C:\Some\Arb\Folder"
sr_cape = arcpy.SpatialReference(r"C:\Some\Arb\Folder\cape17.prj")
sr_mines = arcpy.SpatialReference(r"C:\Some\Arb\Folder\Schwarzeck17_mines.prj")
fields = ["Field3", "SHAPE@XY"]
x_constant = 89
y_constant = 2000018

# Specify topdown=True so that it does not process the created csvs
for root, dirnames, filenames in os.walk(fld, topdown=True):
    for f in filenames:
        if f.endswith(".csv"):
            print "Processing " + f
            cur_csv = os.path.join(root, f)
            lyr = "XY_Layer"
            # Create XY Layer in the original CS
  , "Field1", "Field2", lyr, sr_cape)
            print "\tCreated XY Event Layer"
            # Output csv filename created from splitting current csv filename
            final_csv = os.path.join(fld, "Schwarzeck17_" + f.rpartition(" ")[2])
            with open(final_csv, 'wb') as csvfile:
                csvwriter = csv.writer(csvfile)
                # Use a cursor to project the coordinates to the new CS
                with arcpy.da.SearchCursor(lyr, fields, "#", sr_mines) as cursor:
                    for row in cursor:
                        # Apply the constant and write to csv
                        csvwriter.writerow([row[1][0] - x_constant, row[1][1] - y_constant, row[0]])
            print "Processed " + final_csv
            # Clean up
print "Script complete."

Project coordinates on the fly using Update Cursor

A colleague in our Namibian office recently asked me to project some points for him. Easy enough, except it was in a xyz file and it was a million points and the file was huge and their internet is not great (which is saying something, as our internet is abysmal). After splitting the file into 10 separate csv files and emailing them (I know, I know), I gave one file to another colleague who is a projections ninja (silently beats them into submission? I don’t know).

Once she had figured out the multiple issues (there are always multiple issues – “Can you quickly project this arb file for me” invariably turns into a full day escapade), it was time to automate the process for all the csvs. To Python!

Created on 17 Sep 2014

@author: Cindy Williams

Process csv and project coordinates on the fly
Output new csv file
import arcpy
import csv
import os

arcpy.env.overwriteOutput = True

in_fld = r"C:\Some\Arb\Folder"
out_fld = os.path.join(in_fld, "constant")
end_fld = os.path.join(in_fld, "schwarzeck")
gdb = r"C:\Some\Arb\Folder\working.gdb"
sr_nam = arcpy.SpatialReference(r"C:\Some\Arb\Folder\Schwarzeck17.prj")
sr_cape = arcpy.SpatialReference(r"C:\Some\Arb\Folder\cape17.prj")
lyr = "XY_Layer"
excl = ["constant", "schwarzeck"]

def checkPath(p):
    if not os.path.exists(p):
        return "Created path " + p
        return "Path exists: " + p
print checkPath(out_fld)
print checkPath(end_fld)

for root, dirs, filenames in os.walk(in_fld, topdown=True):
    dirs[:] = [d for d in dirs if d not in excl] # Prune directories in-place
    for f in filenames:
        if f.endswith(".csv"):
            cur_csv = os.path.join(root, f)
            new_csv = os.path.join(out_fld, "CapeConstant_" + str(f.split(") ")[1]))
            # Read in the values and add the constant 
            with open(cur_csv, 'rb') as infile:
                with open(new_csv, 'wb') as outfile:
                    csvreader = csv.reader(infile)
                    csvwriter = csv.writer(outfile)
                    csvwriter.writerow(["y", "x", "z"])
                    for row in csvreader:
                        csvwriter.writerow([row[0], float(row[1]) - 2000000], row[2])
            print "Finished processing " + f

for root, dirs, filenames in os.walk(out_fld):
    for f in filenames:
        cur_xy = os.path.join(root, f)
        end_csv = os.path.join(end_fld, "Schwarzeck17_" + f.split("_")[1])
        ftr = os.path.join(gdb, f[:-4])
        # Create temporary points layer from coordinates in the csv, "y", "x", lyr, sr_cape)
        print "Created XY Layer for " + f
        # Copy temp layer to disk, ftr)
        print "Created feature class"
        # Project on the fly and write new coordinates to csv
        with open(end_csv, 'wb') as csvfile:
            csvwriter = csv.writer(csvfile)
            with arcpy.da.UpdateCursor(ftr, ["SHAPE@XY", "z"], "#", sr_nam) as cursor:
                for row in cursor:
                    csvwriter.writerow([row[0][0], row[0][1], row[1]])
        print "Processed " + end_csv