List the duplicate values in a field in an attribute table

I received some asset management data, and the state of it led me to write this script to determine if there were any values duplicated in the GIS ID field. There’s a number of ways that one can do this, but I do like to know as many methods as possible before settling on the optimised one.

#
# @date 29/05/2015
# @author Cindy Williams
#
# Checks an attribute field for duplicate values,
# and displays them along with the amount of duplicates.
#
# For use in the Python window in ArcMap.
#
import arcpy
mxd = arcpy.mapping.MapDocument("CURRENT")
lyrs = arcpy.mapping.ListLayers(mxd)
def checkDuplicate(ftr, field):
vals = [row[0] for row in arcpy.da.SearchCursor(ftr, field)]
dct = {x:vals.count(x) for x in vals}
print ftr.name
for k, v in dct.iteritems():
if v > 1:
print k, v
for lyr in lyrs:
field_name = arcpy.ListFields(lyr, "*_ID")[0].name
checkDuplicate(lyr, field_name)

In Line 18, the attribute value is stored as a key in the dictionary, along with the number of occurrences of that value. In Line 21, only the keys for which there is more than one occurrence is displayed. I wrote it as a function so that I could generate a report for all the layers in the map document.

Writing this report out to a table may be more useful, especially if there are lots of duplicates, but then one may as well run Summary Statistics.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.