Categorising spatially enabled dashboards

Lately I’ve been working with dashboards quite a bit. For clients who don’t really understand spatial data, it’s easier for them to digest the information on the map when it’s presented along with the graphs and indicators they are familiar with from Excel or Power BI.

Over the last few weeks, Julian has spent quite a bit of time setting up a number of dashboards using Operations Dashboard, each with a different purpose. On one project, we have a dashboard showing the client the real-time progress of fieldworkers on a map, along with some graphs showing the breakdown of assignments which are in progress and completed per district. It enables the client to answer questions such as which worker is causing a bottleneck. This dashboard consumes the workers and assignments layer from the Workforce for ArcGIS project, along with the various Survey123 feature services.

On another project, we have a dashboard showing the results of an asset life cycle cost analysis model. This dashboard includes graphs depicting when the client can expect to incur the greatest cost to replace key assets, as well as helping to answer questions such as: Is it cheaper to replace an asset in 5 years, or to spend an additional amount on maintenance in 3 years in order to extend the remaining useful life of the asset by 7 years?

We also have a number of ideas in dev at the moment, including the actual software we use to display the dashboard (that will have to be a post by itself). I’ve been mulling over how to package these different dashboard types as solutions to offer to a client. I decided to adapt the traditional categories to our purpose.

  • Operational: This is the basic dashboard, as detailed in my first example. This type will normally display two maps – one showing real-time progress of fieldworkers and their assignments, and another showing the surveys they submit along with actual data. Graphs may include the amount of assignments completed per worker, per area or along whichever dimension is most logical (or whatever the client prefers). Filters are included to drill down through the live data.
  • Analytical: This dashboard shows the results of analysing the data displayed on an operational dashboard (my second example). A single map can be used to display the analysis results per survey or per area. Graphs will vary according to client needs, but will be based on the survey points in the map. The user can interact with the dashboard by drawing various reports that they need, creating pivot tables, filtering etc.

My current dev efforts are focussed on a third type of dashboard. For a large project last year, I designed and implemented a mobile data capture solution which incorporated a QA process (to be carried out by professional engineers) as well as an invoicing process (to reduce turnaround time between carrying out the work and getting paid by the client). I’ll have to use another post to brainstorm that idea.

Convert a list of field names and aliases from Excel to table using ArcPy

I went digging through my old workspace and started looking at some of my old scripts. My style of coding back then is almost embarrassing now 🙂 but that’s just the process of learning. I decided to post this script I wrote just before ArcGIS released their Excel toolset in 10.2.


'''
@date 09/07/2013
@author Cindy Williams
Converts a spreadsheet containing field names and aliases
into a file geodatabase table.
'''
import arcpy
xls = r"C:\Some\Arb\Folder\test.xlsx\Fields$"
tbl = r"C:\Some\Arb\Folder\test.gdb\tbl_Samples"
with arcpy.da.SearchCursor(xls,("FIELD_NAME","ALIAS")) as cursor:
for row in cursor:
arcpy.management.AddField(tbl, row[0], "DOUBLE", field_alias=row[1])
print("Adding field {}…".format(row[0]))
print("Fields added successfully.")

From what I can recall, I needed to create a file geodatabase table to store records of microbial sample data. Many of the field names were the chemical compound themselves, such as phosporus or nitrogen, or bacterial names. For brevity’s sake, I had to use the shortest field names possible while still retaining the full meaning.

I set up a spreadsheet containing the full list of field names in column FIELD_NAMES and their aliases in ALIAS. I created an empty table in a file gdb, and used a SearchCursor on the spreadsheet to create the fields and fill in their aliases.

This solution worked for me at the time, but of course there are now better ways to do this.

Write contents of Word tables to a spreadsheet

I wrote this script for a former colleague last year. She had received a Word document containing about 600 tables which must have been dumped out of a database somewhere. The tables had the same header, and each table represented an “incident”, with dates, details etc.

She was requested to “put it into Excel”. After she had manually copied the first table into matching columns in a spreadsheet, she came to me. This type of thing is normally a task we would give to a student, as it has nothing to do with GIS. Nevertheless, when I saw the repeating structure I was sure I could come up with something to do this automatically.


import docx
import xlwt
doc = r"C:\Some\Arb\Folder\input.docx"
xls = r"C:\Some\Arb\Folder\output.xls"
document = docx.Document(doc)
book = xlwt.Workbook()
cur_sheet = book.add_sheet("Tables")
row_num = 0
tables = document.tables # Get all the tables in the docx
# Get the header row from the 1st table's 1st row
for index, cell in enumerate(tables[0].rows[0].cells):
cur_sheet.write(row_num, index, cell.text)
for table in tables:
for row in table.rows[1:]: # Skip the repeating header row of each table
row_num += 1
for index, cell in enumerate(row.cells):
if cell != '':
cur_sheet.write(row_num, index, cell.text.strip())
book.save(xls)

The script finds all the tables in the document, and grabs the header from the first table to serve as the headings in the spreadsheet. It then iterates over all the tables, skips the header row and populates the spreadsheet with all the rows from the various tables.

It took about 15 minutes to write (had to play around with accessing the table elements correctly) and less than a minute to extract the data. That’s the amount of time it would have taken to copy 5 of the tables manually. At that pace it would have taken about 4 days to complete the process.

Convert spreadsheets to GDB tables and display XY

I’m sick today, so I decided to choose one of the simpler scripts for today’s post. This one automates a common task of the GIS professional: converting a spreadsheet of coordinates to GIS. The script assumes you’ve formatted each sheet appropriately.


#
# @date 28/07/2015
# @author Cindy Williams
#
# Converts all the sheets in a spreadsheet
# to file gdb tables and displays the XY
# coordinates in ArcMap.
#
# Assumes the spreadsheet has been formatted
# according to GIS standards.
#
# For use in the Python window in ArcMap.
#
import arcpy
import xlrd
arcpy.env.workspace = r"C:\Some\Arb\Folder\work.gdb"
sr = arcpy.SpatialReference(4326)
xls = r"C:\Some\Arb\Folder\Coords.xlsx"
wb = xlrd.open_workbook(xls)
sheets= [sheet.name for sheet in wb.sheets()]
field_x = "Long"
field_y = "Lat"
for sheet in sheets:
print("Processing " + sheet)
tbl = "tbl_" + sheet
try:
arcpy.conversion.ExcelToTable(xls, tbl, sheet)
print("Converted " + sheet)
arcpy.management.MakeXYEventLayer(tbl, field_x, field_y, sheet, sr)
print("Displaying XY")
except Exception as e:
print(e)
print("Script complete.")

For each sheet in the workbook, the script creates a file geodatabase table, and displays the coordinates as a layer in ArcMap using the WGS 1984 spatial reference. The script is very basic, and therefore will be easy to change to suit your own data.

Create feature classes from a pandas data frame

I had a large CAD drawing which I had brought into ArcGIS, converted to a feature class and classified groups of features using a 3 letter prefix. I also had an spreadsheet containing a long list of those prefixes, along with additional columns of information for that prefix, including feature class name and shape type.

I wanted to create the feature classes for dozens of these prefixes, based on the values in my converted feature class, and a template feature class for the field structure. The Select geoprocessing tool could have easily split out all the different features per prefix for me, but that would have kept the CAD feature class structure, and not the structure that I wanted.

I figured this would be a good time to get into pandas (and eventually geopandas, maybe).


#
# @date 24/06/2015
# @author Cindy Williams
#
# Creates feature classes by looking up the
# name in a pandas data frame
#
# For use as a standalone script
#
import arcpy
import pandas as pd
# Set workspace
arcpy.env.workspace = r"C:\Some\Arb\Folder\work.gdb"
# Template feature class
fc_template = "ftr_template"
# Spreadsheet containing the values
xl_workbook = r"C:\Some\Arb\Folder\assets.xlsx"
lyr_source = arcpy.management.MakeFeatureLayer("ftr_source")
field_category = "Category"
# Get projection from template feature class
sr = arcpy.Describe(fc_template).spatialReference
# Create data frame and parse values
df = pd.read_excel(xl_workbook, 0, parse_cols[0,6], index_col="Prefix")
# Get the list of categories
categories = list(set(row[0] for row in arcpy.da.SearchCursor(lyr_source, field_category)))
for cat in categories:
print("Processing " + cat)
qry = """ "{0}" = '{1}' """.format(field_category, cat)
# Look up the category in the data frame and return the matching feature class name
fc_name = df.loc[cat, "Feature Class Name"]
try:
arcpy.management.CreateFeatureclass(arcpy.env.workspace,
fc_name,
"POINT",
fc_template,
"#",
"#",
sr)
print("Feature class created: " + fc_name)
lyr_cat = arcpy.management.MakeFeatureLayer(in_features=lyr_source,
where_clause=qry)
arcpy.management.Append(lyr_cat, fc_name, "NO_TEST")
except Exception as e:
print(e)
print("Finished " + cat)
print("Script complete.")

In Line 28, I load the first 7 columns on the first sheet in the workbook into a pandas data frame. I set the index column to the column called “Prefix”, so those values will be used for the lookup instead of the default int index pandas assigns.

In Line 37, the prefix value from the feature class is used to look up the corresponding feature class name in the pandas data frame. Once the feature class has been successfully created, a selection layer of the matching features is appended into the new feature class. I could use a SearchCursor and store the matching features in a python list to be copied into the new feature class, but that’s something I will test at another time.

EX101x: Final recap

When I signed up for this MOOC, I expected more Python. I’ll quickly summarise the last 6 weeks of the course.

Highlights:

  1. Learning how to correctly use the Data Model introduced in Excel 2013. The way Felienne explained how to use it was quite simple, and realising that one uses a join between two ranges to create the model was an eye-opener.
  2. Double click the value field in a pivot table to get a new sheet with a copy of that data filtered on it: I can’t remember why I wrote this down, but I know it’s useful.
  3. More uses for named ranges
  4. Data checks: definitions and different types
  5. Using wild cards in COUNTIF: Seriously, I don’t know how I didn’t know this.
  6. DataNitro exists: This would have helped me about 2 years ago.

Lowlights:

  1. I was quite disappointed that Python only appeared in Week 7. More Python please!
  2. I was also disappointed that DataNitro was used, instead of accessing everything through the command line.
  3. Language barrier: I know the course coordinator is not a native English speaker. This created many ambiguous questions in the quizzes, as well as some weird phrasing in the videos.
  4. Fellow MOOC-ers not being constructive: I’ve mentioned on this blog several times that I don’t understand the need for many MOOCs to have peer assessments. One in particular took it a step further with mandatory forum posts. While I’m glad that EX101x did not do this, just opening up the “Discussion” below each quiz reminded me of why I don’t like the forums on these things.
    Many of the comments were about the way Felienne was dressed, or her “strange look”, or the way she spoke. Yes, the videos could have been better – maybe she didn’t have to take up almost half the frame, leaving the spreadsheet off to the side and hard to look at. What do her piercings have to do with that? Good grief.
  5. Not enough Python: I cannot stress this enough. I appreciate that I learnt some new things about Excel, but as a power user, I think I was expecting more analysis out of the course.
  6. Bringing in neo4j: What an interesting programme, but to bring it in at such a late stage (basically at the end of the course) did not make sense. I tried to run it but after getting errors I just abandoned it. I already had 81% so I chose not to complete the neo4j exercises. I feel like after they introduced Python to the people crash-course style, the different commands needed for neo4j was needlessly confusing.

Suggestions:

  1. More Python. More Python. More Python.
  2. The videos need to be redone in a clearer way.
  3. The course needs to be structured differently. Python should be introduced much sooner.
  4. Neo4j seems like a nice tool on its own. It should also have been introduced earlier in the course, to allow for more time for the people still getting to grips with Python.

Overall, I did enjoy this MOOC. I enjoyed it enough to pony up for a verified certificate:

EX101

If DelftX offers a follow-up course, I would definitely take it.

Using INDEX & MATCH instead of VLOOKUP

I’ve been using Excel since Office 97. I was about 8 years old, and had no idea what I was doing. By the time I had progressed to VLOOKUPs and PivotTables, I was a bit older (and wiser, I’d like to think).

It was to my chagrin that I discovered about 2 years ago, that instead of creating a clusterbomb of VLOOKUP/CHOOSE (sometimes with some nested IFs sprinkled in for good measure), I could have been doing INDEX/MATCH all along.

=INDEX(x, MATCH(y, z, 0))

x = Range to return a value from
y = Value to look up
z = Range to look up the value in

I’ve made a small spreadsheet to demonstrate. Please ignore the fact that copying the formulas from Excel Online has changed them to the American region structure, hence the semi-colons as delimiters instead of commas.

Cell E2 does the standard VLOOKUP

=VLOOKUP(D2;$A$2:$B$9;2;FALSE)

while F2 uses INDEX/MATCH to achieve the same thing

=INDEX($B$2:$B$9; MATCH(D2;$A$2:$A$9;0))

That looks a bit more complex than the VLOOKUP. The real power is demonstrated in the next example. To achieve E3, VLOOKUP does the following

=VLOOKUP(D3; CHOOSE({2\1};$A$2:$A$9;$B$2:$B$9);2;FALSE)

while INDEX/MATCH follows the same structure as before in F3

=INDEX($A$2:$A$9; MATCH(D3;$B$2:$B$9;0))

This is why I am now more inclined to use INDEX/MATCH. The ability to do a lookup in any column, and return a value from any column, is great. This really helps for those times when I’m not just rather loading the data from the spreadsheets into Python dictionaries and pulling it apart there.

EX101x: Week 1 and 2 recap

I haven’t been as consistent with these recaps as I would like to be, but as I get closer to my first exam in 5 years, I’ve been cooling it on the MOOC front. As a result, I’m ending up completing two weeks in one most of the time now.

Week 1 focussed on using conditional functions like IF, COUNTIF, SUMIF etc. Week 2 was about using lookup and search functions like VLOOKUP, INDEX and MATCH. A few things which I did for the first time:

  1. Import data into a sheet from a web feed: I’ve never had a need to use this function, and I’m not sure if I have the need now, but it’s good to know.
  2. Using a time criteria for COUNTIF: So useful, and I can’t believe that in my 15 years of Excel usage, I have never needed to do something like this.
  3. A use case for setting the range_lookup for VLOOKUP as TRUE: I use VLOOKUP a lot (or used to, before I switched to doing most of my analysis in Python). I also use a lot of nested IF statements for ranges of items, and some combination of INDEX/MATCH.

    At school, we were always taught “FALSE with VLOOKUP” because every scenario always required exact matches. I never though to question that, nor did I have the need to, because with all the GIS data that needs pre-processing, we always require exact matches.

    Now that I know that by creating a lookup table and using TRUE, VLOOKUP can be used to match value ranges the same way I would use a nested IF statement, I can see how that would really have helped me out on a few projects a while ago.

A big part of why I also focussed on learning more Python wherever I could was because I was tired of exiting ArcGIS, manipulating data in Excel, then bringing it back into ArcGIS. With Python, I could do all the data manipulation I wanted right there, and then take it into Excel for non-GIS people.

I’m also really pleased that I can still learn things in Excel to increase my productivity. One becomes so accustomed to doing things a certain way, that if that method is fast enough, one tends not to look for a replacement.

DelftX EX101x: Excel and Python? Yes please

This MOOC has me excited. Excel makes me excited, Python makes me excited, GIS makes me excited, and I get to work with some combination of them everyday.

I’m hoping this MOOC will show me how to work some Python magic in Excel beyond my usual reading values out/writing values in. Also, I haven’t detected any trace of peer assessments, though the lecturer is quite funky, so will have to wait and see.

I’m a bit behind in it at the moment because I’ve been distracted by one of my side projects, which I should be posting about very soon!

Build a definition query from an existing spreadsheet

I wrote this script about 2 years ago, to set a long definition query on a layer from data contained in a spreadsheet. I can’t remember why I couldn’t join the data to the spreadsheet, but I’m in the process of adding all my scripts to GitHub, so even if I can no longer recall the reason for the script, the logic of it should still be able to help someone.


#
# @author Cindy Williams
# @date 25/06/2013
#
# Loops over an existing spreadsheet with attribute data
# and builds up a definition query to apply to the dataset.
#
# For use in the Python window in ArcMap.
#
from arcpy import da
import arcpy.mapping as MAP
# Input variables
xls = r"C:\Some\Arb\Folder\Data.xlsx\Sheet1$" # Spreadsheet
mxd = MAP.MapDocument("CURRENT") # Current map document
lyr = MAP.ListLayers(mxd)[0] # First layer in the the map
lst_qry = [] # List to hold the query
field_name = "Layer"
# Loops over the spreadsheet using a search cursor.
# Appends each value into the list
with arcpy.da.SearchCursor(xls, field_name) as cursor:
for row in cursor:
defn_qry = """ "{0}" = '{1}' """.format(field_name, row[0])
lst_qry.append(defn_qry)
# Build a string representation of the definition query and apply it to the layer
lyr.definitionQuery = ' OR '.join(lst_qry)
# Refresh ArcMap to see the query applied
arcpy.RefreshActiveView()

A few things to note – Line 12:

import arcpy.mapping as MAP

I went through a phase where most of my scripts involved data driven pages, and I was getting tired of typing out the full reference. However, I have since returned to the more verbose way of typing it because my laziness should not violate Python’s philosophy of readability and there are so many ways to import that it doesn’t matter in these tiny scripts.

lyr.definitionQuery = ' OR '.join(lst_qry)

I used join to create a bulky SQL query. This is terrible, but it gets the job done, and it helped me evolve my understanding of the peculiarities of using SQL in ArcGIS.