Overcoming the Make Query Table bug in ArcGIS

According to my notes, I first used the Make Query Table tool in my first week at Aurecon, back in March 2012. It was the first of many, many times, because often when receiving spatial data in a non-spatial format from a non-GIS user, the first thing that gets thrown out is any trace of the original spatial component.

At some point, I realised the tool’s expression parameter was a bit wonky. As I have come up against this problem every few months since (forgetting it happens each time because I only thought to write down a note about it now), I have decided to immortalise it in a gist below.


I passed my registration exams (and other news)

I know it’s already a month into the new year, and I haven’t been active on the blog for a while, but that’s mostly because I had nothing to post. However, that will all be changing because in January I returned to Aurecon. This time around, I’m fulfilling a technology expert role within the Asset Management team, so I’ll be designing and driving our technology strategy going forward. I’ve got a few interesting things lined up, particularly around spatial data warehouses and ontologies, so I’ll be posting about it alot here while I work through my ideas.

I also got my SAGC exam results today – 100% for Paper A and 96% for Paper F, so once the fees are paid, I will be registered as a Geomatics Technologist. When I’m finished with my BscHons next year, I’ll be able to upgrade to Practitioner as I’m just short of the academic requirement.

Speaking of Hons, I just registered for the 2nd (final) year, and am currently studying for 3 exams next week. To be honest, it has been painful studying, simply because many of the concepts we have discussed are outdated and having been in industry now for almost 6 years, it is irritating to have to study these things. I’m just pushing through to get that piece of paper. Hopefully I get a good research topic for the project (or can convince one of the supervisors to take me with my own topic). I’ll worry about that after the exams.

I’ll probably be able to return to my normal posting schedule in March. I just thought it was about time I post something.

Aaaaand now I'm thinking about how much I miss Community. #SixSeasonsAndAMovie

Aaaaand now I’m thinking about how much I miss Community. #SixSeasonsAndAMovie

Thoughts on spatial data warehousing

I’ve been thinking quite a bit lately about how to store spatial data. It’s something I’ve covered here and my attitude towards this topic has evolved over the years.

The organisations I’ve worked in have mountains of spatial data accumulated over the years. The data is stored in shapefiles, geodatabases, normal databases, spreadsheets, documents, reports, photos…Why is it this way? It doesn’t have to be this way. It shouldn’t be this way!

In the course of my research for a topic for my project for next year, I’ve honed in on the methods for implementing an enterprise geoportal within an existing spatial data infrastructure. However, I feel like my focus is shifting to the data that the geoportal is trying to expose to a larger audience.

The concepts of a spatial data warehouse and a spatially enabled operational data store have been intriguing me. A regular GIS task involves comparing spatial data across a time period, analysing trends and presenting the results in a map or report. Why aren’t we storing this historical data in a SDW that’s optimised for reporting?

Non-spatial data can come from a variety of sources as well – spreadsheets, other databases etc. Another common GIS task is to spatially enable these datasets. Why are we not storing the outputs in a spatial enabled operational data store in an open format like GML?

I think it’s because to plan and implement a SDW/S-ODS takes time (and money). With a normal EDW, the organisation will not need much convincing to see the benefit of implementing one. “Spatial” is still seen as an “add-on”, or a “nice-to-have”.

How to run VBA code from a Python script

I recently modified a script I wrote to extract data from a Word document to a csv file. The modified script had to iterate over multiple docs and extract data from certain tables based on certain keywords and fields.

I used the python-docx module to do this, but hit an obstacle when I realised that it could not (as yet) parse Word’s content controls. Since I only had 9 documents, I opened each, pasted some VBA code pilfered off StackOverflow to remove all content controls from the document.

While that worked temporarily, my next step is of course to schedule the script to automatically pull the data out once the folder is updated with the new batch of docs for the month. A solution suggested entails the code being saved inside the doc so it can be called via com.

I’m not happy with that solution because I would still need to open each document and insert the code. What I need to do now is fiddle around some more so that the code can be saved inside the script and then run on each document as needed.

The issue with names

I recently underwent a name change, and though I have yet to make it official (who wants to waste a Saturday at Home Affairs?), I have been thinking about the implications of my name change.

Now that my surname contains a hyphen and is 18 characters long (with my full name now 26 characters), I’ve been wondering how I should abbreviate it. Some hasty Googling shows that there is no standard for this. My entire life I just assumed that the first part of the surname takes precedence so the initials remain the same. In other words, Cindy Lee Williams (CLW) becomes Cindy Lee Williams-Jayakumar (CLW).

I toyed around with the idea of dropping my middle name (itself having been an issue with people assuming I’m Cindy-Lee and not Cindy Lee) to become Cindy Williams-Jayakumar (CW), but the thought of having only two initials terrified me.

I came across this blog post which calls out the assumptions programmers make when building systems which need to accept names (I’m guessing that’s about 95% of all systems). Now that my name has become slightly more complicated, I’m going to be more aware of my own assumptions when writing code, and not just when it comes to validating names.

I’ve also decided to be a bit more difficult and use CWJ as my initials. I had CLW for 27 years, it was time for a change.

Convert a list of field names and aliases from Excel to table using ArcPy

I went digging through my old workspace and started looking at some of my old scripts. My style of coding back then is almost embarrassing now 🙂 but that’s just the process of learning. I decided to post this script I wrote just before ArcGIS released their Excel toolset in 10.2.

From what I can recall, I needed to create a file geodatabase table to store records of microbial sample data. Many of the field names were the chemical compound themselves, such as phosporus or nitrogen, or bacterial names. For brevity’s sake, I had to use the shortest field names possible while still retaining the full meaning.

I set up a spreadsheet containing the full list of field names in column FIELD_NAMES and their aliases in ALIAS. I created an empty table in a file gdb, and used a SearchCursor on the spreadsheet to create the fields and fill in their aliases.

This solution worked for me at the time, but of course there are now better ways to do this.