GIS data. ESRI has a tab in their help documentation dedicated to the topic. As a GIS person, I may have become accustomed to receiving (and accepting) data in almost every conceivable format, because I know that by some process, I can get it into GIS.
Here are a few examples of data I have received over the last few weeks, and the conversion process followed for each:
- As-built drawings (dwg/dxf), coordinated, but missing a constant: Set the coordinate system in ArcCatalog. Add the study area boundary to ArcMap, followed by the Polyline layer from the CAD file. Right click the Polyline layer > Properties > Transformations > Enable Transformations > Coordinates. Click OK and go back into the Properties to enable transformations again because ArcMap only remembers it the second time. Add the constant to the coordinates, and press OK. Hope that the drawing falls in the right place. When it does, convert the Polyline layer to a feature class. Project it, then inspect the attribute table. Hope that the CAD technician has placed matching elements in the same layer. Extract the features by unique attribute, ignoring any extraneous data of type Circle/Arc, or any layout items.
- Dozens of KML files, each containing a single point: Batch convert the KML files to feature classes. While this does create multiple GDBs, it ensures that each file is checked before being extracted for processing. If I’m fairly confident in the data, I will run a script to convert the KMLs and merge them into a single feature class at the same time.
- PDF files of maps which have been drawn on and scanned: Convert the relevant pages from the PDF to JPG. Georeference the JPG if it contains identifying features such as cadastre (with erf numbers). If there is no line data, it may be OK to eyeball it when digitising.
- A0 hard copies of maps/as-builts with no digital copy: Eyeball it.
- Spreadsheets with road names and intersections (no GIS IDs): Format the spreadsheet so that it can be converted (no spaces in field names, remove unnecessary columns etc) to a file gdb table. Hope that the unique combination of road names and intersections will match perfectly with the road feature class.
Sometimes the CAD files are not coordinated, in which case I send it back. Sometimes we get old shapefiles, which have long lost their unique GIS IDs. One time, I received a personal geodatabase (!!!) containing feature classes with a single ID attribute each. Their “matching” attribute tables were stored in separated dbf (!!!) files per folder per service. These dbfs contained many attributes, everything besides the IDs needed to join the data back to the shapes. This is where I had to “work the magic” to get anything usable out.
I haven’t covered all the scenarios, but that’s just about getting the data into GIS. Once it’s in, the data needs to be digitised (if it’s new assets which have been added), or the previous datasets must be inspected and the relevant features extracted (if an asset has been upgraded).
Some discretion needs to be used throughout this process. Time constraints and the current state of the data for a municipality will determine the level of detail which is captured. By putting this methodology in place, I am hoping to change that approach so that in the future, a standard amount of data is captured in a standardised way.
Due to the growing amount of features we were being asked to record as assets, I decided to create a spreadsheet (which will eventually become a table in a database) to separate the services and to specify the prefixes needed for the GIS IDs. A GIS ID is composed of a prefix, a dash and a string of numbers. For example, the prefix for Water Reticulation Pipeline is WRP, so a feature in this feature class may have the GIS ID WRP-00101. Whenever a new asset is added, I run a script to autopopulate the next GIS IDs.
Currently, the list contains over 70 feature types we need to maintain. Each service has its own feature dataset. For example, fds_WaterSupply contains ftr_wrp_Pipe, ftr_why_Hydrant, ftr_wva_Valve etc. The naming convention is not only for consistency (all Water Supply prefixes start with W, all Stormwater prefixes start with SW), but also for the eventual transition to a SQL Server database. This way, the feature classes will be grouped according to the service it belongs to (because SQL Server Management Studio will display the feature class tables in alphabetical order, and because it ignores feature datasets. One of our clients actually pointed out this helpful tip).
I am enforcing certain topology rules based on the requests of the asset guys, such as roads are captured intersection to intersection, sewer pipelines are captured manhole to manhole, water pipelines are captured road intersection to road intersection, and parking areas are captured as polygons and converted to centroids with the polygon area attached to the point. I don’t have the actual topology set up, because at this stage it would add unnecessary complexity. Rather, this capturing convention will become a habit, and as we clean up the older datasets, we will automatically be cleaning topology errors as well.
Despite speaking at length about the data, I have only scratched the surface of what we do with the asset data when it comes to us. In Part 4, I will talk about what happens to the data once it’s been processed by GIS.