Jennifer Kennedy, a GIS Analyst at Timmons Group discusses the importance of managing GIS data quality and some GIS solutions for managing GIS data and ensuring its integrity.
There is no doubt about it – the GIS industry is trending towards more open data, interconnectedness, and public engagement on a grand scale. The key to tackling current and future mapping and spatial problems is to invest in high quality data NOW. Data is available from so many sources, allowing much-needed insight into the environment, infrastructure and the fabric of the world around us – or, at least, that’s the intent.
We all know the reality is a little more complicated. There is a vast amount of data available from a multitude of sources. All that can be very challenging to sort through to find the most impactful, complete, and relevant information to help tackle the issues at hand. What does this mean for organizations both authoring and consuming data? Data quality and the tools to assess it are incredibly important, and organizations need to take advantage of tools that simplify the tasks of data management while ensuring that data meets quality standards.
Barriers to sharing data are falling, attitudes towards data sharing are changing, and new platforms and technologies are reducing the effort it takes to create and host web content. Organizations realize how important spatial data is for making decisions and how beneficial it can be to allow other groups, including the public, access to that data. It’s also easier than ever to share data among colleagues, clients, and the public. Esri’s ArcGIS Online platform is a great tool for organizations looking to author web maps and apps to serve GIS content to public and private groups. As a data user, it is also a powerful platform that provides quick access to content and the tools to rapidly build maps with data from disparate sources.
Managing GIS Data Quality
With such ease of access and visibility comes the responsibility and critical need to ensure the quality of the data being shared. How do organizations maintain data, judge and ensure its quality, and communicate what the level of quality is to non-GIS folks? What does quality mean? Many organizations face problems when migrating data from different formats, transitioning from legacy systems, and evolving work processes that once involved custom applications. Such road blocks can waste a lot of time and effort for staff.
On another front, organizations are also realizing more ways to interact across departments and utilize data in ways perhaps not originally envisioned when the data was first created. Local governments and other organizations are looking toward their GIS departments for new ways to apply their data to create efficient solutions to business problems. There can be challenges when adapting data for new purposes, and sometimes just assessing how useful data is, or where it needs to be improved, can be difficult. For example, a locality might need to start billing residents and businesses for stormwater runoff, and may notice that they have a GIS layer representing impervious surfaces. How accurate is the impervious surface data? Does it align correctly with imagery and with the parcel data? What other relevant data does the organization already have, and how can that be leveraged to determine where the data is faulty, and where it is accurate? These are the types of questions that need to be answered before notifying the public, getting comments, and sending out bills.
Using GIS Tools to Share Data
Esri, the leader in providing geospatial software and solutions, enables organizations to create responsible and sustainable solutions to problems at local and global scales. There are some powerful Esri extensions that make management of data quality and integrity a less complicated task. ArcGIS Online, Workflow Manager, Task Assistant Manager, and Data Reviewer are all tools that make the duty of ensuring data quality simpler and more efficient.
During any data production process, an organization must define the standards of data quality. After these standards are defined, workflows to assess, create, and maintain quality data can then be implemented using tools such as Esri’s Workflow Manager and Data Reviewer. The hypothetical Stormwater organization would need to determine, for example, if impervious surface polygons are allowed to cross parcel lines, and what sort of exceptions can exist. Then the organization can set up Data Reviewer tasks that check which impervious surface features violate those rules, allowing editors to fix the errors or determine solutions. Data Reviewer is a robust tool for validating data quality rules in an ad hoc way, but its true strength is that it documents those rules and allows them to be easily shared among users, or to simply be run periodically. That is one of the keys to success with data projects – not only to define what quality means, but to verify early and often that the data meets those quality markers.
Once the group has determined the best way to edit polygons, how to address common problems, and who needs to perform different tasks, those workflows can be easily documented in Workflow Manager and Task Assistant Manager. With Workflow Manager, the organization can establish roles for different individuals. Some may edit the data, and others may run quality checks on the data and approve it. Workflow Manager allows the organization to capture that overall workflow, setting up an overarching process that guides the work from start to finish, as different people edit the data. It works within multiuser databases, automating tasks like creating new versions, updating map documents to point to those new versions, kicking off data processing tools or scripts, and assigning different tasks, like quality control, to the correct people at the appropriate time.
With Task Assistant Manager, the organization can set up micro-level editing workflows within ArcMap. Task Assistant Manager is a window that sits directly within ArcMap, providing helpful information and putting all the needed tools directly at the editors’ fingertips. It’s much more useful than a paper tutorial, because it interfaces directly with ArcMap, allowing the user to simply click the step in the workflow to open the needed tool.
These tools, Task Assistant Manager and Workflow Manager, are fully configurable and customizable. They require a little overhead in the beginning of a project, when setting them up, but they save time in the long run because they automate tasks and save users from needing to hunt down tools.
Workflow Manager, Task Assistant Manager and Data Reviewer are especially useful because they are self-documenting. By the very nature of the tools, there is a record of how the data is created and validated. This kind of documentation is essential, and not always available in an organization, especially when data or processes are very old. Sometimes not even paper tutorials exist, or different users have their own notes on processes, and not everyone may be working in the same way. Workflow Manager and Task Assistant Manager provide a mechanism for centralizing those workflows and processes, helping to ensure that everyone in the organization has access to the same tools and is using the same methods to create data. This necessarily impacts data quality, because different processes can easily lead to inconsistency in the data.
Where does ArcGIS Online fit into a quality control strategy? It actually can be a very useful tool for quality assessment, simply because it makes sharing data between groups, and especially with non-GIS staff, very easy. Revisiting our Stormwater organization, imagine that the GIS-centric Stormwater group is separate from the financial billing group. The GIS group might need input from the billing group, who are not GIS users. One solution could be to set up a simple, yet secure and internal, map on ArcGIS Online, providing a user-friendly web-based interface for the billing group to view the data as it is being created and edited, and allowing them to provide feedback. ArcGIS Online can be a powerful tool, allowing different groups to view and interact with data in a quick and easy way, and enabling groups to work together to refine and improve data quality.
To sum it all up, organizations need to invest time early on the data production cycle toward defining and ensuring data quality, to avoid headaches and wasted effort down the road. Tools such as Data Reviewer, Workflow Manager, Task Assistant Manager, and ArcGIS Online empower organizations to make the best use of their time and resources by automating repetitive tasks and ensuring everyone has the same tools to work efficiently.
About the Author:
Jennifer Kennedy is a GIS Analyst at Timmons Group. Her experience includes data analysis and workflow design, with a focus on validating and maintaining data health. The projects that she has worked on have focused on data conversion, mapping, and implementation of quality control and workflow tools from Esri’s Production Mapping suite. Jennifer has conducted several training seminars introducing tools such as Esri’s Data Reviewer and Task Assistant Manager, and she regularly provides training for clients on these topics.