The Open Geospatial Consortium Data Quality Working Group has compiled a survey to assess global spatial data quality. Those that work with and create geospatial data are invited to participate. A background article describing the work of the Data Quality Working Group and the goals of the survey is below. This article was originally written for the GITA newsletter, Network by 1Spatial.
A global opportunity for spatial data quality measurement and reporting
Data quality is a massive concern for those involved in information technology and the software business globally. The Data Warehousing Institute estimated that data quality problems cost U.S. businesses more than $600 billion per year.1 Closer to home for those working with spatial data, PIRA (Commercial Exploitation of Europe’s Public Sector Information, 20002) estimated that in 1999 it would cost the European Community countries €36bn to replace its geographical information assets. This amount was estimated to be growing at €4.5bn per annum. Similar costs for the US were estimated at $375bn with a $10bn growth per annum! These figures will almost certainly have accelerated in the aftermath of 9/11 with the focus on homeland security and geospatial intelligence. The increased emphasis and activity around Spatial Data Infrastructures is also driving a number of regional and national activities using geographical information assets or geospatial data, and its interaction with non-spatial or ‘regular’ data.
Many organisations invest significant sums in collecting these geospatial data. The demand for the 00 decade is for joined-up decision-making. In a geospatial context, geometric data quality deserves as much attention as alphanumeric data quality. Typically most organisations spend many years collecting spatial data and integrating or conflating (merging together) their own data with third party reference data. By third party data we mean states or regions in the US using TIGER data with their own asset data, or Local Authorities in Great Britain using Land-Line or OS MasterMapR in conjunction with gazetteer information. After such a large initial investment, ensuring high quality spatial data is essential in order to achieve value from the investment.
There are numerous reasons why spatial data may not meet user expectations, for example:
• The use of GPS has introduced a more rigorous accuracy for reference data sets
• GIS installations are only just beginning to understand the importance of metadata and configuration management
• Transformations alter information
• Lack of understanding or recognition that data may be inaccurate
• Data duplication – the same dataset is held by different departments but is rarely synchronised or even shared
• Conflation is only just becoming widespread in response to industry initiatives.
Taking these considerations into account, a draft Charter was submitted to the Open Geospatial Consortium with a view to establishing a Data Quality Working Group3. The Mission of this group was set: “To establish a forum for describing an interoperable framework or model for OGC Quality Assurance Web Services to enable access and sharing of high quality geospatial information, improve data analysis and ultimately influence policy decisions.” The Working Group will define a framework and grammar for the certification and communication of spatial data quality.
In order to build the framework or model for OGC Quality Assurance Web Services, it is necessary to ascertain what organizations involved in the market place currently understand and mean when using the term spatial data quality. The method to describe and communicate spatial data quality measures will reference these categories below:
• Positional (in respect of re-using spatial data collected before GPS)
• Thematic • Integrity • Definition (for semantic interoperability)
• Validity & Classification
Reference shall also be made to the standards defined in ISO 19113, 19114 and 19113(8) when published4. ISO 19115 metadata standards may be relevant in the storage of such measures and quality descriptors. Hopefully these methods will become a quality assurance type guide for users of spatial data.
To facilitate these objectives the Data Quality Working Group has compiled a survey that is open to the entire geospatial community and wider IT industry using spatial data. We encourage you to participate in this unique opportunity to provide feedback on spatial data quality, in terms of your views on the issues you see in the market place today. The survey can be found here:
For the purposes of the survey spatial data quality is defined as a measure of spatial data’s relative fitness for the user’s intended purpose. There are several reasons why it is important to determine spatial data quality requirements in terms of fitness for purpose or use, for example:
– Legislative and regulatory requirements
– Improved and more valid decision making
– Cost reduction through operational efficiency gains
– Maximise return on your investment in spatial data
– Poor public image and customer perception
– Increased profits through better business intelligence
– Re-use Public Sector Information.
Value is attributable if the data allow decisions to be taken with a given confidence. Confidence can be measured through assessing fitness-for-purpose. The next steps are to develop a framework to allow confidence to be stated, so please take this unique opportunity in this industry-wide collaboration, and help to shape the future of how spatial data is perceived, measured and ultimately assists in improving business processes for all.
Steven Ramage is Business Development Director at 1Spatial, where he has responsibility for their international partner network, the 1Spatial Community. He was involved in drafting the Charter for the Open Geospatial Consortium Data Quality Working Group, as well as a contributor to the current survey. The DQ WG is currently chaired by Graham Stickler from 1Spatial and co-chaired by Patrick Cunningham from Blue Marble Geographics.
1. “Data Quality and the Bottom Line,” Wayne Eckerson, The Data Warehousing Institute, 2002. http://www.tdwi.org/display.aspx?id=6045
2. Pira International Ltd., University of East Anglia, and KnowledgeView Ltd. 2000. Commercial exploitation of Europe’s public sector information. Final report for the European Commission Directorate General for the Information Society. ftp://ftp.cordis.lu/pub/econtent/docs/commercial_final_report.pdf
3. Open Geospatial Consortium Data Quality Working Group
4. Data Quality Challenges in 2007, Dr Michael Sanderson, 1Spatial, January 18, 2007