Data quality is the degree of data excellency that satisfy the given objective. In other words, completeness of attributes in order to achieve the given task can be termed as Data Quality. Production of data by private sector as well as by various mapping agencies assesses the data quality standards in order to produce better results. Data created from different channels with different techniques can have discrepancies in terms of resolution, orientation and displacements. Data quality is a pillar in any GIS implementation and application as reliable data are indispensable to allow the user obtaining meaningful results.
Spatial Data quality can be categorized into Data completeness, Data Precision, Data accuracy and Data Consistency.
- Data Completeness: It is basically the measure of totality of features. A data set with minimal amount of missing features can be termed as Complete-Data.
- Data Precision: Precision can be termed as the degree of details that are displayed on a uniform space. More about precision: GIS Data: A Look at Accuracy, Precision, and Types of Errors
- Data Accuracy: This can be termed as the discrepancy between the actual attributes value and coded attribute value.
- Data Consistency: Data consistency can be termed as the absence of conflicts in a particular database.
Assessment of Data Quality:
Data quality is assessed using different evaluation techniques by different users.
- The first level of assessment is performed by the data producer. This level of assessment is based on data quality check based on given data specifications.
- Second level of data quality assessment is performed at consumer side where feedback is taken from the consumer and processed. Then the data is analyzed / rectified on the basis of processed feedback.
Sources of Spatial Data Discrepancy:
- Data Information Exchange:
Data information exchange is basically the information about the data provided by the client to organization. The degree of information provided by the client defines the accuracy and completeness of data.
- Type and Source:
Data type and source must be evaluated in order to get appropriate data values. There are many spatial data formats and each one of them is having some beneficiary elements as well as some drawbacks. For example: In
order to use CAD data on GIS platform, data must be evaluated and problems must be rectified otherwise resultant values will show the high extents of discrepancies.
Conventional data formats are quiet specific to data storage technique and functional compatibilities. For example: Topology can not be created on shapefiles. This can be created only on the latest geospatial storage format-
So, data type and source must be identified and evaluated before proceeding towards any analysis.
- Data Capture:
There are many tools that incorporate manual skills to capture the data using various softwares like ArcGIS. These softwares allows user to capture information from the base data. During this data capture, the user may misinterpret features from the base data and captures the features with errors. For example: A user misinterprets two buildings as single building and capture as a single feature. But in real world, there are two features. So, the correct interpretation of features in base data must be performed. However, there are many tools that enables user to find and fix those errors, but still these tools are not used frequently due to lack of awareness. Data capture must be performed on a perfect scale where one must be able to view the features distinctly.
- Cartographic Effects:
After capturing the data, some cartographic effects like symbology, pattern, colors, orientation and size are assigned to the features. This is required for a better representation of reality. These effects must be assigned according to the domain of the features. Like for Forestry application, forestry domain specific cartographic elements must be used. Elements of any other domain used for a particular domain degrades the output of results.
- Data Transfer:
Some discrepancies may occur while transferring the data from one place to another. For example: Data transferred from a web source to the standalone, web disconnected machine. Sometimes, In order to make the accurate data
more accurate, user tries to apply different advanced rectification technique but as a result the less accurate data changes into highly degraded data. “There is no bad or good data. There are only data which are suitable for a
specific purpose.” So, Data must be evaluated according to the domain for which it is supposed to be used.
Sometimes metadata is not updated according to the original features. For example: Few features are edited on some software platform but the edited information is not updated like name of the editor, reason for editing and some more relevant information. So, metadata must be updated with the original data.
Data Quality Improvement Techniques:
- Choice of relevant data from a relevant source.
- Derive precisions in the origin itself.
- Data quality testing in each phase of data capture.
- Using automated software tools for spatial and non-spatial data validation.
- Assessment of the mode of data uses and user.
- Determining the map elements like scale, visualization and feature orientation.
About the Author
Ravi Nishesh Srivastava is a member of Esri India.