Where is the Phrase “80% of Data is Geographic” From?

Filed in GIS Data by on October 28, 2012


The phrase “80% of data is geographic” is one of those commonly cited facts that those that work with GIS data are very familiar with.  The phrase almost always presented without any reference to its originations, repeated over and over again.  It is frequently cited to underpin the vast untapped GIS data potential out there.  Embraced as the perfect angle for convincing agencies and companies to adopt geospatial technologies, GIS powerhouses like Esri have cited it every which way on its pages: 80% of the world’s data includes some kind of spatial aspect (here), 80% of data has a location component (here), 80% of data possesses a geographic reference (here), 80% of transactional data has a location component (here).

The Urban Legend of GIS Data?

So where did this phrase originate from and what evidence is it based on?  A recent post on Spatially Adjusted about the phrase made me want to dig deeper.  Answering that definitely isn’t quite so simple given the folkloric status the phrase has reached . There are several sources of origination that have been offered up.  Some pointed to Franklin, Carl and Paula Hane, “An introduction to GIS: linking maps to databases,” authored by Carl Franklin and Paula Hane (1). as the originating source (examples:  gis.stackexchange.com and Spatial Sustain).  That article took at look at the emerging field of what the authors called Geographic Information Management (GIM) and the “impact of computerization of maps on access to business and government information that may be geographically referenced.”

That then drilled down into referencing a 1990 report from the Ohio Geographically Referenced Information Program (OGRIP) that I have not been able to location.  However, the phrase “80% of data collected, stored, and maintained by local governments includes some reference to geography.” has been used repeatedly in reports issued by the agency since.  A draft white paper from 2004 and 2011 on the “Ohio Location Based Response System” both repeat that phrase.  Since both papers don’t cite any of the “studies” referenced, the trail dies there.

However, the earliest date is an article written 1987.  In “Analytic Mapping and Geographic Databases”, Issue 87, published in 1992 and edited by Robert S. Biggs, G. David Garson, the authors make the statement, “Computer mapping is particularly important in government, and hence is salient to social scientists who study government policies.  It is estimated that 80% of the informational needs of local government policymakers are related to geographic location.”

Biggs and Garson cite an article written by Robert E. Williams in 1987 entitled “Selling a geographical information system to government policy makers.”  At the time of the publication, Williams was the Director of the Alachua County Regional Information Center.  The article was published in “Papers from the 1987 Annual Conference of the Urban and Regional Information Systems Association” by URISA.

While a copy of the article is not available online, the abstract of the article is available from Esri’s online bibliography:

One of the most important elements of a GIS installation is obtaining approval and support of the policy makers who will fund the project. Because of the encompassing nature of a GIS, there is no one or major user of the system; therefore, it is imperative that a champion for the project is identified. (A champion is used here to refer to the individual or department that will take charge of the project and be responsible for all aspects of this implementation.) That champion must then gain the support of all users and sell the concept of the system to the policy makers. A case study of the selling of a GIS called GEOMAX at Alachua County will be used to show how a comprehensive GIS was effectively sold to three separate policy making bodies through an effective realtime demonstration. The demonstration was tailored to meeting the concerns of the policy makers and not to the technical features of the system.

I was able to procure a scanned copy of the article from Wendy Nelson, the Executive Director of URISA. The exact statement appears on page 151 of the publication and states in the second paragraph:

Automated mapping is probably an easier sell because, again, the policymakers are cognizant of the need for improved mapping capabilities. It has been estimated that approximately 80% of the informational needs of a local government policymaker is related to a geographical location. This information is usually supplied by a map rendering, e.g., maps showing the location of a parcel of land being considered for a rezoning petition.

The article by Williams lists no sources or any indication where the number comes from.  However, a little digging into GEOMAX reveals that the program was developed in 1985 by two academics at the University of Florida in 1985.  John Alexander, a professor of urban and regional planning, and Paul Zwick, a research scientist were behind the effort to digitize maps at Alachua County so perhaps the knowledge of where the phrase originates lies with them?

Testing if 80% of Data is Geographic

There have been a couple of articles produced by a team of German researchers that have attempted to test this statement.  In a paper presented at Agile 2011,  Stefan Hahmann and Professor Dirk Burghardt, a professor of Cartographic Communication at the Dresden University of Technology (Technische Universität Dresden) in Germany, along with Beatrix Weber presented a research framework for testing out the validity of the phrase in a paper entitled ““80% of All Information is Geospatially Referenced”??? Towards a Research Framework: Using the Semantic Web for (In)Validating this Famous Geo Assertion.” The article noted that while the phrase is referenced repeatedly in even academic articles, no paper has provided any methodology to demonstrate this statement.

In an 2012 article in press with International Journal of Geographical Information Science, “How much information is geospatially referenced? Networks and cognition,” the German university academics attempted to test the theory that 80% of information has a spatial component by looking at German Wikipedia articles.  The article took two approaches to analyzing scientifically the statement.  The first approach was to look at German Wikipedia articles as a network, with articles as nodes and links within the articles as “edges of a directed graph.”  The second approach was cognitive.  Articles were categorized as having a “direct geospatial reference”, “indirect geospatial reference” or “no geospatial reference.”  The network approach found that 78% of articles were either tagged with geographic coordinates or linked to an article with tagged coordinates.  The cognitive approach found that percentage to be closer to 60% at 57%.  The article is in German but there is a summary in English near the beginning.

The end of Kahman, Burghardt, and Weber’s article indicates real value of the phrase, citing a quote on Twitter by John Fagan, Head of Software Engineering, Axon Active AG. Agile & Lean and formerly of Bing Maps and Multimap: “that geo quote keeps us all in our jobs. Best not go poking around to see if it’s true.”

GIS data layers

GIS data layers. Source: FCDC


Like this article?

Sign up for GIS Lounge's weekly newsletter for more great content:


Tags: ,

Comments (7)

Trackback URL | Comments RSS Feed

  1. Rebecca Somers says:

    “70-80% of the information used in municipal management was tied to a geographic location” in Handling Geographic Information, Report to the Secretary of State for the Environment of the Committee of Enquiry into the Handling of Geographic Information. Chairman: Lord Chorely. 1987

  2. Seyoum Melese says:

    Every thing on and near the surface of our planet could expressed according to its location and its logical Derivatives.If we have this premise,how one dare to say 80% of any data locational.My view is every thing,at least at its geographical scale could be expressed in terms of its location (Time-Space Coordinates).

  3. Thanks for that, I had found the OGRIP citation once (and sent it onto @cageyjames then) but can no longer find it either (the beauty of the web, now you see it, now you don’t). I wonder if you’re trying to measure the immeasurable? And have you seen the Journal of Irreproducible Results http://www.jir.com

    Think of electric cars as being greener, until that claim had to be restated to include the fabrication of batteries and generation of electricity… IOW how long is a string (or the UK coastline)? So one may not be able to test the percentage in any tangible way, but if you pick 60% then what says it cannot be 80% because of some underlying data? Thanks for digging into this however and for the replies it generated!

  4. Arnie says:

    thanks for your “digging”, I have just written this fact into my thesis and was wondering whether it is true..

  5. Aparecido says:

    I think the phrase can also have a bit of Pareto Principle: for many events, roughly 80% of the effects come from 20% of causes.

  6. Jeff Essic says:

    I was once trying to submit a paper to a non-GIS journal, and the editor kept insisting that I find a citation for this statement. I never could convince or adequately explain to her that it is just a common axiom, and I think eventually just took it out.

  7. See: https://twitter.com/cedricmoullet/status/245368937930444800
    One possible sentence is “60-80% of decisions taken by citizen are related to geoinformation”.Coopers/Lybrand,1996 http://catalogue.nla.gov.au/Record/97620

Discuss this article

Your email address will not be shown.

More in GIS Data
A Google Street View Camera Car on display at Google's campus in Mountain View, CA, USA
The Many Modes of Google’s Street View Program

Google's street level imagery gathering is an enormous data collecting effort.  Street View launched in May of 2007 and Google...