Google’s PlaNet: Geolocating Photos Using Artificial Intelligence

A recent project at Google, in collaboration with the technologically focused Rheinisch­Westfälische Technische Hochschule Aachen University near Cologne, Germany, has developed an artificial intelligence system capable of identifying locations more consistently accurately than a human is able to do. The system, PlaNet, draws upon a large database of geotagged images and seeks recognizable clues in environment to determine the probability that a photo was taken in any area. These clues range from recognizable landmarks to landscapes, flora, fauna, and even styles of architecture. In this way, the system operates similarly to a person hoping to identify the location of a photo. In fact, PlaNet competed against ten people in games of GeoGuessr and used these similar methods. The game showed both the human player and the PlaNet system the same ten panorama Street View shots, for which they marked the map with their guesses. The human subjects said that they used street signs, vegetation, and other cues that the PlaNet system has also been built to consider. While particular clues might be missed by PlaNet, it has access to countless more places than any person can visit. PlaNet performed far better than the human subjects, but still struggles with some rural landscapes, confusing Alaska for Scotland or Iceland, for example. Another incorrect guess the paper includes is a beach in the Virgin Islands that PlaNet guessed was in Seychelles.

The examples on the left are the query photos. In response, PlaNet will output a probability distribution on the map. In these three examples, the Eiffel Tower (a) is confidently assigned to Paris, the model believes that the fjord photo (b) could have been taken in either New Zealand or Norway. For the beach photo (c), PlaNet assigns the highest probability to southern California (correct), but some probability mass is also assigned to places with similar beaches, like Mexico and the Mediterranean. The authors use a model with a much lower spatial resolution than the full model for visualization purposes. Source: Weyand, Kostrikov, & Philbin, 2016.

The examples on the left are the query photos. In response, PlaNet will output a probability distribution on the map. In these three examples, the Eiffel Tower (a) is confidently assigned to Paris, the model believes that the fjord photo (b) could have been taken in either New Zealand or Norway. For the beach photo (c), PlaNet assigns the highest probability to southern California (correct), but some probability mass is also assigned to places with similar beaches, like Mexico and the Mediterranean. The authors use a model with a much lower spatial resolution than the full model for visualization purposes. Source: Weyand, Kostrikov, & Philbin, 2016.

In some ways, PlaNet operates like some previous attempts to develop this sort of technology. For example, it utilizes a memory architecture that can assume that photos taken around the same time, in the same album, should have been taken in close proximity. If one photo has geographic clues, the other photos are assumed to have been taken in the place suggested by those clues. In other ways, the system is unique. A major reason that it performs more successfully than others is that it divides the earth into cells and assigns a probability that any given image was taken in each of the many cells. This use of varying certainty, rather than a single guess, offers more information about how the system is working and what it has learned from the image. The PlaNet system is a breakthrough in AI that may prove to be useful in now unimaginable ways.

More:

Weyand, T., Kostrikov, I., & Philbin, J. (2016). PlaNet-Photo Geolocation with Convolutional Neural NetworksarXiv preprint arXiv:1602.05314.

See Also



Advertising


Like this article and want more?

Enter your email to receive the weekly GIS Lounge newsletter:

Advertising