Common geospatial data formats

Home » Learning paths » Introduction to Working with Geospatial Data » Common geospatial data formats

Ways to Access data

Before looking for spatial data it is probably a good idea to have some basic knowledge of the main formats of spatial data, and if you are completely new to using spatial data read the introduction article on how geospatial information is represented as geospatial data.

In addition to these five types of data, you also need to distinguish between local data and remote (service-based) data. Local data is data stored on your local computer or network drive connected to your computer such as OneDrive, while remote data is data, typically managed by someone else, that you access over an internet connection when you need it.

Pictures of maps

Pictures of maps are something you typically find in a report (pdf) or on the internet. There are three reasons why you if possible should avoid using this type of data:

  1. They are typically not meant to be used outside their original context so you can easily by breaching someone’s copyright. You can find a general description of copyright and licencing practices here,
  2. Pictures of maps often lack coordinated information especially if they are from a pdf or webpage. Therefore you need to add coordinate information to it through a process called “georeferencing“.
  3. Pictures of maps can generally only be used as background maps for other data, it is a visualisation of data, not the data so you cant change the way it looks or use it in numerical analyses without creating your own data based on it. See the article on digitalisation.

Indirect spatial data

Indirect spatial data is data that does not contain spatial coordinates but an identifier (key) to a well-defined spatial location. There are two common types of indirect spatial data namely data that uses addresses as the key or data that uses the name of an administrative or statistical units i.e. country, region municipality or census area.

Exactly which choices you are presenting with depends on the software running the metadata server and which ways of access are available for the data you have located. In relation to accessing the data you typically have two main options, either download the data or access it as a service. Both options often support different standard data formats. 

If you wish to access the data as a service, i.e. let the data stay at the original location and access it online, the most common standards are WMS and WFS. WMS is typically used for ready symbolized data like digital versions of paper maps or images, while WFS is more suitable if you want to do your own symbolization or even do analysis. You can read more about the different standards for data services in the post “Data and service standards for geospatial data”

If you wish to download the data to your local computer there is a host of formats for this. The most common formats for downloading data to your computer are GML and GEOjson although others such as “shp” files and “geopackages” are also often seen. Again you can read more about the different standards in the post “Data and service standards for geospatial data”

The choice of downloading the data vs. accessing it as a service is not trivial. The obvious advantages of downloading data are:

  1. You don’t have to be online
  2. The data won’t disappear
  3. It is typically faster in analysis

The advantages of accessing data as a service are:

  1. The dataset might be huge and not fit onto your computer or you might only need a limited part of the data.
  2. You will always be using the current updated version and especially administrative data is typically updated frequently.
  3. You don’t have to hassle with storing the data on your computer.

You can find a description of how to load data into a series of selected software systems in the post “loading geospatial data