Geocoding is a task we all must face at some point while dealing with spreadsheets. It refers to converting addresses to latitude and longitudes and vise versa.
It often causes frustrations because there are so many ways to go about it. Here we present one way to geocode a list of addresses in a spreadsheet by using the OpenCage geocoder.
If you do not know how to run Python scripts or how to read and write files using Python and Pandas, then go through this tutorial first
We are going to be using some real addresses for this tutorial. You can download the file: Addresses.xlsx
The file contains 10 addresses from all around the world. Our goal is to add two new columns to this file: latitude and longitude.
Fun Exercise: Try to identify what famous landmarks are at these addresses.
See the full code below to follow along
First, let’s walk through what we need from OpenCage
Set Up Your OpenCage Account and Get your API Key
Go to opencagedata.com and sign up for an account. You will be led to your dashboard.
Grab your API key from your dashboard. We will be using it in our script later. It’s a 32 character alphanumeric key that will look something like this: abcdefghijklmnopqrstuvwxyz123456
Install the OpenCage Python Library
Open up your command line and install the opencage python library using the following command
$ pip install opencage
Now we can walk through our code
Create the file for code
Open a text editor and create a file geocode.py. Save this in the same folder as the Addresses.xlsx file.
Import the opencage library to geocode and the pandas library to read and write spreadsheets
import pandas as pd
from opencage.geocoder import OpenCageGeocode
Set up your Geocoding Variables
Create a variable that holds your api key and use that to create a geocoding object.
key = "REPLACE_WITH_YOUR_API_KEY"
geocoder = OpenCageGeocode(key)
Read the File
Read the file and store all its values in a dataframe
addresses_df = pd.read_excel("Addresses.xlsx")
Read your addresses
Read your addresses into a list (array) by reading the Addresses column from the file and calling the values and tolist() methods on it. If your column was named something else, you would replace the “Addresses” in the  with that name instead.
addresses = addresses_df["Addresses"].values.tolist()
Convert Addresses to Latitudes and Longitudes
Now we convert the addresses into latitudes and longitudes
latitudes = 
longitudes = for address in addresses:
result = geocoder.geocode(address, no_annotations="1")
if result and len(result):
longitude = result["geometry"]["lng"]
latitude = result["geometry"]["lat"]
longitude = "N/A"
latitude = "N/A"
Let’s go through this code in more detail:
1. First we create empty lists to hold the latitude and longitude of the geocoded addresses
2. Then we loop through the addresses that we have read from our file
3. We obtain the geocoding result by geocoding the address. We specify no_annotations = “1” to only obtain 1 result
4. If the result exists then we set variables longitude and latitude to the values obtained from the result
5. If the result doesn’t exist, then we set variables longitude and latitude to “N/A”
6. We add these values to our latitudes and longitudes lists
Create columns for the Latitude and Longitudes in our dataframe
Now that we have all the addresses and their latitudes and longitudes, we are going to create new columns in our original dataframe and set their values to our latitude and longitude lists.
addresses_df["latitudes"] = latitudes
addresses_df["longitudes"] = longitudes
Write the dataframe to a file
Run the script
$ python geocode.py
Voila! You should now have a file named Address_Geocoded.xlsx with the addresses, their latitudes and longitudes in the same folder as your geocode.py file