Merging Spreadsheets with Python–Append
Merging multiple spreadsheets together is a common task to perform. Merging can be done two ways:
- Append — where spreadsheets are concatenated on top of each other

2. Join––where spreadsheet columns are joined horizontally based on the same values of your selected columns

In this tutorial we will be focusing on Appending. I will be writing another tutorial on Joining.
Disclaimer
I built a tool mergespreadsheets.com that appends files easily. Feel free to try it out if you need a quick append
Pre-requisite
If you do not know how to use the Terminal and Python, or how to read and write files using Python and Pandas, then read and go through this tutorial first: https://medium.com/@lovespreadsheets/intro-to-handling-spreadsheets-with-python-c6a35e0d8de8
Files
We are going to be using three files for this tutorial. You can download them by clicking on the links: MarketingAnalystNames.xlsx, SalesRepNames.xlsx, SeniorLeadershipNames.xlsx
Full Code
This is the entire code to append the three files
I. Read all three spreadsheets
Create a python file in the same folder as your three spreadsheets and name it append.py
First we are going to import our pandas library and give it an abbreviation of pd. The pd abbreviation is convention and technically you can use import pandas as is and replace everything pd in the code with pandas
import pandas as pd
We are going to read the first sheet without any extra parameters as we only have text data and the first line is the column name so we can read all three files with the pandas read_excel command without any parameters
marketing_analyst_names = pd.read_excel("MarketingAnalystNames.xlsx")sales_rep_names = pd.read_excel("SalesRepNames.xlsx")senior_leadership_names = pd.read_excel("SeniorLeadershipNames.xlsx")
Now we have three variables that contain pandas data frames containing all the values in the excel files in a row and column format
II. Merge all three data frames
Now we are going to merge all three data frames
- We are going to create a list that contains all three data frames. The order of the data frames in this list will determine how the files get appended
all_df_list = [marketing_analyst_names, sales_rep_names, senior_leadership_names]
2. The command to append all three data frames is a 1 line function. We will call the pd.concat function with our list, which will append all our data frames in that order, and assign the appended data frame to a variable appended_df
appended_df = pd.concat(all_df_list)
The append is done by matching up similar columns names. If the column names aren’t the same then the append will not stack the values on top of each other.
I encourage you guys to create dummy spreadsheets with different column names to see how the append is affected
III. Write to an Excel file
Now that we have our appended data frame, we can write it to an Excel file. We do this using the to_excel function on the appended data frame. We are also going to add the parameter index=False which will NOT output any row numbers. (play around with the parameters to see which one makes sense for you)
appended_df.to_excel("AllCompanyNames.xlsx", index=False)
Now that we are done with the code we need to open up our Terminal and go to the folder where we have all the three files and our append.py file saved
If you are lost with the terminal, please go through this tutorial first: https://medium.com/@lovespreadsheets/intro-to-handling-spreadsheets-with-python-c6a35e0d8de8
We are going to run this script from our terminal
$ python append.py
Voila! We should now have a file named AllCompanyNames.xlsx in the same folder as our three spreadsheets and our Python script containing all the three files appended!