TANO Project

VISUALIZING ITALIAN IMMIGRATION TO ARGENTINA IN THE EARLY 20th CENTURY.
The Project

This project aims to visualize the Italian migration flows to Argentina between 1882 and 1920, to give an account of the demographic characteristics of this immigration and facilitate a new approach and understanding of migratory processes in Argentina. It may reveal hidden trends. It also proposes to incorporate into the academic knowledge on migration the particular case of Italians who moved to Argentina to better understand population movements.

International migrations represent a phenomenon of vital importance in the history of Argentina. After the Revolution of May 1810 and followed by a period of civil wars, toward the end of the nineteenth century, the ruling class consolidated a nation-state that promoted immigration, preferably European, as a key factor in the process of modernization of the country. The government implemented policies stimulating the arrival of immigrants, and as a result, by 1914, a third of the Argentinian population had foreign origins, with Italian and Spanish immigrants accounting for 39.4% and 35.1% respectively of the total of the immigrant population. This immigration decreased considerably after the 1930s, a fact that made visible a non-European migration flow of lesser magnitude but constant over time. 

To understand the impact that Italian immigration had on Argentine society, it is first necessary to examine in detail the demographic characteristics of this immigrant population. 

The question that guides this project is: What were the demographic characteristics of the Italian population that arrived in Argentina in the late nineteenth and early twentieth centuries regarding gender, age, occupation, religion, literacy and marital status?

To create the visualizations, I relied on more than a million records of Italian immigrants who arrived at the port of Buenos Aires between those years. It also includes data that has never been visualized.

Methodology
Data Source

This project relies on the dataset of one million records of the Italian passengers who arrived at the port of Buenos Aires between 1882-1920. The data was hosted at https://sites.google.com/site/barcosdeagnelli/Home.
The Center for Latin American Migration Studies (CEMLA) and Barcos Agnelli, a website founded by the Agnelli Foundation (an Italian  nonprofit social science research institute organization)  agreed on a project to index the lists of Italian passengers arriving at the Port of Buenos Aires city between 1882-1920, extracting data from the CEMLA database. The data included the year of arrival, the name of the immigrants, religion, marital status, gender, literacy, occupation, the port of departure, and the name of the ship. 

Methods

The dataset was scattered in 39 Microsoft Excel files, having one file per year⎯ from 1882 to 1920. In total, the dataset had 1,021,843 records of Italians who arrived at the port of Buenos Aires city between the years mentioned above.

The first step was to translate from Spanish to English and normalize the names of the columns in the 39 files. The data available in the columns was the personal information of immigrants (name and last name, age, gender, marital status, occupations, religion, the port of departure, ship’s name, date of arrival, and literacy).

After I had all the files in a common structure, they were ready for the second step.
In this phase, I used TableauPrep for data preparation. Tableau Prep is a digital tool that facilitates the cleaning, aggregation and merging of data for later analyses in Tableau. Tableau Prep has the ability to modify values in the data. Instead of performing the modification on one record, it automatically created a rule with the data modification that then can be applied to any other file.

While working with this tool, I found several problems with the data. It was full of typographical and spelling errors. The information was expressed in different languages, Spanish, Italian, and in some cases, French, which made the translation work most complex. On the other hand, there was information that made no sense; for example, 800-year-old people or missing gender or marital status.

In the third step, I used Google Sheets with Google Translator API to speed up the translation and clean-up process of the immigrants’ occupations.

With all these tools, I was able to clean the data, and finally, I started to create visualizations in Public Tableau.

Visualizations

Images

Blog