Ignatian Pilgrims

Ignatian Pilgrims#

\(^{1}\)Image credit: https://caminoignaciano.org/

Nov, 2022

Time-series forecasting

Background#

The Ignatian Way is a pilgrimage path that follows the route Ignatius of Loyola traveled in 1522 from Loyola to the city of Manresa. Starting in the Basque Country, it passes through Rioja, Navarre, Aragon and Catalonia, ending in Manresa —close to Barcelona, where Ignatius was due to embark for the Holy Land.

The journey starts in Loyola, the hometown of Saint Ignatius, and consists of some 660 km divided in 27 stages. In their first day, pilgrims arrive to the town I live in, and mostly in the summer I occasionally spot some of them walk through with their bulky backpacks. Often they seem to be foreigners and I usually wonder about the country they come from.

I decided to ask in the Tourist Information Office in Loyola, where they kindly provided me with the data after suggesting I could make for them this little study.

The data#

I was given the records with the number of pilgrims and their origins from 2016 to 2022.

Note

These records refer only to pilgrims that came into the tourist office. The sanctuary is the official starting point of this religious journey and not all pilgrims pop in the tourist office.

            pax         from   region  type        notes
date                                                    
2016-12-31   63    Australia      NaN   NaN      Taldean
2016-12-31   54          USA      NaN   NaN      Taldean
2016-12-31   23        Italy      NaN   NaN      Taldean
2016-12-31   15  Philippines      NaN   NaN      Taldean
2016-12-31  144        Spain  Euskadi   NaN      Bakarka
...         ...          ...      ...   ...          ...
2022-09-01    2          USA      NaN  foot  25 ean hasi
2022-09-01    1       France      NaN  foot   27 an hasi
2022-10-31    0          NaN      NaN   NaN          NaN
2022-11-30    0          NaN      NaN   NaN          NaN
2022-12-31    0          NaN      NaN   NaN          NaN

[311 rows x 5 columns]

Data Validation#

In type column, some records contain the value info, meaning the record was not about actual pilgrims but instead related to people asking for information about the Ignatian Way.

I therefore leave out these info entries to proceed with the exploratory analysis.

Exploratory Data Analysis#

Absolute numbers per year#

../_images/97af045129d7ecc444339106693ecf14cfba2eeec4613ab49cfe3d867b035823.png

Average pilgrims per month#

In order to calculate average numbers per month, 2020-pandemic year was skipped to avoid distorting results. Neither were 2016, 2017 and 2018 considered, because data of these years came aggregated and was not month-related.

../_images/8aed34dd92558cd2f111d2fe3dd5331277ebda9e0833727970ed67abe81434f0.png

International proportion#

Note

For the rest of the study only records from 2019 onwards will be considered.

../_images/1ae3f103b4c0249eb2a2bdf70becd958ccbe8d6632a7b83c99a4027cd4e04f59.png

Group types#

This is one piece of information that can be extracted from the data: How do pilgrims travel? Alone? With a partner? In small groups? Large ones?

Just by choice, I have considered small groups to be from 3 up to 6 people —6 included.

../_images/14577a556a6e7b27e641023ac582378ba170391d43e8acea728f003cf7eab69c.png

It appears that the Ignatian Way has a majority of pilgrims coming in large groups.

International numbers by group type#

../_images/c81d6dca88dddba35c80fa8c4b8d5071fe8416316d49c152a83d22c65a51ea22.png

Not surpisingly, neighbouring French pilgrims come first in the list. What was more surprising was the relative amount of people from the Czech Republic, the Philippines or Singapore. We learn from this stacked bar chart that these nationals came mainly in large groups, maybe organized by Jesuit communities.

I select some significant countries and put them on a point plot for a more clear comparison.

../_images/554d84f97522f25a7c807fc9cce76a1b598b4af9f26b3c07fb358030b4921084.png

Spanish numbers by group type#

Spanish pilgrims have some more specific informartion about their origins in the field called region.

../_images/08ce9f5f4e58083b8cdd8f0edcbc71d355b050864246a8baba24ef954fd83584.png

Pilgrims from the Basque Country come first, followed by Catalans and the people from Madrid. However, we can see that there is some inconsistency in the way data was recorded, since region, province and town names all mix up in this region field.

Forecast#

Build monthly trend line#

As 2020 (the Covid-19 pandemic year) was not a normal one, I am not taking it into account for prediction purposes. However, to build the trend line and later proceed with the forecast, as it is still a useful information, I have shifted the data from 2019 into a supposed 2020.

Show code cell source Hide code cell source

# 2019-----------------------------------
# Get pax numbers from 2019
ignatian_pax_19_as_20 = ignatian.loc["2019"]["pax"]

# Add an offset of 1 year to 2019 index
ignatian_pax_19_as_20.index = ignatian_pax_19_as_20.index + pd.DateOffset(years=1)

# Group by months and add pax numbers
ignatian_pax_19_as_20_monthly = ignatian_pax_19_as_20.groupby(ignatian_pax_19_as_20.index.month).sum()

# Create an index with the twelve months and assign it
twelve_months_20 = pd.date_range(start="2020-01-01", freq="M", periods=12)
ignatian_pax_19_as_20_monthly.index = twelve_months_20

# 2021-----------------------------------
# Get pax numbers from 2021
ignatian_pax_21 = ignatian.loc["2021"]["pax"]
ignatian_pax_21_monthly = ignatian_pax_21.groupby(ignatian_pax_21.index.month).sum()
twelve_months_21 = pd.date_range(start="2021-01-01", freq="M", periods=12)
ignatian_pax_21_monthly.index = twelve_months_21

# 2022-----------------------------------
ignatian_pax_22 = ignatian.loc["2022"]["pax"]
ignatian_pax_22_monthly = ignatian_pax_22.groupby(ignatian_pax_22.index.month).sum()
twelve_months_22 = pd.date_range(start="2022-01-01", freq="M", periods=12)
ignatian_pax_22_monthly.index = twelve_months_22


# Build the time series
ignatian_trend = pd.concat([ignatian_pax_19_as_20_monthly,
                            ignatian_pax_21_monthly,
                            ignatian_pax_22_monthly])

# Plot
fig, ax = plt.subplots(figsize=(9, 5))

ignatian_trend.plot(ax=ax, marker="o", color=palette_ign[3])

ax.grid(axis="y")
ax.set_axisbelow(True)
ax.tick_params(axis='x', labelsize=14, rotation=0)
ax.tick_params(axis='y', labelsize=14)
ax.set_title("Ignatian Way from Loyola: Trend line", size=15)
ax.set_xlabel("")
ax.set_ylabel("PAX", size=14)
ax.axvspan("2020-01-01", "2020-12-31", facecolor='grey', alpha=0.15)
ax.annotate("2019 numbers\nshifted to 2020", xy=(pd.Timestamp("2020-03-15"), 85),fontsize=12)
ax.set_ylim(0, 120)
sns.despine()

plt.show()

../_images/a116d958e19cecadc0989b6a59586d89d6a5674e11f8110f5b38f7219486b154.png

Forecasting#

It looks like the time series is stationary (has no statistically significant trend): that means the number of pilgrims during the last years has not changed substancially and there is no apparent trend either upwards nor downwards.

As expected, we have a 12 month seasonality in the data.

The selected model diagnostics shows that the model had very few data during training, but the errors it made look like the model is acceptable.

../_images/1d7afde7efc2ed94b7147731f9781989cbe27dde80b73b7772e6a1224c1bc94c.png

These are the predicted figures for each month in 2023:

           predicted_mean
2023-01-31              5
2023-02-28              0
2023-03-31              8
2023-04-30             35
2023-05-31             46
2023-06-30             45
2023-07-31             91
2023-08-31             60
2023-09-30             11
2023-10-31             -0
2023-11-30             -0
2023-12-31             -0

Conclusions#

In this little data analysis and prediction project some insights were shared about the actual figures in the Ignatian Way provided by the Tourist Information Office in Loyola. As mentioned, this data could probably be supplemented with more records coming from other sources as not all pilgrims seem to visit the Tourist Office while setting off. It could be interesting to have more data because the available one looks scarce to proceed with a full-fledged analysis.

Additionally, if some more features from pilgrims were provided (like age, gender, motives, etc.), it would be possible to proceed with a pilgrim-type segmentation analysis.