Traffic Fines

Traffic Fines#

img/traffic_fines.jpg

Jan, 2023

Open Data Choropleth maps

Background#

To make a data project, first you need data. One possibility is to collect it yourself, making it a quantifying project for a start. Another alternative is to directly ask for the data to the people who have it. And then, there is sometimes the option of downloading it from the internet as open data, which is what I did for this project.

I was curious about what the local government was publishing as open data. The datasets I found were not particularly exciting, much of it was about municipal finances and some demographics. But I came across the information of traffic fines during the last few years, and I thought I would give it a try.

With open data, institutions aim to become more transparent to the public. The data is shared, it is available for anyone to take a look at it, but usually it is not that simple. You need a data project to make sense of it. Once you convert the numbers into graphics and insightful information, only then the data comes alive and you can figure out what is going on.

The data#

I downloaded the CSV files from this link, containing traffic fines in my town from 2018 to 2022.

https://www.gipuzkoairekia.eus/

    year                street   category  fines    paid  unpaid
 2018                   NaN       LEVE      2    15.0    0.00
 2018  GELTOKIEN ENPARANTZA      GRAVE      3     0.0  234.95
 2018  GELTOKIEN ENPARANTZA       LEVE     52   640.8  279.15
 2018  GELTOKIEN ENPARANTZA  MUY GRAVE      2   500.0    0.00
 2018    EUSKADI ENPARANTZA      GRAVE      4   200.0  244.95
..   ...                   ...        ...    ...     ...     ...
2022              IBAIONDO      GRAVE      2   200.0    0.00
2022              IBAIONDO       LEVE     45   385.0  259.65
2022  ESKUALDEKO OSPITALEA      GRAVE     34  2100.0  949.80
2022  ESKUALDEKO OSPITALEA       LEVE     62   675.0  533.55
2022                 ANTIO       LEVE      1     0.0    0.00

[340 rows x 6 columns]

As shown in the dataframe above, the number of fines related to the year and street is already aggregated into three different categories:

LEVE, for mild traffic offences.
GRAVE, for serious ones.
MUY GRAVE, for very serious ones.

Data validation#

Data inspection shows some missing values.

<class 'pandas.core.frame.DataFrame'>
Int64Index: 340 entries, 0 to 79
Data columns (total 6 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   year      340 non-null    int64  
 1   street    337 non-null    object 
 2   category  340 non-null    object 
 3   fines     340 non-null    int64  
 4   paid      340 non-null    float64
 5   unpaid    340 non-null    float64
dtypes: float64(2), int64(2), object(2)
memory usage: 18.6+ KB

Only 3 rows out of 340 have missing values, and with few fine quantities, so I just drop them.

   year street   category  fines   paid  unpaid
2018    NaN       LEVE      2   15.0     0.0
2020    NaN  MUY GRAVE      2  120.0     0.0
2021    NaN  MUY GRAVE      1    0.0     0.0

<class 'pandas.core.frame.DataFrame'>
Int64Index: 337 entries, 1 to 79
Data columns (total 6 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   year      337 non-null    int64  
 1   street    337 non-null    object 
 2   category  337 non-null    object 
 3   fines     337 non-null    int64  
 4   paid      337 non-null    float64
 5   unpaid    337 non-null    float64
dtypes: float64(2), int64(2), object(2)
memory usage: 18.4+ KB

As data types are already correct, no further measures are needed.

Exploratory Data Analysis#

Fines by street#

QUESTION: Which are the streets with the greater number of fines? Locate the hotspots.

category                        LEVE  GRAVE  MUY GRAVE  total
street                                                       
ZELAI-ARIZTI PARKEA            916.0    2.0        NaN  918.0
SEKUNDINO ESNAOLA KALEA        619.0   29.0        2.0  650.0
ESKUALDEKO OSPITALEA           422.0  165.0        NaN  587.0
GELTOKIEN ENPARANTZA           504.0   21.0        6.0  531.0
PIEDAD KALEA                   429.0   32.0        1.0  462.0
BIDEZAR KALEA                  352.0   25.0        NaN  377.0
KALEBARREN                     269.0   25.0        NaN  294.0
ELIZKALE                       202.0   49.0        3.0  254.0
SAN GREGORIO KALEA             169.0   21.0        3.0  193.0
LEGAZPI KALEA                  167.0    6.0        NaN  173.0
ELKANO KALEA                   143.0   22.0        1.0  166.0
EUSKADI ENPARANTZA             137.0   12.0        NaN  149.0
URDANETA HIRIBIDEA             122.0   21.0        1.0  144.0
JAI-ALAI KALEA                 123.0    6.0        NaN  129.0
ESTEBAN ORBEGOZO IBILALDIA     101.0   15.0        NaN  116.0
LETURIATARREN ENPARANTZA        80.0   29.0        1.0  110.0
IPARRAGIRRE HIRIBIDEA           35.0   59.0        3.0   97.0
SORALUCE KALEA                  85.0    NaN        NaN   85.0
OKENDO KALEA                    67.0   12.0        NaN   79.0
TXURRUKA KALEA                  70.0    4.0        1.0   75.0
IBAIONDO                        58.0    8.0        NaN   66.0
ARTIZ AUZOA                     57.0    6.0        1.0   64.0
EUSKALERRIA HIRIBIDEA           29.0   33.0        1.0   63.0
ANTONINO ORAA KALEA             42.0   14.0        NaN   56.0
UROLA KALEA                     28.0   16.0        NaN   44.0
BELOKI HIRIBIDEA                28.0   13.0        1.0   42.0
IÑIGO DE LOIOLA KALEA           25.0   11.0        NaN   36.0
BUSKA SAGASTIZABAL PARKEA       24.0    6.0        NaN   30.0
ETXEBERRI AUZOA                 25.0    3.0        1.0   29.0
ISLAS FILIPINAS KALEA            7.0   11.0        NaN   18.0
ANTIO                           12.0    3.0        NaN   15.0
IPAR HAIZEA AUZOA               12.0    2.0        NaN   14.0
IZAZPI AUZOA                     4.0    7.0        1.0   12.0
ANTZIÑE HIRIBIDEA                5.0    6.0        NaN   11.0
LEGAZPI AUZUNEA                  9.0    NaN        NaN    9.0
SAN ISIDRO KALEA                 5.0    4.0        NaN    9.0
NAFARROA ENPARANTZA              6.0    NaN        NaN    6.0
SAKABANATUAK                     1.0    5.0        NaN    6.0
JOXE MIEL BARANDIARAN AUZUNEA    NaN    4.0        1.0    5.0
URTUBI AUZOA                     4.0    1.0        NaN    5.0
HIRI LORATEGIA                   1.0    4.0        NaN    5.0
ELGARRESTAMENDI KALEA            NaN    4.0        NaN    4.0
ARGIXAO AUZOA                    3.0    1.0        NaN    4.0
ZUMARRAGAKO INDUSTRIALDEA        1.0    1.0        NaN    2.0
EITZA BERRI                      2.0    NaN        NaN    2.0
ANGEL CRUZ JAKA AUZUNEA          NaN    NaN        1.0    1.0

../_images/7105e68478ff1661c2a5e0ce463086de39eb1b56b7bed4caac10e97eb0a7f729.png

I decided to put this information into a choropleth. That for, I first needed to form a map with the streets and neighborhoods of the town. I draw each polygon by hand using https://geojson.io/.

Show code cell source Hide code cell source

# Read geographical data of streets into a geopandas dataframe
zelai = gpd.read_file("geojson/zelai_arizti_parkea.geojson")
zelai["street"] = "ZELAI-ARIZTI PARKEA"
sekundino = gpd.read_file("geojson/sekundino_esnaola_kalea.geojson")
sekundino["street"] = "SEKUNDINO ESNAOLA KALEA"
ospitalea = gpd.read_file("geojson/eskualdeko_ospitalea.geojson")
ospitalea["street"] = "ESKUALDEKO OSPITALEA"
geltokien = gpd.read_file("geojson/geltokien_enparantza.geojson")
geltokien["street"] = "GELTOKIEN ENPARANTZA"
piedad = gpd.read_file("geojson/piedad_kalea.geojson")
piedad["street"] = "PIEDAD KALEA"
bidezar = gpd.read_file("geojson/bidezar_kalea.geojson")
bidezar["street"] = "BIDEZAR KALEA"
kalebarren = gpd.read_file("geojson/kalebarren.geojson")
kalebarren["street"] = "KALEBARREN"
elizkale = gpd.read_file("geojson/elizkale.geojson")
elizkale["street"] = "ELIZKALE"
gregorio = gpd.read_file("geojson/san_gregorio_kalea.geojson")
gregorio["street"] = "SAN GREGORIO KALEA"
legazpi = gpd.read_file("geojson/legazpi_kalea.geojson")
legazpi["street"] = "LEGAZPI KALEA"
elkano = gpd.read_file("geojson/elkano_kalea.geojson")
elkano["street"] = "ELKANO KALEA"
euskadi = gpd.read_file("geojson/euskadi_enparantza.geojson")
euskadi["street"] = "EUSKADI ENPARANTZA"
urdaneta = gpd.read_file("geojson/urdaneta_hiribidea.geojson")
urdaneta["street"] = "URDANETA HIRIBIDEA"
jai = gpd.read_file("geojson/jai_alai_kalea.geojson")
jai["street"] = "JAI-ALAI KALEA"
orbegozo = gpd.read_file("geojson/esteban_orbegozo_ibilaldia.geojson")
orbegozo["street"] = "ESTEBAN ORBEGOZO IBILALDIA"
leturia = gpd.read_file("geojson/leturiatarren_enparantza.geojson")
leturia["street"] = "LETURIATARREN ENPARANTZA"
iparragirre = gpd.read_file("geojson/iparragirre_hiribidea.geojson")
iparragirre["street"] = "IPARRAGIRRE HIRIBIDEA"
soraluze = gpd.read_file("geojson/soraluze_kalea.geojson")
soraluze["street"] = "SORALUZE KALEA"
okendo = gpd.read_file("geojson/okendo_kalea.geojson")
okendo["street"] = "OKENDO KALEA"
txurruka = gpd.read_file("geojson/txurruka_kalea.geojson")
txurruka["street"] = "TXURRUKA KALEA"
ibaiondo = gpd.read_file("geojson/ibaiondo.geojson")
ibaiondo["street"] = "IBAIONDO"
artiz = gpd.read_file("geojson/artiz_auzoa.geojson")
artiz["street"] = "ARTIZ AUZOA"
euskalerria = gpd.read_file("geojson/euskalerria_hiribidea.geojson")
euskalerria["street"] = "EUSKALERRIA HIRIBIDEA"
oraa = gpd.read_file("geojson/antonino_oraa_kalea.geojson")
oraa["street"] = "ANTONINO ORAA KALEA"
urola = gpd.read_file("geojson/urola_kalea.geojson")
urola["street"] = "UROLA KALEA"
beloki = gpd.read_file("geojson/beloki_hiribidea.geojson")
beloki["street"] = "BELOKI HIRIBIDEA"
loiola = gpd.read_file("geojson/inigo_de_loiola_kalea.geojson")
loiola["street"] = "IÑIGO DE LOIOLA KALEA"
busca = gpd.read_file("geojson/busca_sagastizabal_parkea.geojson")
busca["street"] = "BUSCA SAGASTIZABAL PARKEA"
etxeberri = gpd.read_file("geojson/etxeberri_auzoa.geojson")
etxeberri["street"] = "ETXEBERRI AUZOA"
filipinas = gpd.read_file("geojson/islas_filipinas_kalea.geojson")
filipinas["street"] = "ISLAS FILIPINAS KALEA"
antio = gpd.read_file("geojson/antio.geojson")
antio["street"] = "ANTIO"
ipar = gpd.read_file("geojson/ipar_haizea_auzoa.geojson")
ipar["street"] = "IPAR HAIZEA AUZOA"
izazpi = gpd.read_file("geojson/izazpi_auzoa.geojson")
izazpi["street"] = "IZAZPI AUZOA"
antzine = gpd.read_file("geojson/antzine_hiribidea.geojson")
antzine["street"] = "ANTZIÑE HIRIBIDEA"
legazpi_auzo = gpd.read_file("geojson/legazpi_auzunea.geojson")
legazpi_auzo["street"] = "LEGAZPI AUZUNEA"
isidro = gpd.read_file("geojson/san_isidro_kalea.geojson")
isidro["street"] = "SAN ISIDRO KALEA"
nafarroa = gpd.read_file("geojson/nafarroa_enparantza.geojson")
nafarroa["street"] = "NAFARROA ENPARANTZA"
barandi = gpd.read_file("geojson/joxe_miel_barandiaran_auzunea.geojson")
barandi["street"] = "JOXE MIEL BARANDIARAN AUZUNEA"
urtubi = gpd.read_file("geojson/urtubi_auzoa.geojson")
urtubi["street"] = "URTUBI AUZOA"
lorategia = gpd.read_file("geojson/hiri_lorategia.geojson")
lorategia["street"] = "HIRI LORATEGIA"
elgarrestamendi = gpd.read_file("geojson/elgarrestamendi_kalea.geojson")
elgarrestamendi["street"] = "ELGARRESTAMENDI KALEA"
argixao = gpd.read_file("geojson/argixao_auzoa.geojson")
argixao["street"] = "ARGIXAO AUZOA"
industrialdea = gpd.read_file("geojson/zumarragako_industrialdea.geojson")
industrialdea["street"] = "ZUMARRAGAKO INDUSTRIALDEA"
eitza = gpd.read_file("geojson/eitza_berri.geojson")
eitza["street"] = "EITZA BERRI"
jaka = gpd.read_file("geojson/anjel_cruz_jaka_auzunea.geojson")
jaka["street"] = "ANGEL CRUZ JAKA AUZUNEA"

# Concatenate all dataframes into a unique one
zumarraga = pd.concat([zelai, sekundino, ospitalea, geltokien, piedad, bidezar,
                       kalebarren, elizkale, gregorio, legazpi, elkano, euskadi,
                       urdaneta, jai, orbegozo, leturia, iparragirre, soraluze,
                       okendo, txurruka, ibaiondo, artiz, euskalerria, oraa,
                       urola, beloki, loiola, busca, filipinas, antio, ipar,
                       izazpi, antzine, legazpi_auzo, isidro, nafarroa, barandi,
                       urtubi, lorategia, elgarrestamendi, argixao,
                       industrialdea, eitza, jaka,
                       # etxeberri,
                      ])

# Create a projected reference system to plot with a basemap
zumarraga_3857 = zumarraga.copy()
zumarraga_3857.geometry = zumarraga_3857.geometry.to_crs(epsg=3857)

# Plot with a basemap
fig, ax = plt.subplots()
legend_kwds = {'title': 'Streets & neighbourhoods', 'fontsize': 8,
               'loc': 'upper left', 'bbox_to_anchor': (1, 1.03), 'ncol': 2}
zumarraga_3857.plot(ax=ax, column="street", legend=True, legend_kwds=legend_kwds)
contextily.add_basemap(ax,
                       source=contextily.providers.OpenStreetMap.Mapnik,
                       # source=contextily.providers.CartoDB.PositronNoLabels,
                      )
ax.set_axis_off()

plt.show()

../_images/d6f958a21cbe87136f4ac85f918a7cd37a8652883e333432311d6bf9137054df.png

Having made the map, it was all about merging it with the fines dataframe to get the couple of choropleths shown below.

../_images/129f3d7e052eec5edff2ad4daf971a39a0fa513803310f6756d35b28beaadb83.png

../_images/885ab83fde24165162c4e250025642a43689986678ee3afee94874f06b22cc98.png

Fines by year#

QUESTION: How have the number of fines evolved during the last few years?

category  LEVE  GRAVE  MUY GRAVE  total
year                                   
    1112    194          4   1310
     583    104          2    689
    1183     95          1   1279
     918    167         11   1096
    1604    158         11   1773

../_images/2c43dabb75e3ee0e9536e1fdb8729e93c36d98ae7eab711412676d23468a6f7e.png

Fines by year and street#

QUESTION: How have the number of fines evolved during the last years in the top 10 fine-prone streets?

category                   LEVE  GRAVE  MUY GRAVE  total
year street                                             
2018 ANTIO                  9.0    3.0        NaN   12.0
     ANTONINO ORAA KALEA    1.0    1.0        NaN    2.0
     ARGIXAO AUZOA          1.0    NaN        NaN    1.0
     ARTIZ AUZOA           24.0    1.0        NaN   25.0
     BELOKI HIRIBIDEA      12.0    3.0        NaN   15.0
...                         ...    ...        ...    ...
2022 SORALUCE KALEA         6.0    NaN        NaN    6.0
     TXURRUKA KALEA        43.0    1.0        1.0   45.0
     URDANETA HIRIBIDEA    62.0    5.0        NaN   67.0
     UROLA KALEA            9.0    1.0        NaN   10.0
     ZELAI-ARIZTI PARKEA  209.0    1.0        NaN  210.0

[189 rows x 4 columns]

../_images/977172bc2af53fd5ee60f1284664a97456e0e888514ab8859c4d00dda80632e1.png

Money collected from fines#

QUESTION: How much money does the municipality collect from fines?

The data comes with two columns related to the amount of money:

“IMPORTE PAGADO”: which I have renamed as “paid”.
“IMPORTE PENDIENTE DE PAGO EN EJECUTIVA”: which I have renamed as “unpaid”.

For the sake of clarity, I have added together both columns (I have supposed pending payments will eventually be processed).

category      LEVE     GRAVE  MUY GRAVE     total
year                                             
    19548.05  16623.60    1704.95  37876.60
     8359.39   8695.95     500.00  17555.34
    11778.40   6099.65     500.00  18378.05
    14145.21  14438.45    1134.95  29718.61
    27313.80  13633.91    4499.75  45447.46

../_images/af01a3a67f6524bb1c60cf4c65f79e5ae81932beb02e5be0d122a10568d1072a.png

Finally, I find it interesting to summarise this graphic into one single figure: how much money do they collect roughly per year?

The amount of money collected per year is roughly 30000 €

Conclusions#

This project was about shedding light on a freely available open data set. In this case, I downloaded the files of traffic fines in my town. The analysis consisted in exploring ways of representing the data to get a clear picture of its contents. This kind of work should be useful for the people involved, to watch out for trends and follow up new policies. And for the public, to effectively make the information transparent for all of us.