Road traffic safety is a major societal issue costing up to 4% of a country's GNP, while
traffic crashes are said to become the fifth leading cause of death by 2030. The 2017
European Commission Fact sheet shows how the United Kingdom is a leader in road safety with 27
fatalities per millions of inhabitants while France stands slightly above average with 57. What’s
more, the United Kingdom provides their traffic reports from 2005 to 2015. While it is well known
that the majority of road accidents happens while commuting, can the figures prove it ? Therefore,
our goal is to extract information from the UK dataset to put common road safety beliefs into
perspectives whether it is about driving conditions or the likelihood of being a casualty.
If you wish to know more about the technical details of our work, check our Github repo here.
Once you dive in this dataset, you soon realize that there are a lot of interesting leads: specific places and accidents, hypotheses on drivers or vehicles to check … Yet, time is the limit. Therefore, after getting familiar with the datasets and the road safety situation in the United Kingdom, we decided to focus on studying the correlation between our features and the number of casualties. In doing so, we encountered several interesting outliers that we decided to investigate to get to the bottom of the story. Finally, we provide some indications to justify well-known car insurance bonuses.
In order to visualize the repartition of road accidents in the UK, we plot a sample of the accidents on a map. We were looking for specific trends or localization that would appear as more accident-prone than others.
Most accidents seem to be located within cities or their urban areas, especially around London. There are also many accidents on the roads between major cities, for instance the area between Manchester, Liverpool and Preston in the North of England. Consequently, a deeper analysis of the distribution of accidents in rural or urban areas was necessary. Actually, according to our data 40% of accidents happen in rural areas. Moreover, this value has not changed significantly for the last 10 years: this trend is constant over time.
As expected, the number of casualties has decreased linearly over the years except for a bump in 2014. What is even more interesting is when we compare this evolution, to the evolution of the traffic itself which increased by more than 5% over the same period. It shows that this significant decrease over time is not due to lower traffic but to policies and technical upgrades made to the vehicles! Road safety policies are working!
One last thing to look at from a global viewpoint is the repartition of casualties from 2004 to 2016:
The first observation is that passengers make for around 25% of the accident casualties,
with a majority of them being car passengers rather than coach ones. Moreover, pedestrians
represent more than 10% of the casualties, which is a very significant part as they can only
be impacted in specific circumstances in urban areas: they are almost always the victims rather than
the person causing the accident. Finally, casualties still mainly concern drivers and riders: people
who are allegedly responsible for taking every possible steps in order to avoid accidents. It could be
explained in two ways: the high proportion of people taking their car alone, to commute for instance,
and the increasing part of bike riders having road accidents.
Despite this context, the government has still taken no action to make wearing helmets compulsory
for bike users.
Who has never been stuck in traffic? One could imagine that a reduced pace rhymes with less accidents? Or on the opposite, more cars on the road entails more accidents? We used our data to demonstrate the second proposition. While the trend looks alike for week days there is a huge difference with week-ends.
Rush hours are blatant at 8am and around 5pm, plus it is interesting to notice that
the peak lasts longer in the afternoon: are people tired and thus less alert ? While on week-ends,
people seem to be moving around lunch and tea time (do not forget we are talking about UK!).
In order to understand whether accidents related to commuting are due to traffic jams or just to the
increase of drivers on the road, we compared rush hour in urban and rural areas during week days and
week-ends.
As one can see, rush hours are still seeable in rural areas even though they are less
pronounced than for urban areas. What is interesting is that the difference between rural and urban
areas decrease on Sunday: are people going to the countryside on Sunday ?
On the whole, reducing the number of vehicles involved in rush hours could led to a significant
decrease of the number of accidents: commuting at odd hours (10am - 8pm for instance) is safer than
commuting “like everyone else”.
"Which features have the bigger correlation coefficient with the number of accidents?" That’s the kind of question road safety experts must ask themselves everyday, in order to build efficient road safety actions.
Once we set aside the number of vehicles involved, the two most significant features are the type of the area (i.e. whether it is urban or rural) and the type of the road. The correlation between the number of casualties and these features was far above its correlation with the light conditions or the weather which are almost negligible unlike what people usually think! That’s quite surprising, thus we decided to dig deeper to understand and maybe explain this cliché.
The common knowledge says that driving when the weather is poor is more dangerous. Therefore, we tried to spot a difference in the number of accidents when the weather is poor or not. The problem is we do not know how often the weather is bad and therefore we do not have any proper normalization. For that reason we plotted the average number of casualties per accident given many different weather condition:
But that plot showed that there isn’t much difference between different weather conditions. The second step was to find the most dangerous locations in the UK and to compute the frequencies of accidents when the weather is bad at these dangerous locations:
The plot show that there is mostly no difference across these different locations, telling us that they are not more dangerous when the weather is bad. However there is a clear outlier that merits further work. It corresponds to the location (51.498, -0.497) which we checked on Google Maps:
And the funny thing is that we discovered that this location was indeed particularly dangerous and that during bad weather it could be particularly difficult for a car on the right lane to be able to enter the road… So, act up UK!
Every driver has to be covered by a car insurance of which fares are a very sensitive topics,
depending on age, experience and even gender. Bonuses are supposed to depend on established facts
enabling the computation of the likelihood of having a car accident. For instance, some insurances
offer bonuses for female driver: to what extent can our data support this advantage?
From 2005 to 2015, data shows that female drivers almost have half as much car accident than male
drivers, justifying a bonuses for car insurance companies. Yet, we can not conclude that it is safer
to drive for women, because the difference in the number of accident could be explained by a
difference in the number of driver: yet, are men driving twice as much as women?
Is there any other features which could be taken into account to propose bonuses to client? We investigated the other features provided in the dataset and came to the following conclusions:
In a nutshell, this data is highly valuable for insurance companies because it helps computing the likelihood of car accident for a driver depending on several characteristics. Combined with the frequency of insured drivers it can build a relevant overview of at-risk-drivers. One also has to notice that combining such data can be quite intrusive for insured persons and goes against the concept of sympathy behind insurance.
The goal of this study was to aggregate and understand data on road accidents in the leading country
for road safety in Europe: the United Kingdom. By feeding our datasets into the Big Data blackbox,
we gained insights and highlighted trends with some surprises.
Choosing among the endless leads provided by the data has been a challenge. Yet, the biggest
challenge encountered has been the lack of information on traffic itself which forced us into
thinking
differently and come up with innovative baselines.
Finally, studying road accidents has been a very interesting and challenging topic illustrating how
data science has a highly positive and concrete impact on policies and government decisions.