Portfolio Details

Random Forest Prediction on Severity of Traffic Accident

Home
Portfolio
Random Forest Prediction on Severity of Traffic Accident

Project information

Category: Machine Learning
Type: Random Forest Classifier
Client/Purpose: This project was aimed to apply preprocessing and understand feature selection as well as coding the Random Forest algorithm from scratch and compare to libraries' implementation
Project date: July, 2020
Project URL: github

Traffic accident severity prediction

The goal was to create an analytic model to predict the severity of a car accident by identifying similar conditions between accidents of different severities. A Random Forest Classifier was implemented from scratch in Python and compared the accuracy of a single CART model versus a Random Forest and also versus the implementations of Scikit-learn library.

A Random Forest model was chosen to predict severity due to its ability to handle multi-class classification problems and its easily interpreted results. The implementation makes use of the entropy method for information gain and uses the Out-Of-Bag (OOB) metric to determine accuracy for the model.

The data cleaning and wrangling consisted of balanced sampling, correlation, outliers, feature selection and considering possible interaction features. Experiments consisted of varying the number of trees in the Random Forest implementation, as well as the maximum depth of the trees. The best model achieved 99.8% accuracy.