Drowsy Driving is a deadly combination of driving and sleepiness. The number of road accidents due to Drowsy Driving is increasing at an alarming rate worldwide. Not having a proper sleep is the main reason behind drowsiness while driving.
However, other reasons like sleep disorders, medication, alcohol consumption, or driving during night shifts can also cause drowsiness while driving.
According to a report of AIIMS Neurology India —
Sleep Disorders became the reason behind around more than 20% of all road accidents and around 23% of truck drivers have sleep deprivations.
A separate report by National Highway Traffic Safety Administration, USA, states that —
Drowsy Driving was responsible for around 72,000 crashes, 44,000 injuries and 800 deaths in 2013.
Whatever be the reason for drowsiness, the fatalities due to drowsy driving are increasing every year.
This article presents a solution for driver drowsiness detection using a Convolutional Neural Network. The implementation of the project uses a custom CNN architecture with less than 250K trainable parameters for easy deployment on edge or computationally less efficient devices.
As a result, the driver can be alerted at the right time if the system detects that the driver has fallen asleep before anything dangerous happens.
The Cainvas Platform is used for implementation, which provides seamless execution of python notebooks for building AI systems that can eventually be deployed on edge (i.e. an embedded system such as compact MCUs).
The notebook can be found here.
The flow of the article is as follows: –
- Description of the Problem Statement
- CNN Model Architecture
- Drowsiness Detection Dataset
- Training the Model
- Performance of the Model
- Testing the Model
- Building Pipeline for Predictions on Full Face Images
- Conclusion
Description of the Problem Statement
The project aims at detecting drowsiness while driving to alert the driver at the right time to prevent any mishappening. The project uses a CNN model to predict whether a person feels drowsy or not based on whether the eyes are closed or open.
The project’s main objective was to limit the number of trainable parameters of the CNN model to under 250K so that the system can be deployed on edge or computationally less efficient devices. The project has a direct application in the automobile industry, makes drive safer, and reduces the death toll caused by drowsy driving.
Building the CNN Model
The implementation uses a custom-designed Convolutional Neural Network that has the following characteristics —
- Three Convolution Blocks having 2, 3, and 3 convolutional layers, respectively.
- A BatchNormalization Layer follows each Convolution Layer.
- A Dropout Layer follows each Convolution Block for avoiding overfitting and a MaxPool Layer.
- 3 Fully Connected Layers follow convolution Layers for classification.
Defining the model using Keras —
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv1 (Conv2D) (None, 32, 32, 32) 896 _________________________________________________________________ batch_normalization (BatchNo (None, 32, 32, 32) 128 _________________________________________________________________ conv2 (Conv2D) (None, 32, 32, 32) 9248 _________________________________________________________________ batch_normalization_1 (Batch (None, 32, 32, 32) 128 _________________________________________________________________ dropout (Dropout) (None, 32, 32, 32) 0 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 16, 16, 32) 0 _________________________________________________________________ conv3 (Conv2D) (None, 16, 16, 64) 18496 _________________________________________________________________ batch_normalization_2 (Batch (None, 16, 16, 64) 256 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 8, 8, 64) 0 _________________________________________________________________ conv4 (Conv2D) (None, 8, 8, 64) 36928 _________________________________________________________________ batch_normalization_3 (Batch (None, 8, 8, 64) 256 _________________________________________________________________ dropout_1 (Dropout) (None, 8, 8, 64) 0 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 4, 4, 64) 0 _________________________________________________________________ conv5 (Conv2D) (None, 4, 4, 64) 36928 _________________________________________________________________ batch_normalization_4 (Batch (None, 4, 4, 64) 256 _________________________________________________________________ conv6 (Conv2D) (None, 4, 4, 64) 36928 _________________________________________________________________ batch_normalization_5 (Batch (None, 4, 4, 64) 256 _________________________________________________________________ conv7 (Conv2D) (None, 4, 4, 64) 36928 _________________________________________________________________ batch_normalization_6 (Batch (None, 4, 4, 64) 256 _________________________________________________________________ dropout_2 (Dropout) (None, 4, 4, 64) 0 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 2, 2, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 256) 0 _________________________________________________________________ fc1 (Dense) (None, 128) 32896 _________________________________________________________________ dropout_3 (Dropout) (None, 128) 0 _________________________________________________________________ fc2 (Dense) (None, 128) 16512 _________________________________________________________________ dropout_4 (Dropout) (None, 128) 0 _________________________________________________________________ fc3 (Dense) (None, 2) 258 ================================================================= Total params: 227,554 Trainable params: 226,786 Non-trainable params: 768 _________________________________________________________________
The model was compiled with the Adam optimizer and a learning rate of 0.0001.
Drowsiness Detection Dataset
The project uses the Drowsiness_dataset present on the Kaggle platform. The dataset is present on this link. The original dataset contains four classes for classifying images into Open Eyes, Closed Eyes, Yawning, or No-Yawning.
However, this project’s scope is to classify drowsiness based on whether the eyes are closed or open. So, I will be using only two classes of the dataset. Characteristics of the dataset are as follows —
- The dataset contains a total of 1452 images in two categories.
- Each category has 726 images.
- The dataset is already balanced, so no need to balance the dataset.
- Class Labels — ‘Open Eye’ and ‘Closed Eye’.
- Class Labels were encoded such that 0 represents Open Eye and 1 illustrates Closed Eye.
Loading the dataset —
Preprocess the images to make the size of each image equal to (32, 32, 3). Then, the dataset is split into Train and Test Set in 80%-20% proportion.
Training the Model
Model training runs for a total of 200 epochs with a batch size of 128. ImageDataGenerator is used for randomizing the training images for better performance of the model.
Performance of the Model
The following Performance Metrics are used —
- Loss vs Number of Epochs Plot
- Accuracy vs Number of Epochs Plot
- Classification Report
- Confusion Matrix
The results of the metrics mentioned above are as follows —
Classification Report —
precision recall f1-score support
0 0.99 0.98 0.99 169 1 0.98 0.99 0.98 122
accuracy 0.99 291 macro avg 0.98 0.99 0.99 291 weighted avg 0.99 0.99 0.99 291
Confusion Matrix —
Testing the Model
The predictions of the model on images of eyes can be seen in the following pictures —
Building a Pipeline for Predictions on Full Face Images
This project’s last and final step is to build a pipeline for making predictions on full-face images. The channel includes Face detection, face alignment, eyes detection, preprocessing the ROI of the image, passing to the model for prediction and displaying results on the image.
The implementation is as follows —
The implementation uses the ‘dlib’ library for face detection in the image. Face Alignment is done using FaceAlignment class of imutils.face_utils library for better eye detection. Eye detection is performed using Haar Cascade Classifiers.
Finally, the results can be viewed as follows —
Conclusion
Driver Drowsiness is a significant reason for thousands of road accidents all over the world. Driver drowsiness detection is a car safety technology that helps prevent accidents caused by the driver getting drowsy.
The project aims at providing a solution of Driver Drowsiness Detection using CNN and image processing. The project aimed at optimizing the model to limit the number of parameters under 250k for easy deployment on edge devices.
This deployment is possible through the Cainvas Platform by making use of their compiler called deepC. Thus effectively bringing AI out on edge — in actual and physical real-world use cases.
Notebook link is here.