Abstract: ABSTRACT A PROCESS FOR DETECTING AND MONITORING MOBILE PHONE USAGE BY DRIVER IN A MOVING VEHICLE The present invention relates to process for detecting and monitoring mobile phone usage by driver in a moving vehicle The present invention is the process of detecting and identifying a phone usage while driving, in real-time through a video stream, or a live stream through camera device (10). It includes a hardware platform configured with two camera devices (10); one camera device displays the road side view and the other camera device display the driver side cabin view of vehicle configured with stored database and machine learning model. The process of present invention includes: Mouth Aspect Ratio model (MAR), a metric that measures degree of mouth openness and strategic padding model. By incorporating MAR model analysis with face and phone detection model, accurately identify distracted driving behaviour caused by phone usage. Further, YOLOv8n classification model ensures real-time, high precision detection for diverse scenarios. Generating real-time alerts/notifications to drivers, fleet operators to mitigate risk associated with distracted driving. FIG. 2
Description:FORM 2
THE PATENTS ACT 1970
(39 of 1970)
&
The Patents Rules, 2003
COMPLETE SPECIFICATION
(See section 10 and rule 13)
1. TITLE OF THE INVENTION: A PROCESS FOR DETECTING AND MONITORING MOBILE PHONE USAGE BY DRIVER IN A MOVING VEHICLE
2. APPLICANT:
(a) NAME : Nervanik AI Labs Pvt. Ltd.
(b) NATIONALITY : Indian
(c) ADDRESS : A – 1111, World Trade Tower,
Off. S G Road, B/H Skoda Showroom,
Makarba, Ahmedabad – 380051
Gujarat, INDIA.
3. PREAMBLE TO THE DESCRIPTION
PROVISIONAL
The following specification describes the invention. þ COMPLETE
The following specification particularly describes the invention and the manner in which it is to be performed.
Field of the invention
The present invention relates to a process for detecting and monitoring mobile phone usage by driver in a moving vehicle. More particularly, the present invention relates for identification of mobile phone usage in real-time through a video stream or a live stream through an installed camera while driving. The main aim of the present invention is to continuously monitor driver’s behavior to identify phone usage and other risky activities such as holding or talking on the phone.
Background of the invention
In recent years, with the vigorous development of the transportation industry, the number of motor vehicles and the number of motor vehicle drivers have increased rapidly, and there have been more and more traffic accidents. Among them, most of the traffic accidents are caused by the driver's inattention. With the rapid development of information technology, the use of mobile phones has become more and more common and people's dependence on mobile phones has become more and more serious. In actual scenarios, accidents are caused while playing mobile phones are not uncommon.
In today’s world, the use of mobile phones while driving has become a significant concern for road safety worldwide. The number of road traffic accidents caused by driving on the road and playing mobile phones in the country is increasing year by year. A numerous studies have shown that engaging in phone-related activities, such as making calls, texting, or using apps, can significantly impair a driver's attention, reaction time, and overall situational awareness. Due to this distracted driving behavior increases the risk of accidents, leading to property damage, injuries, and fatalities. However, despite the widespread awareness of the dangers of distracted driving, still the problem persists in today’s world. According to the Road Traffic Safety Law, driving a motor vehicle must not make or receive handheld phones, watch TV, etc. that hinder safe driving.
In continuation, still many drivers continue to be on their phones while operating or driving a vehicle, often underestimating the risks or believing they can multitask effectively. Existing solutions, such as legislation prohibiting handheld phone use and educational campaigns, have had limited success in curbing this behavior. The existing method of supervising the driver to play with the mobile phone is mainly to capture images through the camera and analyze whether the driver is using the mobile phone. The camera needs to be activated by the driver before driving, and the driver often forgets to activate it due to habit, causing the monitoring device to fail to work , cannot play a supervisory role.
Overall, the distracted driving, particularly caused by talking on the phone, has become alarmingly common, leading to a rise in road accidents. To address this pressing issue, there is an urgent demand for creative solutions that not only detect but also discourage the risky behavior. In view of the above technical problems, the purpose of the present invention is to address this critical issue and there is a growing need for technological solutions that can actively monitor and detect when a driver is using a mobile phone. Such existing systems could enable early intervention, such as providing real-time alerts to the driver or fleet operators and facilitate the development of more comprehensive distracted driving mitigation strategies. However, designing an effective and reliable phone talking detection system for drivers presents several challenges. The system needs to operate in a wide range of environmental conditions, vehicle types, and driver behaviors, while maintaining high accuracy and minimal false positives.
Therefore, the present invention can develop robust systems capable of real-time detection of phone usage during driving. To address above mentioned issue, the present invention provides a process for detecting and monitoring mobile phone usage by driver in a moving vehicle for detecting and monitoring driver’s phone usage while operating a vehicle. The present invention analyzes real-time video footage from in-vehicle cameras device and identifies instances where the driver is holding and talking on a phone during driving.
The present invention provides a multi-faceted approach by combining deep learning, machine learning model, and mouth aspect ratio analysis allows the process/method to more reliably and accurately determine when the driver is engaged with the mobile phone during driving. This enhanced detection capability of the present invention, further strengthens the process ability to promote safe driving practices and prevent distracted driving incidents.
The main aim of the present invention is to addresses these challenges by developing advanced technology that can reliably detect and monitor drivers' phone usage in real-time through machine learning model and deep learning model. It provides a comprehensive solution for improving road safety and reducing the risks associated with distracted driving. The present invention established a robust and innovative solution for mitigating risks associated with distracted driving, surpassing existing technologies in terms of accuracy, efficiency and versatility.
Object of the invention
The main object of the present invention is to disclose a process for detecting and monitoring mobile phone usage by driver in a moving vehicle.
Another object of the present invention is to detect and monitor the real-time mobile phone usage through a video stream, or a live stream or through a camera device.
The other object of the present invention is to provide the dual-camera device installed in vehicle dedicated for capturing video and a more specialized and effective approach to accurately detect and track real-time driver’s face.
The further object of the present invention is to provide the device setup that is equipped with stored database and advance machine learning module which monitors both the external driving environment and internal environment of the vehicle to analyse the driver’s facial expressions to ensure their attentiveness and emotional state during driving.
Another object of the present invention is to provide a metric that measures degree of mouth openness during a phone conversation accurately to identify situations, wherein the driver is conversing on the mobile phone.
Further object of the present invention is to provide MAR (Mouth Aspect Ratio) analysis through deep learning based face and phone detection model which enhances overall accuracy in identifying distracted driving behavior caused by mobile phone usage.
Another object of the present invention is to provide strategic padding to account for hand positioning near the face, as holding a phone during driving.
The other object of the present invention is to provide multi-model detection i.e. face detection, phone detection, real-time MAR calculation detection and hands detection through the strategic padding.
Further object of the present invention is to provide a real-time alerts or notifications to the drivers, fleet operators or other relevant parties.
Another object of the present invention is to provide an impressive overall accuracy of 96% across a wide range of devices, totaling more than 200 devices.
Still another object of the present invention is to provide multi-faceted approach by combining advanced deep learning model, computer vision based model and the mouth aspect ratio analysis allow which more reliably and accurately determination when the driver is engaged with a mobile phone during driving
Summary of the Invention
The present invention relates to a process for detecting and monitoring mobile phone usage by driver in a moving vehicle. The present invention relates to the process of detecting and identifying a phone usage while driving, involving real-time through a video stream, or a live stream through camera device. It includes a hardware platform configured with two camera devices; one camera device displays the road side view and the other camera device displays the driver side cabin view of vehicle configured with stored database and machine learning model. The driver side camera device tracks the driver’s eye movement, the driver’s lips movement and the driver’s mobile phone usage in real time through face detection model and facial landmark detector model. The process of present invention includes Mouth Aspect Ratio model (MAR), a metric that measures degree of mouth openness and strategic padding model. By incorporating MAR model analysis with face and phone detection model, accurately identify distracted driving behaviour caused by phone usage. Further, YOLOv8n classification model ensures real-time, high precision detection for diverse scenarios. Generating real-time alerts/notifications to drivers, fleet operators to mitigate risk associated with distracted driving.
Brief Description of the Drawings
FIG. 1 illustrates a schematic structural diagram of a process for a process for detecting and monitoring mobile phone usage by driver in a moving vehicle according to the present invention.
Fig. 2 illustrates a flowchart of one embodiment of a process for detecting and monitoring mobile phone usage by driver in a moving vehicle according to the present invention.
Detailed description of the Invention
Before explaining the present invention in detail, it is to be understood that the invention is not limited in its application. The nature of invention and the manner in which it is performed is clearly described in the specification. The invention has various components and they are clearly described in the following pages of the complete specification. It is to be understood that the phraseology and terminology employed herein is for the purpose of description and not of limitation.
As used herein, the term “MAR”, refers to a Mouth Aspect Ratio a metric that measure the openness of mouth.
As used herein, the term "module", and “model” refers to as the unique and addressable components of the software implemented in hardware which can be solved and modified independently without disturbing (or affecting in very small amount) other modules of the software implemented in hardware.
As used herein, the term “device”, refers to a unit of hardware, outside or inside the case or housing that is capable of providing input or of receiving output or of both.
As used herein, the term "database" refers to either a body of data, a relational database management system (RDBMS), or to both. The database includes any collection of data including hierarchical databases, relational databases, flat file databases, object-relational databases, object oriented databases, and any other structured collection of records or data that is stored in a computer system. The above examples are example only, and thus are not intended to limit in any way the definition and/or meaning of the term database.
The present invention disclosure is related to face detection. The face detection is used to find and identify human faces in digital images and video.
Another disclosure of the present invention is facial landmark model that basically represent individual’s facial features of facial images as multi-dimensional vectors and stores the data of face images.
Other disclosure of the present invention is phone detection that identify presence and location of a mobile device within video feed.
The present invention Fig. 1 is a process for detecting and monitoring mobile phone usage by driver in a moving vehicle. The present invention includes a hardware device configured with dual cameras (10), one camera device display road side view and other camera device display driver side cabin view of vehicle. Said dual cameras (10) configured with stored database and machine learning model. Said machine learning model monitors and analyses the driver’s behaviour and facial expression to ensure their attentiveness and emotional state while driving. The driver side camera device (10) accurately detects, identify and tracks the driver’s face in real-time video feed/footage through face detection model. said face detection model provides precise coordinates of the driver’s face location to detect drivers, engaged in phone conversation/talking or holding mobile phone while driving.
As per the process described in the present invention, it initiates with a facial landmark detector model which identifies and extracts 98 facial landmarks monitoring in real-time, including location of the eyes, nose, mouth and ears from the driver’s face detected by the face detection model. The facial landmark detector model trained on a diverse dataset of the drivers in various conditions. Further, the facial landmark detection model is very much important for the initialization of processing techniques like face tracking, facial expression detection, specifically detecting mouth landmarks.
A flowchart of a process for detecting and monitoring mobile phone usage by driver in a moving vehicle according to embodiment of the present invention as shown in FIG. 2. The process comprises:
S11: receiving real-time video footage of driver’s face and upper body captured by camera device (10) while driving the vehicle.
S12: performing face detection on each frame of the video to accurately identify a face area through face detection model.
S13: detecting the driver eyes, nose, mouth and other features through landmark detection model.
S14: calculating MAR (mouth aspect ratio), a metric that measures openness of the mouth.
S15: combining MAR analysis results with face and phone detection model in identifying distracted driving behaviour caused by the mobile phone usage.
In above mentioned step S14 and S15, the MAR is calculated by measuring Euclidean distance between specific mouth landmarks i.e. measuring the distance between an upper lip and a lower lip, providing a quantitative measure of mouth openness. Simultaneously, said MAR model results combine with the face detection model and phone detection model to enhance the accuracy in identifying distracted driving behaviour caused by the mobile phone usage. During a phone conversation, the openness of a driver’s mouth increases as he speaks. By incorporating, this MAR analysis with the face and phone detection models, the process can more accurately identify situations where the driver is conversing/talking on a mobile device.
S16: performing strategic padding to amplify the detection region around the detected face by the face detection model.
In above mentioned step S16, the strategic padding extends cropped area beyond the face itself, effectively capturing space around the driver's ears and hands. According to the present invention, the purpose of the padded face region ensures that any instances where the driver is holding a phone near their ears are included within the captured image. By expanding the detection area beyond the face, the process improving phone detection accuracy and can be more effectively identify where the driver's hands are positioned close to their head, potentially indicating the use of the mobile device while driving.
S17: integrating strategic padding output in to the phone detection model;
Furthermore in step 17, said padded face region result/output used as input in to next step according to the present invention. The next step includes the phone-holding areas detected by strategic padding, is further preceded into a custom-built, state-of-the-art phone detection model. Said phone detection model is based on deep learning model, such as Convolutional Neural Networks (CNNs) and Region-based CNNs (R-CNNs) frameworks to precisely localize and classify the mobile phones within the video frame in real time while driving. Said model accurately detects a wide range of phone models, sizes, and colours, even in challenging conditions like partial occlusion, varying lighting, and complex backgrounds.
S18: processing padded face region in to classification model based on YOLOv8n framework to detect and classify the driver holding the mobile phone;
According to step 18 of the present invention, the padded face region also processed/trained by the specialized classification model based on YOLOv8n framework. Said model is specifically trained to detect and classify instances where the driver is holding the phone near their ears, a strong indicator of distracted driving behaviour. In continuation, the YOLOv8n is a state-of-the-art classification and object detection model for detecting driver distraction while driving in real-time video footage through driver side camera device (10).
Said YOLOv8n trained on large and diverse datasets, enabling it to recognize a wide range of object classes with high accuracy. The YOLOv8n framework is trained to distinguishing between distracted driving behaviour (with the driver holding a phone near their ears) and normal driving behaviour that allows for fast inference, which is crucial for real-time driver monitoring applications. Furthermore, the YOLOv8n model deploy on a variety of hardware platforms, from edge devices to powerful servers, making it a versatile choice for in-vehicle driver monitoring. By leveraging the YOLOv8n framework as core of the phone detection model, the process of the present invention having advantage from its high-performance, efficient, and customizable nature, ensuring accurate and real-time identification of distracted driving behaviour due to mobile phone usage.
S19: integrating classification model in to the phone detection model to detect the driver holding mobile phone near their ear or holding in hands in while driving video footage.
According to the present invention step S19, by combining the phone detection model and classification model work in tandem to provide a comprehensive assessment of the driver's behaviour. The classification model result determines whether the driver is exhibiting distracted driving behaviour by holding the phone near their ear or engaged in normal driving behaviour. Next step of the present invention is to provide alerts in real-time based on output/result of the classification model.
S20: generating alerts to driver, fleet operator or other party to mitigate risk associated with distracted driving in real-time.
In step S20 of the present invention, if the phone is detected in driver hand or near ears or finding talking on mobile phone while driving, the distraction phone alerts play/provide in real time. Further, based on results of the model and the calculated mouth aspect ratio (MAR) over a certain threshold of frames, if phone usage is detected for certain duration, it generates the real-time alerts or notifications to the drivers, fleet operators or other relevant parties to mitigate the risks associated with the distracted driving.
The models performance of the present invention has been extensively tested and optimized to ensure real-time, high-accuracy detection of mobile phone held by the driver. The present invention provide the exact bounding box coordinates of the detected phone, allowing the process to integrate this information with the face detection and mouth aspect ratio analysis for a comprehensive assessment of the driver's interactions with their mobile device.
The combination of accurate face detection, strategic padding to include the relevant hand-to-ear area, the classification model along with the process ability to continuously learn and adapt, allows the overall solution to reliably identify instances of distracted driving due to mobile phone usage, even in diverse environmental conditions and driving scenarios. The present invention comprehensive approach by integrating machine learning model with the driver-side camera (10) significantly enhances road safety by combating distracted driving and mitigating its potential risks.
While various elements of the present invention have been described in detail, it is apparent that modification and adaptation of those elements will occur to those skilled in the art. It is expressly understood, however, that such modifications and adaptations are within the spirit and scope of the present invention as set forth in the following claims.
, Claims:We Claim:
1. A process for detecting and monitoring mobile phone usage by driver in a moving vehicle comprising:
calculating MAR to measure the openness of mouth involving:
receiving video footage of driver’s face and upper body captured by camera device (10) on vehicle;
performing face detection on each frame of the video to accurately identify the face area through face detection model;
detecting the driver eyes, nose, mouth and other facial features through landmark detection model;
wherein,
combining MAR analysis results with face and phone detection model;
performing strategic padding to amplify the detection region around the detected face by the face detection model;
integrating strategic padding output through the phone detection model;
processing padded face region in to classification model based on YOLOv8n framework to detect and classify the driver holding the mobile phone;
integrating the classification model in to the phone detection model to detect the driver holding mobile phone near their ear or holding in hands in real-time video footage;
generating real-time alerts to driver, fleet operators or other parties.
2. The process for detecting and monitoring mobile phone usage by driver in a moving vehicle as claimed in claim 1, wherein the MAR model measuring the distance between an upper lip and a lower lip, providing a quantitative measure of mouth openness through Euclidean distance between specific mouth landmarks.
3. The process for detecting and monitoring mobile phone usage by driver in a moving vehicle as claimed in claim 1, wherein the strategic padding determining extending cropped area beyond the face itself, effectively capturing space around the driver's ears and hands to detect mobile phone.
4. The process for detecting and monitoring mobile phone usage by driver in a moving vehicle as claimed in claim 1, wherein the YOLOv8n framework detecting driver distraction while driving in real-time video footage through driver side camera device (10).
5. The process for detecting and monitoring mobile phone usage by driver in a moving vehicle as claimed in claim 1, wherein phone detection model consists deep learning model, i.e. Convolutional Neural Networks (CNNs) and Region-based CNNs (R-CNNs) frameworks to precisely localize and classify the mobile phones.
6. A system for detecting and monitoring mobile phone usage by driver in a moving vehicle, comprises of:
a camera device (10) captures image/video footage of driver’s face and upper body in vehicle;
a face detection model identifies the face on each frame of the video;
a facial landmark detection model detects the driver eyes, nose, mouth and other facial features;
a phone detection model to detect mobile phone in padded face region;
a classification model to detect and classify the driver holding the mobile phone;
MAR(mouth aspect ratio) model to calculate specific mouth landmarks/measure openness of mouth of the face features detected by the facial landmark detection model;
strategic padding to amplify detection region around the detected face by the face detection model;
the padded face region trained by the classification model based on YOLOv8n framework;
the classification model output generates real-time alert or notifications to the driver, fleet operators or relevant parties.
Dated this on 21st day of February 2025
| # | Name | Date |
|---|---|---|
| 1 | 202521015027-STATEMENT OF UNDERTAKING (FORM 3) [21-02-2025(online)].pdf | 2025-02-21 |
| 2 | 202521015027-PROOF OF RIGHT [21-02-2025(online)].pdf | 2025-02-21 |
| 3 | 202521015027-POWER OF AUTHORITY [21-02-2025(online)].pdf | 2025-02-21 |
| 4 | 202521015027-FORM-9 [21-02-2025(online)].pdf | 2025-02-21 |
| 5 | 202521015027-FORM FOR STARTUP [21-02-2025(online)].pdf | 2025-02-21 |
| 6 | 202521015027-FORM FOR SMALL ENTITY(FORM-28) [21-02-2025(online)].pdf | 2025-02-21 |
| 7 | 202521015027-FORM 1 [21-02-2025(online)].pdf | 2025-02-21 |
| 8 | 202521015027-FIGURE OF ABSTRACT [21-02-2025(online)].pdf | 2025-02-21 |
| 9 | 202521015027-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [21-02-2025(online)].pdf | 2025-02-21 |
| 10 | 202521015027-EVIDENCE FOR REGISTRATION UNDER SSI [21-02-2025(online)].pdf | 2025-02-21 |
| 11 | 202521015027-DRAWINGS [21-02-2025(online)].pdf | 2025-02-21 |
| 12 | 202521015027-DECLARATION OF INVENTORSHIP (FORM 5) [21-02-2025(online)].pdf | 2025-02-21 |
| 13 | 202521015027-COMPLETE SPECIFICATION [21-02-2025(online)].pdf | 2025-02-21 |
| 14 | 202521015027-STARTUP [22-02-2025(online)].pdf | 2025-02-22 |
| 15 | 202521015027-FORM28 [22-02-2025(online)].pdf | 2025-02-22 |
| 16 | 202521015027-FORM 18A [22-02-2025(online)].pdf | 2025-02-22 |
| 17 | Abstract.jpg | 2025-03-03 |
| 18 | 202521015027-FER.pdf | 2025-07-11 |
| 19 | 202521015027-FORM 3 [17-07-2025(online)].pdf | 2025-07-17 |
| 20 | 202521015027-FER_SER_REPLY [01-11-2025(online)].pdf | 2025-11-01 |
| 1 | 202521015027_SearchStrategyNew_E_Document4E_05-06-2025.pdf |