An Ai Powered Mobile System For Common Disease Identification Using

< Back

An Ai Powered Mobile System For Common Disease Identification Using Camera, Sensor Data, And Natural Language Processing

Abstract: AN AI-POWERED MOBILE SYSTEM FOR COMMON DISEASE IDENTIFICATION USING CAMERA, SENSOR DATA, AND NATURAL LANGUAGE PROCESSING The invention relates to a mobile-based system and method for common disease identification using multimodal data inputs. The system integrates a camera module for capturing images of visible symptoms, a natural language processing engine for interpreting user-reported symptoms in voice or text, and a sensor interface for acquiring physiological data such as temperature and heart rate. An edge computing module processes these inputs locally using optimized artificial intelligence models for image classification, symptom extraction, and signal analysis. A diagnostic engine fuses the multimodal data to generate disease likelihood scores, rank possible conditions, and provide actionable recommendations. All data remain encrypted and stored on the device, ensuring privacy and enabling offline operation without reliance on cloud servers. The system supports multilingual interaction, making it accessible to diverse populations. The invention provides real-time, private, and user-friendly disease identification suitable for deployment in both urban and resource-limited settings.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

22 September 2025

Publication Number

43/2025

Publication Type

INA

Invention Field

BIO-MEDICAL ENGINEERING

Status

Parent Application

Applicants

SR UNIVERSITY

ANANTHSAGAR, HASANPARTHY (M), WARANGAL URBAN, TELANGANA - 506371, INDIA

Inventors

1. MUDUTHANAPELLI KIRAN

RESEARCH SCHOLAR, DEPARTMENT OF COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE, SCHOOL OF COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE, SR UNIVERSITY, ANANTHSAGAR, HASANPARTHY (M), WARANGAL URBAN, TELANGANA - 506371, INDIA

2. DR. VIJAYA CHANDRA JADALA

ASSOCIATE PROFESSOR, DEPARTMENT OF COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE, SCHOOL OF COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE, SR UNIVERSITY, ANANTHSAGAR, HASANPARTHY (M), WARANGAL URBAN, TELANGANA - 506371, INDIA

Specification

Description:FIELD OF THE INVENTION
The present invention relates to the field of mobile health technologies and artificial intelligence, particularly to a system and method for identifying common diseases using multimodal data inputs. The invention integrates a mobile camera, natural language processing of user-reported symptoms, and sensor-derived physiological data, all processed locally through edge AI models to ensure privacy, real-time performance, and offline functionality.
BACKGROUND OF THE INVENTION
This invention is related to a system and method of increasing mobile devices with artificial intelligence (AI), deep learning and natural language treatment (NLP) to detect and identify common diseases. The system uses a mobile camera to take images of visible symptoms and accept user -reported symptoms through voice or text. These inputs are treated locally using the Edge AI model to provide immediate preservation of privacy and reliable health evaluation.
US2022369925A1: A method for identifying a disease affected area. The method includes activating a geolocation device of an electronic communication device, determining a current geolocation from the geolocation device, querying a disease database from the electronic communication device, to identify one or more diseases associated with the current geolocation, and generating a graphical display on a display of the electronic communication device displaying a risk rating associated with each of the one or more identified diseases associated with the current geolocation.
US20230326016: Artificial intelligence for detecting a medical condition using facial images. In an embodiment, a convolutional neural network is applied to a facial image to identify facial landmarks, which are then used to align the facial image to a standard template. Next, the aligned facial image is projected into multi-views, and a second convolutional neural network is applied to the multi-views to extract global features. A facial-omics model is also applied to the aligned facial image to extract local features. A classification model is applied to the global features and the local features to predict one or more clinical parameters and/or medical conditions.
Conventional mobile health applications often rely on a single type of input, such as text-based symptom entry or image-based analysis, and are heavily dependent on cloud servers for processing. These approaches restrict diagnostic accuracy, compromise data privacy, and limit usability in low-connectivity or resource-constrained environments. Existing systems are also narrowly focused on specific disease categories and do not provide multilingual or inclusive interfaces for diverse populations.
The present invention solves these problems by enabling multimodal data integration, including visual, textual, vocal, and sensor-based inputs, which are processed directly on the device through edge AI models. This architecture eliminates dependency on internet connectivity, preserves user privacy by keeping sensitive health data local, and delivers real-time diagnostic insights. The system further supports multilingual natural language interfaces, making it accessible to users across varying literacy levels and language backgrounds.
SUMMARY OF THE INVENTION
This summary is provided to introduce a selection of concepts, in a simplified format, that are further described in the detailed description of the invention.
This summary is neither intended to identify key or essential inventive concepts of the invention and nor is it intended for determining the scope of the invention.
The invention discloses an AI-powered mobile system for disease identification that leverages multimodal data inputs to improve diagnostic accuracy. The system integrates a mobile camera for capturing images of visible symptoms, a natural language processing engine for interpreting symptom descriptions provided by users in voice or text, and a sensor interface to acquire physiological data such as heart rate, body temperature, or motion signals.
A lightweight edge computing module processes the captured inputs using optimized AI models, including convolutional neural networks for image classification, compact NLP models for symptom extraction, and signal processing techniques for sensor analysis. The multimodal data are fused in a diagnostic engine that generates a disease likelihood score, ranks possible conditions, and recommends next steps such as hydration, rest, or medical consultation.
Unlike prior systems that rely on cloud computing, the present invention ensures that all analysis occurs locally on the mobile device, preserving privacy and enabling offline functionality. The invention supports multilingual interaction, allowing users to input symptoms in regional languages through either voice or text. This makes the system highly inclusive and effective in diverse, rural, or resource-limited environments.
The invention further provides a scalable framework that can be continuously updated with new AI models to expand disease coverage, ensuring long-term adaptability and relevance in mobile health care delivery.
To further clarify advantages and features of the present invention, a more particular description of the invention will be rendered by reference to specific embodiments thereof, which is illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail with the accompanying drawings.
Mobile health technologies have increased rapidly, but many existing solutions depend on cloud -based services, single mode input and limited clinical abilities. The invention aims to design AI-controlled mobile system that is able to identify common diseases by integrating multimodal data on a single mobile unit visual, oral and sensor-based. The study proposes a new edge-AI architecture that enables real-time health evaluation using camera images (eg skin, eye, heavy symptoms), descriptions of natural language symptoms (text or voice) and physiological data from the built-in or portable sensors. The system uses light deep teaching models for image classification, NLP engine for symptomatic extraction and signal processing algorithms for sensor analysis. All data are processed locally to ensure privacy and offline purposes. In studies that simulate common diseases such as flu, skin infection and dehydration, the system demonstrated high clinical accuracy and real -time response, even in the low -coupling environment. This invention provides a user-friendly, multilingual and privacy individual solution to a user-friendly, multilingual and privacy, to bring an AI-operated disease identification, for the signed population, which marks significant progress in the delivery of mobile health services.
BRIEF DESCRIPTION OF THE DRAWINGS
The illustrated embodiments of the subject matter will be understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of devices, systems, and methods that are consistent with the subject matter as claimed herein, wherein:
FIGURE 1: SYSTEM ARCHITECTURE
The figures depict embodiments of the present subject matter for the purposes of illustration only. A person skilled in the art will easily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
DETAILED DESCRIPTION OF THE INVENTION
The detailed description of various exemplary embodiments of the disclosure is described herein with reference to the accompanying drawings. It should be noted that the embodiments are described herein in such details as to clearly communicate the disclosure. However, the amount of details provided herein is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the present disclosure as defined by the appended claims.
It is also to be understood that various arrangements may be devised that, although not explicitly described or shown herein, embody the principles of the present disclosure. Moreover, all statements herein reciting principles, aspects, and embodiments of the present disclosure, as well as specific examples, are intended to encompass equivalents thereof.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a",” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
In addition, the descriptions of "first", "second", “third”, and the like in the present invention are used for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" and "second" may include at least one of the features, either explicitly or implicitly.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Mobile health technologies have increased rapidly, but many existing solutions depend on cloud -based services, single mode input and limited clinical abilities. The invention aims to design AI-controlled mobile system that is able to identify common diseases by integrating multimodal data on a single mobile unit visual, oral and sensor-based. The study proposes a new edge-AI architecture that enables real-time health evaluation using camera images (eg skin, eye, heavy symptoms), descriptions of natural language symptoms (text or voice) and physiological data from the built-in or portable sensors. The system uses light deep teaching models for image classification, NLP engine for symptomatic extraction and signal processing algorithms for sensor analysis. All data are processed locally to ensure privacy and offline purposes. In studies that simulate common diseases such as flu, skin infection and dehydration, the system demonstrated high clinical accuracy and real -time response, even in the low -coupling environment. This invention provides a user-friendly, multilingual and privacy individual solution to a user-friendly, multilingual and privacy, to bring an AI-operated disease identification, for the signed population, which marks significant progress in the delivery of mobile health services.
1. Multimodal Input Collection:
The user interacts with the system through a mobile app.
Inputs values come from three main ways:
Mobile Camera Module: Camar captures images for visible symptoms (e.g., skin rash, eye redness, tongue color).
Mobile Voice/Text Module: User describes symptoms in natural language (voice or typed), which is processed using an embedded NLP engine.
Mobile / wearable device Sensor Interface: It takes data related health problem from mobile sensors or wearable devices (e.g., heart rate, temperature, accelerometer for movement).
2. Pre-processing of Inputs:
Image Data: First images takes and Compressed and normalized then passed to a lightweight convolutional neural network (CNN) such as EfficientNet-lite or MobileNet.
Voice/Text Data: Next input takes text or speech [Speech-to-text conversion (if voice)] then text analyzed with a compact NLP model like DistilBERT or a rule-based symptom extractor.
Sensor Data: Time-stamped and filtered using signal processing techniques to remove noise (e.g., moving average smoothing, normalization).
3. Multimodal Fusion and Edge AI Analysis:
Device inference engine takes pre-processed data.
A fusion layer aligns and merges the inputs using:
 Feature embedding and concatenation
 Attention mechanisms (optional) to weigh inputs based on relevance
A final disease prediction model (e.g., a shallow feed-forward network or ensemble classifier) outputs a disease likelihood score or direct classification.
4. Diagnosis Generation and Feedback:
The AI app provides:
A ranked list of potential diseases
A confidence score for each
Suggested next actions (e.g., hydration, rest, see a doctor)
Output is shown visually, with icons, graphs, and optionally read aloud via TTS (text-to-speech) for accessibility.
Data stays entirely on the device, ensuring privacy and offline capability.
The invention comprises a mobile-based system designed to identify common diseases through multimodal data integration. The system incorporates a mobile camera, natural language processing engine, and sensor interface, all connected to an on-device edge AI module and diagnostic engine.
The camera module is configured to capture images of physical symptoms such as skin rashes, eye discoloration, or tongue coating. These images are pre-processed for normalization, noise reduction, and feature extraction before being analyzed by a lightweight convolutional neural network optimized for mobile devices.
The natural language processing module interprets user-reported symptoms received via voice or text. For voice inputs, the system employs embedded speech-to-text conversion, followed by compact NLP models such as rule-based extractors or lightweight transformer models to derive meaningful clinical information from user descriptions. The engine supports multilingual processing, enabling users to describe symptoms in local languages, which significantly enhances accessibility.
The sensor interface collects physiological signals from embedded or wearable devices connected to the mobile system. Examples include body temperature, heart rate, blood oxygen levels, and accelerometer readings. The signals are filtered through signal processing techniques, including smoothing and normalization, to reduce noise before analysis.
The edge AI module integrates the outputs of these three inputs. A fusion layer combines image embeddings, extracted symptom vectors, and sensor features into a unified representation. Attention mechanisms may be employed to assign higher weights to more relevant data. The fused representation is analyzed by a disease classification model to generate diagnostic predictions.
The diagnostic engine provides a ranked list of probable diseases with corresponding confidence scores. The system also offers user-friendly feedback in both visual and auditory formats. For example, recommendations such as hydration, rest, or consulting a physician may be displayed as icons, graphs, or spoken text through a text-to-speech module.
Importantly, the invention operates entirely on-device, with no need for cloud connectivity. This ensures data privacy, as all sensitive information remains locally stored and encrypted. Offline functionality makes the system particularly suited for deployment in rural or low-resource areas.
The invention is designed to be energy-efficient, with neural network models quantized for low computational requirements. This allows the system to function effectively on standard smartphones without causing excessive battery drain.
The system is scalable, enabling the inclusion of new disease categories through periodic model updates. For example, while the initial implementation may focus on conditions like flu, dehydration, or skin infections, future updates can incorporate cardiovascular, respiratory, or gastrointestinal diseases.
The architecture supports personalization, as the system can adapt based on historical user interactions and symptom patterns, enhancing diagnostic accuracy over time.
The invention further emphasizes inclusivity through its multilingual natural language interface, making it usable by populations with limited literacy or non-English language skills. The voice-based interaction allows even non-technical users to engage easily.
In addition, the invention incorporates strong privacy-preserving measures. Data encryption ensures that health records remain secure, and all analysis is restricted to the device, eliminating risks associated with transmitting personal information to external servers.
By combining visual analysis, linguistic interpretation, and physiological monitoring, the system delivers a comprehensive approach to disease identification. This multimodal integration sets it apart from prior art solutions that typically rely on a single input modality.
Best Method of Working
The best method of working involves deploying the invention as a mobile application integrated with device sensors. The application runs on smartphones with embedded camera and audio capabilities and connects to external wearable devices via Bluetooth for additional sensor inputs.
Users initiate the process by capturing images of visible symptoms through the mobile camera and describing other symptoms using either voice or text. The natural language processing engine extracts clinical details, while the sensor interface collects physiological data. All inputs are fused within the edge AI module for multimodal analysis.
The diagnostic engine then generates disease predictions with confidence scores and displays results in a user-friendly format, accompanied by actionable suggestions. Data remains encrypted and localized to the device, ensuring privacy and offline functionality.
The innovation of this invention lies in the integrated use of multimodal input data (visual, oral and sensor-based) to complete De-Device using Edge AI to identify common diseases in real-time. Unlike existing health applications, which depend on one mode input (eg, only text or images) and requires constant cloud connection, the system specifically adds to camera-conscious symptoms, descriptions of natural language symptoms (through voice or text), and the physical sensor data adds to the integrated diagnostic framework. In addition, the inclusion of a mild, multilingual natural language treatment engine and low power nerve networks offline operations, especially for use in novels and virtually low resource settings or privacy -sensitive environment. The ability to generate accurately, fast disease assessment without uploading the user data on the outer server makes it significant signs from today's commercial or research -based health units.
ADVANTAGES OF THE INVENTION
Multimodal health entrance integration
For rich images, voice/text and sensor accept data, more accurate diagnosis.
Unlike traditional individual efforts, the user provides a comprehensive approach to health.
Ai (edge ai) on the device ai (edge ai)
All data is processed locally on the user's device using a light AI model.
Enables the use of offline, making the system ideal for rural areas, remote or little connection.
Extended users' privacy
No cloud upload is required - all individual health data remains on the unit.
Data complains with safety rules and creates confidence with users.
Multilingual natural language interface
Users can talk or write symptoms in their local language.
Low-literacy or non-English-speaking populations make health services more accessible.
real -time diagnosis and recommendations
Provides quick response without delay from the Internet or shooter.
Useful under timely conditions, such as infection or fever detection.
Low resource consumption
Designed to go on a regular smartphone without high end glasses or battery drains.
Mobile uses CNN and NLP models favorable for CPU.
wide clinical range
Be able to identify many common diseases, not limited to specific domains such as dermatology or mental health.
Scalable for several conditions with model updates.
Action with trust -based production
Easy understanding shows the possibility of the disease with scenes and suggestions (eg, see doctor, hydrate, rest).
Improves the decision on health without doctors' replacement.

, Claims:1. A mobile-based disease identification system comprising:
a camera module configured to capture visual data of physical symptoms;
a natural language processing engine configured to interpret user input provided via voice or text describing symptoms;
a sensor interface configured to collect physiological data from mobile sensors or connected wearable devices;
an edge computing module comprising one or more AI models for analyzing visual data, natural language data, and sensor data in combination; and
a diagnostic engine configured to generate a disease likelihood score and provide recommendations based on the multimodal analysis,
wherein the analysis and diagnosis occur locally on the mobile device without requiring cloud-based processing.
2. The system as claimed in claim 1, wherein the camera module captures dermatological features, ocular conditions, or tongue coloration for visual diagnosis.
3. The system as claimed in claim 1, wherein the natural language processing engine supports multilingual input and converts spoken voice input to text using an embedded model.
4. The system as claimed in claim 1, wherein the sensor interface collects temperature, heart rate, or motion data via accelerometer or wearable sensors.

5. The system as claimed in claim 1, wherein the edge computing module utilizes quantized neural networks optimized for low-power mobile devices.
6. The system as claimed in claim 1, wherein the diagnostic engine displays disease confidence scores and suggests follow-up actions such as hydration, rest, or medical consultation.
7. The system as claimed in claim 1, wherein all data remains encrypted and stored locally on the device to preserve privacy.
8. The system as claimed in claim 1, wherein the diagnostic engine provides visual, graphical, and auditory outputs to enhance user accessibility; and the system is operable offline without requiring continuous internet connectivity.
9. The system as claimed in claim 1, wherein the architecture is scalable to include new disease categories through AI model updates.
10. A method for disease identification using a mobile device, the method comprising:
capturing images of physical symptoms using a camera module;
receiving symptom descriptions from a user via voice or text and processing them using a natural language processing engine;
collecting physiological data through a sensor interface connected to mobile or wearable sensors;
pre-processing the captured inputs for normalization and noise reduction;
fusing the pre-processed data through an edge computing module comprising artificial intelligence models;
generating disease predictions and confidence scores using a diagnostic engine; and
providing recommendations to the user in visual, textual, or auditory format,
wherein all processing and diagnosis occur locally on the mobile device.

Documents

Application Documents

#	Name	Date
1	202541090193-STATEMENT OF UNDERTAKING (FORM 3) [22-09-2025(online)].pdf	2025-09-22
2	202541090193-REQUEST FOR EARLY PUBLICATION(FORM-9) [22-09-2025(online)].pdf	2025-09-22
3	202541090193-POWER OF AUTHORITY [22-09-2025(online)].pdf	2025-09-22
4	202541090193-FORM-9 [22-09-2025(online)].pdf	2025-09-22
5	202541090193-FORM FOR SMALL ENTITY(FORM-28) [22-09-2025(online)].pdf	2025-09-22
6	202541090193-FORM 1 [22-09-2025(online)].pdf	2025-09-22
7	202541090193-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [22-09-2025(online)].pdf	2025-09-22
8	202541090193-EVIDENCE FOR REGISTRATION UNDER SSI [22-09-2025(online)].pdf	2025-09-22
9	202541090193-EDUCATIONAL INSTITUTION(S) [22-09-2025(online)].pdf	2025-09-22
10	202541090193-DRAWINGS [22-09-2025(online)].pdf	2025-09-22
11	202541090193-DECLARATION OF INVENTORSHIP (FORM 5) [22-09-2025(online)].pdf	2025-09-22
12	202541090193-COMPLETE SPECIFICATION [22-09-2025(online)].pdf	2025-09-22