Abstract: A real-time falsified footage detection system, comprises of a capturing unit that acquires real-time footage for falsification detection, a communication unit that transmits the footage to a remotely located station, and a computational unit equipped with processors and accelerators like GPUs or TPUs, the computational unit executes instructions to preprocess the footage by normalizing lighting, noise, and resolution, followed by analysis using a detection protocol, the protocol includes convolutional neural networks (CNNs) for spatial analysis to detect visual artifacts and temporal analysis using recurrent neural networks (RNNs) or temporal convolution networks (TCNs) to detect frame inconsistencies or unnatural movements, frames with detected falsification are flagged, confidence scores are generated for each flagged frame, and an encrypted alert is transmitted to the station with the flagged footage and confidence scores and the alert includes a visual overlay of the flagged segments for clear identification of tampered content.
Description:FIELD OF THE INVENTION
[0001] The present invention relates to a real-time falsified footage detection system that identifies manipulated content in video footage by analyzing both visual artifacts and temporal inconsistencies in real-time for ensuring high accuracy and reliability in detection of falsified footage.
BACKGROUND OF THE INVENTION
[0002] With the rise of digital media and the increasing accessibility of video editing tools, the authenticity of footage has become a critical concern across multiple sectors, including security, journalism, law enforcement, and social media. The ability to detect falsified or manipulated video content is essential in ensuring the integrity of visual evidence and preventing the spread of misleading or harmful information.
[0003] Traditional methods for identifying video falsification, such as manual inspection or basic tools, often prove inadequate in addressing the complexities of modern digital manipulation techniques. As manipulation methods evolve, this becomes increasingly difficult to differentiate between authentic and altered footage using conventional approaches. This has led to an urgent need for more reliable, and automated means that efficiently detect falsifications in real-time, particularly in dynamic or fast-paced environments where time is of the essence. Furthermore, ensuring the secure and reliable transmission of detected falsifications is of utmost importance, particularly in situations where tampered footage influence critical decisions or public perception. This requires a solution that not only identifies falsifications but also transmits the results in a secure and encrypted manner to prevent interference or unauthorized access.
[0004] US11551474B2 discloses about an invention that aids in Detection of whether a video is a fake video derived from an original video and altered is undertaken using both image analysis and frequency domain analysis of one or more frames of the video. The analysis may be implemented using neural networks.
[0005] US20200065526A1 discloses about an invention that has techniques for digital video authentication (and preventing fake videos) are disclosed. First pixels within a first image frame of the video clip representing an area of interest within the first image frame may be identified. The area of interest may correspond to a person's face or another object. A first frame signature may be calculated based on the first pixels. Second pixels within a second image frame of the video clip representing an area of interest within the second image frame may be identified. A second hash value may be calculated based on the second pixels. The authenticity of the video clip may be determined by comparing the first and second hash values against data extracted from third pixels within the first image frame that do not correspond to the area of interest in the first image frame.
[0006] In order to overcome the aforementioned drawbacks, there exists a need in the art to develop a solution that accurately detect falsified video footage, provide real-time alerts, and securely communicate the results to relevant personnel, thereby ensuring the credibility of visual content in an increasingly digital world.
OBJECTS OF THE INVENTION
[0007] An object of the present invention is to develop a system that is capable of detecting falsification in real-time in video footage to ensure the authenticity of the content being analyzed.
[0008] Another object of the present invention is to develop a system that is capable of performing analysis of visual artifacts including irregular lighting, unnatural textures, and edge inconsistencies to identify potential falsifications in individual frames of the footage.
[0009] Another object of the present invention is to develop a system that is capable of detecting temporal inconsistencies by analyzing the movement patterns and frame transitions across the footage to identify unnatural changes or manipulation.
[0010] Yet another object of the present invention is to develop a system that is capable of enabling secure transmission of alerts and flagged footage for ensuring that the detection results, along with flagged frames and confidence scores, are transmitted safely and protected from tampering.
[0011] The foregoing and other objects, features, and advantages of the present invention will become readily apparent upon further review of the following detailed description of the preferred embodiment as illustrated in the accompanying drawings.
SUMMARY OF THE INVENTION
[0012] The present invention relates to a real-time falsified footage detection system to continuously detect falsifications by incorporating new data and evolving detection techniques for ensuring up-to-date performance in detecting falsification, thus eliminating the chances of tampering with any footage.
[0013] According to an embodiment of the present invention, a real-time falsified footage detection system, comprises of a capturing unit to acquire footage, a communication unit to transmit the data to a remote station, and a computational unit with processors and accelerators (such as GPUs or TPUs). The system processes the footage by pre-processing it to normalize variations in lighting, noise, and resolution, preparing for accurate analysis and then uses detection protocols, incorporating convolutional neural networks (CNNs) to identify visual artifacts such as irregular lighting, unnatural textures, and inconsistencies in edges, along with temporal analysis via recurrent neural networks (RNNs) or temporal convolution networks (TCNs) to detect unnatural movements or frame inconsistencies. The system flags frames with detected falsifications and generates confidence scores to quantify the likelihood of manipulation. Upon detection, the system sends an alert via an encrypted communication channel to ensure secure transmission of the flagged frames and their corresponding confidence scores.
[0014] According to another embodiment of the present invention, the proposed invention further comprises of a method for detecting falsification in footage to ensure accurate and reliable analysis. Real-time footage is acquired for detection. The acquired footage is then pre-processed to normalize variations in lighting, noise, and resolution, which allows for more effective analysis. The system analyzes each frame of the footage using a detection protocol that incorporates convolutional neural networks (CNNs) to identify visual artifacts such as irregular lighting, unnatural textures, and edge inconsistencies, while also examining temporal patterns with methods like recurrent neural networks (RNNs) or temporal convolution networks (TCNs) to detect unnatural movements or frame inconsistencies. Frames that exhibit falsification are flagged, and a confidence score is generated to quantify the likelihood of manipulation. Finally, an alert, along with the flagged frames and confidence scores, is transmitted securely via an encrypted communication channel to ensure the integrity and safety of the results.
[0015] While the invention has been described and shown with particular reference to the preferred embodiment, it will be apparent that variations might be possible that would fall within the scope of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
Figure 1 illustrates a schematic diagram of a real-time falsified footage detection system; and
Figure 2 illustrates a flow chart depicting workflow of the proposed invention.
DETAILED DESCRIPTION OF THE INVENTION
[0017] The following description includes the preferred best mode of one embodiment of the present invention. It will be clear from this description of the invention that the invention is not limited to these illustrated embodiments but that the invention also includes a variety of modifications and embodiments thereto. Therefore, the present description should be seen as illustrative and not limiting. While the invention is susceptible to various modifications and alternative constructions, it should be understood, that there is no intention to limit the invention to the specific form disclosed, but, on the contrary, the invention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the invention as defined in the claims.
[0018] In any embodiment described herein, the open-ended terms "comprising," "comprises,” and the like (which are synonymous with "including," "having” and "characterized by") may be replaced by the respective partially closed phrases "consisting essentially of," consists essentially of," and the like or the respective closed phrases "consisting of," "consists of, the like.
[0019] As used herein, the singular forms “a,” “an,” and “the” designate both the singular and the plural, unless expressly stated to designate the singular only.
[0020] The present invention relates to a real-time falsified footage detection system that accurately identify manipulated content in video footage by analyzing both visual artifacts and temporal inconsistencies in real-time, while ensuring the secure and efficient transmission of flagged footage, confidence scores, and alerts via encrypted channels, thereby protecting the integrity of the detection process.
[0021] Referring to Figure 1 and 2, a schematic diagram of a real-time falsified footage detection system and a flow chart depicting workflow of the proposed invention are illustrated, respectively.
[0022] The system disclosed herein includes a capturing unit to capture footage, or video data, for analysis to detect potential alterations or tampering. The capturing unit obtain high-quality footage that are to be assessed for visual and temporal inconsistencies indicative of falsification. To accomplish this, the capturing unit incorporates various image capturing means that vary in terms of the technology and capabilities involved. The image capturing means refers to the various means used to capture the visual data in the form of images or video frames. This include but not limited to cameras, video recording devices, or even specialized equipment customized to specific types of footage whether for high-definition video, infrared imagery, or other specialized recording formats. The capturing unit typically includes one of these components.
[0023] The most common form of image capturing means is likely be cameras, such as digital cameras, HD cameras, or 4K cameras, capable of capturing high-resolution images and video. These cameras are static (fixed in one position) or dynamic (mounted on a mobile unit) that provide the necessary footage to detect falsification through visual analysis. The higher the resolution and frame rate of the cameras, the more detailed and reliable the data captured is for providing more granular information for the detection protocols.
[0024] In some cases, the capturing unit include thermal cameras or infrared (IR) sensors. These types of image capturing means are useful in scenarios where falsification involve manipulating thermal patterns or creating inconsistencies in temperature distribution that are visible only through infrared imaging. For example, the use of heat signatures in detecting tampered footage require specialized equipment capable of capturing in these non-visible spectrums. Thermal cameras are especially valuable when detecting manipulation of video in contexts where lighting conditions or standard visual cues are altered in a manner that is imperceptible to traditional cameras but visible through heat patterns. For example, falsification involve modifying objects in the scene in a way that disrupts the normal depth relationships, and a stereo or 3D camera is able to flag such inconsistencies by comparing the captured depth data against expected patterns. The capturing unit is not limited to just one type of image capturing means, but rather integrate several different means that complement each other for improved accuracy.
[0025] In terms of functionality, the capturing unit is closely linked with a communication unit that establishes a reliable and secure transmission link between the capturing unit which captures footage for falsification detection and a remotely located station where the footage is further analyzed This requires the capturing unit to have the capability to encode and transmit high-quality video data, potentially over long distances, using suitable wireless or wired communication protocols. For real-time applications, the transmission is continuous and capable of handling high bandwidth, especially when dealing with high-resolution video or large datasets generated by 3D or infrared cameras. The communication unit enables the real-time transfer of data from the capturing unit to the station for ensuring that the captured video or image data is processed efficiently and securely.
[0026] This remote station includes but not limited to a central server, a cloud-based processing means, or a dedicated computational hub where protocols such as machine learning models, are applied to detect falsifications. The communication unit bridges the gap between the capturing unit which gathers the footage, and a computational unit that performs the analysis required for falsification detection. The communication unit support a wide range of capabilities to ensure that the footage is transmitted effectively. These capabilities include:
• Transmission of High-Quality Data: Video and image data, especially high-resolution footage captured by cameras or imaging units that are large in size. The communication unit has the capacity to transmit high-bandwidth data efficiently. To accommodate this, the unit support high-speed communication protocols and is capable of transmitting large volumes of data in real time, with minimal delay, therefore ensuring minimal latency.
• Reliable Communication Protocols: The communication unit support various communication protocols, including wired (Ethernet), RF (Radio Frequency) and wireless (Wi-Fi (Wireless Fidelity), 5G, LTE (Long-Term Evolution), etc.) methods. The choice of protocol depends on the environment and the requirements for bandwidth, latency, and mobility.
• Real-Time Data Streaming: Since the system is developed for real-time falsification detection, the communication unit is capable of live data streaming. This continuously transmit video data or images to the remote station without interruption, allowing for ongoing analysis. Real-time streaming requires buffering and compression techniques to minimize delays and manage data size, particularly when handling high-definition or high-frame-rate footage. The unit employ video compression protocols, such as H.264 or HEVC (H.265) to reduce the size of the data being transmitted without significant loss of visual quality, enabling faster transmission without overburdening the network.
• Security of Data Transmission: The system that involves the transmission of potentially sensitive footage has major concern of the security of the communication channel. The communication unit ensure that the transmission is encrypted to prevent interception, unauthorized access, or tampering of the data. This is particularly crucial when dealing with sensitive information, such as footage related to security surveillance, legal proceedings, or private investigations.
[0027] The computational unit is tasked for performing the necessary processing and analysis of the footage captured by the capturing unit. This is responsible for executing the core tasks that allow the system to detect falsifications in real-time video and comprises of a processor, an accelerator, and a memory storing executable instructions that direct its operations.
[0028] The processor is responsible for carrying out general-purpose computations and executes the executable instructions stored in memory and coordinates the overall processing flow. The processor handles tasks such as coordinating data retrieval from the capturing unit, orchestrating the flow of information between different subsystems, and managing the interaction between the processor and accelerators.
[0029] The accelerator is a specialized computing unit developed to handle specific types of computations much faster than a general-purpose processor. The accelerator is selected from a GPU (Graphics Processing Unit), a TPU (Tensor Processing Unit), or a combination of both. These accelerators are essential for the deep learning and image processing tasks required to detect falsification in real-time footage. The GPU (Graphics Processing Unit) are highly parallelized processors primarily developed for rendering images and video, but they are also exceptionally well-suited for machine learning and artificial intelligence tasks.
[0030] TPU (Tensor Processing Unit) is another type of accelerator, developed by Google, specifically to accelerate machine learning tasks, particularly those involving deep learning models. TPUs are highly specialized for tensor computations (multi-dimensional arrays) which are the fundamental data structures in deep learning. Unlike GPUs, which are general-purpose accelerators, TPUs are optimized for the mathematical operations that underpin many AI tasks, such as matrix multiplications and convolutions in CNNs.
[0031] Using TPU in this system help accelerate the analysis of footage by providing faster processing times for deep learning models, especially those that involve large-scale neural networks. Combination of GPU and TPU are used that depends on the workload. For example, the GPU handle the bulk of parallel image processing tasks (like CNN-based analysis), while the TPU handle tensor-heavy operations within deep learning models, thus optimizing both speed and efficiency for detecting falsification.
[0032] The memory of the computational unit is where the executable instructions are stored. These instructions, when executed by the processor and accelerators, define the specific tasks and protocols the system perform to detect falsifications in footage. The memory provides the data storage required for the models needed for the detection process. The executable instructions in the computational unit drive the system’s core functionalities. Before analyzing footage, the system preprocesses it to normalize variations in lighting, noise, and resolution. This step is essential for ensuring that the input data is in a consistent format, making this easier to detect visual anomalies. Pre-processing methods such as noise reduction, contrast enhancement, or resizing, are implemented in the executable instructions and run on the processor or accelerator.
[0033] The primary role of the computational unit is to execute protocol for detecting falsifications, using machine learning models and image analysis techniques. The system incorporates new data and updates its detection models over time, making the detection protocol more accurate. For example, if new falsification techniques emerge, the system is updated with new data to adapt and improve its detection capabilities. This process involves retraining or fine-tuning the machine learning models stored in memory using new examples of falsified footage for ensuring that the system remains effective as the nature of falsification evolves.
[0034] When the computational unit executes these instructions, the unit performs several critical steps to detect falsifications. The process of acquiring real-time footage for detection of falsification begins with the capturing unit, which is equipped with various image capturing means such as cameras and other specialized imaging means. This capturing unit is responsible for recording live video or image data for providing the foundational input for the entire falsification detection. The capturing unit continuously monitors its environment to collect footage, which range from standard video to high-definition or even specialized footage such as infrared or thermal imaging, depending on the application and environment.
[0035] The real-time nature of footage acquisition aids in the system’s ability to detect falsifications as they happen for ensuring that no delays interfere with the detection process. For example, in a live broadcast scenario, if video footage is manipulated or altered, the system is able to detect these changes immediately and raise an alert without waiting for post-production analysis. This requires the capturing unit to work continually with high-speed transmission protocols that send the captured data to the computational unit for processing as quickly as possible.
[0036] The data gathered by the capturing unit is fed to the computational unit via various communication channels whether wired or wireless, to initiate the pre-processing and analysis steps for falsification detection. The footage is often transmitted in parallel with metadata such as timestamps, camera settings, and geographical information, which helps provide context to the video data for enhancing the accuracy of falsification detection protocols that rely on both visual and temporal cues.
[0037] Pre-processing acquired footage ensures that the system analyze video effectively and accurately for signs of falsification. The footage captured by the capturing unit is subject to various variations or distortions due to environmental conditions, camera limitations, or transmission errors. These variations such as inconsistent lighting, background noise, and fluctuating resolution significantly impair the ability of detection protocols, especially machine learning models, to identify falsifications. Pre-processing helps normalize these variations, ensuring that the footage is standardized in a way that maximizes the effectiveness of the subsequent analysis and detection stages.
[0038] One of the most common challenges in real-world footage acquisition is lighting variation. Changes in ambient light, shadows, or glare distort the appearance of objects and alter key visual cues, making this difficult for detection protocol to discern authentic footage from falsified content. To address this, pre-processing protocols often apply light normalization techniques that adjust the brightness and contrast of each frame to make lighting conditions consistent across the entire footage. This process involves techniques like histogram equalization, which adjusts the brightness levels of image by redistributing pixel values to cover the full range of brightness, or dynamic range compression, which ensures that details in both dark and light areas of the image are visible. Gamma correction is also used to adjust non-linear lighting distortions that affect how light and color are represented in digital footage. These adjustments help to eliminate lighting artifacts that mislead the falsification detection models.
[0039] In addition to lighting issues, real-world footage often contains noise introduced by factors such as low-light conditions, sensor imperfections, or compression artifacts during video encoding. Noise manifest as random graininess, pixelation, or visual distortions, which obscure key details and make this harder for the system to identify subtle visual anomalies indicative of falsification. To resolve this, pre-processing includes denoising techniques, which aim to remove unwanted noise while preserving important structural details in the footage. Protocols such as Gaussian blurring, median filtering, and wavelet transform denoising are commonly applied. These methods smooth out the image or video by averaging pixel values, removing high-frequency noise, while retaining the sharpness of edges and textures.
[0040] Furthermore, pre-processing also involves handling temporal inconsistencies across frames, particularly in video footage. Inconsistent frame rates or irregular timing between frames cause issues during the detection process, as falsified footage exhibits unusual motion or frame transitions that are difficult to detect without proper synchronization. To address this, pre-processing includes techniques like frame interpolation or frame rate conversion to ensure that the video runs at a consistent and uniform frame rate. By aligning the frames and ensuring smooth temporal transitions, the system more effectively analyze temporal patterns and motion dynamics, which are often key indicators of falsification, such as unnatural object movements or inconsistent scene transitions.
[0041] The pre-processing stage also involves preparing the footage for the deep learning models used in falsification detection. This typically requires converting the video frames into a suitable format for analysis, such as transforming images into tensor formats that deep learning models, such as Convolutional Neural Networks (CNNs), process efficiently. In many cases, additional normalization steps are performed on pixel values for ensuring that all frames have similar brightness, contrast, and color scaling, which improves the generalization capabilities of the detection model. By applying these pre-processing techniques, the system ensures that the footage is prepared in the most optimal format for analysis, maximizing the accuracy and speed of falsification detection. Whether the footage is noisy, poorly lit, or of inconsistent quality, pre-processing standardizes the input data, removing distortions that lead to false negatives or positives during detection.
[0042] The analysis of footage for falsification detection is an important parameter in the system, where protocols are employed to identify discrepancies in video that indicate tampering or manipulation. This analysis is carried out through a specialized detection protocol, which includes a multi-step process that involves both the examination of individual frames as well as the evaluation of temporal patterns across frames. The detection protocol is developed to be comprehensive and dynamic by employing combination of Convolutional Neural Networks (CNNs) for spatial analysis of individual frames and Recurrent Neural Networks (RNNs) or Temporal Convolution Networks (TCNs) for temporal analysis of frame sequences.
[0043] CNNs are highly effective at detecting spatial patterns within images, making them ideal for identifying visual inconsistencies in frame, such as irregular lighting, unnatural textures, and inconsistencies in edges. These types of artifacts are often indicative of video manipulation, as tampered footage exhibit unusual lighting effects, textures that do not match the surrounding environment, or blurred or distorted edges where elements are inserted or altered. CNNs work by applying convolutional filters to the image to detect patterns at various scales and levels of abstraction. At the lower layers of the CNN, filters detect simple patterns such as edges, gradients, or color variations. As the data progresses through the layers, the CNN learns to recognize more complex features, such as textures, shapes, and objects.
[0044] Herein, the network is trained to recognize visual anomalies that are typically seen in manipulated footage, such as mismatched lighting, irregular shadows, unnatural reflections, and texture inconsistencies that are hard to detect with the naked eye. By analyzing the spatial content of each frame, the CNN identifies frames that exhibit visual discrepancies and flags them for further investigation. While CNNs are highly effective for analyzing individual frames, falsification detection is not solely about looking for visual anomalies in isolation, this also involves understanding temporal patterns across frames to identify inconsistencies in motion, timing, or scene transitions. To detect such temporal inconsistencies, the system uses either Recurrent Neural Networks (RNNs), Temporal Convolution Networks (TCNs), or combination of both. RNNs and TCNs are developed to handle sequential data, making them well-suited for video analysis, where frames are presented as sequences over time.
[0045] RNNs are neural networks that are specifically developed to process sequential data by maintaining an internal state (or memory) of previous inputs, allowing them to capture dependencies between frames over time. The RNN analyze the progression of motion and scene changes across multiple frames to detect anomalies. For example, if the movement of object in the video seems unnatural such as sudden jumps in position, inconsistencies in speed, or smooth transitions that don’t match the physical world this is an indication of tampering. RNNs identify such inconsistencies by comparing the observed sequence of frames with expected patterns based on the surrounding frames, flagging any deviations from natural movement as suspicious.
[0046] On the other hand, Temporal Convolution Networks (TCNs) are a more recent development in the field of sequential analysis to address some of the limitations of RNNs. TCNs use 1D convolutions across temporal sequences, allowing them to efficiently capture long-range dependencies between frames. TCNs capture complex temporal patterns, such as unnatural frame rates, the lack of smooth transitions between scenes, or inconsistencies in the speed or direction of motion, which are often hallmarks of falsified video. Both RNNs and TCNs play an essential role in identifying frame inconsistencies that is not immediately visible in individual frames but become apparent when the footage is analyzed over time.
[0047] The system’s adaptive learning module allows the detection protocol to continually evolve and improve its accuracy over time. As the system processes more footage and encounters new falsification techniques, the adaptive learning module enables the system to retrain or fine-tune the deep learning models used for detection. This means that the system becomes increasingly capable of detecting newer, more methods of falsification that have not been included in the original training dataset.
[0048] The adaptive learning module works by continuously gathering feedback from the detection process. For example, when the system flags a particular frame as potentially falsified, this then assess a confidence score assigned to that frame and use this information to adjust its models. If the system repeatedly detects a certain type of falsification, such as a specific manipulation technique involving lighting or motion, this automatically update its learning protocols to better detect this type of tampering in future footage. This process involves the collection of new data (such as additional examples of falsified footage), which is then used to retrain the neural networks. By continuously adapting to new data, the system not only improves its ability to detect existing falsification techniques but also becomes stronger to evolving methods of video manipulation.
[0049] The adaptive learning module ensures that the detection protocol continuously improves for enabling the system to keep pace with emerging falsification techniques and deliver more accurate results over time. By integrating both spatial and temporal analysis, coupled with continuous learning, the system offers a highly effective solution for real-time falsification detection in video footage, with applications in security, media, and other domains requiring robust content verification.
[0050] Flagging frames with detected falsification directly highlights the segments of footage that are suspected of manipulation. After the footage is analyzed by the system using deep learning models such as Convolutional Neural Networks (CNNs) for spatial analysis and Recurrent Neural Networks (RNNs) or Temporal Convolution Networks (TCNs) for temporal analysis, the system identifies and isolate the specific frames that exhibit signs of falsification. These frames are then marked or "flagged" to indicate that they contain visual or temporal inconsistencies that are likely the result of tampering or other forms of manipulation. The flagging process is not a simple labeling task as this involves a thorough evaluation of the data produced by the detection models, including factors such as confidence scores, artifact types, and contextual cues from both individual frames and their relationship to adjacent frames in the video sequence.
[0051] The flagging process begins once the detection models identify potential falsification in specific frames, based on visual artifacts like irregular lighting, unnatural textures, mismatched shadows, or inconsistencies in motion. The system assigns confidence score to each flagged frame, which represents the likelihood that the frame is falsified. This score is generated by the model, typically based on the probability output from the neural network that indicates whether the frame is likely to have been manipulated. The higher confidence score suggests a stronger likelihood of tampering, while a lower score indicate a false positive or a less certain result.
[0052] Flagging frames with detected falsification allows the system to isolate these problem areas, making this easier for investigators, analysts, or systems to review and assess the credibility of the footage. The flagged frames are often stored in a separate log or database, along with their associated confidence scores for enabling easy access and further examination. This is essential for alerting the user to potential falsifications for guiding them to specific areas in the video that require more in-depth investigation or verification. By flagging these suspicious frames, the system provides a precise and actionable output that assists in identifying manipulated footage with high accuracy. The flagged frames, alongside their confidence scores are used in a variety of contexts, including media forensics, security monitoring, and legal investigations, where verifying the authenticity of video footage is paramount.
[0053] Generating the confidence score for each positively detected falsification in the analyzed footage is a fundamental part of the falsification detection. Once a frame or segment of the video is flagged for potential falsification based on the analysis performed by models such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) or Temporal Convolution Networks (TCNs), the system calculates the confidence score to quantify the likelihood that the flagged frame has indeed been tampered with. The confidence score serves as numerical value that represents the certainty with which the system has detected the falsification. This score is typically derived from the output probabilities of the detection model, which is a SoftMax function in the case of classification-based neural networks. The higher the confidence score, the more likely the frame is manipulated. The confidence score reflects not only the visual and temporal evidence detected in the footage but also the model’s ability to generalize based on the patterns that has learned from training data, which includes both manipulated and authentic video examples.
[0054] The confidence score is calculated based on a combination of factors, including the severity and type of artifact detected, the consistency across the frames, and the confidence of individual model predictions. For example, if the frame contains clear visual anomalies, such as lighting inconsistencies or edge distortions that are typical of image manipulation, the confidence score is high. Similarly, if the detected falsification aligns with patterns that are commonly associated with known tampering methods (such as spatio-temporal inconsistencies in object movements), this also increase the confidence score.
[0055] The transmission of alert to the remotely located station ensures that the detection of falsified footage is communicated swiftly and securely to relevant stakeholders for further action. Once the system has completed the detection and flagging processes, the next step is to notify the concerned parties that a potential falsification is identified. The communication unit aids in this phase by transmitting the alert to the station that contains detailed information about the falsification, including the flagged frames, the corresponding confidence scores that indicate the likelihood of falsification, and any other relevant metadata, such as timestamps or camera identifiers, to contextualize the data. The alert transmission is developed to be fast, efficient, and reliable, ensuring that the information is delivered in real-time or near real-time, which is particularly important for applications like surveillance, media forensics, or legal investigations, where quick responses to detected falsifications are often essential.
[0056] To ensure that the alert transmission is both secure and tamper-proof, the system employs encrypted communication channels. Encryption aids in preventing the interception, alteration, or tampering of the transmitted signal. Since the transmitted alert contains potentially sensitive data, such as flagged frames, confidence scores, and contextual information about the footage, it is crucial to protect this data from unauthorized access or manipulation. The encrypted channel ensures that even if the communication is intercepted during transmission, the contents of the alert remain protected and unreadable to unauthorized parties. This encryption is implemented using secure protocols such as TLS (Transport Layer Security), SSL (Secure Sockets Layer), or other industry-standard encryption methods. Encryption ensures the integrity of the transmitted data, meaning that the alert is not modified or tampered with during transit. This is essential in high-stakes environments like law enforcement or courtroom settings, where the authenticity of the transmitted alert and its contents is guaranteed.
[0057] In some implementations, the alert is transmitted as a visual overlay on top of the flagged segments of the footage itself. This visual overlay serves as an immediate visual cue to the operator, or automated system reviewing the footage for indicating exactly where the falsified content is identified. The flagged frames, along with the overlay, include a color-coded indication (such as red or yellow) to visually highlight the sections of the footage that are suspect. Along with the visual flag, the confidence score is often displayed for allowing viewers to quickly assess the severity of the potential falsification. This overlay provides an intuitive and user-friendly way to direct attention to the most critical areas of the footage for streamlining the review process and helping the operator make an informed decision about the authenticity of the video.
[0058] In some cases, the alert also contains metadata or instructions for further actions, such as requesting a more in-depth analysis or triggering additional automated checks. This feature enhances the usability and responsiveness of the system for ensuring that flagged content is acted upon efficiently and accurately. By combining encrypted transmission, visual overlays, and confidence scores, the alert system ensures that detected falsifications are effectively communicated and acted upon in a secure, timely, and transparent manner. Whether the alert is viewed by the automated system, a human operator, or a legal authority, the structure and security of the alert transmission aids in maintaining the integrity and trustworthiness of the entire falsification detection process.
[0059] In an exemplary embodiment of the present invention, a method for detection of falsification in footage is a process developed to identify and flag manipulated or tampered video content. This method utilizes various steps to address different aspects of video analysis, from initial footage acquisition to the final alert transmission for ensuring that any falsified footage whether altered through editing, deep fakes, or other forms of manipulation is effectively detected and flagged for further scrutiny.
Step A: Acquiring Real-time Footage for Detection of Falsification
[0060] The process begins with acquiring real-time footage that is to be analyzed for potential falsification. This footage is captured using the capturing unit. The quality and type of camera in capturing unit vary, but the key focus is that the footage is live or recorded with minimal delay to facilitate real-time analysis. For example, in a security context, footage from surveillance cameras installed at a public venue or a private building captured continuously for monitoring purposes. Similarly, for media verification, footage taken during a live event or news broadcast is monitored to check for signs of tampering or manipulation.
Step B: Pre-processing Acquired Footage
[0061] Once the footage is acquired, the next step is pre-processing to address common issues that hinder the accuracy of falsification detection. These issues include lighting variations, noise, and resolution inconsistencies that occur due to various environmental factors or technical limitations of the capturing devices. For example, a video shot outdoors during sunset experience fluctuating lighting conditions, resulting in parts of the video being either too bright or too dark. In such cases, pre-processing techniques like brightness normalization and contrast adjustment are used to bring uniformity across the footage, ensuring that the lighting variations do not obscure potential signs of tampering.
Step C: Analyzing by Means of a Detection Protocol
[0062] After pre-processing, the detection protocol is applied to the footage to analyze it for signs of falsification. This step uses a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) or Temporal Convolution Networks (TCNs) to examine both the spatial and temporal characteristics of the video. CNNs are primarily used to detect visual anomalies in individual frames of the footage. For example, the CNN detect irregular lighting patterns, unnatural textures, or edge inconsistencies that indicate the insertion of artificial elements, such as objects or faces, or manipulation of the original video content. For example, in video that is altered by adding an object to the scene, the CNN detect the inconsistent lighting or mismatched shadows that do not match the natural environment of the footage.
[0063] RNNs or TCNs, on the other hand, are used to analyze the temporal relationships between frames, which is crucial for detecting unnatural movements or frame inconsistencies. Manipulated footage exhibit unusual motion, such as jerky movements, inconsistent frame rates, or unnatural transitions between frames. For example, in a manipulated video, if object suddenly changes position without a natural transition, the RNN detect this inconsistency. TCNs, which use temporal convolutions, are particularly effective in capturing longer-term dependencies and detecting irregularities in movement over a longer sequence of frames. These techniques allow the system to detect manipulation that affects how the scene evolves over time, such as sudden changes in movement speed or unnatural pauses in motion.
Step D: Flagging Frames with Detected Falsification
[0064] Once the analysis is completed, the system moves on to flagging frames that exhibit signs of detected falsification. In this step, the frames that contain visual or temporal inconsistencies are marked as suspicious or manipulated. For example, if the CNN detects inconsistency in the lighting or texture of object, or if the RNN identifies sudden, unnatural movement, the respective frames are flagged. Flagging frames is a way of isolating the segments of the footage that require further examination. This also helps to prioritize which parts of the video need immediate attention, allowing human operator or automated systems to focus on the flagged areas for a more in-depth review.
Step E: Generating Confidence Score for Detected Falsification
[0065] After the frames are flagged, the system generates the confidence score for each positively detected falsification. The confidence score quantifies the likelihood that the flagged frame is indeed been tampered with. The score is based on variety of factors, including the severity of the detected anomalies, the consistency of the detected features with known tampering patterns, and the model’s overall certainty. For example, if the CNN detects anomaly in lighting that is consistent with patterns of common video manipulation, the confidence score is high. Similarly, if the RNN detects unnatural motion that is clearly outside the expected temporal behavior of the scene, the score reflect a high probability of falsification. Confidence scores are important as they allow operator to assess the seriousness of the detected tampering.
Step F: Transmitting an Alert via Encrypted Communication Channel
[0066] Finally, the system transmits the alert to the remotely located station once the falsification is detected and flagged. This alert contains key information, such as the flagged frames, confidence scores, timestamps, and any contextual details necessary for the investigation. The alert is transmitted via encrypted communication channel to ensure the security and integrity of the information. Encryption protects the data from interception or tampering during transmission, which is especially important in scenarios involving sensitive footage, such as surveillance or legal evidence. The encryption ensures that even if the communication is intercepted, the transmitted data remains confidential and is not altered. The alert is transmitted alongside visual overlays that highlight the flagged segments of the footage. These visual cues such as color-coded highlights or bounding boxes make this immediately obvious which frames are flagged, enabling quicker identification of manipulated content.
[0067] This method provides a systematic approach to detecting falsifications in video footage, combining state-of-the-art machine learning techniques with secure and efficient communication to ensure that the integrity of the footage is maintained and that any tampering is promptly identified and addressed.
[0068] Although the field of the invention has been described herein with limited reference to specific embodiments, this description is not meant to be construed in a limiting sense. Various modifications of the disclosed embodiments, as well as alternate embodiments of the invention, will become apparent to persons skilled in the art upon reference to the description of the invention. , Claims:1) A real-time falsified footage detection system, comprising:
i) a capturing unit for capturing footage for detection of falsification;
ii) a communication unit configured with said capturing unit to establish a transmission between said capturing unit and a remotely located station;
iii) a computational unit, and a memory storing executable instructions that, when executed by said computational unit, cause said computational unit to perform the steps of
a. acquiring real-time footage for detection of falsification by means of said capturing unit;
b. pre-processing acquired footage to normalize variations in lighting, noise and resolution to enable effective analysis;
c. analyzing said footage by means of a detection protocol to determine falsification;
d. flagging frames of said footage having detected falsification;
e. generating a confidence score for each of said positively detected falsification in said analyzed footage; and
f. transmitting an alert to said station upon said positive detection of falsification in said analyzed footage, along with flagged frame and said confidence scores, by means of said communication unit.
2) The system as claimed in claim 1, wherein said capturing unit is configured with at least one image capturing means.
3) The system as claimed in claim 1, wherein said detection protocol comprises analysis of individual frames of said footage by means of CNN (convolutional neural network) to detect visual artefacts including irregular lighting, unnatural textures and inconsistencies in edges, and determination of temporal patterns across said frames to detect unnatural movements or frame inconsistencies by a means selected from RNN (Recurrent Neural Networks), TCN (Temporal Convolution Networks), and a combination thereof.
4) The system as claimed in claim 1, wherein said computation unit comprises at least one processor and at least one accelerator.
5) The system as claimed in claim 4, wherein said accelerator is selected from a GPU (Graphics Processing Unit), a TPU (Tensor Processing Unit) and a combination thereof.
6) The system as claimed in claim 1, wherein said communication unit transmits said alert via an encrypted communication channel to prevent interception and tampering of transmitted signal.
7) The system as claimed in claim 1, wherein said computation unit is configured with an adaptive learning module, to improve said detection protocol based on new data and detection techniques.
8) A method for detection of falsification in footage, comprising of step:
a. acquiring real-time footage for detection of falsification;
b. pre-processing acquired footage to normalize variations in lighting, noise and resolution to enable effective analysis;
c. analyzing by means of a detection protocol to determine falsification, wherein individual frames of said footage are analyzed by means of CNN (convolutional neural network) to detect visual artifacts including irregular lighting, unnatural textures and inconsistencies in edges, and determination of temporal patterns across said frames to detect unnatural movements or frame inconsistencies by a means selected from RNN (Recurrent Neural Networks), TCN (Temporal Convolution Networks), and a combination thereof;
d. flagging frames of said footage having detected falsification;
e. generating a confidence score for each of said positively detected falsification in said analyzed footage; and
f. transmitting an alert upon said positive detection of falsification in said analyzed footage, along with flagged frames and said confidence scores, via an encrypted communication channel.
9) The method as claimed in claim 8, wherein said alert is transmitted as visual overlay over flagged segments of said footage.
| # | Name | Date |
|---|---|---|
| 1 | 202521008774-STATEMENT OF UNDERTAKING (FORM 3) [03-02-2025(online)].pdf | 2025-02-03 |
| 2 | 202521008774-REQUEST FOR EXAMINATION (FORM-18) [03-02-2025(online)].pdf | 2025-02-03 |
| 3 | 202521008774-REQUEST FOR EARLY PUBLICATION(FORM-9) [03-02-2025(online)].pdf | 2025-02-03 |
| 4 | 202521008774-PROOF OF RIGHT [03-02-2025(online)].pdf | 2025-02-03 |
| 5 | 202521008774-POWER OF AUTHORITY [03-02-2025(online)].pdf | 2025-02-03 |
| 6 | 202521008774-FORM-9 [03-02-2025(online)].pdf | 2025-02-03 |
| 7 | 202521008774-FORM FOR SMALL ENTITY(FORM-28) [03-02-2025(online)].pdf | 2025-02-03 |
| 8 | 202521008774-FORM 18 [03-02-2025(online)].pdf | 2025-02-03 |
| 9 | 202521008774-FORM 1 [03-02-2025(online)].pdf | 2025-02-03 |
| 10 | 202521008774-FIGURE OF ABSTRACT [03-02-2025(online)].pdf | 2025-02-03 |
| 11 | 202521008774-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [03-02-2025(online)].pdf | 2025-02-03 |
| 12 | 202521008774-EVIDENCE FOR REGISTRATION UNDER SSI [03-02-2025(online)].pdf | 2025-02-03 |
| 13 | 202521008774-EDUCATIONAL INSTITUTION(S) [03-02-2025(online)].pdf | 2025-02-03 |
| 14 | 202521008774-DRAWINGS [03-02-2025(online)].pdf | 2025-02-03 |
| 15 | 202521008774-DECLARATION OF INVENTORSHIP (FORM 5) [03-02-2025(online)].pdf | 2025-02-03 |
| 16 | 202521008774-COMPLETE SPECIFICATION [03-02-2025(online)].pdf | 2025-02-03 |
| 17 | Abstract.jpg | 2025-02-18 |
| 18 | 202521008774-FORM-26 [03-06-2025(online)].pdf | 2025-06-03 |