System/Method For Passive Blind Image Splicing Forensics Detection

< Back

System/Method For Passive Blind Image Splicing Forensics Detection Using Advanced Learning Techniques

Abstract: Image forgery is a growing concern in today’s era of extensive use of social media. It has changed the way people normally stored, accessed and shared data. Visual imagery has impacted the routine of documenting evidences and sharing information. Amidst all this if false information in the form of doctored images are circulated, it misleads people into drawing false conclusions. Such incidents affect courtroom trials, medical and scientific Investigations, political campaigns, fashion industry, media and social networking platforms. It therefore becomes important to differentiate between authentic and forged (or doctored or tampered) images. Image forensic techniques find traces of image manipulations by analyzing the image pixels, investigating camera induced and compression artefacts, studying geometrical and physics based properties of objects captured in the image. The techniques focus on active and passive detection by classifying authentic vs. doctored images and extend this to localizing the region on forgery. This work focuses on the application of machine learning, deep learning and quantum machine learning techniques to image splicing detection. In first approach , features from spliced and authentic images are engineered by applying the Kekre, discrete cosine, and the hybrid Kekre-discrete cosine transforms which are then passed onto an assortment of machine learning classifiers to classify spliced images. In second approach, a novel socio-inspired twin convolutional neural network with a feature-transfer learning approach, named ”MissMarple” is proposed to detect traces of image splicing. This technique is further extended to implement a naive bounded box technique of passive, blind image splicing localization. Finally, in last stage quantum inductive transfer learning technique is applied to image splicing detection.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

18 July 2025

Publication Number

30/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

MLR Institute of Technology

Hyderabad

Inventors

1. Mrs.Chandana

Department of Information Technology, MLR Institute of Technology, Hyderabad

2. Mr.Anwar Ali

Department of Information Technology, MLR Institute of Technology, Hyderabad

3. Dr.B.Varija

Department of Information Technology, MLR Institute of Technology, Hyderabad

4. Mr. G.Vidyasagar

Department of Information Technology, MLR Institute of Technology, Hyderabad

Specification

Description:Field of Invention
The present invention relates to the field of digital image forensics, more specifically to image splicing detection using advanced computational methods. It encompasses techniques from machine learning, deep learning, and quantum machine learning for detecting image tampering, with applications in areas such as digital security, media authentication, legal evidence verification, and content integrity validation.
Background of the Invention
Key-point based techniques have proved beneficial in providing a higher detection accuracy for copy-move forgery incidents. Some of these techniques require heavy processing time and are computationally expensive, for example, the camera-based forensics techniques. The techniques are still evolving to solve the complex image forgery problems like image splicing detection. Appropriate feature engineering (and/or feature selection) is an utmost step prior to selecting an appropriate classifier for the problem at hand. If the primary step fails, the classifier will not be able to make accurate predictions. Existing machine learning models have provided a higher detection accuracy on realistic image splicing datasets compared to the visually perceptible (or coarsely) spliced datasets. Selection of good and less features for quickly and accurately classifying image splices is still evolving. Feature engineering process can be avoided by allowing the deep learning model with several layers to learn meaningful representations from the input data provided with the help of a feedback mechanism. The deep learning models therefore help in identifying features that human beings would not normally process. Even a shallow deep learning network makes good predictions. An end-to-end deep learning model requires a larger training set which may not always be possible. Transfer learning approaches can help but may not be able to solve specific (peculiar/ unique) cases at hand. But these models with the support of GPU can speed up the entire learning and testing process unlike the computationally expensive feature engineering processes. Existing deep learning frameworks for image splicing detection are computationally complex in nature and rely on pre-trained models. Most have implemented deep learning only for the use of feature engineering and then classified the features using machine learning models. Quantum computing systems have proved to be effective in solving complex problems in polynomial time compared to classical systems which take exponential time to solve the same problem. Machine learning problems like optimization could be effectively solved using supervised quantum machine learning (QML) techniques. The results have shown an increase in the performance accuracy of QML has been explored in different areas of pattern classification but remains unexplored in the area of passive blind image splicing classification (US20230110591A1).
Recent studies have greatly applied the discrete wavelet transform and discrete cosine transform, and their variants, to extract relevant features for image splicing detection. Some of these techniques also applied the transforms to extract the noise inconsistencies for classifying authentic vs. spliced images. The fast Fourier transform has also been applied, while other approaches have used higher-order statistics or local binary and ternary patterns for the feature extraction process. Recent techniques use deep learning models in combination with feature transform techniques or independently to engineer features that are then passed onto machine learning classifiers. The most popular classifier for image splicing detection has been the support vector machine, followed by ensemble models like Bagging, AdaBoost, and Random Forest, distance-based classifiers like K-nearest neighbors, probabilistic models like logistic regression and Bayesian models, and linear models based on perceptron like neural networks and their variants. Most of these techniques pre-process the images before extracting features, such as converting the RGB image to gray-level or YCbCr format.
A number of difficult computer vision classification problems have been recently resolved by means of deep learning techniques. Learning traces of picture content modifications has been extensively done using convolutional neural networks (CNN). The challenge of accurately detecting signs of image alteration remains, even though digital signal processing (DSP) combined with machine learning (ML) approaches have been successful in the past when used to picture forensics. Not being able to catch the right signs of manipulation could be one explanation. One solution is to use deep learning techniques, which can learn to distinguish between real and altered input images and then apply this knowledge to other sorts of forgeries. Mayer, Bayar, and Stamm proposed another unified deep learning model based on transfer learning for multiple forensic tasks. Their model first learned features for camera source identification and transferred this learning to detect traces of image manipulations. More specific to image splicing detection, Y. Rao and J. Ni designed a constrained CNN to automatically learn hierarchical representations from RGB color images. The first layer of their model was initialized to use the weights from the spatially rich model (SRM) residual maps that suppressed the image content. The dense features thus extracted from the CNN were classified by a support vector machine (SVM) classifier. Mayer and Stamm proposed a novel deep learning model called the similarity network which compares two image matches to indicate traces of forensic similarity with respect to camera model artefacts, editing operations, and manipulation parameter. The proposed model consists of a CNN-based feature extractor and a three-layer neural network and is efficient in detecting and localizing a number of forgery operations. Hussein, Mahmoud and Zayed took inspiration from the deep belief network-deep neural network (DBN-DNN) model for image classification and tested its application for image splicing detection. They extracted the color filter array (CFA) features from the authentic and spliced input images which were reduced using PCA and passed them through the DBN-DNN classifier consisting of six hidden layers. Additionally, recent work in deep learning also focuses on detection realistic vs. fake facial forgeries. From this observed that a lot of recent work emphasizes on tamper localization. Though localization is a crucial step in detecting image splicing forgery but with an objective such as real-time stripping off unwarranted content from social networking websites, classification will supersede localization (US20240364738A1).
An approach to multi-task transfer learning was suggested by Salloum, Ren, and Kuo. In this method, a VGG-16-based fully convolutional network (FCN) absorbs the surface label information and then uses it to identify the spliced region's edges. Schuld and Petruccione provide a fascinating coin toss example to show how quantum physics helps us get from a state of high uncertainty to low uncertainty. In comparison to classical computing systems, which can only address computationally expensive problems with exponentially huge input sizes, quantum computing systems can solve them in polynomial time. When it comes to machine learning, optimisation challenges are like quantum computing problems in that they demand iterative calculations to find the optimal solution. Machine learning optimisation problems can thus be addressed with quantum computing. Image pattern classification is one of the many machine learning applications that necessitates massive inputs and iterative calculations.

Summary of the Invention
The invention presents a multi-faceted approach to image splicing detection by integrating classical machine learning, deep learning, and emerging quantum machine learning techniques. Initially, traditional transforms such as Kekre, Cosine, and a hybrid Kekre-Cosine are applied to extract high-energy coefficients from images, which are then used as features for classification using machine learning algorithms—where RandomForest yields the highest accuracy. Building on this, a novel socio-inspired twin convolutional neural network model, MissMarple, is introduced. It employs a feature-transfer learning strategy by training on both visually perceptible and realistic image splices, demonstrating superior performance and efficiency over existing deep learning models. Finally, the invention explores the untapped potential of quantum machine learning in image forensics by adapting a hybrid classical-quantum transfer learning framework, showing promising results on both quantum simulators and real quantum hardware. This comprehensive invention advances the state of image splicing detection through a unique blend of classical, deep, and quantum learning paradigms.
Brief Description of Drawings
Figure 1: Proposed MissMarple architecture
Figure 2: Block diagram of the quantum transfer learning implementation

Detailed Description of the Invention
The authentic and spliced images available from the datasets are pre-processed by applying the Kekre, Cosine and the hybrid Kekre-Cosine transforms to obtain the high energy coefficients which store the maximum information about the given image. The Kekre Transform is an orthogonal, asymmetric and non involutional transform which can be applied on any square image of size S×S where S need not be a power of 2. It is the generic version of the Kekre’s LUV color space matrix and works on both gray scale and RGB color images. For RGB images, it is required to apply the Transform to each channel individually. Kekre and Thepade demonstrated that the Kekre-RGB approach provided a higher image retrieval precision and recall compared to the Kekre-Gray technique.
Discrete cosine transform (DCT), a linearly separable orthogonal transform (based on the discrete Fourier transform) that compacts energy coefficients of a digital signal. These energy coefficients are a good representative of the digital signal. The basic idea behind DCT is to map a non-invertible transformation from the pattern space to a reduced dimensionality feature space. Such an implementation helps to successfully run a classification scheme with substantially less features. Similar to the Kekre transform, we apply DCT to each of the red, green, and blue channels individually. The generalized form of a 2-D DCT matrix, C, for an S ×S image. In image forensics, the use of DCT for copy-move forgery detection and image splicing detection has proved beneficial in increasing the overall accuracy of applying pixel-based techniques for forgery detection. Computationally, the complexity of applying Kekre transform is far less compared to the DCT transform. The number of multiplications required are 2S(S − 1) while DCT requires S2 (2S). So, in our paper, for a 64x64 patch, DCT requires 524,288 multiplications while Kekre requires only 8064 mulitplications. In our work, we first transform the input images by applying the Kekre matrix (S = 64) on image patches of size 64×64. This transformation is applied separately to each color channel, resulting in 4096×3 features. Since energy compaction occurs primarily in the upper diagonal of the transformed matrix, we select only the first 4, 12, and 16 coefficient values from each channel. This reduces the feature set to 12 (0.098% of the total image patch), 24 (0.195%), and 48 (0.391%) features, respectively. An important aspect of this process is the normalization of the Kekre matrix, which is applied along both rows and columns to perform a 2D transformation on the image. We normalize both the Kekre and DCT matrices by dividing each row by the square root of the sum of squares of the elements in that row. The same normalization and transformation process is followed when applying DCT to the input image patches.
Two orthogonal transforms were combined to generate a hybrid wavelet transform, which is a new approach. Combinations of the Cosine, Hartley, Walsh, and Kekre transformations were developed and used to solve the image compression problem in their study. The DCT-DKT hybrid wavelet transform, which combines the Cosine and Kekre transforms, has the best detection accuracy of all the combinations. When investigating the signal's global and local characteristics, wavelet transformations prove to be invaluable tools. To maximise their combined effect, two transforms should play to each other's strengths. Prior work using hybrid wavelet transformations to enhance digital watermarking, picture compression, and content-based image retrieval (CBIR) methods has yielded encouraging results.
In order to categorise manipulated vs original photos as passively and without thinking a twin convolutional neural network (CNN) model called MissMarple is suggested, which draws inspiration from sociology. The four convolutional layers that make up the village model (MM V) alternate with max pooling layers to form the initial portion of this model. This model's foundational architecture is based on Chollet's basic CNN model for animal picture classification as shown in figure 1. By decreasing the feature set, max pooling layers improve the model's generalisability. In order to regularise the model and prevent overfitting, we incorporate the Batch Normalisation layers with two Dropout layers. Lastly, in order to differentiate between genuine and spliced images, we employ sigmoid activation and two dense layers that are totally coupled.For all the convolutional layers, we use the ReLU (Rectified Linear Unit) activation and the loss function is set to binary cross entropy. A patch-based approach is used to overcome the limitation of fewer training images and help the model learn better. For extracting the patches, overlay the ground truth masks on the fake images and select patches of size 64×64 wherever the region of spliced object in the mask and the fake image overlap by 40%. In order to create a balanced dataset, compute the number of authentic patches required per image and randomly sample authentic patches of size 64 × 64 the authentic images. For the second part of the model, called the actual case model (MM A), followed the same design layout as the MM V model (twin network) with an additional layer for implementing feature-transfer learning. In this first train the MM V model to learn features from the coarsely spliced Columbia dataset. Since the dataset has visually perceptible spliced forgeries obtained by random cut-paste operation, the boundaries surrounding the spliced regions are evident and presumed that the third convolutional layer would generalize better and learn the boundary features from the spliced regions. For this reason, output the activations from each convolutional layer by generating heatmaps and observe that the third convolutional layer of the MM V model (V Conv2d 3) learns and generalizes the boundary regions of the spliced regions. The initial layers of any convolutional neural network (CNN) learn features that are local to the input feature set and generalize this learning in the progressive layers i.e. they learn spatial hierarchies. Additionally, the features learned by the convolutional layers are translation. The features learned in the V Conv2d 3 layer of the MM V model are then transferred to the MM A model by means of layer weight sharing. This is achieved by loading the trained weights from V Conv2d 3 layer of the MM V model and applying them on the output (or feature map) of the second convolutional layer, (A Conv2d 2), of the MM A model. These weights are not trained during the training of the MM A model and this learning from the V Conv2D 3 layer serves as an additional layer. For allowing the MM A model to also learn features from the input image itself, the usual training progresses from the A Conv2D 2 layer to the A Conv2d 3 layer with the ReLU activation. We then concatenate the weights learned from the feature transfer step (learning transferred from the V Conv2d 3 layer) and those learned in the A Conv2d 3 layer which are provided as input to the last convolutional layer of the MM A model. In feature-transfer learning followed in MM A, apply the same technique for patch-based extraction to sample patches of size 64 × 64 from fake and authentic images of the realistic splicing datasets to train the MM A model. For the fake patch extraction, the percentage of overlap (30% for DSO-1 and AbhAS, 12.5% for WildWeb) depends on the size of the spliced region.
A proposed hybrid classical-quantum neural network aims to improve the performance of picture splicing detection. The architecture is based on a variational quantum circuit (VQC) that augments and modifies a pre-trained classical network as shown in figure 2. The definition of a VQC with depth q is a combination of several quantum layers (L). You can think of each quantum layer as a unitary operation with various weights that take an input state |x⟩ and output a state |y⟩. One possible implementation of this unitary process is a predetermined order of entangling states followed by a series of single-qubit rotations. The input states' Hilbert space dimension is maintained in a quantum layer. A larger number of training and testing examples are provided using a patch-based approach to further investigate the usefulness of quantum-transfer learning (QTL) to picture splicing detection. Here, 64 × 64 spliced sections are taken from the ground-truth masks superimposed on the fake images, and 64 × 64 patches are taken at random from the real photos. , Claims:The scope of the invention is defined by the following claims:

Claim:
A System/Method for Passive Blind Image Splicing Forensics Detection using Advanced Learning Techniques comprising the steps of:
a) The dataset is preprocessed to extract features from the spliced and authentic images to obtain the high energy coefficients which store the maximum information about the given image.
b) The spliced versus authentic images are classified as blindly and passively to localize the regions of image splicing for the detected spliced (or fake) image.
c) THe performance of image splicing detection is enhanced both on the quantum simulator and the quantum processor.
2. As per claim1, the dataset is pre-processed by applying the Kekre, Cosine and the hybrid Kekre-Cosine transforms to obtain the high energy coefficients which store the maximum information about the given image.
3. As per claim1, the socio inspired twin convolutional neural network model that follows a feature-transfer learning approach MissMarple is proposed detect traces of image splicing.
4. As per claim1, the hybrid classical quantum transfer learning approach is proposed to enhance the performance of image splicing detection.

Documents

Application Documents

#	Name	Date
1	202541068706-REQUEST FOR EARLY PUBLICATION(FORM-9) [18-07-2025(online)].pdf	2025-07-18
2	202541068706-FORM-9 [18-07-2025(online)].pdf	2025-07-18
3	202541068706-FORM FOR STARTUP [18-07-2025(online)].pdf	2025-07-18
4	202541068706-FORM FOR SMALL ENTITY(FORM-28) [18-07-2025(online)].pdf	2025-07-18
5	202541068706-FORM 1 [18-07-2025(online)].pdf	2025-07-18
6	202541068706-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [18-07-2025(online)].pdf	2025-07-18
7	202541068706-EVIDENCE FOR REGISTRATION UNDER SSI [18-07-2025(online)].pdf	2025-07-18
8	202541068706-EDUCATIONAL INSTITUTION(S) [18-07-2025(online)].pdf	2025-07-18
9	202541068706-DRAWINGS [18-07-2025(online)].pdf	2025-07-18
10	202541068706-COMPLETE SPECIFICATION [18-07-2025(online)].pdf	2025-07-18