Abstract: A system and method pertain to processing of data, regardless of the platform (aerial, satellite, drones) used for data ingestion, the type of sensor(s) employed, or the processing device utilized. The system incorporates a unique combination of platformagnostic data acquisition, sensor-agnostic processing model, and device-agnostic processing techniques, thereby enabling seamless integration, interpretation, and 10 utilization of aerial/spaceborne data across diverse applications.
FIELD OF THE INVENTION
[1] The following invention is related to multimodal image registration using an
5 on-board image processing of geospatial images captured by aerial/spaceborne
platform.
BACKGROUND OF THE INVENTION
[2] The remote sensing technologies offer different sensors with capabilities of
10 capturing different regions of the electromagnetic spectrum that further offer different
spatial, temporal and spectral resolutions for a plethora of applications. However, at
present the registration of different sensor imagery is challenging due to various factors
like: different sensors have different processing requirements, imagery format, capture
geometry and capture times which makes it difficult to analyze the image data from the
15 various sensors as an unified set; a huge difference in radiometric and geometric
distortions of different modalities leads to poor registration performance; difference in
orbit alignment and incidence view angles result in perspective mismatch radiometric
values and spatial displacement.
20 [3] Further the conventional methods of co-registration of sensor data have issues
in establishing a spatial correlation between two multimodal images. In case of multimodal registration that is most complex is having issues with the acquisition of image
data wherein geometries differ, and the co-registration problem is non-bijective,
primarily due to variations in the 3D characteristics of the scenes being imaged and
25 projected differently.
[4] An application CN114119685A discloses a multi-modal image registration
method based on deep learning wherein the image registration method comprises the
following steps: firstly, putting original images in an image set A into a semantic
30 segmentation network to obtain a segmented image set B; multiplying the
corresponding images in the image set B and the image set A pixel by pixel to obtain
an image set C; and then selecting one image from the image set C as a fixed image,
calculating geometric transformation by using gray value information of the image to
3
obtain a deformed image, calculating the similarity between the deformed image and
the reference image, iterating the optimal transformation T with the maximum
similarity of the two images through an optimization algorithm, and applying the
transformation T to the image set A to complete registration of the images. But this
5 method has challenge in estimation of the geometric transformation using gray value
information that is based on pixel consistency. The present method assumes the gray
values to be similar which might not the case in heterogeneous sensor fusion.
[5] Another application CN106327532A discloses a three-dimensional registering
10 method for a single image, and the method comprises three steps: camera calibration,
interactive modeling and camera registration. The step of camera calibration comprises
the sub steps: extracting rough-resolution blanking points through employing a grid
method, purifying the rough-resolution blanking points through combining with PC
lines spatial conversion and an alignment linear point detection algorithm, obtaining
15 candidate blanking points, carrying out the optimization and updating of the candidate
blanking points to obtain typical blanking points, taking the typical blanking points as
the camera calibration features to build a model, carrying out the analysis of the typical
blanking points, and obtaining the internal and external parameters of a camera,
wherein the internal and external parameters comprises a focal length and a rotation
20 matrix. The step of camera registration is mainly using a method based on the linear
feature alignment to carry out the registering of a three-dimensional model and register
the three-dimensional model in a unified three-dimensional scene. This method of
Camera image registration might not work for other sensors like radar, SAR etc
25 [6] Another application US10460458B1 discloses a method for registration of
partially-overlapped images, comprises (a) performing noise reduction and feature
extraction in a reference image and an unregistered image; (b) determining a template
size using a phase transition methodology for a sufficiently-sampled finite data set; (c)
identifying a template region in the reference image; (d) performing a wide angle
30 estimation of the reference image and the unregistered image; (e) performing
orientation and translation of the reference image and the unregistered image; (f)
performing a search space reduction of the reference image and the unregistered image;
(g) performing a coarse angle estimation of the reference image and the unregistered
image; (h) performing orientation of the reference image and the unregistered image of
4
the coarse angle estimation; and (i) performing a fine angle estimation and registration
of the reference image and the unregistered image. This is a template-based method,
which calculates the fine angle estimation and registration of the reference image and
the unregistered image based on template region, and extension of this method is not
5 promising in case of more sensors where there exist varying local distortions spatially
distributed.
[7] Yet another application CN116664892A discloses a multi-temporal remote
sensing image registration method based on cross attention and deformable
10 convolution, and the method solves the problem of poor feature point matching quality,
by using a violent matching BFMatcher algorithm for coarse registration, and a selfadaptive constraint threshold is combined to screen out high-quality matching points.
Still this method depends on data from a single sensor and may or may not be extended
to multiple sensors image data optimization.
15
[8] Yet another application CN116416289A discloses a multimode image
registration method and system based on deep curve learning, and a medium. The
method comprises the following steps: inputting a to-be-registered image and a
reference image into a trained multimode image registration network; carrying out
20 feature extraction on the stacked to-be-registered image and reference image by using
a feature extraction unit to obtain a feature F; carrying out regression based on the
feature F by utilizing the global constraint sub-network to obtain a global affine
transformation parameter A, and mapping the initial coordinate of the image to be
registered into a global constraint grid coordinate G0; acquiring an accurate image pixel
25 coordinate G1 by using an image coordinate estimation sub-network of forward depth
curve learning and combining the G0; and resampling the to-be-registered image based
on the image pixel coordinate G1 to obtain a registered image. In this method, initial
coordinates of the Image are mapped and registered into global constraint grid
coordinates by utilizing the global affine transformation parameters, and this might not
30 work well for multimodal imagery having local distortions.
[9] Yet another application CN115082359A discloses a synchronous orbit optical
satellite geometric fine correction method based on coastline data and relates to the
technical field of stationary orbit optical remote sensing image geometric processing.
5
According to the method, a combination of optimization algorithm of curve vertex
feature description and a spatial relationship multi-similarity measurement mode is
used for coastline curve features, so that the matching precision can be ensured, and the
robust processing of abnormal and noise points can be greatly improved. According to
5 the method provided by the invention, the geometric correction of the geostationary
satellite image under the cloud interference condition can be completed, and a public
coastline data set can be used without depending on a high-precision reference image.
Yet this method does not disclose Optical satellite geometric fine correction, and multi
sensor coregistration.
10
[10] There is a need for a system that can register multimodal imagery from aerial
or space platforms, enabling fusion of complementary information from different
sensors for better decision intelligence using combination of adaptive non-learning and
learning algorithms in conjunction with sensor models, orbit parameters and other
15 metadata that aid registration accuracy.
SUMMARY OF THE INVENTION
20 [11] The following presents a simplified summary of the disclosure in order to
provide a basic understanding to the reader. This summary is not an extensive overview
of the disclosure, and it does not identify key/critical elements of the invention or
delineate the scope of the invention. Its sole purpose is to present some concepts
disclosed herein in a simplified form as a prelude to the more detailed description that
25 is presented later.
[12] In this the object of the present system for registering multimodal imagery from
aerial or space platforms, enabling fusion of complementary information from different
sensors.
30
[13] It is yet another object of the system to enable decision intelligence and analysis
of different sensor imagery irrespective of differences in processing requirements,
imagery format with uniform/consistent capture geometry and capture times.
6
[14] It is yet another object of the system to employ a combination of adaptive nonlearning and learning algorithms in conjunction with sensor models, orbit parameters
and other metadata to enhance registration accuracy.
5 [15] It is yet another object that the system exploits a combination of techniques to
establish spatial correlation, radiometric differences, and relative skewness in features.
[16] In accordance with the present specification, the system comprises sensory
module having distinct sensors wherein heterogeneous sensory data pre-processing is
10 performed to ensure maximum spatial and temporal matching of the captured
multisensory imagery. The present system further aids in the registration process,
reducing the challenges of image projection planes, perspective differences, different
time captures etc. The sensing payload consists of SAR and Optical sensors along with
other supporting components such as on-board data processing unit and control units.
15
[17] In an aspect the system employs a Platform-Agnostic Data Acquisition model
wherein the system enables data ingestion from a wide range of aerial platforms,
including satellites, drones, and manned aerial vehicles, without relying on platformspecific protocols or interfaces.
20
[18] In another aspect the system employs a Sensor-Agnostic processing model that
utilize advanced machine learning and statistical modelling-based techniques, to
convert the image space to feature space representations that helps in establishing better
correlations between different modality of imageries including optical, infrared, radar,
25 and multispectral sensors, thereby the system achieves a significant performance edge
over conventional solutions.
[19] In another aspect the system comprises a Device-Agnostic processing unit that
is designed to operate on two computing platforms- including cloud servers and
30 onboard processors, thereby facilitating seamless deployment and scalability across
different computing environments.
7
BRIEF DESCRIPTION OF THE DRAWINGS
[20] These features and advantages of the present disclosure may be appreciated by
reviewing the following description of the present disclosure, along with the
5 accompanying figures wherein like reference numerals refer to like parts. Various
embodiments will hereinafter be described in accordance with the appended drawings,
which are provided to illustrate, not limit, the scope, wherein similar designations
denote similar elements, and in which:
10 [21] Fig 1 depicts the schematic diagram of the system in accordance with the
present specification.
[22] Fig 2 depicts the relation between fixed and moving image.
15 [23] Fig 3 depicts preprocessing steps for Optical sensors (Multispectral) and SAR
sensors.
[24] Fig 4a-b depicts the resultant image data at each pre-processing step for Optical
sensors (Multispectral) and SAR sensors.
20
[25] Fig 5a-c shows the coregistration module of the system.
[26] Fig 6a-6b depicts the resultant image data after each processing step in
coregistration module of the present system.
25
[27] Fig 7a-d depicts one of an exemplary embodiment wherein aerial platform is
satellite
[28] Fig 8a-d depicts another exemplary embodiment where the platform is a drone.
30
[29] Fig 9 depicts the process of image coregistration by the present system.
[30] Fig 10 depicts the comparative analysis of the present system with the
conventional system.
35
8
[31] The accompanying drawings illustrate the embodiments of systems, methods,
and other aspects of the disclosure. Any person with ordinary skills in the art will
appreciate that the illustrated element boundaries (e.g., boxes, groups of boxes, or other
shapes) in the figures represent an example of the boundaries. In some examples, one
5 element may be designed as multiple elements, or multiple elements may be designed
as one element. In some examples, an element shown as an internal component of one
element may be implemented as an external component in another and vice versa.
Furthermore, the elements may not be drawn to scale.
10 DETAILED DESCRIPTION
[32] The present disclosure is best understood with reference to the detailed figures
and description set forth herein. Various embodiments have been discussed with
reference to the figures. However, those skilled in the art will readily appreciate that
15 the detailed descriptions provided herein with respect to the figures are merely for
explanatory purposes, as the methods and systems may extend beyond the described
embodiments. As used in the description herein and throughout the claims that follow,
the meaning of “a,” “an,” and “the” includes plural reference unless the context dictates
otherwise. Also, as used in the description herein, the meaning of “in” includes “in”
20 and “on” unless the context dictates otherwise.
[33] References to “one embodiment,” “at least one embodiment,” “an
embodiment,” “one example,” “an example,” “for example,” and so on indicate that the
embodiment(s) or example(s) may include a particular feature, structure, characteristic,
25 property, element, or limitation but that not every embodiment or example necessarily
includes that particular feature, structure, characteristic, property, element, or
limitation. Further, repeated use of the phrase “in an embodiment” does not necessarily
refer to the same embodiment.
30 [34] Methods of the present specification may be implemented by performing or
completing manually, automatically, or a combination thereof, selected steps or tasks.
The term “method” refers to manners, means, techniques and procedures for
accomplishing a given task including, but not limited to, those manners, means,
techniques, and procedures either known to, or readily developed from known manners,
9
means, techniques and procedures by practitioners of the art to which the invention
belongs. The descriptions, examples, methods, and materials presented in the claims
and the specification are not to be construed as limiting but rather as illustrative only.
Those skilled in the art will envision many other possible variations within the scope
5 of the technology described herein.
[35] Referring to Fig 1 is the systematic representation of a system (100) for
multimodal registration of geospatial imagery captured from aerial platform, the system
comprising:
10
a. Sensor module (102) including Heterogeneous sensors: Major sensors
wherein in one of the embodiments, the system comprises: Multispectral (n
channels with wavelength not limited to visible region), Hyperspectral (n’
channels where n’ >> n), thermal sensors, SAR (synthetic aperture radar
15 operating in lower wavelengths - Xband, C band, S band). In alternative
embodiments, the system comprises sensors like LIDAR, thermal sensor, MSI
(microsatellite instability) detection sensors, and the like. Further each sensor
can be associated with an orientation module that allow for proper orientation
of the image sensor to enable capturing of images at appropriate angle.
20 b. Preprocessing layer (104): This layer employs radiometric and geometric
corrections, transformations, orthorectification, noise reduction,
normalization, and other image processing techniques to qualify data. The
system considers additional sensor parameters to perform Geometric modeling
of sensor projection geometry using orbit alignment and incidence angles that
25 help in reduction of geometric and radiometric distortion.
c. Homogeneous latent space (106): It refers to a shared, unified representation
space where data from different modalities (different sensors) are projected in
such a way that their relationships and similarities are preserved.
d. Coregistration module (108): This module facilitates alignment of two or
30 more images of the same area captured by temporally and spatially aligned
similar or different sensors, or from different viewing angles so that
corresponding pixels represent the same geographic location. It has two
variants - Edge processor for processing data on board in real time, and ground
10
processor for higher computing level of processing having higher pixel
registration accuracy.
[36] The present system aims to solve the problem of multi-modal registration.
5 Conventional systems establish spatial correlation between two multimodal images
through estimation of a diffeomorphic transformation which can be like a bijective
mapping (a smooth and continuous one-to-one mapping with invertible derivatives i.e.
non-zero Jacobian determinant) that optimizes an energy function. But in case of multimodal registration the process is most complex because the acquisition geometries
10 differ, and the co-registration problem is non-bijective, primarily due to variations in
the 3D characteristics of the scenes being imaged and projected differently. The relation
between fixed and moving image is represented as (Fig 2)
Im, If denote, respectively, the moving and fixed image.
15 φ denotes the deformation field that warps the moving image.
R(φ) imposes smoothness of the deformation field
λ is the regularization (hyper-parameter for trade-off between image similarity &
deformation
20 [37] The system further comprises one memory coupled to the system’s
coregistration module, wherein the memory storing executable instructions that, when
executed by the on-board processing unit, cause the on-board processing unit to:
a. receive information associated with orientation of the aerial platform and
information associated with orientation of image capturing sensors through
25 the orientation sensors;
b. facilitate reorientation of the image capturing sensors in a synchronized and
aligned manner for obtaining image data;
c. acquire image data from more than one sensor (Sensor A, sensor B,) and
these sensors can be Multispectral, SAR, hyperspectral, Thermal sensors
30 that capture image data;
d. pre-process the image data from multiple sensors with the pre-processing
layer;
e. convert the pre-processed image data into the latent space representation;
11
f. transfer the pre-processed image data from latent space and spatial
databases;
g. perform coarse registration and fine registration
h. feed the stack of coregistered image pairs (bands aligned for each sensor)
5 into the analytics module.
[38] The orientation sensor includes an ADCS (Attitude determination and control)
sensing system for sensing the orientation of the aerial or spaceborne vehicle, and one
or more orientation sensors for sensing orientation of the image capturing sensors.
10 Further the system comprises a georeferencing unit that includes an Inertial
Measurement Unit (IMU) for sensing acceleration and velocity of the aerial platform at
every point of time, and a space-based radio navigation system (GPS) for sensing a
location of the aerial vehicle with respect to the center of the earth. The system is further
configured to receive co-ordinates of an area of interest and provide geo-referenced co15 registered, spatially and temporally matched datasets for the area of interest.
[39] In accordance with one of the embodiments, the preprocessing layer employs
the following functions: denoising; calibration, speckle filtering, contrast enhancement
and feature scaling. Fig 3 depicts preprocessing steps for SAR and Optical sensors
20 (Multispectral), and furthermore slight differences in the steps are required when sensor
type changes.
[40] In case of multi-spectral image sensors, the following steps are performed by
the pre- processing layer:
25 a. Denoising: Noise reduction is a prerequisite step prior to information
extraction that attempts from remote sensing images. Reducing Noise in
remote sensing Image is an image restoration problem to recover an original
image from the corrupted Images. This problem is intractable unless one can
make assumptions about the actual structure of the perfect image. Fig 4a
30 shows reduction in noise in homogeneous regions, enabling better information
extraction.
b. Calibration: Image calibration provides a pixel-to-real-distance conversion
factor (i.e. the calibration factor, pixels/cm), that allows image scaling to
metric units. This information can be then used throughout the analysis to
12
convert pixel measurements performed on the image to their corresponding
values in the real world.
c. Contrast enhancement: “contrast enhancement” means pixel intensity
modification and redistribution to increase visibility. Contrast enhancement is
5 one of the most important pre-processing steps in real-world machine vision
systems. Contrast enhancement has a wide range of applications in industries
ranging from medicine to astronomy to manufacturing, in any case where
image processing may occur under sub-optimal lightening circumstances.
Some known techniques are histogram equalization, Contrast-Limited
10 Adaptive Histogram Equalization (CLAHE), morphological enhancement.
d. Feature scaling: This step involves enhancing the prominent features by
improving the contrast of the captured sensor image. For sensor input that
involves more than one channel, and underlying assumption is considered -
i.e., all channels are registered perfectly. Hence, a single channel is selected
15 for registration with another sensor band for one of the embodiments. Fig 4b
shows Optical (RGB) which is a multispectral sensor with 3 bands (R, G, B).
In this embodiment B band is selected, and if a multi-spectral sensor has a
panchromatic channel/band present that is taken for registration due to better
distinguishability in features. This is followed by sequence of operations –
20 i. Min-max scaling: The pixel intensity values (digital number,
backscatter, reflectance) varies from feature to feature, and this varies
differently for every sensor. Hence, this step scales the intensity values
from minimum to maximum intensity present in the imagery.
ii. Re-scale: After scaling the intensity values from minimum to maximum
25 there might still be differences due to differences in the minimum and
maximum values, hence re-scale brings the scale for both sensors to the
same range.
iii. Histogram_Norm: Normalize a histogram is a technique consisting of
transforming the discrete distribution of intensities into a discrete
30 distribution of probabilities. This ensures a better contrast and helps with
improved feature detection. better variants are Histogram Equalization,
Minmax_Gaussian and Adaptive Histogram.
13
iv. Digitize: This output is then further digitized, before inputting to the
model. Fig 4c shows the feature scaling process for multispectral image
sensor.
5 [41] In case of SAR sensors there is an additional step of speckle filtering is
performed before contrast enhancement by the pre- processing layer: Speckle is a
general phenomenon in SAR imagery caused by the interaction of the out-of-phase
waves reflected from a target. The Speckle function uses mathematical models to filter
the bright and dark spots that are generated as a result of interference, to allow better
10 image interpretation. Fig 4d shows the feature scaling process for SARl image sensor.
[42] Referring to Fig 5a shows the coregistration module of the system. The preprocessed image data from the homogenous latent space is fed as separate inputs based
on type of sensor. In accordance with one of the embodiments when multispectral and
15 SAR sensor are used for image acquisition then pre-processed multispectral (optical)
latent space (501) and SAR latent space (502) is fed as inputs to the coregistration
module. The image data is further processed by using techniques like scale space
generation (504) to enhance images to a required scale; feature detection (506) that
involves identifying specific points, regions, or structures in an image that are
20 significant and can be used as references for further analysis, wherein the features are
usually characterized by their uniqueness, repeatability, and robustness to variations
such as lighting changes, rotations, and scale transformations and common types of
features detected include corners, edges, blobs, and keypoints; feature description (508)
wherein the detected features are described in a way that allows for efficient matching
25 and recognition, features that captures their distinctive characteristics while being
resistant to variations that might occur in real-world images; feature mapping (510)
wherein after detecting and describing features in multiple images, feature matching
involves finding correspondences between the features in different images; outlier
removal (512) that allows removal of matched features that differ significantly from the
30 local environment; generating warping coefficients (516) using polynomial
transformation model (514) to account for more general retinal motions such as scale
changes, shearing and rotation motions.
14
[43] In accordance with the present specification, the coregistration module adopts
two levels of processing to correct for coarse and fine spatial shifts. The coarse shifts
(Fig 5b) are corrected based on the approach to identify common prominent key points
to be matched in multimodal slave and master imagery. The slave imagery is warped to
5 master using matched key points. As shown in Fig 5b, the images are Coarse
coregistered if:
a. translation o1-o2 tends to zero;
b. rotation i.e., theta-> 0 (tends to zero);
c. Temporal baseline -> 0 (tends to zero),
10 Wherein I1, I2 are images Area of Interest (AOI)
x-y and x’-y’ represents coordinate frames for corresponding images,
o1 and o2 represent the origin of the coordinate frame.
The fine shifts (Fig 5c) are corrected using a learning-based approach to correct for
15 feature level distortion achieving close to pixel level registration accuracy in case of
dense urban regions. Thus, it effectively finds the spatial and radiometric mapping
between two modalities resulting in better registration accuracies than conventional
approaches. As shown in Fig 5c, the images are fine registered if:
a. I1, I2 are images AOI;
20 b. x-y, x’-y’ represents coordinate frames for corresponding features f1, f2 as
shown;
c. o1 and o2 represent origin of the coordinate frames corresponding to features
f1 and f2,
wherein for every feature f present in the image:
25 a. translation o1-o2 tends to zero;
b. rotation i.e., theta-> 0 (tends to zero);
c. Temporal baseline -> 0 (tends to zero),
d. Shear -> 0 (tends to zero).
30 [44] In accordance with one of the embodiments, the feature detection algorithms
used for multispectral and SAR image data are (Fig 6a)
a. Harris Corner Detection: Detects corners in an image by analyzing changes
in intensity in different directions;
15
b. Laplacian of Gaussian (LoG) is useful for detecting edges that appear at
various image scales or degrees of image focus. The exact values of sizes of
the two kernels that are used to approximate the Laplacian of Gaussian will
determine the scale of the difference image, which may appear blurry as a
5 result;
c. Sobel operator performs a 2-D spatial gradient measurement on an
image and so emphasizes regions of high spatial frequency that correspond
to edges.
d. Edge detection is a fundamental issue in automatic target detection using
10 synthetic aperture radar (SAR) images. Edges are associated with intensity
changes in the image and are efficient descriptors of the image structure.
Due to the presence of speckle, edge detection in SAR images is extremely
difficult. ROEWA based algorithm that automatically discriminates the
object boundaries and the false edges.
15 The result of features detected in both multispectral; image data and SAR image
data is shown in Fig 6b.
[45] In one of the exemplary embodiment, the aerial platform is Satellite and two
different image sensors multispectral and SAR sensors are used as image capturing
devices therefore there are two types of images acquired. This image data is pre20 processed to remove any noise data from each individual sensor is further processed
using coregistration model. Fig 7a shows the localized feature to be corrected and
matched for each consecutive sensors, here multispectral is used as a reference and SAR
is to be aligned using multispectral image data; Fig 7b shows the mapping functions
for the top k features selected for the alignment, it is shown by arrows of different
25 colours depicting different features; Fiig 7c shows the shift histograms, this is global
shift the individual feature shifts are averaged out and a single resultant vector is
outputted for alignment; and Fig 7d shows the final processed and corrected image
data.
30 [46] In In one of the exemplary embodiment, the aerial platform is drone and two
different image sensors multispectral and SAR sensors are used as image capturing
devices therefore there are two images acquired Fig 8a. This image data is preprocessed to remove any noise data and image data from each individual sensor is
16
further processed using coregistration model. Fig 8b shows the localized feature to be
corrected and matched for each consecutive sensors, here multispectral is used as a
reference and SAR is to be aligned using multispectral image data; Fig 8c shows the
mapping functions for the top k features selected for the alignment, it is shown by
5 arrows of different colours depicting different features; and Fig 8d shows the final
processed and corrected SAR image data.
[47] Fig 9 represents the process flow of the coregistration of images using the
present system. As shown in the image, the present system perform the coregistration
10 that comprises steps of:
a. acquiring image data from more than one sensor (Sensor A, sensor B,) and
these sensors can be Multispectral, SAR, hyperspectral, Thermal sensors
that capture image data and each sensor data has some of distinctiveness in
15 terms of image capturing;
b. pre-processing the image data from multiple sensors with the pre-processing
layer, wherein the pre-processing is sensor independent processing that
includes steps of Dynamic range normalization and histogram corrections
which involves adjusting the pixel values of images so that they fall within
20 a common range, resulting in better correlation that could be established,
and further attitude information from the sensor is incorporated to account
for initial coarser correction/alignment;
c. converting the pre-processed image data into the latent space representation;
d. transferring data from latent space and spatial databases, wherein spatial
25 database refers to vector database that is exploited to understand the spatial
context of the captured AOI imagery, this contains the information of
capture area of interest (AOI) which can be used to understand the nature of
misalignment and extract the area of common overlap for two imagery
AOIs. Attitude information from the sensor can be used to do initial
30 corrections/alignments;
e. performing coarse registration which includes steps of
i. estimating warping coefficients using higher-order polynomial
transformation models, to account for more general retinal
motions such as scale changes, shearing and rotation motions;
17
ii. performing Geometric and Radiometric corrections by utilizing
the underlying digital elevation model (DEM) of the captured
AOI to account for terrain induced distortions whereas
radiometric corrections are achieved using averaged normalized
5 radiometric values of the captured scene;
iii. conducting a Frame level transformation from warping
coefficients, here frames refer to the image planes for the two
sensors that is transformed using estimated warping coefficients
to align the two images of two sensors effectively;
10 f. performing fine registration which included steps of:
i. estimating diffeomorphic transformation function that
addresses local distortions due to spatial variance in errors,
wherein the estimation is done using deep learning technique
that estimates flow vectors for each pixel which tries to find
15 higher order polynomial transformation models that could be
estimated; and
ii. correcting feature wise misalignments;
g. feeding the stack of coregistered image pairs (bands aligned for each sensor)
into the analytics module or some feature engineering is done before feeding
20 based on the underlying use case.
[48] Fig 10 shows a comparative study of the present system “Galaxeye” with one
of the existing algorithm “SOTA”. Fig 10a shows relative X, Y shifts in images of
present system and output image of SOTA algorithm.
System Accuracy Spatial corrections
Arosics (SOTA) RMSE of 0.89 Global, Windowed Local
Galaxeye (present system) RMSE of 0.78 Global, Feature Local
25
A comparison of the registration performance of Galaxeye coregistration and Arosics
coregistration algorithm illustrated above. From the checkerboard mosaiced images
(band 02 optical sentinel-2 and vv sentinel-1) Fig 10b, it is clear that the translation
18
error in the imagery has been efficiently reduced. The red colour circle highlights the
region alignments corresponding to each algorithm.
[49] Further, to do a numerical comparison, we have to manually locate 4 pairs of
5 checkpoints with a good distribution and evaluate the registration accuracies (RMSE).
Fig 10c shows the comparison of the registration performance and directional shifts of
each algos w.r.t selected checkpoints. GCPn represents the nth checkpoint taken into
consideration. “1” represents the final alignment of the point after Arosics
coregistration whereas “2” represents the final alignment after Galaxeye coregistration.
10 The shift closer to GCPn indicates the registration accuracy of the algorithm into
consideration.
[50] The present method of coregistration is a fundamental step for fused derived
data products that include - cloud removal in optical imagery, regeneration of optical
15 imagery from SAR, 3D height estimation of man-made features etc. It can be useful in
applications like - vegetation monitoring where optical sensors are used to extract
height information and SAR sensors for vegetation growth monitoring; defense
applications- for camouflage detection, in agriculture, disaster management, insurance,
defense and maritime etc.
20
[51] No language in the specification should be construed as indicating any nonclaimed element as essential to the practice of the invention.
[52] It will be apparent to those skilled in the art that various modifications and
25 variations can be made to the present specification without departing from the scope of
the invention. There is no intention to limit the invention to the specific form or forms
enclosed. On the contrary, the intention is to cover all modifications, alternative
constructions, and equivalents falling within the scope of the invention, as defined in
the appended claims. Thus, it is intended that the present specification cover the
30 modifications and variations of this invention, provided they are within the scope of the
appended claims and their equivalents.
Dated: 22nd August 2024 Signature
Disha Shah -IN/PA-4826
19
CLAIMS:
I/We claim:
1. A system (100) for multimodal registration of geospatial imagery captured from
5 aerial vehicle, such system comprising:
a. a sensor module (102) including multiple distinct image capturing sensors;
b. a georeferencing unit that includes an Inertial Measurement Unit (IMU) for
sensing acceleration and velocity of the aerial or spaceborne vehicle at every
point of time;
10 c. a preprocessing layer (104) that employs radiometric and geometric
corrections, transformations, orthorectification, noise reduction,
normalization, and other image processing techniques on the images captured
by the sensor module (102);
d. a Homogeneous latent space (106) that is a shared, unified representation
15 space wherein relationships and similarities of data from different image
capturing sensors is preserved; and
e. a Coregistration module (108) that facilitate alignment of two or more images
of the same area captured by temporally and spatially aligned similar or
different sensors, or from different viewing angles so that corresponding pixels
20 represent the same geographic location.
2. The system as claimed in claim 1 wherein the sensor module includes
Multispectral (n channels with wavelength not limited to visible region) sensor,
Hyperspectral (n’ channels where n’ >> n) sensor, thermal sensors, SAR
25 (synthetic aperture radar operating in lower wavelengths - X band, C band, S
band), LIDAR, MSI (microsatellite instability) detection sensors, orientation
sensors wherein each sensor is associated with one orientation sensor that
allows for proper orientation of the image sensor to enable capturing of images
at appropriate angle.
30
3. The system as claimed in claim 1 wherein the pre-processing layer (104) acquire
additional sensor parameters to perform Geometric modeling of sensor
projection geometry using orbit alignment and incidence angles that help in
reduction of geometric and radiometric distortion.
20
4. The system as claimed in claim 1 wherein the coregistration module has two
variants: edge processor for processing data on board in real time, and ground
processor for higher computing level of processing having higher pixel
5 registration accuracy.
5. The system as claimed in claim 1 includes a space-based radio navigation
system (GPS) for sensing a location of the aerial vehicle with respect to the
center of the earth.
10
6. A method of coregistration of images captured by an aerial platform, such
method comprises steps of:
a. acquiring image data from more than one image capturing sensor;
b. pre-processing the image data from multiple sensors with the pre15 processing;
c. converting the pre-processed image data into the latent space representation;
d. transferring data from latent space and spatial databases;
e. performing coarse registration which includes steps of
i. estimating warping coefficients using higher-order polynomial
20 transformation models, to account for more general retinal motions such
as scale changes and rotation motions;
ii. performing Geometric and Radiometric corrections by utilizing the
underlying digital elevation model (DEM) of the captured AOI to account
for terrain induced distortions whereas radiometric corrections are
25 achieved using averaged normalized radiometric values for the captured
scene;
iii. conducting a Frame level transformation from warping coefficients, here
frames refer to the image planes for the two sensors that is transformed
using estimated warping coefficients to align the two images of two
30 sensors effectively;
f. performing fine registration which included steps of:
i. estimating diffeomorphic feature that addresses local distortions due to
spatial variance in errors, wherein the estimation is done using deep
learning technique that estimates flow vectors for each pixel which tries
21
to find higher order polynomial transformation models that could be
estimated; and
ii. correcting feature wise misalignments;
g. feeding the stack of coregistered image pairs (bands aligned for each sensor)
5 into the analytics module or some feature engineering is done before feeding
based on the underlying use case.
7. The method as claimed in claim 6 wherein the image capturing sensor can be
Multispectral, SAR, hyperspectral, Thermal sensors that capture image data.
10
8. The method as claimed in claim 6 wherein the pre-processing is sensor
independent processing that includes steps of Dynamic range normalization and
histogram corrections which involves adjusting the pixel values of images so
that they fall within a common range, resulting in better correlation that could
15 be established, and further attitude information from the sensor is incorporated
to account for initial coarser correction/alignment
| # | Name | Date |
|---|---|---|
| 1 | 202441063327-STATEMENT OF UNDERTAKING (FORM 3) [22-08-2024(online)].pdf | 2024-08-22 |
| 2 | 202441063327-REQUEST FOR EXAMINATION (FORM-18) [22-08-2024(online)].pdf | 2024-08-22 |
| 3 | 202441063327-REQUEST FOR EARLY PUBLICATION(FORM-9) [22-08-2024(online)].pdf | 2024-08-22 |
| 4 | 202441063327-POWER OF AUTHORITY [22-08-2024(online)].pdf | 2024-08-22 |
| 5 | 202441063327-FORM-9 [22-08-2024(online)].pdf | 2024-08-22 |
| 6 | 202441063327-FORM FOR STARTUP [22-08-2024(online)].pdf | 2024-08-22 |
| 7 | 202441063327-FORM FOR SMALL ENTITY(FORM-28) [22-08-2024(online)].pdf | 2024-08-22 |
| 8 | 202441063327-FORM 18 [22-08-2024(online)].pdf | 2024-08-22 |
| 9 | 202441063327-FORM 1 [22-08-2024(online)].pdf | 2024-08-22 |
| 10 | 202441063327-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [22-08-2024(online)].pdf | 2024-08-22 |
| 11 | 202441063327-EVIDENCE FOR REGISTRATION UNDER SSI [22-08-2024(online)].pdf | 2024-08-22 |
| 12 | 202441063327-DRAWINGS [22-08-2024(online)].pdf | 2024-08-22 |
| 13 | 202441063327-DECLARATION OF INVENTORSHIP (FORM 5) [22-08-2024(online)].pdf | 2024-08-22 |
| 14 | 202441063327-COMPLETE SPECIFICATION [22-08-2024(online)].pdf | 2024-08-22 |
| 15 | 202441063327-Proof of Right [10-09-2024(online)].pdf | 2024-09-10 |