Sign In to Follow Application
View All Documents & Correspondence

A System For Acoustic Source Signature Detection, Classification, And Localization And Its Method Thereof

Abstract: This invention discloses an advanced acoustic source localization system featuring a unique three-dimensional microphone array arrangement for enhanced accuracy. The system employs 3D Angle of Arrival (AoA) estimation and AI-based confidence range estimation to localize acoustic sources. The 7-sensor array acquires wavefront data simultaneously, enabling accurate bearing calculations and AoA computation in both azimuthal and elevation planes. Incorporating range estimation with AoA provides comprehensive localization. Operating standalone, the system estimates geo-coordinates locally without remote servers, reducing infrastructure dependence. Optionally, it leverages networks for increased accuracy from remote servers. Offering improved precision, situational awareness, and resilience in complex environments, this system represents a significant advancement over existing single-plane AoA or remote server-based solutions.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
12 February 2025
Publication Number
09/2025
Publication Type
INA
Invention Field
PHYSICS
Status
Email
Parent Application

Applicants

Linearized Amplifier Technology & Services Private Limited
Incubation Building, iHub Divya Sampark, I.I.T Roorkee, Haridwar, Roorkee, Uttarakhand 247667
Indian Institute of Technology Roorkee
Indian Institute of Technology Roorkee, Roorkee, Haridwar, Uttarakhand, 247667

Inventors

1. Karun Rawat
Department of Electronics & Communication Engineering, I.I.T Roorkee, Roorkee, Uttarakhand, 247667
2. Meenakshi Rawat
Department of Electronics & Communication Engineering, I.I.T Roorkee, Roorkee, Uttarakhand, 247667
3. Parth Sarthi Mishra
Department of Electronics & Communication Engineering, I.I.T Roorkee, Roorkee, Uttarakhand, 247667
4. Akash Jha
Department of Electronics & Communication Engineering, I.I.T Roorkee, Roorkee, Uttarakhand, 247667
5. Harsh Kumar
Department of Electronics & Communication Engineering, I.I.T Roorkee, Roorkee, Uttarakhand, 247667
6. Amit Kumar
Department of Electronics & Communication Engineering, I.I.T Roorkee, Roorkee, Uttarakhand, 247667
7. Aniket Vishwakarma
Department of Electronics & Communication Engineering, I.I.T Roorkee, Roorkee, Uttarakhand, 247667
8. Vivek Sharma
Incubation Building, iHub Divya Sampark, I.I.T Roorkee, Haridwar, Roorkee, Uttarakhand 247667

Specification

Description:FIELD OF THE INVENTION
The present disclosure relates to a system for acoustic source signature detection, classification, and localization, and its method thereof. In more particular manner the present invention relates to acoustic signature detection, signal processing, acoustic source localization, triangulation, acoustic source classification, and the development of three-dimensional sensor array for acoustic applications.

BACKGROUND OF THE INVENTION
Accurate localization of acoustic sources, such as gunshots or explosions, is crucial for various applications, including law enforcement, military operations, and security monitoring. However, existing acoustic source localization systems suffer from several limitations that hinder their performance and adaptability.
The first topology, which relies on remote localization estimation and a central server for data aggregation, restricts the system's flexibility and requires prior deployment of fixed acoustic sensing nodes. This approach limits the system's ability to adapt to dynamic environments and introduces a dependency on a robust network infrastructure. The use of Time-Difference of Arrival (TDoA) based localization algorithms further exacerbates this dependency, as it requires reliable data transmission from multiple sensing nodes to the central server.
The second topology, which employs microphone arrays arranged in planar or tetrahedral geometries, suffers from limited accuracy in estimating the Angle of Arrival (AoA) of acoustic sources. This limitation arises from the restricted number of sensors available for capturing the acoustic wavefront at any given instance. Additionally, this topology is incapable of measuring the range of the acoustic source, thereby limiting its ability to provide comprehensive localization information beyond AoA.

To address these limitations, there is a need for an acoustic source localization system that can operate independently without relying on a central server or fixed sensing node positions. Moreover, the system should be capable of accurately estimating both the AoA and range of acoustic sources, enabling precise localization in dynamic environments. By overcoming the drawbacks of existing topologies, such a system would greatly enhance the capabilities of acoustic source localization for various applications, including law enforcement, military operations, and security monitoring.
In the view of the foregoing discussion, it is clearly portrayed that there is a need for a system for acoustic source signature detection, classification, and localization, and its method thereof.

SUMMARY OF THE INVENTION
The present disclosure relates to a system for acoustic source signature detection, classification, and localization, and its method thereof. The present invention introduces an advanced acoustic source signature detection, classification, and localization system featuring a unique three-dimensional microphone array arrangement. This arrangement enables enhanced accuracy in localizing acoustic sources in three-dimensional space by employing 3-Dimensional Angle of Arrival (AoA) estimation and confidence-based range estimation techniques. The system comprises a three-dimensional microphone array with seven sensors strategically positioned to acquire acoustic wavefront data simultaneously. This geometric configuration allows for an increased number of microphone pairs within each plane, facilitating accurate bearing calculations and enhancing the system's overall localization accuracy, even in complex environments with multiple sound sources or reflections.Unlike existing methods that compute AoA in a single plane, the present invention calculates AoA comprehensively in both azimuthal and elevation planes. This approach provides a detailed understanding of the spatial orientation of the acoustic source, improving localization accuracy and situational awareness.Furthermore, the system incorporates AI-based algorithms for confidence-based range estimation, enabling it to localize acoustic sources by combining AoA and range information. This capability sets the invention apart from traditional methods limited to AoA estimation alone.Notably, the proposed technology can estimate geo-coordinates locally without relying on a remote network, reducing dependency on external infrastructure and enhancing system resilience in challenging operating conditions. However, the system can also be configured to leverage wired or wireless networks for enhanced accuracy by obtaining the location of the acoustic source from a remote server.
The present invention seeks to provide a system for acoustic source signature detection, classification, and localization. The system comprises: one or more acoustic sensing and localization nodes, each comprising: a 3-dimensional microphone array consisting of seven acoustic sensors configured to capture acoustic signals in a 3D space; a unique acoustic signature detection module operatively connected to the 3-dimensional microphone array, configured to detect specific acoustic signatures from the captured signals; a 3-dimensional angle of arrival (AoA) calculator configured to compute azimuthal and elevation angles of arrival of the detected acoustic source; a confidence-based range estimation module configured to estimate the range of the detected acoustic source from the acoustic sensing and localization node based on the calculated AoA and the captured signals; and a local display configured to present the computed AoA, estimated range, and classified source type for user situational awareness. The system further comprises: a local wireless network operatively connected to the one or more acoustic sensing and localization nodes via a local wireless network interface and configured to transmit data; a central server operatively connected to the local wireless network and configured to execute the AoA-based localization technique, process the transmitted data, and retransmit the localization results to the acoustic sensing and localization nodes; and one or more user UX (user experience) devices operatively connected to the local wireless network, configured to receive and display the localization results for user interaction and control.
In an embodiment, the 3-dimensional microphone array is arranged to capture acoustic signals in real-time and in a spatially distributed manner to enhance the accuracy of AoA calculations and range estimations, wherein the 3-dimensional microphone array comprising: a spherical structure subdivided into quadrilateral sections with equal surface areas, resulting in six faces and eight vertices; a plurality of rods, each extruded from one of the vertices, forming a spherically symmetric configuration with an angle from 65-75 degrees, preferably 70.529 degrees between any two adjacent rods; a plurality of microphones strategically housed within the structure, wherein the geometric configuration ensures that each microphone captures data of the acoustic wavefront simultaneously, wherein preferably seven microphones are strategically housed within the structure, wherein the strategic positioning of the microphones within the structure enables the detection and elimination of false data that conventional planar or tetrahedral microphone arrangements cannot achieve; an arrangement of microphone pairs within each plane, enhancing the system's ability to detect and localize acoustic sources with high accuracy, wherein the arrangement of microphone is adaptable to variations in size while maintaining the same geometric structure, provided the number of microphones remains constant, wherein the arrangement of microphone is configurable to adopt a different geometric structure, subject to a change in the number of microphones, as determined by the specific application requirements and the available chassis shape and size; a data filtering mechanism that utilizes data received from specific opposite microphone pairs to eliminate false or anomalous data within the received waveform; an angle of arrival (AoA) estimation module that determines the direction of the acoustic source in both azimuthal and elevation planes with precision; and a central processing unit configured to process the data captured by the microphones and provide a comprehensive snapshot of the acoustic environment in real-time.
In an embodiment, the unique acoustic signature detection module is configured to operate in parallel with the 3-dimensional angle of arrival (AoA) calculator and confidence-based range estimation module to simultaneously detect, classify, and localize the acoustic source, wherein the unique acoustic signature detection module comprising: a convolutional neural network (CNN) architecture designed to process acoustic spectrograms with an input layer configured to accept data of shape (13, 1000, 1), where 13 represents the number of features or channels, 1000 represents the temporal length, and 1 denotes the number of channels in the input data;a first Conv2D layer consisting of 64 filters with a kernel size of 3x3, configured to perform 2D convolutional operations on the input data, extracting spatial and temporal features across different frequency bands;a MaxPooling2D layer with a pooling size of 2x2, configured to downsample the feature maps by selecting the maximum value within each 2x2 region, effectively reducing the spatial dimensions while retaining critical information;a second Conv2D layer with 64 filters and a kernel size of 3x3, designed to further refine the extracted features, enhancing the model’s ability to capture intricate patterns in unique acoustic signatures;a flatten layer configured to transform the multi-dimensional output of the convolutional and pooling layers into a one-dimensional vector for subsequent classification tasks;a first dense layer with 512 neurons, fully connected to the Flatten layer, employing ReLU activation to facilitate the learning of complex non-linear decision boundaries between acoustic signatures;a second dense layer with 128 neurons, configured to further process the feature vector, refining the model’s decision-making capabilities;an output layer with a single neuron and sigmoid activation function, designed to classify the input signal as either a gunshot or non-gunshot acoustic event; anda customizable and scalable architecture, wherein the number of layers, the size of each layer, and the number of neurons within each layer are optimized and fine-tuned to enhance the model’s accuracy and predictive capabilities.
In an embodiment, the local display is configured to provide a visual interface displaying the real-time 3-dimensional AoA, the estimated range of the acoustic source, and the classified source type for situational awareness and decision-making, wherein the Local Wireless Network is configured to ensure continuous data transmission between the acoustic sensing and localization nodes, the central server, and the user UX devices, thereby enabling real-time acoustic source detection, classification, and localization across multiple nodes, wherein the central server is configured to aggregate data from multiple acoustic sensing and localization nodes thereby performing central computations to increase the confidence and accuracy of acoustic source localization and retransmitting the aggregated localization results to the networked nodes and user UX devices for real-time display and monitoring.
In an embodiment, the AoA-based localization technique is configured to utilize the data received from the 3-dimensional microphone array to compute the precise location of the acoustic source and dynamically update the computed AoA and range based on continuous signal analysis and transmit the updated results to the local display and user UX devices, wherein the AoA-based localization technique employs: geometric extrapolation techniques to estimate the position of the acoustic source by determining the intersection points of lines corresponding to the AoA data from different nodes: graphical intersection calculation techniques to identify a consistent intersection point, even in the presence of AoA calculation errors, thereby ensuring accurate triangulation of the acoustic source; and compute the approximate intersection point of AoA lines rather than seeking exact mathematical solutions, thus enhancing the reliability of the triangulation process in scenarios where AoA data may be slightly erroneous.
In an embodiment, the system further comprises a network-independent mode wherein each acoustic sensing and localization node can independently detect, classify, and localize the acoustic source without relying on the local wireless network, and display the results locally on the local display.
The present invention also seeks to provide a method for acoustic source signature detection, classification, and localization. The method comprises: capturing acoustic signals using a 3-dimensional microphone array at one or more acoustic sensing and localization nodes (300); detecting specific acoustic signatures from the captured signals using a unique acoustic signature detection module; calculating the azimuthal and elevation angles of arrival (AoA) of detected acoustic source using a 3-dimensional angle of arrival (AoA) calculator; estimating the range of the detected acoustic source from the acoustic sensing and localization node using a confidence-based range estimation module; displaying the computed AoA, estimated range, and classified source type on a local display for user situational awareness; transmitting the data from the acoustic sensing and localization node to a central server via a local wireless network; and executing an AoA-based localization technique at the central server to process the transmitted data and retransmitting the localization results to the acoustic sensing and localization nodes and user UX devices for real-time display.
In an embodiment, the method further comprises the step of aggregating data from multiple acoustic sensing and localization nodes at the central server to increase the accuracy and confidence of the acoustic source localization results.
In an embodiment, the method further comprises the step of operating the system in a network-independent mode, where each acoustic sensing and localization node can independently detect, classify, and localize the acoustic source, displaying the results locally without relying on the local wireless network.
An objective of the present disclosure is to provide a system for acoustic source signature detection, classification, and localization, and its method thereof.
Another object of the present disclosure is to provide an advanced acoustic source signature detection, classification, and localization system that offers enhanced accuracy through a unique three-dimensional microphone array arrangement.
Another objective of the present disclosure is to enable comprehensive Angle of Arrival (AoA) calculation in both azimuthal and elevation planes, providing a detailed understanding of the spatial orientation of the acoustic source.
Another objective of the present disclosure is to incorporate confidence-based range estimation capabilities, allowing the system to localize acoustic sources by combining AoA and range information.
Another objective of the present disclosure is to design a standalone system capable of estimating geo-coordinates locally without the need for a remote network, reducing dependency on external infrastructure and enhancing system resilience.
Another objective of the present disclosure is to provide a configurable system that can leverage wired or wireless networks for enhanced accuracy by obtaining the location of the acoustic source from a remote server.
Another objective of the present disclosure is to improve situational awareness for users by providing comprehensive localization information, including AoA, range, and geo-coordinates of acoustic sources.
Yet, another objective of the present disclosure is to enhance the system's adaptability and reliability in various environments, including complex scenarios with multiple sound sources or reflections.
To further clarify advantages and features of the present disclosure, a more particular description of the invention will be rendered by reference to specific embodiments thereof, which is illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail with the accompanying drawings.

BRIEF DESCRIPTION OF FIGURES
These and other features, aspects, and advantages of the present disclosure will become better understood when the following detailed description is read with reference to the accompanying drawings in which like characters represent like parts throughout the drawings, wherein:
Figure 1 illustrates conventional architecture topology-1 of acoustic signature detection and source localization system in accordance with an embodiment of the present disclosure;
Figure 2 illustrates conventional architecture topology-2 of acoustic signature detection and source localization system in accordance with an embodiment of the present disclosure;
Figure 3 illustrates a detailed architecture of the proposed Acoustic signature detection and source localization system in accordance with an embodiment of the present disclosure;
Figure 4 illustrates a detailed geometry of 3-Dimensional Microphone sensor array, in accordance with an embodiment of the present disclosure;
Figure 5illustrates detailed internal architecture of unique acoustic signature detector, in accordance with an embodiment of the present disclosure;
Figure 6 illustrates detailed internal architecture of 3-Dimensional Angle of Arrival (AoA) Calculator, in accordance with an embodiment of the present disclosure;
Figure 7 illustrates detailed internal architecture of Confidence based range estimation for Source Classification, in accordance with an embodiment of the present disclosure;
Figure 8 illustrates detailed internal architecture of Confidence based range estimation for Range Estimation, in accordance with an embodiment of the present disclosure;
Figure 9 illustrates detailed internal architecture of AoA(Angle of Arrival) Based localization Algorithm, in accordance with an embodiment of the present disclosure;
Figure 10 illustrates confusion matrix showing accuracy of Acoustic Signature Detector in accordance with an embodiment of the present disclosure;
Figure 11illustrates measured results of 3D AoA estimation showcasing azimuthal and elevation angle of arrival from the acoustic source at different locationsin accordance with an embodiment of the present disclosure;
Figure 12 illustrates accuracy of Confidence based range estimation – Source classification for Firearm Classification in accordance with an embodiment of the present disclosure;
Figure 13 illustrates range estimation for samples taken at different range from the node in accordance with an embodiment of the present disclosure;
Figure 14 illustrates a block diagram of a system for acoustic source signature detection, classification, and localization in accordance with an embodiment of the present disclosure; and
Figure 15 illustrates a flow chart of a method for acoustic source signature detection, classification, and localization in accordance with an embodiment of the present disclosure.
Further, skilled artisans will appreciate that elements in the drawings are illustrated for simplicity and may not have been necessarily been drawn to scale. For example, the flow charts illustrate the method in terms of the most prominent steps involved to help to improve understanding of aspects of the present disclosure. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the drawings by conventional symbols, and the drawings may show only those specific details that are pertinent to understanding the embodiments of the present disclosure so as not to obscure the drawings with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.

DETAILED DESCRIPTION:
For the purpose of promoting an understanding of the principles of the invention, reference will now be made to the embodiment illustrated in the drawings and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the illustrated system, and such further applications of the principles of the invention as illustrated therein being contemplated as would normally occur to one skilled in the art to which the invention relates.
It will be understood by those skilled in the art that the foregoing general description and the following detailed description are exemplary and explanatory of the invention and are not intended to be restrictive thereof.
Reference throughout this specification to “an aspect”, “another aspect” or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of the phrase “in an embodiment”, “in another embodiment” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps but may include other steps not expressly listed or inherent to such process or method. Similarly, one or more devices or sub-systems or elements or structures or components proceeded by "comprises...a" does not, without more constraints, preclude the existence of other devices or other sub-systems or other elements or other structures or other components or additional devices or additional sub-systems or additional elements or additional structures or additional components.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The system, methods, and examples provided herein are illustrative only and not intended to be limiting.
Embodiments of the present disclosure will be described below in detail with reference to the accompanying drawings.
The functional units described in this specification have been labeled as devices. A device may be implemented in programmable hardware devices such as processors, digital signal processors, central processing units, field programmable gate arrays, programmable array logic, programmable logic devices, cloud processing systems, or the like. The devices may also be implemented in software for execution by various types of processors. An identified device may include executable code and may, for instance, comprise one or more physical or logical blocks of computer instructions, which may, for instance, be organized as an object, procedure, function, or other construct. Nevertheless, the executable of an identified device need not be physically located together but may comprise disparate instructions stored in different locations which, when joined logically together, comprise the device and achieve the stated purpose of the device.
Indeed, an executable code of a device or module could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different applications, and across several memory devices. Similarly, operational data may be identified and illustrated herein within the device and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set or may be distributed over different locations including over different storage devices, and may exist, at least partially, as electronic signals on a system or network.
Reference throughout this specification to “a select embodiment,” “one embodiment,” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosed subject matter. Thus, appearances of the phrases “a select embodiment,” “in one embodiment,” or “in an embodiment” in various places throughout this specification are not necessarily referring to the same embodiment.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided, to provide a thorough understanding of embodiments of the disclosed subject matter. One skilled in the relevant art will recognize, however, that the disclosed subject matter can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the disclosed subject matter.
In accordance with the exemplary embodiments, the disclosed computer programs or modules can be executed in many exemplary ways, such as an application that is resident in the memory of a device or as a hosted application that is being executed on a server and communicating with the device application or browser via a number of standard protocols, such as TCP/IP, HTTP, XML, SOAP, REST, JSON and other sufficient protocols. The disclosed computer programs can be written in exemplary programming languages that execute from memory on the device or from a hosted server, such as BASIC, COBOL, C, C++, Java, Pascal, or scripting languages such as JavaScript, Python, Ruby, PHP, Perl or other sufficient programming languages.
Some of the disclosed embodiments include or otherwise involve data transfer over a network, such as communicating various inputs or files over the network. The network may include, for example, one or more of the Internet, Wide Area Networks (WANs), Local Area Networks (LANs), Long range radio (LoRa) network, analog or digital wired and wireless telephone networks (e.g., a PSTN, Integrated Services Digital Network (ISDN), a cellular network, and Digital Subscriber Line (xDSL)), radio, television, cable, satellite, and/or any other delivery or tunneling mechanism for carrying data. The network may include multiple networks or sub-networks, each of which may include, for example, a wired or wireless data pathway. The network may include a circuit-switched voice network, a packet-switched data network, or any other network able to carry electronic communications. For example, the network may include networks based on the Internet protocol (IP) or asynchronous transfer mode (ATM), and may support voice using, for example, VoIP, Voice-over-ATM, or other comparable protocols used for voice data communications. In one implementation, the network includes a cellular telephone network configured to enable exchange of text or SMS messages.
Examples of the network include, but are not limited to, a personal area network (PAN), a storage area network (SAN), a home area network (HAN), a campus area network (CAN), a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a virtual private network (VPN), an enterprise private network (EPN), Internet, a global area network (GAN), and so forth.
Figure 1 illustrates conventional architecture topology-1 of acoustic signature detection and source localization system in accordance with an embodiment of the present disclosure.
Figure 2 illustrates conventional architecture topology-2 of acoustic signature detection and source localization system in accordance with an embodiment of the present disclosure.
Referring to Figure 1, the first topology represents an acoustic source localization system where the localization estimation is performed remotely on a server. Each acoustic node comprises a single microphone and an acoustic signature detection module for a particular acoustic source, such as a gunshot. The localization process is not conducted locally within the sensing nodes; instead, it relies on a remote server to gather data from the microphones of several acoustic sensing nodes. This approach requires the fixed positioning of each acoustic sensing node and is only viable in scenarios where such sensing nodes have been previously deployed.
The remote server employs a Time-Difference of Arrival (TDoA) based localization algorithm to estimate the source location. The localization results are then sent back to the local display of the acoustic sensing nodes using the same local wireless network. However, this architecture necessitates a robust network infrastructure to fetch data from each sensing node, rendering it unsuitable for environments with unreliable or unavailable networks.
Referring to Figure 2, the second topology utilizes a microphone array arranged in either a planar or tetrahedral geometry. This configuration only enables the accurate detection of the Angle of Arrival (AoA) of the acoustic source in the azimuth plane. The tetrahedral arrangement of sensors provides only four microphones within each plane for a given structure size. This limited number of sensors restricts the acquisition of the acoustic wavefront to only four sensors at any given instance, consequently limiting the accuracy of AoA estimation in both the azimuth and elevation planes. Furthermore, this architecture is incapable of measuring the range of the acoustic source, thereby limiting its functionality to display only the estimated AoA.
Figure 3 illustrates a detailed architecture of the proposed Acoustic signature detection and source localization system in accordance with an embodiment of the present disclosure.
The proposed architecture for Acoustic Source Signature Detection, Classification, and Localization, as illustrated in Figure 3, comprises one or multiple Acoustic Sensing and Localization Nodes (300) connected to a Local Wireless Network (106), which in turn links to a Central Server (107). Here, computations, particularly of the Angle of Arrival (AoA) based Localization Algorithm (305), are executed and subsequently transmitted to User user experience (UX) Devices (109) distributed among users, control centers, or mobile vans. The AoA-based Localization Algorithm (305) results are also retransmitted to the network to display outcomes at each connected Acoustic Sensing and Localization Node (300).
Broadly, the system aims to detect specific acoustic signatures, classify the source, and localize it, revealing both its 3-D AoA and range from the Acoustic Sensing and Localization Node (300). These details are displayed on the Local Display for user convenience and situational awareness. To enhance accuracy and confidence, multiple nodes are networked to perform central computations, alert remote server stations, and display results locally. The ultimate goal is to achieve acoustic source localization in the absence of a network as well as improving confidence, accuracy, and reporting whenever the network is available.
Each Acoustic Sensing and Localization Node (300) is equipped with a 3-Dimensional Microphone Array (301) containing seven acoustic sensors to capture signals. The captured signals undergo Unique Acoustic Signature Detection (302) to identify specific acoustic signatures. If a signature is detected, the signal is processed in parallel by the 3-Dimensional Angle of Arrival (AoA) Calculator (303) and Confidence-based Range Estimation (304) modules. These calculate azimuthal and elevation angles of arrival and classify the source type, estimating the range of the acoustic source from the node, respectively. The estimated AoA, range, and source type are then displayed on Local Displays (204 and 105) and transmitted to the Local Wireless Network (106). Additionally, the AoA-based localization algorithm (305) results are displayed at Local Display of Co-ordinates of Acoustic Source (105).
Advantages of the proposed architecture:
? Can Estimate Geo Coordinates of acoustic sources locally usingAoA and Range estimation without any remote network.
? The network-assisted localization is also added based on AoA detected from multiple acoustic sensing and localization nodes in the remote server to improve the accuracy of geo-localization.
? The architecture is suitable for mobile platforms with each acoustic sensing and localization node can be movable with their GPS coordinates known.
? The microphone array arranges sensors in a unique 3-dimensional arrangement allowing seven sensors to acquire data of the acoustic wavefront at a given instance. This geometric configuration ensures an increased number of microphones within each plane for a given structure size, thereby facilitating a greater number of microphone pairs for bearing calculation, resulting in enhanced system accuracy.
? This geometry aids in the detection and removal of false or anomalous data within the received waveform by eliminating data received at specific opposite microphone pairs. This is not possible in planar and tetrahedral microphone arrangements.
Figure 4 illustrates a detailed geometry of 3-Dimensional Microphone sensor array, in accordance with an embodiment of the present disclosure.
The Figure 4 depicts various views of the proposed 3-Dimensional microphone sensor array (301), showcasing its Right-Hand Side View, Front View, Isometric View, and Top View. The proposed structure involves Sphere with Quadrilateral subdivisions (all of equal area) of its entire surface with total of six faces and 8 vertices. Each rod of the structure is extruded from one of the vertex. This gives rise to a spherically symmetric structure with the angle between any two adjacent rods being 70.529 degrees. This array is ingeniously designed to house seven microphones, strategically arranged in a unique 3-dimensional configuration. Such an arrangement allows each of the seven sensors to capture data of the acoustic wavefront simultaneously, providing a comprehensive snapshot of the acoustic environment at any given moment.
The geometric layout of this microphone array ensures an increased number of microphones within each plane, despite maintaining a compact structure size. Consequently, this design facilitates a greater number of microphone pairs for bearing calculation, leading to significantly enhanced system accuracy in detecting and localizing acoustic sources.
An additional advantage of this geometric configuration lies in its ability to detect and eliminate false or anomalous data within the received waveform. By strategically positioning the microphones, data received at specific opposite microphone pairs can be effectively filtered out, a capability that is not feasible with conventional planar or tetrahedral microphone arrangements.
Moreover, the proposed geometry plays a pivotal role in accurately estimating the Angle of Arrival (AoA) in both azimuthal and elevation planes. This capability ensures that the system can determine the source direction with precision, enabling reliable localization of acoustic events in diverse environments.
Figure 5 illustrates detailed internal architecture of unique acoustic signature detector, in accordance with an embodiment of the present disclosure.
The Figure 5 showcases the proposed Unique Acoustic Signature Detection (302), demonstrating its method for identifying distinctive signatures within received acoustic signals. The model employs ReLU-activated convolutional layers for hierarchical feature extraction from input spectrograms, effectively capturing intricate patterns in unique acoustic signatures. Utilizing 2D convolutions and max-pooling layers integrates spatial and temporal information, enabling comprehensive analysis across different frequency bands and time frames. The scalable and computationally efficient model architecture ensures efficient processing of large volumes of acoustic data, making it versatile for diverse datasets and real-world applications. Multiple convolutional and dense layers with ReLU activation facilitate learning complex non-linear decision boundaries between acoustic signatures, ensuring accurate detection across diverse acoustic landscapes.
A CNN (Convolutional Neural Network) model is a type of deep learning algorithm specifically designed for processing structured grid-like data, such as images or audio spectrograms, using convolutional layers to automatically learn hierarchical representations of features directly from the data. The input layer (13, 1000, 1) denotes the shape of the input data, with 13 representing the number of features or channels, 1000 indicating the length of the data along the temporal axis, and 1 denoting the number of channels or dimensions in the input data. The Conv2D layer with parameters (64, 3) indicates that it consists of 64 filters, each with a kernel size of 3x3. This layer performs 2D convolutional operations on the input data, extracting features using these filters across the spatial dimensions of the input. The MaxPooling2D layer with parameters (2x2) performs a downsampling operation on the input data, reducing its spatial dimensions by a factor of 2 along both the height and width dimensions. This is achieved by selecting the maximum value within each 2x2 region of the input feature map, effectively reducing its size while retaining important information. The Conv2D layer with parameters (64, 3) denotes that it consists of 64 filters with a kernel size of 3x3. This means that the layer performs convolutional operations on the input data using 64 filters, each having a receptive field of 3x3 pixels. The Flatten layer in a neural network is used to transform the multi-dimensional output from the preceding convolutional or pooling layers into a one-dimensional array. It collapses all dimensions except the batch dimension, essentially converting the feature maps into a single vector, which is then fed into the subsequent fully connected layers for classification or regression tasks. The Dense layer in a neural network is a fully connected layer where each neuron in the layer is connected to every neuron in the preceding layer. In the context of "Dense(512)", it indicates that there are 512 neurons in the layer. This layer performs a linear operation on the input data followed by an activation function, enabling the network to learn complex patterns in the data through its numerous parameters. Similarly, Dense(128) indicates 128 neurons in the next layer, which is followed by the output layer having 1 neuron predicting gunshot/non-gunshot of the input signal, using a sigmoid activation function.
It is crucial to emphasize that while CNN is an established model with predefined functions such as conv2D, MaxPool, Flatten, Dense, among others, our innovation lies in the customization or pruning of these elements. We focus on optimizing the size of neural networks, determining the ideal number of layers, and fine-tuning the number of neurons within each layer. Moreover, our novelty extends to strategically selecting the placement of specific layers within the network architecture. These tailored adjustments aim to maximize the accuracy and predictive capabilities of our model, setting it apart in the realm of machine learning advancements.
Advantages:
• The model's ReLU-activated convolutional layers facilitate hierarchical feature extraction from input spectrograms, capturing intricate patterns in unique acoustic signatures effectively.
• Utilizing 2D convolutions and max-pooling layers integrates spatial and temporal information within input spectrograms, enabling comprehensive analysis across different frequency bands and time frames for discerning unique acoustic features.
• The model architecture is scalable and computationally efficient, making it suitable for processing large volumes of acoustic data efficiently. This scalability ensures that the model can handle diverse datasets encompassing various acoustic environments and sources, making it versatile for real-world applications.
• Incorporating multiple convolutional and dense layers with ReLU activation allows the model to learn complex non-linear decision boundaries between acoustic signatures, facilitating accurate detection across diverse acoustic landscapes.
Figure 6 illustrates the detailed internal architecture of 3-Dimensional Angle of Arrival (AoA) Calculator, in accordance with an embodiment of the present disclosure.
The Figure 6shows the 3-Dimensional Angle of Arrival (AoA) Calculator (303), showcasing its algorithm capable of estimating AoA in both azimuthal and elevation planes, along with the confidence percentage of the estimated result. The algorithm is engineered to be robust to noise and interference by employing correlation-based calculations, ensuring accurate estimation of the angle of arrival even in noisy environments. A key feature of the algorithm is its utilization of a pre-computed matrix for the sample difference of the modeled 3D space. This innovative approach significantly reduces computational complexity, making the algorithm well-suited for real-time applications where efficiency is paramount. Furthermore, the algorithm enables fast on-board computing of 3D angle of arrival due to its low complexity, effectively reducing the latency of the overall architecture. This capability enhances the system's responsiveness, ensuring timely and accurate localization of acoustic sources.
Advantages:
• Angle of arrival calculation is based on correlation, making it robust to noise and interference, thereby enabling accurate estimation of the angle of arrival even in noisy environments.
• Using a pre-computed matrix for sample difference of modelled 3D space reduces the computational complexity drastically, making the algorithm suitable for real time application.
• The algorithm allows fast on-board computing of angle of arrival due to low complexity, reducing the latency of overall architecture.
Figure 7 illustrates detailed internal architecture of Confidence based range estimation for Source Classification, in accordance with an embodiment of the present disclosure.
Advantages:
• The model utilizes diverse convolutional layers with Leaky ReLU, ELU, and ReLU activations for intricate feature extraction from firearm images.
• The model employs max-pooling layers for efficient spatial downsampling, retaining crucial information while enhancing computational efficiency.
• The model implements kernel regularization (L2) in convolutional layers to prevent overfitting and promote smoother decision boundaries.
• The model incorporates multiple layers with non-linear activations and depth, enabling effective discrimination between firearm classes.
• The model utilizes Adam optimizer with adaptive learning rate for faster convergence and improved optimization performance.
Figure 8 illustrates detailed internal architecture of Confidence based range estimation for Range Estimation, in accordance with an embodiment of the present disclosure.
Referring to Figure 7 and Figure 8, a detailed algorithm flow of confidence-based range estimation (304) is depicts source classification and range estimation respectively, showcasing its unique approach to estimating the range of acoustic sources. This innovative algorithm begins by classifying the acoustic source, followed by modeling the environment to accurately determine the source's range. To achieve this, the model employs diverse convolutional layers with Leaky ReLU, ELU, and ReLU activations, facilitating intricate feature extraction from firearm images. Furthermore, the algorithm utilizes max-pooling layers for efficient spatial downsampling, preserving essential information while enhancing computational efficiency. To prevent overfitting and promote smoother decision boundaries, the model implements kernel regularization (L2) in convolutional layers. This regularization technique enhances the model's generalization capabilities, ensuring robust performance across diverse datasets. Additionally, the model incorporates multiple layers with non-linear activations and depth, enabling effective discrimination between acoustic source classes. By leveraging the Adam optimizer with adaptive learning rate, the algorithm achieves faster convergence and improved optimization performance, enhancing its effectiveness in estimating the range of acoustic sources with high confidence and accuracy.
The notation Input (5,5,2) signifies that the input data has a spatial shape of 5x5 with a depth of 2 channels. This means that the input consists of a grid with 5 rows and 5 columns, and each grid cell contains information from 2 channels. The `Conv2D` layer with parameters `(64, 5x5)` signifies that it is a convolutional layer with 64 filters, each having a spatial size of 5x5. This means that the layer will convolve the input data with 64 different filters, extracting various features from the input. The activation function used in this layer is Leaky ReLU, which introduces a small positive slope to the negative part of the output, helping alleviate the vanishing gradient problem. In a CNN model, the batch normalization layer normalizes the input data of each layer within a mini-batch. This normalization process helps stabilize and accelerate the training process by reducing internal covariate shift, which refers to the change in the distribution of the input values to a layer during training. MaxPooling 2D (64, 4x4) refers to a max-pooling operation applied to the output of a convolutional layer with 64 filters. In this operation, a 2D window of size 4x4 is moved across each feature map independently, and the maximum value within each window is retained. This process effectively reduces the spatial dimensions (width and height) of the feature maps by a factor of 4, while retaining the most significant information. MaxPooling helps to reduce computational complexity, control overfitting, and focus on the most relevant features by capturing the most salient features in each local region of the input feature maps. Conv2D (64, 4x4) ELU denotes a convolutional layer with 64 filters of size 4x4 and activation function Exponential Linear Unit (ELU). This layer performs convolution operations on the input data, extracting features using the specified filter size and ELU activation function, which helps to capture non-linear relationships and enhance model expressiveness. The batch normalization layer again normalizes the input data of each layer within a mini-batch. Max Pooling 3D (3x3) refers to a three-dimensional max pooling operation with a kernel size of 3x3. This layer reduces the spatial dimensions of the input volume along the depth dimension by retaining the maximum value within each 3x3x3 region. It helps in downsampling the feature maps, reducing computational complexity, and capturing the most relevant features in the input data. The dropout rate (0.4) refers to the proportion of neurons in a layer that are randomly "dropped out" during training, specifically, 40% in this case. Dropout is a regularization technique used to prevent overfitting by randomly deactivating neurons, forcing the model to learn more robust features and reducing reliance on specific neurons. The conv2D (128, 3x3) ReLU refers to a convolutional layer with 128 filters of size 3x3, where ReLU (Rectified Linear Unit) is used as the activation function. This layer performs feature extraction on input data using the specified number of filters, each applying a 3x3 convolution operation, followed by the ReLU activation function to introduce non-linearity. The batch normalization layer again normalizes the input data of each layer within a mini-batch. MaxPooling 2D (2x2) is a layer in a convolutional neural network (CNN) that performs downsampling by retaining only the maximum value within each 2x2 window of the input feature map. This operation reduces the spatial dimensions of the feature map, preserving the most prominent features while reducing computational complexity and preventing overfitting. The dropout rate (0.4) again refers to the proportion of neurons in a layer that are randomly "dropped out" during training, specifically, 40% in this case. The Flatten layer in a neural network is used to convert the multidimensional output of the preceding convolutional or pooling layers into a one-dimensional array. This allows the data to be fed into the subsequent fully connected layers for further processing. Essentially, it collapses all the dimensions of the input tensor except for the batch size, enabling compatibility with dense layers. The Dense layer in a neural network is a fully connected layer where each neuron in the layer is connected to every neuron in the previous layer. The parameter "64 units" specifies the number of neurons in the Dense layer. The activation function "Leaky ReLU" (Rectified Linear Unit) introduces non-linearity to the output of each neuron, helping the model learn complex patterns in the data. In Leaky ReLU, for negative input values, a small positive slope is introduced instead of setting the output to zero, which can help prevent dying ReLU problem. The Dense layer serves as the output layer of the neural network. In this context, "softmax" refers to the activation function used in the output layer. Softmax activation function is commonly used for multi-class classification tasks. It computes the probabilities of each class being the correct one, ensuring that the sum of all probabilities across classes equals one. This enables the model to output a probability distribution over the classes, making it suitable for classification problems where each sample belongs to exactly one class.
It is again crucial to emphasize that while the CNN model itself is established, our innovation lies in the meticulous selection and arrangement of its components. The decision on the CNN's size, layer count, and the strategic placement of functions like dropout, batch normalization, and max pooling are distinctive features of our approach. These choices are tailored to optimize the accuracy of the CNN model specifically for firearm classification, thereby adding a novel dimension to its application.
Advantages:
• Traditional approaches necessitate the deployment of multiple separate devices for precise gunshot location determination. In contrast, the proposed model achieves range estimation using a single device.
• This model comprehensively integrates absorption factors, including spherical divergence attenuation, air absorption, ground effects, and wall effects.
• Reduces dependency on sensor placement.
• Provides complete solution using a singular device.
Figure 9 illustrates detailed internal architecture of AoA(Angle of Arrival) Based localization Algorithm, in accordance with an embodiment of the present disclosure.
Referring to Figure 9, the Angle of Arrival (AoA) is showcased based localization algorithm (305), situated at the central server (107), tasked with triangulating the acoustic source using received AoA data from multiple Acoustic Sensing and Localization nodes (300). Centralized computation on the server significantly reduces processing time compared to onboard computers, enhancing efficiency and scalability. The algorithm utilizes geometric extrapolation and graphical intersection calculation techniques to ensure consistent estimation of a common intersection point, even in the presence of angle of arrival calculation errors. This approach allows for the triangulation of the acoustic source with high confidence, surpassing the accuracy of traditional mathematical triangulation methods. Unlike exact solutions sought by mathematical functions, our model employs approximate solutions, guaranteeing accurate triangulation despite small angle of arrival errors. This feature proves effective in scenarios where localization in any of the Acoustic Sensing nodes (300) is unreliable due to external factors, positioning, or extreme noise conditions.
Advantages:
• Computation performed on the central server significantly reduces processing time compared to onboard computers.
• Geometric extrapolation, coupled with graphical intersection calculation, ensures consistent estimation of a common intersection even in the presence of angle of arrival calculation errors.
• The triangulated region, representing the area of highest confidence in acoustic source, surpasses the accuracy of traditional mathematical triangulation methods. Unlike exact solutions sought by mathematical functions, our model employs approximate solutions, guaranteeing accurate triangulation despite small angle of arrival errors.
• This blocks proves effective and helpful when localization in any of the Acoustic sensing and Localization node (300) is unreliable due to external factors, positioning or extreme noise conditions.
Figure 10 illustrates confusion matrix showing accuracy of Acoustic Signature Detector in accordance with an embodiment of the present disclosure.
Referring to Figure 10, confusion matrix for the detection of received acoustic signature as gunshot or not, revealed the signature detection accurate as 99.72%.
Some of the results of the detailed internal architecture shown in Figure 5, the gunshot is considered as the acoustic source, is also provided in Figure 10 showing confusion matrix for the detection of received acoustic signature as gunshot or not, wherein it is revealed that the signature detection accuracy has an 99.72%.
Figure 11 illustrates measured results of 3D AoA estimation showcasing azimuthal and elevation angle of arrival from the acoustic source at different locations in accordance with an embodiment of the present disclosure.
Referring to Figure 11 and Figure 6, experimental results where acoustic source is placed at various locations, wherein azimuthal and elevation angle are estimated and compared with actual angles for 12 samples. The acoustic source here is cork-gun fired at a distance of around 200-300 m.
Figure 12 illustrates accuracy of Confidence based range estimation – Source classification for Firearm Classification in accordance with an embodiment of the present disclosure.
Figure 13 illustrates range estimation for samples taken at different ranges from the node in accordance with an embodiment of the present disclosure.
In reference to Figure 7, the Figure 12 shows theconfidence-based range estimation of the gun shot is shown as an acoustic source and classifies the acoustic signature into various firearm classes. The data used from some publicly available data base. Figure 12 relates to the results of classifying firearm according to their acoustic patterns. It reveals an accuracy of 94.69% with Precision of 94.92%.
In reference to Figure 8, the Figure 13shows the measured results from the estimated range based on classification using confidence-based range estimation, measured for 12 samples with sources at different locations and compared with the actual range.
Figure 14 illustrates a block diagram of a system (100) for acoustic source signature detection, classification, and localization in accordance with an embodiment of the present disclosure.
Referring to Figure 14, the system (100) includes one or more acoustic sensing and localization nodes (300), each comprising: a 3-dimensional microphone array (301) consisting of seven acoustic sensors configured to capture acoustic signals in a 3D space; a unique acoustic signature detection module (302) operatively connected to the 3-dimensional microphone array (301), configured to detect specific acoustic signatures from the captured signals; a 3-dimensional angle of arrival (AoA) calculator (303) configured to compute azimuthal and elevation angles of arrival of the detected acoustic source; a confidence-based range estimation module (304) configured to estimate the range of the detected acoustic source from the acoustic sensing and localization node (300) based on the calculated AoA and the captured signals; and a local display (105) configured to present the computed AoA, estimated range, and classified source type for user situational awareness.
In an embodiment, a local wireless network (106) is operatively connected to the one or more acoustic sensing and localization nodes (300) via a local wireless network interface (104) and configured to transmit data.
In an embodiment, a central server (107) is operatively connected to the local wireless network (106) and configured to execute the AoA-based localization technique (305), process the transmitted data, and retransmit the localization results to the acoustic sensing and localization nodes (300).
In an embodiment, one or more user UX (user experience) devices (109) are operatively connected to the local wireless network (106), configured to receive and display the localization results for user interaction and control.
In an embodiment, the 3-dimensional microphone array (301) is arranged to capture acoustic signals in real-time and in a spatially distributed manner to enhance the accuracy of AoA calculations and range estimations, wherein the 3-dimensional microphone array (301) comprising: a spherical structure subdivided into quadrilateral sections with equal surface areas, resulting in six faces and eight vertices; a plurality of rods, each extruded from one of the vertices, forming a spherically symmetric configuration with an angle from 65-75 degrees, preferably 70.529 degrees between any two adjacent rods; a plurality of microphones strategically housed within the structure, wherein the geometric configuration ensures that each microphone captures data of the acoustic wavefront simultaneously, wherein preferably seven microphones are strategically housed within the structure, wherein the strategic positioning of the microphones within the structure enables the detection and elimination of false data that conventional planar or tetrahedral microphone arrangements cannot achieve; an arrangement of microphone pairs within each plane, enhancing the system's ability to detect and localize acoustic sources with high accuracy, wherein the arrangement of microphone is adaptable to variations in size while maintaining the same geometric structure, provided the number of microphones remains constant, wherein the arrangement of microphone is configurable to adopt a different geometric structure, subject to a change in the number of microphones, as determined by the specific application requirements and the available chassis shape and size; a data filtering mechanism that utilizes data received from specific opposite microphone pairs to eliminate false or anomalous data within the received waveform; an angle of arrival (AoA) estimation module that determines the direction of the acoustic source in both azimuthal and elevation planes with precision; and a central processing unit configured to process the data captured by the microphones and provide a comprehensive snapshot of the acoustic environment in real-time.
In an embodiment, the unique acoustic signature detection module (302) is configured to operate in parallel with the 3-dimensional angle of arrival (AoA) calculator (303) and confidence-based range estimation module (304) to simultaneously detect, classify, and localize the acoustic source, wherein the unique acoustic signature detection module (302) comprising: a convolutional neural network (CNN) architecture designed to process acoustic spectrograms with an input layer configured to accept data of shape (13, 1000, 1), where 13 represents the number of features or channels, 1000 represents the temporal length, and 1 denotes the number of channels in the input data;a first Conv2D layer consisting of 64 filters with a kernel size of 3x3, configured to perform 2D convolutional operations on the input data, extracting spatial and temporal features across different frequency bands;a MaxPooling2D layer with a pooling size of 2x2, configured to downsample the feature maps by selecting the maximum value within each 2x2 region, effectively reducing the spatial dimensions while retaining critical information;a second Conv2D layer with 64 filters and a kernel size of 3x3, designed to further refine the extracted features, enhancing the model’s ability to capture intricate patterns in unique acoustic signatures;a flatten layer configured to transform the multi-dimensional output of the convolutional and pooling layers into a one-dimensional vector for subsequent classification tasks;a first dense layer with 512 neurons, fully connected to the Flatten layer, employing ReLU activation to facilitate the learning of complex non-linear decision boundaries between acoustic signatures;a second dense layer with 128 neurons, configured to further process the feature vector, refining the model’s decision-making capabilities;an output layer with a single neuron and sigmoid activation function, designed to classify the input signal as either a gunshot or non-gunshot acoustic event; anda customizable and scalable architecture, wherein the number of layers, the size of each layer, and the number of neurons within each layer are optimized and fine-tuned to enhance the model’s accuracy and predictive capabilities.
In an embodiment, the local display (105) is configured to provide a visual interface displaying the real-time 3-dimensional AoA, the estimated range of the acoustic source, and the classified source type for situational awareness and decision-making, wherein the Local Wireless Network (106) is configured to ensure continuous data transmission between the acoustic sensing and localization nodes (300), the central server (107), and the user UX devices (109), thereby enabling real-time acoustic source detection, classification, and localization across multiple nodes, wherein the central server (107) is configured to aggregate data from multiple acoustic sensing and localization nodes (300) thereby performing central computations to increase the confidence and accuracy of acoustic source localization and retransmitting the aggregated localization results to the networked nodes and user UX devices (109) for real-time display and monitoring.
In an embodiment, the AoA-based localization technique (305) is configured to utilize the data received from the 3-dimensional microphone array (301) to compute the precise location of the acoustic source and dynamically update the computed AoA and range based on continuous signal analysis and transmit the updated results to the local display (105) and user UX devices (109), wherein the AoA-based localization technique (305) employs: geometric extrapolation techniques to estimate the position of the acoustic source by determining the intersection points of lines corresponding to the AoA data from different nodes; graphical intersection calculation techniques to identify a consistent intersection point, even in the presence of AoA calculation errors, thereby ensuring accurate triangulation of the acoustic source; and compute the approximate intersection point of AoA lines rather than seeking exact mathematical solutions, thus enhancing the reliability of the triangulation process in scenarios where AoA data may be slightly erroneous.
In an embodiment, the system (100) further comprises a network-independent mode wherein each acoustic sensing and localization node (300) can independently detect, classify, and localize the acoustic source without relying on the local wireless network (106), and display the results locally on the local display (105).
Figure 15 illustrates a flow chart of a method for acoustic source signature detection, classification, and localization in accordance with an embodiment of the present disclosure.
Referring to Figure 15, the method (400) includes a plurality of steps as described under:
At step (402), the method (400) includes capturing acoustic signals using a 3-dimensional microphone array (301) at one or more acoustic sensing and localization nodes (300).
At step (404), the method (400) includes detecting specific acoustic signatures from the captured signals using a unique acoustic signature detection module (302).
At step (406), the method (400) includes calculating the azimuthal and elevation angles of arrival (AoA) of detected acoustic source using a 3-dimensional angle of arrival (AoA) calculator (303).
At step (408), the method (400) includes estimating the range of the detected acoustic source from the acoustic sensing and localization node (300) using a confidence-based range estimation module (304).
At step (410), the method (400) includes displaying the computed AoA, estimated range, and classified source type on a local display (105) for user situational awareness.
At step (412), the method (400) includes transmitting the data from the acoustic sensing and localization node (300) to a central server (107) via a local wireless network (106).
At step (414), the method (400) includes executing an AoA-based localization technique (305) at the central server (107) to process the transmitted data and retransmitting the localization results to the acoustic sensing and localization nodes (300) and user UX devices (109) for real-time display.
In an embodiment, the method (200) further comprises the step of aggregating data from multiple acoustic sensing and localization nodes (300) at the central server (107) to increase the accuracy and confidence of the acoustic source localization results.
In an embodiment, the method (200) further comprises the step of operating the system in a network-independent mode, where each acoustic sensing and localization node (300) can independently detect, classify, and localize the acoustic source, displaying the results locally without relying on the local wireless network (106).
The present invention represents a significant advancement in acoustic source signature detection, classification, and localization technology, offering several key advantages over existing solutions. This innovative system addresses the limitations of prior art topologies, making it a highly valuable and impactful invention. Here's how this invention is particularly helpful:
1. Enhanced accuracy: The unique three-dimensional arrangement of the microphone array, with seven sensors acquiring acoustic wavefront data simultaneously, ensures enhanced accuracy in localizing acoustic sources. This increased accuracy is especially beneficial in complex environments with multiple sound sources or reflections, where precise localization is crucial.
2. Comprehensive situational awareness: By calculating the Angle of Arrival (AoA) in both azimuthal and elevation planes, the invention provides a more detailed understanding of the spatial orientation of the acoustic source. This comprehensive approach improves localization accuracy and enhances situational awareness for users, enabling them to make informed decisions based on accurate and complete information.
3. Improved range estimation: The incorporation of AI-based algorithms for confidence-based range estimation enables the system to localize acoustic sources not only by AoA but also by combining range information. This capability significantly enhances the system's localization capabilities, providing more comprehensive and reliable results.
4. Operational resilience: Unlike existing methods that rely on remote servers, this invention can estimate geo-coordinates locally without the need for a remote network. This standalone capability reduces dependency on external infrastructure, making the system more resilient and reliable in challenging operating conditions where network connectivity may be limited or unavailable.
5. Adaptability and flexibility: While capable of operating independently, the proposed technology can also be configured to leverage wired or wireless networks for enhanced accuracy. This flexibility allows the system to obtain the location of the acoustic source from a remote server when network connectivity is available, further improving localization precision.
6. Versatility: The system's ability to perform acoustic signature detection, classification, AoA estimation, range estimation, triangulation, and geolocation makes it a comprehensive solution applicable to a wide range of scenarios, including law enforcement, military operations, and security monitoring.
Overall, this invention represents a significant technological advancement, offering enhanced accuracy, improved situational awareness, operational resilience, adaptability, and versatility. By addressing the limitations of existing solutions, it has the potential to revolutionize acoustic source localization applications, making it a highly valuable and impactful invention.
The drawings and the forgoing description give examples of embodiments. Those skilled in the art will appreciate that one or more of the described elements may well be combined into a single functional element. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, orders of processes described herein may be changed and are not limited to the manner described herein. Moreover, the actions of any flow diagram need not be implemented in the order shown; nor do all of the acts necessarily need to be performed. Also, those acts that are not dependent on other acts may be performed in parallel with the other acts. The scope of embodiments is by no means limited by these specific examples. Numerous variations, whether explicitly given in the specification or not, such as differences in structure, dimension, and use of material, are possible. The scope of embodiments is at least as broad as given by the following claims.
Benefit s, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits, advantages, solutions to problems, and any component(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as a critical, required, or essential feature or component of any or all the claims.
, Claims:1. A system for acoustic source signature detection, classification, and localization, comprising:
one or more acoustic sensing and localization nodes (300), each comprising:
a 3-dimensional microphone array (301) consisting of seven acoustic sensors configured to capture acoustic signals in a 3D space;
a unique acoustic signature detection module (302) operatively connected to the 3-dimensional microphone array (301), configured to detect specific acoustic signatures from the captured signals;
a 3-dimensional angle of arrival (AoA) calculator (303) configured to compute azimuthal and elevation angles of arrival of the detected acoustic source;
a confidence-based range estimation module (304) configured to estimate the range of the detected acoustic source from the acoustic sensing and localization node (300) based on the calculated AoA and the captured signals;
a local display (105) configured to present the computed AoA, estimated range, and classified source type for user situational awareness;
a local wireless network (106) operatively connected to the one or more acoustic sensing and localization nodes (300) via a local wireless network interface (104) and configured to transmit data;
a central server (107) operatively connected to the local wireless network (106) and configured to execute the AoA-based localization technique (305), process the transmitted data, and retransmit the localization results to the acoustic sensing and localization nodes (300); and
one or more user UX (user experience) devices (109) operatively connected to the local wireless network (106), configured to receive and display the localization results for user interaction and control.

2. The system as claimed in claim 1, wherein the 3-dimensional microphone array (301) is arranged to capture acoustic signals in real-time and in a spatially distributed manner to enhance the accuracy of AoA calculations and range estimations, wherein the 3-dimensional microphone array (301) comprising:
a spherical structure subdivided into quadrilateral sections with equal surface areas, resulting in six faces and eight vertices;
a plurality of rods, each extruded from one of the vertices, forming a spherically symmetric configuration with an angle from 65-75 degrees, preferably 70.529 degrees between any two adjacent rods;
a plurality of microphones strategically housed within the structure, wherein the geometric configuration ensures that each microphone captures data of the acoustic wavefront simultaneously, wherein preferably seven microphones are strategically housed within the structure, wherein the strategic positioning of the microphones within the structure enables the detection and elimination of false data that conventional planar or tetrahedral microphone arrangements cannot achieve;
an arrangement of microphone pairs within each plane, enhancing the system's ability to detect and localize acoustic sources with high accuracy;
wherein the arrangement of microphone is adaptable to variations in size while maintaining the same geometric structure, provided the number of microphones remains constant;
wherein the arrangement of microphone is configurable to adopt a different geometric structure, subject to a change in the number of microphones, as determined by the specific application requirements and the available chassis shape and size;
a data filtering mechanism that utilizes data received from specific opposite microphone pairs to eliminate false or anomalous data within the received waveform; and
an angle of arrival (AoA) estimation module that determines the direction of the acoustic source in both azimuthal and elevation planes with precision.

3. The system as claimed in claim 1, wherein the unique acoustic signature detection module (302) is configured to operate in parallel with the 3-dimensional angle of arrival (AoA) calculator (303) and confidence-based range estimation module (304) to simultaneously detect, classify, and localize the acoustic source, wherein the unique acoustic signature detection module (302) comprising:
a convolutional neural network (CNN) architecture designed to process acoustic spectrograms with an input layer configured to accept data of shape (13, 1000, 1), where 13 represents the number of features or channels, 1000 represents the temporal length, and 1 denotes the number of channels in the input data;
a first Conv2D layer consisting of 64 filters with a kernel size of 3x3, configured to perform 2D convolutional operations on the input data, extracting spatial and temporal features across different frequency bands;
a MaxPooling2D layer with a pooling size of 2x2, configured to downsample the feature maps by selecting the maximum value within each 2x2 region, effectively reducing the spatial dimensions while retaining critical information;
a second Conv2D layer with 64 filters and a kernel size of 3x3, designed to further refine the extracted features, enhancing the model’s ability to capture intricate patterns in unique acoustic signatures;
a flatten layer configured to transform the multi-dimensional output of the convolutional and pooling layers into a one-dimensional vector for subsequent classification tasks;
a first dense layer with 512 neurons, fully connected to the Flatten layer, employing ReLU activation to facilitate the learning of complex non-linear decision boundaries between acoustic signatures;
a second dense layer with 128 neurons, configured to further process the feature vector, refining the model’s decision-making capabilities;
an output layer with a single neuron and sigmoid activation function, designed to classify the input signal as either a gunshot or non-gunshot acoustic event; and
a customizable and scalable architecture, wherein the number of layers, the size of each layer, and the number of neurons within each layer are optimized and fine-tuned to enhance the model’s accuracy and predictive capabilities.

4. The system as claimed in claim 1, wherein the local display (105) is configured to provide a visual interface displaying the real-time 3-dimensional AoA, the estimated range of the acoustic source, and the classified source type for situational awareness and decision-making, wherein the Local Wireless Network (106) is configured to ensure continuous data transmission between the acoustic sensing and localization nodes (300), the central server (107), and the user UX devices (109), thereby enabling real-time acoustic source detection, classification, and localization across multiple nodes, wherein the central server (107) is configured to aggregate data from multiple acoustic sensing and localization nodes (300) thereby performing central computations to increase the confidence and accuracy of acoustic source localization and retransmitting the aggregated localization results to the networked nodes and user UX devices (109) for real-time display and monitoring.

5. The system as claimed in claim 1, wherein the AoA-based localization technique (305) is configured to utilize the data received from the 3-dimensional microphone array (301) to compute the precise location of the acoustic source and dynamically update the computed AoA and range based on continuous signal analysis and transmit the updated results to the local display (105) and user UX devices (109), wherein the AoA-based localization technique (305) employs:
geometric extrapolation techniques to estimate the position of the acoustic source by determining the intersection points of lines corresponding to the AoA data from different nodes:
graphical intersection calculation techniques to identify a consistent intersection point, even in the presence of AoA calculation errors, thereby ensuring accurate triangulation of the acoustic source; and
compute the approximate intersection point of AoA lines rather than seeking exact mathematical solutions, thus enhancing the reliability of the triangulation process in scenarios where AoA data may be slightly erroneous.

6. The system as claimed in claim 1, further comprising a network-independent mode wherein each acoustic sensing and localization node (300) can independently detect, classify, and localize the acoustic source without relying on the local wireless network (106), and display the results locally on the local display (105).

7. A method for acoustic source signature detection, classification, and localization, comprising:
capturing acoustic signals using a 3-dimensional microphone array (301) at one or more acoustic sensing and localization nodes (300);
detecting specific acoustic signatures from the captured signals using a unique acoustic signature detection module (302);
calculating the azimuthal and elevation angles of arrival (AoA) of detected acoustic source using a 3-dimensional angle of arrival (AoA) calculator (303);
estimating the range of the detected acoustic source from the acoustic sensing and localization node (300) using a confidence-based range estimation module (304);
displaying the computed AoA, estimated range, and classified source type on a local display (105) for user situational awareness;
transmitting the data from the acoustic sensing and localization node (300) to a central server (107) via a local wireless network (106); and
executing an AoA-based localization technique (305) at the central server (107) to process the transmitted data and retransmitting the localization results to the acoustic sensing and localization nodes (300) and user UX devices (109) for real-time display.

8. The method as claimed in claim 7, further comprising the step of aggregating data from multiple acoustic sensing and localization nodes (300) at the central server (107) to increase the accuracy and confidence of the acoustic source localization results.

9. The method as claimed in claim 7, further comprising the step of operating the system in a network-independent mode, where each acoustic sensing and localization node (300) can independently detect, classify, and localize the acoustic source, displaying the results locally without relying on the local wireless network (106).

Documents

Application Documents

# Name Date
1 202511012044-STATEMENT OF UNDERTAKING (FORM 3) [12-02-2025(online)].pdf 2025-02-12
2 202511012044-REQUEST FOR EARLY PUBLICATION(FORM-9) [12-02-2025(online)].pdf 2025-02-12
3 202511012044-PROOF OF RIGHT [12-02-2025(online)].pdf 2025-02-12
4 202511012044-POWER OF AUTHORITY [12-02-2025(online)].pdf 2025-02-12
5 202511012044-FORM-9 [12-02-2025(online)].pdf 2025-02-12
6 202511012044-FORM FOR SMALL ENTITY(FORM-28) [12-02-2025(online)].pdf 2025-02-12
7 202511012044-FORM FOR SMALL ENTITY [12-02-2025(online)].pdf 2025-02-12
8 202511012044-FORM 1 [12-02-2025(online)].pdf 2025-02-12
9 202511012044-FIGURE OF ABSTRACT [12-02-2025(online)].pdf 2025-02-12
10 202511012044-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [12-02-2025(online)].pdf 2025-02-12
11 202511012044-EVIDENCE FOR REGISTRATION UNDER SSI [12-02-2025(online)].pdf 2025-02-12
12 202511012044-EDUCATIONAL INSTITUTION(S) [12-02-2025(online)].pdf 2025-02-12
13 202511012044-DRAWINGS [12-02-2025(online)].pdf 2025-02-12
14 202511012044-DECLARATION OF INVENTORSHIP (FORM 5) [12-02-2025(online)].pdf 2025-02-12
15 202511012044-COMPLETE SPECIFICATION [12-02-2025(online)].pdf 2025-02-12
16 202511012044-FORM-8 [22-03-2025(online)].pdf 2025-03-22
17 202511012044-FORM 18 [04-08-2025(online)].pdf 2025-08-04