System And Method For Handling Alarms

< Back

System And Method For Handling Alarms

Abstract: ABSTRACT SYSTEM AND METHOD FOR HANDLING ALARMS The present disclosure relates to handling of alarms in a Network Management System (125). Stream data collected from network elements (110) is parsed and transformed into raise alarm or clear alarm. The alarms are stored in a distributed cache (225) after comparison with existing alarms already present in the distributed cache (225). A unique alarm identifier is generated for each alarm, and the clear alarms corresponding to the unique alarm identifiers are retrieved. A database (220) is checked for presence of associated raise alarms, and the raise alarms are deleted from an active section when the associated raise alarms are identified to be present and the clear alarms are streamed for retrying when the associated raise alarms are identified to be absent. The database (220) is also checked for presence of the raise alarms corresponding to retry alarm data and the raise alarms are deleted from the active section. Ref. Fig. 1

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

09 July 2023

Publication Number

2/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

JIO PLATFORMS LIMITED

OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD - 380006, GUJARAT, INDIA

Inventors

1. Aayush Bhatnagar

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

2. Sandeep Bisht

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

3. Rahul Mishra

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

4. Dipankar Divy

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

5. Smridhi Sharma

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

6. Elanchezhiyan E

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

7. Sumit Tiwari

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

Specification

DESC:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003

COMPLETE SPECIFICATION
(See section 10 and rule 13)
1. TITLE OF THE INVENTION
SYSTEM AND METHOD FOR HANDLING ALARMS
2. APPLICANT(S)
NAME NATIONALITY ADDRESS
JIO PLATFORMS LIMITED INDIAN OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA
3.PREAMBLE TO THE DESCRIPTION

THE FOLLOWING SPECIFICATION PARTICULARLY DESCRIBES THE NATURE OF THIS INVENTION AND THE MANNER IN WHICH IT IS TO BE PERFORMED.

FIELD OF THE INVENTION
[0001] The present invention relates to the field of network management systems and, more specifically, to the efficient handling of repetitive alarms and auditing.

BACKGROUND OF THE INVENTION
[0002] Network Management Systems (NMS) are essential for monitoring and maintaining the health and performance of computer networks. These systems generate alarms or alerts when specific events occur, indicating potential issues or anomalies within the network infrastructure. Alarms play a crucial role in monitoring and maintaining the health and performance of network elements, services, and interfaces. Alarms are raised to notify operators and administrators about critical events or abnormal conditions that require attention or intervention.
[0003] Alarm management is an integral part of network management systems and involves the detection, processing, and communication of alarms. Alarms are typically raised when certain predefined thresholds are exceeded, errors occur, or specific conditions are met within the network infrastructure. These alarms act as signals to indicate potential issues, faults, or performance degradation that may impact the overall network functionality.
[0004] Alarms are raised for various reasons, including network faults, equipment failures, performance bottlenecks, security breaches, or service disruptions. Repetitive alarms often occur when the underlying cause of an alarm persists, leading to recurring notifications. This commonly happens when services are down or when common interfaces experience fluctuations, causing multiple nodes, network elements, or network functions to repeatedly raise and clear alarms until the issue is resolved.
[0005] A typical alarm management system consists of several components that work together to handle alarms efficiently. The components include a Fault Processor (FP) or Fault Manager (FM) , Enrichment and Correlation modules, and Trouble Ticketing (TT) System.
[0006] In the field of network management, various approaches have been proposed to handle alarms and improve system performance. Existing systems utilize distributed processing and auditing of alarms to address some of the challenges associated with repetitive alarms. However, these solutions often lack comprehensive optimization and fail to provide efficient handling of repetitive alarms.
[0007] Despite the existing approaches, network management systems still encounter delays and inefficiencies when processing repetitive alarms. The recurring insertion, updating, and deletion of alarms from persistent storage and other processing stages lead to redundant operations and resource consumption, causing performance bottlenecks. Moreover, processing multiple stages for each raise and clear alarm, such as enrichment, correlation, and trouble ticketing, further exacerbates the problem.
[0008] There is therefore a need for an innovative solution that efficiently handles repetitive alarms, optimizes alarm processing workflow, reduces computational resources, and provides effective auditing, without compromising accuracy or losing any alarm data.

BRIEF SUMMARY OF THE INVENTION
[0009] One or more embodiments of the present disclosure provide a system and method of handling alarms in a Network Management System (NMS).
[0010] In one aspect of the present invention, a system for handling alarms in the NMS (henceforth referred as system) is disclosed. The system includes a collector component configured to collect stream data from network elements, and parse and transform the stream data into alarms in a standardized format. The stream data includes Fault, Configuration, Accounting, Performance, and Security (FCAPs) information. The alarms comprise one of two event types, raise alarm and clear alarm. The system further includes a distributed cache configured to store the alarms in after comparing the alarms with existing alarms already present in the distributed cache. The system further includes a Fault processor Master (FM) module configured to generate an unique alarm identifier for each alarm. The system further includes a raise FM module configured to retrieve the alarms corresponding to the unique alarm identifiers from the distributed cache. The system further includes a clear FM module configured to retrieve clear alarms corresponding to the unique alarm identifiers from the distributed cache and check the database for presence of associated raise alarms, and delete the raise alarms from an active section when the associated raise alarms are identified to be present and stream the clear alarms for retrying when the associated raise alarms are identified to be absent. The system further includes a retry FM module configured to check the database for presence of the raise alarms corresponding to retry alarm data and delete the raise alarms from the active section when identified to be present.
[0011] In one aspect, one or more parameters including time-based and count-based thresholds, delay for the raise alarm and the clear alarm consumption, interval for storing in the distributed cache, and batch interval for an auditor are configurable. Further, an occurrence count and a timestamp array of the alarms is updated based on the comparison of the alarms with the existing alarms. Metadata associated with the alarms is updated, the alarms are enriched with additional information, and details of the alarms are stored into the distributed cache. A retry count is incremented and the retry alarm data is reproduced into the stream data when the raise alarms are identified to be absent, thereby retrying an alarm until cleared or a count of retry threshold is exhausted. The system is further configured to perform one or more operations on the alarms including planned event processing, AI-based correlation to identify patterns or related events, and trouble ticketing to initiate incident management processes. The system is further configured to perform one or more of monitoring running fault processor topics, performing a lookup on a table based on the topics, extracting alarms present in the distributed cache for more than a configurable duration, and processing the alarms. When the raise alarms are deleted from the active section, the system adds clearance metadata to the raise alarms, and stores the raise alarms in an archived section of the database. The distributed cache is provided with a configurable interval for periodically storing the alarms, and the system scans the distributed cache for identifying a stranded alarm.
[0012] In another aspect of the present invention, a method of handling alarms in the NMS is disclosed. The method includes the step of collecting stream data from one or more network elements. The stream data includes Fault, Configuration, Accounting, Performance, and Security (FCAPs) information. The method further includes the step of parsing and transforming the stream data into alarms in a standardized format. The alarms comprise one of two event types, raise alarm and clear alarm. The method further includes the step of storing the alarms in a distributed cache after comparing the alarms with existing alarms already present in the distributed cache. The method further includes the step of generating a unique alarm identifier for each alarm. The method further includes the step of retrieving the alarms corresponding to the unique alarm identifiers from the distributed cache. The method further includes the step of retrieving the clear alarms corresponding to the unique alarm identifiers from the distributed cache. The method further includes the step of checking the database for presence of associated raise alarms, and deleting the raise alarms from an active section when the associated raise alarms are identified to be present and streaming the clear alarms for retrying when the associated raise alarms are identified to be absent. The method further includes the step of checking the database for presence of the raise alarms corresponding to retry alarm data and deleting the raise alarms from the active section when identified to be present.
[0013] In one aspect, one or more parameters including time-based and count-based thresholds, delay for the raise alarm and the clear alarm consumption, interval for storing in the distributed cache, and batch interval for an auditor are configurable. The method further includes the step of updating an occurrence count and a timestamp array of the alarms based on the comparison of the alarms with the existing alarms. The method further includes the step of updating metadata associated with the alarms, enriching the alarms with additional information, and storing details of the alarms into the distributed cache. The method further includes the step of incrementing a retry count and reproducing the retry alarm data into the stream data when the raise alarms are identified to be absent, thereby retrying an alarm until cleared or a count of retry threshold is exhausted. The method further includes the step of performing one or more operations on the alarms including planned event processing, Artificial Intelligence (AI)-based correlation to identify patterns or related events, and trouble ticketing to initiate incident management processes. The method further includes the step of performing one or more of monitoring and running fault processor topics, performing a lookup on a table based on the topics, extracting alarms present in the distributed cache for more than a configurable duration, and processing the alarms. The method further includes the step of deleting the raise alarms from the active section, adding clearance metadata to the raise alarms, and storing the raise alarms in an archived section of the database. The method further includes the step of providing the distributed cache with the configurable interval for periodically storing the alarms. The method further includes the step of scanning the distributed cache for identifying a stranded alarm.
[0014] Other features and aspects of this invention will be apparent from the following description and the accompanying drawings. The features and advantages described in this summary and in the following detailed description are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the relevant art, in view of the drawings, specification, and claims hereof. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes disclosure of electrical components, electronic components or circuitry commonly used to implement such components.
[0016] FIG. 1 illustrates a network architecture of a system for handling repetitive alarms and auditing, according to one or more embodiments of the present disclosure;
[0017] FIG. 2 illustrates a block diagram of a system for handling repetitive alarms and auditing, according to various embodiments of the present system;
[0018] FIG. 3 illustrates a block diagram of a system and a first network device communicating with each other for handling alarms, according to various embodiments of the present system;
[0019] FIG. 4 illustrates a system architecture for handling repetitive alarms and auditing, according to one or more embodiments of the present disclosure;
[0020] FIG. 5 illustrates a flow chart of a method of handling raise alarms, according to one or more embodiments of the present disclosure;
[0021] FIG. 6 illustrates a flow diagram of a method of handling clear alarms, according to one or more embodiments of the present disclosure; and
[0022] FIG. 7 illustrates a flow diagram of a method of managing raise alarm and clear alarm in a network managing system, according to one or more embodiments of the present disclosure.
[0023] The foregoing shall be more apparent from the following detailed description of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0024] Some embodiments of the present disclosure, illustrating all its features, will now be discussed in detail. It must also be noted that as used herein and in the appended claims, the singular forms “a”, “an” and “the” include plural references unless the context clearly dictates otherwise.
[0025] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure including the definitions listed here below are not intended to be limited to the embodiments illustrated but is to be accorded the widest scope consistent with the principles and features described herein.
[0026] A person of ordinary skill in the art will readily ascertain that the illustrated steps detailed in the figures and here below are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0027] The present invention introduces a system and method for efficient handling of repetitive alarms and auditing in network management systems. By leveraging distributed caching and configurable parameters, the invention optimizes the processing of raise and clear alarms, reduces redundant operations, and ensures accurate auditing. Multiple components, such as Fault Processor (FP), distributed cache, enrichment and correlation stages, Trouble Ticketing (TT) system, and auditors, work in coordination to handle alarms effectively. Present invention provides a comprehensive solution that enables network administrators and operators to monitor, control, and maintain the health and performance of network infrastructure. Also, the present invention provides a centralized platform for managing network devices, applications, and services, allowing administrators to proactively identify and resolve network issues.
[0028] A collector component consumes alarms and stores them in the I/O cache (also referred as distributed cache), and updates them for each recurrence within a configured delay. Raise and clear operations of alarms are handled separately by a raise FM module and clear FM module, respectively. The alarms undergo stages like enrichment, correlation, and trouble ticketing for comprehensive processing. An auditor component periodically consumes any missed alarms, ensuring their proper processing.
[0029] The invention can be implemented in a server-based network management system, where various modules collaborate to process alarms and ensure efficient handling. The inventive step lies in the configurability of parameters and the utilization of distributed caching to handle repetitive alarms effectively, optimize processing resources, and guarantee accurate auditing.
[0030] FIG. 1 illustrates a network architecture 100 of a system for handling repetitive alarms and auditing, in accordance with one implementation of the present embodiment. The network architecture 100 comprises several interconnected components that work together to optimize alarm processing and ensure accurate auditing.
[0031] The network architecture 100 shows a plurality of network devices 110. For the purpose of description and explanation, the description will be explained with respect to one or more network devices 110, or to be more specific will be explained with respect to a first network device 110a, a second network device 110b, and a third network device 110c, and should nowhere be construed as limiting the scope of the present disclosure.
[0032] In an embodiment, each of the first network device 110a, the second network device 110b, and the third network device 110c is one of, but are not limited to, hubs, switches, routers, bridges, gateways, modems, repeaters, and access points.
[0033] Each of the first network device 110a, the second network device 110b, and the third network device 110c is configured to transmit stream data via a communication network 105 to a Network Management System (NMS) 125.
[0034] The communication network 105 includes, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, or some combination thereof. The communication network 105 may include, but is not limited to, a Third Generation (3G), a Fourth Generation (4G), a Fifth Generation (5G), a Sixth Generation (6G), a New Radio (NR), a Narrow Band Internet of Things (NB-IoT), an Open Radio Access Network (O-RAN), and the like.
[0035] Also, a server 115 is accessible via the communication network 105. The server 115 may include by way of example but not limitation, one or more of a standalone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof. In an embodiment, the entity may include, but is not limited to, a vendor, a network operator, a company, an organization, a university, a lab facility, a business enterprise, a defence facility, or any other facility that provides content.
[0036] The NMS 125 (henceforth referred as the system 125) is communicably coupled to the server 115 and each of the first network device 110a, the second network device 110b, and the third network device 110c via the communication network 105. The system 125 is configured to handle repetitive alarms and auditing. The system 125 is adapted to be embedded within the server 115 or is embedded as an individual entity. However, for the purpose of description, the system 125 is described as an integral part of the server 115, without deviating from the scope of the present disclosure.
[0037] In various embodiments, the system 125 may be generic in nature and may be integrated with any application including a System Management Facility (SMF), an Access and Mobility Management Function (AMF), a Business Telephony Application Server (BTAS), a Converged Telephony Application Server (CTAS), any SIP (Session Initiation Protocol) Application Server which interacts with core Internet Protocol Multimedia Subsystem (IMS) on Industrial Control System (ISC) interface as defined by Third Generation Partnership Project (3GPP) to host a wide array of cloud telephony enterprise services, a System Information Blocks (SIB)/ and a Mobility Management Entity (MME).
[0038] Operational and construction features of the system 125 will be explained in detail successively with respect to different figures. FIG. 2 illustrates a block diagram of the system 125 for handling repetitive alarms and auditing, according to one or more embodiments of the present disclosure.
[0039] A fault management component (not illustrated in FIG. 2) of the system 125 is responsible for detecting, capturing, and notifying network administrators of any faults or abnormalities in the communication network 105. The fault management component receives alarms and alerts from the network devices 100, analyzes the alarms, and triggers appropriate actions for fault resolution. The fault management component also facilitates the storage and retrieval of alarm information for future analysis and reporting.
[0040] A performance monitoring component (not illustrated in FIG. 2) of the system 125 continuously measures and evaluates the performance metrics of each of the first network device 110a, the second network device 110b, and the third network device 110c and services. The performance monitoring component collects data such as network traffic, latency, packet loss, and device utilization. The performance monitoring component provides real-time and historical performance statistics, enabling administrators to identify performance bottlenecks, optimize network resources, and ensure Service Level Agreements (SLAs) are met.
[0041] A configuration management component (not illustrated in FIG. 2) of the system 125 facilitates centralized management and control of network device configurations. The configuration management component allows administrators to define, deploy, and update configuration settings for network devices, ensuring consistency and compliance across the network. The configuration management component also supports configuration backup, version control, and rollback capabilities to simplify troubleshooting and minimize downtime.
[0042] A security management component (not illustrated in FIG. 2) of the system 125 focuses on protecting the network infrastructure from unauthorized access, threats, and vulnerabilities. The security management component encompasses functions such as user authentication, access control, intrusion detection and prevention, and security event logging. The security management component monitors network traffic, identifies potential security breaches, and enforces security policies to safeguard the network from malicious activities.
[0043] A network planning and optimization component (not illustrated in FIG. 2) of the system 125 assists in designing and optimizing the network infrastructure. The network planning and optimization component employs algorithms and modeling techniques to analyze network performance, predict capacity requirements, and plan for network expansions or upgrades. The network planning and optimization component helps the administrators in making informed decisions regarding network topology, routing protocols, and resource allocation to enhance network efficiency and scalability.
[0044] As per the illustrated embodiment, the system 125 includes one or more processors 205, a memory 210, and an input/output interface unit 215. The one or more processors 205, hereinafter referred to as the processor 205, may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, single board computers, and/or any devices that manipulate signals based on operational instructions. As per the illustrated embodiment, the system 125 includes the processor 205. However, it is to be noted that the system 125 may include multiple processors as per the requirement and without deviating from the scope of the present disclosure. Among other capabilities, the processor 205 is configured to fetch and execute computer-readable instructions stored in the memory 210. The memory 210 may be configured to store one or more computer-readable instructions or routines in a non-transitory computer-readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory 210 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as EPROM, flash memory, and the like.
[0045] In an embodiment, the input/output (I/O) interface unit 215 includes a variety of interfaces, for example, interfaces for data input and output devices, referred to as Input/Output (I/O) devices, storage devices, and the like. The I/O interface unit 215 facilitates communication of the system 125. In one embodiment, the I/O interface unit 215 provides a communication pathway for one or more components of the system 125. Examples of such components include, but are not limited to, the network devices 110, a backend database 220, and a distributed cache 225.
[0046] The backend database 220 is one of, but is not limited to, a centralized database, a cloud-based database, a commercial database, an open-source database, a distributed database, an end-user database, a graphical database, a No-Structured Query Language (NoSQL) database, an object-oriented database, a personal database, an in-memory database, a document-based database, a time series database, a wide column database, a key value database, a search database, a cache database, and so forth. The foregoing examples of the backend database 220 types are non-limiting and may not be mutually exclusive e.g., a database can be both commercial and cloud-based, or both relational and open-source, etc.
[0047] The distributed cache 225 is a pool of Random-Access Memory (RAM) of multiple networked computers into a single in-memory data store for use as a data cache to provide fast access to data. The distributed cache 225 is essential for applications that need to scale across multiple servers or are distributed geographically. The distributed cache 225 ensures that data is available close to where it’s needed, even if the original data source is remote or under heavy load.
[0048] Further, the processor 205, in an embodiment, may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processor 205. In the examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processor 205 may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processor 205 may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the memory 210 may store instructions that, when executed by the processing resource, implement the processor 205. In such examples, the system 125 may comprise the memory 210 storing the instructions and the processing resource to execute the instructions, or the memory 210 may be separate but accessible to the system 125 and the processing resource. In other examples, the processor 205 may be implemented by electronic circuitry.
[0049] For the system 125 to handle repetitive alarms and auditing, the processor 205 includes a collector component 228, a Fault Processor Master (FM) module 230, a raise FM module 235, a clear FM module 240, and a retry FM module 245 communicably coupled to each other.
[0050] The collector component 228 of the processor 205 is communicably connected to each of the first network device 110a, the second network device 110b, and the third network device 110c via the communication network 105. Accordingly, the collector component 228 is configured to collect stream data from network elements. The stream data includes Fault, Configuration, Accounting, Performance, and Security (FCAPs) information. Further, the collector component 228 parses and transforms the stream data into alarms of standardized format. The alarms may belong to one of two event types, raise alarm and clear alarm. The alarms may be enriched with additional information.
[0051] The collector component 228 receives FCAPs data over different protocols, such as, but not limited to, SNMP (Simple Network Management Protocol), syslog, Representational State Transfer (REST), Simple Object Access Protocol (SOAP), or Kafka, from network devices 110. The collector component 228 converts the FCAPs data into a generic alarm format, making it compatible with the system's processing requirements. After conversion, the collector component 228 forwards the alarms to the FM module 230 for further processing.
[0052] The distributed cache 225 compares the alarms generated by the collector component 228 with existing alarms already present in the distributed cache 225. Upon comparison, when the alarms are not identified to be present, the alarms are stored in the distributed cache 225. Also, an occurrence count and a timestamp array of the alarms is updated based on the comparison. The distributed cache 225 also stores metadata associated with the alarms and details of the alarms. The distributed cache 225 is provided with a configurable interval for periodically storing the alarms.
[0053] The processor 205 further includes the FM master module 230 configured to receive the alarms from the collector component 228 and play a central role in the alarm synchronization and processing. The FM master module 230 stores the alarms in the distributed cache 225 for efficient processing. The FM master module 230 continuously consumes the alarms from the alarm stream and checks if the alarms already exist in the distributed cache 225. By comparing the received alarms with existing alarms, the FM master module 230 updates the occurrence count and timestamp array of the alarms accordingly. Such process helps in tracking the number of occurrences of each alarm and maintaining a history of their timestamps. Additionally, the FM master module 230 produces unique alarm identifiers (IDs) for each alarm, which are then streamed towards the raise FM module 235 and the clear FM module 240 for further handling and processing.
[0054] The processor 205 further includes the raise FM module 235 configured to retrieve the alarms corresponding to the unique alarm identifiers from the distributed cache 225. The raise FM module 235 retrieves the corresponding alarms from the distributed cache 225 using the identifiers. Once retrieved, the raise FM module 235 processes the alarms. The raise FM module 235 updates metadata associated with the alarms, enriches the alarms with additional information or context, and inserts them into the database 220. The raise FM module 235 also performs various operations on the alarms, such as planned event processing, AI-based correlation to identify patterns or related events, and trouble ticketing to initiate incident management processes.
[0055] In one embodiment, the processor 205 further includes an FM auditor configured to run at longer intervals for scanning the distributed cache 225 and diligently search for any stranded alarms that may have been overlooked or not adequately processed. This fail-safe mechanism ensures that no alarms are lost or left unattended. When the FM auditor identifies stranded alarms, it initiates the necessary processing steps to handle them appropriately, preventing any potential gaps in fault and alarm management. By regularly monitoring the distributed cache 225 and addressing any outstanding alarms, the FM auditor contributes to the overall efficiency and reliability of the system 125, ensuring that all alarms are accounted for and processed in a timely manner.
[0056] The FM auditor may be implemented as a raise auditor and a clear auditor. The raise auditor continuously monitors the running fault processor topics. The raise auditor performs a lookup on the table based on the topics and extracts alarms that have been present in the distributed cache 225 for more than a configurable duration, for example a minute. The raise Auditor ensures proper processing of alarms, especially during alarm surges, aiding in system recovery. The clear auditor component operates similarly to the raise auditor but focuses on processing clear alarms. The clear auditor continuously looks up the running fault processor topics, extracts alarms from the distribute cache 225 that have exceeded the configurable time threshold and processes them accordingly.
[0057] The processor 205 further includes the clear FM module 240 configured to retrieve raise alarms corresponding to the unique alarm identifiers from the distributed cache 225. Further, the clear FM module 240 checks the database for presence of associated raise alarms. If the raise alarms are found in the distributed cache 225, indicating that they were previously raised and not yet cleared, the clear FM module 240 performs clearance operations i.e. deletes the raise alarms from the active section, adds clearance metadata to the alarms, and stores the raise alarms in an archived section of the distributed cache 225. On the other hand, if the associated raise alarms are not found in the distributed cache 225, implying that they might have been cleared previously, the clear FM module 240 streams the clear alarms to the Retry FM 245 for retrying.
[0058] The processor 205 further includes the retry FM module 245 configured to check the database 220 for presence of the raise alarms corresponding to retry alarm data and delete the raise alarms from the active section when identified to be present. The retry FM module 245 increments a retry count and reproduces the retry alarm data into the stream data when the raise alarms are identified to be absent, thereby retrying an alarm until cleared or a count of retry threshold is exhausted.
[0059] Such non-blocking retry mechanism ensures optimal performance of the system 125 by avoiding unnecessary blocking of application threads and enabling handling of exceptional cases or delayed processing. The non-blocking retry mechanism uses a stream as a holding point and ensures that application threads remain free, enabling higher performance. During retry process, the raise alarm is inserted into the database 220, allowing the clear alarm to find it. For example, the clear alarm may be held for a defined holding period, such as an hour. The holding period is configurable. Further, the time of retrying can also be configured in different implementations, for example 5 seconds, 15 seconds, and so on.
[0060] One or more parameters associated with operation of one or more of the above-described modules are configurable. The parameters may include, but not limited to, time-based and count-based thresholds, delay for the raise alarm and the clear alarm consumption, interval for storing in the distributed cache, and batch interval for an auditor.
[0061] The distributed cache 225 of the system 125 has high throughput and distributed architecture designed to provide a reliable and available in-memory database with disk persistence. The distributed cache 225 serve as a storage mechanism for the alarms received from the FM master module 230. The distributed cache 225 stores the alarms based on their unique IDs generated by the FM master module 230. In this manner, the distributed cache 225 allows for efficient retrieval and processing of alarm data. Additionally, the distributed cache 225 efficiently updates the occurrence counts and timestamp arrays of existing alarms, eliminating the need for multiple database hits optimizing performance.
[0062] The database 220 of the system 125 serves as a non-structured (NoSQL) database that stores the FCAPs data, including alarms. The database 220 maintains both active and archived alarms, allowing for efficient retrieval and reporting. The raise FM module 235 and the clear FM module 240 interact with the database 220 to perform operations such as inserting new alarms, updating metadata, retrieving alarm information for processing, and storing cleared alarms in the archived section. The database 220 plays a crucial role in persistently storing and managing alarm data, ensuring its availability for analysis, reporting, and historical tracking.
[0063] Referring to FIG. 3 illustrating a block diagram of the system 125 and the first network device 110a communicating with each other for handling alarms, a preferred embodiment of the system 125 is described. It is to be noted that the embodiment with respect to FIG. 3 will be explained with respect to the first network device 110a for the purpose of description and illustration and should nowhere be construed as limited to the scope of the present disclosure.
[0064] The first network device 110a includes one or more primary processors 305 communicably coupled to the processor 205 of the system 125. The one or more primary processors 305 are coupled with a memory unit 310 storing instructions which are executed by the one or more primary processors 305. Execution of the stored instructions by the one or more primary processors 305 enables the first network device 110a to provide stream data including Fault, Configuration, Accounting, Performance, and Security (FCAPs) information. The first network device 110a further includes a kernel 315 which is a core component serving as the primary interface between hardware components of the first network device 110a and the plurality of services at the backend database 220. The kernel 315 is configured to provide the plurality of services on the first network device 110a to resources available in the communication network 105. The resources include one of a Central Processing Unit (CPU), memory components such as Random Access Memory (RAM) and Read Only Memory (ROM).
[0065] In the preferred embodiment, the collector component 228 of the processor 205 is communicably connected to the kernel 315 of the first network device 110a. The collector component 228 is configured to collect the stream data, parse and transform the stream data into alarms in a standardized format. The alarms comprise one of two event types, raise alarm and clear alarm. The collector component 228 is connected with a distributed cache 225 configured to store the alarms in after comparing the alarms with existing alarms already present in the distributed cache 225. The processor 205 further include the FM master module 230 communicably connected to the collector component 228 to generate a unique alarm identifier for each alarm. The processor 205 further includes the raise FM module 235 in communication with the FM module 230 to retrieve the alarms corresponding to the unique alarm identifiers from the distributed cache 225. The processor 205 further includes the clear FM module 240 in communication with the raise FM module 235 to retrieve clear alarms corresponding to the unique alarm identifiers from the distributed cache 225, and check a database 220 for presence of associated raise alarms, and delete the raise alarms from an active section when the associated raise alarms are identified to be present and stream the clear alarms for retrying when the associated raise alarms are identified to be absent. The processor 205 further includes the retry FM module 245 in communication with the clear FM module 240 to check the database 220 for presence of the raise alarms corresponding to retry alarm data and delete the raise alarms from the active section when identified to be present.
[0066] FIG. 4 illustrates a system architecture for handling repetitive alarms and auditing, according to one or more embodiments of the present disclosure. The collector component 228 collects stream data from network elements, parses and transforms the stream data into alarms of standardized format, and pushes the alarms into an alarm stream. The alarms can be of two types, raise alarm and clear alarm. The FM master module 230 consumes the alarms from the alarm stream and stores them into the distributed cache 225. Upon identifying that an alarm is already stored, the FM master module 230 updates occurrence count and timestamp array of the alarm in the distributed cache 225.
[0067] The raise FM module 235 fetches the alarms from the distributed cache 225 based on their unique identifiers and performs various operations on the alarms, such as planned event processing, AI-based correlation to identify patterns or related events, and trouble ticketing to initiate incident management processes. The raise FM module 235 also updates metadata associated with the alarms, enriches the alarms with additional information or context, and inserts them into the database 220. The clear FM module 240 retrieves clear alarms corresponding to the unique alarm identifiers from the distributed cache 225 and check the database 220 for presence of associated raise alarms. The clear FM module 240 deletes the raise alarms from an active section when the associated raise alarms are identified to be present and stream the clear alarms for retrying when the associated raise alarms are identified to be absent. After deleting the raise alarms from the active section, the clear FM module 240 adds clearance metadata to the alarms, and stores them in an archived section of the database 220. The retry FM module 245 check the database 220 for presence of the raise alarms corresponding to retry alarm data and deletes the raise alarms from the active section when identified to be present. If the raise alarms are not found, the retry FM module 245 increments the retry count and reproduces the data into the retry Stream for subsequent retries. This non-blocking retry mechanism avoids unnecessary blocking of application threads and enables handling of exceptional cases or delayed processing.
[0068] FIG. 5 illustrates a flow chart of a method 500 of handling raise alarms, according to one or more embodiments of the present disclosure. For the purpose of description, the method 500 is described with the embodiments as illustrated in FIGS. 1 and 2 and should nowhere be construed as limiting the scope of the present disclosure.
[0069] At step 505, the method 500 includes the step of raising of an alarm with a unique alarm ID, by the processor 205. Initially, the alarm may not be present in the distributed cache 225, and therefore will need to be stored into the distributed cache 225. Additionally, an alarm is raised in the raise message stream to notify relevant components.
[0070] At step 510, the method 500 includes the step of tracking recurrence and update of alarm in the distributed cache 225 by the processor 205. If the alarm with same alarm ID is raised again, the processor 205 retrieves the alarm from the distributed cache 225. Since the alarm is identified to be already present in the distributed cache 225, its occurrence count and timestamp are updated to track recurrences accurately.
[0071] At step 515, the method 500 includes the step of monitoring an occurrence count and timestamp evaluation by the processor 205. The processor 205 evaluates the occurrence count and the timestamp of the alarm. If the occurrence count reaches the configured raise threshold or the timestamp difference exceeds a configured time threshold, indicating a significant recurrence, the alarm is updated in the distributed cache 225 and produced in a raise message stream. Alternatively, if the timestamp does not exceed the configured time threshold, the alarm is only updated in the distributed cache 225 for future reference.
[0072] At step 520, the method 500 includes the step of consuming data from the raise stream based on a configured frequency by the processor 205. Further, the processor 205 sends alarm for persisted storage and further processing, such as in the database 220. Further processing includes metadata updating and enrichment, planned event handling, Artificial Intelligence (AI)-based correlation, and trouble ticketing.
[0073] FIG. 6 illustrates a flow chart of a method 600 of handling clear alarms, according to one or more embodiments of the present disclosure. For the purpose of description, the method 600 is described with the embodiments as illustrated in FIGS. 1 and 2 and should nowhere be construed as limiting the scope of the present disclosure.
[0074] At step 605, the method 600 includes the step of receiving an alarm with same alarm ID for clearance, by the processor 205. Initially, the alarm may not be present in the distributed cache 225, and therefore will need to be stored into the distributed cache 225. Additionally, an alarm is produced in the clear message stream to notify relevant components.
[0075] At step 610, the method 600 includes the step of tracking recurrence and update of alarm in the distributed cache 225 by the processor 205. If the alarm with same alarm ID is received again for clearance, the processor 205 retrieves the alarm from the distributed cache 225. Since the alarm is identified to be already present in the distributed cache 225, its occurrence count and timestamp are updated to track recurrences accurately.
[0076] At step 615, the method 600 includes the step of monitoring an occurrence count and timestamp evaluation by the processor 205. The processor 205 evaluates the occurrence count and clear timestamp of the alarm. If the occurrence count reaches the configured time threshold indicating a significant recurrence, the alarm is updated in the distributed cache 225 and produced in the clear message stream. Alternatively, if the configured time threshold is not reached, the alarm is only updated in the distributed cache 225 for future reference.
[0077] At step 620, the method 600 includes the step of consuming data from the clear stream based on a configured frequency by the processor 205. Further, the processor 205 sends alarm for clearance in the persisted storage, such as in the database 220. Sending the alarm for clearance includes checking the alarm in the database 220 to find a corresponding raise alarm. If a match is found, indicating an active raise alarm, the raise alarm is removed from the active section and archived. Alternatively, if no match is found, indicating a missed clear alarm, the alarm is produced back to the stream for the retry FM module 245 to retry the clearance.
[0078] FIG. 7 illustrates a flow chart of a method 700 of managing raise alarm and clear alarm in a network managing system, according to one or more embodiments of the present disclosure. For the purpose of description, the method 700 is described with the embodiments as illustrated in FIGS. 1 and 2 and should nowhere be construed as limiting the scope of the present disclosure.
[0079] At step 705, the method 700 includes the step of receiving an alarm from a message stream, by the processor 205.
[0080] At step 710, the method 700 includes the step of determining whether the alarm is present in the distributed cache 225 or not, by the processor 205.
[0081] At step 715, the method 700 includes updating the alarm occurrence count and the timestamp when the alarm is identified to be present in the distributed cache 225, by the processor 205. A recent update is updated in the distributed cache 225.
[0082] At step 720, the method 700 includes inserting the alarm in the distributed cache 225 when the alarm is identified to be absent in the distributed cache 225, by the processor 205.
[0083] At step 725, the method 700 includes processing the raise or clear alarms after the alarm is inserted in the distributed cache 225, by the processor 205.
[0084] At step 730, the method 700 includes comparing an occurrence count of the alarm against a threshold. Ideally, the alarm occurrence is not required to exceed the threshold. Similarly, timestamp of the alarm is also compared with a time threshold, at step 730.
[0085] At step 735, the method 700 includes producing the message to the message stream when the timestamp is identified to be greater than the time threshold, and successively clearing the alarm from the distributed cache 225, at step 725.
[0086] The present invention further discloses a non-transitory computer-readable medium having stored thereon computer-readable instructions. The computer-readable instructions are executed by the processor 205. The processor 205 is configured to collect stream data from network elements. The processor 205 is further configured to parse and transform the stream data into alarms in a standardized format. The alarms comprise one of two event types, raise alarm and clear alarm. The processor 205 is further configured to store the alarms in a distributed cache 225 after comparing the alarms with existing alarms already present in the distributed cache 225. The processor 205 is further configured to generate a unique alarm identifier for each alarm. The processor 205 is further configured to retrieve the alarms corresponding to the unique alarm identifiers from the distributed cache 225. The processor 205 is further configured to retrieve the clear alarms corresponding to the unique alarm identifiers from the distributed cache 225. The processor 205 is further configured to check the database 220 for presence of associated raise alarms, and delete the raise alarms from an active section when the associated raise alarms are identified to be present and stream the clear alarms for retrying when the associated raise alarms are identified to be absent. The processor 205 is further configured to check the database 220 for presence of the raise alarms corresponding to retry alarm data and delete the raise alarms from the active section when identified to be present.
[0087] A person of ordinary skill in the art will readily ascertain that the illustrated embodiments and steps in description and drawings (FIG.1-7) are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0088] The above described techniques (of creating hierarchies) of the present disclosure provide multiple advantages, including providing efficient handling of repetitive alarms and auditing using a distributed cache. Usage of distributed cache improves system performance by supporting higher transaction processing rates (TPS) and efficiently handling multiple occurrences of alarms into a single processed record. Configurable parameters, including time-based and count-based thresholds, delay for raise and clear consumption, interval for storing in IO cache, and batch interval for the auditor, offer flexibility and optimization options.
[0089] The present invention offers multiple advantages over the prior art and the above listed are a few examples to emphasize on some of the advantageous features. The listed advantages are to be read in a non-limiting manner.
[0090] Server: A server may include or comprise, by way of example but not limitation, one or more of a standalone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof. In an embodiment, the entity may include, but is not limited to, a vendor, a network operator, a company, an organization, a university, a lab facility, a business enterprise, a defence facility, or any other facility that provides content.
[0091] Network: A network may include, by way of example but not limitation, at least a portion of one or more networks having one or more nodes that transmit, receive, forward, generate, buffer, store, route, switch, process, or a combination thereof, etc. one or more messages, packets, signals, waves, voltage or current levels, some combination thereof, or so forth. The network may also include, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, or some combination thereof.
[0092] UE/ Wireless Device: A wireless device or a user equipment (UE) may include, but are not limited to, a handheld wireless communication device (e.g., a mobile phone, a smart phone, a phablet device, and so on), a wearable computer device (e.g., a head-mounted display computer device, a head-mounted camera device, a wristwatch computer device, and so on), a Global Positioning System (GPS) device, a laptop computer, a tablet computer, or another type of portable computer, a media playing device, a portable gaming system, and/or any other type of computer device with wireless communication capabilities, and the like. In an embodiment, the UEs may communicate with the system via set of executable instructions residing on any operating system. In an embodiment, the UEs may include, but are not limited to, any electrical, electronic, electro-mechanical or an equipment or a combination of one or more of the above devices such as virtual reality (VR) devices, augmented reality (AR) devices, laptop, a general-purpose computer, desktop, personal digital assistant, tablet computer, mainframe computer, or any other computing device, wherein the computing device may include one or more in-built or externally coupled accessories including, but not limited to, a visual aid device such as camera, audio aid, a microphone, a keyboard, input devices for receiving input from a user such as touch pad, touch enabled screen, electronic pen and the like. It may be appreciated that the UEs may not be restricted to the mentioned devices and various other devices may be used.
[0093] System (for example, computing system): A system may include one or more processors coupled with a memory, wherein the memory may store instructions which when executed by the one or more processors may cause the system to perform offloading/onloading of broadcasting or multicasting content in networks. An exemplary representation of the system for such purpose, in accordance with embodiments of the present disclosure. In an embodiment, the system may include one or more processor(s). The one or more processor(s) may be implemented as one or more microprocessors, microcomputers, microcontrollers, edge or fog microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that process data based on operational instructions. Among other capabilities, the one or more processor(s) may be configured to fetch and execute computer-readable instructions stored in a memory of the system. The memory may be configured to store one or more computer-readable instructions or routines in a non-transitory computer readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory may comprise any non-transitory storage device including, for example, volatile memory such as Random-Access Memory (RAM), or non-volatile memory such as Electrically Erasable Programmable Read-only Memory (EPROM), flash memory, and the like. In an embodiment, the system may include an interface(s). The interface(s) may comprise a variety of interfaces, for example, interfaces for data input and output devices, referred to as input/output (I/O) devices, storage devices, and the like. The interface(s) may facilitate communication for the system. The interface(s) may also provide a communication pathway for one or more components of the system. Examples of such components include, but are not limited to, processing unit/engine(s) and a database. The processing unit/engine(s) may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processing engine(s). In examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processing engine(s) may be processor executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processing engine(s) may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the machine-readable storage medium may store instructions that, when executed by the processing resource, implement the processing engine(s). In such examples, the system may include the machine-readable storage medium storing the instructions and the processing resource to execute the instructions, or the machine-readable storage medium may be separate but accessible to the system and the processing resource. In other examples, the processing engine(s) may be implemented by electronic circuitry. In an aspect, the database may comprise data that may be either stored or generated as a result of functionalities implemented by any of the components of the processor or the processing engines.
[0094] Computer System: A computer system may include an external storage device, a bus, a main memory, a read-only memory, a mass storage device, communication port(s), and a processor. A person skilled in the art will appreciate that the computer system may include more than one processor and communication ports. The communication port(s) may be any of an RS-232 port for use with a modem-based dialup connection, a 10/100 Ethernet port, a Gigabit or 10 Gigabit port using copper or fiber, a serial port, a parallel port, or other existing or future ports. The communication port(s) may be chosen depending on a network, such a Local Area Network (LAN), Wide Area Network (WAN), or any network to which the computer system connects. The main memory may be random access memory (RAM), or any other dynamic storage device commonly known in the art. The read-only memory may be any static storage device(s) including, but not limited to, a Programmable Read Only Memory (PROM) chips for storing static information e.g., start-up or basic input/output system (BIOS) instructions for the processor. The mass storage device may be any current or future mass storage solution, which may be used to store information and/or instructions. The bus communicatively couples the processor with the other memory, storage, and communication blocks. The bus can be, e.g. a Peripheral Component Interconnect (PCI) / PCI Extended (PCI-X) bus, Small Computer System Interface (SCSI), universal serial bus (USB), or the like, for connecting expansion cards, drives, and other subsystems as well as other buses, such a front side bus (FSB), which connects the processor to the computer system. Optionally, operator and administrative interfaces, e.g. a display, keyboard, and a cursor control device, may also be coupled to the bus to support direct operator interaction with the computer system. Other operator and administrative interfaces may be provided through network connections connected through the communication port(s). In no way should the aforementioned exemplary computer system limit the scope of the present disclosure.

REFERENCE NUMERALS
[0095] Environment - 100;
[0096] Communication network - 105;
[0097] Network devices - 110;
[0098] Server - 115;
[0099] System - 125;
[00100] One or more processors -205;
[00101] Memory – 210;
[00102] Input/output interface unit – 215;
[00103] Database – 220;
[00104] Distributed cache – 225;
[00105] Collector component – 228;
[00106] FM master module – 230;
[00107] Raise FM module – 235;
[00108] Clear FM module – 240;
[00109] Retry FM module - 245;
[00110] First network device – 110a;
[00111] Primary processor of first network device - 305;
[00112] Memory unit of first network device – 310; and
[00113] Kernel of the network device – 315.

,CLAIMS:CLAIMS

1. A method of handling alarms in a Network Management System (NMS) (125), the method comprising:
collecting, by one or more processors (205), stream data from one or more network elements (110);
parsing and transforming, by the one or more processors (205), the stream data into alarms in a standardized format, the alarms comprising one of two event types, raise alarm and clear alarm;
storing, by the one or more processors (205), the alarms in a distributed cache (225) after comparing the alarms with existing alarms already present in the distributed cache (225);
generating, by the one or more processors (205), a unique alarm identifier for each alarm;
retrieving, by the one or more processors (205), the clear alarms corresponding to the unique alarm identifiers from the distributed cache (225);
checking, by the one or more processors (205), a database (220) for presence of associated raise alarms, and deleting the raise alarms from an active section when the associated raise alarms are identified to be present and streaming the clear alarms for retrying when the associated raise alarms are identified to be absent; and
checking, by the one or more processors (205), the database (220) for presence of the raise alarms corresponding to retry alarm data and deleting the raise alarms from the active section when identified to be present.

2. The method as claimed in claim 1, the stream data including Fault, Configuration, Accounting, Performance, and Security (FCAPs) information.

3. The method as claimed in claim 1, comprising updating, by the one or more processors (205), an occurrence count and a timestamp array of the alarms based on the comparison of the alarms with the existing alarms.

4. The method as claimed in claim 1, comprising updating, by the one or more processors (205), metadata associated with the alarms, enriching the alarms with additional information, and storing details of the alarms into the distributed cache (225).

5. The method as claimed in claim 1, comprising incrementing, by the one or more processors (205), a retry count and reproducing the retry alarm data into the stream data when the raise alarms are identified to be absent, thereby retrying an alarm until cleared or a count of retry threshold is exhausted.

6. The method as claimed in claim 1, comprising performing, by the one or more processors (205), one or more operations on the alarms including planned event processing, Artificial Intelligence (AI)-based correlation to identify patterns or related events, and trouble ticketing to initiate incident management processes.

7. The method as claimed in claim 1, comprising performing, by the one or more processors (205), one or more of monitoring and running fault processor topics, performing a lookup on a table based on the topics, extracting alarms present in the distributed cache for more than a configurable duration, and processing the alarms.

8. The method as claimed in claim 1, comprising deleting, by the one or more processors (205), the raise alarms from the active section, adding clearance metadata to the raise alarms, and storing the raise alarms in an archived section of the database (220).

9. The method as claimed in claim 1, comprising providing, by the one or more processors (205), the distributed cache (225) with the configurable interval for periodically storing the alarms.

10. The method as claimed in claim 1, comprising scanning, by the one or more processors (205), the distributed cache (225) for identifying a stranded alarm.

11. A Network Management System (NMS) (125) for handling alarms, wherein the NMS (125) comprises:
a collector component (228) configured to:
collect stream data from network elements (110); and
parse and transform the stream data into alarms in a standardized format, the alarms comprising one of two event types, raise alarm and clear alarm;
a distributed cache (225) configured to store the alarms in after comparing the alarms with existing alarms already present in the distributed cache (225),
a Fault processor Master (FM) module (230) configured to generate a unique alarm identifier for each alarm;
a clear FM module (240) configured to:
retrieve clear alarms corresponding to the unique alarm identifiers from the distributed cache (225); and
check a database (220) for presence of associated raise alarms, and delete the raise alarms from an active section when the associated raise alarms are identified to be present and stream the clear alarms for retrying when the associated raise alarms are identified to be absent; and
a retry FM module (245) configured to check the database (220) for presence of the raise alarms corresponding to retry alarm data and delete the raise alarms from the active section when identified to be present.

12. The NMS (125) as claimed in claim 11, wherein the stream data includes Fault, Configuration, Accounting, Performance, and Security (FCAPs) information.

13. The NMS (125) as claimed in claim 11, wherein an occurrence count and a timestamp array of the alarms is updated based on the comparison of the alarms with the existing alarms.

14. The NMS (125) as claimed in claim 11, wherein metadata associated with the alarms is updated, the alarms are enriched with additional information, and details of the alarms are stored into the distributed cache (225).

15. The NMS (125) as claimed in claim 11, wherein a retry count is incremented and the retry alarm data is reproduced into the stream data when the raise alarms are identified to be absent, thereby retrying an alarm until cleared or a count of retry threshold is exhausted.

16. The NMS (125) as claimed in claim 11, wherein the NMS (125) performs one or more operations on the alarms including planned event processing, AI-based correlation to identify patterns or related events, and trouble ticketing to initiate incident management processes.

17. The NMS (125) as claimed in claim 11, wherein the NMS (125) is configured to perform one or more of monitoring running fault processor topics, performing a lookup on a table based on the topics, extracting alarms present in the distributed cache (225) for more than a configurable duration, and processing the alarms.

18. The NMS (125) as claimed in claim 11, wherein when the raise alarms are deleted from the active section, the NMS (125) adds clearance metadata to the raise alarms, and stores the raise alarms in an archived section of the database (220).

19. The NMS (125) as claimed in claim 11, wherein the distributed cache (225) is provided with a configurable interval for periodically storing the alarms.

20. The NMS (125) as claimed in claim 11, wherein the NMS (125) scans the distributed cache (225) for identifying a stranded alarm.

Documents

Application Documents

#	Name	Date
1	202321046099-STATEMENT OF UNDERTAKING (FORM 3) [09-07-2023(online)].pdf	2023-07-09
2	202321046099-PROVISIONAL SPECIFICATION [09-07-2023(online)].pdf	2023-07-09
3	202321046099-FORM 1 [09-07-2023(online)].pdf	2023-07-09
4	202321046099-FIGURE OF ABSTRACT [09-07-2023(online)].pdf	2023-07-09
5	202321046099-DRAWINGS [09-07-2023(online)].pdf	2023-07-09
6	202321046099-DECLARATION OF INVENTORSHIP (FORM 5) [09-07-2023(online)].pdf	2023-07-09
7	202321046099-FORM-26 [20-09-2023(online)].pdf	2023-09-20
8	202321046099-Proof of Right [22-12-2023(online)].pdf	2023-12-22
9	202321046099-DRAWING [01-07-2024(online)].pdf	2024-07-01
10	202321046099-COMPLETE SPECIFICATION [01-07-2024(online)].pdf	2024-07-01
11	Abstract-1.jpg	2024-08-05
12	202321046099-Power of Attorney [11-11-2024(online)].pdf	2024-11-11
13	202321046099-Form 1 (Submitted on date of filing) [11-11-2024(online)].pdf	2024-11-11
14	202321046099-Covering Letter [11-11-2024(online)].pdf	2024-11-11
15	202321046099-CERTIFIED COPIES TRANSMISSION TO IB [11-11-2024(online)].pdf	2024-11-11
16	202321046099-FORM 3 [27-11-2024(online)].pdf	2024-11-27
17	202321046099-FORM 18 [20-03-2025(online)].pdf	2025-03-20