System And Method For Selecting Anomaly Detection Technique

< Back

System And Method For Selecting Anomaly Detection Technique

Abstract: ABSTRACT SYSTEM AND METHOD FOR SELECTING ANOMALY DETECTION TECHNIQUE The present invention relates to a system (108) and a method (500) for selecting anomaly detection technique. The method (500) includes step of retrieving, data from one or more data sources based on establishing one or more connections with the one or more data sources. The method (500) further includes the step of formatting the data to align with an acceptable format. The method (500) further includes the step of storing the formatted data in a storage unit (204). The method (500) further includes the step of determining one or more attributes of the data stored in the storage unit (204). The method (500) further includes selecting one or more anomaly detection techniques based on determining the one or more attributes. The method (500) further includes the step of applying the one or more anomaly detection techniques to the stored data to identify one or more anomalies. Ref. Fig. 2

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

06 October 2023

Publication Number

15/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

JIO PLATFORMS LIMITED

OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA

Inventors

1. Aayush Bhatnagar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

2. Ankit Murarka

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

3. Jugal Kishore

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

4. Chandra Ganveer

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

5. Sanjana Chaudhary

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

6. Gourav Gurbani

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

7. Yogesh Kumar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

8. Avinash Kushwaha

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

9. Dharmendra Kumar Vishwakarma

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

10. Sajal Soni

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

11. Niharika Patnam

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

12. Shubham Ingle

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

13. Harsh Poddar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

14. Sanket Kumthekar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

15. Mohit Bhanwria

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

16. Shashank Bhushan

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

17. Vinay Gayki

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

18. Aniket Khade

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

19. Durgesh Kumar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

20. Zenith Kumar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

21. Gaurav Kumar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

22. Manasvi Rajani

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

23. Kishan Sahu

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

24. Sunil Meena

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

25. Supriya Kaushik De

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

26. Kumar Debashish

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

27. Mehul Tilala

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

28. Satish Narayan

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

29. Rahul Kumar

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

30. Harshita Garg

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

31. Kunal Telgote

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

32. Ralph Lobo

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

33. Girish Dange

Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

Specification

DESC:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003

COMPLETE SPECIFICATION
(See section 10 and rule 13)
1. TITLE OF THE INVENTION
SYSTEM AND METHOD FOR SELECTING ANOMALY DETECTION TECHNIQUE
2. APPLICANT(S)
NAME NATIONALITY ADDRESS
JIO PLATFORMS LIMITED INDIAN OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA
3. PREAMBLE TO THE DESCRIPTION

THE FOLLOWING SPECIFICATION PARTICULARLY DESCRIBES THE NATURE OF THIS INVENTION AND THE MANNER IN WHICH IT IS TO BE PERFORMED.

FIELD OF THE INVENTION
[0001] The present invention relates to the field of network management. More particularly, the invention relates to a system and a method for analyzing data and selecting a suitable anomaly detection technique based on the characteristics of the data source.
BACKGROUND OF THE INVENTION
[0002] With the increase in the number of users, the network service providers have been implementing upgrades to enhance service quality and keep pace with high demand. To enhance user experience and implement advanced monitoring mechanisms, prediction methodologies are being incorporated into network management. An advanced prediction system integrated with Artificial Intelligence /Machine Learning (AI/ML) excels in executing a wide array of algorithms or methodologies to perform analysis or predictive tasks on data inputs obtained from various sources with different formats, types, volumes, structures, etc. The data input may consist of historical data or live data.
[0003] A telecommunication network generates various types of data from different sources, each with unique characteristics. Network performance data contains information about network latency, packet loss, and error rates, which are structured and historical. Traditional statistical methods, like the Z-Score, are currently being used to analyse this data to detect anomalies. A cell tower telemetry data, which consists of real-time data streams from cell towers, provides information on signal strength, handover events, and network congestion. Streaming anomaly detection algorithms, such as Holt-Winters or Exponential Moving Averages (EMA), can be applied to real-time data to capture sudden changes in signal strength or congestion patterns. External data sources include customer feedback and complaints from reviews, social media, emails, and surveys. NLP (Neuro-Linguistic Programming) techniques like sentiment analysis can be used to detect customer sentiment or identify emerging issues.
[0004] The contemporary integration of AI/ML in networks employs various machine learning methodologies to perform necessary tasks for anomaly detection or prediction from the obtained data. When selecting a specific data set for anomaly detection, a specific model or predictive methodology has to be implemented. In traditional systems, network operators often face the challenge of manually selecting the most suitable machine learning methodology for anomaly detection based on the characteristics of the data source. This process can be time-consuming, error-prone, and requires specialized expertise in machine learning and data analysis.
[0005] Hence, there is a need for an efficient system and a method to automatically select suitable analysis and prediction methodology that would effectively detect the anomaly.
SUMMARY OF THE INVENTION
[0006] One or more embodiments of the present disclosure provide a system and a method for selecting anomaly detection technique.
[0007] In one aspect of the present invention, the method for selecting an anomaly detection technique is disclosed. The method includes the step of retrieving, by one or more processors, data from one or more data sources based on establishing one or more connections with the one or more data sources. The method further includes the step of formatting, by the one or more processors, the data to align with an acceptable format. The method further includes the step of storing, by the one or more processors, the formatted data in a storage unit. The method further includes the step of determining, by the one or more processors, one or more attributes of the data stored in the storage unit. The method further includes the step of selecting, by the one or more processors, one or more anomaly detection techniques based on determining the one or more attributes. The method further includes the step of applying, by the one or more processors, the one or more anomaly detection techniques to the stored data to identify one or more anomalies.
[0008] In an embodiment, the one or more processors, establishes the one or more connections with the one or more data sources using one or more Application Programming Interfaces (APIs).
[0009] In an embodiment, the one or more data sources include at least one of, the data sources within a telecommunication network and the data sources outside the telecommunication network.
[0010] In an embodiment, the one or more data sources within the telecommunication network include at least one of, network performance data, subscriber data and device data, wherein the one or more data sources outside the telecommunication network include at least one of, competitor data, social media data, customer feedback and surveys.
[0011] In an embodiment, the acceptable format is the format which is suitable for anomaly detection.
[0012] In an embodiment, the storage unit is at least one of, a data lake.
[0013] In an embodiment, the one or more attributes include at least one of, timestamps, data frequency, data size and scale, data complexity, nature of text of the data and specific type of data sources.
[0014] In an embodiment, to select, one or more anomaly detection techniques based on determining the one or more attributes, the method comprises the steps of, analysing, by the one or more processors, the one or more determined attributes. Thereafter, the method comprises selecting, by the one or more processors, the one or more anomaly detection techniques based on the analysing the determined one or more attributes.
[0015] In an embodiment, the analysing includes determining type of the one or more attributes, wherein the type of the one or more attributes includes at least one of, large data, historic data, complex data and text based data.
[0016] In an embodiment, the one or more processors, selects the one or more anomaly detection techniques based on historic performance of the one or more anomaly detection techniques with respect to similar data.
[0017] In an embodiment, the method further comprises the step of generating, by the one or more processors, notifications which are transmitted to at least one of, a user or integrated to a Network Management System (NMS), the notifications pertain to one or more anomalies which are identified.
[0018] In an another embodiment, the method further comprises the one or more processors, enables a learning module to learn the historic performance of the one or more anomaly detection techniques with respect to the similar data, a result of learned historic performance data is stored in the storage unit.
[0019] In another aspect of the present invention, the system for selecting anomaly detection technique is disclosed. The system includes a retrieving unit configured to retrieve data from one or more data sources based on establishing one or more connections with the one or more data sources. The system includes a formatting unit configured to format the data to align with an acceptable format. The formatting unit is further configured to store the formatted data in a storage unit. The system further includes a determining unit configured to determine one or more attributes of the data stored in the storage unit. The system further includes a selecting unit configured to select one or more anomaly detection techniques based on determining the one or more attributes. The system further includes an applying unit configured to apply the one or more anomaly detection techniques to the stored data to identify one or more anomalies.
[0020] In an another aspect of the present invention, a non-transitory computer-readable medium having stored thereon computer-readable instructions that, when executed by a processor, the processor is configured to retrieve data from one or more data sources based on establishing one or more connections with the one or more data sources. The processor is further configured to format, the data to align with an acceptable format. The processor is further configured to store, the formatted data in a storage unit. The processor is further configured to determine, one or more attributes of the data stored in the storage unit. The processor is further configured to select, one or more anomaly detection techniques based on determining the one or more attributes. The processor is further configured to apply, the one or more anomaly detection techniques to the stored data to identify one or more anomalies.
[0021] Other features and aspects of this invention will be apparent from the following description and the accompanying drawings. The features and advantages described in this summary and in the following detailed description are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the relevant art, in view of the drawings, specification, and claims hereof. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes disclosure of electrical components, electronic components or circuitry commonly used to implement such components.
[0023] FIG. 1 is an exemplary block diagram of an environment for selecting anomaly detection technique, according to one or more embodiments of the present invention;
[0024] FIG. 2 is a block diagram of the system for selecting anomaly detection technique, according to one or more embodiments of the present invention;
[0025] FIG. 3 is an exemplary architecture of the system of FIG. 2, according to one or more embodiments of the present invention;
[0026] FIG. 4 is an exemplary architecture illustrating the flow of operations performed for selecting anomaly detection technique, according to one or more embodiments of the present disclosure; and
[0027] FIG. 5 is a flow diagram of a method for selecting one or more anomaly detection techniques based on one or more attributes of data from the one or more data sources, according to one or more embodiments of the present invention.
[0028] The foregoing shall be more apparent from the following detailed description of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0029] Some embodiments of the present disclosure, illustrating all its features, will now be discussed in detail. It must also be noted that as used herein and in the appended claims, the singular forms "a", "an" and "the" include plural references unless the context clearly dictates otherwise.
[0030] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure including the definitions listed here below is not intended to be limited to the embodiments illustrated but is to be accorded the widest scope consistent with the principles and features described herein.
[0031] A person of ordinary skill in the art will readily ascertain that the illustrated steps detailed in the figures and here below are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0032] The present invention discloses a system and a method for selecting anomaly detection technique. More particularly, the system described herein offers a comprehensive approach for performing suitable analysis of data retrieved from one or more data sources, selecting one or more efficient and accurate anomaly detection techniques, and applying the anomaly detection technique to identify at least one or more anomalies. The selection is based on one or more attributes of the data retrieved from one or more data sources. The system uses an Artificial Intelligence/Machine Learning (AI/ML) model to determine and select the most suitable anomaly detection algorithm from a repository unit.
[0033] Referring to FIG. 1, FIG. 1 illustrates an exemplary block diagram of an environment 100 for selecting anomaly detection technique, according to one or more embodiments of the present invention. The environment 100 includes a User Equipment (UE) 102, a server 104, a network 106, and a system 108. A user interacts with the system 108 utilizing the UE 102.
[0034] Each of the at least one UE 102 namely the first UE 102a, the second UE 102b, and the third UE 102c is configured to connect to the server 104 via the network 106.
[0035] In an embodiment, each of the first UE 102a, the second UE 102b, and the third UE 102c is one of, but not limited to, any electrical, electronic, electro-mechanical or an equipment and a combination of one or more of the above devices such as smartphones, Virtual Reality (VR) devices, Augmented Reality (AR) devices, laptop, a general-purpose computer, desktop, personal digital assistant, tablet computer, mainframe computer, or any other computing device.
[0036] The network 106 includes, by way of example but not limitation, one or more of a telecommunication network, wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, or some combination thereof. The network 106 may include, but is not limited to, a Third Generation (3G), a Fourth Generation (4G), a Fifth Generation (5G), a Sixth Generation (6G), a New Radio (NR), a Narrow Band Internet of Things (NB-IoT), an Open Radio Access Network (O-RAN), and the like.
[0037] The network 106 may also include, by way of example but not limitation, at least a portion of one or more networks having one or more nodes that transmit, receive, forward, generate, buffer, store, route, switch, process, or a combination thereof, etc. one or more messages, packets, signals, waves, voltage or current levels, some combination thereof, or so forth.
[0038] The environment 100 includes the server 104 accessible via the network 106. The server 104 may include by way of example but not limitation, one or more of a standalone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, a processor executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof. In an embodiment, the entity may include, but is not limited to, a vendor, a network operator, a company, an organization, a university, a lab facility, a business enterprise side, a defense facility side, or any other facility that provides service.
[0039] The environment 100 further includes the system 108 communicably coupled to the server 104, and the UE 102 via the network 106. The system 108 is adapted to be embedded within the server 104 or is embedded as the individual entity.
[0040] Operational and construction features of the system 108 will be explained in detail with respect to the following figures.
[0041] FIG. 2 is a block diagram of the system 108 for selecting anomaly detection technique, according to one or more embodiments of the present invention.
[0042] As per the illustrated and preferred embodiment, the system 108 for selecting anomaly detection technique, includes one or more processors 202, memory 204, and a storage unit 206. The one or more processors 202, hereinafter referred to as the processor 202, may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, single board computers, and/or any devices that manipulate signals based on operational instructions. However, it is to be noted that the system 108 may include multiple processors as per the requirement and without deviating from the scope of the present disclosure. Among other capabilities, the processor 202 is configured to fetch and execute computer-readable instructions stored in the memory 204.
[0043] As per the illustrated embodiment, the processor 202 is configured to fetch and execute computer-readable instructions stored in the memory 204 as the memory is communicably connected to the processor 202. The memory 204 is configured to store one or more computer-readable instructions or routines in a non-transitory computer-readable storage medium, which may be fetched and executed to select anomaly detection technique. The memory 204 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as disk memory, EPROMs, FLASH memory, unalterable memory, and the like.
[0044] As per the illustrated embodiment, the storage unit 206 is specifically configured to store data associated with the operation performed in the system 108. The storage unit 206 is one of, but not limited to, the Unified Inventory Management (UIM) unit, a centralized database, a cloud-based database, a commercial database, an open-source database, a distributed database, an end-user database, a graphical database, a No-Structured Query Language (NoSQL) database, an object-oriented database, a personal database, an in-memory database, a document-based database, a time series database, a wide column database, a key value database, a search database, a cache databases, and so forth. The foregoing examples of storage unit 206 types are non-limiting and may not be mutually exclusive e.g., the database can be both commercial and cloud-based, or both relational and open-source, etc.
[0045] In an exemplary embodiment, the storage unit 206 serves as a central hub for all the data associated with the system 108, providing unified and accessible data for analysis. The storage unit 206 is configured to accommodate various data types, including structured data such as databases, semi-structured data such as JSON or XML files, and unstructured data such as text documents, images, and videos. This capability is crucial for telecommunications systems, which generate a wide array of data from different sources.
[0046] As per the illustrated embodiment, the system 108 includes the processor 202 to manage operations to select anomaly detection techniques. The processor 202 includes a retrieving unit 208, a formatting unit 210, a determining unit 212, a selecting unit 214, an applying unit 216, and a generating unit 218. The processor 202 is communicably coupled to the one or more components of the system 108 such as the memory 204 and the storage unit 206. In an embodiment, operations and functionalities of the retrieving unit 208, the formatting unit 210, the determining unit 212, the selection unit 214, the applying unit 216, the generating unit 218, and the one or more components of the system 108 can be used in combination or interchangeably.
[0047] The storage unit 204 may be at least one of, a data lake. It is important to note that the data lake is a centralized repository designed to hold vast volumes of data in its native, raw format. Data lakes may be designed to handle increasing volumes of data without significant changes to the underlying architecture. This scalability is essential for telecommunications networks, which continuously generate large amounts of data. This centralization simplifies data management and ensures that all relevant data is readily available for analysis.
[0048] The retrieving unit 208 of the processor 202 is configured to retrieve the data pertaining to the operation of the system 108 from one or more data sources. In one embodiment, the data is at least one of, but not limited to an internal data and an external data. The internal data includes all data available within a telecommunications network, such as, but not limited to, network performance data, subscriber data, cell tower telemetry data, and device data.
[0049] The network performance data provides historical data that includes metrics such as latency, bandwidth usage, packet loss, and error rates to provide the operational status of the telecommunications network. The subscriber data includes data about users of the telecommunications services, such as account details, usage patterns, billing information, and service preferences. The cell tower telemetry data are real-time data streams from cell towers that provide information on signal strength, handover events, and network congestion. The device data refers to data collected from various devices connected to the network, such as mobile phones, routers, and internet of things (IoT) devices. The device data includes information on device performance, connectivity status, and usage statistics, which are essential for managing network resources effectively. These internal data are vital for the telecommunications provider to maintain service quality, optimize network performance, and enhance user experience.
[0050] The external data include all the data available outside the telecommunications network such as competitor data, social media data, customer feedback, and surveys. The competitor data includes information regarding competitors' performance, pricing strategies, and market positioning. This data is crucial for benchmarking and developing competitive strategies. The social media data includes data gathered from social media platforms that can reveal customer sentiment, trends, and public perception of the telecommunications services. The customer feedback and surveys include direct feedback from customers, collected through surveys, reviews, and other channels.
[0051] In one embodiment, the retrieving unit 208 may utilize one of techniques such as, but not limited to, Database Extraction, ETL (Extract, Transform, Load) Tools, Application Programming Interface (API) Integration, Web Scraping, Real-Time Data Streaming, and Query Languages to retrieve the data from the one or more data sources.
[0052] In the database extraction technique, the retrieving unit 208 connects to a database using a database client or programming language to execute queries and retrieve data. For instance, the retrieving unit 208 pulls data on customer call records, including call duration, time of day, and destination numbers. This data is crucial for analyzing customer behavior and identifying usage patterns. The ETL tools are utilized to extract data from multiple data sources, handling various data formats and making them ideal for analyzing the consolidated data. For example, the ETL tool connects to network management systems via APIs to pull real-time performance metrics, extract billing data from SQL databases, and scrape social media platforms for customer feedback.
[0053] The API integration allows the retrieving unit 208 to access data from data services by making HTTP requests to API endpoints, enabling real-time data retrieval. For instance, the retrieving unit 208 pulls logs from cloud services or third-party monitoring tools via their APIs. The Web scraping involves writing scripts to extract data from web pages. For example, extracting data such as pricing of data plan, data limits, contract terms, and promotional offers from web pages. The real-time data streaming techniques facilitate continuous data retrieval from sources that provide live feeds, such as IoT devices or social media platforms. Technologies like Apache Kafka or AWS Kinesis may be employed to manage these streams, allowing applications to process and analyze data as it arrives.
[0054] The formatting unit 210 is configured to format the data to align with an acceptable format and is further configured to store the formatted data in the storage unit 206. The acceptable format refers to a data structure that allows machine learning models to accurately identify patterns and anomalies, such as service outages or unusual traffic spikes. The formatting unit 210 transforms raw internal data such as network performance data and raw external data such as customer feedback and social media received from the retrieving unit 208 into a consistent and standardized format that aligns with predefined criteria suitable for detecting anomalies.
[0055] For example, the formatting unit 210 pulls the data from the retrieving unit 208 and performs several key processes: such as cleaning the data by removing duplicates and correcting errors, structuring the data into a predefined schema such as organizing network logs into tables with fields like timestamp, event type, and severity, and ensures that the formatted data is stored in a designated storage unit 206 for easy retrieval. This structured approach ensures that the data is not only consistent and standardized but also ready for effective use in machine learning model training and analysis, enabling the telecommunications company to proactively identify and address potential network issues, thereby enhancing service reliability and customer satisfaction.
[0056] The determining unit 212 is configured to determine one or more attributes of the data stored in the storage unit 206. In an embodiment, the one or more attributes may include at least one of timestamps, data frequency, data size and scale, data complexity, nature of the text within the data, and specific type of data sources.
[0057] The timestamps are markers indicating the exact time when a specific event occurred or when data was recorded, essential for tracking changes over time. For example, in a network management system, each call detail record (CDR) may include a timestamp showing when the call started and ended, such as "2024-09-24 05:51:00 UTC". The system 208 analyzes the timestamps associated with the data to determine whether the data is historical or real-time. Historical data typically features past timestamps, while real-time data is characterized by current or continuously updating timestamps.
[0058] Data frequency refers to how often data points are collected or transmitted over a specific time period, impacting the responsiveness and performance of telecommunication services. The frequency at which new data is generated or logged also provides insights; historical data is often generated at regular intervals or event-driven, whereas real-time data updates continuously or at high-frequency intervals. For instance, a cellular network that measures signal strength every second has a high data frequency, while a network that aggregates data and sends reports every hour has a low frequency.
[0059] Data size refers to the volume of data stored or transmitted, often measured in bytes, kilobytes, or megabytes, while scale indicates the extent of the dataset, influencing processing capabilities and storage requirements. A telecommunication provider may store billions of records from customer call logs, resulting in terabytes of data, whereas a small local provider might only have a few gigabytes from fewer customers.
[0060] Data complexity refers to the intricacy of the data structure, including the number of variables and their interrelationships. Complex data often requires sophisticated processing techniques for analysis. For example, analyzing network traffic data, which includes various metrics such as bandwidth usage, latency, and error rates across different devices and locations, represents a complex dataset that requires advanced analytics.
[0061] Data size and scale may necessitate specific algorithms optimized for big data processing, while data complexity requires assessing whether the data is structured or unstructured and what types of data it contains.
[0062] The nature of the text within the data pertains to the characteristics of textual data, including its format, content type, and context. Understanding these characteristics is crucial for effective data processing and analysis. For example, customer service transcripts from interactions with support agents may contain varied language, technical jargon, and informal expressions, all of which can significantly impact sentiment analysis and the interpretation of customer feedback. Additionally, performing sentiment analysis on social media data, such as text in posts and comments, can effectively enhance the understanding of customer sentiments
[0063] The specific type of data sources refers to the origin of the data, which can vary widely in telecommunications, affecting how data is collected, stored, and analyzed. Data sources may include billing systems, network management tools, customer relationship management (CRM) software, and social media platforms, each providing different types of information, such as user profiles, call logs, or customer feedback.
[0064] The selecting unit 214 is configured to select one or more anomaly detection techniques based on the determined attributes. In an embodiment, the selecting unit 214 is configured to analyze the one or more determined attributes, which includes determining type of the one or more attributes and selecting the one or more anomaly detection techniques. The type of the one or more attributes may include at least one of, but are not limited to, large data, historic data, complex data, and text based data. For instance, the selecting unit 214 examines timestamps associated with data to distinguish between historical and real-time data based on whether the timestamps represent past dates or are continuously updating. Further, the selecting unit 214 evaluates data size and scale, assessing data complexity to determine if the data is structured or unstructured and whether it contains text, images, numerical values, or a mix of data types, as different algorithms are suited for various complexities. Furthermore, the selecting unit 214 performs sentiment analysis on social media data, such as text in posts and comments, aids in understanding customer sentiments.
[0065] Based on the analysis of the one or more attributes, the selecting unit 214 utilizes predefined criteria and machine learning models to select the most suitable anomaly detection algorithm from a repository unit. For example, selecting unit 214 initially relies on the analysis of the attributes of the data sources which may include the data type—referring to the nature of the data, such as numerical, categorical, or time-series, data volume—representing the amount of data being processed, which can influence the choice of algorithm; data complexity—indicating the intricacy of the data structure and the relationships among variables; and data frequency—how often data points are collected, which can affect real-time detection capabilities.
[0066] Subsequently, the selecting unit 214 employs predefined criteria to evaluate which anomaly detection techniques are most appropriate for the specific data attributes. These criteria may encompass performance metrics such as accuracy, precision, and recall of the algorithms based on historical data; computational efficiency, which considers the resource requirements of the algorithms and is crucial for real-time applications; and scalability, the ability of the algorithm to handle increasing amounts of data without significant performance degradation.
[0067] The selecting unit 214 then leverages machine learning models trained on historical data to enhance the selection process. These models can predict which algorithms are likely to perform best based on the characteristics of the current dataset. For instance, if the data exhibits a high degree of variability, the model might suggest using a robust statistical method or a machine learning approach that is effective in high-variance scenarios. The selected anomaly detection algorithms are drawn from the repository unit, a collection of various algorithms that have been pre-evaluated and stored for use. This repository unit may include statistical methods, such as Z-score or Tukey’s method for detecting outliers; machine learning algorithms, including supervised methods like decision trees or unsupervised methods like clustering algorithms (e.g., DBSCAN); and hybrid approaches that combine multiple techniques to improve detection accuracy.
[0068] The selecting unit 212 performs tailored selection. By leveraging predefined criteria and machine learning models, the selecting unit ensures that the selected anomaly detection techniques are specifically aligned with the characteristics of the data, thereby improving the likelihood of successful anomaly detection. Furthermore, the selecting unit 212 enhances efficiency in algorithm selection; its ability to quickly identify and select suitable algorithms from a well-organized repository unit streamlines the anomaly detection process, allowing for faster implementation and response times in monitoring systems. Additionally, the selecting unit 212 promotes user empowerment by displaying the selected techniques on a user interface. This feature enables users to actively engage with the anomaly detection process, enhancing the overall user experience and allowing for customization based on individual preferences and expertise. Here, users refer to telecom operators.
[0069] The applying unit 216 is configured to apply one or more anomaly detection techniques on stored data to identify patterns or trends within the data, as well as to detect one or more anomalies. Here, patterns/trends refer to consistent trends or behaviors observed in the data. For example, a telecom company may notice a steady increase in data usage among its subscribers during specific hours of the day, such as evenings or weekends. The anomalies are specific instances that stand out as unusual or unexpected. For example, if a particular cell tower experiences an unexpected spike in call drop rates during a time when usage is typically stable, this could indicate a potential issue with the tower's hardware or software. Similarly, if a subscriber suddenly shows an unusual increase in data consumption that is significantly higher than their historical usage patterns, it may suggest fraudulent activity or a malfunctioning device. The ability to detect these anomalies is essential for timely intervention and decision-making, particularly in dynamic environments like telecommunications or financial transactions. Herein the one or more anomaly detection techniques may include various algorithms designed to analyze data points and identify those that deviate from established norms. Anomaly detection is fundamentally about recognizing instances that do not conform to expected patterns, which can be critical. This process is crucial for understanding the underlying behaviour of the data and for detecting any irregularities that may indicate issues such as fraud, system failures, or operational inefficiencies.
[0070] The applying unit 216 may involve the steps which include processing of the data, identifying any patterns or anomalies. By analyzing the temporal or sequential aspects of the data, the applying unit 216 uncovers insights such as seasonal trends, cyclical behaviors, or sudden shifts in data patterns. This capability is particularly valuable in fields like finance or network security, where understanding normal behavior is critical for timely anomaly detection.
[0071] The generating unit 218 is configured to generate notifications that are transmitted to at least one of, the users or integrated to a Network Management System (NMS), pertaining to one or more identified anomalies. In an embodiment, the user may include data consumers specifically, telecom operators and the NMS serves as a framework for managing both the hardware and software components of a company's data network, ensuring that all elements work together efficiently and securely. The NMS is essential for the effective administration and operation of data networks, enabling organizations to maintain reliable and secure communication infrastructures.
[0072] To effectively generate these notifications, the generating unit 218 may employ several methods such as utilizing notification policies, which determine how alerts are routed to specific contact points. These policies may be structured hierarchically, allowing for the creation of child policies that can handle different types of alerts based on severity or context. This ensures that the user receive notifications tailored to the nature of the anomaly detected. Additionally, the generating unit 218 may implement alert rules that define the conditions under which notifications are triggered. For instance, when significant patterns or issues are detected, the alert rules may activate the notification process, ensuring that network operators are promptly informed. Moreover, integrating with existing systems may enhance notification delivery. For example, notifications can be sent through various channels such as email, Slack, or PagerDuty, allowing for immediate attention from the user. This multi-channel approach ensures that notifications reach the user in a manner that is most convenient and effective for them.
[0073] FIG. 3 illustrates an exemplary architecture for the system 108, according to one or more embodiments of the present invention. More specifically, FIG. 3 illustrates the system 108 for selecting anomaly detection technique. It is to be noted that the embodiment with respect to FIG. 3 will be explained with respect to the one or more data sources 302 for the purpose of description and illustration and should nowhere be construed as limited to the scope of the present disclosure.
[0074] FIG. 3 shows that the processor 202 establishes the one or more connections with the one or more data sources 302. The processor 202 is a critical component in the system 108 designed to interact with various data sources, referred to as one or more data sources 302, analyze data, select anomaly detection technique based on historic performance of the one or more anomaly detection techniques with respect to similar data, apply selected anomaly detection technique on the data to identify at least one of, patterns/ trends with the data or one or more anomalies and generate notifications pertain to one or more anomalies which are identified and transmit to the user. Further, the processor 202, enables a learning module to learn the historic performance of the one or more anomaly detection techniques with respect to the similar data. The result of this learnt historic performance data are stored in the storage unit 206.
[0075] In an embodiment, the connection of the processor 202 with the one or more data sources 104 is facilitated through the use of Application Programming Interfaces (APIs). The APIs used can vary in type, including RESTful APIs, SOAP APIs, or other custom APIs designed for specific data sources. Each type has its own set of rules and protocols for communication, which the processor 202 must adhere to when establishing connections. By using APIs, the processor 202 is configured to connect with a wide range of data sources, regardless of their underlying technology or architecture. This interoperability is essential for modern systems that rely on diverse data inputs. The ability to establish multiple connections through the APIs allows the system to scale efficiently. As the demand for data increases, the processor 202 can connect to additional data sources without significant changes to the underlying architecture. The APIs enable real-time access to data, allowing the processor 202 to retrieve the most current information from the data sources 302. This is particularly important for applications that require up-to-date data for decision-making or analysis. The APIs often include authentication and authorization mechanisms, ensuring that only authorized users or systems can access the data sources 302. This adds a layer of security to the data exchange process.
[0076] The one or more data sources 302 includes, by way of example but not limited to, internal data sources 302a and external data sources 302b. The internal data sources 302a encompass all data sources available within a telecommunications network that provide internal data. In contrast, the external data sources 302b consist of all data sources available outside the telecommunications network that provide external data.
[0077] As mentioned earlier in FIG.2, the system 108 includes the processors 202, the memory 204, and the storage unit 206, for selecting anomaly detection technique, which are already explained in FIG. 2. For the sake of brevity, a similar description related to the working and operation of the system 108 as illustrated in FIG. 2 has been omitted to avoid repetition.
[0078] Further, as mentioned earlier the processor 202 includes the retrieving unit 208, the formatting unit 210, the determining unit 212, the selection unit 214, the applying unit 216, the generating unit 218, which are already explained in FIG. 2. Hence, for the sake of brevity, a similar description related to the working and operation of the system 108 as illustrated in FIG. 2 has been omitted to avoid repetition. The limited description provided for the system 108 in FIG. 3, should be read with the description provided for the system 108 in the FIG. 2 above, and should not be construed as limiting the scope of the present disclosure.
[0079] FIG. 4 is an exemplary architecture illustrating the flow of operations performed for selecting anomaly detection technique, according to one or more embodiments of the present disclosure.
[0080] In one embodiment, the architecture 400 includes data source integration 402, data ingestion layer 404, data lake 406, data source analysis unit 408, algorithm repository 410, algorithm selection 412, algorithm application unit 414, alerting and response unit 416, and user interface 418.
[0081] Initially, the data source integration 402 establishes connections with various data sources. These diverse data sources are essential for comprehensive analysis and decision-making, as they provide critical insights into network efficiency, user behavior, and operational performance. In an embodiment, the data source integration 402 establishes connections with internal data sources that include data such as network performance data, subscriber data, device data, and more. In another embodiment, the data source integration 402 establishes connections with external data sources located outside the network and includes data such as competitor data, social media data, customer feedback and surveys, etc.
[0082] Once the connections are established, the data ingestion layer 404 collects data from the data sources, converts/structures raw data from various sources into a format that is suitable for machine learning model training and analysis. This process ensures that data is consistent, standardized, and ready to be used effectively. In an embodiment, the data ingestion layer 404 may include at least, but is not limited to, connectors, data ingestion pipelines, and real-time data stream handling mechanisms to effectively collect data from various sources.
[0083] The connectors are essential components that facilitate communication with the data sources. The connectors may be designed for specific types of databases (e.g., SQL, NoSQL) or data services (e.g., web-socket based connections, REST APIs and etc). The connectors are configured to handle authentication, data querying, and data retrieval processes, ensuring that the system 108 can access the necessary data efficiently. For instance, a SQL connector allows the system 108 to execute queries directly against a relational database, while a REST API connector enables the retrieval of data from web services.
[0084] The data ingestion pipelines are structured processes that automate the collection of data from various data sources. This method is particularly useful for integrating data from disparate sources into a unified format.
[0085] For applications that require immediate access to data, mechanisms for handling real-time data streams are crucial. This involves setting up connections to data sources that provide continuous data feeds, such as IoT devices, social media platforms, or financial market data.
[0086] The formatted data is then aggregated and stored in a data lake 406. For example, a retail company wants to analyze customer behavior to improve sales strategies, where the company first collects data from various sources such as transaction data from point-of-sale systems, customer interactions from the website and mobile app, social media engagement metrics, and he like. Further, the transaction data is formatted to standardize currency and date formats, remove duplicates and irrelevant entries, and social media data is processed to extract relevant metrics like likes, shares, and comments. The next step involves the consolidation of formatted data from all sources. For instance, daily transaction summaries are created, customer profiles are enriched with interaction history, and inventory data is combined with sales data to identify trends. Thereafter, the aggregated data is stored in a data lake in a structured format, such as Parquet files or as JSON objects. The data scientists can run complex queries to find correlations between social media engagement and sales performance or to segment customers based on purchasing behavior. The data lake 406 serves as the central hub for all incoming data, providing a unified and accessible source for analysis. In an exemplary embodiment, the data ingestion layer 404 may perform normalization to adjust the data values to a common scale without distorting differences in the ranges of values, encode to convert categorical data into numerical formats or any of the AI/ML acceptable formats that can be easily processed by machine learning models, and structuring to organize the data into a predefined schema or structure, such as tables or arrays, which is essential for efficient querying and analysis. Said AI/ML data acceptable formats may include CSV, JSON, Parquet, Avro, TFRecord, HDF5, Text Formats and the like. The data source analysis unit 408 performs an analysis on the stored data. This analysis includes assessing data format, structure, volume, velocity, and other relevant attributes to determine one or more attributes of the stored data. In an exemplary embodiment, the data source analysis unit 408 obtains the network performance data, perform analysis, and determines that the data is structured, historical, contains information on various network parameters, and also assesses volume of data generated each day.
[0087] Based on the analysis of the data source's characteristics, the algorithm selection 412 uses predefined criteria and machine learning models to recommend the most suitable anomaly detection algorithm from an algorithm repository 410. The selection is based on the algorithm's historical performance with similar data sources. The algorithm repository 410 maintains a comprehensive repository of various anomaly detection algorithms, each categorized and optimized for specific data type. This categorization allows the algorithm selection 412 to efficiently match the data attributes with the most relevant algorithms, ensuring optimal performance in anomaly detection.
[0088] Once the anomaly detection algorithm is selected, the algorithm application unit 414 then applies the selected one or more anomaly detection techniques to the stored data to identify at least one of the patterns/trends of data or one or more anomalies.
[0089] The alerting and response unit 416 then generates notifications, when significant patterns or issues are identified. These notifications pertain to one or more anomalies which are identified are transmitted to at least one of, the user or integrated to the NMS for immediate attention.
[0090] In one embodiment, the algorithm selection 412 recommends the selected anomaly detection techniques to the user through a user interface 418, The user interface 418 may be GUI-based, CLI-based, or internal API-based, operating without active intervention from a human user. The user interface 418 enables the user to check the selected one or more recommended anomaly detection techniques and make informed decisions by selecting one of the recommended anomaly detection techniques. In one embodiment, the user is not a human but rather a service, microservice, software component, or application that sends requests to system 108 and automatically consumes the recommendations provided by it, using predefined logic.
[0091] It is important to note that the user interface 418 provides an interactive feature to enable interaction between the user and user interface 418. It encompasses all the elements that allow users to engage with the system 102, including buttons, menus, icons, and other visual components. The primary goal of the user interface 418 is to facilitate effective and efficient operation, ensuring that users can navigate and utilize the system 108 with ease. This user-centric approach enhances the flexibility and usability of the anomaly detection system.
[0092] In an embodiment, the data source analysis unit 408, algorithm repository 410, algorithm selection 412, algorithm application unit 414, and alerting and response unit 416 may collectively function as an artificial intelligence (AI) tool designed to perform multiple operations, including data analysis, anomaly detection, and the generation of notifications. This may take up network data and operation data to perform analysis using machine learning models
[0093] FIG. 5 is a flow diagram of a method 500 for selecting anomaly detection technique, according to one or more embodiments of the present invention. For the purpose of description, the method 500 is described with the embodiments as illustrated in FIG. 3 and should nowhere be construed as limiting the scope of the present disclosure.
[0094] At step 502, the method 500 includes the step of retrieving the data from one or more data sources based on establishing one or more connections with the one or more data sources. In an embodiment, the retrieving unit 208 retrieves the network performance data, subscriber data, cell tower telemetry data, and device data from the internal data sources 302a. For example, the retrieving unit 208 establishes connections with a network performance database to pull metrics like latency, which is the time it takes for data to travel from one point to another, packet loss, which is the percentage of packets that do not reach their destination, and error rates, which is frequency of errors in data transmission. In another example, the retrieving unit 208 establishes connections and collects data in real-time from the cell towers that includes vital information about operational status of the network. The vital information such as signal strength which is the quality of the signal received by devices, handover events when a mobile device switches from one cell tower to another, and network congestion which discloses the level of traffic on the network.
[0095] At step 504, the method 500 includes the step of formatting the data to align with an acceptable format. In particular, this step involves transforming the raw data into a structured and standardized format that meets specific criteria or guidelines. For instance, formatting may include adjusting data types such as converting strings to dates, ensuring consistent units of measurement, or organizing data into predefined categories. By aligning the data with an acceptable format, the method enhances data integrity and usability, making it easier for subsequent processes.
[0096] At step 506, the method 500 includes the step of storing the formatted data in the storage unit 206, such as a central repository. This step is crucial for ensuring that the processed and organized data is securely saved and readily accessible for future use. By utilizing the central repository, the method 500 allows for efficient data management, enabling the users or systems to retrieve the information as needed for analysis, reporting, or decision-making. The central repository acts as a centralized hub where all relevant data can be consolidated, facilitating better data governance and ensuring that stakeholders have access to consistent and up-to-date information. This structured approach to data storage not only enhances data integrity but also supports scalability, as additional data can be easily integrated into the repository over time.
[0097] At step 508, the method 500 includes the step of determining one or more attributes of the data stored in the storage unit 206. For instance, the one or more attributes include at least one of, timestamps, data frequency, data size and scale, data complexity and nature of text of the data.
[0098] At step 510, the method 500 includes the step of selecting one or more anomaly detection techniques based on determining the one or more attributes. The selection of the anomaly detection techniques includes method steps such as analyzing the one or more determined attributes received from the determining unit 212 and selecting, the one or more anomaly detection techniques based on analysing the one or more attributes. This selection process is crucial because different anomaly detection methods are suited to different types of data and anomalies. For example, if the analysis reveals that the data exhibits seasonal patterns, techniques such as Holt-Winters or Exponential Moving Averages might be chosen to detect anomalies in time series data. Conversely, if the data is high-dimensional, clustering methods or supervised learning approaches may be more appropriate. The choice of technique is guided by the nature of the anomalies identified during the analysis, ensuring that the selected methods are tailored to effectively capture and address the specific anomalies present in the data.
[0099] Thereafter, at step 512, the method 500 includes the step of applying the one or more anomaly detection techniques to the stored data to identify one or more anomalies.
[00100] A person of ordinary skill in the art will readily ascertain that the illustrated embodiments and steps in description and drawings (FIG.1-5) are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[00101] The present disclosure provides technical advancement by recommending the most suitable algorithm for each specific data source, this invention enhances the accuracy of anomaly detection processes. This tailored approach ensures that anomalies are identified more effectively, thereby reducing both false positives and negatives. Additionally, the automated algorithm selection process optimizes the allocation of computational resources, ensuring efficient processing without unnecessary expenditure. The system 108 possesses the capability to adapt to changes in data sources or evolving data characteristics. As new data sources are integrated or existing sources undergo modifications, the service can autonomously recommend appropriate algorithms, eliminating the need for manual intervention. This adaptability not only streamlines the anomaly detection process but also enhances its overall robustness and reliability in dynamic environments.
[00102] The present invention offers multiple advantages over the prior art and the above listed are a few examples to emphasize on some of the advantageous features. The listed advantages are to be read in a non-limiting manner.

REFERENCE NUMERALS

[00103] Environment - 100;
[00104] User Equipment (UE) - 102;
[00105] Server - 104;
[00106] Network- 106;
[00107] System -108;
[00108] Processor - 202;
[00109] Memory - 204;
[00110] Storage unit -206
[00111] Retrieving unit – 208;
[00112] Formatting unit – 212,
[00113] Determining unit – 212;
[00114] Selecting unit – 214;
[00115] Applying unit – 216;
[00116] Generating unit– 218;
[00117] One or more data source - 302;
[00118] Internal data sources – 302a;
[00119] External data sources – 302b;
[00120] Data source integration - 402;
[00121] Data ingestion layer - 404;
[00122] Data lake – 406;
[00123] Data source analysis unit - 408;
[00124] Algorithm repository – 410;
[00125] Algorithm selection - 412;
[00126] Algorithm application unit – 414;
[00127] Alerting and response unit – 416;
[00128] User interface – 418;

,CLAIMS:
CLAIMS:
We Claim:
1. A method (500) for selecting anomaly detection technique, the method (500) comprising the steps of:
retrieving, by the one or more processors (202), data from one or more data sources (302) based on establishing one or more connections with the one or more data sources (302);
formatting, by the one or more processors (202), the data to align with an acceptable format;
storing, by the one or more processors (202), the formatted data in a storage unit (206);
determining, by the one or more processors (202), one or more attributes of the data stored in the storage unit (206);
selecting, by the one or more processors (202), one or more anomaly detection techniques based on determining the one or more attributes; and
applying, by the one or more processors (202), the one or more anomaly detection techniques to the stored data to identify at least one of, one or more anomalies.

2. The method (500) as claimed in claim 1, wherein the one or more processors (202), establishes the one or more connections with the one or more data sources (302) using one or more Application Programming Interfaces (APIs).

3. The method (500) as claimed in claim 1, wherein the one or more data sources (302) include at least one of, the data sources (302) within a telecommunication network and the data sources (302) outside the telecommunication network.

4. The method (500) as claimed in claim 3, wherein the one or more data sources (302) within the telecommunication network include at least one of, network performance data, subscriber data and device data, wherein the one or more data sources (302) outside the telecommunication network include at least one of, competitor data, social media data, customer feedback and surveys.

5. The method (500) as claimed in claim 1, wherein the acceptable format is the format which is suitable for anomaly detection.

6. The method (500) as claimed in claim 1, wherein the storage unit (206) is at least one of, a data lake.

7. The method (500) as claimed in claim 1, wherein the one or more attributes include at least one of, timestamps, data frequency, data size and scale, data complexity, nature of text of the data and specific type of data sources (302) .

8. The method (500) as claimed in claim 1, wherein step of, selecting, one or more anomaly detection techniques based on determining the one or more attributes, includes the steps of:
analysing, by the one or more processors (202), the one or more determined attributes;
selecting, by the one or more processors (202), the one or more anomaly detection techniques based on analysing the determined one or more attributes.

9. The method (500) as claimed in claim 9, wherein the analysing includes determining type of the one or more attributes, wherein the type of the one or more attributes includes at least one of, large data, historic data, complex data and text based data.

10. The method (500) as claimed in claim 1, wherein the one or more processors (202), selects the one or more anomaly detection techniques based on historic performance of the one or more anomaly detection techniques with respect to similar data.

11. The method (500) as claimed in claim 1, wherein the method further comprises the step of:
generating, by the one or more processors (202), notifications which are transmitted to at least one of, a user or integrated to a Network Management System (NMS), the notifications pertain to one or more anomalies which are identified.

12. The method (500) as claimed in claim 10, wherein the one or more processors (202), enables a learning module to learn the historic performance of the one or more anomaly detection techniques with respect to the similar data, a result of learnt historic performance data is stored in the storage unit (206).

13. A system (108) for selecting anomaly detection technique, the system comprising:
a retrieving unit (208), configured to, retrieve, data from one or more data sources (302) based on establishing one or more connections with the one or more data sources (302);
a formatting unit (210), configured to, format, the data to align with an acceptable format;
the formatting unit (210), configured to, store, the formatted data in a storage unit;
a determining unit (212), configured to, determine, one or more attributes of the data stored in the storage unit (206);
a selecting unit (214)), configured to, select, one or more anomaly detection techniques based on determining the one or more attributes; and
an applying unit (216), configured to, apply, the one or more anomaly detection techniques to the stored data to identify at least one of, one or more anomalies.

14. The system (108) as claimed in claim 13, wherein the one or more processors (202), establishes the one or more connections with the one or more data sources (302) using one or more Application Programming Interfaces (APIs).

15. The system (108) as claimed in claim 13, wherein the one or more data sources (302) include at least one of, the data sources (302) within a telecommunication network and the data sources outside the telecommunication network.

16. The system (108) as claimed in claim 15, wherein the one or more data sources (302) within the telecommunication network include at least one of, network performance data, subscriber data and device data, wherein the one or more data sources (302) outside the telecommunication network include at least one of, competitor data, social media data, customer feedback and surveys.

17. The system (108) as claimed in claim 13, wherein the acceptable format is the format which is suitable for anomaly detection.

18. The system (108) as claimed in claim 13, wherein the storage unit (206) is at least one of, a data lake.

19. The system (108) as claimed in claim 13, wherein the one or more attributes include at least one of, timestamps, data frequency, data size and scale, data complexity, nature of text of the data and specific type of data sources (302).

20. The system (108) as claimed in claim 13, wherein the selecting unit (214), selects, the one or more anomaly detection techniques by:
analysing, the one or more determined attributes; and
selecting, the one or more anomaly detection techniques based on analysing the determined one or more attributes.

21. The system (108) as claimed in claim 20, wherein the analysing includes determining type of the one or more attributes, wherein the type of the one or more attributes includes at least one of, large data, historic data, complex data and text based data.

22. The system (108) as claimed in claim 13, wherein the selecting unit (214) selects the one or more anomaly detection techniques based on historic performance of the one or more anomaly detection techniques with respect to similar data.

23. The system (108) as claimed in claim 13, wherein a generating unit (218), is configured to, generate, notifications which are transmitted to at least one of, a user or integrated to a Network Management System (NMS), the notifications pertain to one or more anomalies which are identified.

24. The system (108) as claimed in claim 22, wherein the selecting unit (214) enables a learning module to learn the historic performance of the one or more anomaly detection techniques with respect to the similar data, a result of learnt historic performance data is stored in the storage unit (206).

Documents

Application Documents

#	Name	Date
1	202321067272-STATEMENT OF UNDERTAKING (FORM 3) [06-10-2023(online)].pdf	2023-10-06
2	202321067272-PROVISIONAL SPECIFICATION [06-10-2023(online)].pdf	2023-10-06
3	202321067272-FORM 1 [06-10-2023(online)].pdf	2023-10-06
4	202321067272-FIGURE OF ABSTRACT [06-10-2023(online)].pdf	2023-10-06
5	202321067272-DRAWINGS [06-10-2023(online)].pdf	2023-10-06
6	202321067272-DECLARATION OF INVENTORSHIP (FORM 5) [06-10-2023(online)].pdf	2023-10-06
7	202321067272-FORM-26 [27-11-2023(online)].pdf	2023-11-27
8	202321067272-Proof of Right [12-02-2024(online)].pdf	2024-02-12
9	202321067272-DRAWING [07-10-2024(online)].pdf	2024-10-07
10	202321067272-COMPLETE SPECIFICATION [07-10-2024(online)].pdf	2024-10-07
11	Abstract.jpg	2024-12-07
12	202321067272-Power of Attorney [24-01-2025(online)].pdf	2025-01-24
13	202321067272-Form 1 (Submitted on date of filing) [24-01-2025(online)].pdf	2025-01-24
14	202321067272-Covering Letter [24-01-2025(online)].pdf	2025-01-24
15	202321067272-CERTIFIED COPIES TRANSMISSION TO IB [24-01-2025(online)].pdf	2025-01-24
16	202321067272-FORM 3 [31-01-2025(online)].pdf	2025-01-31