Methods And Systems For Detecting Anomalies In Application Programming

< Back

Methods And Systems For Detecting Anomalies In Application Programming Interface Calls Within A Network

Abstract: Embodiments provide methods and systems for behavioral profiling and anomaly detection in API ecosystems. The method includes accessing an API call dataset including long-term API call data and short-term API call data related to API calls. The method includes generating feature(s) including a set of long-term velocity features and a set of short-term API call features. The method includes determining, via a first machine learning (ML) model, a set of anomalous nodes from the node(s) based on the features. The method includes extracting a subset of API calls associated with each anomalous node from the short-term API call data. The method includes generating a reconstruction loss corresponding to each API call. The method includes generating, via the first ML model, a risk score for each API call. The method includes declining API call(s) from the subset of API calls based on the risk score being greater than anomaly detection threshold.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

11 October 2023

Publication Number

16/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

MASTERCARD INTERNATIONAL INCORPORATED

2000 Purchase Street, Purchase, NY 10577, United States of America

Inventors

1. Hardik Wadhwa

27 Barker Ave, Apt 417, White Plains, New York 10601, United States of America

2. Gaurav Dhama

27 Barker Ave, PH 1526, White Plains, New York 10601, United States of America

3. Brian M McGuigan

1512 Urban Street, Mamaroneck, New York 10543, United States of America

4. Rupesh kumar Sankhala

Near Nani Bai Marda School, Ward No. 60 Bhootiyabas, Gayatri Nagar, Churu 331001, Rajasthan, India

5. Siddharth Vimal

House number 149, Mayur Vihar Sector 48A, Chandigarh 160047, India

Specification

Description:[0001] The present disclosure relates to artificial intelligence-based processing systems and, more particularly, to electronic methods and complex processing systems for detecting anomalies in Application Programming Interface (API) calls within a network using behavior profiling.

BACKGROUND
[0002] In today’s world of growing technology, Application Programming Interface (API) adoption by various organizations to stay connected with computer networks and exchange information within and between the networks continues to grow. As the adoption of API continues growing, APIs themselves are evolving in terms of their security and governance. As systems become increasingly distributed, malicious entities such as hackers have an increased incentive for attacking the API network/infrastructure for malicious gains, thereby making the case of API calls security imperative.
[0003] Various threats to an API network may include object-level incursion, user authentication exploits, careless data exposure, Distributed Denial of Service (DDoS), authorization hacks, mass assignment weaknesses, security misconfiguration flaws, code injection vulnerabilities, poor asset management, inadequate logging and monitoring, and the like. In order to counter such threats, various approaches and techniques have been developed to prevent an API network from malicious entities. One such approach includes a web application firewall (WAF) which is a security tool for monitoring, filtering, and blocking incoming and outgoing data packets from a web application or a website. Although this tool provides security, it can introduce some latency and overhead to the Hypertext Transfer Protocol (HTTP) traffic, as it has to process and analyze each request and response. It can also affect the functionality and usability of the web application, as it may block some legitimate traffic or trigger some false positives.
[0004] Another approach that is widely used for securing API networks is known as Tipping Point® Next-Generation Intrusion Prevention System (NGIPS). Tipping Point® NGIPS is a network intrusion prevention system that deals with Information Technology (IT) threat protection. It combines application-level security with user awareness and inbound/outbound messaging inspection capabilities, to protect the user’s applications, network, and data from threats. However, this approach lacks enough bandwidth to handle a certain amount of traffic, possesses a complex interface, response time is poor, creates a latency issue, and the like.
[0005] Other known approaches and techniques include Akamai®, F5 BIG-IP® Application Security Manager® (ASM), API Gateway (APIGW), Pivotal Cloud Foundry® (PCF), Neustar®, and the like. However, all these approaches possess one or more drawbacks such as delay in response time, expensive, complex user interface, and the like. Moreover, such approaches are not able to detect and block each type of attack such as DoS attacks, phishing attacks, Out of Band Exploitation (OOB) attacks, and the like.
[0006] Thus, there exists a technological need for technical solutions to predict such attacks by detecting any anomalies in the API calls within a network.
SUMMARY
[0007] Various embodiments of the present disclosure provide methods and systems for detecting anomalies in Application Programming Interface (API) calls within a network.
[0008] In an embodiment, a computer-implemented method for detecting anomalies in Application Programming Interface (API) calls within a network is disclosed. The computer-implemented method performed by a server system includes accessing an API call dataset related to a set of API calls performed between a plurality of nodes connected in the network from a database associated with the server system. The API call dataset includes long-term API call data and short-term API call data. The method further includes generating a plurality of features based, at least in part, on the API call dataset. The plurality of features includes a set of long-term velocity features and a set of short-term API call features. Further, the method includes determining, via a first machine learning (ML) model, a set of anomalous nodes from the plurality of nodes based, at least in part, on the set of long-term velocity features and the set of short-term API call features. The method includes extracting a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data. Furthermore, the method includes generating a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls. The method includes generating, via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss. Moreover, the method includes declining one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.
[0009] In another embodiment, a server system is disclosed. The server system includes a communication interface and a memory including executable instructions. The server system also includes a processor communicably coupled to the memory. The processor is configured to execute the instructions to cause the server system, at least in part, to access an Application Programming Interface (API) call dataset related to a set of API calls performed between a plurality of nodes connected in a network from a database associated with the server system. The API call dataset includes long-term API call data and short-term API call data. The server system is further caused to generate a plurality of features based, at least in part, on the API call dataset. The plurality of features includes a set of long-term velocity features and a set of short-term API call features. Further, the server system is caused to determine, via a first machine learning (ML) model, a set of anomalous nodes from the plurality of nodes based, at least in part, on the set of long-term velocity features and the set of short-term API call features. Furthermore, the server system is caused to extract a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data. Moreover, the server system is caused to generate a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls. In addition, the server system is caused to generate, via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss. The server system is further caused to decline one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.
[0010] In yet another embodiment, a non-transitory computer-readable storage medium is disclosed. The non-transitory computer-readable storage medium includes computer-executable instructions that, when executed by at least a processor of a server system, cause the server system to perform a method. The method includes accessing an API call dataset related to a set of API calls performed between a plurality of nodes connected in a network from a database associated with the server system. The API call dataset includes long-term API call data and short-term API call data. The method further includes generating a plurality of features based, at least in part, on the API call dataset. The plurality of features includes a set of long-term velocity features and a set of short-term API call features. Further, the method includes determining, via a first machine learning (ML) model, a set of anomalous nodes from the plurality of nodes based, at least in part, on the set of long-term velocity features and the set of short-term API call features. The method includes extracting a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data. Furthermore, the method includes a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls. The method includes generating, via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss. Moreover, the method includes declining one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.

BRIEF DESCRIPTION OF THE FIGURES
[0011] For a more complete understanding of example embodiments of the present technology, reference is now made to the following descriptions taken in connection with the accompanying drawings in which:
[0012] FIG. 1 illustrates an exemplary representation of an environment related to at least some example embodiments of the present disclosure;
[0013] FIG. 2 illustrates a simplified block diagram of a server system, in accordance with an embodiment of the present disclosure;
[0014] FIG. 3 illustrates a simplified block diagram representation of a node configured to run a protocol corresponding to an Application Programming Interface (API) associated with a software application running on the node, in accordance with an embodiment of the present disclosure;
[0015] FIG. 4A illustrates a block diagram representation of an environment related to at least some example embodiments of the present disclosure;
[0016] FIG. 4B illustrates a block diagram representation of a data communication flow between the server system and one or more user devices of the environment of FIG. 4A, in accordance with an embodiment of the present disclosure;
[0017] FIGS.4C and 4D collectively illustrate a sequential flow diagram for facilitating an API call security to an API call session between a plurality of user devices, the entity server, and the server system, in accordance with an embodiment of the present disclosure;
[0018] FIG. 5 illustrates a block diagram representation of an environment related to at least some example embodiments of the present disclosure;
[0019] FIG. 6A illustrates a detailed schematic representation of the server system and a data communication flow between the server system and a plurality of nodes, in accordance with an embodiment of the present disclosure;
[0020] FIG. 6B illustrates a detailed schematic representation of the server system and a data communication flow between the server system and the plurality of nodes, in accordance with another embodiment of the present disclosure;
[0021] FIG. 6C illustrates a detailed schematic representation of the server system and a data communication flow between the server system and the plurality of nodes, in accordance with yet another embodiment of the present disclosure;
[0022] FIG. 6D illustrates a detailed schematic representation of the server system and a data communication flow between the server system and the plurality of nodes, in accordance with yet another embodiment of the present disclosure;
[0023] FIG. 7 illustrates a flow diagram representation of a process flow of a generation of a first machine learning (ML) model, in accordance with an embodiment of the present disclosure;
[0024] FIG. 8 illustrates a flow diagram representation of a process flow of a determination of a set of anomalous nodes, in accordance with an embodiment of the present disclosure;
[0025] FIG. 9 illustrates a flow diagram representation of a process flow of the generation of a risk score, in accordance with an embodiment of the present disclosure;
[0026] FIG. 10A illustrates a process flow diagram depicting a method for generating the first ML model, in accordance with the present disclosure;
[0027] FIG. 10B illustrates a process flow diagram depicting a method for behavioral profiling and anomaly detection in an Application Programming Interface (API) ecosystem, in accordance with the present disclosure;
[0028] FIG. 10C illustrates a process flow diagram depicting a method for determining a set of anomalous nodes, in accordance with the present disclosure;
[0029] FIG. 10D illustrates a process flow diagram depicting a method for generating a risk score, in accordance with the present disclosure; and
[0030] FIG. 11 is a simplified block diagram of an electronic device capable of implementing various embodiments of the present disclosure.
[0031] The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.

DETAILED DESCRIPTION
[0032] In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be apparent, however, to one skilled in the art that the present disclosure can be practiced without these specific details. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
[0033] Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearances of the phrase “in an embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
[0034] Moreover, although the following description contains many specifics for the purposes of illustration, anyone skilled in the art will appreciate that many variations and/or alterations to said details are within the scope of the present disclosure. Similarly, although many of the features of the present disclosure are described in terms of each other, or in conjunction with each other, one skilled in the art will appreciate that many of these features can be provided independently of other features. Accordingly, this description of the present disclosure is set forth without any loss of generality to, and without imposing limitations upon, the present disclosure.
[0035] Conditional language such as, among others, “can”, “could”, “might”, or “may”, unless specifically stated otherwise, are otherwise understood within the context as used in general to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements, and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular embodiment.
[0036] Disjunctive language such as the phrase “at least one of X, Y, or Z” unless specifically stated otherwise, is otherwise understood with the context as used in general to present that an item, term, etc., may be either X, Y, or Z, or any combination thereof (e.g., X, Y, and/or Z). Thus, such disjunctive language is not generally intended to, and should not, imply that certain embodiments require at least one of X, at least one of Y, or at least one of Z to each be present.
[0037] Unless otherwise explicitly stated, articles such as “a” or “an” should generally be interpreted to include one or more described items. Accordingly, phrases such as “a device configured to” are intended to include one or more recited devices. Such one or more recited devices can also be collectively configured to carry out the stated recitations. For example, “a processor configured to carry out recitations A, B, and C” can include a first processor configured to carry out recitation A working in conjunction with a second processor configured to carry out recitations B and C. The same holds true for the use of definite articles used to introduce embodiment recitations. In addition, even if a specific number of an introduced embodiment recitation is explicitly recited, those skilled in the art will recognize that such recitation should typically be interpreted to mean at least the recited number (e.g., the bare recitation of “two recitations”, without other modifiers, typically means at least two recitations or two or more recitations).
[0038] It will be understood by those within the art that, in general, terms used herein, are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes but is not limited to,” etc.).
[0039] The term “Application Programming Interface (API)” refers to a set of rules and protocols that allows different software applications to communicate and interact with each other. API defines how different software components should interact, what data can be accessed, and what operations can be performed. Different types of software that APIs can be exposed to may include web applications, mobile applications, operating systems, libraries, frameworks, and the like.
[0040] The term “API Gateway” refers to a server or middleware component that acts as an intermediary between clients (such as applications or devices) and a collection of backend services or APIs. API Gateway provides a centralized entry point for clients to access multiple APIs or services. Various tasks that the API Gateway can handle may include request routing, protocol translation, authentication, rate limiting, caching, logging, monitoring, and the like.
[0041] The terms “API calls”, “requests”, and “API requests” are used interchangeably throughout the description and refer to a process through which a client application submits a request to a server’s API asking for a service or information. API call also includes everything that happens after the request is submitted, including when the API retrieves information from the server and delivers it back to the client. There exist different request methodologies using which a client can send a request to a server. For instance, when the clients want the server to perform basic functions, requests may be written as Uniform Resource Locators (URLs) so that the communication between the clients and the server is dictated by the rules of Hyper-Text Transfer Protocol (HTTP). The four most basic request methods include GET (to retrieve a resource), POST (to create a new resource), PUT (to edit or update an existing source), and DELETE (to delete a resource). Further, an API key may be used as a unique identifier for authenticating the API calls to an API. Alternative means of authentication may also be used such as authentication tokens. As used herein, the term “resource” refers to any piece of information that the API can provide the client, and the server is used by the software application that contains the resources.
[0042] The term “Data Lake” as used herein refers to a central storage repository that holds big data from many sources in a raw and granular format. It can store data in a flexible format for future use as it can store structured, semi-structured, and unstructured data. When storing data, a data lake associates it with identifiers and metadata tags for faster retrieval.
[0043] The term “feature engineering” as used herein refers to a technique of generating features by transforming or extracting meaningful information data from raw data that can be used as input to train artificial intelligence (AI) and machine learning (ML) algorithms.
OVERVIEW
[0044] Various embodiments of the present disclosure provide methods, systems electronic devices, and computer program products for detecting anomalies in Application Programming Interface (API) calls within a network using behavioral profiling. In an embodiment, the present disclosure describes a server system for detecting the anomalies in the API calls within the network. The server system includes a processor and a memory. In an embodiment, the server system is configured to access an API call dataset related to a set of API calls performed between a plurality of nodes connected in the network from a database associated with the server system. Herein, the API call dataset includes long-term API call data and short-term API call data.
[0045] In another embodiment, the server system is further configured to generate a plurality of features based, at least in part, on the API call dataset. Herein, the plurality of features includes a set of long-term velocity features and a set of short-term API call features. Further, the server system is configured to determine, via a first machine learning (ML) model, a set of anomalous nodes from the plurality of nodes based, at least in part, on the set of long-term velocity features and the set of short-term API call features. Furthermore, the server system is configured to extract a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data.
[0046] Moreover, the server system is configured to generate a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls. In addition, the server system is configured to generate, via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss. The server system is further configured to decline and/or flag one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.
[0047] It is noted that in an embodiment, the first ML model includes a Variational Autoencoder (VAE) model trained using deep-learning neural networks. Further, the server system is configured to generate the first ML model. Herein, the first ML model includes performing a set of operations by the server system iteratively for training the first ML model to learn a general distribution of a genuine API call. In an embodiment, the set of operations may include extracting a first set of features from the set of long-term velocity features. Herein, the first set of features corresponds to a non-malicious API call dataset from the long-term API call data. The set of operations may further include computing a reconstruction loss based, at least in part, on analyzing the first set of features and minimizing the reconstruction loss by optimizing one or more ML model parameters. Herein, the one or more ML model parameters may include at least an encoder and decoder architecture design, a predefined latent space dimension, regularization parameters, learning rate, optimization, reconstruction loss weighting, batch size, regularization techniques, and the like.
[0048] In an embodiment, for the server system to determine the set of anomalous nodes, the server system may further be configured to generate, via the first ML model, a standard behavioral profile that a genuine node possesses in response to having performed a set of genuine API calls based, at least in part, on the first set of features. The server system may be configured to generate, via the first ML model, a behavioral profile for each of the plurality of nodes by analyzing underlining patterns and behavior of the set of API calls based, at least in part, on the set of short-term API call features.
[0049] Further, the server system may be configured to compute, via the first ML model, a behavioral discrepancy probability for each of the plurality of nodes based, at least in part, on comparing the behavioral profile of each of the plurality of nodes with the standard behavioral profile. The server system may then be configured to assign, via the first ML model, an anomalous identity label to one or more nodes from the plurality of nodes based, at least in part, on determining if the behavioral discrepancy probability corresponding to the one or more nodes from the plurality of nodes is at least equal to a pre-determined threshold probability. The server system may be configured to determine the set of anomalous nodes from the plurality of nodes based, at least in part, on the corresponding anomalous identity label of the one or more nodes from the plurality of nodes.
[0050] In another embodiment, to generate the risk score, the server system may be further configured to determine, via the first ML model, the anomaly detection threshold based, at least in part, on a predefined threshold value and the reconstruction loss. The server system may further be configured to determine, via the first ML model, a data sensitivity and history of attacks on each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes based, at least in part, on the reconstruction loss.
[0051] Further, the server system may be configured to generate, via the first ML model, the risk score including one of a first risk score and a second risk score. Herein, the first risk score is generated for each API call of the subset of API calls that is less than the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviate at least by a first predefined extent from the general distribution of the genuine API call. The second risk score is generated for each API call of the subset of API calls that is at least equal to the anomaly detection threshold when the data sensitivity and the history of attacks on each API call of the subset of API calls deviate by a second predefined extent from the general distribution of the genuine API call. Herein, the second predefined extent is greater than the first predefined extent.
[0052] In some embodiments, it may be noted that the server system may be configured to generate one or more alerts to API management authorities when the risk score associated with one or more API calls of the subset of API calls is at least equal to the anomaly detection threshold. In addition, the server system may be configured to generate a report for the set of API calls. The report may include a data summary corresponding to overall malicious traffic on an API ecosystem. In an example embodiment, the report may be generated using a Natural Language Generation (NLG) model.
[0053] In an example embodiment, the server system may be further configured to determine, via a second ML model, a subset of anomalous nodes from the set of anomalous nodes to be labeled as highly risky anomalous nodes based, at least in part, on analyzing the risk score generated for each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes. Herein, the second ML model may include a Natural Language Processing (NLP) model trained using deep learning neural networks.
[0054] Various embodiments of the present disclosure offer multiple advantages and technical effects. For instance, the present disclosure is intended to facilitate nodes in a network with call-level security for securing the API calls between the nodes in an API ecosystem through machine learning. The present disclosure uses machine learning and artificial intelligence to build a behavioral profile of the users and automate the learning and detection of malicious API call patterns.
[0055] Further, as AI or ML models are trained on historical API call datasets, context information such as system logs, and client information including previous behavior and demographic information, the present disclosure can detect malicious nodes which were not previously detected using conventional methods. This is because the conventional methods operated based on rules which are made from known attack patterns and lack any information related to new attack methods and new APIs, whereas the present disclosure is based on both the known and new types of attacks.
[0056] Although the rules can detect a good number of call-level attacks using rules they cannot detect coordinated attacks (attacks with a small number of calls from different addresses) or business logic attacks which try to exploit API usage logic to extract information. Therefore, the system proposed in the present disclosure uses an AI model which has learned general usage patterns and also has user history, thereby enabling the system to detect both coordinated attacks as well as business logic attacks as the model is aware of what a normal API usage is and also has system information on API level to make decisions.
[0057] Furthermore, it may be understood that the Variational Autoencoder (VAE) model is a state-of-the-art method in the field of anomaly detection using deep learning. Therefore, the usage of the VAE model facilitates the usage of two neural networks that are trained on large amounts of data. This makes the system proposed in the present disclosure a very powerful tool for learning the general distribution of API usage by a user.
[0058] Moreover, it is known that large neural networks trained on huge datasets can learn correlations between a large number of features and understand insights which another advantage of the system is proposed in the present disclosure. This is because conventional models or rule-based methods cannot learn such correlations. The system performs significantly better in comparison to basic rule-based methods such as an interquartile range (IQR) method and even tree-based methods such as an isolation forest method.
[0059] Various example embodiments of the present disclosure are described hereinafter with reference to FIGS. 1 to 11.
[0060] FIG. 1 illustrates a block diagram representation of an environment 100 related to at least some example embodiments of the present disclosure. Although the environment 100 is presented in one arrangement, other embodiments may include the parts of the environment 100 (or other parts) arranged otherwise depending on, for example, generating a behavioral profile for each node of a plurality of nodes in a network, determining a set of anomalous nodes from the plurality of nodes based on the behavioral profile of each node, generating a risk score for each of the plurality of Application Programming Interface (API) calls associated with each anomalous node, and declining or flagging each of the API calls of each node based, at least in part, on the risk score associated with the corresponding API calls being at least equal to an anomaly detection threshold.
[0061] The environment 100 generally includes a plurality of entities such as a server system 102, a plurality of nodes 104(1), 104(2), … 104(N) (collectively, referred to as nodes 104) where ‘N’ is a non-zero natural number, a plurality of data sources 106(1), 106(2), … 106(N) (collectively, referred to as data sources 106, where ‘N’ is again a non-zero natural number, and a database 108 each coupled to, and in communication with (and/or with access to) a network 110. The network 110 may include, without limitation, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the Internet, a fiber optic network, a coaxial cable network, an infrared (IR) network, a radio frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among two or more of the parts or users illustrated in FIG. 1, or any combination thereof.
[0062] Various entities in the environment 100 may connect to the network 110 in accordance with various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, New Radio (NR) communication protocol, any future communication protocol, or any combination thereof. In some instances, the network 110 may utilize a secure protocol (e.g., Hypertext Transfer Protocol (HTTP), Secure Socket Lock (SSL), and/or any other protocol, or set of protocols for communicating with the various entities depicted in FIG. 1.
[0063] In an embodiment, the nodes 104 may refer to computing devices or entities having a software application installed or an instance of a software application running on each of them. Herein, the software application running on one or more of the nodes 104 is capable of communicating with other software applications running on other nodes via an API provided by the corresponding software applications installed on each node. In another embodiment, each node of the nodes 104 may be operated by a node operator (not shown in FIG. 1) associated with the API. For instance, the node operator may be a user, a person, an individual, an organization, an institution, a company, a merchant, a banking institution, a service provider, an entity, and the like, without limiting the scope of the present disclosure.
[0064] In various non-limiting examples, the nodes 104 may be any portable communication devices such as a smartphone, a tablet, a personal digital assistant (PDA), a phablet, a wearable device, a smartwatch, a laptop computer, a server, and the like. In some other examples, the nodes 104 may include any fixed communication device such as a desktop, a mainframe computer, a workstation, and the like.
[0065] In a non-limiting scenario, a network of the nodes 104 may be a peer-to-peer network such that the nodes 104 can be both client nodes and/or server nodes. For instance, a node 104(1) may act as a server node while providing a plurality of services to other nodes (e.g., the nodes 104(2)-104(N)) that may act as client nodes. In another instance, the server node (e.g., the node 104(1)) now may be in need of a service that may be available at one or more nodes of the other nodes (e.g., the nodes 104(2)-104(N)). Then, the server node may now act as a client node and one or more nodes of the other nodes (e.g., the nodes 104(2)-104(N)) may now act as server nodes since they are providing the desired service to the node 104(1). Therefore, in the peer-to-peer network, the nodes 104 may be able to communicate with each other through the network 110 and provide services to each other. Moreover, in case of the services being provided with software applications running on the nodes 104, the software applications can communicate with each other through the network 110 by sending API calls to respective APIs of the nodes 104 and receiving the desired services via API call responses from the respective APIs of the nodes 104.
[0066] In another non-limiting scenario, the network of nodes 104 may be a client-server network having a first group of nodes (e.g., the nodes 104(1)-104(x), ‘x’ being a natural number less than ‘N’) being clients and a second group of nodes (e.g., the nodes 104(x+1)-104(N)) being servers providing one or more services to the clients. Alternatively, any of the nodes 104(1)-104(N) may be a service provider server or client randomly, however, roles of the nodes 104 may be fixed in the client-server network. For instance, a node 104(8) may be fixed to be a service provider server and nodes 104(1)-104(7) may be fixed to be clients as services provided by the service provider server i.e., the node 104(8) are required by the clients i.e., the nodes 104(1)-104(7).
[0067] To that note, there exist at least two scenarios. A first scenario, including different applications running on the server and the client and those different applications, communicate with each other via an API provided by an application running on the server. A second scenario, including a single application running on the service provider server and for the client to be able to use one or more services provided by the application on the service provider server, an instance of the same application has to be there at the client end communicating with the application on the service provider server via an API provided by the application on the service provider server.
[0068] For example, in the first scenario, a first application running on the nodes 104(1)-104(7) while performing its operations needs to access or communicate with a second application running on the node 104(8). The first application sends an API call via the network 110 to an API of the second application. Upon receiving the API call, a sender’s identity is authenticated and authorized and the second application provides information as per the API call to the first application in the form of an API call response.
[0069] In the second scenario, an instance of an application running on the node 104(8) is running on each of the nodes 104(1)-104(7). The instance of the application sends an API call via the network 110 to the application on the node 104(8) to access one or more services of the application. Upon receiving the API call, a sender’s identity is authenticated and authorized and the application on the node 104(8) provides information as per the API call to the instance of the application on the corresponding nodes 104(1)-104(7) in the form of an API call response.
[0070] In either of the scenarios, multiple services provided by the node 104(8) may be accessed either via a single API or multiple APIs. In the latter case, each service may be provided with an API using API calls, and hence such services are referred to as microservices. As used herein, the term “microservice” refers to a style of a software architecture that divides an application’s different functions into smaller components called “services”. When an application is built this way, it’s said to follow a microservice architecture. Developers may often refer to such services as microservices. Further, these microservices of a particular application can interact with each other via private APIs. Furthermore, in case of multiple APIs being used by an application, then an API gateway may be needed at the service provider server’s end for handling various tasks such as, but not limited to, routing the API calls to appropriate backend services, enforcing security measures, transforming request/response data, handling authentication and authorization, providing logging, and the like. It may be noted that these scenarios and their working may also be applicable to the nodes of the peer-to-peer type of network as well.
[0071] To that end, as the various organizations stay connected through the peer-to-peer network or the client-server network of their nodes and exchange information to grow, the vulnerability of such networks increases as the network expands, and hence the API calls and/or the API may be attacked by malicious entities such as fraudsters, hackers and the like. As described earlier, although various approaches and techniques have been developed to prevent such attacks, each of them is riddled with various limitations. Therefore, a need exists to address the limitations of such conventional approaches and techniques. To that end, the present disclosure provides one or more technical solutions to address these limitations. More specifically, the present disclosure provides the server system 102 that is configured to perform a plurality of operations for addressing the technical problem described earlier.
[0072] In one embodiment, the server system 102 is configured to facilitate the nodes 104 with call level security for securing an API network/ecosystem from anomalous or suspicious API calls between the nodes 104 using various Artificial Intelligence (AI) or Machine Learning (ML) models. In some embodiments, the server system 102 may be deployed as a standalone server or may be implemented in the cloud as software as a service (SaaS). The server system 102 may be configured to provide or host an API security application 112 (hereinafter, interchangeably referred to as an ‘application 112’) that facilitates the nodes 104 with the call level security for securing the API network from anomalous or suspicious API calls. In some embodiments, an instance of the application 112 is also accessible to the nodes 104 as shown in the environment 100 in FIG. 1. It is noted that for leveraging the functionality of the application 112, an instance of the application 112 may be installed on each of the nodes 112. This enables the node operators to be able to access the server system 102 on the nodes 104.
[0073] Further, for the server system 102 to be able to facilitate such a feature, the server system 102 needs to collect a plurality of data samples from the plurality of data sources 106. In some embodiments, the data sources 106 may include firewalls, load balancers, gateways including Akamai®, F5 BIG-IP® Application Security Manager® (ASM), API Gateway (APIGW), Pivotal Cloud Foundry® (PCF), Neustar®, and the like. It is understood that the data sources 106 may be existing solutions that facilitate API security or API monitoring by allowing the API calls to pass through them. Thus, it may be understood that the server system 102 is configured to collect a plurality of data samples from these data sources. In an alternative embodiment, the server system 102 may monitor the various API calls and responses occurring on the network 110 to generate the plurality of data samples instead of relying on the data sources 106. In another embodiment, the server system 102 may store the plurality of data samples in the database 108 associated with it as an API call dataset 114.
[0074] In various non-limiting examples, the database 108 may include one or more hard disk drives (HDD), solid-state drives (SSD), an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a redundant array of independent disks (RAID) controller, a storage area network (SAN) adapter, a network adapter, and/or any component providing the server system 102 with access to the database 108. In one implementation, the database 108 may be viewed, accessed, amended, updated, and/or deleted by an administrator (not shown) associated with the server system 102 through a database management system (DBMS) or relational database management system (RDBMS) present within the database 108.
[0075] In some other embodiments, the API call dataset 114 may be collected from other types of the data sources 106 as well. In a non-limiting example, the API call dataset 114 may include both a long-term API call data 114A and a short-term API call data 114B. In an example embodiment, the long-term API call data 114A may further include client-specific behavior details, call details, server-side details, machine details, and the like. For instance, the client-specific behavior details may include a count of API calls associated with at least one node of the nodes 104, successful requests, demographic information, etc. Similarly, the call details may include request size, response size, accepted language, authentication method, etc. Further, the server-side details for the API calls may include response time, authentication time, response code, etc. The machine details may include Central Processing Unit (CPU) Idle time, process counts, memory used, etc. In other examples, long-term API call data 114A may include IP Addresses, call numbers, time, geography, gateway captured data, firewall captured data, and the like. Further, the gateway-captured data may include response time, response code, authentication time, and the like. Similarly, the firewall-captured data may include ASM calls, the severity of flagged calls, and the like.
[0076] Therefore, the long-term API call data 114A may be data related to the API calls performed between the nodes 104 in the network 110 collected from the past for a long-term time span of about a first period and the short-term API call data 114B refers to the same data or any other streaming data related to the API calls for a short-term time span of about a second period. It may be clear that the first period is greater than the second period. For example, if the first period is about the past 9-10 months, then the second period may be about the past 1-2 weeks. Furthermore, it is understood that as time passes the short-term API call data is combined with the long-term API data, and a new short-term API call data is generated.
[0077] In some embodiments, the API call dataset 114 may further include device information from BMC discovery or hash tables, Linus distribution (NIX) index for device performance, web databases on malicious IPs, secure access and authentication data, such as the unique identity of clients, and the like. Herein, BMC in the term ‘BMC discovery’ is derived from the surname initials of its founders i.e., Scott Boulette, John Moores, and Dan Cloer. In addition, the long-term API call data 114A may be collected by the server system 102 at any level such as an IP address level at different time intervals, and may be updated on a regular basis.
[0078] The server system 102 may then be configured to fetch the API call dataset 114 from a data lake and store in the database 108 associated with the server system 102. Further, the server system 102 may be configured to access the API call dataset 114 from the database 108. The server system 102 may be further configured to generate a plurality of features based, at least in part, on the API call dataset 114. In a non-limiting example, the plurality of features may include a set of long-term velocity features generated based on the corresponding long-term API call data 114A and a set of short-term API call features generated based on the corresponding short-term API call data 114B.
[0079] In some embodiments, the server system 102 may be configured to train AI or ML models using the plurality of features generated using the API call dataset 114. More specifically, the server system 102 may be configured to generate a first machine learning (ML) model 116 by training the first ML model 116 to learn a general distribution of a genuine API call.
[0080] The server system 102 may be further configured to determine, via the first ML model 116, a set of anomalous nodes from the plurality of nodes 104 based, at least in part, on the set of long-term velocity features and the set of short-term API call features. Further, the server system 102 may be configured to extract a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data.
[0081] Moreover, the server system 102 may be configured to generate, via the first ML model 116, a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls. The server system 102 may further be configured to generate via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss. Upon generating the risk score, the server system 102 may be configured to decline or flag one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold. In scenarios, where the one or more API calls from the subset of API calls are flagged, then the server system 102 a notification to an administrator of the server system (not shown) or the node operator indicating the one or more flagged API calls. Upon receiving this notification, the administrator or node operator may determine what action to take against the anomalous node based on their own set of policies.
[0082] In addition, in some embodiments, the server system 102 may be configured to generate a second ML model 118 by training the second ML model 118 to learn to further optimize the set of anomalous nodes for predicting highly risky anomalous nodes of the set of anomalous nodes. Thus, the server system 102 may be configured to determine, via the second ML model, a subset of anomalous nodes from the set of anomalous nodes to be labeled as highly risky anomalous nodes based, at least in part, on analyzing the risk score generated for each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes.
[0083] It may be noted that the first ML model 116 and the second ML model 118 may be stored in the database 108 as shown in FIG. 1. It may also be noted that details of how the first ML model 116 and the second ML model 118 may be generated, how the set of anomalous models may be determined, how the risk score may be generated, and additional embodiments associated with the configurations of the server system 102 are explained further with reference to FIGS. 2 to 10A-10D later in the present disclosure.
[0084] It should be understood that the server system 102 is a separate part of the environment 100, and may operate apart from (but still in communication with, for example, via the network 110) any third-party external servers (to access data to perform the various operations described herein). However, in other embodiments, the server system 102 may be incorporated, in whole or in part, into one or more parts of the environment 100.
[0085] The number and arrangement of systems, devices, and/or networks shown in FIG. 1 are provided as an example. There may be additional systems, devices, and/or networks; fewer systems, devices, and/or networks; different systems, devices, and/or networks; and/or differently arranged systems, devices, and/or networks than those shown in FIG. 1. Furthermore, two or more systems or devices shown in FIG. 1 may be implemented within a single system or device, or a single system or device is shown in FIG. 1 may be implemented as multiple, distributed systems or devices. In addition, the server system 102 should be understood to be embodied in at least one computing device in communication with the network 110, which may be specifically configured, via executable instructions, to perform steps as described herein, and/or embodied in at least one non-transitory computer-readable media.
[0086] FIG. 2 illustrates a simplified block diagram of a server system 200, in accordance with an embodiment of the present disclosure. The server system 200 is identical to the server system 102 of FIG. 1. In some embodiments, the server system 200 is embodied as a cloud-based and/or SaaS-based (software as a service) architecture.
[0087] The server system 200 includes a computer system 202 and a database 204. The database 204 is identical to the database 108 of FIG. 1. The computer system 202 includes at least one processor 206 for executing instructions, a memory 208, a communication interface 210, a user interface 212, and a storage interface 214. The one or more components of the computer system 202 communicate with each other via a bus 216. The components of the server system 200 provided herein may not be exhaustive and the server system 200 may include more or fewer components than those depicted in FIG. 2. Further, two or more components depicted in FIG. 2 may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities.
[0088] In some embodiments, the database 204 is integrated within the computer system 202. In one non-limiting example, the database 204 is configured to store an API call dataset 218 including a long-term API call data 218A and a short-term API call data 218B, an instance of the API security application 112, one or more components of the API security application 112, and the like. The one or more components of the API security application 112 may include a first machine learning (ML) model 220 and a second ML model 222. The API call dataset 218, the long-term API call data 218A, the short-term API call data 218B, the first ML model 220, and the second ML model 222 are identical to the API call dataset 114, the long-term API call data 114A, the short-term API call data 114B, the first ML model 116 and the second ML model 118 of FIG. 1.
[0089] Further, the computer system 202 may include one or more hard disk drives as the database 204. The storage interface 214 is any component capable of providing the processor 206 access to the database 204. The storage interface 214 may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processor 206 with access to the database 204.
[0090] The processor 206 includes suitable logic, circuitry, and/or interfaces to execute operations for facilitating a plurality of nodes (e.g., the nodes 104) in a network (e.g., the network 110) with call level security for securing an API network from risky API calls between the nodes 104. Examples of the processor 206 include, but are not limited to, an application-specific integrated circuit (ASIC) processor, a reduced instruction set computing (RISC) processor, a graphical processing unit (GPU), a complex instruction set computing (CISC) processor, a field-programmable gate array (FPGA), and the like.
[0091] The memory 208 includes suitable logic, circuitry, and/or interfaces to store a set of computer-readable instructions for performing operations. Examples of the memory 208 include a random-access memory (RAM), a read-only memory (ROM), a removable storage drive, a hard disk drive (HDD), and the like. It will be apparent to a person skilled in the art that the scope of the disclosure is not limited to realizing the memory 208 in the server system 200, as described herein. In another embodiment, the memory 208 may be realized in the form of a database server or a cloud storage working in conjunction with the server system 200, without departing from the scope of the present disclosure.
[0092] The processor 206 is operatively coupled to the communication interface 210, such that the processor 206 is capable of communicating with a remote device 224 such as the nodes 104 and a plurality of data sources (e.g., the data sources 106), or communicating with any entity connected to the network 110 (as shown in FIG. 1).
[0093] It is noted that the server system 200 as illustrated and hereinafter described is merely illustrative of an apparatus that could benefit from embodiments of the present disclosure and, therefore, should not be taken to limit the scope of the present disclosure. It is noted that the server system 200 may include fewer or more components than those depicted in FIG. 2. It should be noted that the server system 200 is identical to the server system 102 described in reference to FIG. 1.
[0094] In one implementation, the processor 206 includes a data pre-processing module 226, a feature engineering module 228, an anomaly detection module 230, a scoring module 232, and an optimization module 234. It should be noted that components, described herein, such as the data pre-processing module 226, the feature engineering module 228, the anomaly detection module 230, the scoring module 232, and the optimization module 234 can be configured in a variety of ways, including electronic circuitries, digital arithmetic, and logic blocks, and memory systems in combination with software, firmware, and embedded technologies.
[0095] In an embodiment, the data pre-processing module 226 includes suitable logic and/or interfaces for accessing the API call dataset 218 related to the set of API calls performed between the nodes connected in the network 110 from the database 204 associated with the server system 200. As may be understood that the API call dataset 218 includes the long-term API call data 218A and the short-term API call data 218B, the long-term API call data 218A may include data related to the API calls performed between the nodes 104 in the network 110 collected from past for a long-term time span of about a first period and the short-term API call data 114B refer the same data or any other streaming data related to the API calls for a short-term time span of about a second period. It may be clear that the first period is greater than the second period. For example, if the first period may be about the past 9-10 months, then the second period may be about the past 1-2 weeks. It is noted that the timespan which differentiates the long-term API calls apart from the short-term API calls may be determined or set by the node operator or the administrator of the server system.
[0096] In some embodiments, the API call dataset 218 may be collected from different data sources (e.g., the data sources 106) in a data lake and then transferred to the database 204 where the collected API call dataset 218 may be stored in a structured format. The API call dataset 218 may then be fed to the feature engineering module 228.
[0097] In an embodiment, the feature engineering module 228 includes suitable logic and/or interfaces for generating a plurality of features based, at least in part, on the API call dataset 218. It may be noted that as the API call dataset 218 includes the long-term API call data 218A and the short-term API call data 218B, the plurality of features may include a set of long-term velocity features and a set of short-term API call features. The set of long-term velocity features may correspond to the long-term API call data 218A, and the set of short-term API call features may correspond to the short-term API call data 218B.
[0098] The set of long-term velocity features may be generated to represent historical behavior or clients’ requests (e.g., API calls from one or more nodes of the nodes 104). Further, from the definition of feature engineering, the feature engineering module 228 may be configured to select, generate, create, and transform the plurality of features to enhance the predictive power of an artificial intelligence (AI) or machine learning (ML) model which is trained to achieve the objective of the present disclosure, i.e., the first ML model and the second ML model.
[0099] In some embodiments, the feature engineering module 228 may generate the plurality of features using one or more feature engineering techniques. For instance, the one or more feature engineering techniques may include feature extraction, one-hot encoding, binning, scaling and normalization, feature interactions, polynomial features, time-based features, domain-specific feature engineering, and the like.
[00100] It may be understood that the process implemented by the server system 200 via the feature engineering module 228 may be iterative and may require a good understanding of the data, domain knowledge, and experimentation. The plurality of features generated may capture the underlying patterns and relationships in the data (e.g., the API call dataset 218). This nature of the plurality of features improves the model’s performance and interoperability. The plurality of features may be then fed to the anomaly detection module 230.
[00101] In an embodiment, the anomaly detection module 230 includes suitable logic and/or interfaces for determining, via the first ML model 220, a set of anomalous nodes from the plurality of nodes (e.g., the nodes 104) based, at least in part, on the set of long-term velocity features and the set of short-term API call features. For determining the set of anomalous nodes using the first ML model 220, the first ML model 220 may have to be generated. Thus, in an embodiment, the anomaly detection module 230 may be configured to generate the first ML model 220 by performing a set of operations for training the first ML model 220 to learn a general distribution of a genuine API call. The set of operations may include extracting a first set of features from the set of long-term velocity features, the first set of features corresponding to non-malicious API call dataset from the long-term API call data 218A, computing a reconstruction loss based, at least in part, on analyzing the first set of features, and minimizing the reconstruction loss by optimizing one or more ML model parameters. The details associated with the generation of the first ML model 220 are explained with reference to FIG. 7.
[00102] Upon generating the first ML model 220, the first ML model 220 is then used by the anomaly detection module 230 for detecting the set of anomalous nodes. Further, for determining the set of anomalous nodes, the anomaly detection module 230 may be configured to generate, via the first ML model 220, a standard behavioral profile for non-malicious nodes, generate a behavioral profile for each of the plurality of nodes (e.g., the nodes 104), compute, via the first ML model, a behavioral discrepancy probability, and assign, via the first ML model 220, an anomalous identity label to one or more nodes of the plurality of nodes (e.g., the nodes 104) when the behavioral discrepancy probability is at least equal to a pre-determined threshold probability. The details of the steps involved in the determination of the set of anomalous nodes are explained with reference to FIG. 8. The set of anomalous nodes may be fed to the scoring module 232.
[00103] In an embodiment, the scoring module 232 includes suitable logic and/or interfaces for extracting a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data 114B. The scoring module 232 may further be configured to generate, via the first ML model 220, a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls.
[00104] Further, the scoring module 232 may be configured to generate a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss. It may be noted that, upon feeding the short-term API call features to the first ML model 220, a new reconstruction loss may be generated by the first ML model 220. Thereafter, the risk score is generated based on this new reconstruction loss.
[00105] In some embodiments, when the risk score associated with one or more API calls from the subset of API calls may be greater than or equal to an anomaly detection threshold, the scoring module 232 may be configured to decline or flag the corresponding one or more API calls. The steps involved in the generation of the risk score are explained with reference to FIG. 9.
[00106] In some embodiments, a subset of anomalous nodes may be determined from the set of anomalous nodes that may be highly risky. Thus, the optimization module 234 includes suitable logic and/or interfaces for determining, via the second ML model 222, the subset of anomalous nodes from the set of anomalous nodes to be labeled as highly risky anomalous nodes based, at least in part, on analyzing the risk score generated for each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes. In a non-limiting example, the second ML model 222 may be a Natural Language Processing (NLP) model trained using deep learning neural networks. The steps involved in analyzing the risk score for the generation of the subset of anomalous nodes may be explained in detail with reference to FIG. 6D.
[00107] In some other embodiments, the processor 206 may further include an alert generation module (not shown in FIG. 2). The alert generation module includes suitable logic and/or interfaces for generating one or more alerts to API management authorities when the risk score associated with one or more API calls of the subset of API calls is at least equal to the anomaly detection threshold.
[00108] In addition, the processor 206 may further include a report generation module (not shown in FIG. 2). The report generation module includes suitable logic and/or interfaces for generating a report for the set of API calls. The report may include a data summary corresponding to overall malicious traffic on an API ecosystem. In a non-limiting example, the report may be generated using a Natural Language Generation (NLG) model. The steps involved in the generation of the report may be explained in detail with reference to FIG. 6C.
[00109] FIG. 3 illustrates a simplified block diagram representation of a node 300 configured to run a protocol corresponding to an API associated with a software application running on the node 300, in accordance with an embodiment of the present disclosure. It should be noted that the node 300 may be the same as at least one of the nodes 104 of FIG. 1. In an embodiment, the node 300 includes at least one processor, such as a processor 302. The processor 302 is depicted to include a data processing engine 304 and a generation engine 306. It should be noted that the processor 302 may include other modules for its operation as well.
[00110] The node 300 may further include a memory module 308, an input/output (I/O) module 310, a storage module 312, and a communication module 314. It is noted that although the node 300 is depicted to include only one processor, the node 300 may include a greater number of processors therein. In an embodiment, the memory module or the memory 308 is capable of storing machine-executable instructions. Further, the processor 302 is capable of executing the machine-executable instructions. In an embodiment, the processor 302 may be embodied as a multi-core processor, a single-core processor, or a combination of one or more multi-core processors and one or more single-core processors. For example, the processor 302 may be embodied as one or more of various processing devices, such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), a processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, and/or the like. In an embodiment, the processor 302 may be configured to execute hard-coded functionality. In an embodiment, the processor 302 is embodied as an executor of software instructions, wherein the instructions may specifically configure the processor 302 to perform the algorithms and/or operations described herein when the instructions are executed. The modules of the processor 302 may be implemented as software modules, hardware modules, firmware modules, or as a combination thereof.
[00111] The memory 308 may be embodied as one or more volatile memory devices, one or more non-volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. For example, the memory 308 may be embodied as semiconductor memories (such as mask ROM, PROM (programmable ROM), EPROM (erasable PROM), flash memory, RAM (random access memory), etc.), magnetic storage devices (such as hard disk drives, floppy disks, magnetic tapes, etc.), optical magnetic storage devices (e.g., magneto-optical disks), CD-ROM (compact disc read-only memory), CD-R (compact disc recordable), CD-R/W (compact disc rewritable), DVD (Digital Versatile Disc) and BD (BLU-RAY® Disc), and/or the like.
[00112] In at least some embodiments, the memory 308 stores logic and/or instructions, which may be used by modules of the processor 302, such as the data processing engine 304, the generation engine 306, for implementing the protocol corresponding to the API. For example, the memory 308 includes logic for receiving and/or transmitting one or more API call requests, processing the API call requests, generating one or more API call responses, and transmitting and/or receiving the one or more API call responses.
[00113] In an embodiment, the I/O module 310 may include mechanisms configured to receive inputs from and provide outputs to an operator(s) of the node 300. To that effect, the I/O module 310 may include at least one input interface and/or at least one output interface. Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a joystick, a keypad, a touch screen, soft keys, a microphone, and the like. Examples of the output interface may include, but are not limited to, a display such as a light-emitting diode display, a thin-film transistor (TFT) display, a liquid crystal display, an active-matrix organic light-emitting diode (AMOLED) display, a microphone, a speaker, a ringer, a vibrator, and/or the like. In an example embodiment, the processor 302 may include I/O circuitry configured to control at least some functions of one or more elements of the I/O module 310, such as, for example, a speaker, a microphone, a display, and/or the like. The processor 302 and/or the I/O circuitry may be configured to control one or more functions of the one or more elements of the I/O module 310 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the memory 308, and/or the like, accessible to the processor 302.
[00114] The communication module 314 may include communication circuitry such as for example, a transceiver circuitry including an antenna and other communication media interfaces to connect to a communication network, such as the network 110 shown in FIG. 1. The communication circuitry may, in at least some example embodiments, enable reception and/or transmission of the one or more API call requests or responses in the network 110 the node 300. The communication circuitry may further be configured to enable processing of data associated with the one or more API call requests, generating the one or more API call responses, and then transmission and/or reception of the one or more API call responses.
[00115] The storage module 312 is any computer-operated hardware suitable for storing and/or retrieving data. In one embodiment, the storage module 312 includes a repository for storing the data related to the one or more API call requests, the one or more call responses, node operator details, and the like. The storage module 312 may include multiple storage units such as hard drives and/or solid-state drives in a redundant array of inexpensive disks (RAID) configuration. In some embodiments, the storage module 312 may include a storage area network (SAN) and/or a network-attached storage (NAS) system.
[00116] In some embodiments, the processor 302 and/or other components of the processor 302 may access the storage module 312 using a storage interface (e.g., the storage interface 214 shown in FIG. 2). The storage interface may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processor 302 and/or the modules of the processor 302 with access to the storage module 312.
[00117] The various components of the node 300, such as the processor 302, the memory 308, the I/O module 310, the storage module 312, and the communication module 314 are configured to communicate with each other via or through a centralized circuit system 316. The centralized circuit system 316 may be various devices configured to, among other things, provide or enable communication between the components of the node 300. In certain embodiments, the centralized circuit system 316 may be a central printed circuit board (PCB) such as a motherboard, a main board, a system board, or a logic board. The centralized circuit system 316 may also, or alternatively, include other printed circuit assemblies (PCAs) or communication channel media.
[00118] In an embodiment, the node 300 may be a client node or a server node depending upon whether the node 300 is involved in generating the one or more API requests or the one or more API responses. For instance, the node 300 may be a client node having an application ‘x’ running on the node 300. In such a scenario, to complete a task on the application ‘x’ for a node operator, the application ‘x’ may have to receive certain information from another application running on another node in the network 110. For the application ‘x’ to receive such information, the application ‘x’ may have to make an API call request to an API of another application running on another node.
[00119] Initially, the data processing engine 304 may receive the data required for generating an API call request. The data may include node operator details, recipient node details, recipient node operator details, recipient node API details, and the like. Further, the data processing engine 304 may analyze the data and prepare the data for the generation of the API call request. The output of the data processing engine 304 is fed to the generation engine 306. The generation engine 306 may be configured to generate the API call request by embedding the data together in the form of a request. Further, the API call request may then be transmitted to the API of the other node via the communication module 314, for the other node to generate an API call response. Upon the generation of the API call response, it is transmitted to the node 300. The API call response may be received by the node 300 via the communication module 314. Upon receiving the API call response, the data processing engine 304 may process and analyze the API call response to validate whether the API call response received from another node is in accordance with the API call request sent by the node 300. Upon validating the API call response, a normal functioning of the application ‘x’ running on the node 300 may continue to run and complete its task.
[00120] In another embodiment, the node 300 may be a server node that provides an API for other applications to enable them to fetch information from the application running on the server node. In such an embodiment, similar steps as explained above may be performed by the node 300 and its components such as the data processing engine 304 and the generation engine 306, as performed in the case when the node 300 may be the client node. The exception is that the data processing engine 304 is now facilitated to process and analyze data for the generation of the API call response upon receiving an API call request from other nodes. The generation engine 306 is configured to generate the API call response based on the analysis of the API call request.
[00121] FIG. 4A illustrates a block diagram representation of an environment 400 related to at least some example embodiments of the present disclosure. Although the environment 100 is presented in one arrangement, other embodiments may include the parts of the environment 400 (or other parts) arranged otherwise depending on, for example, determining a set of anomalous users by generating a behavioral profile for each user using an ML model, generating a risk score for each API call associated with each anomalous user, and declining API calls based, at least in part, on a risk score associated with the corresponding API calls being at least equal to an anomaly detection threshold.
[00122] The environment 400 generally represents a scenario in which the network of nodes (e.g., the nodes 104) corresponds to a client-server network as explained above. Herein, the nodes may be considered to have at least one entity server playing a role of a server and at least one user device associated with a user playing a role of a client. Thus, the environment 400 includes a plurality of components such as a server system 402, an entity server 404, a plurality of user devices 406(1), 406(2), … 406(N) (collectively, referred to as user devices 406), where ‘N’ is a non-zero Natural number, associated with a plurality of users (hereinafter, interchangeably also referred to as users), a plurality of data sources 408(1), 408(2), … 408(N) (collectively, referred to as data sources 408), where ‘N’ is a non-zero Natural number, and a database 410 each coupled to, and in communication with (and/or with access to) a network 412. It should be noted that the server system 402, the data sources 408, the database 410, and the network 412 are substantially similar to the server system 102, the data sources 106, the database 108, and the network 110 respectively of FIG. 1. The entity server 404 and the user devices 406 are examples for the nodes 104 of FIG. 1.
[00123] In an embodiment, the entity server 404 may refer to a separate legal entity with a primary function to perform business activities (e.g., selling goods, products, and/or services in exchange for cash/currency). In an example, the entity server 404 may refer to an established bank that provides various financial services (e.g., banking services, loans, financial products such as debit/credit payment cards, investment services, etc.) to the users who are accessing services provided by the entity server 404 via the user devices 406. In various examples, the entity server 404 may be any establishment serving people such as social media, entertainment websites, government enterprises, non-government organization (NGO) websites, internal portals associated with the user’s workplace, etc. The entity server 404 may have a physical establishment and/or even exist virtually in a web or metaverse space. The entity server 404 may provide products (such as goods, services, etc.) to the users.
[00124] In an embodiment, a user may refer to a person/organization that operates a user device (e.g., the user device 406(1)). The user devices 406 may be any portable communication devices or any fixed communication device. As explained above, in one scenario, the user may access the services/products provided by the entity server 404 by logging in through a login portal associated with the website or API operated by an application 414 running on the entity server 404 as an instance of the application 414 is also provided to the user on the user device (e.g., the user device 406(1)) as shown in FIG. 4. Therefore, in some embodiments, each of the user devices 406 may be provided with an instance of the application 414 running on the entity server 404 for the users of the respective user devices 406 to be able to access the services and/or products provided by the application 414. In some scenarios, the login process may require the user to produce a user identity (user ID) along with a secure password/pin/passcode to authenticate their identity. If the user enters the correct information in the login portal, the identity of the user is successfully validated, and the user can then access the services/products provided by the entity server 404. However, if the user is unable to provide the correct information during the login portal due to any reason (e.g., the user may have forgotten the credentials for the login portal), the entity server 404 may authenticate the identity of the user by any predefined alternative means.
[00125] Further, upon successfully logging in on the application 414, for accessing the services/products provided by the entity server 404 via the application 414, the user may select a command on an application user interface (UI) provided on the user device (e.g., the user device 406(1) by the instance of the application 414 running on the user device 406(1). Based on the command selected, an API call may be generated for transmission to an API of the application 414 on the entity server 404.
[00126] Assuming that the API of the application 414 receives multiple API calls from different users via different user devices 406. Suppose an identity of each user or the user device may be identified using Internet Protocol (IP) address. Further, out of all the multiple API calls that the application 414 on the entity server 404 has received, fewer or more may be fraudulent or spoofed. Therefore, to ensure the security of the API both at the call level and at an identity level of clients, the server system 402 proposed in the present disclosure may be beneficial.
[00127] The server system 402 may provide an API security application 416. The API security application 416 may be substantially similar to the API security application 112 of FIG. 1. In an embodiment, the entity server 404 may have an instance of the API security application 416 running on it. In another embodiment, the server system 402 may be the entity server 404. Therefore, the API security application 416 may perform operations as explained with reference to API security Application 112 of FIG. 1 for facilitating the entity server 404 with call level security for securing the API calls to the API of the application 414 of the entity server 404 from other nodes such as the user devices 406 in the API network/ecosystem.
[00128] For example, the server system 402 may be configured to access the API call dataset 114 including the long-term API call data 114A and the short-term API call data 114B, from the database 410 of the server system 402. The server system 402 may further be configured to generate a plurality of features including a set of long-term velocity features corresponding to the long-term API call data 114A and a set of short-term API call features corresponding to the short-term API call data 114B.
[00129] Further, the server system 402 may be configured to determine a set of anomalous users from a plurality of users associated with a plurality of user devices 406 respectively that are a part of the network 412. The server system 402 may determine the set of anomalous users based on the set of long-term velocity features and the set of short-term API call features using the first ML model 116. Later, the server system 402 may then extract a subset of API calls associated with each anomalous user of the set of anomalous users from the short-term API call data. A reconstruction loss may then be generated by analyzing the subset of API calls using the first ML model 116.
[00130] The server system 402 may generate a risk score for each API call of the subset of API calls based on the reconstruction loss and using the first ML model 116. Based on the risk score generated for one or more API calls, the one or more API calls with the risk score greater than or equal to the anomaly detection threshold may be declined, thereby securing the API of the application 414 running on the entity server 404.
[00131] FIG. 4B illustrates a block diagram representation 420 of a data communication flow between the server system 402 and one or more user devices 406 of the environment 400 of FIG. 4A, in accordance with an embodiment of the present disclosure. More specifically, FIG. 4B illustrates an alternative scenario in which different applications running on different devices communicate with each other as a part of the client-server network having the one or more user devices 406 and the entity server 404 communicate with each other via the network 412.
[00132] For instance, the different applications may include a user-end application 422 running on the user devices 406 and an entity-end application 424 running on the entity server 404. In an embodiment, the entity-end application 424 may provide multiple services such as a microservice 426A, 426B, and 426C via multiple APIs such as an API 428A, 428B, and 428C respectively. In an example, the user devices 406 may include a user database 430. In various non-limiting examples, the user devices 406 may include a laptop 406(1), a mobile phone 406(2), a desktop computer 406(3), a tablet 406(4), and other similar communication devices. In one implementation, the user database 430 may include information associated with the user device 406 of the user. In various non-limiting examples, the user database 430 may include device configuration data, metadata, user profile data, and the like.
[00133] In one non-limiting example, the user database 430 may be implemented using memory. Some non-limiting examples of the memory may include a random-access memory (RAM), a read-only memory (ROM), a removable storage drive, a hard disk drive (HDD), a solid-state drive (SSD), and the like. In some other examples, the memory may be realized in the form of a cloud storage working in conjunction with the user device 406, without deviating from the scope of the present disclosure.
[00134] As explained above, in such a scenario, the user-end application 422 may access the multiple services (e.g., the microservice 426A) provided by the entity-end application 424, by accessing the multiple APIs (e.g., the API 428A) via an API gateway 432. Therefore, the API gateway 432 may be associated with the server system 402. Further, in an embodiment, the API security application 416 may be running on the server system 402. In another embodiment, the API security application 416 may be running on the API gateway 432.
[00135] In some embodiments, the API gateway 432 may communicate with an entity data repository 434 associated with the entity server 404 via the API gateway 432. The entity data repository 434 may be adapted to store details related to the multiple services or microservices 426A-426C that are provided with APIs 428A-428C respectively. Therefore, the entity data repository 434 may also store details related to the APIs.
[00136] In a non-limiting example, the API gateway 432 is facilitated via the API security application 416 to communicate the API calls between the user devices 406 and the entity data repository 434. In one embodiment, the API gateway 432 receives an API call from a user via the mobile phone 406(2) when the user selects an option on a UI on the mobile phone 406(2) provided by the user-end application 422 running on the mobile phone 406(2).
[00137] It may be noted that the API gateway 432 having the server system 402 operating on it may be continuously monitoring various API calls in real-time based on what it has learned using the first ML model 220, the second ML model 222, and the API call dataset 218 (as shown in FIG. 2). As it may be understood that the API gateway 432 via the server system 402 has learned a genuine behavior of a genuine API call. Therefore, as it receives new API calls, the API gateway 432 may analyze the API calls by comparing a behavior profile corresponding to the API calls and the user from whose user device the API calls may be received with genuine API calls. During the process, a risk score may be generated for each API call that is newly received via the API gateway 432.
[00138] In one scenario, the risk score may be less than a threshold (e.g., the anomaly detection threshold) (see, 436). Therefore, the corresponding API call may be concluded to be a genuine API call as the risk score is less than the threshold. Suppose the API call may be for the microservice 426A, and therefore the genuine API call may be transmitted to the entity-end application 434 via the API 428A. Upon receiving the API call, the entity-end application 434 may generate an API call response corresponding to the microservice 426A which may be allowed (see, 438) to be transmitted to the user-end application 422 in the mobile device 406(2) of the corresponding user via the API gateway 432.
[00139] In a second scenario, the risk score may be greater than or equal to a threshold (e.g., the anomaly detection threshold) (see, 440). Therefore, the corresponding API call may be concluded to be a malicious API call as the risk score is greater than the threshold. Suppose the API call may be for the microservice 426C, and therefore the malicious API call may be transmitted to the entity-end application 434 via the API 428C. Upon receiving the API call, the entity-end application 434 may generate an API call response corresponding to the microservice 426C. However, since the API call is concluded to be malicious, the API call is declined (see, 442) and hence the API call response may not be allowed or declined (see, 442) from transmitting to the user-end application 422 in the mobile device 406(2) of the corresponding user via the API gateway 432.
[00140] FIGS. 4C and 4D, collectively, illustrate a sequential flow diagram 450 for facilitating an API call security to an API call session between a plurality of user devices (the user device 406(1), and the user device 406(2)) the data source 408(1) the entity server 404, and the server system 400, in accordance with an example embodiment of the present disclosure. The API call session is facilitated by the applications (e.g., the application 414 or the user-end application 422 and the entity-end application 424) in the user devices and/or the entity server 404. However, the server system 402 facilitates call level security for securing the API calls between the user devices 406 and the entity server 404 in the API ecosystem. The user devices 406(1) and 406(2), the data source 408(1), the server system 402, and the entity server 404 are described with reference to FIG. 4A. Moreover, considering the user-end application 422 running on the user devices 406(1) and 406(2) and the entity-end application 424 running on the entity server 404, and the user-end application 422 communicates with the entity-end application 424, the sequence of operations is as follows and starts from step 452.
[00141] At step 452, the user device 406(1) generates an API call. It may be noted that the user-end application 422 facilitates the user device 406(1) to generate the API call for an API of the entity-end application 434 on the entity server 404 upon selecting an option on a graphical user interface (GUI) on the user device 406(1).
[00142] At step 454, the user device 406(2) generates an API call. It may be noted that the user-end application 422 facilitates the user device 406(2) to generate the API call for an API of the entity-end application 424 on the entity server 404 upon selecting an option on another GUI on the user device 406(2).
[00143] At step 456 and step 458, the server system 402 receives the API calls from the user devices 406(1) and 406(2). As may be understood, the API calls from the user devices 406(1) and 406(2) are intended to be transmitted to the API of the entity-end application 424. However, the entity server 404 has registered with the server system 404 for receiving a service of API call security from the server system 404. Therefore, in an embodiment, the API calls may first pass through the server system 404.
[00144] At step 460, the data source 408(1) also receives the API calls from the user devices 406(1) and 406(2). It may be understood that the data source 408(1) as explained above may correspond to an existing solution system that facilitates API security to APIs and the server system 402 receives the API call dataset from such data sources. Thus, to that note, the API calls from the user devices 406(1) and 406(2) may also be transmitted to such data sources (e.g., the data source 408(1)). For instance, the data source 408(1) may include one of firewalls, load balancers, gateways, and the like.
[00145] At step 462, the server system 402 facilitates the data source 408(1) to provide the API call dataset (e.g., the API call dataset 114). The server system 402 may request the data source 408(1) for the API call dataset 114. In response to the request, the data source 408(1) may provide access to an API associated with an application in the data source 408(1), so that the API call dataset 114 that the data source 408(1) collects from various API calls may be accessible to the server system 402.
[00146] At step 464, the server system 402 receives the API call dataset 114 from the data source 408(1). In an embodiment, the API call dataset 114 received from the data source 408(1) may be stored in the database 204 associated with the server system 402. As may be understood, the API call dataset 114 may include the long-term API call data 114A and the short-term API call data 114B.
[00147] At step 466, the server system 402 generates a set of features (hereinafter, interchangeably also termed as features or a plurality of features) corresponding to the API call dataset 114. It is understood that the features may include a set of long-term velocity features corresponding to the long-term API call data 114A and a set of short-term API call features corresponding to the short-term API call data 114B.
[00148] At step 468, the server system 402 generates a first machine learning (ML) model (e.g., the first ML model 220). It may be noted that, based on the long-term API call data 144A, the server system 402 may have to learn a genuine behavior of an API call, so that the server system 402 can classify an API call to be genuine or malicious when it receives different API calls or details related to different API calls in future. Therefore, the concept of machine learning may be used by the server system 402 for generating the first ML model 220 that is trained with the API call dataset 114. This enables the first ML model 220 to learn the behavior or the genuine API call. The first ML model 220 may then be used by the server system 402 to determine anomalous users. The steps involved in the generation of the first ML model 220 are explained with reference to FIG. 7.
[00149] At step 470, the server system 402 determines a set of anomalous users from the plurality of users that are a part of the network 412. In an embodiment of the network 412 having the user devices 406(1) and 406(2) as a part of the network 412, the server system 402 may have to determine anomalous users out of these two of the user devices 406(1) and 406(2). The server system 402 determines an anomalous user based on the set of long-term velocity features using the first ML model 220. The steps involved in the determination of the set of anomalous users and/or nodes 104 are explained with reference to FIG. 8.
[00150] Suppose the server system 402 determined that the user device 406 (1) is anomalous. Then the process moves to step 472 otherwise the server system 402 continues to monitor the subsequent API call.
[00151] At step 472, the server system 402 extracts a subset of API calls associated with the anomalous user i.e., the user of the user device 406 (1). As may be understood, the user device 406 (1) may be involved in sending API calls to other devices through the user-end application 422 based on requirements of the user-end application 422. Therefore, it may become necessary for the server system 402 to extract data related to such API calls. The server system 402 may extract the subset of API calls from the short-term API call data 114B.
[00152] At step 474, the server system 404 generates a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls.
[00153] At step 476, the server system 404 generates a risk score for each API call of the subset of API calls based on the reconstruction loss and using the first ML model 116. The steps involved in the generation of the risk score for each API call of the subset of API calls associated with the set of anomalous users and/or nodes 104 are explained with reference to FIG. 9.
[00154] At step 478, the server system 402 identifies the risk score generated for one or more API calls associated with the user device 406(1) to be at least equal to the anomaly detection threshold.
[00155] If the risk score associated with one or more API calls is greater than or equal to an anomaly detection threshold, then the server system 402 declines the corresponding one or more API calls and the process moves to step 480.
[00156] At step 480, upon identifying the risk score to be at least equal to the anomaly detection threshold, the server system 402 declines the transmission of the corresponding API calls further to the entity server 404 from the user device associated with each of the anomalous API call.
[00157] At step 482, the server system 402 transmits an alert to the entity server 404 about the decline of the transmission the one or more API calls to the entity server 404, upon detection of the risk score generated for the corresponding one or more API calls being greater than or equal to the anomaly detection threshold.
[00158] At step 484, the server system 402 transmits an alert to the user device 406(1). The alert corresponds to an indication about the decline of the transmission of the one or more API calls to the entity server 404, upon detection of the risk score generated for the corresponding one or more API calls being greater than or equal to the anomaly detection threshold.
[00159] At step 486, the server system 402 identifies the risk score generated for one or more API calls associated with the user device 406(2) to be less than the anomaly detection threshold.
[00160] If the risk score associated with one or more API calls is less than the anomaly detection threshold, then the server system 402 accepts the corresponding one or more API calls and the process moves to step 488.
[00161] At step 488, upon identifying the risk score to be less than the anomaly detection threshold, the server system 402 accepts the transmission of the corresponding API calls further to the entity server 404.
[00162] At step 490, the server system 402 transmits the one or more API calls from the user device 406(2) to the entity server 404.
[00163] At step 492, the entity server 404 generates an API call response to be transmitted to the user device 406(2) via the server system 402.
[00164] At step 494, the server system 402 receives the API call response from the entity server 404.
[00165] At step 496, the server system 402 finally transmits the API call response to the user device 406(2) and the operation in the user-end application 422 continues upon receiving the API call response. It may be noted that the API call response may include details that the user-end application 422 may need to proceed with the operation.
[00166] FIG. 5 illustrates a block diagram representation of an environment 500 related to at least some example embodiments of the present disclosure. Although the environment 500 is presented in one arrangement, other embodiments may include the parts of the environment 500 (or other parts) arranged otherwise depending on, for example, determining a set of anomalous nodes by generating a behavioral profile for each node using an ML model, generating a risk score for each API call associated with each anomalous node, and declining API calls based, at least in part, on a risk score associated with the corresponding API calls being at least equal to an anomaly detection threshold.
[00167] The environment 500 generally represents a scenario in which the network of nodes (e.g., the nodes 104) corresponds to a client-server network as explained above. Herein, the nodes may be considered to include a payment server on a payment network playing a role of a server and a plurality of merchants and a plurality of cardholders playing a role of clients. Thus, the environment 500 includes a plurality of components such as a server system 502, a plurality of user devices 504(1), 504(2), … 504(N) (collectively, referred to as a plurality of cardholders 504 and ‘N’ is a non-zero Natural number) associated with a plurality of cardholders 506(1), 506(2), … 506(N) (collectively, referred to as a plurality of merchants 506 and ‘N’ is a non-zero Natural number), a plurality of merchants 508(1), 508(2), … 508(N) (collectively, referred to as a plurality of merchants 508 and ‘N’ is a Natural number), a database 510, each coupled to, and in communication with (and/or with access to) a network 512. The server system 502 may provide an API security application 514. The API security application 514 may be substantially similar to the API security application 416 of FIG. 4A.
[00168] In an embodiment, the plurality of components may also include an acquirer server 516, an issuer server 518, and a payment network 520 including a payment server 522 coupled to other components of the environment 500 via the network 512. However, it may be noted that the server system 502, the user devices 506, the merchants 508, the database 510, the network 512, the acquirer server 516, the issuer server 518, and the payment network 520 including the payment server 522 are substantially similar to the corresponding components of the environment 400 of FIG 4A. For example, the server system 502 is similar to the server system 402, the database 510 is similar to the database 410, the user devices 506 are similar to the user devices 406, and the network 512 is similar to the network 412. In one embodiment, the merchants 508, the acquirer server 516, the issuer server 518, and the payment network 520 including the payment server 522 may be examples of the entity server 404. In another embodiment, one or more of the merchants 508, the acquirer server 516, the issuer server 518, and the payment network 520 may act as a client or a server interchangeably.
[00169] As used herein, the term “cardholder” refers to a person who has a payment account or a payment card (e.g., credit card, debit card, etc.) associated with the payment account, that will be used by a merchant to perform a payment transaction. The payment account may be opened via an issuing bank or an issuer server. Similarly, as used herein, the term “merchant” refers to a seller, a retailer, a purchase location, an organization, or any other entity that is in the business of selling goods or providing services, and it can refer to either a single business location or a chain of business locations of the same entity. Further, as used herein, the term “payment network” refers to a network or collection of systems used for the transfer of funds through the use of cash substitutes. Payment networks are companies that connect an issuing bank with an acquiring bank to facilitate an online payment. Examples of networks or systems configured to perform as payment networks include those operated by such as Mastercard®.
[00170] In a non-limiting example, a cardholder 506(1) uses an online shopping application (not shown in FIG. 5) provided by a merchant 508(1) for online shopping of clothes on a user device 504 (1). To that note, the cardholder 506(1) selected a few clothes via a user interface (UI) provided by the online shopping application on the user device 504(1) and is ready to check out. While checking out, from different payment options, the cardholder 506(1) selects a card payment option. The cardholder 506(1) may have already registered a payment card such as a debit card or credit card on the online shopping application, and hence upon selection of the card payment option, card details may appear for the cardholder 506(1) on the UI so that the cardholder 506(1) selects the card and completes the payment of the purchase to the merchant.
[00171] Upon selection of the card, the cardholder 506(1) provides credentials such as a card pin for authentication of the cardholder 506(1) owning the card. Upon entry of the card pin, at the backend a plurality of operations is performed between the acquirer server 516, the issuer server 518, and the payment network 520 including the payment server 522. The operations may include the exchange of multiple API calls between applications running on the acquirer server 516, the issuer server 518, the payment network 520, and a payment gateway. The multiple API calls may include a merchant API call that is initiated from the online shopping application provided by the merchant 508(1) to the payment gateway. Other API calls may be payment gateway API calls, cardholder authentication API calls, authorization API calls, payment confirmation API calls, settlement and reporting API calls, and the like.
[00172] In an embodiment, such API calls and APIs associated with such applications may not be secure and hence the API security application 514 may be used for the security of such API calls. An instance of the API security application 514 may be available at the payment gateway, at an issuer end, an acquirer end, merchant end, and/or user device in the user device 504(1) as shown in FIG. 5. Therefore, while checking out, before completing the payment process, the API calls happening at the backend are verified using the API security application 514. Moreover, the server system 502 facilitates such an implementation by performing the various operations as explained above with reference to previously referred Figures and not repeated here for the sake of brevity.
[00173] FIG. 6A illustrates a detailed schematic representation 600 of the server system 200 and a data communication flow between the server system 200 and the plurality of nodes (e.g., the user devices 406(1)-406(3)), in accordance with an embodiment of the present disclosure. In some embodiments, applications running on the user devices 406(1)-406(3) may be involved in API call-based communication either among themselves or with other applications running on other devices, and hence multiple API calls may have been exchanged between them. The user devices 406(1)-406(3) may be vulnerable and can be attacked by malicious entities through malicious or anomalous API calls. Therefore, the user devices 406(1)-406(3) may use the API security application 112 running on the server system 200. Moreover, an instance of the API security application 112 may also be available on the user devices 406(1)-406(3).
[00174] In an embodiment, the server system 200 includes multiple modules such as the data pre-processing module 226, the feature engineering module 228, the anomalous detection module 230, and the scoring module 232 communicably coupled with each other as shown in FIG. 6A. As may be understood, the server system 200 accesses the API call dataset 114 including the long-term API call data 114A and the short-term API call data 114B from the database 204. However, the API call dataset 114 may be stored in the database 204 upon collecting the plurality of data samples from a plurality of data sources as explained in the description with reference to FIG. 1.
[00175] It may be noted that the plurality of data sources may include a firewall 602, a gateway API 604, another firewall 606, and an API 608 associated with the application 414. Further, the data pre-processing module 226 facilitates the server system 200 to access the API call dataset 114 from the plurality of data sources as shown in FIG. 6A. The API call dataset 114 may be related to the set of API calls performed between the user devices 406(1)-406(3).
[00176] In a non-limiting example, the API call dataset 114 may include different types of datasets targeting at least call level, user information, machine and network information, and the like for each API calls. In a call-level type of dataset, the API call dataset 114 may include call logs from the API calls which include information such as response codes (approved, redirected, client error, or server error), HTTP methods (get, put, delete), response time, authentication time, etc. In a user information type of dataset, the API call dataset 114 may include a history of the user including historical call level data and other demographic information and behavior (call on other APIs) of the user. In a machine and network information type of dataset, the API call dataset 114 may include system logs of APIs to get overall context information for the API ecosystem and usage during the calls.
[00177] Upon receiving the API call dataset 114, the feature engineering module 228 generates the plurality of features (e.g., the features 610) based on the API call dataset 114. As explained above, the feature engineering module 228 may generate features 610 using the one or more feature engineering techniques. The features 610 may include the set of long-term velocity features and the set of short-term API call features. In a non-limiting example, the features 610 for each client IP address may include response time, requested data length, authorization latency, byte sent, minimum payload, average payload, maximum payload, number of API calls, payload length, severity such as warnings, errors, critical, informational, etc.
[00178] Upon generation of the features 610, the anomaly detection module 230 determines, via the first ML model 220, a set of anomalous clients 612 from one or more clients using the user devices 406(1)-406(3) based, at least in part, on the set of long-term velocity features and the set of short-term API call features.
[00179] In some embodiments, details related to the set of anomalous clients 612 may then be transmitted to the scoring module 232. The scoring module 232 extracts a subset of API calls associated with each anomalous client of the set of anomalous clients from the short-term API call data 114B. The scoring module 232 generates, via the first ML model 220, a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls. The scoring module 232 then generates the risk score 614 for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss.
[00180] In one embodiment, the scoring module 232 declines (see, 616) one or more API calls from the subset of API calls based, at least in part, on the risk score 614 associated with the one or more API calls being at least equal to the anomaly detection threshold. In another embodiment, the scoring module 232 accepts (see, 616) the one or more API calls from the subset of API calls based, at least in part, on the risk score 614 associated with the one or more API calls being less than the anomaly detection threshold.
[00181] FIG. 6B illustrates a detailed schematic representation 640 of the server system 200 and a data communication flow between the server system 200 and the plurality of nodes (e.g., the user devices 406 (1)-406(3)), in accordance with another embodiment of the present disclosure. The data communication flow between the user devices 406 (1)-406(3) and the modules of the server system 200 is similar to as explained above with reference to FIG. 6A. In some embodiments, the server system 200 may further include an alert generation module 642 communicably coupled to the scoring module 232. The alert generation module 642 may further generate one or more alerts (e.g., the alerts 644) to API management authorities (e.g., the API management authorities 646) when the risk score 614 associated with one or more API calls of the subset of API calls is at least equal to the anomaly detection threshold. For instance, the API management authorities may include corporate security, application BizOps, API owner team, and the like.
[00182] FIG. 6C illustrates a detailed schematic representation 660 of the server system 200 and a data communication flow between the server system 200 and a plurality of nodes (e.g., the user devices 406(1)-406(3)), in accordance with yet another embodiment of the present disclosure. The data communication flow between the user devices 406(1)-406(3) and the modules of the server system 200 is similar to as explained above with reference to FIG. 6A.
[00183] In some other embodiments, the server system 200 may further include a report generation module 662 communicably coupled to the scoring module 232. The report generation module 662 may be configured to generate a report 664 for the set of API calls. The report 664 may include a data summary corresponding to overall malicious traffic on an API network. It may be noted that, in a non-limiting example, the report 664 may be generated using the Natural Language Generation (NLG) model. As used herein, the term “Natural Language Generation” refers to a process of generating human-like text or narratives from structured data or other non-linguistic sources. Systems that use the NLG model, analyze data, make inferences, and transform the information into coherent and meaningful written or spoken language.
[00184] Therefore, in an embodiment, the report generation module 662 may receive a plurality of details from each module of the processor 206 such as the data pre-processing module 226, the feature engineering module 228, the anomaly detection module 230, the scoring module 232, and the optimization module 234. The report generation module 662 may further perform a plurality of operations on the plurality of details using NLG. In an embodiment, the NLG method can be implemented with or without using machine learning. The plurality of operations in NLG may include performing data analysis for identifying relevant patterns, relationships, and insights, content determination, text planning, sentence generation, refinement and aggregation, and linguistic realization.
[00185] Alternatively, when ML may be used, different approaches may include rule-based models, supervised learning, sequence-to-sequence models, transformer model, and the like. Upon performing the plurality of operations corresponding to NLG, the report generation module 662 may generate the report 664. Further, upon generation of the report 664, the report 664 may be stored in the database 204.
[00186] FIG. 6D illustrates a detailed schematic representation 680 of the server system 200 and a data communication flow between the server system 200 and the plurality of nodes (the user devices 406(1)-406(3)), in accordance with yet another embodiment of the present disclosure. The data communication flow between the user devices 406(1)-406(3) and the modules of the server system 200 is similar to as explained above with reference to FIG. 6A.
[00187] In some other embodiments, the server system 200 may further include an optimization module 682 communicably coupled to the scoring module 232. The optimization module 682 is similar to the optimization module 234 of FIG. 2. The optimization module 682 may be configured to generate, via the second ML model 222, a subset of anomalous client 684 from the set of anomalous clients to be labeled as highly risky anomalous clients based, at least in part, on analyzing the risk score 614 generated for each API call of the subset of API calls associated with each anomalous client of the set of anomalous clients. In a non-limiting example, the second ML model 222 may be the NLP model trained using deep learning neural networks.
[00188] Further, analyzing the risk score 614 generated for each API call of the subset of API calls associated with each anomalous client of the set of anomalous clients may include evaluating an overall risk associated with each anomalous client based on a total sum or an average of the risk scores for the subset of API calls for the corresponding anomalous client. Later, the overall risk scores evaluated for the corresponding set of anomalous clients is compared with each other, and one or more anomalous client of the set of anomalous clients having the overall risk score greater in comparison to other clients may be selected and aggregated as a subset of anomalous clients.
[00189] FIG. 7 illustrates a flow diagram representation 700 of a process flow of the generation of the first ML model 220, in accordance with an embodiment of the present disclosure. In an embodiment, the server system 200 may generate the first ML model 220 via the anomaly detection module 230. The anomaly detection module 230 may perform a set of operations to generate the first ML model 220. Upon performing the set of operations, the first ML model 220 may get trained using the API call dataset 114 and learn the general distribution of the genuine API call.
[00190] In a non-limiting example, the first ML model 220 may be a Variational Autoencoder (VAE) model trained using deep learning neural networks. As used herein, the term “variational autoencoder” refers to a generative model that combines elements of both autoencoders and variational inference including an encoder network, a decoder network, and a latent space representation.
[00191] In a specific embodiment, the set of operations may include extraction 702, computation 704, and optimization 706. The extraction step 702 may be fed with a set of long-term velocity features 708 as shown in FIG. 7. Further, the extraction step 702 includes enabling the server system 200 to extract a first set of features 708 from the set of long-term velocity features 708. The first set of features 708 may correspond to a non-malicious API call dataset from the long-term API call data 218A.
[00192] In some embodiments, the computation step 704 may be fed with the first set of features 708 as shown in FIG. 7. Further, the computation step 704 includes enabling the server system 200 to compute a reconstruction loss based, at least in part, on analyzing the first set of features 708, and minimizing the reconstruction loss 710 by optimizing one or more ML model parameters 712. Herein, the one or more ML model parameters 712 may include at least an encoder and decoder architecture design, a predefined latent space dimension, regularization parameters, learning rate, optimization, reconstruction loss weighting, batch size, regularization techniques, and the like.
[00193] More specifically, in an embodiment, the first ML model 220 may include an encoder 714 and a decoder 716. Upon extracting the first set of features 708, for analyzing the first set of features 708 to compute the reconstruction loss 710, the server system 200 may further be configured to map, via the encoder 714, the first set of features 708 to a lower-dimensional latent space. As used herein, the term “latent space” represents a compressed and continuous representation of the input data. For instance, a multivariate Gaussian distribution may be parameterized by the mean and variance values. Herein, the mean and variance values may be obtained from the encoder 714. In a non-limiting example, the encoder 714 may include several layers that progressively reduce input dimensions of the first set of features 708 and compute the mean and variance parameters of the latent space distribution.
[00194] The server system 200 may be configured to generate, via the decoder 716, an output set of features by performing reconstruction of features from the lower-dimensional latent space. Herein, the decoder 716 may include several layers that gradually up-sample the input, reconstructing the data based on the latent representation. In a non-limiting example, the decoder 716 may take a sample from the latent space and map it back to the original data space.
[00195] The server system 200 may further be configured to compute the reconstruction loss 710 by comparing (see, 718) the output set of features and the first set of features. As may be understood, the reconstruction loss 710 may refer to the variation in the output set of features in comparison (see, 718) to the first set of features which are input to the decoder 716. To that note, for the first ML model 220 to provide optimized results, the reconstruction loss 710 may have to be reduced. Therefore, the server system 200 may be configured to iteratively perform the set of operations upon optimizing the one or more ML model parameters 712, until the value of the reconstruction loss 710 reduces to a minimum value or saturates to a predefined or permissible value.
[00196] FIG. 8 illustrates a flow diagram representation 800 of a process flow of a determination of a set of anomalous nodes, in accordance with an embodiment of the present disclosure. In an embodiment, the server system 200 may determine the set of anomalous nodes via the anomaly detection module 230. As may be understood, the set of anomalous nodes is determined via the first ML model 220. Further, for determining the set of anomalous nodes, the server system 200 may follow the process flow as disclosed in FIG. 8.
[00197] Initially, the first set of features may be taken as input to the server system 200 (see, 802). Further, the server system 200 may generate, via the first ML model 220, a standard behavioral profile that a genuine node possesses in response to having performed a set of genuine API calls based, at least in part, on the first set of features (see, 804).
[00198] Further, the set of short-term API call features may be taken as another input to the server system 200 (see, 806). The server system 200 may further generate, via the first ML model 220, a behavioral profile for each of the plurality of nodes (e.g., the nodes 104) by analyzing underlining patterns and behavior of the set of API calls based, at least in part, on the set of short-term API call features (see, 808).
[00199] Basically, upon training the first ML model 220 using historical data such as the long-term API call data 218A, the server system 200 determines the behavior of each node by generating the behavioral profile for each node using the first ML model 220 based on the short-term API call features. Further, since the generation of the behavioral profile is based on the short-term API call features, the behavioral profile generated will be near real-time behavioral profile.
[00200] To that note, the server system 200 may then compare the behavioral profile of each of the plurality of nodes (e.g., the nodes 104) and the standard behavioral profile (see 810). The server system 200 may be configured to compute, via the first ML model 220, a behavioral discrepancy probability based, at least in part, on the comparison of the behavioral profile of each of the plurality of nodes (e.g., the nodes 104) and the standard behavioral profile (see, 812). The server system 200 may determine if the behavioral discrepancy probability is at least equal to a threshold such as a pre-determined threshold probability (see, 814).
[00201] The server system 200 may be configured to assign, via the first ML model 220, an anomalous identity label to one or more nodes of the plurality of nodes (e.g., the nodes 104) when the behavioral discrepancy probability is at least equal to a pre-determined threshold probability, thereby determining the set of anomalous nodes (see, 816). In a non-limiting example, the pre-determined threshold probability may be determined and fixed prior based on API security standards.
[00202] FIG. 9 illustrates a flow diagram representation 900 of a process flow of the generation of a risk score, in accordance with an embodiment of the present disclosure. In an embodiment, the server system 200 may generate the risk score via the anomaly detection module 230. As may be understood, for generating the risk score, the server system 200 may extract the subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data 114B. The server system 200 may generate, via the first ML model 220, a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls. Further, the server system 200 may generate the risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss.
[00203] In a non-limiting example for the generation of the risk score, the server system 200 may receive a predefined threshold value and the reconstruction loss as input (see, 902). Later, the server system 200 may determine the anomaly detection threshold based, at least in part, on the predefined threshold value and the reconstruction loss (see, 904). Herein, the predefined threshold value may be defined prior based on the API security standards. Further, it may be understood that based on the new reconstruction loss determined during the process for the short-term API call features and the predefined threshold value, the server system 200 may fix the anomaly detection threshold.
[00204] In addition, the server system 200 may be configured to determine, via the first ML model 220, a data sensitivity and history of attacks on each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes based, at least in part, on the reconstruction loss (see, 906).
[00205] Upon determining information corresponding to the data sensitivity and the history of attacks on each API call of the subset of API calls, the server system 200 may determine if the data sensitivity and the history of attacks on each API call of the subset of API calls deviate at least by a first predefined extent from the general distribution of the genuine API call (see, 908).
[00206] In one embodiment, the server system 200 may generate the risk score including the first risk score that is less than the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviate at least by a first predefined extent from the general distribution of the genuine API call (see, 910).
[00207] In another embodiment, the server system 200 may determine if the data sensitivity and the history of attacks on each API call of the subset of API calls deviate at least by a second predefined extent from the general distribution of the genuine API call (see, 912).
[00208] The server system 200 may generate the risk score including the second risk score that is at least equal to the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviate by a second predefined extent from the general distribution of the genuine API call (see, 914). Further, if the data sensitivity and the history of attacks on each API call of the subset of API calls do not deviate from the general distribution of the genuine API call by the second predefined extent, then the server system 200 continues to determine the data sensitivity and the history of attacks on each API call of the subset of API calls. Herein, the second predefined extent is greater than the first predefined extent. Also, the first predefined extent and the second predefined extent may be fixed based on API security standards.
[00209] FIG. 10A illustrates a process flow diagram depicting a method 1000 for generating the first ML model 220, in accordance with the present disclosure. The method 1000 depicted in the flow diagram may be executed by, for example, the server system 200. The sequence of operations of the method 1000 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped and performed in the form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. Operations of the method 1000, and combinations of operations in the method 1000 may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The plurality of operations is depicted in the process flow of the method 1000. The process flow starts at operation 1002.
[00210] At 1002, the method 1000 includes extracting, by a server system (e.g., the server system 200), a first set of features from the set of long-term velocity features. The first set of features may correspond to a non-malicious API call dataset from the long-term API call data 218A. It may be noted that the set of long-term velocity features may be corresponding to the long-term API call data 218A of the API call dataset 218. As may be understood, the long-term API call data 218A corresponds to historical data related to the set of API calls performed between the nodes 104 in the past during a first predefined period. For instance, the first predefined period may include about the past 9-12 months. Later, the set of long-term velocity features may be generated for the long-term API call data 218A.
[00211] Further, for the generation of the first ML model 220, the first set of features of the set of long-term velocity features may be extracted that are corresponding to the non-malicious API call dataset of the long-term API call data 218A. This step may be performed to train the first ML model 220 to learn a general distribution of a genuine API call. Moreover, since the features corresponding to the non-malicious API call dataset may be segregated, the API call dataset 218 may be assumed to include labeled data and unlabeled data. The labeled data may include the non-malicious API call dataset that is labeled to be non-malicious in the history of the set of API calls performed between the nodes 104. In a non-limiting example, the first ML model 220 may include a Variational Autoencoder (VAE) model. Thus, in an embodiment, the first ML model 220 may include an encoder and a decoder.
[00212] At 1004, the method 1000 includes computing, by the server system 200, a reconstruction loss based, at least in part, on analyzing the first set of features. Herein, analyzing the first set of features may further include mapping, by the server system 200 via the encoder, the first set of features to a lower-dimensional latent space. More specifically, lower-dimensional embeddings may be generated from the first set of features for the training of the first ML model 220.
[00213] In some embodiments, analyzing the first set of features may further include generating, by the server system 200 via the decoder, an output set of features by performing reconstruction of features from the lower-dimensional latent space. Further, analyzing the first set of features may then include computing the reconstruction loss by comparing the output set of features and the first set of features by the server system 200.
[00214] At 1006, the method 1000 includes minimizing, by the server system 200, the reconstruction loss by optimizing one or more ML model parameters. For instance, the one or more ML model parameters may include at least an encoder and decoder architecture design, a predefined latent space dimension, regularization parameters, learning rate, and optimization, reconstruction loss weighting, batch size, regularization techniques, and the like.
[00215] Moreover, the steps of the method 1000 may be iteratively performed by the server system 200 until the reduction of the reconstruction loss is near zero. Then, the first ML model 220 may be considered to have learned the general distribution of the genuine API call, and hence can be used for predicting real-time behavior of the set of API calls performed between the nodes 104 in the network 110. Further, for predicting the real-time behavior of the set of API calls, real-time data may have to be fed to the first ML model 220. In some embodiments, the real-time data may include the short-term API call data 218B of the API call dataset 218. Furthermore, based on this prediction of the real-time behavior of the set of API calls, a set of anomalous nodes may also be detected and further any primitive measures may be implemented for taking against highly risky API calls in the network 110.
[00216] FIG. 10B illustrates a process flow diagram depicting a method 1020 for behavioral profiling and anomaly detection in an Application Programming Interface (API) ecosystem, in accordance with the present disclosure. The method 1020 depicted in the flow diagram may be executed by, for example, the server system 200. The sequence of operations of the method 1020 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped and performed in the form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. Operations of the method 1020, and combinations of operations in the method 1020 may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The plurality of operations is depicted in the process flow of the method 1020. The process flow starts at operation 1022.
[00217] At 1022, the method 1020 includes accessing, by a server system (e.g., the server system 200), an API call dataset related to a set of API calls performed between a plurality of nodes (e.g., the nodes 104) connected in a network (e.g., the network 110), from a database (e.g., the database 204) associated with the server system 200. The API call dataset 218 may include long-term API call data 218A and short-term API call data 218B.
[00218] At 1024, the method 1020 includes generating, by the server system 200, a plurality of features based, at least in part, on the API call dataset 218. In some embodiments, the plurality of features may include a set of long-term velocity features and a set of short-term API call features.
[00219] At 1026, the method 1020 includes determining, by the server system 200 via a first machine learning (ML) model 220, a set of anomalous nodes from the nodes 104 based, at least in part, on the set of long-term velocity features and the set of short-term API call features.
[00220] At 1028, the method 1020 includes extracting, by the server system 200, a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data 218B.
[00221] At 1030, the method 1020 includes generating, by the server system 200 via the first ML model 220, a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls.
[00222] At 1032, the method 1020 includes generating, by the server system 200 via the first ML model 220, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss.
[00223] At 1034, the method 1020 includes declining, by the server system 200, one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.
[00224] FIG. 10C illustrates a process flow diagram depicting a method 1040 for determining a set of anomalous nodes, in accordance with the present disclosure. The method 1040 depicted in the flow diagram may be executed by, for example, the server system 200. The sequence of operations of the method 1040 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped and performed in the form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. Operations of the method 1040, and combinations of operations in the method 1040 may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The plurality of operations is depicted in the process flow of the method 1040. The process flow starts at operation 1042.
[00225] At 1042, the method 1040 includes generating, by a server system (e.g., the server system 200) via a first ML model (e.g., the first ML model 220), a standard behavioral profile that a genuine node possesses in response to having performed a set of genuine API calls based, at least in part, on the first set of features.
[00226] At 1044, the method 1040 includes generating, by the server system 200 via the first ML model 220, a behavioral profile for each of the plurality of nodes (e.g., the nodes 104) by analyzing underlining patterns and behavior of the set of API calls based, at least in part, on the set of short-term API call features.
[00227] At 1046, the method 1040 includes computing, by the server system 200 via the first ML model 220, a behavioral discrepancy probability based, at least in part, on the comparison of the behavioral profile of each of the plurality of nodes 104 and the standard behavioral profile.
[00228] At 1048, the method 1040 includes assigning, by the server system 200 via the first ML model 220, an anomalous identity label to one or more nodes from the plurality of nodes based, at least in part, on determining if the behavioral discrepancy probability corresponding to the one or more nodes from the plurality of nodes is at least equal to a pre-determined threshold probability.
[00229] At 1050, the method 1040 includes determining, by the server system 200, the set of anomalous nodes from the plurality of nodes based, at least in part, on the corresponding anomalous identity label of the one or more nodes from the plurality of nodes.
[00230] FIG. 10D illustrates a process flow diagram depicting a method 1060 for generating a risk score, in accordance with the present disclosure. The method 1060 depicted in the flow diagram may be executed by, for example, the server system 200. The sequence of operations of the method 1060 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped and performed in the form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. Operations of the method 1060, and combinations of operations in the method 1060 may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The plurality of operations is depicted in the process flow of the method 1060. The process flow starts at operation 1062.
[00231] At 1062, the method 1060 includes determining, by a server system (e.g., the server system 200) via a first ML model (e.g., the first ML model 220) an anomaly detection threshold based, at least in part, on a predefined threshold value and a reconstruction loss.
[00232] At 1064, the method 1060 includes determining, by the server system 200 via the first ML model 220, a data sensitivity and history of attacks on each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes based, at least in part, on the reconstruction loss.
[00233] At 1066, the method 1060 includes generating, by the server system 200 via the first ML model 220, the risk score including one of a first risk score and a second risk score. Thus, step 1066 of the method 1060 includes steps 1066A and 1066B.
[00234] At 1066A, the method 1060 includes generating the risk score including the first risk score for each API call of the subset of API calls that is less than the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviates at least by a first predefined extent from a general distribution of a genuine API call.
[00235] At 1066B, the method 1060 includes generating the risk score including the second risk score for each API call of the subset of API calls that is at least equal to the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviate by a second predefined extent from the general distribution of the genuine API call. The second predefined extent is greater than the first predefined extent.
[00236] FIG. 11 is a simplified block diagram of an electronic device 1100 capable of implementing various embodiments of the present disclosure. In an example, the electronic device 1100 may correspond to the node 104(1) of FIG. 1. In another example, the electronic device 1100 may correspond to the administrator device (not shown in figures) of the administrator. The electronic device 1100 is depicted to include one or more applications such as an API security application 1106 facilitated by the server system 200. The API security application 1106 can be an instance of an application downloaded from the server system 200 or a third-party server. The API security application 1106 is capable of communicating with the server system 200 for facilitating the nodes 104 with the call level security for securing the API calls between the nodes 104 in the API ecosystem through machine learning (ML) shown in FIG. 1.
[00237] It should be understood that the electronic device 1100 as illustrated and hereinafter described is merely illustrative of one type of device and should not be taken to limit the scope of the embodiments. As such, it should be appreciated that at least some of the components described below in connection with the electronic device 1100 may be optional and thus in an embodiment may include more, less, or different components than those described in connection with the embodiment of the FIG. 11. As such, among other examples, the electronic device 1100 could be any of mobile electronic device, for example, cellular phones, tablet computers, laptops, mobile computers, personal digital assistants (PDAs), mobile televisions, mobile digital assistants, or any combination of the aforementioned, and other types of communication or multimedia devices.
[00238] The illustrated electronic device 1100 includes a controller or a processor 1102 (e.g., a signal processor, microprocessor, ASIC, or other control and processing logic circuitry) for performing such tasks as signal coding, data processing, image processing, input/output processing, power control, and/or other functions. An operating system 1104 controls the allocation and usage of the components of the electronic device 1100 and supports one or more operations of the application (see, the API security application 1106) that implements one or more of the innovative features described herein. In addition, the one or more applications may further include common mobile computing applications (e.g., email applications, calendars, contact managers, web browsers, messaging applications) or any other computing application.
[00239] The illustrated electronic device 1100 includes one or more memory components, for example, a non-removable memory 1108 and/or removable memory 1110. The non-removable memory 1108 and/or the removable memory 1110 may be collectively known as a database in an embodiment. The non-removable memory 1108 can include RAM, ROM, flash memory, a hard disk, or other well-known memory storage technologies. The removable memory 1110 can include flash memory, smart cards, or a Subscriber Identity Module (SIM). The one or more memory components can be used for storing data and/or code for running the operating system 1104 and the one or more applications. The electronic device 1100 may further include a user identity module (UIM) 1112. The UIM 1112 may be a memory device having a processor built in. The UIM 1112 may include, for example, a subscriber identity module (SIM), a universal integrated circuit card (UICC), a universal subscriber identity module (USIM), a removable user identity module (R-UIM), or any other smart card. The UIM 1112 typically stores information elements related to a mobile subscriber. The UIM 1112 in the form of the SIM card is well known in Global Systems for Mobile (GSM) communication systems, Code Division Multiple Access (CDMA) systems, or with third generation (3G) wireless communication protocols such as Universal Mobile Telecommunications System (UMTS), CDMA11000, wideband CDMA (WCDMA) and time division-synchronous CDMA (TD-SCDMA), or with fourth-generation (4G) wireless communication protocols such as LTE (Long-Term Evolution).
[00240] The electronic device 1100 can support one or more input devices 1120 and one or more output devices 1130. Examples of the input devices 1120 may include, but are not limited to, a touch screen/a display screen 1122 (e.g., capable of capturing finger tap inputs, finger gesture inputs, multi-finger tap inputs, multi-finger gesture inputs, or keystroke inputs from a virtual keyboard or keypad), a microphone 1124 (e.g., capable of capturing voice input), a camera module 1126 (e.g., capable of capturing still picture images and/or video images) and a physical keyboard 1128. Examples of the output devices 1130 may include, but are not limited to, a speaker 1132 and a display 1134. Other possible output devices can include piezoelectric or other haptic output devices. Some devices can serve more than one input/output function. For example, the touch screen 1122 and the display 1134 can be combined into a single input/output device.
[00241] A wireless modem 1140 can be coupled to one or more antennas (not shown in FIG. 11) and can support two-way communications between the processor 1102 and external devices, as is well understood in the art. The wireless modem 1140 is shown generically and can include, for example, a cellular modem 1142 for communicating at long range with the mobile communication network, a Wi-Fi compatible modem 1144 for communicating at short range with an external Bluetooth-equipped device, or a local wireless data network or router, and/or a Bluetooth-compatible modem 1146. The wireless modem 1140 is typically configured for communication with one or more cellular networks, such as a GSM network for data and voice communications within a single cellular network, between cellular networks, or between the electronic device 1100 and a public switched telephone network (PSTN).
[00242] The electronic device 1100 can further include one or more input/output ports 1150, a power supply 1152, and one or more sensors 1154 for example, an accelerometer, a gyroscope, a compass, or an infrared proximity sensor for detecting the orientation or motion of the electronic device 1100 and biometric sensors for scanning biometric identity of an authorized user, a transceiver 1156 (for wirelessly transmitting analog or digital signals) and/or a physical connector 1160, which can be a USB port, IEEE 1294 (FireWire) port, and/or RS-232 port. The illustrated components are not required or all-inclusive, as any of the components shown can be deleted and other components can be added.
[00243] The disclosed method with reference to FIGS. 10A-10D, or one or more operations of the server system 200 may be implemented using software including computer-executable instructions stored on one or more computer-readable media (e.g., non-transitory computer-readable media, such as one or more optical media discs, volatile memory components (e.g., DRAM or SRAM), or nonvolatile memory or storage components (e.g., hard drives or solid-state nonvolatile memory components, such as Flash memory components) and executed on a computer (e.g., any suitable computer, such as a laptop computer, netbook, Web book, tablet computing device, smartphone, or other mobile computing devices). Such software may be executed, for example, on a single local computer or in a network environment (e.g., via the Internet, a wide-area network, a local-area network, a remote web-based server, a client-server network (such as a cloud computing network), or other such networks) using one or more network computers. Additionally, any of the intermediate or final data created and used during the implementation of the disclosed methods or systems may also be stored on one or more computer-readable media (e.g., non-transitory computer-readable media) and are considered to be within the scope of the disclosed technology. Furthermore, any of the software-based embodiments may be uploaded, downloaded, or remotely accessed through a suitable communication means. Such a suitable communication means include, for example, the Internet, the World Wide Web, an intranet, software applications, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication means.
[00244] Although the invention has been described with reference to specific exemplary embodiments, it is noted that various modifications and changes may be made to these embodiments without departing from the broad scope of the invention. For example, the various operations, blocks, etc., described herein may be enabled and operated using hardware circuitry (for example, complementary metal oxide semiconductor (CMOS) based logic circuitry), firmware, software, and/or any combination of hardware, firmware, and/or software (for example, embodied in a machine-readable medium). For example, the apparatuses and methods may be embodied using transistors, logic gates, and electrical circuits (for example, application-specific integrated circuit (ASIC) circuitry and/or in Digital Signal Processor (DSP) circuitry).
[00245] Particularly, the server system 200 and its various components may be enabled using software and/or using transistors, logic gates, and electrical circuits (for example, integrated circuit circuitry such as ASIC circuitry). Various embodiments of the invention may include one or more computer programs stored or otherwise embodied on a computer-readable medium, wherein the computer programs are configured to cause a processor or the computer to perform one or more operations. A computer-readable medium storing, embodying, or encoded with a computer program, or similar language, may be embodied as a tangible data storage device storing one or more software programs that are configured to cause a processor or computer to perform one or more operations. Such operations may be, for example, any of the steps or operations described herein. In some embodiments, the computer programs may be stored and provided to a computer using any type of non-transitory computer-readable media. Non-transitory computer-readable media includes any type of tangible storage media. Examples of non-transitory computer-readable media include magnetic storage media (such as floppy disks, magnetic tapes, hard disk drives, etc.), optical magnetic storage media (e.g. magneto-optical disks), CD-ROM (compact disc read-only memory), CD-R (compact disc recordable), CD-R/W (compact disc rewritable), DVD (Digital Versatile Disc), BD (BLU-RAY® Disc), and semiconductor memories (such as mask ROM, PROM (programmable ROM), EPROM (erasable PROM), flash memory, RAM (random access memory), etc.). Additionally, a tangible data storage device may be embodied as one or more volatile memory devices, one or more non-volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. In some embodiments, the computer programs may be provided to a computer using any type of transitory computer-readable media. Examples of transitory computer-readable media include electric signals, optical signals, and electromagnetic waves. Transitory computer-readable media can provide the program to a computer via a wired communication line (e.g., electric wires, and optical fibers) or a wireless communication line.
[00246] Various embodiments of the invention, as discussed above, may be practiced with steps and/or operations in a different order, and/or with hardware elements in configurations, which are different than those which, are disclosed. Therefore, although the invention has been described based on these exemplary embodiments, it is noted that certain modifications, variations, and alternative constructions may be apparent and well within the scope of the invention.
[00247] Although various exemplary embodiments of the invention are described herein in a language specific to structural features and/or methodological acts, the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as exemplary forms of implementing the claims.
, Claims:1. A computer-implemented method, comprising:
accessing, by a server system, an Application Programming Interface (API) call dataset related to a set of API calls performed between a plurality of nodes connected in a network from a database associated with the server system, the API call dataset comprising long-term API call data and short-term API call data;
generating, by the server system, a plurality of features based, at least in part, on the API call dataset, the plurality of features comprising a set of long-term velocity features and a set of short-term API call features;
determining, by the server system via a first machine learning (ML) model, a set of anomalous nodes from the plurality of nodes based, at least in part, on the set of long-term velocity features and the set of short-term API call features;
extracting, by the server system, a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data;
generating, by the server system via the first ML model, a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls;
generating, by the server system via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss; and
declining, by the server system, one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.
2. The computer-implemented method as claimed in claim 1, wherein the first ML model comprises a Variational Autoencoder (VAE) model trained using deep learning neural networks.
3. The computer-implemented method as claimed in claim 1, further comprising:
generating, by the server system, the first ML model, wherein generating the first ML model comprises performing a set of operations iteratively for training the first ML model to learn a general distribution of a genuine API call, the set of operations comprising:
extracting, by the server system, a first set of features from the set of long-term velocity features, the first set of features corresponding to non-malicious API call dataset from the long-term API call data;
computing, by the server system, a reconstruction loss based, at least in part, on analyzing the first set of features; and
minimizing, by the server system, the reconstruction loss by optimizing one or more ML model parameters.
4. The computer-implemented method as claimed in claim 3, wherein the one or more ML model parameters comprise at least an encoder and decoder architecture design, a predefined latent space dimension, regularization parameters, learning rate, optimization, reconstruction loss weighting, batch size, and regularization techniques.
5. The computer-implemented method as claimed in claim 3, wherein determining the set of anomalous nodes, further comprises:
generating, by the server system via the first ML model, a standard behavioral profile that a genuine node possesses in response to having performed a set of genuine API calls based, at least in part, on the first set of features;
generating, by the server system via the first ML model, a behavioral profile for each of the plurality of nodes by analyzing underlining patterns and behavior of the set of API calls based, at least in part, on the set of short-term API call features;
computing, by the server system via the first ML model, a behavioral discrepancy probability for each of the plurality of nodes based, at least in part, on the comparing the behavioral profile of each of the plurality of nodes with the standard behavioral profile;
assigning, by the server system via the first ML model, an anomalous identity label to one or more nodes from the plurality of nodes based, at least in part, on determining if the behavioral discrepancy probability corresponding to the one or more nodes from the plurality of nodes is at least equal to a pre-determined threshold probability; and
determining, by the server system, the set of anomalous nodes from the plurality of nodes based, at least in part, on the corresponding anomalous identity label of the one or more nodes from the plurality of nodes.
6. The computer-implemented method as claimed in claim 1, wherein generating the risk score further comprises:
determining, by the server system via the first ML model, the anomaly detection threshold based, at least in part, on a predefined threshold value and the reconstruction loss;
determining, by the server system via the first ML model, a data sensitivity and history of attacks on each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes based, at least in part, on the reconstruction loss; and
generating, by the server system via the first ML model, the risk score comprising one of:
a first risk score for each API call of the subset of API calls that is less than the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviates at least by a first predefined extent from a general distribution of a genuine API call; and
a second risk score for each API call of the subset of API calls that is at least equal to the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviate by a second predefined extent from the general distribution of the genuine API call, the second predefined extent being greater than the first predefined extent.
7. The computer-implemented method as claimed in claim 1, further comprising:
generating, by the server system, one or more alerts to API management authorities when the risk score associated with one or more API calls of the subset of API calls is at least equal to the anomaly detection threshold.
8. The computer-implemented method as claimed in claim 1, further comprising:
generating, by the server system, a report for the set of API calls, the report comprising a data summary corresponding to overall malicious traffic on an API ecosystem.
9. The computer-implemented method as claimed in claim 8, wherein the report is generated using a Natural Language Generation (NLG) model.
10. The computer-implemented method as claimed in claim 1, further comprising:
determining, by the server system via a second ML model, a subset of anomalous nodes from the set of anomalous nodes to be labeled as highly risky anomalous nodes based, at least in part, on analyzing the risk score generated for each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes.
11. The computer-implemented method as claimed in claim 10, wherein the second ML model comprises a Natural Language Processing (NLP) model trained using deep learning neural networks.
12. A server system, comprising:
a communication interface;
a memory comprising executable instructions; and
a processor communicably coupled to the communication interface and the memory, the processor configured to cause the server system to at least:
access an Application Programming Interface (API) call dataset related to a set of API calls performed between a plurality of nodes connected in a network from a database associated with the server system, the API call dataset comprising long-term API call data and short-term API call data;
generate a plurality of features based, at least in part, on the API call dataset, the plurality of features comprising a set of long-term velocity features and a set of short-term API call features;
determine, via a first machine learning (ML) model, a set of anomalous nodes from the plurality of nodes based, at least in part, on the set of long-term velocity features and the set of short-term API call features;
extract a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call data;
generate a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls;
generate, via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss; and
decline one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.
13. The server system as claimed in claim 12, wherein the first ML model comprises a Variational Autoencoder (VAE) model trained using deep learning neural networks.
14. The server system as claimed in claim 12, wherein the server system is further caused to generate the first ML model, wherein the generation of the first ML model causes the server system to perform a set of operations iteratively, the set of operations comprising:
extracting a first set of features from the set of long-term velocity features, the first set of features corresponding to non-malicious API call dataset from the long-term API call data;
computing a reconstruction loss based, at least in part, on analyzing the first set of features; and
minimizing the reconstruction loss by optimizing one or more ML model parameters, wherein the one or more ML model parameters comprise at least an encoder and decoder architecture design, a predefined latent space dimension, regularization parameters, learning rate, and optimization, reconstruction loss weighting, batch size, and regularization techniques.
15. The server system as claimed in claim 14, wherein for determining the set of anomalous nodes, the server system is further caused to:
generate, via the first ML model, a standard behavioral profile that a genuine node possesses in response to having performed a set of genuine API calls based, at least in part, on the first set of features;
generate, via the first ML model, a behavioral profile for each of the plurality of nodes by analyzing underlining patterns and behavior of the set of API calls based, at least in part, on the set of short-term API call features;
compute, via the first ML model, a behavioral discrepancy probability for each of the plurality of nodes based, at least in part, on comparing the behavioral profile of each of the plurality of nodes with the standard behavioral profile;
assign, via the first ML model, an anomalous identity label to one or more nodes from the plurality of nodes based, at least in part, on determining if the behavioral discrepancy probability corresponding to the one or more nodes from the plurality of nodes is at least equal to a pre-determined threshold probability; and
determine the set of anomalous nodes from the plurality of nodes based, at least in part, on the corresponding anomalous identity label of the one or more nodes from the plurality of nodes.
16. The server system as claimed in claim 12, wherein for generating the risk score, the server system is further caused to:
determine, via the first ML model, the anomaly detection threshold based, at least in part, on a predefined threshold value and the reconstruction loss;
determine, via the first ML model, a data sensitivity and history of attacks on each API call of the subset of API calls associated with each anomalous node of the set of anomalous nodes based, at least in part, on the reconstruction loss;
generate, via the first ML model, the risk score comprising a first risk score for each API call of the subset of API calls that is less than the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviates at least by a first predefined extent from a general distribution of a genuine API call; and
generate, via the first ML model, the risk score comprising a second risk score for each API call of the subset of API calls that is at least equal to the anomaly detection threshold, when the data sensitivity and the history of attacks on each API call of the subset of API calls deviates by a second predefined extent from the general distribution of the genuine API call, the second predefined extent being greater than the first predefined extent.
17. The server system as claimed in claim 12, wherein the server system is further caused to generate one or more alerts to API management authorities when the risk score associated with one or more API calls of the subset of API calls is at least equal to the anomaly detection threshold.
18. The server system as claimed in claim 12, wherein the server system is further caused to generate a report for the set of API calls, the report comprising a data summary corresponding to overall malicious traffic on an API ecosystem, wherein the report is generated using a Natural Language Generation (NLG) model.
19. The server system as claimed in claim 12, wherein the server system is further caused to:
determine, via a second ML model, a subset of anomalous nodes from the set of anomalous nodes to be labeled as highly risky anomalous nodes based, at least in part, on a subset of short-term API call features corresponding to the set of anomalous nodes from the set of short-term API call features, wherein the second ML model comprises a Natural Language Processing (NLP) model trained using deep learning neural networks.
20. A non-transitory computer-readable storage medium comprising computer-executable instructions that, when executed by at least a processor of a server system, cause the server system to perform a method comprising:
accessing an Application Programming Interface (API) call dataset related to a set of API calls performed between a plurality of nodes connected in a network from a database associated with the server system, the API call dataset comprising long-term API call data and short-term API call data;
generating a plurality of features based, at least in part, on the API call dataset, the plurality of features comprising a set of long-term velocity features and a set of short-term API call features;
determining, via a first machine learning (ML) model, a set of anomalous nodes from the plurality of nodes based, at least in part, on the set of long-term velocity features and the set of short-term API call features;
extracting a subset of API calls associated with each anomalous node of the set of anomalous nodes from the short-term API call dataset;
generating, via the first ML model, a reconstruction loss corresponding to each API call of the subset of API calls based, at least in part, on analyzing the subset of API calls;
generating, via the first ML model, a risk score for each API call of the subset of API calls based, at least in part, on the corresponding reconstruction loss; and
declining one or more API calls from the subset of API calls based, at least in part, on the risk score associated with the one or more API calls being at least equal to an anomaly detection threshold.

Documents

Application Documents

#	Name	Date
1	202341068307-STATEMENT OF UNDERTAKING (FORM 3) [11-10-2023(online)].pdf	2023-10-11
2	202341068307-POWER OF AUTHORITY [11-10-2023(online)].pdf	2023-10-11
3	202341068307-FORM 1 [11-10-2023(online)].pdf	2023-10-11
4	202341068307-FIGURE OF ABSTRACT [11-10-2023(online)].pdf	2023-10-11
5	202341068307-DRAWINGS [11-10-2023(online)].pdf	2023-10-11
6	202341068307-DECLARATION OF INVENTORSHIP (FORM 5) [11-10-2023(online)].pdf	2023-10-11
7	202341068307-COMPLETE SPECIFICATION [11-10-2023(online)].pdf	2023-10-11
8	202341068307-Proof of Right [24-11-2023(online)].pdf	2023-11-24