Sign In to Follow Application
View All Documents & Correspondence

System And Method For Assessing Insider Influence On Enterprise Assets

Abstract: ABSTRACT SYSTEM AND METHOD FOR ASSESSING INSIDER INFLUENCE ON ENTERPRISE ASSETS This disclosure relates generally to system and method for assessing insider 5 influence on enterprise assets. Existing work focuses on the detection of insider threat and does not consider the influence of an insider on their peers and subordinates. The present disclosure aggregates and preprocesses the enterprise data specific to the individuals received from various sources, and further creates an enterprise graph between entities. Weights of every edge connected between any 10 two entities in the enterprise graph is then calculated. Community of the individuals are detected wherein, relevant insider(s) are identified, and susceptibility of the individuals for probable influence by relevant insider(s) based on the analysis scenarios(s) is calculated. Paths taken by the relevant insider(s) is calculated for estimating probability of data loss. The present disclosure identifies the assets 15 which are under possible threat from the relevant insider(s), obtains cumulative risk associated with the enterprise and generates an analysis report accordingly.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
26 October 2020
Publication Number
17/2022
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
kcopatents@khaitanco.com
Parent Application

Applicants

Tata Consultancy Services Limited
Nirmal Building, 9th Floor, Nariman Point Mumbai Maharashtra India 400021

Inventors

1. SHUKLA, Manish
Tata Consultancy Services Limited Tata Research Development & Design Centre, 54-B, Hadapsar Industrial Estate, Hadapsar, Pune Maharashtra India 411013
2. LODHA, Sachin Premsukh
Tata Consultancy Services Limited Tata Research Development & Design Centre, 54-B, Hadapsar Industrial Estate, Hadapsar, Pune Maharashtra India 411013

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
SYSTEM AND METHOD FOR ASSESSING INSIDER INFLUENCE ON
ENTERPRISE ASSETS
Applicant
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description:
The following specification particularly describes the invention and the
manner in which it is to be performed.
2
CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY
[001] The present application claims priority from Indian provisional
application no. 202021046660, filed on October 26, 2020. The entire contents of
the aforementioned application are incorporated herein by reference.
5
TECHNICAL FIELD
[002] The disclosure herein generally relates to cybersecurity analytics,
and, more particularly, to system and method for assessing insider influence on
enterprise assets.
10
BACKGROUND
[003] Insider threat is defined as the negative effect on an enterprise due
to an individual who has or had access to the enterprise assets and internal working.
The insider threat is risk to the confidentiality, integrity and availability of critical
15 information assets and loss of reputation from the enterprise perspective. Based on
an individual’s intention, the existing literature on insider threat identifies two main
classes of threat actors. The major class consists of individuals with no malicious
intention and who performs anomalous action or divulge information unknowingly.
However, the other class consists of individuals with malicious intent to harm the
20 enterprise by stealing or sabotaging its assets.
[004] Despite of the amount of research in this field, insider threat is still
a problem field in cybersecurity which is mainly due to the presence of multiple
dynamic and interdependent contexts related to data sharing. For example, casual
data exchange between employees due to their reporting hierarchy or interpersonal
25 relationship. In an enterprise setup, an influential individual may either force or
persuade peers and subordinates to share privileged information. Further, this
information sharing tends to appear normal to the existing solutions as it happens
within a team (community), wherein regular exchange of information is common.
Even if the insider is identified and corrective actions are taken, there is still a large
30 and unknown attack surface open in the form of individuals who might have already
been influenced. Thus, it is important to assess the reachability or influence of an
3
insider on other benign users, and thereby, insider’s indirect access to different
assets within the enterprise.
[005] Some of the existing works compute multiple attributes for everyone
within the enterprise and then isolate the most anomalous behavior. Further, these
5 works have used deviation from peer’s as well as from individual’s normal behavior
for validating the individual’s current behavior. More recently, one of the existing
works uses a heuristic which converts log entries into a heterogenous graph by using
the sequential and the logical relationships among the events. Each log entry is
represented into a low-dimension vector by applying a graph embedding on the
10 graph. Further, algorithm proposed in the above-mentioned existing work separates
malicious and benign log entries into different clusters for detection. However, all
the existing work focuses on the detection of insider threat.
SUMMARY
15 [006] Embodiments of the present disclosure present technological
improvements as solutions to one or more of the above-mentioned technical
problems recognized by the inventors in conventional systems. For example, in one
embodiment, a method for assessing insider influence on enterprise assets is
provided. The method includes receiving, via one or more hardware processors, an
20 enterprise data specific to one or more individuals associated with an enterprise
from a plurality of sources, wherein the one or more individuals comprises of at
least one of one or more vendors, one or more employees and one or more
contractors associated with the enterprise; pre-processing, via the one or more
hardware processors, the received enterprise data to obtain an intermediate common
25 input representation; creating, via the one or more hardware processors, an
enterprise graph between one or more entities from the obtained intermediate
common input representation, wherein the one or more entities includes the one or
more individuals and one or more assets associated with the enterprise, and wherein
the enterprise graph includes a plurality of vertices consisting of the one or more
30 entities associated with the enterprise in a present time period and a past time
period, and a plurality of edges between the one or more entities and a plurality of
4
attributes associated with the plurality of vertices and the plurality of edges;
calculating, via the one or more hardware processors, a weight for each of the
plurality of edges between any two connected entities based on a plurality of
enterprise graph features and the plurality of attributes; detecting, via the one or
5 more hardware processors, one or more communities of the one or more individuals
by using a plurality of graph-based techniques based on the calculated weights of
the plurality of edges; calculating, via the one or more hardware processors, a
threshold behavior for the one or more individuals and the one or more detected
communities within an observation window by applying a plurality of statistical
10 methods based on the plurality of enterprise graph features and the plurality of
attributes; performing, via the one or more hardware processors, a comparison of
the threshold behavior of the one or more individuals calculated within the
observation window with a current behavior of the one or more individuals to
identify one or more potential insiders; performing, via the one or more hardware
15 processors, a comparison of the current behavior of the one or more potential
insiders and the current behavior of a plurality of individuals of the one or more
detected communities to identify the one or more potential insiders as one or more
relevant insiders; calculating, via the one or more hardware processors, a
susceptibility of the plurality of individuals for probable influence by the one or
20 more relevant insiders based on an analysis of a plurality of scenarios, wherein the
plurality of scenarios includes hierarchy exploitation, relationship exploitation and
mixed mode; calculating, via the one or more hardware processors, a plurality of
paths taken by the one or more relevant insiders based on the calculated
susceptibility of the plurality of individuals; and performing, via the one or more
25 hardware processors, an analysis of the calculated paths to obtain a probability score
indicative of a probable data loss.
[007] In another aspect, there is provided a system for assessing insider
influence on enterprise assets is provided. The system comprises: a memory storing
30 instructions; one or more communication interfaces; and one or more hardware
processors coupled to the memory via the one or more communication interfaces,
5
wherein the one or more hardware processors are configured by the instructions to:
receive, via one or more hardware processors, an enterprise data specific to one or
more individuals associated with an enterprise from a plurality of sources, wherein
the one or more individuals comprises of at least one of one or more vendors, one
5 or more employees and one or more contractors associated with the enterprise. The
system further comprises pre-processing, via the one or more hardware processors,
the received enterprise data to obtain an intermediate common input representation;
create, via the one or more hardware processors, an enterprise graph between one
or more entities from the obtained intermediate common input representation,
10 wherein the one or more entities includes the one or more individuals and one or
more assets associated with the enterprise, and wherein the enterprise graph
includes a plurality of vertices consisting of the one or more entities associated with
the enterprise in a present time period and a past time period, and a plurality of
edges between the one or more entities and a plurality of attributes associated with
15 the plurality of vertices and the plurality of edges; calculate, via the one or more
hardware processors, a weight for each of the plurality of edges between any two
connected entities based on a plurality of enterprise graph features and the plurality
of attributes; detect, via the one or more hardware processors, one or more
communities of the one or more individuals by using a plurality of graph-based
20 techniques based on the calculated weights of the plurality of edges; calculate, via
the one or more hardware processors, a threshold behavior for the one or more
individuals and the one or more detected communities within an observation
window by applying a plurality of statistical methods based on the plurality of
enterprise graph features and the plurality of attributes; perform, via the one or more
25 hardware processors, a comparison of the threshold behavior of the one or more
individuals calculated within the observation window with a current behavior of the
one or more individuals to identify one or more potential insiders; perform, via the
one or more hardware processors, a comparison of the current behavior of the one
or more potential insiders and the current behavior of a plurality of individuals of
30 the one or more detected communities to identify the one or more potential insiders
as one or more relevant insiders; calculate, via the one or more hardware processors,
6
a susceptibility of the plurality of individuals for probable influence by the one or
more relevant insiders based on an analysis of a plurality of scenarios, wherein the
plurality of scenarios includes hierarchy exploitation, relationship exploitation and
mixed mode; calculate, via the one or more hardware processors, a plurality of paths
5 taken by the one or more relevant insiders based on the calculated susceptibility of
the plurality of individuals; and perform, via the one or more hardware processors,
an analysis of the calculated paths to obtain a probability score indicative of a
probable data loss.
[008] In yet another aspect, there are provided one or more non-transitory
10 machine-readable information storage mediums comprising one or more
instructions which when executed by one or more hardware processors cause
receiving, via one or more hardware processors, an enterprise data specific to one
or more individuals associated with an enterprise from a plurality of sources,
wherein the one or more individuals comprises of at least one of one or more
15 vendors, one or more employees and one or more contractors associated with the
enterprise; pre-processing, via the one or more hardware processors, the received
enterprise data to obtain an intermediate common input representation; creating, via
the one or more hardware processors, an enterprise graph between one or more
entities from the obtained intermediate common input representation, wherein the
20 one or more entities includes the one or more individuals and one or more assets
associated with the enterprise, and wherein the enterprise graph includes a plurality
of vertices consisting of the one or more entities associated with the enterprise in a
present time period and a past time period, and a plurality of edges between the one
or more entities and a plurality of attributes associated with the plurality of vertices
25 and the plurality of edges; calculating, via the one or more hardware processors, a
weight for each of the plurality of edges between any two connected entities based
on a plurality of enterprise graph features and the plurality of attributes; detecting,
via the one or more hardware processors, one or more communities of the one or
more individuals by using a plurality of graph-based techniques based on the
30 calculated weights of the plurality of edges; calculating, via the one or more
hardware processors, a threshold behavior for the one or more individuals and the
7
one or more detected communities within an observation window by applying a
plurality of statistical methods based on the plurality of enterprise graph features
and the plurality of attributes; performing, via the one or more hardware processors,
a comparison of the threshold behavior of the one or more individuals calculated
5 within the observation window with a current behavior of the one or more
individuals to identify one or more potential insiders; performing, via the one or
more hardware processors, a comparison of the current behavior of the one or more
potential insiders and the current behavior of a plurality of individuals of the one or
more detected communities to identify the one or more potential insiders as one or
10 more relevant insiders; calculating, via the one or more hardware processors, a
susceptibility of the plurality of individuals for probable influence by the one or
more relevant insiders based on an analysis of a plurality of scenarios, wherein the
plurality of scenarios includes hierarchy exploitation, relationship exploitation and
mixed mode; calculating, via the one or more hardware processors, a plurality of
15 paths taken by the one or more relevant insiders based on the calculated
susceptibility of the plurality of individuals; and performing, via the one or more
hardware processors, an analysis of the calculated paths to obtain a probability score
indicative of a probable data loss.
[009] It is to be understood that both the foregoing general description and
20 the following detailed description are exemplary and explanatory only and are not
restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[010] The accompanying drawings, which are incorporated in and
25 constitute a part of this disclosure, illustrate exemplary embodiments and, together
with the description, serve to explain the disclosed principles:
[011] FIG. 1 illustrates an exemplary system for assessing insider
influence on enterprise assets, according to some embodiments of the present
disclosure.
8
[012] FIGS. 2A through 2C illustrate a flow diagram for the steps involved
in the method for assessing insider influence on enterprise assets, according to some
embodiments of the present disclosure.
[013] FIGS. 3A and 3B shows an example of insider detection technique
5 illustrating a lowpass filter-based anomaly detection and community and peer
voting for removing false positives, according to some embodiments of the present
disclosure.
[014] FIGS. 4A through 4C are the use cases illustrating the spider charts
for the behavior of a normal individual (FIG.4A), an anomalous behavior with
10 respect to self (FIG.4B) and suspect's behavior validation with community
(FIG.4C) on multiple dimensions, according to some embodiments of the present
disclosure.
[015] FIGS. 5A and 5B are the use cases illustrating a communication
graph of a plurality of communities, according to some embodiments of the present
15 disclosure.
[016] FIG. 6 illustrates an average probability of loss with respect to a
number of hops for four suspects in the CMU CERT (Carnegie Mellon University,
Computer Emergency Response Team) dataset, according to some embodiments of
the present disclosure.
20
DETAILED DESCRIPTION OF EMBODIMENTS
[017] Exemplary embodiments are described with reference to the
accompanying drawings. In the figures, the left-most digit(s) of a reference number
identifies the figure in which the reference number first appears. Wherever
25 convenient, the same reference numbers are used throughout the drawings to refer
to the same or like parts. While examples and features of disclosed principles are
described herein, modifications, adaptations, and other implementations are
possible without departing from the scope of the disclosed embodiments.
[018] The embodiments herein provide a system and method for assessing
30 insider influence on enterprise assets. The present disclosure primarily focuses on
assessing the influence of an insider on other benign users, and thereby, their
9
indirect reachability to different assets within the enterprise. The present system
first detects the community of the one or more individuals associated with the
enterprise and then identifies the one or more individuals with suspicious behavior.
For a given community (usually a project team), the present system calculates the
5 susceptibility of the one or more individuals for probable influence by an identified
insider as a function of their position in reporting hierarchy and the health of
communication (indicating the strength of interpersonal relationship). Further, the
results from the spread and influence detection are used to identify the assets which
are under possible threat from the one or more individuals with suspicious behavior.
10 Further, the present disclosure provides a method to calculate the probability of
information loss which in turn helps in calculating the risk profiles. More
specifically, the present disclosure enables the detection of the insider's influence
over other individuals in his/her team.
[019] Referring now to the drawings, and more particularly to FIG. 1
15 through FIG. 6, where similar reference characters denote corresponding features
consistently throughout the figures, there are shown preferred embodiments and
these embodiments are described in the context of the following exemplary system
and/or method.
[020] FIG. 1 illustrates an exemplary system (100) for assessing insider
20 influence on enterprise assets, according to some embodiments of the present
disclosure. In an embodiment, the system 100 includes one or more processors 104,
communication interface device(s) or input/output (I/O) interface(s) 106, one or
more data storage devices or memory 102 operatively coupled to the one or more
processors 104. In an embodiment, the system 100 further includes a data
25 preprocessing module 108, a weight calculation module 110, a community
detection module 112, a threat detection module 114, an analysis module 116 and
a reporting and preventive measure module 118. The one or more processors 104
are hardware processors and can be implemented as one or more microprocessors,
microcomputers, microcontrollers, digital signal processors, central processing
30 units, state machines, graphics controllers, logic circuitries, and/or any devices that
manipulate signals based on operational instructions. Among other capabilities, the
10
processor(s) are configured to fetch and execute computer-readable instructions
stored in the memory. In an embodiment, the system 100 can be implemented in a
variety of computing systems, such as laptop computers, notebooks, hand-held
devices, workstations, mainframe computers, servers, a network cloud and the like.
5 [021] The I/O interface device(s) 106 can include a variety of software and
hardware interfaces, for example, a web interface, a graphical user interface, and
the like and can facilitate multiple communications within a wide variety of
networks N/W and protocol types, including wired networks, for example, LAN,
cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an
10 embodiment, the I/O interface device(s) can include one or more ports for
connecting a number of devices to one another or to another server.
[022] The memory 102 may include any computer-readable medium
known in the art including, for example, volatile memory, such as static random
access memory (SRAM) and dynamic random access memory (DRAM), and/or
15 non-volatile memory, such as read only memory (ROM), erasable programmable
ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an
embodiment, one or more modules (not shown) of the system 100 can be stored in
the memory 102.
[023] In an embodiment, the system 100 includes one or more data storage
20 devices or memory 102 operatively coupled to the one or more processors 104 and
is configured to store instructions configured for execution of steps of the method
200 by the one or more processors 104.
[024] FIGS. 2A through 2C illustrate a flow diagram for the steps involved
in the method for assessing insider influence on enterprise assets, according to some
25 embodiments of the present disclosure. Steps of the method of FIG. 2A through 2C
shall be described in conjunction with the components of FIG. 1. At step 202 of the
method 200, the one or more hardware processors 104, receive an enterprise data
specific to one or more individuals associated with an enterprise from a plurality of
sources, wherein the one or more individuals comprises of at least one of one or
30 more vendors, one or more employees and one or more contractors associated with
the enterprise. The enterprise data specific to the one or more individuals associated
11
with the enterprise may include login-logout time, device usage events, I/O
(Input/Output) events, punch in/ punch out time (entry inside office), network
events, email(s) and the like. The plurality of sources can be anything including
network logs, SIEM/SYSLOG events (captures device, I/O, registry, process evets),
5 DBLP (Digital Bibliography and Library Project) co-author, social network data,
office swipe in/out data, email(s) and the like. At step 204 of the method 200, the
one or more hardware processors 104 pre-process the received enterprise data to
obtain an intermediate common input representation. Some of the examples of rulebased algorithms with respect to the present disclosure includes, ignoring a plurality
10 of edges between the two or more individual pairs if the communication is less than
global average, flagging the one or more individual’s activity as abnormal if the
pluggable device usage is more than the community average and if the login time
of the one or more individuals is beyond usual time. Heuristics are simple strategies
for making decisions for complex problem, wherein some of the examples with
15 respect to the present disclosure includes, marking an activity as suspicious, if the
one or more individuals opens a file which is received from a client email id, and
then copies to a different file type and encrypt it. Further an activity is marked as
suspicious, if the one or more individuals copy a large amount of data in short bursts
of time to a removable drive. Referring to the FIG.1, the data preprocessing module
20 108 of the system 100 aggregates the data from the plurality of sources, perform
necessary pre-processing and feature transformation to obtain an intermediate
common input representation for a machine learning/rule based/heuristic algorithm.
As an event/activity from each source is different, for example, for I/O event it
could be file id (also referred as file identifier), name, parent name, size, date etc.
25 which are in different format and also the HTTP (hypertext transfer protocol) logs
includes fields which are in different format. The enterprise data which are in
different format are then converted to CSV (comma separated values) format with
addition of additional attributes which includes event id and the user responsible
for that event. The final common input representation fields include universal id,
30 time stamp, event name, entity name, user, data (additional data specific to data
source). The data preprocessing module 108 further includes a pre-processing
12
engine which performs the feature engineering and feature transformation of the
input samples/data specific to the one or more individuals in the enterprise based
on the pre-configuration required by the machine learning algorithms. The feature
engineering and feature transformation includes constructing or deducing new
5 features from the existing enterprise data, for example, if daily login and logoff data
is available for the one or more individuals on a given system, then the usual shift
duration can be deduced, further the one or more individual’s time to come to office
and leave the office, holidays, weekends and abnormal logins could also be
deduced. Further, the arrival of enterprise data can be a continuous-stream or a
10 batch-wise input which helps in deciding an optimal algorithm for detection of the
insider threat, detecting their existing community, identifying the different type of
relationship between the one or more individuals, determining/identifying the
optimal route (from malicious user to the targeted individual with most influenced
individuals) for data exfiltration, and deciding one or more preventive steps.
15 Further, the pre-processing engine considers the data rate and efficiency while
processing the received enterprise data. The pre-processing involves cleaning of
data (removal of outliers), de-duplication of events (for performance), merging of
multiple accounts related to the one or more individuals (user with multiple roles
or accounts or IDs), removal of spurious edges between the one or more individuals
20 based on a plurality of statistical methods (like number of messages from one to
another vertex should be greater than a threshold value). Here, a threshold value
associated with enterprise graph helps in identifying the mutual relevance between
the one or more individuals. For user 𝑢 and 𝑣, there are two threshold values which
are possible (depending on the direction for which relevance is calculated) wherein
these two values are 2
𝑑𝑢
𝑎𝑛𝑑 2
𝑑𝑣
25 where 𝑑𝑢 is the degree of node 𝑢 and 𝑑𝑣 is degree of
node 𝑣. The choice of numerator is based on Markov inequality. In an embodiment,
“vertex” represents an individual associated with the enterprise.
[025] At step 206 of the method 200, the one or more hardware processors
104 create an enterprise graph between one or more entities from the obtained
30 intermediate common input representation, wherein the one or more entities
13
includes the one or more individuals and one or more assets associated with the
enterprise, and wherein the enterprise graph includes a plurality of vertices
consisting of the one or more entities associated with the enterprise in a present time
period and a past time period, and a plurality of edges between the one or more
5 entities and a plurality of attributes associated with the plurality of vertices and the
plurality of edges. The enterprise graph may be referred as graph and
interchangeably used herein. In the present disclosure, the one or more entities
includes the one or more individuals and one or more assets associated with the
enterprise wherein the one or more assets can include virtual devices (for example:
10 cloud) and/or physical devices (for example: hardware devices which includes
servers, data storage devices, networking equipment and the like.) associated with
the enterprise. The present system 100 creates an enterprise graph 𝐺 = (𝑉, 𝐸, 𝐴),
where 𝑉 is a plurality of vertices consisting of the one or more entities associated
with the enterprise in a present time period and a past time period, 𝐸 is a plurality
15 of edges between the one or more entities and 𝐴 is a plurality of attributes associated
with the plurality of vertices and the plurality of edges. The plurality of attributes
can include attributes specific to the one or more assets (for example, asset id, server
attributes, geolocation, IP address, CPU (Central Processing Unit) count, RAM
(Random-Access), HDD (Hard disk drive), CPU (Central Processing Unit) usage
20 and the like.) and attributes specific the one or more individuals (for example, {role,
experience, peers, team size, project}) associated with the enterprise. Further, the
attributes specific to the plurality of edges can include weight, timestamp, direction,
color and the like. At step 208 of the method 200, the one or more hardware
processors 104 calculate a weight for each of the plurality of edges between any
25 two connected entities based on a plurality of enterprise graph features and the
plurality of attributes. Referring to FIG.1, the weight calculation module 110 of the
system 100 calculates the weight for each of the plurality of edges between any two
connected entities based on multiple criteria, for example, following could be
derived from the plurality of enterprise graph features including the weight as a
30 ratio of mutual messages to global ratio, a mutual to local messages ratio, a relative
importance-based ratio, a quantile-based scoring, whereas a temporal score is
14
derived based on the recency of communication which is further based on the
plurality of attributes specific to the plurality of edges. For example, if 𝑀 is the total
number of messages shared between any two given individuals (𝑢, 𝑣) in set 𝑉 then
the edge weight 𝑊𝑢,𝑣 is given as:
𝑊𝑢,𝑣 = {
1
𝑀𝑢,𝑣
⁄ , 𝑖𝑓 𝑀𝑢,𝑣 > 0
∞, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
5
The above equation implies that frequently communicating candidates have lower
edge weight.
Another method based on quantile can be described as follows for calculating
weight 𝑊𝑢,𝑣:
10 The present disclosure takes the number of sent messages from each directed edge
and put it in a list Ω and sort the list in ascending order. Further, the present
disclosure partitions Ω into q-quantiles. For vertices 𝑢 and 𝑣 , let 𝑃 = {𝑣0 =
𝑢, 𝑣1, . . . , 𝑣𝑛 = 𝑣 }be the optimal path consisting of the plurality of edges, where
𝑃 ⊂ 𝐸. Let 𝑒𝑗 be a directed edge between (𝑣𝑗−1, 𝑣𝑗) ∈ 𝑃. Also based on the number
of sent messages, let the edge 𝑒𝑗 belongs to the 𝑘
𝑡ℎ 15 quantile of Ω, then the weight
𝑤𝑗 of edge 𝑒𝑗
is calculated as: 𝑤𝑗 = 𝑘/𝑞 , where, 𝑘 = 1, 2, . . . , 𝑞.
Another method based on relative mutual importance for calculating weight 𝑊𝑢,𝑣
can be calculated as:
For a given path, 𝑆(𝜈, 𝜏1
) = {𝑣0 = 𝑢, 𝑣1, … , 𝑣ℎ−1, 𝑣ℎ = 𝑣}, let 𝐶𝑣𝑗−1
be the total
communication on the edges incident on node 𝑣𝑗−1and 𝐶𝑣𝑗
20 be the total
communication on the edges incident on node 𝑣𝑗
. For edge (𝑣𝑗−1, 𝑣𝑗) ∈ 𝐸, let
𝐶𝑣𝑗−1,𝑣𝑗
be the mutual communication. The weight 𝑤𝑗 can be calculated as:
𝑤𝑗 = {
1, 𝑣𝑗 𝒓𝒆𝒑𝒐𝒓𝒕𝒔 𝒕𝒐 𝑣𝑗−1
𝐶𝑣𝑗−1,𝑣𝑗
/𝐶𝑣𝑗−1
, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
The above three methods show the configurability of the system of the present
25 disclosure, which allows it to have standard or custom edge weight calculation
algorithms.
[026] In an embodiment of the present disclosure, the thresholds are
calculated by applying various statistical methods on the plurality of enterprise
15
graph features and a plurality of graph properties wherein the plurality of graph
properties can include average communication, degree of vertex, average degree of
vertex, percentage distribution of messages in a community (subgraph) and the like.
Further, the plurality of enterprise graph features are considered for both directed
5 and undirected graphs. In the present disclosure, as far as the complete system is
considered, there are multiple thresholds which are used, for example, degree of a
vertex (𝑑𝑢 and 𝑑𝑣), 𝜇𝑓
𝑤 ± 𝜃 ∗ 𝜎𝑓
𝑤 for detection of a one or more relevant insiders,
and further details on the same are discussed in later sections. In an embodiment of
the present disclosure, the plurality of graph properties can be calculated using
10 standard graph concepts/algorithms for example, degree of vertex and average
degree of vertex, average communication (average edges for pair of vertices) and
distribution of messages (that is distribution of edges in a subgraph).
[027] At step 210 of the method 200, the one or more hardware processors
104 detect one or more communities of the one or more individuals by using a
15 plurality of graph-based techniques based on the calculated weights of the plurality
of edges. The community detection module 112 of the system 100 is configured to
the detect one or more communities of the one or more individuals by using various
graph-based techniques, their co-author network, social network relationship,
content sharing behavior, reporting hierarchy, code-repository/version control
20 system access pattern and the like.
In an example embodiment, the co-author network is explained with an example
below.
For paper ‘test.pdf’ authors are A, B, D, E
For paper ‘west.pdf’ authors are C, D, E
25 For paper ‘rest.pdf’ authors are D, B, F
For paper ‘pest.pdf’ authors are B, E, F
Therefore, co-author graph is (here – means an edge between the author):
A – B
A – D
30 A – E
B – D
16
B – E
B – F
C – D
C – E
5 D – E
D – F
E – F
As considered in the previous sections, let 𝐺 = (𝑉, 𝐸, 𝐴) be the undirected
communication graph of the enterprise, where 𝑉 is plurality of vertices consisting
10 of one or more entities associated with the enterprise, 𝐸 is plurality of edges
between the one or more entities in a present time period and a past time period and
𝐴 be plurality of attributes associated with the plurality of vertices and the plurality
of edges. A node 𝑖 is part of the community 𝑋 ⊂ 𝐺, if and only if:
𝑘𝑖
𝑖𝑛(𝑋)
𝑘𝑖
𝑜𝑢𝑡(𝑋)
> 𝐾, ∀𝑖 ∈ 𝑋
where, 𝑘𝑖
𝑖𝑛 15 (𝑋) is number of edges connecting node 𝑖 to the other vertex in 𝑋 and
𝑘𝑖
𝑜𝑢𝑡(𝑋) is the number of edges connecting 𝑖 to the vertex which are not in 𝑋.
In the above equation, 𝐾 is the cohesiveness of the community and can have values
> 0. when 𝐾 = 1, the Equation 1. For values greater than 1, it suggests a more
frequent and dense communication between the one or more individuals within 𝑋.
20 Further the present system could also utilize the enterprise specific information to
further refine the community discovery, for example, using reporting hierarchy,
content sharing behavior, access to code/version management repository to name
few.
[028] At step 212 of the method 200, the one or more hardware processors
25 104 calculate a threshold behavior for the one or more individuals and the one or
more detected communities within an observation window by applying a plurality
of statistical methods based on the plurality of enterprise graph features and the
plurality of attributes. At step 214 of the method 200, the one or more hardware
processors 104 perform a comparison of the threshold behavior of the one or more
30 individuals calculated within the observation window with a current behavior of the
17
one or more individuals to identify one or more potential insiders. At step 216 of
the method 200, the one or more hardware processors 104 perform a comparison of
the current behavior of the one or more potential insiders and the current behavior
of a plurality of individuals of the one or more detected communities to identify the
5 one or more potential insiders as one or more relevant insiders. Referring to FIG. 1,
the threat detection module 114 of the system uses various techniques, for example,
a low pass filter-based method which uses an analysis/observation window of size
‘𝑊’ to compare the one or more individual’s current behavior with the past
behavior (threshold behavior) calculated for that window. In case of potentially
10 malicious behavior, the one or more individual is identified as the one or more
potential insiders and the current behavior of the one or more potential insiders is
compared with the current behavior of the peers of the one or more detected
communities with same role within the same window ‘𝑊’ to identify the one or
more potential insiders as the one or more relevant insiders. Another method
15 consists of utilization of isolation forest-based technique for identifying the one or
more relevant insiders. One of the most recent technique uses graph encoding for
training the deep learning model for detecting malicious behavior. In addition to
that there are other heuristic and machine learning based method which can be
utilized for detecting the malicious insider activity in the enterprise. For example,
20 a low-pass filter based-based detection method is depicted in FIG.3 and described
below:
Behavior Flagging
A low pass filter can be used as an unsupervised statistical anomaly detection
method for filtering out high frequency normal benign behavior. The low-pass filter
25 can be implemented as a rolling mean which is a type of convolution,
(𝑓 ∗ 𝑔)(𝑡) ≜ ∫ 𝑓(𝜏) ∗ 𝑔(𝑡 − 𝜏)𝑑𝜏 ∞
−∞
where, 𝑓(𝜏) is the input function and 𝑔(𝜏) is the weighting function shifted by time
𝑡. For discrete input, for example events happening occurring at distinct points in
time, the convolution equation becomes:
∑ 𝑓(𝜏) ∗ 𝑔(𝑡 − 𝜏)

𝜏=−∞
30
18
Also, the input data has a normal distribution as the anomalous behaviors are rare
in a large population, and therefore, most of the behaviors are clustered around the
mean behavior. Let 𝜇𝑓
𝑤 and 𝜎𝑓
𝑤 be the mean and the standard deviation for the
analysis window 𝑤 and selected feature 𝑓 for a data source 𝑑. The one or more
5 individual’s behavior is defined as an anomalous for a feature 𝑓 during analysis
window 𝑤 if it is not within the range 𝜇𝑓
𝑤 ± 𝜃 ∗ 𝜎𝑓
𝑤, where 𝜃 is a configurable
parameter.
Behavior Validation
A suspected behavior of the one or more individuals is first validated against the
10 current community wherein this step normally helps in limiting the scenarios
wherein there are some events triggered activities or the one or more individuals
moves from one community to another, for example change of project. Due to rarity
of anomalous behavior, the behavior of the one or more detected communities
related to the one or more individuals is also assumed to be normally distributed,
15 therefore the above analysis is also applicable on the community behavior analysis.
An additional validation could be done by comparing the one or more potential
insider’s (𝑣) behavior with the associated peers. Let 𝑃𝑣 = {𝑝1, … , 𝑝𝑘} be the set of
peers based on the one or more potential insider’s role. All the peers in 𝑃𝑣 validate
the behavior of 𝑣 by comparing the behaviour against their own behavior during
20 the time window ‘𝑊’ for feature 𝑓 in dataset 𝑑. Once the validation result is
obtained from all the peers voting is done by considering the absolute majority, i.e.,
a number of votes, either against or in favor, should be greater than |𝑃𝑣
|/2.
Table 1 represents Peer voting for suspected behavior for a suspect in CMU
CERT (Carnegie Mellon University, Computer Emergency Response Team)
25 dataset.
Column 1 shows 1 standard deviation from the mean, whereas column 2 shows 2
standard deviations from the mean behavior.
Dimensions µ + σ µ + 2 ∗ σ
19
Device Session Frequency
(D1)
94.8% 55.7%
External Emails (D2) 23.7% 4.1%
File Copy Session (D3) 33.3% 11.1%
All Emails (D4) 94.8% 3.1%
Logon Session (D5) 55.5% 22.2%
It is to be noted that above method is an example of possible insider detection
method and there could be potentially a plurality of other methods based on
machine learning, deep learning, graph analysis, heuristic and rule based. The
5 present disclosure allows the state-of-the-art or custom method to be used for
detection purpose.
[029] At step 218 of the method 200, the one or more hardware processors
104 calculate a susceptibility of the plurality of individuals for probable influence
by the one or more relevant insiders based on the analysis of a plurality of scenarios,
10 wherein the plurality of scenarios includes hierarchy exploitation, relationship
exploitation and mixed mode. More specifically, the susceptibility is calculated to
determine by how much is an individual is being influenced by the one or more
relevant insiders. Referring to FIG. 1, the analysis module 116 of the system 100 is
the core of the present disclosure which uses 3 predominant scenarios for
15 calculating the various results. More specifically, the present disclosure explores
the following three scenarios which can be exploited by the insider) to achieve
his/her goal of data exfiltration:
Scenario 1. Hierarchy Exploitation. It is common in the enterprise that the one or
more individuals shares the information requested by their supervisor. The one or
20 more insiders in the supervisor position could exploit this directly if there exist a
sequence 𝑆(𝜈, 𝜏1
) = {𝑣0 = 𝜈, 𝑣1, … , 𝑣ℎ−1, 𝑣ℎ = 𝜏1}, such that, individual
𝑣𝑖 𝒓𝒆𝒑𝒐𝒓𝒕𝒔 ⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗𝒕𝒐⃗⃗⃗⃗ 𝑣𝑖−1, ∀𝑣𝑖 ∈ 𝑋.
Wherein, 𝑆(𝜈, 𝜏1
) = {𝑣0 = 𝜈, 𝑣1, … , 𝑣ℎ−1, 𝑣ℎ = 𝜏1}, represents a path between
𝑣 𝑎𝑛𝑑 𝜏1 with intermediate vertex 𝑣1 … 𝑣ℎ−1. Since this is hierarchy exploitation
25 therefore, every preceding vertex is the supervisor of the next vertex in sequence.
20
Scenario 2. Relationship Exploitation. The optimal path in this scenario consists
of edges with a good message density which is explained in detail in further
sections. An insider would try to exploit this interpersonal relationship between the
vertex to get the desired information.
5 Scenario 3. Mixed Mode. This scenario consists of a combination of previous two
scenarios. In this the one or more relevant insiders tries to maximize their influence
over the target by first exploiting the reporting hierarchy, and then selecting paths
with healthy communication edges.
The 3 scenarios mentioned above helps in understanding the susceptibility of a node
10 for leaking data under the influence of the one or more relevant insiders. It is to be
understood by a person having ordinary skill in the art or a person skilled in the art
that the above scenarios shall not be construed as limiting the scope of the present
disclosure, and the system and method described herein may utilize any other
scenario as applicable basis requirement.
15 [030] At step 220 of the method 200, the one or more hardware processors
104 calculate a plurality of paths taken by the one or more relevant insiders based
on the calculated susceptibility of the plurality of individuals. For suspect (for
example, one or more potential insiders) 𝑣 and target (for example, victim) 𝜏1 in
subgraph 𝑋, let 𝑆(𝜈, 𝜏1 ) = {𝑠1, . . . , 𝑠𝑡
} the set of all possible paths. The plurality
20 of edges in community is further qualified by using patterns (or any other depiction
that characterizes various edges amongst the community) according to the vertices
to which they are connected with and the quality of the communication between
them. Based on communication, let 𝑡ℎ𝑢,𝑣 and 𝑡ℎ𝑣,𝑢 be the thresholds for assessing
the interpersonal relationship between 𝑢 and 𝑣. For a data request from 𝑢 to 𝑣, the
25 present disclosure uses one of the following four patterns for edge
indication/representation: a) pattern code 1 or PC1 indicated by the solid black
arrows for representing reporting relationship (𝑃𝑢,𝑣 = 1), b) pattern code 2 or PC2
indicated by the dotted arrows represents healthy communication among the one or
more individuals, that is, (𝑃𝑢,𝑣 > 𝑡ℎ𝑢,𝑣) and (𝑃𝑣,𝑢 > 𝑡ℎ𝑣,𝑢 ), c) pattern code 3 or PC3
30 indicated by a dash-dotted arrow represents weak communication, that is, (𝑃𝑢,𝑣 >
𝑡ℎ𝑢,𝑣) or (𝑃𝑣,𝑢 > 𝑡ℎ𝑣,𝑢), but not both, and d) pattern code 4 or PC4 indicated by thin
21
grey lines represents the acquaintance(represents weaker edge between two entities
within a given community) as depicted in FIGS.5A and 5B. The patterns are
assigned to give preference to reporting hierarchy and then communication density.
It is further quantified by assigning a value from the set {HIGH, MEDIUM, LOW,
5 NONE}. The one or more individuals who reports to the one or more relevant
insiders may have HIGH susceptibility for data sharing. Similarly, the one or more
individuals with good interpersonal relationship with the insider may have
MEDIUM-LOW chances as there are no obligation(s) to share data except for their
past relationship. Finally, the one or more individuals with rare or occasional
10 communication with the insider may have LOW-NONE chances of data sharing.
[031] At step 222 of the method 200, the one or more hardware processors
104 perform an analysis of the calculated paths to obtain a probability score
indicative of a probable data loss. Referring to the algorithm 1 which is explained
in the further section, for a given path, 𝑆(𝜈, 𝜏1
) = {𝑣0 = 𝜈, 𝑣1, … , 𝑣ℎ−1, 𝑣ℎ = 𝜏1},
let 𝐶𝑣𝑗−1
be the total communication on the edges incident on node 𝑣𝑗−1and 𝐶𝑣𝑗
15 be
the total communication on the edges incident on node 𝑣𝑗
. For edge (𝑣𝑗−1, 𝑣𝑗) ∈ 𝐸,
let 𝐶𝑣𝑗−1,𝑣𝑗
be the mutual communication. The probability of data leakage 𝑃𝑗 can be
as:
𝑃𝑗 = {
1, 𝑣𝑗 𝒓𝒆𝒑𝒐𝒓𝒕𝒔 𝒕𝒐 𝑣𝑗−1
𝐶𝑣𝑗−1,𝑣𝑗
/𝐶𝑣𝑗−1
, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
20 This above equation is similar to the previous equation of weight calculation as 𝑃𝑗 ∝
𝑤𝑗 or 𝑃𝑗 = 𝛼 ∗ 𝑤𝑗
, where 𝛼 is a constant and for a static network/graph its value is
1. However, 𝛼 can take other values depending on the setup, for example, 𝛼 could
be a decaying function for representing decrease in loss probability with time 𝑡, if
two individuals are not in contact for a long time. As data sharing for each of the
25 plurality of edges is an independent event, then the cumulative probability of data
exfiltration associated with a path ‘𝑆(𝜈, 𝜏1
)’ is:
𝑃 = ∏𝑃𝑗

𝑗=1
From the insider’s perspective, he/she applies this equation to all given paths to a
22
given target to arrive at the optimal path with the maximum probability of data
exfiltration.
[032] The present disclosure formally defines the influence of a node on
others as a reachability problem for an undirected graph 𝐺 = (𝑉, 𝐸, 𝐴). It is the set
5 of all ordered pairs (𝑥, 𝑦) of vertices in 𝑉 for which there exists a sequence of
vertices 𝑠(𝑥, 𝑦) = {𝑣0 = 𝑥, 𝑣1, … , 𝑣ℎ−1, 𝑣ℎ = 𝑦}, such that the edge
(𝑣𝑖−1, 𝑣𝑖 ) ∈ 𝐸 for all 1 ≤ 𝑖 ≤ ℎ. Here ‘ℎ’ is the number of hops from vertex 𝑥
to 𝑦. In a graph there could be multiple paths (sequence of vertices) from a given
source (𝜈) (for example, one or more potential insiders) and target (𝜏1)(for
10 example, victim). The present disclosure uses ‘Depth First Search’ (DFS) for
finding all the paths between the source and the target. The choice of DFS (Depth
First Search) is due to the fact that |𝑋 | ≪ |𝐺| with an additional restriction
imposed by the number of hops which ν can take to reach τ, thus resulting in very
fast path enumeration. From the one or more relevant insider’s perspective, an
15 optimal path is that which has the highest probability of getting the desired
information from the target, as implemented in Algorithm 1 as depicted below:
Algorithm 1: Path analysis and Probability of data leakage
1: procedure PATHANALYSIS(𝑆) 𝑆 is the set of all paths between node
𝑢 and 𝑣
20 2: opt = nil Optimal path for data exfiltration
3: for 𝑠 𝜖 𝑆 do
4: limit = 𝑠.length() -1, 𝑝=1
5: for 𝑖𝑑𝑥 = 0; 𝑖𝑑𝑥 ≤ limit; 𝑖𝑑𝑥 + 1 do
6: 𝑢 = 𝑠[𝑖𝑑𝑥], 𝑣 = 𝑠[𝑖𝑑𝑥 + 1], 𝑒 = edge(𝑢, 𝑣)
7: probability=𝐶(𝑢,𝑣)
𝐶𝑢
25
8: if (𝑒. color == ‘red’) and (𝑣 reports to 𝑢) then
9: probability = 1 Direct request from supervisor
10: end if
11: 𝑝 = 𝑝 ∗ probability
30 12: end for
23
13: 𝑠. probability = 𝑝
14: if (opt = = nil) or (opt. probability < 𝑠.probability) then
15: opt = 𝑠
16: end if
5 17: end for
18: return opt Path with maximum probability of data leakage
19: end procedure
Let 𝐼 = {𝐼1, … ,𝐼𝑛} be the set of individuals who are influenceable via direct or
10 indirect communication with the insider. Further let 𝐷𝑖 = {𝐷1, … ,𝐷𝑘} be the assets
accessible by the 𝑖
𝑡ℎ
individual 𝐼𝑖 ∈ 𝐼, then the spread of the insider is defined as:
𝑆𝑝𝑟𝑒𝑎𝑑 = ⋃𝐷𝑖
𝑛
𝑖=1
[033] In an embodiment of the present disclosure, a subset of the assets are
the assets which are indirectly reachable by the one or more relevant insiders
15 through their influence over their peers in the one or more detected communities.
The subset of the assets may be referred as impacting assets and interchangeably
used herein. Furter, the impacting assets are the virtual or physical devices from
where potential data loss is possible due to the one or more relevant insider’s
influence over the owners of such assets. The present disclosure performs risk
20 profiling of the one or more individuals with the suspected behavior which helps
the enterprise in identifying any potential threat. Further, the potential threat may
negatively impact the enterprise assets, its reputation and the other individuals
associated with the enterprise. Further the risk profiling helps the enterprise in
making an informed decision about the risk and taking precautionary steps to avert
25 any potential threat by the one or more relevant insiders. In present disclosure, first
the one or more individuals with suspected behavior is identified followed by
identification of his/her/their influence over other individuals in his/her/their
community, and finally, his/her overall impact on the enterprise due to direct and
indirect access to the assets of the enterprise is identified. The present disclosure
30 calculates the total risk as the sum over the one or more individuals risks as
24
suggested by Open web application Security Project (OWASP) standard. Let 𝑅𝑖 be
the risk associated with the 𝑖
𝑡ℎ device, 𝐿𝑖 be the associated loss and 𝑃𝑖 be the
probability of the loss, then:
𝑅𝑖 = 𝐿𝑖 ∗ 𝑃𝑖
5 And the cumulative risk 𝑅 is given as:
𝑅 = ∑𝑅𝑖
In the above Equation, 𝐿𝑖 can be evaluated with respect to the legal penalties to be
paid by the enterprise, a monetary value associated with the loss or a less granular
symbolic scaling like, extreme, high, moderate and low. 𝑃𝑖
is calculated as
10 discussed in the above section.
[034] In an embodiment of the present disclosure, the reporting and
preventive measure module 118 of the system 100 is used for sharing the analysis
report with various stakeholders. The reporting and preventive measure module 118
uses graph-based visualization of reporting hierarchy, interpersonal relationship,
15 and various other properties (for example, message density, degree distribution,
in/out degree and the like). The analysis and calculations discussed in the previous
sections includes following:
1. Individuals
a. Susceptibility score, that is, {HIGH, MEDIUM, LOW, NONE} as
20 in one of the earlier sections.
b. Reachability score from a given insider as discussed in one of the
earlier sections.
2. Device
a. Risk Score from a given insider
25 b. Overall risk associated with all suspected individual (summation of
the risk calculated in the above point)
Based on the above-mentioned scores, the reporting and preventive measure
module 118 suggests multiple preventive and corrective measure, which includes
reeducating susceptible individuals about good/recommended data sharing
30 practices, installation licensed data leakage prevention solutions, active monitoring
25
to name few. Such examples of preventive and corrective measures shall not be
construed as limiting the scope of the present disclosure.
[035] FIGS. 3A and 3B shows an example of insider detection technique
illustrating a low-pass filter-based anomaly detection and community and peer
5 voting for removing false positives, according to some embodiments of the present
disclosure. The implementation details of the low-pass filter-based anomaly
detection in conjunction with the present disclosure is explained in the earlier
sections.
[036] FIGS. 4A through 4C are the use cases illustrating the spider charts
10 for the behavior of a normal individual (FIG.4A), an anomalous behavior with
respect to self (FIG.4B) and suspect's behavior validation with community
(FIG.4C) on multiple dimensions, according to some embodiments of the present
disclosure. The dimensions include D1) Device session frequency, D2) External
emails, D3) File copy session, D4) All emails, and D5) Logon session. The three
15 polygons in FIGS. 4A through 4C represent actual behavior, mean ± 1 (µ + 1σ)
standard deviation and mean ± 2 (µ + 2σ) standard deviation.
[037] FIGS. 5A through 5B are the use cases illustrating a communication
graph with a plurality of communities, according to some embodiments of the
present disclosure. Referring to FIG.5A, the node with star shape represents the
20 insider, and the nodes with rectangle shape represents the potential target. The
arrow on lines represents the direction of influence and the optimal path for each
star and rectangle node pair. Further, the solid black directed edge between vertices
represents reporting hierarchy. The dotted directed edge between vertices
represents strong interpersonal relationship, and dash-dotted directed edge between
25 vertices represent weak or one-sided interpersonal relationship. Finally, other edges
represent acquaintance. Referring to FIG.5B, the insider is represented by the star
node where influence radius of 𝜈 for 4 hops is represented. The reachability from 𝜈
to other nodes in the community is represented by different shapes of the node
which includes hexagon shape representing reachability in 1 hop, pentagon shape
30 representing reachability in 2 hop, square shape representing reachability in 3 hop
and triangle shape representing reachability in 4 hops.
26
[038] FIG.6 illustrates an average probability of loss with respect to a
number of hops for four suspects in the CMU CERT (Carnegie Mellon University,
Computer Emergency Response Team) dataset, according to some embodiments of
the present disclosure. The items CCL0068, KPC0073, MAS0025 and JTM0023 in
5 the legend of the FIG.6 are the four known insiders in CMU CERT (Carnegie
Mellon University, Computer Emergency Response Team) dataset. Further, FIG.6
depicts the change in average probability of loss 𝑃𝑠with the number hops to reach a
target device or the one or more individuals wherein it is observed that with increase
in number of hops the probability keeps on decreasing. For example, on CMU
10 CERT (Carnegie Mellon University, Computer Emergency Response Team)
dataset the average probability of loss 𝑃𝑠
is nearly 0 in 2 hops.
Table 2 represents Indirect impact of an insider on the enterprise assets on
the CMU CERT (Carnegie Mellon University, Computer Emergency
Response Team) dataset
Hops Affected Colleagues Assigned Shared Devices
1 39 40 77 117
2 55=39+16 (New) 56 78 134
15
[039] The present disclosure provides system and method for assessing
insider influence on enterprise assets that includes monitoring susceptible
individuals within the team/project (subgraph of enterprise network graph). In
another use-case, when the one or more individuals in the enterprise resigns, the
20 enterprise applies multiple restrictions to avoid data exfiltration during the notice
period. However, the one or more individuals can still exploit the influence on other
individuals to get the desired impact of exfiltrating sensitive enterprise data.
Further, the deployment of licensed security controls for restricting data leakage
can be prioritized according to the influence and spread of a potential threat, thus
25 resulting in better inventory management of security controls. The present
disclosure further enables introducing corrective behavioral nudges for hardening
the human component in cybersecurity for influence based social engineering
27
attacks. Further, the corrective behavioral nudges can be enabled by running a
minimal agent on the one or more individual’s device which alerts her/him/them of
a potential breach due to data sharing with a suspected individual or ask the one or
more individuals to get more information from the requester regarding the need-to5 know.
[040] Hence, the present disclosure provides the system and method for
assessing insider influence on enterprise assets. The present disclosure enables the
detection of an insider's influence over other individuals in a team in an enterprise.
For a given community (usually a project team), the present system calculates the
10 susceptibility of an individual for probable influence by the one or more identified
relevant insiders as a function of their position in reporting hierarchy and the health
of communication (indicating the strength of interpersonal relationship). Further,
the present disclosure uses various types of relationships existing between the one
or more individuals in the enterprise so as to assess the one or more relevant
15 insider's potential influence over his/her/their peers, subordinates and supervisors.
For example, reporting hierarchy relationship existing between the employee,
interpersonal relationships based on health of communication between them,
relationships based on co-author on a paper, casual/weak relationships due to mere
acquaintance are some of the examples of interpersonal relationships Further, the
20 results from the spread and influence detection enable identification of the assets
which are under possible threat from a suspicious individual. Further, the present
disclosure provides a method to calculate the probability of loss of information
which in turn helps in calculating the risk profiles. Thus, the present disclosure
helps in sharing the analysis report and suggesting a plurality of preventive and
25 corrective measures with various stakeholders.
[041] The written description describes the subject matter herein to enable
any person skilled in the art to make and use the embodiments. The scope of the
subject matter embodiments is defined by the claims and may include other
modifications that occur to those skilled in the art. Such other modifications are
30 intended to be within the scope of the claims if they have similar elements that do
28
not differ from the literal language of the claims or if they include equivalent
elements with insubstantial differences from the literal language of the claims.
[042] It is to be understood that the scope of the protection is extended to
such a program and in addition to a computer-readable means having a message
5 therein; such computer-readable storage means contain program-code means for
implementation of one or more steps of the method, when the program runs on a
server or mobile device or any suitable programmable device. The hardware device
can be any kind of device which can be programmed including e.g., any kind of
computer like a server or a personal computer, or the like, or any combination
10 thereof. The device may also include means which could be e.g., hardware means
like e.g., an application-specific integrated circuit (ASIC), a field-programmable
gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC
and an FPGA, or at least one microprocessor and at least one memory with software
processing components located therein. Thus, the means can include both hardware
15 means and software means. The method embodiments described herein could be
implemented in hardware and software. The device may also include software
means. Alternatively, the embodiments may be implemented on different hardware
devices, e.g., using a plurality of CPUs.
[043] The embodiments herein can comprise hardware and software
20 elements. The embodiments that are implemented in software include but are not
limited to, firmware, resident software, microcode, etc. The functions performed by
various components described herein may be implemented in other components or
combinations of other components. For the purposes of this description, a
computer-usable or computer readable medium can be any apparatus that can
25 comprise, store, communicate, propagate, or transport the program for use by or in
connection with the instruction execution system, apparatus, or device.
[044] The illustrated steps are set out to explain the exemplary
embodiments shown, and it should be anticipated that ongoing technological
development will change the manner in which particular functions are performed.
30 These examples are presented herein for purposes of illustration, and not limitation.
Further, the boundaries of the functional building blocks have been arbitrarily
29
defined herein for the convenience of the description. Alternative boundaries can
be defined so long as the specified functions and relationships thereof are
appropriately performed. Alternatives (including equivalents, extensions,
variations, deviations, etc., of those described herein) will be apparent to persons
5 skilled in the relevant art(s) based on the teachings contained herein. Such
alternatives fall within the scope of the disclosed embodiments. Also, the words
“comprising,” “having,” “containing,” and “including,” and other similar forms are
intended to be equivalent in meaning and be open ended in that an item or items
following any one of these words is not meant to be an exhaustive listing of such
10 item or items, or meant to be limited to only the listed item or items. It must also be
noted that as used herein and in the appended claims, the singular forms “a,” “an,”
and “the” include plural references unless the context clearly dictates otherwise.
[045] Furthermore, one or more computer-readable storage media may be
utilized in implementing embodiments consistent with the present disclosure. A
15 computer-readable storage medium refers to any type of physical memory on which
information or data readable by a processor may be stored. Thus, a computerreadable storage medium may store instructions for execution by one or more
processors, including instructions for causing the processor(s) to perform steps or
stages consistent with the embodiments described herein. The term “computer20 readable medium” should be understood to include tangible items and exclude
carrier waves and transient signals, i.e., be non-transitory. Examples include
random access memory (RAM), read-only memory (ROM), volatile memory,
nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any
other known physical storage media.
25 [046] It is intended that the disclosure and examples be considered as
exemplary only, with a true scope of disclosed embodiments being indicated by the
following claims.
30
We Claim:
1. A processor implemented method (200), comprising:
receiving, via one or more hardware processors, an enterprise data specific
to one or more individuals associated with an enterprise from a plurality of sources,
5 wherein the one or more individuals comprises of at least one of one or more
vendors, one or more employees and one or more contractors associated with the
enterprise (202);
pre-processing, via the one or more hardware processors, the received
enterprise data to obtain an intermediate common input representation (204);
10 creating, via the one or more hardware processors, an enterprise graph
between one or more entities from the obtained intermediate common input
representation, wherein the one or more entities includes the one or more
individuals and one or more assets associated with the enterprise, and wherein the
enterprise graph includes a plurality of vertices consisting of the one or more
15 entities associated with the enterprise in a present time period and a past time
period, and a plurality of edges between the one or more entities and a plurality of
attributes associated with the plurality of vertices and the plurality of edges (206);
calculating, via the one or more hardware processors, a weight for each of the
plurality of edges between any two connected entities based on a plurality of
20 enterprise graph features and the plurality of attributes (208);
detecting, via the one or more hardware processors, one or more communities
of the one or more individuals by using a plurality of graph-based techniques based
on the calculated weights of the plurality of edges (210);
calculating, via the one or more hardware processors, a threshold behavior for
25 the one or more individuals and the one or more detected communities within an
observation window by applying a plurality of statistical methods based on the
plurality of enterprise graph features and the plurality of attributes (212);
performing, via the one or more hardware processors, a comparison of the
threshold behavior of the one or more individuals calculated within the observation
30 window with a current behavior of the one or more individuals to identify one or
more potential insiders (214);
31
performing, via the one or more hardware processors, a comparison of the
current behavior of the one or more potential insiders and the current behavior of a
plurality of individuals of the one or more detected communities to identify the one
or more potential insiders as one or more relevant insiders (216);
5 calculating, via the one or more hardware processors, a susceptibility of the
plurality of individuals for probable influence by the one or more relevant insiders
based on an analysis of a plurality of scenarios, wherein the plurality of scenarios
includes hierarchy exploitation, relationship exploitation and mixed mode (218);
calculating, via the one or more hardware processors, a plurality of paths taken
10 by the one or more relevant insiders based on the calculated susceptibility of the
plurality of individuals (220); and
performing, via the one or more hardware processors, an analysis of the
calculated paths to obtain a probability score indicative of a probable data loss
(222).
15
2. The processor implemented method of claim 1, wherein the plurality of
attributes comprises attributes specific to (i) the one or more assets and (ii) the one
or more individuals associated with the enterprise.
20 3. The processor implemented method of claim 1, further comprising
identifying at least a subset of one or more impacting assets from the one or more
assets based on the probable data loss.
4. The processor implemented method of claim 3, further comprising
25 estimating risk associated with the one or more impacting assets and obtaining
cumulative risk associated with the enterprise based on the probable data loss.
5. The processor implemented method of claim 1, generating an analysis
report comprising at least one of :
30 the information of the one or more relevant insiders, the plurality of individuals
affected by the one or more relevant insiders, the paths taken by the one or more
32
relevant insiders to influence the plurality of individuals, the susceptibility of the
plurality of individuals for probable influence by the one or more relevant insiders,
the probability of data loss, the impacting one or more assets and the estimated risk.
5 6. A system (100), comprising:
a memory (102) storing instructions;
one or more communication interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via
the one or more communication interfaces (106), wherein the one or more
10 hardware processors (104) are configured by the instructions to:
receive, via one or more hardware processors, an enterprise data specific to
one or more individuals associated with an enterprise from a plurality of sources,
wherein the one or more individuals comprises of at least one of one or more
vendors, one or more employees and one or more contractors associated with the
15 enterprise;
pre-process, via the one or more hardware processors, the received
enterprise data to obtain an intermediate common input representation;
create, via the one or more hardware processors, an enterprise graph
between one or more entities from the obtained intermediate common input
20 representation, wherein the one or more entities includes the one or more
individuals and one or more assets associated with the enterprise, and wherein the
enterprise graph includes a plurality of vertices consisting of the one or more
entities associated with the enterprise in a present time period and a past time
period, and a plurality of edges between the one or more entities and a plurality of
25 attributes associated with the plurality of vertices and the plurality of edges;
calculate, via the one or more hardware processors, a weight for each of the
plurality of edges between any two connected entities based on a plurality of
enterprise graph features and the plurality of attributes;
detect, via the one or more hardware processors, one or more communities of
30 the one or more individuals by using a plurality of graph-based techniques based
on the calculated weights of the plurality of edges;
33
calculate, via the one or more hardware processors, a threshold behavior for the
one or more individuals and the one or more detected communities within an
observation window by applying a plurality of statistical methods based on the
plurality of enterprise graph features and the plurality of attributes;
5 perform, via the one or more hardware processors, a comparison of the
threshold behavior of the one or more individuals calculated within the observation
window with a current behavior of the one or more individuals to identify one or
more potential insiders;
perform, via the one or more hardware processors, a comparison of the current
10 behavior of the one or more potential insiders and the current behavior of a plurality
of individuals of the one or more detected communities to identify the one or more
potential insiders as one or more relevant insiders;
calculate, via the one or more hardware processors, a susceptibility of the
plurality of individuals for probable influence by the one or more relevant insiders
15 based on an analysis of a plurality of scenarios, wherein the plurality of scenarios
includes hierarchy exploitation, relationship exploitation and mixed mode;
calculate, via the one or more hardware processors, a plurality of paths taken by
the one or more relevant insiders based on the calculated susceptibility of the
plurality of individuals; and
20 perform, via the one or more hardware processors, an analysis of the calculated
paths to obtain a probability score indicative of a probable data loss.
7. The system of claim 6, wherein the plurality of attributes comprises
attributes specific to (i) the one or more assets and (ii) the one or more individuals
25 associated with the enterprise.
8. The system of claim 6, further comprising identifying at least a subset of
one or more impacting assets from the one or more assets based on the probable
data loss.
30
34
9. The system of claim 8, further comprising estimating risk associated with
the one or more impacting assets and obtaining cumulative risk associated with
the enterprise based on the probable data loss.
5 10. The system of claim 6, generating an analysis report comprising at least
one of :
the information of the one or more relevant insiders, the plurality of individuals
affected by the one or more relevant insiders, the paths taken by the one or more
relevant insiders to influence the plurality of individuals, the susceptibility of the
10 plurality of individuals for probable influence by the one or more relevant insiders,
the probability of data loss, the impacting one or more assets and the estimated risk.

Documents

Application Documents

# Name Date
1 202021046660-STATEMENT OF UNDERTAKING (FORM 3) [26-10-2020(online)].pdf 2020-10-26
2 202021046660-PROVISIONAL SPECIFICATION [26-10-2020(online)].pdf 2020-10-26
3 202021046660-FORM 1 [26-10-2020(online)].pdf 2020-10-26
4 202021046660-DRAWINGS [26-10-2020(online)].pdf 2020-10-26
5 202021046660-Proof of Right [23-02-2021(online)].pdf 2021-02-23
6 202021046660-FORM 18 [19-10-2021(online)].pdf 2021-10-19
7 202021046660-ENDORSEMENT BY INVENTORS [19-10-2021(online)].pdf 2021-10-19
8 202021046660-DRAWING [19-10-2021(online)].pdf 2021-10-19
9 202021046660-CORRESPONDENCE-OTHERS [19-10-2021(online)].pdf 2021-10-19
10 202021046660-COMPLETE SPECIFICATION [19-10-2021(online)].pdf 2021-10-19
11 202021046660-FORM-26 [21-10-2021(online)].pdf 2021-10-21
12 202021046660-Power of Attorney [17-01-2022(online)].pdf 2022-01-17
13 202021046660-Form 1 (Submitted on date of filing) [17-01-2022(online)].pdf 2022-01-17
14 202021046660-Covering Letter [17-01-2022(online)].pdf 2022-01-17
15 202021046660 CORRESPONDANCE WIPO DAS 20-01-2022.pdf 2022-01-20
16 Abstract 1.jpg 2022-03-11
17 202021046660-FORM 3 [05-05-2022(online)].pdf 2022-05-05
18 202021046660-FER.pdf 2022-06-07
19 202021046660-RELEVANT DOCUMENTS [25-07-2022(online)].pdf 2022-07-25
20 202021046660-PETITION UNDER RULE 137 [25-07-2022(online)].pdf 2022-07-25
21 202021046660-OTHERS [25-07-2022(online)].pdf 2022-07-25
22 202021046660-FORM 3 [25-07-2022(online)].pdf 2022-07-25
23 202021046660-FER_SER_REPLY [25-07-2022(online)].pdf 2022-07-25
24 202021046660-CORRESPONDENCE [25-07-2022(online)].pdf 2022-07-25
25 202021046660-CLAIMS [25-07-2022(online)].pdf 2022-07-25

Search Strategy

1 202021046660searchE_23-05-2022.pdf