Sign In to Follow Application
View All Documents & Correspondence

System And Method For Identification And Temporal Progression Measurement Of Theme Associated With Contextual Unit

Abstract: This disclosure relates generally to theme identification and temporal progression of themes associated with a contextual unit. The conventional systems are static single use models developed to measure constructs relevant only to a contextual unit or an industry. The disclosed system and method provides a self-learning methodology to assess and track digital footprints relative to abstract constructs of contextual entities such as organizations and industries. In an embodiment, the disclosed system enables iteratively crawling multiple data sources to exhaustively extract dimensions and parameters thereof related to the abstract construct using ANN model. The system prioritizes a set of parameters from amongst the extracted parameters and utilizes the set of parameters to optimize temporal matrices indicative of temporal progression of the theme. [To be published with FIGS. 2A-2B]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
24 June 2020
Publication Number
53/2021
Publication Type
INA
Invention Field
ELECTRONICS
Status
Email
kcopatents@khaitanco.com
Parent Application

Applicants

Tata Consultancy Services Limited
Nirmal Building, 9th Floor, Nariman Point Mumbai Maharashtra India 400021

Inventors

1. DUTTA, Swayambhu
Tata Consultancy Services Limited Eden Building,Plot - B1, Block EP & GP, Sector-V, Saltlake, Kolkata West Bengal India 700091
2. KIRTANIA, Manish
Tata Consultancy Services Limited Eden Building,Plot - B1, Block EP & GP, Sector-V, Saltlake, Kolkata West Bengal India 700091
3. PRAMANIK, Himadri Sikhar
Tata Consultancy Services Limited Eden Building,Plot - B1, Block EP & GP, Sector-V, Saltlake, Kolkata West Bengal India 700091
4. MAJUMDAR, Supratim
Tata Consultancy Services Limited Eden Building,Plot - B1, Block EP & GP, Sector-V, Saltlake, Kolkata West Bengal India 700091
5. BANERJEE, Soumya
Tata Consultancy Services Limited Eden Building,Plot - B1, Block EP & GP, Sector-V, Saltlake, Kolkata West Bengal India 700091
6. DUTTA, Sujoy Kanti
Tata Consultancy Services Limited Eden Building,Plot - B1, Block EP & GP, Sector-V, Saltlake, Kolkata West Bengal India 700091
7. SHAH, Romil
Tata Consultancy Services Limited Olympus - A, Opp Rodas Enclave, Hiranandani Estate, Ghodbunder Road, Patlipada, Thane West Maharashtra India 400607

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
SYSTEM AND METHOD FOR IDENTIFICATION AND TEMPORAL PROGRESSION MEASUREMENT OF THEME ASSOCIATED WITH
CONTEXTUAL UNIT
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.

TECHNICAL FIELD [001] The disclosure herein generally relates to theme identification and temporal progression measurement, and, more particularly, to system and method for identification and temporal progression measurement of themes associated with a contextual unit.
BACKGROUND
[002] With the development of internet and social media, a huge amount of data is being created every day. The data is being utilized to assess and track digital footprints associated with an abstract construct such as an industry or an organization. However, there is a lack of standardized approach for observing and measuring the constructs for contextual units.
[003] Due to fast pace of change in industrial ecosystem, it is imperative to measure and compare the abstract constructs across industries. However, the conventional systems are static single use models developed to measure constructs relevant only to a contextual unit or an industry.
SUMMARY
[004] Embodiments of the present disclosure present technological
improvements as solutions to one or more of the above-mentioned technical
problems recognized by the inventors in conventional systems. For example, in one
embodiment, a processor-implemented method for identification and temporal
progression measurement of themes associated with a contextual unit is provided.
The method includes enabling a first level crawling through a plurality of data
sources to obtain a data associated with an abstract construct of a contextual entity, via one or more hardware processors. Further, the method includes determining, via the one or more hardware processors, a first set of narratives associated with the abstract construct based on the data, wherein the first set of narratives comprises a list of keywords and parameters associated with the abstract construct. Further the method includes identifying, based on the first set of narratives, one or more first themes indicative of the abstract construct associated with the contextual entity, via

the one or more hardware processors, each of the one or more first themes comprising a first plurality of parameters. Herein, identifying the one or more first themes comprises generating a document term matrix (DTM) comprising frequency of keywords in the list of keywords and combination of keywords across the plurality of data sources, performing an exploratory factor analysis (EFA) on the DTM for dimensionality reduction of the DTM and obtaining a clear factor solution (CFS) from the EFA, the CFS determines a correlation of keywords to the one or more first themes. Furthermore, the method includes performing a second level crawling, by an Artificial Neural Network, on the plurality of data sources based on the first set of narratives and the one or more first themes to obtain one or more second themes associated with the contextual entity, via the one or more hardware processors, each of the one or more second themes comprising a second plurality of parameters. Also, the method includes assigning weightages to the first plurality of parameters and the second plurality of parameters, via the one or more hardware processors. Still, the method includes generating, via the one or more hardware processors. a plurality of temporal matrices mapping data sources with the plurality of parameters using one or more NLP models, wherein each cell of the plurality of temporal matrices is associated with a score based on frequency and association. Also, the method includes optimizing, via the one or more hardware processors, the plurality of temporal matrices by iteratively performing second level crawling, assigning weightages, and generating the plurality of temporal matrices till a threshold level of data at the plurality of data sources is obtained, to obtain a temporal information indicative of progression of the theme over a period of time. [005] In another aspect, a system for identification and temporal progression measurement of themes associated with a contextual unit is provided. The system includes one or more memories; and one or more hardware processors, the one or more memories coupled to the one or more hardware processors, wherein the one or more hardware processors are configured to execute programmed instructions stored in the one or more memories to enable a first level crawling through a plurality of data sources to obtain a data associated with an abstract construct of a contextual entity. Further, the one or more hardware processors are

configured by the instructions to determine first set of narratives associated with the abstract construct based on the data, wherein the first set of narratives comprises a list of keywords and parameters associated with the abstract construct. The one or more hardware processors are further configured to execute programmed instructions to identify, based on the first set of narratives, one or more first themes indicative of the abstract construct associated with the contextual entity, each of the one or more first themes comprising a first plurality of parameters. To identify the one or more first themes, the one or more hardware processors are configured by the instructions to generate a document term matrix (DTM) comprising frequency of keywords in the list of keywords and combination of keywords across the plurality of data sources, and perform an exploratory factor analysis (EFA) on the DTM for dimensionality reduction of the DTM and obtaining a clear factor solution (CFS) from the EFA, the CFS determines a correlation of keywords to the one or more first themes. Further, the one or more hardware processors are configured by the programmed instructions to perform a second level crawling, via an Artificial Neural Network, on the plurality of data sources based on the first set of narratives and the one or more first themes to obtain one or more second themes associated with the contextual entity, each of the one or more second themes comprising a second plurality of parameters. Furthermore, the one or more hardware processors are configured by the programmed instructions to assign weightages to the first plurality of parameters and the second plurality of parameters. Also, the one or more hardware processors are configured by the programmed instructions to generate a plurality of temporal matrices mapping data sources with the plurality of parameters using one or more NLP models, wherein each cell of the plurality of temporal matrices is associated with a score based on frequency and association. optimize the plurality of temporal matrices by iteratively performing second level crawling, assigning weightages, and generating the plurality of temporal matrices till a threshold level of data at the plurality of data sources is obtained, to obtain a temporal information indicative of progression of the theme over a period of time. [006] In yet another aspect, a non-transitory computer readable medium for identification and temporal progression measurement of themes associated with

a contextual unit is provided. The method includes enabling a first level crawling through a plurality of data sources to obtain a data associated with an abstract construct of a contextual entity, via one or more hardware processors. Further, the method includes determining, via the one or more hardware processors, a first set of narratives associated with the abstract construct based on the data, wherein the first set of narratives comprises a list of keywords and parameters associated with the abstract construct. Further the method includes identifying, based on the first set of narratives, one or more first themes indicative of the abstract construct associated with the contextual entity, via the one or more hardware processors, each of the one or more first themes comprising a first plurality of parameters. Herein, identifying the one or more first themes comprises generating a document term matrix (DTM) comprising frequency of keywords in the list of keywords and combination of keywords across the plurality of data sources, performing an exploratory factor analysis (EFA) on the DTM for dimensionality reduction of the DTM and obtaining a clear factor solution (CFS) from the EFA, the CFS determines a correlation of keywords to the one or more first themes. Furthermore, the method includes performing a second level crawling, by an Artificial Neural Network, on the plurality of data sources based on the first set of narratives and the one or more first themes to obtain one or more second themes associated with the contextual entity, via the one or more hardware processors, each of the one or more second themes comprising a second plurality of parameters. Also, the method includes assigning weightages to the first plurality of parameters and the second plurality of parameters, via the one or more hardware processors. Still, the method includes generating, via the one or more hardware processors. a plurality of temporal matrices mapping data sources with the plurality of parameters using one or more NLP models, wherein each cell of the plurality of temporal matrices is associated with a score based on frequency and association. Also, the method includes optimizing, via the one or more hardware processors, the plurality of temporal matrices by iteratively performing second level crawling, assigning weightages, and generating the plurality of temporal matrices till a threshold level of data at the

plurality of data sources is obtained, to obtain a temporal information indicative of progression of the theme over a period of time.
[007] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[008] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[009] FIG. 1 illustrates an exemplary system for identification and temporal progression measurement of themes associated with a contextual unit according to some embodiments of the present disclosure.
[010] FIGS. 2A-2B is a flow diagram illustrating a method for identification and temporal progression measurement of themes associated with a contextual unit according to some embodiments of the present disclosure.
[011] FIG. 3 illustrates an example representation of dimensions and parameters associated with temporal progression of themes in accordance with some embodiments of the present disclosure.
[012] FIG. 4 illustrates an example 2-dimensional (2D) plot between dimension/ themes and their associated narrative in 2X2 matrix, in accordance with some embodiments of the present disclosure.
[013] FIG. 5A illustrate an architecture of an artificial neural network (ANN) model for the identification and temporal progression measurement of themes associated with a contextual unit in accordance with some embodiments of the present disclosure.
[014] FIG. 5B illustrates an ANN unit of the ANN model of FIG. 5A for the identification and temporal progression measurement of themes associated with a contextual unit in accordance with some embodiments of the present disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS
[015] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope being indicated by the following claims.
[016] Referring now to the drawings, and more particularly to FIG. 1 through 5B, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[017] FIG. 1 illustrates a block diagram of a system 100 for identification and temporal progression measurement of themes associated with a contextual unit, in accordance with an example embodiment. In an embodiment, the system 100 facilitates an end-to-end data driven, self-learning methodology to assess and track digital footprints relative to different social constructs of organizations and industries. For example, the disclosed system 100 facilitates in measuring fear, emotions, agility, and innovation for the organizations or the industries under consideration. Such social or abstract constructs are constructive in nature and are measured through the help of multiple indicators which governs the constitution of this abstract construct. Given a context, the system 100 can start measuring abstract constructs and can also compare the same with each other. The disclosed system enables measuring abstract constructs such as innovation, culture and also assess temporal progressions of the measured abstract constructs.
[018] The system 100 includes or is otherwise in communication with one or more hardware processors such as a processor 102, at least one memory such as a memory 104, and an I/O interface 106. The processor 102, memory 104, and the

I/O interface 106 may be coupled by a system bus such as a system bus 108 or a similar mechanism. The I/O interface 106 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like The interfaces 106 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, an external memory, a camera device, and a printer. Further, the interfaces 106 may enable the system 100 to communicate with other devices, such as web servers and external databases. The interfaces 106 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite. For the purpose, the interfaces 106 may include one or more ports for connecting a number of computing systems with one another or to another server computer. The I/O interface 106 may include one or more ports for connecting a number of devices to one another or to another server.
[019] The hardware processor 102 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the hardware processor 102 is configured to fetch and execute computer-readable instructions stored in the memory 104.
[020] The memory 104 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the memory 104 includes a plurality of modules 120 and a repository 140 for storing data processed, received, and generated by one or more of the modules 120. The modules 120 may include routines, programs, objects, components, data structures, and so on, which perform particular tasks or implement particular abstract data types.

[021] The repository 140, amongst other things, includes a system database 142 and other data 144. The other data 144 may include data generated as a result of the execution of one or more modules in the other modules 130. The repository 140 is further configured to maintain a plurality dimensions (themes), parameters, links, assigned weightages and scores associated with the disclosed method of identification and temporal progression measurement of themes associated with a contextual unit., as will be described further in detail in the description below.
[022] According to the present subject matter, the system 100 performs identification and temporal progression measurement of themes associated with a contextual unit. For example, the disclosed system is a self-learning system that maintains pace with changing business or ecosystem dynamics and upgrades itself with each subsequent iteration; thereby keeps itself relevant in multiple scenarios. The system iterates the processes unless near-fit dimensions are not generated along with the narratives be used to measure the maturity and the score of any social constructs. The system 100 overcomes the issue of standardization of approach to measure constructs for contextual units across industries lines of business; with capabilities to aggregate across contextual units and predict state changes. The process is described further with reference to FIGS. 2A-3.
[023] FIGS. 2A-2B is a flow diagram for a method 200 of identification and temporal progression measurement of themes associated with a contextual unit in accordance with some embodiments of the present disclosure. The method 200 depicted in the flow chart may be executed by a system, for example, the system, 100 of FIG. 1. In an example embodiment, the system 100 may be embodied in a computing device.
[024] Operations of the flowchart, and combinations of operation in the flowchart, may be implemented by various means, such as hardware, firmware, processor, circuitry and/or other device associated with execution of software including one or more computer program instructions. For example, one or more of the procedures described in various embodiments may be embodied by computer program instructions. In an example embodiment, the computer program

instructions, which embody the procedures, described in various embodiments may be stored by at least one memory device of a system and executed by at least one processor in the system. Any such computer program instructions may be loaded onto a computer or other programmable system (for example, hardware) to produce a machine, such that the resulting computer or other programmable system embody means for implementing the operations specified in the flowchart. It will be noted herein that the operations of the method 200 are described with help of system 100. However, the operations of the method 200 can be described and/or practiced by using any other system.
[025] The method for generating abstract and measuring constructs for a contextual unit initiates when a user mentions a problem statement, and a abstract construct corresponding to the problem is to be measured. Herein, the term ‘abstract construct’ refers to a soft construct such as innovation maturity, emotional state determination and so on. For instance, the user may wish to determine the innovation maturity of a company X over a period of time, where the ‘innovation maturity of the company X’ may become the abstract construct.
[026] The disclosed system receives input pertaining to a problem statement associated with the abstract construct from a user. The user may provide one or more keywords and/or key phrases specific to the abstract construct. For example, the user may provide keywords such as “Innovation Maturity” OR “Innovation Index” OR “Innovation” OR “Technology” OR “Innovative” AND “Company ‘X’ on the UI associated with the system 100. In an embodiment, the user may provide keywords and/or key phrases at a UI, for example, the I/O interface 106 (FIG. 1).
[027] The system 100 may generate insights on the abstract construct of ‘Innovation Maturity’ by conducting a search in relationship with a probe area included with OR logic on a context included with an AND logic. In response, the system may generate Keywords or Phrase(s) related to problem. The system 100 may determines a set of keywords that can be leveraged to search around the abstract construct (specified problem). The set of keywords may include, but are not limited to, distinct keywords and combination of keywords. The set of keywords

may be obtained by using one or more NLP techniques, such as topic modeling, tokenization, named entity resolution, sentiment analysis, and so on. The set of keywords so determined are further sliced into smaller relevant keywords/key phrases. For example, the system 100 may remove all the adverbs or verbs or adjectives to some extent from amongst the set of keywords to determine the most relevant words that are aligned with the problem statement. The system 100 may utilize an Artificial Neural Network (ANN) model for slicing or breaking the keywords and phrases to obtain the most relevant words. An example ANN model utilized for slicing the keywords and phrases is descried further with reference to FIGS. 5A and 5B. The most relevant words may be stored in a storage or a repository. In the present example, the system 100 may process the keywords to generate searchable strings and/or words by using an ANN model. For example, the problem statement may be divided into certain keywords or searchable strings such as, but not limited to, ‘Innovation Maturity’, ‘Innovation Index’, ‘Company X and Innovation’, ‘Company X and Innovation maturity index’, and so on. The keywords and searchable strings may be stored in a database of a repository. The disclosed system can relate the specific company names and thereby generate related keywords and information. While some predefined information or relationship tables would be there in the databases, but new words or phrases keeps on adding.
[028] The set of keywords so obtained are utilized by a crawler to perform a first level crawling to search for relevant links from a plurality of data sources. At 202, the method 200 includes enabling the first level crawling through the plurality of data sources to obtain a data (for example, links) associated with the abstract construct of the contextual entity. The crawler receives inputs in form of the set of keywords and utilizes said input to perform the first level crawling at the plurality of sources.
[029] The keywords and searchable strings are used by the system to crawl through the secondary available data sources and generate a plurality of links. A set of links from amongst the plurality of links are stored in the database or repository depending on the availability of the links. For example, for the phrase ‘Innovation’, the links may be:

• Innovation maturity Matrix – hyperlink1
• Innovation Maturity Model – hyperlink3,
• How to measure your organization – hyperlink3,
• Innovation Maturity Model – hyperlink4,
• Open Innovation Maturity Framework – hyperlink5, and so on. [030] At 204, the method 200 includes determining a first set of narratives
associated with the abstract construct of the contextual entity based on data obtained from the first level crawling through a plurality of data sources. The first set of narratives includes a list of keywords and parameters associated with the abstract construct. Based on the first level crawling, the system retrieves a data which includes a relevant links from the web. The links may be scanned for extraction of text or words or paragraphs. The paragraphs are being subjected to different algorithms NLP (Natural Language Processing) techniques to generate narratives. The narratives may include unstructured text or paragraphs.
[031] For example, corresponding to the keywords ‘Innovation maturity Matrix’, the system may have retrieved the link - hyperlink1, and for the aforementioned link, the following narrative may be :
[032] Unstructured Abstract: Difficult to implement organizational changes; Lessons from customers leading to development of maturity understanding; Expect solid result from innovation; Traditional R&D style; Innovation theatre; Innovation activities such as ideas, challenges, Drivers of Innovation; Innovation Methodology; hackathons; Capability Maturity Model Integration; scale; balance, clear focus and the right structures; 70-20-10 Rule as an example; Classifying the Innovators; importance of innovation; Innovation Promotions; Amplification of Innovation; very innovative culture; Innovation Impact; Continuity Factor of Innovation; Benefits of Innovation; Output of Innovation.
[033] The web crawler is configured to create a file with all the relevant links pertaining to the problem statement and it gets stored (version controlled) in a Data Storage. For each link generated against a keyword or a combination of keywords, the system 100 gather specific portions of the narrative. The links may

be scanned and the paragraphs may be subjected to different NLP (Natural Language Processing) technique to generate narratives.
[034] The system 100 constructs a link versus keyword, or a combination of keywords matrix is generated. The corresponding cells at the interface of a link to a keyword, or a combination of keywords is populated with the associated narrative as obtained from that web / data source. This is termed as Document Term Matrix (DTM).
[035] At 206, the method 200 includes identifying, based on the first set of narratives, a theme (or dimension) indicative of the abstract construct associated with the contextual entity. For identifying the theme, the links are obtained and the first set of narratives are developed across the different searchable keywords/ phrases. For example, a document term matrix (DTM) is generated (at 208) having frequency of keywords in the list of keywords and combination of keywords across the plurality of data sources. Exploratory factor analysis (EFA) with Principal Component Analysis (PCA) is performed at 210 on on the DTM for dimensionality reduction of the DTM and obtaining a clear factor solution (CFS) from the EFA. The CFS determines association of keywords/combination of keywords to evolve the representative theme. Association of keywords/combination of key words relating to themes form the parameters to explain the evolved themes within the contextual unit. The evolvement of themes is indicative of temporal progression thereof.
[036] In an embodiment, the CFS is obtained from EFA, a table is created with Row (Themes) and Columns (Links), as shown below. The table is dimensionally reduced and undergoes for Factor Analysis where factor loading is utilized as weightages. Factor Analysis by PCA (Principal Component method) is a widely used statistical technique for dimensionality reduction. Exploratory Factor Analysis (EFA) is used for factorization of the unique constructs using the factor loading and cross loading. PCA (Principal Component Analysis) method is performed in order to obtain list of keywords or combination of keywords. New Themes or Dimensions are obtained which us a collection of related columns based on Factor Loading and the parameters in the columns. Herein, EFA is used for

unique constructs creation, followed by PCA to reduce dimensions, thereby ultimately leading to the creation of new themes. The probe solution is constructed based on occurrence of a word or phrase across multiple data sources, also referred to as dimensions/ themes that answer the problem statement. Association of words relating to the above dimension or themes, form the parameters to explain the same. Frequency, word association and vector algorithm using ANN facilitates in the creation of CFS. The creation of CFS is explained further in detail with reference to an example.
[037] The factor is a dimension, which is a collection of related columns based on factor loading and columns are the parameters. The CFS is indicative of frequency matrix (or temporal matrices) of certain key words and/or combination of different keywords.
[038] Continuing the same example, the clear factor solution for the term ‘innovation maturity’ yields ‘new themes’ as:

Dimension 1 Dimension 2 Dimension 3
Drivers of Innovation Innovation Benefits Amplification
Enablers of Innovation Innovation Outcome Promotions of Innovation
Influencers of Innovation Advertisements of Innovation
[039] Herein, each of the dimensions (or themes) have parameters. For example, the ‘dimension 1’ (or theme) has parameters including, ‘Drivers of Innovation’, ‘Enablers of Innovation’, and ‘Influencers of Innovation’.
[040] Occurrence of a word or phrase across multiple data sources facilitates in constructing the solution for the theme identification. The solution to theme identification may be referred to as ‘dimensions’/ ‘themes’ that answer the problem statement, and is obtained using an ANN model. The ANN model takes frequency, word association and vector algorithm as inputs to determine association of words relating to the above dimension or themes, and outputs the parameters to explain said association. The frequency matrix generated using different links for the present example may be represented as:

Themes Link1 Link 2 Link ... Link M
New Theme 1 0 0 1 1
New Theme 2 1 1 0 0
New Theme ... 1 1 1 0
New Theme N 0 0 0 1
[041] In the above frequency matrix, Row represents Themes and the Columns represents Links. The table is dimensionally reduced and undergoes for Factor Analysis where, Factor Loading is utilized as weightages for primary terms within a dimension and to ascertain how the primary terms relate to that dimension, as will be explained in subsequent steps. The frequency matrix is later utilized for determining temporal progression of the abstract construct, hence the term ‘frequency matrix’ may also be referred to as ‘temporal matrix’ in the foregoing description. The rows of a temporal matrix comprises of themes derived previously and the columns includes the links. The table is dimensionally reduced and undergoes for Factor Analysis where, Factor Loading will be used as weightages, as will be later described in the description.
[042] At 212, the method 200 includes performing a second level crawling, via an Artificial Neural Network (ANN), on the plurality of data sources based on the first set of narratives and the theme to obtain one or more themes associated with the contextual entity. The second level crawling enables capturing temporal progression of the abstract construct associated with the contextual entity by determining abstracts and/or information of data points collected for particular contextual entity with respect to particular themes. In an embodiment, various old themes (prestored in the database/repository) may be combined with newer themes in order to ensure exhaustiveness of the data points, information and abstracts. The second level of data source crawling performed based on old and new themes by using an ANN model (described further with reference to FIGS. 5A and 5B) ensures exhaustiveness of the data collection for each keyword and also temporal progression. In this step, abstracts or information of data points are collected for the particular company (or contextual unit) with respect to particular themes. Taking

the present example again, the new themes identified for the ‘Innovation maturity’ may be Innovation Drivers of company X, Innovation Methodologies of company X, Innovation Amplification of company X, Innovation Output of company X, Innovation Benefits of company X, Older Themes, Innovation Enablers of company X, and so on. The innovation driver parameters may be identified as factors of innovation; Factors which influence innovation; Culture; Investments; Talent pool; specific technology embracement. For company X - $Y Million Investments; For Company X – Z Technology adopted, and so on. In this step, many old themes are combined with the newer themes in order to ensure exhaustiveness of the data points, information and abstracts.
[043] Each of the plurality of themes (obtained as a result of first level crawling and the second level crawling) comprises a plurality of parameters or dimensions. Out of the identified first and second plurality of parameters, a set of parameters for the identified themes may be selected, and a hygiene check is performed on the columns to rows ratio (using an ANN model). For instance, a hygiene check of minimum 1:5 columns to row ratio is maintained to ensure minimum 5 parameters for each of the dimensions or the identified themes. Weightages are assigned to the selected parameters based on a factor loading at 214. An example of assigned weightages in the present example is shown below:

Innovation Drivers Weightages
Investments by company X 0.4
Specific Technology 0.2
Culture of Company X 0.12
Talent Pool 0.09
[044] Herein, multiple levels of crawling may be performed for identifying more parameters, and such identified parameters may be utilized for selecting the set of parameters therefrom. The links (obtained from the subsequent levels of crawling) along with the identified parameters obtained from the links are stored in a link vs parameters matrix (or temporal matrix) at 216. The links vs parameters matrix may be utilized for determining temporal progress of the identified themes

associated with the abstract construct. An example of the link vs parameters matrix for the current example is shown below

Drivers of Innovation Links Timeline
Investments by Link 1 2018
company X Link 2 Link 3 Link 4 Link 5 2019
Specific Technology Link 1 Link 2 Link 3 2017
Culture of Company X Link 1 Link 2 2019
Talent Pool Link 1 Link 2 2018
[045] Herein, all links captured may have time stamps attached to it, to be able to identify the progression of a dimension or a parameter. The links vs parameters matrix (also referred to as temporal matrix) along with the weightages is utilized for prioritization of narratives. The narratives may be prioritized based on a frequency of occurrence of the narratives in the links vs parameter matrix. The prioritized narratives are pulled out of the links. An example of prioritization of narratives is as given below:

Drivers of Innovation Priority
Investments by company X 1
Specific Technology 2
Culture of Company X 3
Talent Pool 4
[046] The listing of the set of narratives within the dimensions (factors) may be stored for parameters based on the threshold number of frequencies. The related narratives may be pulled out from the links.
[047] In an implementation of the disclosed system, the list of dimensions including factors and elements may be updated on a continuous basis for a particular

abstract construct (i.e. problem statement and context). Each time the system iterates the process (including multiple iterations of crawling, deriving links and relevant parameters and generating temporal matrices), the unique selected dimensions gets updated. At 218, the method 200 includes optimizing the plurality of temporal matrices by iteratively performing a subsequent level of crawling, determining the plurality of parameters and generating the plurality of temporal matrices till a threshold level of data at the plurality of data sources is obtained, to obtain a temporal information indicative of progression of the theme over a period of time. Herein, the iterative process of generating temporal matrices based on new parameters identified in each iteration renders the disclosed method and system ‘self-learning’. Moreover, comparison of temporal matrices generated in each iteration facilitates in determining temporal progression associated with the abstract construct over a period of time.
[048] The disclosed system enables temporal evolution of constructive variables that is achieved through iterative crawling. In various embodiments, various cells or links or rather narratives are linked with the same form of the basic building block, which when collated form the parameters or the rows. These parameters which when combined forms the dimension. An ultimate solution consists of several dimensions, which are combination of several parameters, which are in turn formed by specific narratives from the links or the cells. Each of these building blocks are dynamic in nature, as the content of the cell change with time, because of the dynamicity in the exploration process. The exploration process being iterative in nature, it takes cues from the past stored similar data pints and evolve as per the latest updates in the secondary domain of search which is being used for web-scraping. Referring to FIG. 3, if the content in a Cell 4 is X at time T, it is bound to become X’ in Time T+N1 and it is supposed to become X’’ in time T+N2. Hence, for each of the iteration with time, the evolution of the contents of the cell or the smallest building block is taking place, which is getting reflected in the formation of the evolved parameters, which in turn forms the evolved dimensions. The role of ANN (Artificial Neural Network) acts to incorporate the concept of iteration and comparison with the old stored data points

[049] The temporal progression of the abstract construct for the present example may be as follows:

Drivers of Innovation Priority
Investments by company X $Y in MN 2018
$Y2 in MN 2019
$Y3 in MN 1020 for hubs
Specific Technology Big Data Analytics in 2017 Blockchain in 2019
[050] In the present example, only two parameters (or elements) updating have been showed for the ease of brevity of description. However, in different applications and scenarios, the number of dimensions may be more than those disclosed herein.
[051] The disclosed system can generate abstract amplification index by taking the weighted average across dimension levels for assessing relevance of the determined parameters/dimensions. The weighted averages may be taken across parameters or the frequency of occurrences. An example of generation of abstract amplification indices is shown below:

Dimensions Weighted Averages
Innovation Drivers of company X 4.5
Innovation Methodologies of company X 4.1
Innovation Amplification of company X 3.9
Innovation Output of company X 3.1
Innovation Benefits of company X 5.9
[052] The next stage includes assessing relevance, which is a follow-up of publication of the Abstract Amplification index. Based on the identified parameters and abstract amplification index, the system ascertains positivity, negativity and neutrality of identified items (columns) - (assigns signs positive and negative). In particular, the system classifies and scores each instance of narrative as negative, neutral or positive and generates temporal matrices (source vs dimension /key terms) along with the score. In the matrix of 'Items vs links' sentiment/narrative analysis is being done for each cell narratives and the scores are being assigned to

each cell. The scores help in incorporating the progression in dimensional indices and factoring time stamp of links allowing comparison across entities may be performed. Further, the system uses ANN and NLP together to generate the sentiments of the narratives or the cells or the links with respect to parameters. An example of assigning positive, negative and neutral values to the parameters in the present example is shown as below:

Dimensions Parameter Values +2 +2 -1 +2 +2
Innovation Drivers of company X -2 +2 -2 +1 +1 +1 +1 +1 +2 +1 -1 -1 +2 +1 -2 +1

Innovation Methodologies of company X


-1

Innovation Amplification of company X


+1

Innovation Output of company X


-1

Innovation Benefits of company X


+1

[053] The data associated with the disclosed embodiments is stored in a data storage or a repository. Said data may include final construct, 2D Plots, dimensions and the corresponding variables and information. An example of a 2D plot is shown in FIG. 3.
[054] Referring to FIG. 3. a 2D plot between dimension/ themes and their associated narrative in 2X2 matrix is illustrated. The 2D plots forms a visual basis for comparison of temporal progression of the theme over a period of time. The axis of this representative matrix is time based on observed data and sentiment scores based on +2 to -2. In the present example, the 2D plot is a 4P plot, where the 4P refers to Primary, Prolonged, Potential and Peripheral. All these are created along two dimensions of Continuity and Impact for the Innovation Maturity Index. This Continuity and Impact may vary and so also, 4P Plots may also vary. 4P is a representative example, where the 4 Ps are the names of the 4 boxes of a matrix having one of the dimensions of continuity or impact as high or low.
[055] Referring to FIG. 5A, an architecture of ANN model 500 employed by the system 100 of FIG. 1 for identification and temporal progression measurement of themes associated with a contextual unit is illustrated in accordance with various embodiments of the present disclosure. Using the ANN model 500,

the system 100 learns to perform the tasks by considering new variables with each iteration without being programmed by any task-specific rules. The disclosed system may have multiple (or N number of) ANN units, as illustrated in FIG. 5A. An ANN unit 550 from amongst the multiple ANN units is further disclosed with reference to FIG. 5B. The multiple ANN units for multiple iterations determines the temporal progression (or evolvement) of themes.
[056] Each ANN unit of the ANN model 500 facilitates in identifying relevant variables around various tasks (or steps indicated in method flowchart 200) with each iteration. For instance, the tasks/steps includes, scanning across various secondary sources, specific data source crawling (second level crawling) with themes, one or more second themes associated with the contextual entity, top parameters for the theme identified for the contextual entity, hygiene check on columns to row ratio, and use of ANN for accessing relevance of the obtained themes. For example, for the process of system search criteria, the ANN model/unit slices the keywords (captured previously) into smaller relevant words and/or phrases. Herein, the ANN model/unit enables intelligent way of word searching as it is self-learning. For example, it may try to remove all the verbs or adverbs or adjectives to determine most relevant words aligned to abstract construct.
[057] The ANN model/unit for theme identification is used to assign frequency to the words based on number of frequent occurrences after the factors is performed on the theme link structured data.
[058] The ANN model/unit for second level crawling identifies redundancies ad/or duplications by comparing new themes in combination with older stored themes from previous stages/levels of crawling. The ANN model/unit further identifies the themes that are relevant to the abstract construct.
[059] The system learns to perform the tasks by considering new variables with every iteration without being programmed with any task-specific rules. Herein, the ANN model/unit for updates the list of dimensions and/or factors and elements. The ANN model/unit can also update the database with most accurate and current results so that temporal progression is achieved across iterations.

[060] The ANN model/unit may be utilized for assessing relevance of abstracts or insights could be determined. For example, the ANN model/unit performs the sentiment analysis so that the extent of positivity or negativity (for example, +2, +1, -1, -2) aspects of different abstracts or insights may be determined. Herein, the ANN model/unit facilitates in assigning extent of sentiments so that the parameters may be arranged accordingly inside the dimension as represented in the 2D plot.
[061] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[062] Various embodiments disclosed herein provides method and system for identification and progression measurement of theme associated with contextual unit are disclosed. Unlike current theme generation systems that are biased by prior subject matter knowledge, the disclosed system discovers entities (including, but not limited to themes, variables, parameters, dimensions) in a data context without any prior hypothesis. Scales to measure the discovered entities evolve over time and contextual data. In other words, entities and its measurement guidelines are discovered by the system aligned to time and context. These are stored for future reference and learning incrementally in newer contexts, while establishing associations to newer patterns. Therefore, the unbiased nature of discovered entities and scales is reinforced by dependency on contextual data and temporal patterns over previously established knowledge.
[063] For example, the disclosed system is capable of measuring an abstract construct (such as innovation, culture, agility, and so on) with respect to an entity within a defined context; The system is further capable of comparing the abstract construct across similar entities within similar context. The system may

assess temporal progressions of the measured abstract construct for the entities within the context (maturity).
[064] As discussed previously, the system is self-learning. For example, th system includes a pre-programmed set of basic variables and dimensions. When the system runs for the first time, it leverages these variables and dimensions. However, the system automatically finds newer and relevant variables and dimensions with every iteration and decides on the relevance of the dimensions, thereby enabling self-learning. For instance, innovation of an organization is to be measured by the system, the system may take variables like new technologies, innovation centers, hubs and so on into consideration. However, if the system is to measure ‘happiness’, it may automatically change the set of variables under consideration. So, ‘happiness’ may consider human interests, humanity, likings for life, and so on and not technology or innovation centers. As there is no human role to be played with system deciding everything, there would be no biasness. The system scans the words or the phrases which are being derived from the 1st phase of the iterative model and tries to incorporate the variables which are pre-planted. In case of no-match or partial match, the system goes for web-crawling and fetches similar themes, dimensions or variables from the secondary sources. It can measure any social construct and store N number of files /links /data over cloud.
[065] Another important contribution of the disclosed embodiments is that the disclosed method and system enables temporal evolution of constructive variables that is achieved through iterative crawling. In various embodiments, various cells or links or rather narratives are linked with the same form of the basic building block, which when collated form the parameters or the rows. These parameters which when combined forms the dimension. An ultimate solution consists of several dimensions, which are combination of several parameters, which are in turn formed by specific narratives from the links or the cells. Each of these building blocks are dynamic in nature, as the content of the cell change with time, because of the dynamicity in the exploration process. The exploration process being iterative in nature, it takes cues from the past stored similar data pints and evolve as per the latest updates in the secondary domain of search which is being used for

web-scraping. Hence, for each of the iteration with time, the evolution of the contents of the cell or the smallest building block is taking place, which is getting reflected in the formation of the evolved parameters, which in turn forms the evolved dimensions. The role of ANN (Artificial Neural Network) acts to incorporate the concept of iteration and comparison with the old stored data points
[066] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[067] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[068] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological

development will change the manner in which particular functions are performed.
These examples are presented herein for purposes of illustration, and not limitation.
Further, the boundaries of the functional building blocks have been arbitrarily
defined herein for the convenience of the description. Alternative boundaries can
be defined so long as the specified functions and relationships thereof are
appropriately performed. Alternatives (including equivalents, extensions,
variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[069] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[070] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

We Claim:
1. A processor implemented method, comprising:
enabling a first level crawling through a plurality of data sources to obtain a data associated with an abstract construct of a contextual entity, via one or more hardware processors;
determining, via the one or more hardware processors, a first set of narratives associated with the abstract construct based on the data, wherein the first set of narratives comprises a list of keywords and parameters associated with the abstract construct;
identifying, based on the first set of narratives, one or more first themes indicative of the abstract construct associated with the contextual entity, via the one or more hardware processors, each of the one or more first themes comprising a first plurality of parameters, wherein identifying the one or more first themes comprises:
generating a document term matrix (DTM) comprising the frequency of keywords in the list of keywords and a combination of keywords across the plurality of data sources;
performing an exploratory factor analysis (EFA) on the DTM for dimensionality reduction of the DTM and obtaining a clear factor solution (CFS) from the EFA, the CFS determines a correlation of the list of keywords to the one or more first themes;
performing a second level crawling, by an Artificial Neural Network, on the plurality of data sources based on the first set of narratives and the one or more first themes to obtain one or more second themes associated with the contextual entity, via the one or more hardware processors, each of the one or more second themes comprising a second plurality of parameters;
assigning weightages to the first plurality of parameters and the second plurality of parameters, via the one or more hardware processors;

generating, via the one or more hardware processors, a plurality of temporal matrices mapping data sources with the plurality of parameters using one or more natural language processing (NLP) models, wherein each cell of the plurality of temporal matrices is associated with a score based on frequency and association; and
optimizing, via the one or more hardware processors, the plurality of temporal matrices by iteratively performing the second level crawling, assigning weightages, and generating the plurality of temporal matrices till a threshold level of data at the plurality of data sources is obtained, to obtain a temporal information indicative of progression of the theme over a period of time.
2. The method as claimed in claim 1, wherein each of the plurality of temporal matrices comprises a link versus parameters matrix.
3. The method as claimed in claim 2, further comprising prioritizing the narratives based on a frequency of occurrence of the narratives in the links vs parameter matrix.
4. A system (100), comprising:
a memory (102) storing instructions; one or more communication interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the one or more communication interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to:
enable a first level crawling through a plurality of data sources to obtain a data associated with an abstract construct of a contextual entity;
determine a first set of narratives associated with the abstract construct based on the data, wherein the first set of narratives comprises a list of keywords and parameters associated with the abstract construct;

identify, based on the first set of narratives, one or more first themes indicative of the abstract construct associated with the contextual entity, each of the one or more first themes comprising a first plurality of parameters wherein to identify the one or more first themes, the one or more hardware processors are configured by the instructions to:
generate a document term matrix (DTM) comprising frequency of keywords in the list of keywords and combination of keywords across the plurality of data sources;
perform an exploratory factor analysis (EFA) on the DTM for dimensionality reduction of the DTM and obtaining a clear factor solution (CFS) from the EFA, the CFS determines a correlation of keywords to the one or more first themes;
perform a second level crawling, via an Artificial Neural Network, on the plurality of data sources based on the first set of narratives and the one or more first themes to obtain one or more second themes associated with the contextual entity, each of the one or more second themes comprising a second plurality of parameters;
assign weightages to the first plurality of parameters and the second plurality of parameters;
generate a plurality of temporal matrices mapping data sources with the plurality of parameters using one or more natural language processing (NLP) models, wherein each cell of the plurality of temporal matrices is associated with a score based on frequency and association; and
optimize the plurality of temporal matrices by iteratively performing second level crawling, assigning weightages, and generating the plurality of temporal matrices till a threshold level of data at the plurality of data sources is obtained, to obtain a temporal information indicative of progression of the theme over a period of time.
5. The system as claimed in claim 4, wherein each of the plurality of temporal
matrices comprises a link versus parameters matrix.

6. The system as claimed in claim 5, wherein the one or more hardware
processors are further configured by the instructions to prioritize the narratives based on a frequency of occurrence of the narratives in the links vs parameter matrix.

Documents

Application Documents

# Name Date
1 202021026737-CLAIMS [05-07-2022(online)].pdf 2022-07-05
1 202021026737-STATEMENT OF UNDERTAKING (FORM 3) [24-06-2020(online)].pdf 2020-06-24
2 202021026737-DRAWING [05-07-2022(online)].pdf 2022-07-05
2 202021026737-REQUEST FOR EXAMINATION (FORM-18) [24-06-2020(online)].pdf 2020-06-24
3 202021026737-FORM 18 [24-06-2020(online)].pdf 2020-06-24
3 202021026737-FER_SER_REPLY [05-07-2022(online)].pdf 2022-07-05
4 202021026737-OTHERS [05-07-2022(online)].pdf 2022-07-05
4 202021026737-FORM 18 [24-06-2020(online)]-1.pdf 2020-06-24
5 202021026737-FORM 1 [24-06-2020(online)].pdf 2020-06-24
5 202021026737-FER.pdf 2022-03-17
6 Abstract1.jpg 2021-10-19
6 202021026737-FIGURE OF ABSTRACT [24-06-2020(online)].jpg 2020-06-24
7 202021026737-Proof of Right [02-12-2020(online)].pdf 2020-12-02
7 202021026737-DRAWINGS [24-06-2020(online)].pdf 2020-06-24
8 202021026737-DECLARATION OF INVENTORSHIP (FORM 5) [24-06-2020(online)].pdf 2020-06-24
8 202021026737-FORM-26 [23-10-2020(online)].pdf 2020-10-23
9 202021026737-COMPLETE SPECIFICATION [24-06-2020(online)].pdf 2020-06-24
10 202021026737-FORM-26 [23-10-2020(online)].pdf 2020-10-23
10 202021026737-DECLARATION OF INVENTORSHIP (FORM 5) [24-06-2020(online)].pdf 2020-06-24
11 202021026737-Proof of Right [02-12-2020(online)].pdf 2020-12-02
11 202021026737-DRAWINGS [24-06-2020(online)].pdf 2020-06-24
12 Abstract1.jpg 2021-10-19
12 202021026737-FIGURE OF ABSTRACT [24-06-2020(online)].jpg 2020-06-24
13 202021026737-FORM 1 [24-06-2020(online)].pdf 2020-06-24
13 202021026737-FER.pdf 2022-03-17
14 202021026737-OTHERS [05-07-2022(online)].pdf 2022-07-05
14 202021026737-FORM 18 [24-06-2020(online)]-1.pdf 2020-06-24
15 202021026737-FORM 18 [24-06-2020(online)].pdf 2020-06-24
15 202021026737-FER_SER_REPLY [05-07-2022(online)].pdf 2022-07-05
16 202021026737-REQUEST FOR EXAMINATION (FORM-18) [24-06-2020(online)].pdf 2020-06-24
16 202021026737-DRAWING [05-07-2022(online)].pdf 2022-07-05
17 202021026737-STATEMENT OF UNDERTAKING (FORM 3) [24-06-2020(online)].pdf 2020-06-24
17 202021026737-CLAIMS [05-07-2022(online)].pdf 2022-07-05

Search Strategy

1 SearchHistoryE_17-03-2022.pdf