Sign In to Follow Application
View All Documents & Correspondence

Artificial Intelligence Based Skill Assessment System And Method

Abstract: A system for candidate screening, comprising: a skillset requirement vector generator configured to generate a skillset requirement vector from a skillset defined in accordance with one or more dimensions of a project; a candidate data module configured to retrieve candidate data and determine one or more skills of a candidate corresponding to the generated skillset requirement vector; a skill vector generator configured to generate a skill vector from the determined one or more skills of the candidate; and a candidate screening module configured to calculate a vector distance between the skillset requirement vector and the skill vector of the candidate, and shortlist the candidate in an event the vector distance is less than a predefined threshold.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
31 October 2018
Publication Number
44/2018
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
dev.robinson@amsshardul.com
Parent Application

Applicants

Cognizant Technology Solutions India Pvt. Ltd.
Techno Complex, No. 5/535, Old Mahabalipuram Road, Okkiyam Thoraipakkam, Chennai 600 097, Tamil Nadu, India

Inventors

1. Tapodhan Sen
17/17 Baishnabghata Road, Flat 6, Kolkata-700047, West Bengal, India
2. Saroj Pradhan
Abasar Building 3rd Floor, 251A/27 Netaji Subhas Chandra, Bose Road, Naktala, Kolkata-700047, West Bengal, India

Specification

ARTIFICIAL INTELLIGENCE BASED SKILL ASSESSMENT SYSTEM AND METHOD
FIELD OF INVENTION
[0001] The subject matter described herein, in general, relates to candidate performance and skill assessment system and method, and, in particular, relates to application of artificial intelligence for candidate screening and selection system and method for effective team building.
BACKGROUND OF INVENTION
[0002] Presently, matching a person to a job needs significant human involvement in the profile shortlisting and interviewing process. The process of recruiting an ideal developer for a team has become ever more challenging. While recruiting a candidate, the skills that are characteristically assessed are technical skills along with soft skills. As will be agreed, ever changing diverse technology landscape has added to the complexity of entire recruitment process. This necessitates the process of identification of the right profile before valuable efforts and time of the subject matter experts in the selection process are invested.
[0003] Candidly admitting, the resumes no longer act as a reliable source of actual skills and level of understanding of such skills of a developer, resulting in many false positives during shortlisting process. Self-assessments may be misleading. E-assessments can again produce false positives as the subject may have sound theoretical knowhow of a topic without any practical application.

[0004] The profiles are usually shortlisted based on matching keywords in the resumes. This approach creates a lot of false positives that results in significant wastage of resource involvement and subject matter expert's time in the downstream process. Selecting the right developer depending on the needs of a software development project is critical to the success of the project. Therefore, though the involvement of subject matter experts in the selection process cannot be denied, yet the need to automate the up-stream process of assessing the fitment of an individual to an open position before human involvement is absolutely undeniable.
[0005] Putting together the right set of individuals with relevant expertise will not only assist in accomplishing the team goals but also improve team effectiveness. Efforts have been made in recent past to automate the process of technical skill assessment whereby predefined coding problems presented to the candidate are tested for their solutions by a manual intervention as pre-defined test cases are run to determine relative performance of the candidates. Other solutions have proposed manual tagging of multiple programming features within a code to create rule base for evaluating the quality and complexity of code. Many such static code analysis techniques based on a set of predetermined parameters have been deployed for evaluation of technical capability levels of candidates.
[0006] However, none of the existing evaluation tools dive deeper to assess about further fine grained information such as know- how of sub technology areas or familiarity of candidate with certain programming concepts, languages and toolkits based on actual work done. Work has been done on use of natural language processing (NLP) for

processing files based on analyzing English vocabulary or grammatical construct, but none such efforts have been concentrated for constructing symbol dictionary from represented skills of the candidate. Also, with use of code parsing techniques, formidable challenge is posed in creation of parser for almost every skillset that is to be assessed. In addition, this requires rules specific for each of the technologies with specific knowledge of the skill has to be created in order to query abstract syntax tree.
[0007] In the background of aforementioned limitations, there exists a need for a system and method capable of assessing technical skills of candidate along with his maturity level in understanding of fine-grained source code requiring minimal intervention of subject matter experts at preliminary level of assessment. Also, the system shall be capable of assessing code constructs and symbols representing skills of individuals.
OBJECTS OF THE INVENTION
[0008] The primary object of the present disclosure is to provide an artificial intelligence based skill assessment system and system for candidate sourcing.
[0009] Another object of this disclosure is to provide an intelligent system and method for assessing maturity level of technical skill possessed by the candidate.
[0010] Yet another object of the disclosure is to infer technical skills required and maturity levels expected from a candidate who is to perform relevant technical job.

[0011] Yet other object of the present disclosure is to provide an artificial intelligence based skill assessment system and method that allows classification of multiple source codes containing similar code snippets to infer the relevant learnings therefrom.
[0012] In yet another embodiment, the disclosure provides a dynamic system and method of assessing the candidate for his understanding and know-how of sub technology areas under a programming language.
[0013] Still another embodiment of present disclosure provides an automated system and an up stream process of assessing the fitment of an individual to a given job position before involvement of subject matter experts, thereby reducing wastage of precious time and substantial efforts invested by SMEs.
[0014] In still other object of the present disclosure, the artificial intelligence based system capable of assessing code constructs and symbolic representations of candidate's skill set, is provided.
[0015] These and other objects will become apparent from the ensuing description of the present invention.
SUMMARY OF THE INVENTION
[0016] Briefly described, in a preferred embodiment, the present disclosure overcomes the above-mentioned disadvantages, and meets the recognized need by providing an artificial intelligence based system and methodology for candidate screening. The system, for instance, comprises of a skillset requirement vector generator that is

configured to generate a skillset requirement vector from a skillset defined in accordance with one or more dimensions of a project. The system further comprises of a candidate data module configured to retrieve candidate data and determine one or more skills of a candidate corresponding to the generated skillset requirement vector. Furthermore, a skill vector generator is configured to generate a skill vector from the determined one or more skills of the candidate. Finally, a candidate screening module is configured to calculate a vector distance between the skillset requirement vector and the skill vector of the candidate, and shortlist the candidate in an event the vector distance is less than a predefined threshold.
[0017] In one aspect of the disclosure, the skillset requirement vector generator is an artificial intelligence based utility model for the skill detection from job description and vectorization in accordance with the one or more dimensions.
[0018] In another aspect of the disclosure, a method for candidate screening is proposed. The method comprising of following steps: generating a skillset requirement vector from a skillset defined in job description accordance with one or more dimensions of a project; retrieving candidate data and determining one or more skills of a candidate corresponding to the generated skillset requirement vector; generating a skill vector from the determined one or more skills of the candidate. This is followed by calculating a vector distance between the skillset requirement vector and the skill vector of the candidate, and shortlisting the candidate in an event the vector distance is less than a predefined threshold.

[0019] These and other aspects, features and advantages of the present invention will be described or become apparent from the following detailed description of preferred embodiments.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] Fig. 1 is a block diagram of a candidate assessment system shown in accordance with a preferred embodiment of the present disclosure.
[0021] Fig. 2 is a block diagram illustrating schematically the interaction between various components and sub-components of Al engine, in accordance with one preferred embodiment of present disclosure.
[0022] Fig. 3 is an illustration of skill training corpus constructed to train the Al engine, in accordance with one preferred embodiment of present disclosure.
[0023] Fig. 4 is a schematic diagram showing code classification and evaluation, in accordance with one preferred embodiment of present disclosure.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0024] It has to be understood and acknowledged for this specification and claims, that the term "candidate screening" refers either, though not limiting, to candidate assessment or candidate performance evaluation and the like. For purposes of illustration, the invention is described in the context of finding Information Technology (IT) professionals,

but it will be understood that the system and method of the present invention can be applied in a variety of contexts and domains.
[0025] In describing the preferred and alternate embodiments of the present disclosure, as illustrated in Figs 1 and 2, specific terminology is employed for the sake of clarity. The disclosure, however, is not intended to be limited to the specific terminology so selected, and it is to be understood that each specific element includes all technical equivalents that operate in a similar manner to accomplish similar functions. The words "comprising," "having," "containing," and "including," and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. The disclosed embodiments are merely exemplary methods of the invention, which may be embodied in various forms.
[0026] According to its major aspects and broadly stated, the present disclosure in its preferred form provides a server-based artificial intelligence (Al) system and method capable of evaluating responses tendered by candidates for their sourcing. The system, in general, employs a central server system having a processing unit connected to a database storage device that is configured to source and present candidate responses to the system for evaluating the desired candidate profile. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects.
[0027] The present invention is described below with reference to methods and systems in accordance with general embodiments of present disclosure. The instructions may be

loaded into the system, which when executed upon such a computer implemented system - a general purpose computer or a special purpose hardware based computer systems, creates means for training the system and implementing functions of various modules hosted by the system.
[0028] Referring now to Figure 1, the system 100, in its preferred embodiment is a computer based system for candidate skill assessment and selection. As illustrated in Fig. 1, a functional block diagram of exemplary system 100 provides a suitable environment for implementing embodiments of present disclosure. Following the illustration, the architecture of system 100 is shown to comprise of various subsystems such as central processing unit (CPU) 102 coupled to random access memory (RAM) 104 to execute instructions stored therein, read-only memory (ROM) 106, various sub routines to transmit information between the subsystems, and an operating system 108 including an artificial intelligence (Al) engine 110 for performing variety of actions associated with searching and evaluating candidate skills dynamically.
[0029] Understandably, the Al engine 110 for candidate sourcing makes use of technology that searches for individual profiles and demographics available online (e.g., resumes, professional portfolios, or social media profiles) and screen them to find ideal candidates matching given job requirements. Al engine 110 for candidate matching interacts with its various sub components and identifies strongest matches for any open requirement thereby obviating the need to undergo extensive weeding-out process of candidate screening and selection. Matching job description to candidate profiles

necessitates analysis of multiple sources of data such as candidates' personality traits, skills to automatically assess candidates against the job requirements.
[0030] Generally speaking, success of any project vastly depends on ascertaining the technical and communications skills of a developer. Majorly, these skills are verbal communication, teamwork, commercial awareness, analyzing and investigating, initiative / self-motivation, drive, written communication, planning and organizing, flexibility or time management or a combination thereof. Broadly, the skills of a developer are broadly classified into 1. Technical skills 2. Soft skills. In accordance with one significant aspect of present disclosure, individual's skills can be represented as a vector comprising multiple dimensions. Each of these dimensions can be evaluated either by rule based system or by Al based system.
[0031] One can adopt different Al mechanisms to evaluate different dimensions e.g. one dimension may be evaluated using decision tree, another using deep neural network, reinforcement learning mechanism etc. The choice of technology and underlying model will depend on the nature of the dimension and the nature of data used to asses it. It may also involve interactive sessions with intelligent robotic agents. In addition, an artificial intelligence based system and methodology to pre-screen a potential candidate is proposed so as to reduce false positives and thus optimize the overall recruitment process by improving efficiency and reducing wastage of subject matter expert involvement.
[0032] Accordingly, the Al engine 110 processes and analyzes the profiles and responses shared by the candidates. Here, the overall skill of potential candidate is represented by

way of a vector with Z dimensions, where Z\ e{Technical Skills u Soft Skills}. In one exemplary embodiment, primary technical skills required for a developer can be broadly classified as user interface skills, service development skills, persistence skills, and integration skills. The assessment can further be augmented by incorporating dimensions e.g. public profile skill endorsements, open source contributions, magnitude of rework, conformity to design / coding standards etc. Now, considering the following sets:
[0033] A= {A:A^ user interface related skills} = {a-i, ..., ai}
[0034] B= {B:B^ service creation related skills} = {bi, ..., bj}
[0035] C= {C:Ce persistence related skills} = {ci, ..., ck}
[0036] D= {D:De integration related skills} = {di, .... di}
[0037] Now, for example, T is a set of all technical skills needed for a developer. Then, T can be defined as follows:
[0038] T= {T:Te Au Bu Cu D}
= {ai, ...,ai}u{bi, .... bj}u{ci, ..., ck}u{di, ...,di}
= {ai, ..., a., bi, ..., bj, ci, ..., Ck, di, ..., di}
[0039] So, if represented as a vector T can be represented as a vector having N dimensions (< Z) where N = (i+j+k+l).

[0040] According to a noteworthy development of the invention, in order to assess the level of experience of the candidate in using a specific skill along with know-how of a particular skill, a hierarchy of technical skill tj is defined as follows:
tjBasic c tj Intermediate c tjAdvanceda tj
[0041] So, the sets representing unique elements of Advanced, Intermediate and Basic skills are represented as follows:
[0042] tjBasicOnly - tjBasic
[0043] tj IntermediateOnly ~ tj Intermediate ~ tjBasic
[0044] tjAdvancedOnly = tjAdvanced ~ tj Intermediate
[0045] tj—tjAdvancedOnly^jtjlntermediateOnly^jtjBasicOnly
[0046] Now, Fig. 2 sequentially depicts a conceptual illustration of sub-components of the Al engine 110 including, but not limited to, skill set requirement vector generator 112, candidate data module 114, skill vector generator 116, and candidate screening module 118, each designed and configured to perform one or more specified operations or instructions directed thereto. Al Engine 110 preferably is a multi-layer perceptron neural network, however, other known neural networks or later developed are contemplated herein.

[0047] Each of the above listed sub-components may perform actions individually or in a shared manner. Some or all of the sub-components may be combined or split into smaller
sub-components. In one embodiment, the skill set requirement vector generator 112 is
—♦
n
configured to generate a skillset requirement vector " from a skill set defined in accordance with afore-mentioned one or more dimensions. Next, a candidate data module 114 of system is configured to retrieve candidate data from one or more sources such that job search websites and other such sources hosting candidate professional experience profile. Now, the skill vector generator 116 generates a skill vector from the one or more skills of the candidate that have been retrieved by the candidate data module 114.
[0048] In one working embodiment of present disclosure, the skill vector generator 116 constructs the skill vector for a given individual (m of M) as follows:
[0049] Let there be Y files containing code provided by an individual for assessment. Now for each of the Y files (K=1 to K=Y), the probability of match with skill tj (J=1 to J=N) is determined.
. p(Yk\tj) = fl Pfc\tj) where&- e Yk
I.e. ■*• A;
[0050] On completion of the iteration of Y files, the skill vector s™ is formed by using median co-efficient for each of N dimensions.
[0051] Now, once the project requirements are codified into a vector consisting of N dimensions, say ^ by the skillset requirement vector generator 112. For each candidate,

the skill vector generator 116 generates an N dimensional skill vector ^m . Finally, the candidate screening module 118 of the system 100 computes the cosine distance
between skillset requirement vectorR and skillset requirement vector >* . Based on the computed distance, the candidate is shortlisted in an event the cosine distance is less than a predetermined threshold.
[0052] It is to be noted that source codes submitted by candidates are assessed for their constructs and symbols representing such skills, and not merely for their grammatical or vocabulary construct. Since any application code written today usually ends up referencing other APIs/frameworks/DLLs, such references are perceived by the system 100 as a set of symbol from which a dictionary is built to filter put the noise generated by individual coding style.
[0053] Referring to Fig. 2, the skill vector generator 116 interacts with dictionary utility module 120 that crawls API documentation from internet 200 and builds the symbol dictionary for the skill to be assessed. Next, the data utility module 122, other than performing tasks of reading, cleansing, tokenizing the source code files, extracts features from such code samples 202 applying different strategies for broad level skill detection and then delve deeper within a skill for hierarchical assessment.
[0054] Based on the extracted features, the source code classifier 124, in conjunction with Al model utility 126, is responsible for creating and managing the classifiers for broad level skill detection and dive deeper within a skill for hierarchical assessment. Al model utility 126 is further responsible for interacting with the natural language toolkit (NLTK) for

abstracting out the training, testing, prediction, load, store activities related to Al engine 110. Finally, the skill vector generator 116 scans through the code provided by the
c
candidate by iterating over the Y files from the skill vector ™ by using median co-efficient for each N dimensions.
[0055] Finally, the candidate screening module 118 takes the skill vector s™ and
compares it with project requirements vector " consisting of N dimensions to compute cosine distance and identify candidates matching the project requirement.
[0056] In one working embodiment of present disclosure, the system 100 is trained by providing afore-discussed code samples maintaining the hierarchy:
tjBasic c tjIntermediate c tjAdvancedc tj
[0057] These code samples spanning across multiple skills and levels of skill are randomly selected to simulate an individual to be assessed for fitment with pre-defined project requirements. Publicly available API documentation may be used for automatically building symbol dictionary for each of the technical skill tj. In one exemplary embodiment, python along with NLTK Naive Bayes Classifier can be used for skill vector formation.
[0058] The above discussed Al engine 110 thus deeply analyzes the technical skills possessed by potential candidate along with the level of maturity of understanding such skills from fine grained source code using multi-layered Bayesian Networks based models. For instance, let for a technical skill, j subject matter experts (SMEs) provide

source codes representing different maturity levels of application i.e. each SME provides x, y, z files for tech skill belonging to maturity level H, ML respectively.
[0059] In one significant embodiment of present disclosure, it is to be noted that during this process, the SMEs are not required to mark individual code content/ snippet. On the contrary, categorization of code is done based on overall understanding. Following this categorization, source code files SCi, SC2, SC3 for maturity level M, source code files for maturity level H and source code files for maturity level L are obtained for one set of technical skill, as shown in Fig. 3. This process is repeated for all the N tech skills. All these files are collected to build the skill training corpus, which is eventually used for training the Al engine 110 for skill assessment.
[0060] Next, as can be seen in Fig. 4, the Al engine 110 is configured to evaluate the source code SC1 ... SCN submitted by the potential candidates ci ... Cn respectively, and determine co-efficient of all the recognized skills categorized to the level of maturity of those candidates. The source code files submitted by the potential candidates are evaluated by source code classifier 124 to adjudge level of maturity with regard to the skills possessed by the candidate. The skill vector generator 116 eventually iterates over these files from the generated skill vector by using median co-efficient for each of N dimensions.
[0061] In one exemplary embodiment of present disclosure, the system 100 infers technical skills requirement and maturity levels needed as per the job requirements from given job description using attention based neural network. In one exemplary embodiment, an attention based long short-term memory (LSTM) is trained using corpus

built by SMEs (depicted in Fig. 3) to learn the encoding strategy of a job description into a skill vector. Here, Ti represents a technical skill at a given maturity level and Ci represents the weightage of Ti in the context of given job description.
[0062] The Al engine 110 evaluates the job descriptions submitted by a project and determines co-efficient of all the recognized skills categorized to the level of maturity to for the skill vector of that job. The Al engine 110 then evaluates the job descriptions submitted by a project and determines co-efficient of all the recognized skills categorized to the level of maturity to for the skill vector of that job. Finally, the candidate screening module 118 compares the skill vector of candidates with skill vector of job descriptions and produces matrix comprising cosine distances. Candidates with the lowest distances with a job description are shortlisted and recommended for the given position.
[0063] The foregoing description is a specific embodiment of the present disclosure. It should be appreciated that this embodiment is described for purpose of illustration only, and that numerous alterations and modifications may be practiced by those skilled in the art without departing from the spirit and scope of the invention. It is intended that all such modifications and alterations be included insofar as they come within the scope of the invention as claimed or the equivalents thereof.

We claim:
1) A system for candidate screening, comprising:
a skillset requirement vector generator configured to generate a skillset requirement vector from a skillset defined in accordance with one or more dimensions of a project;
a candidate data module configured to retrieve candidate data and determine one or more skills of a candidate corresponding to the generated skillset requirement vector;
a skill vector generator configured to generate a skill vector from the determined one or more skills of the candidate; and
a candidate screening module configured to calculate a vector distance between the skillset requirement vector and the skill vector of the candidate, and shortlist the candidate in an event the vector distance is less than a predefined threshold.
2) The system as claimed in claim 1, wherein the candidate data comprising of one or more source codes representative of the one or more skills is retrieved from one or more candidate profile on prior work done by the candidate from source control public or private systems.
3) The system as claimed in claim 1, wherein the skillset corresponding to the one or more dimensions comprises of a user interface related skills, service creation related skills, persistence related skills and integration related skills.

1
4) The system as claimed in claim 1, wherein the skillset requirement vector generator is an artificial intelligence based utility model for the skill detection from job description and vectorization in accordance with the one or more dimensions.
5) The system as claimed in claim 2, wherein the candidate data module is configured to determine the one or more skills by:
extraction of one or more features from the one or more source code;
detection of one or more skills corresponding to the one or more dimensions based on a probable match of the extracted features with the skillset requirement vector; and classification of the detected one or more skills based on an experience level of the candidate related to the one or more dimensions.
6) The system as claimed in claim 1, wherein the candidate data module is an artificial intelligence based utility model for the skill detection and classification.
7) The system as claimed in claim 1, wherein the skill vector generator is configured to generate the skill vector using median coefficient for each of the one or more dimensions.
8) A method for candidate screening, comprising steps of:
generating a skillset requirement vector from a skillset defined in job description accordance with one or more dimensions of a project;

retrieving candidate data and determining one or more skills of a candidate corresponding to the generated skillset requirement vector;
generating a skill vector from the determined one or more skills of the candidate; and
calculating a vector distance between the skillset requirement vector and the skill vector of the candidate, and shortlisting the candidate in an event the vector distance is less than a predefined threshold.
9) The method for candidate screening as claimed in claim 8, wherein the candidate data comprising of one or more source codes representative of the one or more skills is retrieved from one or more candidate profile on plurality of source control systems public and / or private.
10) The method for candidate screening as claimed in claim 8, further comprising:
extracting one or more features from the one or more source code;
detecting one or more skills corresponding to the one or more dimensions based on a probable match of the extracted features with the skillset requirement vector; and
classifying the detected one or more skills based on an experience level of the candidate related to the one or more dimensions.
11) The method for candidate screening as claimed in claim 8, the skill detection and
classification is achieved utilizing artificial intelligence based utility model.

12) The method for skillset requirement vector determination as claimed in claim 8, the skillset requirement detection and vectorization is achieved utilizing artificial intelligence based utility model.

Documents

Application Documents

# Name Date
1 201841041146-STATEMENT OF UNDERTAKING (FORM 3) [31-10-2018(online)].pdf 2018-10-31
2 201841041146-REQUEST FOR EARLY PUBLICATION(FORM-9) [31-10-2018(online)].pdf 2018-10-31
3 201841041146-PROOF OF RIGHT [31-10-2018(online)].pdf 2018-10-31
4 201841041146-POWER OF AUTHORITY [31-10-2018(online)].pdf 2018-10-31
5 201841041146-FORM-9 [31-10-2018(online)].pdf 2018-10-31
6 201841041146-FORM 1 [31-10-2018(online)].pdf 2018-10-31
7 201841041146-DRAWINGS [31-10-2018(online)].pdf 2018-10-31
8 201841041146-COMPLETE SPECIFICATION [31-10-2018(online)].pdf 2018-10-31
9 Correspondence by Agent_Proof of Right,Form26_05-11-2018.pdf 2018-11-05