Abstract: The present disclosure provides a method and system for providing course categorization for improving digital career counselling. The method includes a first step to collect a first set of data from one or more sources at a course classification system (108). In addition, the method includes a second step to scan content of the first set of data in real-time at the course classification system 10 (108). Further, the method includes a third step to map one or more keywords present in a keyword bank with one or more keywords present in one or more courses in real-time at the course classification system (108). Furthermore, the method includes a fourth step to re-map each of the one or more courses with each of one or more specializations in real-time at the course classification system 15 (108).
[0001] The present invention relates to the field of computer application and,
in particular, relates to a method and system for providing course categorization
for improving digital career counselling.
BACKGROUND
10 [0002] Over the past few years, digitalization of education and career
counselling is growing evidently. Growth of the digitalization of education and
career counselling has increased the importance of categorisation of courses
offered by universities around the globe. The categorisation of courses in various
specialisations is crucial for educational consultant companies and students. The
15 educational consultant companies enable students and administrators to search the
courses according to suitable specialisation. Currently, the educational consultant
companies manually perform data scraping from multiple data sources to
categorise the courses in various specialisations based on basic understanding of
data scrapping team. In addition, the categorisation of courses in various
20 specialisations is extremely time consuming. Further, decision model to
categorise the courses in the various specialisations. Furthermore, manual
decision model to categorise the courses in the various specialisations has various
errors induced due to human neglect. Moreover, the manual decision model to
categorise the courses in the various specialisations may have ambiguity as some
25 courses can be categorised in different specialisations. Also, the manual decision
model may have loss of information regarding reason or logic behind mapping of
the courses in the various specialisations by the data scrapping team. Also, the
manual decision model may have scalability issues.
30 OBJECT OF THE DISCLOSURE
3
[0003] A primary object of the present disclosure is to provide a course
classification system to categorise various courses into multiple levels for
improving digital career counselling.
[0004] Another object of the present disclosure is to provide the course
5 classification system to increase scalability.
[0005] Yet another object of the present disclosure is to provide the course
classification system to eliminate human negligence.
[0006] Yet another object of the present disclosure is to provide the course
classification system to remove ambiguity in categorisation of each of the various
10 courses with specialisation.
[0007] Yet another object of the present disclosure is to provide the course
classification system to induce reasoning and logic behind mapping of each of the
various courses.
SUMMARY
15 [0008] The present disclosure provides a computer system. The computer
system includes one or more processors, a signal generator circuitry embedded
inside a computing device for generating a signal, and a memory. The memory is
coupled to the one or more processors. The memory stores instructions. The
instructions are executed by the one or more processors. The execution of
20 instructions causes the one or more processors to perform a method to provide
course categorization for improving digital career counselling. The method
includes a first step to collect a first set of data from one or more sources at a
course classification system. In addition, the method includes a second step to
scan content of the first set of data in real-time at the course classification system.
25 Further, the method includes a third step to map one or more keywords present in
a keyword bank with one or more keywords present in one or more courses in
real-time at the course classification system. Furthermore, the method includes a
fourth step to re-map each of the one or more courses with each of one or more
specializations in real-time at the course classification system. Moreover, the first
30 set of data is data associated with the one or more courses. Also, the first set of
data is collected for each of the one or more specializations in real-time. Also, the
4
one or more keywords are extracted to assign weightage to each of the one or
more keywords for each of the one or more specialization. Also, the one or more
keywords are extracted for each of the one or more specializations. The
extraction of the one or more keywords is performed to create the keyword bank
5 of the one or more keywords. The extraction is performed using one or more
hardware run algorithms. In addition, the mapping is performed to associate each
of the one or more courses with the one or more specialization. Further, the remapping is performed based on feedback received with facilitation of one or more
methods. The received feedback enables variation in weightage of each of the one
10 or more keywords for each of the one or more specializations. Furthermore, the
re-mapping is done to perform change in weightage of the one or more erroneous
keywords and increasing accuracy of the mapping of name of the one or more
courses with the one or more specialization. Moreover, the re-mapping is
performed continuously until accuracy of course to specialization mapping is less
15 than or equal to a threshold error percentage. Also, the re-mapping is performed
to improve digital career counselling.
STATEMENT OF THE DISCLOSURE
[0009] The present disclosure provides a computer system. The computer
system includes one or more processors, a signal generator circuitry embedded
20 inside a computing device for generating a signal, and a memory. The memory is
coupled to the one or more processors. The memory stores instructions. The
instructions are executed by the one or more processors. The execution of
instructions causes the one or more processors to perform a method to provide
course categorization for improving digital career counselling. The method
25 includes a first step to collect a first set of data from one or more sources at a
course classification system. In addition, the method includes a second step to
scan content of the first set of data in real-time at the course classification system.
Further, the method includes a third step to map one or more keywords present in
a keyword bank with one or more keywords present in one or more courses in
30 real-time at the course classification system. Furthermore, the method includes a
5
fourth step to re-map each of the one or more courses with each of one or more
specializations in real-time at the course classification system. Moreover, the first
set of data is data associated with the one or more courses. Also, the first set of
data is collected for each of the one or more specializations in real-time. Also, the
5 one or more keywords are extracted to assign weightage to each of the one or
more keywords for each of the one or more specialization. Also, the one or more
keywords are extracted for each of the one or more specializations. The
extraction of the one or more keywords is performed to create the keyword bank
of the one or more keywords. The extraction is performed using one or more
10 hardware run algorithms. In addition, the mapping is performed to associate each
of the one or more courses with the one or more specialization. Further, the remapping is performed based on feedback received with facilitation of one or more
methods. The received feedback enables variation in weightage of each of the one
or more keywords for each of the one or more specializations. Furthermore, the
15 re-mapping is done to perform change in weightage of the one or more erroneous
keywords and increasing accuracy of the mapping of name of the one or more
courses with the one or more specialization. Moreover, the re-mapping is
performed continuously until accuracy of course to specialization mapping is less
than or equal to a threshold error percentage. Also, the re-mapping is performed
20 to improve digital career counselling.
BRIEF DESCRIPTION OF FIGURES
[0010] Having thus described the invention in general terms, reference will
now be made to the accompanying drawings, which are not necessarily drawn to
scale, and wherein:
25 [0011] FIG. 1 illustrates an interactive computing environment for providing
course categorization for improving digital career counselling, in accordance with
various embodiments of the present disclosure;
[0012] FIG. 2 illustrates a block diagram of a vector decision tree and a
keyword vector tree, in accordance with various embodiments of the present
30 disclosure; and
6
[0013] FIG. 3 illustrates a block diagram of a computing device, in
accordance with various embodiments of the present disclosure.
[0014] It should be noted that the accompanying figures are intended to
present illustrations of exemplary embodiments of the present disclosure. These
5 figures are not intended to limit the scope of the present disclosure. It should also
be noted that accompanying figures are not necessarily drawn to scale.
DETAILED DESCRIPTION
[0015] In the following description, for purposes of explanation, numerous
specific details are set forth in order to provide a thorough understanding of the
10 present technology. It will be apparent, however, to one skilled in the art that the
present technology can be practiced without these specific details. In other
instances, structures and devices are shown in block diagram form only in order to
avoid obscuring the present technology.
[0016] Reference in this specification to “one embodiment” or “an
15 embodiment” means that a particular feature, structure, or characteristic described
in connection with the embodiment is included in at least one embodiment of the
present technology. The appearance of the phrase “in one embodiment” in
various places in the specification are not necessarily all referring to the same
embodiment, nor are separate or alternative embodiments mutually exclusive of
20 other embodiments. Moreover, various features are described which may be
exhibited by some embodiments and not by others. Similarly, various
requirements are described which may be requirements for some embodiments but
not other embodiments.
[0017] Moreover, although the following description contains many specifics
25 for the purposes of illustration, anyone skilled in the art will appreciate that many
variations and/or alterations to said details are within the scope of the present
technology. Similarly, although many of the features of the present technology
are described in terms of each other, or in conjunction with each other, one skilled
in the art will appreciate that many of these features can be provided
30 independently of other features. Accordingly, this description of the present
7
technology is set forth without any loss of generality to, and without imposing
limitations upon, the present technology.
[0018] FIG. 1 illustrates an interactive computing environment 100 for
providing course categorization for improving digital career counselling, in
5 accordance with various embodiments of the present disclosure. The interactive
computing environment 100 includes one or more users 102, one or more
communication devices 104, a communication network 106, and a course
classification system 108. In addition, the interactive computing environment 100
includes a server 110, and a database 112. The above stated elements of the
10 interactive computing environment 100 operate coherently and synchronously to
provide course categorization for improving digital career counselling.
[0019] The interactive computing environment 100 includes the one or more
users 102 who is any person present at any location and accessing information
associated with one or more courses. In addition, the one or more courses are
15 associated with one or more educational institutions. In general, educational
institutions are institutions engaged in educating students. In an embodiment of
the present disclosure, the one or more educational institutions correspond to
universities. In another embodiment of the present disclosure, the one or more
educational institutions correspond to colleges. In yet another embodiment of the
20 present disclosure, the one or more educational institutions correspond to
academies. In yet another embodiment of the present disclosure, the one or more
educational institutions correspond to graduate schools. However, the one or
more educational institutions are not limited to above mentioned educational
institutions.
25 [0020] In addition, the one or more users 102 are an individuals or persons
who search for the one or more courses through the one or more communication
devices 104. The one or more courses may be classified into multiple levels on
the course classification system 108. Further, the multiple levels include but may
not be limited to one or more specializations and one or more area of study. In an
30 embodiment of the present disclosure, the one or more users 102 search the one or
more courses classified into the one or more specializations. In another
8
embodiment of the present disclosure, the one or more users 102 search the one or
more courses classified into the one or more area of study. Furthermore, the one
or more courses include at least one of online courses and offline courses.
[0021] In an embodiment of the present disclosure, the one or more users
5 102 correspond to employees of an educational consultant organisation. In
addition, the educational consultant organisation provides academic advice to a
student about education and career counselling. In another embodiment of the
present disclosure, the one or more users 102 correspond to a student. In yet
another embodiment of the present disclosure, the one or more users 102
10 correspond to an independent career consultant. In yet another embodiment of the
present disclosure, the one or more users 102 correspond to an independent
academic advisor. In yet another embodiment of the present disclosure, the one or
more users 102 corresponds to a parent.
[0022] In addition, the one or more users 102 may be any person or
15 individual accessing the one or more communication devices 104. In an
embodiment of the present disclosure, the one or more users 102 are owners of the
one or more communication devices 104. In another embodiment of the present
disclosure, the one or more users 102 are not the owners of the one or more
communication devices 104. In an embodiment of the present disclosure, the one
20 or more users 102 access the one or more communication devices 104 at home. In
another embodiment of the present disclosure, the one or more users 102 access
the one or more communication devices 104 at a cafe. In yet another embodiment
of the present disclosure, the one or more users 102 access the one or more
communication devices 104 at office. The one or more users 102 correspond to
25 any number of person or individual associated with the course classification
system 108. The course classification system 108 enables the one or more users
102 to access the one or more courses classified into the multiple levels through
the one or more communication devices 104.
[0023] The interactive computing environment 100 includes the one or more
30 communication devices 104. The one or more users 102 are connected with the
interactive computing environment 100 through the one or more communication
9
devices 104. In an embodiment of the present disclosure, the one or more
communication devices 104 facilitate access to the course classification system
108. In an embodiment of the present disclosure, the one or more communication
devices 104 are a portable communication device. The portable communication
5 device include but may not be limited to a laptop, smartphone, tablet, and smart
watch. In an example, the smartphone may be an iOS-based smartphone, an
android-based smartphone, a windows-based smartphone and the like. In another
embodiment of the present disclosure, the one or more communication devices
104 are a fixed communication device. The fixed communication device includes
10 but may not be limited to desktop, workstation, smart TV and mainframe
computer. In an embodiment of the present disclosure, the one or more
communication devices 104 are currently in the switched-on state. The one or
more communication devices 104 are any type of devices having an active
internet. In addition, the one or more users 102 access the one or more
15 communication devices 104 in real-time.
[0024] In an embodiment of the present disclosure, the one or more
communication devices 104 perform computing operations based on a suitable
operating system installed inside the one or more communication devices 104. In
general, the operating system is system software that manages computer hardware
20 and software resources and provide common services for computer programs. In
addition, the operating system acts as an interface for software installed inside the
one or more communication devices 104 to interact with hardware components of
the one or more communication devices 104. In an embodiment of the present
disclosure, the one or more communication devices 104 perform computing
25 operations based on any suitable operating system designed for the portable
communication device. In an example, the operating system installed inside the
one or more communication devices 104 is a mobile operating system. Further,
the mobile operating system includes but may not be limited to windows
operating system, android operating system, iOS operating system, symbian
30 operating system, bada operating system from and blackBerry operating system,
and sailfish. In an embodiment of the present disclosure, the one or more
10
communication devices 104 operate on any version of particular operating system
corresponding to above mentioned operating systems.
[0025] In another embodiment of the present disclosure, the one or more
communication devices 104 perform computing operations based on any suitable
5 operating system designed for fixed communication device. In an example, the
operating system installed inside the one or more communication devices 104 is
windows. In another example, the operating system installed inside the one or
more communication devices 104 is Mac. In yet another example, the operating
system installed inside the one or more communication devices 104 is Linux
10 based operating system. In yet another example, the operating system installed
inside the one or more communication devices 104 is Chrome OS. In yet another
example, the operating system installed inside the one or more communication
devices 104 may be one of UNIX, Kali Linux, and the like. However, the
operating system is not limited to above mentioned operating systems.
15 [0026] In an embodiment of the present disclosure, the one or more
communication devices 104 operate on any version of windows operating system.
In another embodiment of the present disclosure, the one or more communication
devices 104 operate on any version of Mac operating system. In yet another
embodiment of the present disclosure, the one or more communication devices
20 104 operate on any version of Linux operating system. In yet another
embodiment of the present disclosure, the one or more communication devices
104 operate on any version of Chrome OS. In yet another embodiment of the
present disclosure, the one or more communication devices 104 operate on any
version of particular operating system corresponding to above mentioned
25 operating systems.
[0027] The one or more communication devices 104 include a memory. In
general, the memory includes computer-storage media in the form of volatile
and/or non-volatile memory. The memory may be removable, non-removable, or
a combination thereof. Exemplary hardware devices include solid-state memory,
30 hard drives, optical-disc drives, etc. The memory is coupled with one or more
processors. In general, the one or more processor read data from various entities
11
such as memory or I/O components. The one or more processors execute the one
or more instructions which are stored in the memory. The one or more processors
provide execution method for one or more instructions provided by the course
classification system 108.
5 [0028] The interactive computing environment 100 includes the
communication network 106. The one or more communication devices 104 are
connected to the communication network 106. The communication network 106
provides a medium for the one or more users 102 to search and access the one or
more courses through the one or more communication devices 104. The
10 communication network 106 provides the medium for the one or more users 102
to connect with the course classification system 108. In an embodiment of the
present disclosure, the communication network 106 is an internet connection. In
another embodiment of the present disclosure, the communication network 106 is
a wireless mobile network. In yet another embodiment of the present disclosure,
15 the communication network 106 is a wired network with a finite bandwidth. In
yet another embodiment of the present disclosure, the communication network
106 is a combination of the wireless and the wired network for the optimum
throughput of data transmission. In yet another embodiment of the present
disclosure, the communication network 106 is an optical fiber high bandwidth
20 network that enables a high data rate with negligible connection drops. The
communication network 106 includes a set of channels. Each channel of the set of
channels supports a finite bandwidth. Moreover, the finite bandwidth of each
channel of the set of channels is based on capacity of the communication network
106. The communication network 106 connects the one or more communication
25 devices 104 to the course classification system 108 using a plurality of methods.
The plurality of methods used to provide network connectivity to the one or more
communication devices 104 includes 2G, 3G, 4G, 5G, Wifi and the like.
[0029] The interactive computing environment 100 includes the course
classification system 108. The course classification system 108 is associated with
30 the educational consultant organisation and the one or more users 102. In
addition, the course classification system 108 provides course categorization to
12
improve digital career counselling. Further, the course classification system 108
dynamically analyses a first set of data associated with the one or more courses.
Furthermore, the first set of data includes seed content of each of the one or more
courses, one or more documents associated with each of the one or more courses,
5 one or more keywords utilized for each of the one or more courses.
[0030] The course classification system 108 collects the first set of data from
one or more sources. In addition, the first set of data corresponds to data
associated with the one or more courses. Further, the first set of data is collected
for each of the one or more specializations in real-time. Furthermore, the one or
10 more sources include but may not be limited to at least one of online websites
associated with universities, third-party websites, public portals, private portals,
and offline sources. In an embodiment of the present disclosure, the course
classification system 108 collects the first set of data from online websites
associated with the one or more educational institutions. In another embodiment
15 of the present disclosure, the course classification system 108 collects the first set
of data from online websites associated with third-party educational consultant
organisation. In yet another embodiment of the present disclosure, the course
classification system 108 collects the first set of data from the independent career
consultant. In yet another embodiment of the present disclosure, the course
20 classification system 108 collects the first set of data from the independent
academic advisor. However, the one or more sources are not limited to above
mentioned sources.
[0031] The course classification system 108 scans content of the first set of
data in real-time. In addition, the course classification system 108 performs the
25 scanning to extract the one or more keywords. Further, the one or more keywords
are extracted to assign weightage to each of the one or more keywords for each of
the one or more specializations. Furthermore, the one or more keywords are
extracted for each of the one or more specializations. The course classification
system 108 performs the extraction of the one or more keywords to create a
30 keyword bank. The course classification system 108 performs the extraction
using one or more hardware run algorithms. In addition, the one or more
13
hardware-run algorithms include at least one of natural language processing
algorithms, machine learning algorithms and artificial intelligence algorithms.
The course classification system 108 creates the keyword bank and a course bank
based on the scanning of the first set of data. Further, the keyword bank includes
5 the one or more keyword. Furthermore, the course bank includes the one or more
courses. Moreover, the keyword bank and the course bank are dynamic in nature.
The course classification system 108 dynamically updates the keyword bank and
the course bank with new keywords and new courses based on the scanning of the
first set of data in real-time.
10 [0032] The course classification system 108 calculates weightage to assign to
each of the one or more keywords. In addition, the course classification system
108 calculates the weightage by calculating occurrence count of each of the one or
more keywords in one or more documents, total number of the one or more
documents, and number of the one or more documents in which the one or more
15 keywords are present. In an embodiment of the present disclosure, the occurrence
count corresponds to frequency distribution of each of the one or more keywords
in the one or more documents. In addition, the frequency distribution corresponds
to distinct number of times each of the one or more keywords is distributed and
reoccurs in the one or more documents. Further, the weightage of corresponding
20 keyword of the one or more keywords enables the course classification system
108 to detect relevant specialization to map to each of the one or more courses.
Furthermore, the one or more keywords are categorised to the one or more
specialisations with different weightages. Moreover, the one or more keywords
and the weightage are prepared by the one or more hardware run algorithms.
25 Also, the one or more keywords and the weightage are utilized to enable the
course classification system 108 to categorize and classify each of the one or more
courses.
[0033] The course classification system 108 recommends preferred list of
courses to each of one or more users 102. In addition, the preferred list of courses
30 is recommended to each of the one or more users 102 based on a plurality of
factors. Further, the plurality of factors include interest of each of the one or more
14
users 102, time duration of courses, preferred choice of university of each of the
one or more users 102, preferred fee of courses and other factors associated with
courses. In an embodiment of the present disclosure, the course classification
system 108 recommends the preferred list of courses to employees of the
5 educational consultant organisation. In another embodiment of the present
disclosure, the course classification system 108 recommends the preferred list of
courses to the student. In yet another embodiment of the present disclosure, the
course classification system 108 recommends the preferred list of courses to the
independent career consultant. In yet another embodiment of the present
10 disclosure, the course classification system 108 recommends the preferred list of
courses to the independent academic advisor. In yet another embodiment of the
present disclosure, the course classification system 108 recommends the preferred
list of courses to the parent.
[0034] The course classification system 108 categorizes the one or more
15 courses into the multiple levels based on the one or more hardware run
algorithms. In an embodiment of the present disclosure, the course classification
system 108 has a vector decision tree and a keyword vector tree based on the one
or more hardware run algorithms. In addition, the vector decision tree and the
keyword vector tree enable the course classification system 108 to categorise the
20 one or more courses into the multiple levels. Further, the vector decision tree and
the keyword vector tree correspond to mapping algorithms based on the weightage
of each of the one or more keywords and direction of each of the one or more
keywords towards the one or more specializations. Furthermore, the vector
decision tree has the mapping for all possible specialisations of each of the one or
25 more courses. The keyword vector tree has the mapping for all possible
specialisations to keywords.
[0035] The course classification system 108 classifies the one or more
courses into the one or more area of study. Furthermore, the course classification
system 108 classifies the one or more courses into the one or more specializations.
30 Moreover, the one or more specializations correspond to particular academic area
of subject in which the one or more courses are categorized. Also, the one or
15
more area of study correspond to broader academic area of subject in which the
one or more courses are categorized. Also, the one or more specializations may
be categorized in the each of the one or more area of study. Also, each of the one
or more specializations is a focused area of study. In an example, a specialization
5 S1 (let’s say commerce) is subcategory or subset of an area of study A1 (let’s say
management). In another example, a specialization S2 (let’s say criminology) is
subcategory or subset of an area of study A2 (let’s say Law). In yet another
example, a specialization S3 (let’s say geography) is subcategory or subset of an
area of study A3 (let’s say Social Science).
10 [0036] The course classification system 108 maps the one or more keywords
present in the keyword bank with one or more keywords present in the one or
more courses in real-time. In addition, the course classification system 108
performs the mapping to associate each of the one or more courses with the one or
more specialization. The course classification system 108 receives real-time
15 feedback from the one or more users 102. The real-time feedback is associated
with incorrect mapping of each of the one or more courses under each of the one
or more specializations. The real-time feedback is received after a pre-defined
interval. The real-time feedback facilitates re-mapping of each of the one or more
courses with each of the one or more specializations in real-time. In an example,
20 the pre-defined interval is
[0037] The course classification system 108 re-maps each of the one or more
courses with each of the one or more specialization in real-time. In addition, the
course classification system 108 performs the re-mapping based on feedback
received with facilitation of one or more methods. The received feedback enables
25 variation in weightage of each of the one or more keywords for each of the one or
more specializations. The course classification system 108 performs the remapping for change in weightage of the one or more erroneous keywords and
increasing accuracy of the mapping of name of the one or more courses with the
one or more specialization. The weightage may increase or decrease. The course
30 classification system 108 performs the re-mapping continuously until accuracy of
course to specialization mapping is less than or equal to a threshold error
16
percentage. The course classification system 108 performs the re-mapping to
improve digital career counselling. In an embodiment of the present disclosure,
the threshold error percentage has a value of about 5 percent. In another
embodiment of the present disclosure, the value of the threshold error percentage
5 may vary. In addition, the threshold error percentage is defined by an
administrator.
[0038] The one or more methods include elimination of blacklisted
keywords, elimination of synonym keywords and restriction of keywords specific
to the specialization of courses, and the like. In addition, the blacklisted keywords
10 include list of keywords that are eliminated from the one or more keywords.
Further, the synonym keywords include list of keywords with similar meaning to
each other. Furthermore, the keywords specific to the specialization of courses
include list of keywords that are specific with the specialization of courses.
Moreover, the one or more methods include specific handling of specialization
15 contradictory keywords, elimination of garbage characters, and keyword presence
check, and the like. Also, the keyword presence check is performed based on
smart handling of trailing of successive keywords.
[0039] The interactive computing environment 100 includes the server 110
and the database 112. The course classification system 108 is associated with the
20 server 110. In general, server is a computer program or device that provides
functionality for other programs or devices. The server 110 provides various
functionalities, such as sharing data or resources among multiple users, or
performing computation for the one or more users 102. However, those skilled in
the art would appreciate that the course classification system 108 is connected to
25 more number of servers. Furthermore, it may be noted that the server 110
includes the database 112. However, those skilled in the art would appreciate that
more number of the servers include more numbers of database.
[0040] In an embodiment of the present disclosure, the course classification
system 108 is located in the server 110. In another embodiment of the present
30 disclosure, the course classification system 108 is connected with the server 110.
In yet another embodiment of the present disclosure, the course classification
17
system 108 is a part of the server 110. The server 110 handles each operation and
task performed by the course classification system 108. The server 110 stores one
or more instructions for performing the various operations of the course
classification system 108. The server 110 is located remotely from the one or
5 more users 102. The server 110 is associated with an administrator. In general,
the administrator manages the different components in the course classification
system 108. The administrator coordinates the activities of the components
involved in the course classification system 108. The administrator is any person
or individual who monitors the working of the course classification system 108
10 and the server 110 in real time. The administrator monitors the working of the
course classification system 108 and the server 110 through a communication
device. The communication device includes the laptop, the desktop computer, the
tablet, a personal digital assistant and the like.
[0041] The database 112 store different sets of information associated with
15 various components of the course classification system 108. In general, database
are used to hold general information and specialized data, such as characteristics
data of the one or more users 102, data of the one or more communication devices
104, data of the one or more courses and the like. For example, the database 112
includes the first set of data, the one or more documents and the pre-defined
20 instructions. The database 112 stores the information of the one or more courses,
the one or more educational institutions, profiles of the one or more users 102, and
the like. The database 112 organizes the data using model such as relational
models or hierarchical models. Further, the database 112 stores data provided by
the one or more sources.
25 [0042] FIG. 2 illustrates a block diagram 200 of the vector decision tree and
the keyword vector tree, in accordance with various embodiments of the present
disclosure. In addition, the block diagram 200 includes the keyword bank, the
course bank, the vector decision tree, the keyword vector tree, specific handlings,
and algorithm and logic. The keyword bank is utilized to extract the one or more
30 keywords to represent a specific specialization based on the frequency distribution
of each of the one or more keywords in content of the one or more courses.
18
Further, the course classification system 108 receives the seed content of each of
the one or more courses for each of the one or more specializations. In an
example, number of the seed content for each of the one or more specializations
received by the course classification system 108 is about 10. In another example,
5 number of the seed content for each of the one or more specializations received by
the course classification system 108 may vary. The course classification system
108 executes the one or more hardware run algorithms to load the keyword bank
and the first set of data to create the keyword vector tree for the one or more
keywords. In an example, a keyword K1 (let’s say data analytics) has a list of
10 specializations. In addition, the list of specializations includes a specialization S1
(let’s say data science and big data), a specialization S2 (let’s say information
technology), a specialization S3 (let’s say computer science), and a specialization
S4 (let’s say business analytics). Further, the keyword K1 has a weightage W1
(let’s say 200) for the specialization S1. Furthermore, the keyword K1 has a
15 weightage W2 (let’s say 200) for the specialization S2. Moreover, the keyword
K1 has a weightage W3 (let’s say 150) for the specialization S3. Also, the
keyword K1 has a weightage W4 (let’s say 150) for the specialization S4. The
course classification system 108 scans the seed content of each of the one or more
courses for each of the one or more specializations.
20 [0043] In addition, the course classification system 108 extracts the one or
more keywords in an N-gram model. Further, the N-gram model is used to
predict and extract the one or more keywords using natural language processing.
Furthermore, the N-gram model enables the course classification system 108 to
recognize specific pattern of each of the one or more keywords in the seed content
25 of each of the one or more courses for each of the one or more specializations. In
an embodiment of the present disclosure, the N-gram model is a unigram model.
In another embodiment of the present disclosure, the N-gram model is a bigram
model. In yet another embodiment of the present disclosure, the N-gram model is
a trigram model. However, the N-gram model is not limited to above mentioned
30 models.
19
[0044] In an example, the keyword vector tree includes a keyword 1 and a
keyword 2. In addition, the keyword 1 has a specific weightage for a
specialization 1. Further, the keyword 1 has a specific weightage for a
specialization 2. Furthermore, the keyword 1 has a specific weightage for a
5 specialization 3. Moreover, the keyword 2 has a specific weightage for a
specialization 1. Also, the keyword 2 has a specific weightage for a specialization
2.
[0045] In addition, the course classification system 108 assigns the
weightage to each of the one or more keywords based on the frequency
10 distribution of each of the one or more keywords in the one or more documents.
Further, the weightage of each of the one or more keywords in a document from
the one or more documents is represented as:
KW ሺnሻ ൌ ሺ KW ሺnെ1ሻ ଵ
)
wherein:
15 n represents the occurrence count of each of the one or more keywords in the
document; and
KW represent the weightage of the one or more keywords.
[0046] The course classification system 108 calculates summation of the
KW(n) for each of the one or more document using the keyword vector tree. In
20 addition, the summation of the weightage of each of the one or more keywords is
represented as:
SWK୩ୣ୷୵୭୰ୢ ൌ ሺ Ʃ KW ሺnሻ/
)
wherein:
p represents all number of the one or more documents;
25 n represents the occurrence count of each of the one or more keywords in the
document; and
m represents number of documents in which the keyword is present.
[0047] The course classification system 108 normalizes the SWKkeyword for
each of the one or more keywords in a predefined range of numerical values using
30 the keyword vector tree. In addition the predefined range of numerical values is
20
defined by the administrator. In an example, the predefined range of numerical
values is from 0 to 100. In an example, the predefined range of numerical values
may vary. Further, the course classification system 108 allocates highest
numerical value to highest summation based on the normalization using the
5 keyword vector tree.
[0048] The course classification system 108 collects the one or more courses
to run the vector decision tree to classify the one or more courses into multiple
levels. In an example, the vector decision tree includes a course 1 and a course 2.
In addition, the course 1 may be classified in a specialization 1 and a
10 specialization 2. Further, the specialization 1 for the course 1 has a keyword 1
and a keyword 2. Furthermore, the keyword 1 has a specific weightage for the
specialization 1 of the course 1. Moreover, the keyword 2 has a specific
weightage for the specialization 1 of the course 1. Also, the specialization 2 for
the course 1 has a keyword 1 and a keyword 2. Furthermore, the keyword 1 has a
15 specific weightage for the specialization 2 of the course 1. Moreover, the
keyword 2 has a specific weightage for the specialization 2 of the course 1. Also,
the course 2 may be classified in a specialization 1. Also, the specialization 1 for
the course 2 has a keyword 1. Also, the keyword 1 has a specific weightage for
the specialization 1 of the course 2.
20 [0049] The course classification system 108 executes the one or more
hardware run algorithms to scan the keyword vector tree to detect the one or more
keywords that are present in content available for the one or more courses to
create the vector decision tree. In addition, the course classification system 108
stores corresponding specialization of the one or more specializations for
25 respective course of the one or more courses with the summation of weightage for
each of the one or more keywords.
[0050] The course classification system 108 performs specific handlings to
calculate the vector decision tree. The specific handlings correspond to the one or
more methods that is utilized for the re-mapping. The specific handlings include
30 elimination of blacklisted keywords, elimination of synonym keywords and
restriction of keywords specific to the specialization of courses. The blacklisted
21
keywords include list of keywords that are eliminated from the one or more
keywords. The synonym keywords include list of keywords with similar meaning
to each other. The keywords specific to the specialization of courses include list
of keywords that are specific with the specialization of courses. In addition,
5 specific handlings include specific handling of specialization contradictory
keywords, elimination of garbage characters, and keyword presence check. The
keyword presence check is performed based on smart handling of trailing of
successive keywords.
[0051] The course classification system 108 again creates the vector decision
10 tree, the keyword vector tree, and the summation of the weightage of each of the
one or more keywords based on initial cycle, the specific handlings, and the realtime feedback. The course classification system 108 increases accuracy of
keyword mapping. The course classification system 108 identifies the one or
more erroneous keywords from the one or more keywords in the one or more
15 specialization mapped in the keyword bank from the vector decision tree. In
addition, the summation of the weightage of each of the one or more keywords
with an error factor is represented as:
SWK୩ୣ୷୵୭୰ୢ ൌ ሺ Ʃ KW ሺnሻ/ ሺ ିሻ
)
wherein:
20 p represents all number of the one or more documents;
n represents the occurrence count of each of the one or more keywords in the
document;
erc represents count errors reported by users; and
m represents number of documents in which the keyword is present.
25 [0052] FIG. 3 illustrates the block diagram of a computing device 300, in
accordance with various embodiments of the present disclosure. The computing
device 300 includes a bus 302 that directly or indirectly couples the following
devices: memory 304, one or more processors 306, one or more presentation
components 308, one or more input/output (I/O) ports 310, one or more
30 input/output components 312, and an illustrative power supply 314. The bus 302
22
represents what may be one or more busses (such as an address bus, data bus, or
combination thereof). Although the various blocks of FIG. 3 are shown with lines
for the sake of clarity, in reality, delineating various components is not so clear,
and metaphorically, the lines would more accurately be grey and fuzzy. For
5 example, one may consider a presentation component such as a display device to
be an I/O component. Also, processors have memory. The inventors recognize
that such is the nature of the art, and reiterate that the diagram of FIG. 3 is merely
illustrative of an exemplary computing device 300 that can be used in connection
with one or more embodiments of the present invention. Distinction is not made
10 between such categories as “workstation,” “server,” “laptop,” “hand-held device,”
etc., as all are contemplated within the scope of FIG. 3 and reference to
“computing device”.
[0053] The computing device 300 typically includes a variety of computerreadable media. The computer-readable media can be any available media that
15 can be accessed by the computing device 300 and includes both volatile and
nonvolatile media, removable and non-removable media. By way of example,
and not limitation, the computer-readable media may comprise computer storage
media and communication media. The computer storage media includes volatile
and nonvolatile, removable and non-removable media implemented in any method
20 or technology for storage of information such as computer-readable instructions,
data structures, program modules or other data.
[0054] The computer storage media includes, but is not limited to, RAM,
ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital
versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic
25 tape, magnetic disk storage or other magnetic storage devices, or any other
medium which can be used to store the desired information and which can be
accessed by the computing device 300. The communication media typically
embodies computer-readable instructions, data structures, program modules or
other data in a modulated data signal such as a carrier wave or other transport
30 mechanism and includes any information delivery media. The term “modulated
data signal” means a signal that has one or more of its characteristics set or
23
changed in such a manner as to encode information in the signal. By way of
example, and not limitation, communication media includes wired media such as a
wired network or direct-wired connection, and wireless media such as acoustic,
RF, infrared and other wireless media. Combinations of any of the above should
5 also be included within the scope of computer-readable media.
[0055] Memory 304 includes computer-storage media in the form of volatile
and/or nonvolatile memory. The memory 304 may be removable, non-removable,
or a combination thereof. Exemplary hardware devices include solid-state
memory, hard drives, optical-disc drives, etc. The computing device 300 includes
10 one or more processors that read data from various entities such as memory 304 or
I/O components 312. The one or more presentation components 308 present data
indications to a subscriber or other device. Exemplary presentation components
include a display device, speaker, printing component, vibrating component, etc.
The one or more I/O ports 310 allow the computing device 300 to be logically
15 coupled to other devices including the one or more I/O components 312, some of
which may be built in. Illustrative components include a microphone, joystick,
game pad, satellite dish, scanner, printer, wireless device, etc.
We Claim:
1. A computer system comprising:
one or more processors; and
a memory coupled to the one or more processors, the memory for storing
5 instructions which, when executed by the one or more processors, cause the one or
more processors to perform a method for providing course categorization for
improving digital career counselling, the method comprising:
collecting, at a course classification system (108), a first set of data
from one or more sources, wherein the first set of data is data associated
10 with one or more courses, wherein the first set of data is collected for each
of one or more specializations in real-time;
scanning, at the course classification system (108), content of the
first set of data in real-time, wherein the scanning is performed for
extracting one or more keywords, wherein the one or more keywords are
15 extracted for assigning weightage to each of the one or more keywords for
each of the one or more specializations, wherein the one or more keywords
are extracted for each of the one or more specializations, wherein the
extraction of the one or more keywords is performed for creating a
keyword bank, wherein the extraction is performed using one or more
20 hardware run algorithms;
mapping, at the course classification system (108), the one or more
keywords present in the keyword bank with one or more keywords present
in the one or more courses in real-time, wherein the mapping is performed
for associating each of the one or more courses with one or more
25 specialization; and
re-mapping, at the course classification system (108), each of the
one or more courses with each of the one or more specialization in realtime, wherein re-mapping is performed based on feedback received with
facilitation of one or more methods, wherein the received feedback enables
30 variation in weightage of each of the one or more keywords for each of the
one or more specializations, wherein the re-mapping is done for
25
performing change in weightage of the one or more erroneous keywords
and increasing accuracy of the mapping of name of the one or more
courses with the one or more specialization, wherein the re-mapping is
performed continuously until accuracy of course to specialization mapping
5 is less than or equal to a threshold error percentage, wherein the remapping is performed for improving digital career counselling.
2. The computer system as recited in claim 1, wherein the one or more
sources comprising at least one of online websites associated with universities,
third-party websites, public portals, private portals and offline sources.
10 3. The computer system as recited in claim 1, wherein the one or more
hardware-run algorithms comprising at least one of natural language processing
algorithms, machine learning algorithms and artificial intelligence algorithms.
4. The computer system as recited in claim 1, further comprising calculating,
at the course classification system (108), weightage for assigning to each of the
15 one or more keywords, wherein the weightage is calculated by calculating
occurrence count of each of the one or more keywords in one or more documents,
total number of the one or more documents, and number of the one or more
documents in which the one or more keywords are present.
5. The computer system as recited in claim 1, further comprising
20 recommending, at the course classification system (108), preferred list of courses
to each of one or more users (102), wherein the preferred list of courses is
recommended to each of the one or more users (102) based on a plurality of
factors, wherein the plurality of factors comprising interest of each of the one or
more users (102), time duration of courses, preferred choice of university of each
25 of the one or more users (102), preferred fee of courses and other factors
associated with courses.
6. The computer system as recited in claim 1, further comprising
categorizing, at the course classification system (108), the one or more courses
into multiple levels, wherein the course classification system (108) classifies the
30 one or more courses into one or more area of study, wherein the course
26
classification system (108) classifies the one or more courses into the one or more
specializations.
7. The computer system as recited in claim 1, further comprising receiving, at
the course classification system (108), real-time feedback from the one or more
5 users (102), wherein the real-time feedback is associated with incorrect mapping
of each of the one or more courses under each of the one or more specializations,
wherein the real-time feedback is received after a pre-defined interval, wherein
the real-time feedback facilitates re-mapping of each of the one or more courses
with each of the one or more specializations in real-time.
10 8. The computer system as recited in claim 1, wherein the one or more
methods comprising elimination of blacklisted keywords, elimination of synonym
keywords and restriction of keywords specific to the specialization of courses,
wherein the blacklisted keywords comprising list of keywords that are eliminated
from the one or more keywords, wherein the synonym keywords comprising list
15 of keywords with similar meaning to each other, wherein the keywords specific to
the specialization of courses comprising list of keywords that are specific with the
specialization of courses.
9. The computer system as recited in claim 1, wherein the one or more
methods comprising specific handling of specialization contradictory keywords,
20 elimination of garbage characters, and keyword presence check, wherein the
keyword presence check is performed based on smart handling of trailing of
successive keywords.
| # | Name | Date |
|---|---|---|
| 1 | 202011032555-COMPLETE SPECIFICATION [29-07-2020(online)].pdf | 2020-07-29 |
| 1 | 202011032555-STATEMENT OF UNDERTAKING (FORM 3) [29-07-2020(online)].pdf | 2020-07-29 |
| 2 | 202011032555-DECLARATION OF INVENTORSHIP (FORM 5) [29-07-2020(online)].pdf | 2020-07-29 |
| 2 | 202011032555-FORM FOR STARTUP [29-07-2020(online)].pdf | 2020-07-29 |
| 3 | 202011032555-DRAWINGS [29-07-2020(online)].pdf | 2020-07-29 |
| 3 | 202011032555-FORM FOR SMALL ENTITY(FORM-28) [29-07-2020(online)].pdf | 2020-07-29 |
| 4 | 202011032555-EVIDENCE FOR REGISTRATION UNDER SSI [29-07-2020(online)].pdf | 2020-07-29 |
| 4 | 202011032555-FORM 1 [29-07-2020(online)].pdf | 2020-07-29 |
| 5 | 202011032555-FIGURE OF ABSTRACT [29-07-2020(online)].jpg | 2020-07-29 |
| 5 | 202011032555-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [29-07-2020(online)].pdf | 2020-07-29 |
| 6 | 202011032555-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [29-07-2020(online)].pdf | 2020-07-29 |
| 6 | 202011032555-FIGURE OF ABSTRACT [29-07-2020(online)].jpg | 2020-07-29 |
| 7 | 202011032555-EVIDENCE FOR REGISTRATION UNDER SSI [29-07-2020(online)].pdf | 2020-07-29 |
| 7 | 202011032555-FORM 1 [29-07-2020(online)].pdf | 2020-07-29 |
| 8 | 202011032555-DRAWINGS [29-07-2020(online)].pdf | 2020-07-29 |
| 8 | 202011032555-FORM FOR SMALL ENTITY(FORM-28) [29-07-2020(online)].pdf | 2020-07-29 |
| 9 | 202011032555-DECLARATION OF INVENTORSHIP (FORM 5) [29-07-2020(online)].pdf | 2020-07-29 |
| 9 | 202011032555-FORM FOR STARTUP [29-07-2020(online)].pdf | 2020-07-29 |
| 10 | 202011032555-STATEMENT OF UNDERTAKING (FORM 3) [29-07-2020(online)].pdf | 2020-07-29 |
| 10 | 202011032555-COMPLETE SPECIFICATION [29-07-2020(online)].pdf | 2020-07-29 |