Sign In to Follow Application
View All Documents & Correspondence

Method And System For Generating Named Entities

Abstract: This disclosure relates generally to natural language processing, and more particularly to system and method for generating named entities. In one embodiment, a method is provided for generating named entities. The method includes extracting a plurality of named entities in a primary language from a plurality of digital content in the primary language, transliterating each of the plurality of named entities in the primary language to a set of possible named entities in a secondary language, determining a correct named entity in the secondary language from among the set of possible named entities in the secondary language, and generating a named entity in a subsequent secondary language corresponding to the correct named entity in the secondary language. It should be noted that the plurality of named entities in the primary language are named entities in the subsequent secondary language, and the subsequent secondary language is related to the secondary language. Figure 3

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
18 May 2017
Publication Number
47/2018
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
bangalore@knspartners.com
Parent Application
Patent Number
Legal Status
Grant Date
2023-12-22
Renewal Date

Applicants

WIPRO LIMITED
Doddakannelli, Sarjapur Road, Bangalore 560035, Karnataka, India.

Inventors

1. BALAJI JAGAN
32 Muthamil Nagar, Chinnalapatti, Dindigul District, Tamil Nadu – 624301, India
2. NAVEEN KUMAR NANJAPPA
59, Lakshmi Nilaya, Sonnappa Layout, Virupakshapura, Kodigehalli, Bengaluru-560097, Karnataka, India.

Specification

Claims:WE CLAIM
1. A method of generating named entities, the method comprising:
extracting, by a named entity generation engine, a plurality of named entities in a primary language from a plurality of digital content in the primary language;
transliterating, by the named entity generation engine, each of the plurality of named entities in the primary language to a set of possible named entities in a secondary language;
determining, by the named entity generation engine, a correct named entity in the secondary language from among the set of possible named entities in the secondary language; and
generating, by the named entity generation engine, a named entity in a subsequent secondary language corresponding to the correct named entity in the secondary language, wherein the plurality of named entities in the primary language are named entities in the subsequent secondary language, and wherein the subsequent secondary language is related to the secondary language.

2. The method of claim 1, wherein the plurality of named entities in the primary language is extracted using a named entity extraction model.

3. The method of claim 2, wherein the named entity extraction model is trained by manually tagging an initial plurality of named entities in the primary language in an initial plurality of digital content in the primary language.

4. The method of claim 3, wherein the initial plurality of digital content in the primary language are curated from across a plurality of genres, and are relevant to a population versant with the subsequent secondary language.

5. The method of claim 1, wherein each of the plurality of named entities comprise at least one of a person, a place, and an organization.

6. The method of claim 1, wherein each of the plurality of named entities in the primary language is transliterated to the set of possible named entities in the secondary language using a plurality of predefined transliteration frameworks, and wherein each of the plurality of predefined transliteration frameworks comprise a plurality of pre-defined mapping tables in Unicode values.

7. The method of claim 6, wherein each of the plurality of predefined transliteration frameworks comprises at least one of a Harvard-Kyoto transliteration framework, an ISO 15919 transliteration framework, and a simplified customized standard transliteration framework, and wherein each of the plurality of pre-defined mapping tables comprises at least one of a vowels mapping table, a constants mapping table, and a matras mapping table.

8. The method of claim 1, wherein transliteration further comprises:
retrieving a sequence of symbols for the primary language for each of the plurality of named entities in the primary language;
retrieving a sequence of symbols for the secondary language corresponding to the sequence of symbols for the primary language; and
combining the sequence of symbols in the secondary language to generate the set of possible named entities in the secondary language.

9. The method of claim 1, wherein the correct named entity in the secondary language is determined from among the set of possible named entities in the secondary language using a long short term memory (LSTM) model based on a confidence score.

10. The method of claim 9, wherein the LSTM model is trained with a large dataset comprising of a plurality of named entities of the secondary language and a plurality of corresponding transliterated named entities in the primary language.

11. The method of claim 1, wherein the named entity in the subsequent secondary language corresponding to the correct named entity in the secondary language is generated using a multi-lingual character level tree model.

12. The method of claim 1, wherein the primary language is English, wherein the secondary language is Hindi, and the subsequent secondary language is one of a plurality of Indian languages.

13. A system for generating named entities, the system comprising:
at least one processor; and
a computer-readable medium storing instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising:
extracting a plurality of named entities in a primary language from a plurality of digital content in the primary language;
transliterating each of the plurality of named entities in the primary language to a set of possible named entities in a secondary language;
determining a correct named entity in the secondary language from among the set of possible named entities in the secondary language; and
generating a named entity in a subsequent secondary language corresponding to the correct named entity in the secondary language, wherein the plurality of named entities in the primary language are named entities in the subsequent secondary language, and wherein the subsequent secondary language is related to the secondary language.

14. The system of claim 13, wherein the plurality of named entities in the primary language is extracted using a named entity extraction model.

15. The system of claim 14, wherein the named entity extraction model is trained by manually tagging an initial plurality of named entities in the primary language in an initial plurality of digital content in the primary language, and wherein the initial plurality of digital content in the primary language are curated from across a plurality of genres and are relevant to a population versant with the subsequent secondary language.

16. The system of claim 13, wherein each of the plurality of named entities in the primary language is transliterated to the set of possible named entities in the secondary language using a plurality of predefined transliteration frameworks, and wherein each of the plurality of predefined transliteration frameworks comprise a plurality of pre-defined mapping tables in Unicode values.

17. The system of claim 16, wherein each of the plurality of predefined transliteration frameworks comprises at least one of a Harvard-Kyoto transliteration framework, an ISO 15919 transliteration framework, and a simplified customized standard transliteration framework, and wherein each of the plurality of pre-defined mapping tables comprises at least one of a vowels mapping table, a constants mapping table, and a matras mapping table.

18. The system of claim 13, wherein transliteration further comprises:
retrieving a sequence of symbols for the primary language for each of the plurality of named entities in the primary language;
retrieving a sequence of symbols for the secondary language corresponding to the sequence of symbols for the primary language; and
combining the sequence of symbols in the secondary language to generate the set of possible named entities in the secondary language.

19. The system of claim 13, wherein the correct named entity in the secondary language is determined from among the set of possible named entities in the secondary language using a long short term memory (LSTM) model based on a confidence score, and wherein the LSTM model is trained with a large dataset comprising of a plurality of named entities of the secondary language and a plurality of corresponding transliterated named entities in the primary language.

20. The system of claim 13, wherein the named entity in the subsequent secondary language corresponding to the correct named entity in the secondary language is generated using a multi-lingual character level tree model.

Dated this 18th day of March 2017

SWETHA SN
OF K&S PARTNERS
AGENT FOR THE APPLICANT
IN/PA-2123
, Description:TECHNICAL FIELD
This disclosure relates generally to natural language processing, and more particularly to method and system for generating named entities.

Documents

Application Documents

# Name Date
1 Power of Attorney [18-05-2017(online)].pdf 2017-05-18
2 Form 5 [18-05-2017(online)].pdf 2017-05-18
3 Form 3 [18-05-2017(online)].pdf 2017-05-18
4 Form 18 [18-05-2017(online)].pdf_168.pdf 2017-05-18
5 Form 18 [18-05-2017(online)].pdf 2017-05-18
6 Form 1 [18-05-2017(online)].pdf 2017-05-18
7 Drawing [18-05-2017(online)].pdf 2017-05-18
8 Description(Complete) [18-05-2017(online)].pdf_169.pdf 2017-05-18
9 Description(Complete) [18-05-2017(online)].pdf 2017-05-18
10 REQUEST FOR CERTIFIED COPY [19-05-2017(online)].pdf 2017-05-19
11 PROOF OF RIGHT [13-07-2017(online)].pdf 2017-07-13
12 Correspondence by Agent_Form 1_17-07-2017.pdf 2017-07-17
13 abstract 201741017539.jpg 2017-07-20
14 201741017539-Proof of Right (MANDATORY) [15-09-2017(online)].pdf 2017-09-15
15 Correspondence by Agent_Form30,Form1_19-09-2017.pdf 2017-09-19
16 201741017539-PETITION UNDER RULE 137 [03-05-2021(online)].pdf 2021-05-03
17 201741017539-OTHERS [03-05-2021(online)].pdf 2021-05-03
18 201741017539-FORM 3 [03-05-2021(online)].pdf 2021-05-03
19 201741017539-FER_SER_REPLY [03-05-2021(online)].pdf 2021-05-03
20 201741017539-DRAWING [03-05-2021(online)].pdf 2021-05-03
21 201741017539-COMPLETE SPECIFICATION [03-05-2021(online)].pdf 2021-05-03
22 201741017539-CLAIMS [03-05-2021(online)].pdf 2021-05-03
23 201741017539-FER.pdf 2021-10-17
24 201741017539-PatentCertificate22-12-2023.pdf 2023-12-22
25 201741017539-IntimationOfGrant22-12-2023.pdf 2023-12-22
26 201741017539-PROOF OF ALTERATION [18-03-2024(online)].pdf 2024-03-18

Search Strategy

1 search_strategyE_25-11-2020.pdf

ERegister / Renewals

3rd: 18 Mar 2024

From 18/05/2019 - To 18/05/2020

4th: 18 Mar 2024

From 18/05/2020 - To 18/05/2021

5th: 18 Mar 2024

From 18/05/2021 - To 18/05/2022

6th: 18 Mar 2024

From 18/05/2022 - To 18/05/2023

7th: 18 Mar 2024

From 18/05/2023 - To 18/05/2024

8th: 18 Mar 2024

From 18/05/2024 - To 18/05/2025

9th: 06 May 2025

From 18/05/2025 - To 18/05/2026