Sign In to Follow Application
View All Documents & Correspondence

A System And A Method For Categorizing Data

Abstract: ABSTRACT A SYSTEM AND A METHOD FOR CATEGORIZING DATA The present invention discloses a system and a method to categorize the data based on one or more particular attributes. The method comprises receiving a data set comprising at least one of a portion of data to be categorized; determining that the received data set has a pattern list; in response to determination of the pattern list, updating the pattern list; processing of the received data set is performed based on the updated pattern list; and standardizing the processed data based on a set of identified attribute. FIGURE 2

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
01 August 2019
Publication Number
21/2022
Publication Type
INA
Invention Field
MECHANICAL ENGINEERING
Status
Email
archana@anandandanand.com
Parent Application

Applicants

Equifax Credit Information Services Private Limited
Unit No. 931, 3rd Floor, Building No.9, Solitaire Corporate Park, Andheri Ghatkopar Link Road, Andheri East, Mumbai – 400093, Maharashtra, India

Inventors

1. Shruti Joshi
601, A, Anand Amrut, 28 Tejpal road, Vile parle east - Mumbai, 400057, Maharashtra, India

Specification

FORM-2
THE PATENT ACT,1970
(39 OF 1970)
AND
THE PATENT RULES, 2003
(As Amended)
COMPLETE SPECIFICATION (See section 10;rule 13)
" A SYSTEM AND A METHOD FOR CATEGORIZING DATA "
Equifax Credit Information Services Private Limited, a corporation organized and existing under the laws of India, of Unit No. 931, 3rd Floor, Building No.9, Solitaire Corporate Park, Andheri Ghatkopar Link Road, Andheri East, Mumbai – 400093, Maharashtra, India.
The following specification particularly describes the invention and the manner in which it is to be performed:

A SYSTEM AND A METHOD FOR CATEGORIZING DATA
TECHNICAL FIELD
The present invention relates to the field of data processing, more specifically related to categorizing data based on one or more particular attributes.
BACKGROUND
In an establishment, such as financial institutions, educational institutions, hotels etc. various people are associated with the establishment in terms of professional engagement. The establishment has to maintain a record of the said people along with their demographic details.
However, at a particular instance, if it is asked to retrieve details of people associated with the establishment based on a particular category such as locality, it would be difficult to ascertain the number of people within the establishment. Therefore, there is a need to have a platform so that details a set of people can be categorized and retrieved when required.
SUMMARY OF THE INVENTION
The following presents a simplified summary of the subject matter in order to provide a basic understanding of some aspects of subject matter embodiments. This summary is not an extensive overview of the subject matter. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the subject matter.
Its sole purpose to present some concepts of the subject matter in a simplified form as a prelude to the more detailed description that is presented later.

According to an embodiment of the present invention, a method for implementing the present invention is discussed, the method comprising: receiving a data set comprising at least one of a portion of data to be categorized; determining that the received data set has a pattern list; in response to determination of the pattern list, updating the pattern list; processing of the received data set is performed based on the updated pattern list; and standardizing the processed data based on a set of identified attribute.
According to an embodiment of the present invention, the method comprising identifying one or more attributes of the processed data set.
According to an embodiment of the present invention, wherein the received data is processed on the basis of a pattern list comprising of a user defined PATTERN-ID.
According to an embodiment of the present invention, wherein the pattern list is updated on the basis of number of times of receiving the data by a data input device.
According to an embodiment of the present invention, a system for implementing the present invention has been discussed, the system comprising: a database, and a processor, wherein the processor along with the database is configured to: receive a data set comprising at least one of a portion of data to be categorized; determine that the received data set has a pattern list; in response to determination of the pattern list, update the pattern list; process of the received data set is performed based on the updated pattern list; and standardize the processed data based on a set of identified attribute.
According to an embodiment of the present invention, wherein the processor is configured to identify one or more attributes of the processed data set.

According to an embodiment of the present invention, wherein the received data is processed on the basis of a pattern list comprising of a user defined PATTERN-ID.
According to an embodiment of the present invention, wherein the pattern list is updated on the basis of number of times of receiving the data by a data input device.
These and other objects, embodiments and advantages of the present invention will become readily apparent to those skilled in the art from the following detailed description of the embodiments having reference to the attached figures, the invention not being limited to any particular embodiments disclosed.
BRIEF DESCRIPTION OF FIGURES
The foregoing and further objects, features and advantages of the present subject matter will become apparent from the following description of exemplary embodiments with reference to the accompanying drawings, wherein like numerals are used to represent like elements.
It is to be noted, however, that the appended drawings along with the reference numerals illustrate only typical embodiments of the present subject matter, and are therefore, not to be considered for limiting of its scope, for the subject matter may admit to other equally effective embodiments.
FIGURE 1: illustrates a system in which the present invention is implemented
according to a first embodiment. FIGURE 2: illustrates a method by which the present invention is implemented
according to an embodiment.

DETAILED DESCRIPTION
Exemplary embodiments now will be described with reference to the accompanying drawings. The disclosure may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey its scope to those skilled in the art. The terminology used in the detailed description of the particular exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting. In the drawings, like numbers refer to like elements.
It is to be noted, however, that the reference numerals used herein illustrate only typical embodiments of the present subject matter, and are therefore, not to be considered for limiting of its scope, for the subject matter may admit to other equally effective embodiments.
The specification may refer to “an”, “one” or “some” embodiment(s) in several locations. This does not necessarily imply that each such reference is to the same embodiment(s), or that the feature only applies to a single embodiment. Single features of different embodiments may also be combined to provide other embodiments.
As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless expressly stated otherwise. It will be further understood that the terms “includes”, “comprises”, “including” and/or “comprising” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is

referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Furthermore, “connected” or “coupled” as used herein may include operatively connected or coupled. As used herein, the term “and/or” includes any and all combinations and arrangements of one or more of the associated listed items.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The figures depict a simplified structure only showing some elements and functional entities, all being logical units whose implementation may differ from what is shown. The connections shown are logical connections; the actual physical connections may be different. It is apparent to a person skilled in the art that the structure may also comprise other functions and structures.
Also, all logical units described and depicted in the figures include the software and/or hardware components required for the unit to function. Further, each unit may comprise within itself one or more components which are implicitly understood. These components may be operatively coupled to each other and be configured to communicate with each other to perform the function of the said unit.
Figure 1 illustrates a system 100 in which the present invention is implemented. The system 100 comprises a data input device 110 and a processor 120 communicatively coupled with a database 140, a server 150, a network 130 which is connected to the server 150. The data input device 110 comprises a plurality of

input devices to receive the data of a particular category. Such category may be demographic details of an establishment and/or person. The processor 120 is coupled to the network 130 to process/clean the received data on the basis of a pattern list comprising of an user defined PATTERN-ID at 130. The pattern list is updated on the basis of number of times of receiving the data by the data input device 110. The processor 120 continues to process/clean the received data till and average number of words per received data remains to four or less than four. The processor 120 is coupled to the database 140. After processing/cleaning the received data by the processor 120, the processor 120 identifies one or more attributes as pre-stored in the database 140. The one or more attributes may be at least one of but not limited to location, name of street, name of society, pin-code and etc. The processor 120, based on identification of one or more attribute, standardize the processed/cleaned data to the equivalent of such one or more attribute.
It should be understood that the database 140 and the processor 120, may be physically separated from each other or may be embedded together when in operation. The system 100 may be in a network environment (not shown) along with its all other components such as the data input device 110, the database 140 and the processor 120. On the other hand, the components of the system 100 such as the data input device 120, the database 140 and the processor 120 may be in communication with each of them via separate network interface (not shown). The network interfaces may be network interface cards, switches or routers, Fibre Channel transceivers, InfiniBand-enabled devices, or other devices programmed to transmit and receive messages according to standardized data network protocols. These protocols may include Ethernet for a media access control (MAC) layer, Internet Protocol (IP) for a network layer, and/or User Datagram Protocol (UDP) or Transmission Control Protocol (TCP) for a transport layer.
The database 140 may at least have a pattern list which comprises of PATTERN_IDs and at least one or more attributes as defined above.

Referring now to Figure 2, an example methodology 200 is disclosed for categorizing data. A set of data can be received at 210 by the data input device 110. A pattern list is set at 220 comprising of a user defined PATTERN_ID. The pattern list is updated at 230 on the basis of number of times of receiving the data at 210. For example, the pattern list may be updated on the basis of the following parameters such as removing C/O, S/O, D/O, digits, punctuations except ‘/’ and company names from addresses present in the received set of data. The pattern list with different PATTERN_IDs may be but not limited to as defined below in Table 1.

COMPANY_IDENTIFIE RS NAME OF COMPANIES DIRECTIO N INSTITUTE S
CO ACC OLD MASJID
ELECTRICALS ACTION SOTHERN JAIL
SOLUTIONS ADANI CENTERAL UNIVERSITY
ENTERPRISES AMBUJA RING AIRPORT
PRODUCTS AMRUTANJAN NEW RESTAURANT
Now, the received data is processed at 240 to clean the received data by the processor 120 on the basis of updated pattern list. The received data is processed at 240 to reduce the total number of counts. Processing at 240 continues till an average number of words per received data remains to four or less than four. This may be performed on the basis of following parameters:
a. Removing first word where first word is a SELECTED keyword,
b. Remove all words before noise word where it appears before society
identifier,
c. Remove single word patterns, extra white-space and words having only one
letter,
d. Remove multiword patterns,
e. Remove duplicate words from society name.

Based on the processed/cleaned data at 240, one or more attribute is identified at 250. The one or more attributes may be at least one of but not limited to location, name of street, name of society, pin-code and etc.
Based on identification of one or more attribute at 250, the processed/cleaned data is standardized at 260 to reduce the processed/cleaned data to the equivalent of one or more attributes. The standardization at 260 may be but not limited to as defined in Table 2.

EXTRACTED SOCIETY
ADDRESS NAME
ROOM NO 10 PLOT NO 2008
SWASADAN APT NEAR SCHOOL
CHARKOP SECTO SWASADAN CHS
W/O RamrajanJ N dahiya C-2161
SEEMANT APERTMENTS DELHI SEEMANT APARTMENTS
S 14 157 SHANI APPARTMENT
SHASHTRINAGAR SHOPING CENTER
SAME AHM SHANI APARTMENTS
27 NIRMAL NAGAR SOC NR
CHIKUWADI NANA VARACHA SURAY
SURAT NIRMAL SOCIETY
Z-159 AASTHA RESIDENCY 150 FT
RING ROAD RAJKOT AASTHA RESIDENCY
A/501 PARISHRAM TOWER LINK
CROSS ROAD OPP PAVAN PARTY
PLOT PARISHRAM TOWER
Further, the standardized data may be stored in the database 140 or the server 150.

As will be appreciated by one of skill in the art, the present invention may be embodied as a method and system. Accordingly, the present invention may take the form of an entirely hardware embodiment, a software embodiment or an embodiment combining software and hardware aspects.
In the drawings and specification, there have been disclosed exemplary embodiments of the invention. Although specific terms are employed, they are used in a generic and descriptive sense only and not for purposes of limitation of the scope of the invention.

We claim:
1. A method for categorizing data, the method comprising:
receiving a data set comprising at least one of a portion of data to be categorized;
determining that the received data set has a pattern list;
in response to determination of the pattern list, updating the pattern list;
processing of the received data set is performed based on the updated pattern list; and
standardizing the processed data based on a set of identified attribute.
2. The method as claimed in claim 1, comprising identifying one or more attributes of the processed data set.
3. The method as claimed in claim 1, wherein the received data is processed on the basis of a pattern list comprising of a user defined PATTERN-ID.
4. The method as claimed in claim 1, wherein the pattern list is updated on the basis of number of times of receiving the data by a data input device.
5. A system for categorizing data, the system comprising: a database, and
a processor, wherein the processor along with the database is configured to:
receive a data set comprising at least one of a portion of data to be categorized;
determine that the received data set has a pattern list;
in response to determination of the pattern list, update the pattern list;

process of the received data set is performed based on the updated pattern list; and
standardize the processed data based on a set of identified attribute.
6. The system as claimed in claim 5, wherein the processor is configured to identify one or more attributes of the processed data set.
7. The system as claimed in claim 5, wherein the received data is processed on the basis of a pattern list comprising of a user defined PATTERN-ID.
8. The system as claimed in claim 5, wherein the pattern list is updated on the basis of number of times of receiving the data by a data input device.

Documents

Application Documents

# Name Date
1 201921031183-STATEMENT OF UNDERTAKING (FORM 3) [01-08-2019(online)].pdf 2019-08-01
2 201921031183-PROVISIONAL SPECIFICATION [01-08-2019(online)].pdf 2019-08-01
3 201921031183-FORM 1 [01-08-2019(online)].pdf 2019-08-01
4 201921031183-DRAWINGS [01-08-2019(online)].pdf 2019-08-01
5 201921031183-DRAWING [31-07-2020(online)].pdf 2020-07-31
6 201921031183-CORRESPONDENCE-OTHERS [31-07-2020(online)].pdf 2020-07-31
7 201921031183-COMPLETE SPECIFICATION [31-07-2020(online)].pdf 2020-07-31
8 Abstract1.jpg 2021-10-19
9 201921031183-FORM-26 [24-05-2022(online)].pdf 2022-05-24
10 201921031183-Proof of Right [13-07-2022(online)].pdf 2022-07-13
11 201921031183-ENDORSEMENT BY INVENTORS [13-07-2022(online)].pdf 2022-07-13
12 201921031183-ORIGINAL UR 6(1A) FORM 1-240822.pdf 2022-08-26
13 201921031183-FORM 18 [23-11-2022(online)].pdf 2022-11-23
14 201921031183-FER.pdf 2022-11-28
15 201921031183-PETITION UNDER RULE 137 [27-05-2023(online)].pdf 2023-05-27
16 201921031183-MARKED COPIES OF AMENDEMENTS [27-05-2023(online)].pdf 2023-05-27
17 201921031183-FORM 13 [27-05-2023(online)].pdf 2023-05-27
18 201921031183-FER_SER_REPLY [27-05-2023(online)].pdf 2023-05-27
19 201921031183-COMPLETE SPECIFICATION [27-05-2023(online)].pdf 2023-05-27
20 201921031183-CLAIMS [27-05-2023(online)].pdf 2023-05-27
21 201921031183-AMMENDED DOCUMENTS [27-05-2023(online)].pdf 2023-05-27
22 201921031183-Response to office action [05-05-2025(online)].pdf 2025-05-05

Search Strategy

1 201921031183E_24-11-2022.pdf