Abstract: In Railway field, now days advanced technologies like RFID, Sensor networks etc. are used to monitor the health of Railway systems. In case of any malfunctioning, the trouble symptoms are generated and transmitted to the monitoring systems and also recorded in the fault verbatim (fault terms).These records are in high dimensional and imbalanced data so they take some time to analyze the fault. For this limitation we are performing 2 levels of extraction, syntax and semantic. By using these text mining techniques we can extract the fault terms and can able to display the fault class in a way of easy understanding.
BACKGROUND OF INVENTION
Text mining is a knowledge-intensive task and is gaining more and more attention in
several industrial fields, for example, aerospace, automotive, railway, power, medical,
biomedicine, manufacturing, sales and marketing sectors. In a railway field, advanced
information technologies, such as sensor networks, RFID techniques, wireless
communication, and Internet cloud, are used to monitor the health of the railway
. systems . In" the event of malfunctioning, the diagnostic trouble symptoms are
generated and transmitted to the monitoring center database by wired/wireless
communications. After every diagnosis episode a repair verbatim is recorded, which
consists of a textual description of the mixture of fault symptom (i.e., fault terms), e.g.,
"Speed Distance Unit (SDU) relevant faults," a fault symptom associated with a
specific part, e.g., "SDU," failure modes (i.e., fault classes), and finally corrective
actions, e.g., "replaced SDU," taken to fix its faults. In railway industry, millions of
such repair verbatims are generated every year. The data provides knowledge which
must be discovered for efficient fault diagnosis and handling of the similar cases in
future. From repair verbatim data, text mining techniques can be used to establish the
associations between fault terms and fault classes such that these associations can be
used to improve the precision of fault diagnosis. However, the task of automatic
discovery of knowledge from the repair verbatim is a non-trivial exercise mainly due
to the following reasons:
.1) High-dimension data. In maintenance documents, there are tens of thousands or even hundreds of thousands of distinct terms or tokens. After elimination of stop words and stemming, the set of features is still too large for many learning algorithms.
2) Imbalanced fault class distribution. In maintenance documents, the number of examples in one fault class (i.er,'majority class) is significantly greater than that of the others (i.e., minority classes). Such imbalanced class distributions have posed a
. serious difficulty to most classifier learning algorithms which assume a relatively balanced distribution.
3) Unsupervised text mining models. They may not produce topics that conform to the user's existing knowledge. One key reason is that the objective functions of topic models, e.g. Latent Dirichlet Allocation, LDA , often do not correlate well with human judgments .
EXISTING SYSTEM:
A vast amount of text data is recorded in. the forms of repair verbati maintenance sectors. Efficient text mining of such maintenance d; important role in detecting anomalies and improving fault diagnosi However, unstructured verbatim, high-dimensional data, and imbalanc< distribution pose challenges for feature selections and fault diagnosis. Ir malfunctioning, the diagnostic trouble symptoms are generated and trans monitoring center database. After every diagnosis episode a repair recorded, which consists of a textual description of the mixture of fault sy
DISADVANTAGES:
> High-dimension data.
> Imbalanced fault class distribution.
> Unsupervised text mining models.
Existing systems contain no mechanism for handling deri\ uncertainty and they cannot efficiently compute the probability wi accuracy.
SUMMARY OF THE INVENTION:
We proposes two levels of feature extraction-based text mining that integr extracted at both syntax and semantic levels with the aim to improve the fj classification performance. We first perform an improved %2 statistics-basi selection at the syntax level to overcome the learning difficulty caused by imbalanced data set. Then, we perform a prior latent Dirichlet allocation-b selection at the semantic level to reduce the data set into a low-dimensiona space.
PROPOSED SYSTEM ADVANTAGES
- To reduce the data set into a low-dimensional.
- To overcome the learning difficulty caused by an imbalanced data set.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig 1 presents the proposed system architecture
Fig 2 shows the registration and login webpage
Fig 3 represents verbatim fault record maintenance
Fig 4 User interface screen •■
BRIEF DESCRIPTION
This work proposes two levels of feature extraction-based text mining for fault diagnosis to meet the aforementioned challenges by analyzing the repair verbatim. Our main idea is to extract fault features at syntax and semantic levels respectively and then fuse them to achieve the desired results. Considering the fact that the extracted features at each level gives a different emphasis to a particular' aspect of feature spaces and has its deficiencies, the proposed feature fusion of two levels may enhance the precision of fault diagnosis for all fault classes, specially minority ones.
User Interface Design:
To Login into the website. User must register their Details into the server! Then only the user can able to login into the website. So all the user details can be able to store in the database. The database will maintain all the user details.
Railway Records Maintenance;
This module helps to extract the railway maintenance records from 2008 to 2015. So we may notice how the railway has performed those years through the bar charts. And we rectify for the following years.
Generate Fault Verbatim Records:
This module helps to generate the Fault Verbatim Records. When the train starts from the specified station. The reason to generate fault verbatim record is, if the trainexceed speed limits and if the train does not throws the signal at particular time, etc.
Generate Signal Code:
If the train starts from the particular location. Then the signal code will generate for the particular train. So the person can identify that the particular train has been started. If it delays to generate the signal code then the message will send to the fault verbatim record.
Generate Destination Details:
This module helps to generate the destination details, if the train reach the destination. So we come to know the destination details of date, time, km, station Code, Arrival time. All the details will store into the database.
Evaluating Halt Time and Distance:
This module is used to describe distance of each station and halt time of each station. So that we come to know the faults and we could rectify the faults.
SYSTEM TECHNIQUES:
> ICHI-BASED FEATURE SELECTION AT SYNTAX LEVEL > PLDA-BASED FEATURE SELECTIONAT SEMANTIC LEVEL
SYNTAX LEVEL:
. The basic idea of the proposed ICHI is to make a minorityclass far away from the majority one by adjusting weights, of fault terms. To facilitate understanding, wefirst define some notations. Tm is the set of fault terms of minorityfault classes, TM the set of fault terms of majority fault classand Tc, the intersection of Dm and TM, the common feature set.
SEMANTIC LEVEL :
We performa prior latent Dirichlet allocation-based feature selection at thesemantic level to reduce the data set into a low-dimensional topicspace. Given D documents expressed over W unique.words and T topics', LDA outputs the document-topic
5. CLAIMS:
We claim that the two levels feature extraction-based text mining for fault diagnosis is useful in prevention of accidents in railways. In railway industry, millions of such repair verbatims are generated every year. The verbatim data can be used to establish the associations between fault terms and fault classes such that these associations can be used to improve the precision of fault diagnosis
| # | Name | Date |
|---|---|---|
| 1 | Form 3_As Filed_19-05-2017.pdf | 2017-05-19 |
| 2 | Form 2(Title Page)_Complete_19-05-2017.pdf | 2017-05-19 |
| 3 | Form 1_As Filed_19-05-2017.pdf | 2017-05-19 |
| 4 | Drawing_As Filed_19-05-2017.pdf | 2017-05-19 |