Abstract: A ENHANCED STOCK PRICE PREDICTION SYSTEM USING MACHINE LEARNING AND SENTIMENT FUSION A system and method for enhanced stock price prediction using machine learning and sentiment fusion are disclosed. Historical stock price data and daily financial news articles are collected, and a sentiment analysis module employing a pre-trained natural language processing model generates daily sentiment scores from the news. A time-series forecasting engine processes historical stock data using machine learning models. A hybrid integration layer normalises and combines sentiment scores with price features to create a composite dataset. A hybrid forecasting model, such as an RNN-LSTM network, receives the composite dataset and predicts future stock prices. Predicted prices, evaluation metrics and visualisations are displayed to the user. By integrating structured historical data with unstructured sentiment data in a single pipeline, the invention delivers improved accuracy and responsiveness compared to conventional forecasting systems, enabling investors and analysts to make more informed, timely investment decisions.
Description:FIELD OF THE INVENTION
This invention relates to financial technology and predictive analytics. More specifically, it concerns a system and method for enhanced stock price prediction using machine learning and sentiment fusion, combining time-series forecasting with real-time sentiment analysis of financial news to improve the accuracy and responsiveness of investment decision support systems.
BACKGROUND OF THE INVENTION
Forecasting stock market prices accurately remains a significant challenge due to its dynamic, non-linear, and driven by investor sentiment and news. Traditional time-series forecasting models (e.g. Auto-Regressive Integrated Moving Average (ARIMA) and Moving Averages) only use historical price movements of the market and cannot react to sudden market shifts induced from investors' sentiment or news. Machine learning or neural network models (e.g. Long Short-term Memories (LSTM) networks) typically exclude qualitative inputs, i.e. financial news, analyst opinions, and investor behavior; consequently, predictions will be inaccurate leading to high risk investment approaches. Currently, solutions address time-series data, or sentiment, but not the concurrent relationship between historic price movements and market sentiment. The mechanisms of relying on qualitative data are inadequate since there has been no acceptable solution which temporarily integrates and normalizes real-time sentiment data for the volatility of financial markets. The gap and opportunity exist for a Hybrid model, which will accept both structured numerical data (historical prices) and unstructured qualitative data (news sentiment) to yield an accurate prediction for stock prices, in real-time. This utility patent addresses this problem by proposing a novel, integrated system that fuses sentiment analysis with advanced time-series forecasting to enhance predictive accuracy and decision-making in stock trading.
US20230130409A1: A method of using natural language processing (NLP) techniques to extract information from online news feeds and then using the information so extracted to predict changes in stock prices or volatilities. These predictions can be used to make profitable trading strategies. Company names can be recognized and simple templates describing company actions can be automatically filled using parsing or pattern matching on words in or near the sentence containing the company name. These templates can be clustered into groups which are statistically correlated with changes in the stock prices. The system is composed of two parts: message understanding component that automatically fills in simple templates and a statistical correlation component that tests the correlation of these patterns to increases or decreases in the stock price. The methods can be applied to a broad range of text, including articles in online newspapers such as the Wall Street Journal, financial newsletters, radio &TV transcripts and annual reports. In an enhanced embodiment of the system statistical patterns in Internet usage data and Internet data such as newly released textual information on Web pages are further leveraged.
US2005131794A1: One aspect of the invention is a method for investing. An equation is created using multivariate regression techniques to calculate a plurality of coefficients each associated with one of a plurality of statistic types that is correlated with actual market prices of the plurality of stocks. At least some of the plurality of statistic types comprise financial information, other than the particular stock's past market price, specific to the entity associated with the particular stock. The equation is then used to estimate the degree to which ones of the plurality of stocks are over-priced or under-priced relative to the price of other ones of the plurality of stocks. These estimates may then be used to make investment decisions.
Accurate forecasting of stock market prices remains difficult because of their dynamic, non-linear nature and susceptibility to investor sentiment and news events. Traditional time-series models such as ARIMA rely only on historical prices and fail to react to sudden market shifts induced by sentiment. Machine learning or neural network models often exclude qualitative inputs such as financial news, analyst opinions, and investor behaviour, resulting in incomplete predictions. Existing solutions address either price history or sentiment but not both concurrently. The present invention solves this problem by providing a hybrid framework that fuses structured historical stock data with unstructured real-time sentiment data to yield more accurate and timely predictions of stock prices.
SUMMARY OF THE INVENTION
This summary is provided to introduce a selection of concepts, in a simplified format, that are further described in the detailed description of the invention.
This summary is neither intended to identify key or essential inventive concepts of the invention and nor is it intended for determining the scope of the invention.
The invention provides a hybrid machine learning framework that integrates a sentiment analysis module with a time-series forecasting engine through a hybrid integration layer. The sentiment analysis module employs a pre-trained natural language processing (NLP) model to extract polarity scores from financial news articles, classifying them as positive, negative or neutral and aggregating them into daily sentiment scores.
The time-series forecasting engine uses one or more machine learning models such as ARIMA, Prophet, K-Nearest Neighbours and Long Short-Term Memory (LSTM) networks to learn temporal patterns in historical stock price data.
A hybrid integration layer normalises and fuses sentiment scores with historical price features to create an enhanced input for the forecasting model. This combined input allows the model to learn sequential dependencies of prices while accounting for evolving market sentiment.
By unifying structured and unstructured data, the system delivers superior prediction accuracy and responsiveness, enabling investors and analysts to anticipate market changes more effectively than with conventional approaches.
To further clarify advantages and features of the present invention, a more particular description of the invention will be rendered by reference to specific embodiments thereof, which is illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail with the accompanying drawings.
The proposed invention is a hybrid machine learning framework that combines sentiment analysis with time-series forecasting in one project to enable improved real-time prediction of stock price fluctuation. The system seeks to advance current capabilities of dashboards by addressing the shortcomings of traditional models that analyze historical stock data and market sentiment separately, combining historical stock price and market sentiment into one adaptive prediction model.
BRIEF DESCRIPTION OF THE DRAWINGS
The illustrated embodiments of the subject matter will be understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of devices, systems, and methods that are consistent with the subject matter as claimed herein, wherein:
FIGURE 1: SYSTEM ARCHITECTURE
The figures depict embodiments of the present subject matter for the purposes of illustration only. A person skilled in the art will easily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
DETAILED DESCRIPTION OF THE INVENTION
The detailed description of various exemplary embodiments of the disclosure is described herein with reference to the accompanying drawings. It should be noted that the embodiments are described herein in such details as to clearly communicate the disclosure. However, the amount of details provided herein is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the present disclosure as defined by the appended claims.
It is also to be understood that various arrangements may be devised that, although not explicitly described or shown herein, embody the principles of the present disclosure. Moreover, all statements herein reciting principles, aspects, and embodiments of the present disclosure, as well as specific examples, are intended to encompass equivalents thereof.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a",” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
In addition, the descriptions of "first", "second", “third”, and the like in the present invention are used for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" and "second" may include at least one of the features, either explicitly or implicitly.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The proposed invention is a hybrid machine learning framework that combines sentiment analysis with time-series forecasting in one project to enable improved real-time prediction of stock price fluctuation. The system seeks to advance current capabilities of dashboards by addressing the shortcomings of traditional models that analyze historical stock data and market sentiment separately, combining historical stock price and market sentiment into one adaptive prediction model.
The framework consists of three components:
Sentiment Analysis Module: The decision-making process begins with implementing a pre-trained Natural Language Processing (NLP) model incorporating the VADER (Valence Aware Dictionary and sEntiment Reasoner) tool to analyze financial news articles to extract daily sentiment scores. News articles were assessed as positive, negative, or neutral then summed to produce a market sentiment score and assigned for each trading day.
Time-series forecasting engine: Time-series forecasting engine used various machine learning models, including Auto-Regressive Integrated Moving Average (ARIMA), Prophet, K-Nearest Neighbors (KNN), and Long Short-Term Memory (LSTM) neural networks. Each model leveraged historical stock price data to learn the temporal patterns of price movements and trends.
Hybrid Integration layer: The sentiment scores derived from the news analysis were normalized, and then, combining this additional information with the historical stock data created another input feature. The historical data with sentiment data was provided into a hybrid forecasting model (usually an RNN-LSTM network), which is capable of learning "sequential" dependencies in time and also how market sentiment can vary over time. The platform is built in Python, by using TensorFlow (deep learning models), Stats-models (ARIMA), Prophet (for trend detection) and NLTK (for sentiment analysis). Once the data has been pre-processed and scaled for the models analysis and divided into testing and training information to keep independence, we use MAE and RMSE for performance measures - with the hybrid consistently outperforming the single models. This invention is a flexible, synergetic prediction platform that provides a responsive opportunity to take advantage of the evolving nature of the market using structured historical data combined with unstructured sentiment data. This invention provides investors and financial analysts with more aggressive insight in a timely manner, which can help improve investing results and reduce financial risk.
What distinguishes this invention is the proposed hybrid framework, which combines a system for live sentiment analysis of financial news and complex time-series forecasting models for predicting stock market changes. Typically, systems rely on sentiment in one form of unstructured data, while price relies on structured data, sentiment and price are each rejected from the prediction or prescriptive process independently. By combining unstructured textual sentiment data and structured numerical stock price data to inform models of sentiment sentiment and price sentiment accuracy when combined. The invention relies on a pre-trained natural language processing (NLP) sentiment model, which extracts polarity (positive or negative) from financial news as a feature of models like LSTM or ARIMA. Consequently, the invention enables the model to learn the factors, affecting the market price and the trends in emotional sentiment reflecting psychology investors in their investment decisions. As a dual-layer architecture the aggregate of these two layers learn each in reference to the other. Consequently, this hybrid framework capitalizes the one major weakness of other similar structured model systems applied to stock price predictions, being the rapid change in market conditions driven by unpredictable sentiment factors inferring investor psychology which were independent in unstructured or structured did not allow the model to register and adapt in these rapid market shifts. Overall, the hybrid framework is modular, scalable, and deployable in a real-time trading environment, resulting in suitable improvements in accuracy and responsiveness over traditional sentiment systems or forecasting systems.
Pseudocode: Implementation for Sentiment-Fused Stock Price Prediction System
Initialize:
sentiment_model ← Pre-trained VADER or FinBERT
price_model ← Hybrid model (e.g., RNN-LSTM)
prediction_window ← number of past days used for forecasting
sentiment_scores ← empty list
combined_dataset ← []
Load:
historical_stock_data ← Load time-series stock prices
financial_news_data ← Load daily financial news articles
Function extract_sentiment_score(news_articles):
score ← 0
For article in news_articles:
sentiment ← sentiment_model.predict(article)
score += sentiment.polarity_score
Return normalized(score)
For each day t in dataset:
# Step 1: Collect data
price_features ← historical_stock_data[t - prediction_window : t]
daily_news ← financial_news_data[t]
# Step 2: Perform sentiment analysis
daily_sentiment ← extract_sentiment_score(daily_news)
sentiment_scores.append(daily_sentiment)
# Step 3: Combine features
combined_input ← concatenate(price_features, daily_sentiment)
combined_dataset.append(combined_input)
# Step 4: Split data
train_set, test_set ← train_test_split(combined_dataset, ratio=80:20)
# Step 5: Train model
price_model.train(train_set)
# Step 6: Predict and evaluate
predictions ← price_model.predict(test_set.inputs)
Evaluate predictions using MAE, RMSE, accuracy
# Step 7: Output
Display predicted vs actual stock prices
Log evaluation metrics
Plot prediction performance
The invention comprises a data acquisition module that collects historical stock price data and daily financial news articles.
A sentiment analysis module processes the news articles using a pre-trained NLP sentiment model to produce daily sentiment scores. Each article is classified as positive, negative or neutral, and the scores are aggregated and normalised for each trading day.
A time-series forecasting engine processes historical stock price data using machine learning models including ARIMA, Prophet, KNN and LSTM to learn patterns and trends over time.
A hybrid integration layer combines the normalised sentiment scores with historical price features to create a composite dataset representing both structured and unstructured factors affecting stock prices.
The hybrid forecasting model, typically based on an RNN-LSTM network, receives this composite dataset as input and learns both sequential price dependencies and temporal sentiment variations.
Training involves splitting the combined dataset into training and testing sets to maintain independence, with performance evaluated using metrics such as Mean Absolute Error (MAE) and Root Mean Square Error (RMSE).
The system outputs predicted stock prices along with performance metrics and visualisations comparing predicted versus actual values.
By fusing sentiment and price features, the model adapts quickly to sudden market shocks driven by news or investor psychology, outperforming traditional single-source models.
The platform is modular, allowing substitution of different sentiment models or forecasting engines as needed, and scalable for integration into trading dashboards, mobile applications or cloud services.
Security measures ensure that all data are handled confidentially, and privacy standards are met when processing third-party content.
This invention thus provides a flexible, synergistic prediction platform that combines structured and unstructured data streams, enabling more timely and accurate investment decisions and reducing financial risk.
BEST METHOD OF WORKING
The preferred embodiment implements the system as a cloud-based analytics platform. Historical stock prices and real-time financial news are ingested via APIs. A pre-trained NLP sentiment model generates daily sentiment scores. These scores are normalised and combined with historical price features and fed into a hybrid forecasting model based on an RNN-LSTM network. The model is trained and evaluated on a rolling window of data, and predictions with evaluation metrics are displayed on a dashboard for investors and analysts. This configuration achieves improved prediction accuracy and responsiveness compared to traditional forecasting systems.
, Claims:1. A system for enhanced stock price prediction comprising:
a data acquisition module configured to collect historical stock price data and daily financial news articles;
a sentiment analysis module configured to process the financial news articles using a pre-trained natural language processing model to generate daily sentiment scores;
a time-series forecasting engine configured to analyse historical stock price data using machine learning models;
a hybrid integration layer configured to normalise and combine the sentiment scores with historical price features to create a composite dataset;
a hybrid forecasting model configured to receive the composite dataset and predict future stock prices; and
an output module configured to display predicted prices, performance metrics and visualisations to a user.
2. The system as claimed in claim 1, wherein the sentiment analysis module classifies news articles as positive, negative or neutral and aggregates them into daily sentiment scores.
3. The system as claimed in claim 1, wherein the time-series forecasting engine comprises one or more models selected from ARIMA, Prophet, K-Nearest Neighbours and Long Short-Term Memory networks.
4. The system as claimed in claim 1, wherein the hybrid integration layer normalises sentiment scores and concatenates them with historical price features for model input.
5. The system as claimed in claim 1, wherein the output module provides predicted versus actual price visualisations and evaluation metrics including Mean Absolute Error and Root Mean Square Error.
6. A method for enhanced stock price prediction comprising:
collecting historical stock price data and daily financial news articles;
processing the financial news articles using a pre-trained natural language processing model to generate daily sentiment scores;
analysing historical stock price data using machine learning models to learn temporal patterns;
normalising and combining sentiment scores with historical price features to create a composite dataset;
predicting future stock prices using a hybrid forecasting model trained on the composite dataset; and
displaying predicted prices, performance metrics and visualisations to a user.
7. The method as claimed in claim 6, wherein financial news articles are classified as positive, negative or neutral and aggregated into daily sentiment scores.
8. The method as claimed in claim 6, wherein the machine learning models comprise ARIMA, Prophet, K-Nearest Neighbours and Long Short-Term Memory networks.
9. The method as claimed in claim 6, wherein sentiment scores and historical price features are combined to enable the model to learn both sequential dependencies and temporal sentiment variations.
10. The method as claimed in claim 6, wherein the system outputs predicted stock prices along with evaluation metrics comparing predicted and actual values.
| # | Name | Date |
|---|---|---|
| 1 | 202541090701-STATEMENT OF UNDERTAKING (FORM 3) [23-09-2025(online)].pdf | 2025-09-23 |
| 2 | 202541090701-REQUEST FOR EARLY PUBLICATION(FORM-9) [23-09-2025(online)].pdf | 2025-09-23 |
| 3 | 202541090701-POWER OF AUTHORITY [23-09-2025(online)].pdf | 2025-09-23 |
| 4 | 202541090701-FORM-9 [23-09-2025(online)].pdf | 2025-09-23 |
| 5 | 202541090701-FORM FOR SMALL ENTITY(FORM-28) [23-09-2025(online)].pdf | 2025-09-23 |
| 6 | 202541090701-FORM 1 [23-09-2025(online)].pdf | 2025-09-23 |
| 7 | 202541090701-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [23-09-2025(online)].pdf | 2025-09-23 |
| 8 | 202541090701-EVIDENCE FOR REGISTRATION UNDER SSI [23-09-2025(online)].pdf | 2025-09-23 |
| 9 | 202541090701-EDUCATIONAL INSTITUTION(S) [23-09-2025(online)].pdf | 2025-09-23 |
| 10 | 202541090701-DRAWINGS [23-09-2025(online)].pdf | 2025-09-23 |
| 11 | 202541090701-DECLARATION OF INVENTORSHIP (FORM 5) [23-09-2025(online)].pdf | 2025-09-23 |
| 12 | 202541090701-COMPLETE SPECIFICATION [23-09-2025(online)].pdf | 2025-09-23 |