STOCK PRICE PREDICTION USING MACHINE LEARNING AND SENTIMENTAL ANALYSIS
1.1
AIM
OF THE PROJECT
The
aim of the project is to examine a number of different forecasting techniques
to predict future stock returns based on
past returns and numerical news indicators
to construct a portfolio of multiple stocks in
order to diversify the risk. We do this by
applying supervised learning methods for
stock price forecasting by interpreting the
seemingly chaotic market data
and integrate it with sentimental analysis data.
The
fluctuation of stock market is violent and there are many complicated
financial indicators. However, the advancement in
technology, provides an opportunity to gain steady fortune from stock market and
also can help experts to find out the most informative indicators to make better
prediction. The prediction of the market value is of paramount importance to help in
maximizing the profit of stock option purchase while keeping the risk low.
Social
media plays important role in predicting the stock market return values.
So, we then appended our data with one more feature
Twitter’s
Daily Sentiment Score for each company based upon the user’s tweets
about that particular company and also the tweets on that
company’s page.
Once
we were ready with complete set of features, we normalized our data for
better results.
2.1 Existing
System
Stock market prediction is
the act of trying to determine the future value of a company stock or
other financial instrument traded on an exchange. The successful prediction of a stock's future price could
yield significant profit. The efficient market
hypothesis suggests
that stock prices reflect all currently available information and any price
changes that are not based on newly revealed information thus are inherently
unpredictable. Others disagree and those with this viewpoint possess myriad
methods and technologies which purportedly allow them to gain future price
information.
2.1.1
PREDICTION METHODS:
FUNDAMENTAL ANALYSIS:
Fundamental Analysts are concerned with the company
that underlies the stock itself. They evaluate a company's past performance as
well as the credibility of its accounts. Many performance ratios are
created that aid the fundamental analyst with assessing the validity of a
stock, such as the P/E ratio. Warren Buffett is perhaps the most famous
of all Fundamental Analysts.
Fundamental analysis is built on the belief that
human society needs capital to make progress and if a company operates well, it
should be rewarded with additional capital and result in a surge in stock
price. Fundamental analysis is widely used by fund managers as it is the most
reasonable, objective and made from publicly available information like
financial statement analysis.
Another meaning of fundamental analysis is beyond bottom-up
company analysis, it refers to top-down analysis from first analyzing the
global economy, followed by country analysis and then sector analysis, and
finally the company level analysis.
2.2.2 INTERNET-BASED DATA
SOURCES FOR STOCK MARKET PREDICTION
Tobias Preis et
al. introduced a method to identify online precursors for stock market moves,
using trading strategies based on search volume data provided by G ends. Their
analysis of Google search
volume for 98 terms of varying financial relevance, published in Scientific Reports,suggests that increases in search volume for
financially relevant search terms tend to precede large losses in financial
markets. Out of these terms, three were significant at the 5% level (|z|
> 1.96). The best term in the negative direction was "debt",
followed by "color".
In a study published in Scientific Reports in 2013, Helen Susannah Moat, Tobias Preis and colleagues demonstrated a link between
changes in the number of views of English Wikipedia articles
relating to financial topics and subsequent large stock market moves.
The use of Text Mining together
with Machine Learning algorithms
received more attention in the last years,] with the use of textual
content from Internet as input to predict price changes in Stocks and
other financial markets.
The collective mood of Twitter messages
has been linked to stock market performance. The study, however, has been
criticized for its methodology.
The activity in stock message boards has been mined
in order to predict asset returns. The enterprise headlines from Yahoo! Finance and Google Finance were
used as news feeding in a Text mining process,
to forecast the Stocks price movements from Dow Jones Industrial Average.
2.2
PROPOSED SYSTEM
Stock price trend
prediction is an active research area, as more accurate predictions are
directly related to more returns in stocks. Therefore, in recent years,
significant efforts have been put into developing models that can predict for
future trend of a specific stock or overall market. Most of the existing
techniques make use of the technical indicators. Some of the researchers showed
that there is a strong relationship between news article about a company and
its stock prices fluctuations. Following is discussion on previous research on
sentiment analysis of text data and different classification techniques. Nagar
and Hahsler in their research presented an automated text mining based approach
to aggregate news stories from various sources and create a News Corpus. The
Corpus is filtered down to relevant sentences and analyzed using Natural
Language Processing (NLP) techniques. A sentiment metric, called News
Sentiment, utilizing the count of positive and negative polarity words is
proposed as a measure of the sentiment of the overall news corpus. They have
used various open source packages and tools to develop the news collection and
aggregation engine as well as the sentiment evaluation engine. They also state
that the time variation of News Sentiment shows a very strong correlation with
the actual stock price movement.
Yu et al
present a text mining based framework to determine the sentiment of news
articles and illustrate its impact on energy demand. News sentiment is
quantified and then presented as a time series and compared with fluctuations
in energy demand and prices. J.
Bean uses keyword tagging on Twitter
feeds about airlines satisfaction to score them for polarity and sentiment.
This can provide a quick idea of the sentiment prevailing about airlines and
their customer satisfaction ratings. We
have used the sentiment detection algorithm based on this research. This
research paper studies how the results
of financial forecasting can be improved when news articles with different
levels of relevance to the target stock are used simultaneously. They used
multiple kernels learning technique for partitioning the information which is
extracted from different five categories of news articles based on sectors,
sub-sectors, industries etc. News
articles are divided into the five categories of relevance to a targeted stock,
its sub industry, industry, group industry and sector while separate kernels
are employed to analyze each one. The experimental results show that the
simultaneous usage of five news categories improves the prediction performance
in comparison with methods based on a lower number of news categories. The
findings have shown that the highest prediction accuracy and return per trade
were achieved for MKL when all five categories of news were utilized with two
separate kernels of the polynomial and Gaussian types used for each news
category.
thank you for your comment
pls call me on 8125424511