Real-Time Stock News Sentiment Analysis Using Python Tools
Written on
Chapter 1: Introduction to Sentiment Analysis in Stock Trading
Understanding public sentiment towards specific stocks is essential for forecasting their future market behavior. Monitoring stock-related news is one of the most effective means of gauging this sentiment. However, determining whether a piece of news is beneficial or detrimental to a stock's value can be quite complex. Fortunately, we have a method to tackle this challenge!
In this article, we will delve into an innovative model that predicts whether news sentiment is positive or negative in real-time. By leveraging the Financial Modeling Prep (FMP) Stock News Sentiment API, which supplies high-quality, well-categorized data with sentiment scores, we can gain valuable insights.
This remarkable dataset provided by FMP's API has various applications, and we will focus on utilizing it to forecast sentiment in news articles.
Let's embark on this exciting journey!
Section 1.1: Tools Required for Sentiment Analysis
To implement our model, we need several libraries to facilitate data processing and analysis. Think of this as gathering the necessary tools before we begin our expedition.
import pandas as pd
import requests
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier
from sklearn.naive_bayes import BernoulliNB
from sklearn.preprocessing import LabelEncoder
from sklearn import metrics
- requests: Simplifies HTTP operations such as GET and POST.
- pandas: An open-source tool for efficient data analysis and manipulation.
- CountVectorizer: Converts text data into a numerical format suitable for machine learning.
- train_test_split: Splits the dataset into training and testing subsets for model validation.
- XGBClassifier: A high-performance classifier based on the XGBoost algorithm.
- BernoulliNB: Implements the Bernoulli Naive Bayes algorithm for binary feature data.
- LabelEncoder: Converts categorical labels into numerical values for model processing.
- metrics: Provides various functions to evaluate model performance.
If you haven't installed these libraries yet, you can do so using the pip command in your terminal. To extract the data, you will need a developer account with FMP, which can be created easily through their website.
Section 1.2: Accessing News Data with FMP
Before we embark on our analysis, we must access the extensive news data available through the FMP Stock News Sentiment API.
api_key = 'YOUR API KEY'
response = requests.get(url).json()
data = pd.DataFrame(response)
data.head()
This code is straightforward. We first store our API key (ensure you replace 'YOUR API KEY' with your actual FMP API key) and the API URL in variables. We then make an API call to fetch the news sentiment data and convert the JSON response into a DataFrame.
The API endpoint retrieves the latest classified stock news, labeling them as Positive, Neutral, or Negative, along with sentiment scores and source verification.
Section 1.3: Constructing the Dataset
Now, let’s gather our raw data. We will compile news articles along with their sentiment scores, akin to collecting pieces of a puzzle before assembling them.
for i in range(0, 100):
api_key = 'YOUR API KEY'
response = requests.get(url).json()
df = pd.DataFrame(response)
data = data.merge(df, how='outer')
By iterating through the pages, we can create a dataset with 10,000 rows and 9 columns, merging the data collected in each iteration.
Section 1.4: Data Preprocessing
Before analyzing the data, we must ensure it is clean and organized. This is similar to sorting through treasures to arrange them properly.
data = data.dropna()
le = LabelEncoder()
data.sentiment = le.fit_transform(data.sentiment)
count_vectorizer = CountVectorizer(max_features=1000)
feature_vector = count_vectorizer.fit(data.text)
train_ds_features = count_vectorizer.transform(data.text)
train_x, test_x, train_y, test_y = train_test_split(train_ds_features, data.sentiment,
test_size=0.3, random_state=42)
We remove null values and convert the text data into numerical format using CountVectorizer. The dataset is then divided into training and testing sets in a 70:30 ratio.
Chapter 2: Model Training and Evaluation
The first video titled "[Python Project] Sentiment Analysis and Visualization of Stock News" provides a comprehensive overview of how to conduct sentiment analysis using Python, including visualizations to enhance understanding.
Section 2.1: Training the Model
Now comes the exciting part! We will train our model with ample data to differentiate between positive and negative news sentiments.
train_y = train_y.astype('int')
nb_clf = BernoulliNB()
nb_clf.fit(train_x.toarray(), train_y)
xg_clf = XGBClassifier()
xg_clf.fit(train_x.toarray(), train_y)
Two models are utilized here—feel free to choose either based on your preference.
Section 2.2: Testing the Model
With the model trained, it’s time to assess its predictive capabilities on real-time news sentiments.
from sklearn.metrics import classification_report
test_xg_predicted = xg_clf.predict(test_x.toarray())
print(classification_report(test_y, test_xg_predicted))
test_nb_predicted = nb_clf.predict(test_x.toarray())
print(classification_report(test_y, test_nb_predicted))
The classification reports can be intricate, but they provide critical insights into the model's performance.
In our analysis, we observe that XGBoost outperformed Naive Bayes across nearly all metrics, suggesting it is the more effective choice for predicting news sentiments.
The second video titled "Live Project-Stock Sentiment Analysis using News Headlines Machine Learning" demonstrates a hands-on approach to implementing sentiment analysis in stock trading, providing practical insights and applications.
Chapter 3: Conclusion and Future Directions
To conclude, this article presented a method for predicting public sentiment based on stock news using the Financial Model Prep API. This data holds significant potential for various applications, particularly in enhancing our market understanding and facilitating informed decision-making. The model discussed can effectively gauge public sentiment towards stock news in real-time, proving valuable for investors and analysts alike.
Thank you for joining us on this exploration. If you have any suggestions for improving the machine learning model we developed here, please share your thoughts in the comments. We appreciate your time and hope you found this information insightful!