SuperNormal by Simple Obvious

Back to index

Stock News Sentimental Analysys

## Project Overview

This project centers around analyzing sentiment from financial news articles related to specific stock tickers retrieved from Finviz. Leveraging Python libraries such as Beautiful Soup for web scraping, NLTK for sentiment analysis, Pandas for data manipulation, and Matplotlib for visualization, this project aims to extract news headlines, perform sentiment analysis, and visualize sentiment trends for chosen stock tickers.

## Data Collection

The code uses web scraping techniques to extract news headlines from Finviz's website for selected stock tickers (in this case: AMD, AMZN, NVDA). It retrieves the headlines, their publication dates, and times, storing this information in a structured format using Pandas DataFrames.

## Sentiment Analysis

The NLTK library's Vader Sentiment Intensity Analyzer is employed to assess the sentiment of each news headline. The sentiment scores (negative, neutral, positive, compound) are computed and appended to the DataFrame. For instance, a compound score of -0.5423 suggests a moderately negative sentiment in a headline.

## Data Visualization

The sentiment analysis results are visualized using Matplotlib. A bar chart is generated to display the mean compound sentiment scores for each ticker over time. This visualization offers an intuitive view of sentiment trends associated with the selected stocks based on the news headlines.

## Conclusion

This project demonstrates the utilization of web scraping techniques to gather real-time financial news data and sentiment analysis tools to gauge market sentiment around specific stocks. The ability to analyze sentiment from textual data can provide insights into market perception, potentially aiding investors in making informed decisions.

---

Source Code:

from urllib.request import urlopen, Request
from bs4 import BeautifulSoup
import nltk
nltk.download()
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import pandas as pd
import matplotlib.pyplot as plt
import time
from urllib.error import HTTPError

finviz_url = "https://finviz.com/quote.ashx?t="

tickers = ["AMD", "AMZN", "NVDA"]

news_tables = {}

for ticker in tickers:
    url = finviz_url + ticker

    req = Request(url=url, headers={"user-agent":"my program"})

    try:
        response = urlopen(req)
    except HTTPError as e:
        print(f"HTTP Error {e.code}: {e.reason}")
        continue  # Skip to the next ticker in case of error

    html = BeautifulSoup(response, "html")

    news_table = html.find(id="news-table")

    news_tables[ticker] = news_table

    time.sleep(1)  # Add a delay of 1 second between requests in case website blocks recursive requests


#print(news-tables)

parsed_data = []

for ticker, news_table in news_tables.items():
    for row in news_table.findAll("tr"):
        anchor = row.a  # Get the <a> tag

        if anchor:  # Check if <a> tag exists
            title = anchor.text.strip()  # Remove leading/trailing whitespace
            date_time = row.td.text.strip().split(" ")  # Split date and time

            if len(date_time) == 1:
                time = date_time[0]
                date = None  # Set date to None if only time is available
            else:
                date = date_time[0]
                time = date_time[1]

            if date:  # Only append if date is not empty (None)
                parsed_data.append([ticker, date, time, title])

#print(parsed_data)

df = pd.DataFrame(parsed_data, columns=["ticker", "date", "time", "title"])
#print(df.head)

vader = SentimentIntensityAnalyzer()

#print(vader.polarity_scores("I think Apple is a bad company, they will do poorly this quarter")) = {'neg': 0.259, 'neu': 0.741, 'pos': 0.0, 'compound': -0.5423} (not so bad)

f = lambda title: vader.polarity_scores(title)["compound"]
df["compound"] = df["title"].apply(f)

try:
    df["date"] = pd.to_datetime(df["date"]).dt.date
except Exception as e:
    print(f"Error occurred during date conversion: {e}")


plt.figure(figsize=(10,8))

mean_df = df.groupby(["ticker", "date"]).mean()
mean_df = mean_df.unstack()
mean_df = mean_df.xs("compound", axis="columns").transpose()
mean_df.plot(kind="bar")

##print(mean_df)

plt.show()

Finviz: https://finviz.com/

I recommend running this on Goole Colab or Jupyter to see the graphics.

Thanks for reading!

Gerard Puche

Back to index

Get this for free