Sentiment Analysis | Data Science Portfolio

94.2%

Best Model Accuracy

4

Model Approaches

88.7%

Ensemble Accuracy

<1s

Processing Time

Project Overview

Business Problem

Marketing teams were reacting to PR crises hours too late due to slow batch reporting. They needed immediate visibility into brand sentiment trends.

Solution Approach

Built a flexible architecture to ingest social data. Fine-tuned a BERT transformer model for specific domain sentiment analysis. Visualised live trends in a Streamlit dashboard for immediate insight.

Business Impact

Reduced crisis response time from 4 hours to <10 minutes. Processing 50k+ tweets/day with sub-second latency.

Key Features

• Real-time sentiment classification
• Interactive visualization dashboards
• Confidence scoring & uncertainty quantification
• Batch processing capabilities

Technical Architecture

System Components

Data Collection

• NLTK Movie Reviews
• Custom datasets
• Real-time API feeds

Processing

• Text cleaning
• Tokenization
• Feature extraction

Models

• VADER
• ML Classifier
• Transformer

# Example: Quick Sentiment Analysis
from src.sentiment_analyzer import SentimentAnalyzer

analyzer = SentimentAnalyzer()
result = analyzer.get_ensemble_prediction(
    "This product exceeded my expectations!"
)

print(f"Sentiment: {result['sentiment']}")
print(f"Confidence: {result['confidence']:.2f}")
# Output: Sentiment: positive, Confidence: 0.89

Model Performance Comparison

VADER (Rule-Based)

Accuracy:85.1%

Speed:~0.001s

Best for:Social media, informal text

TextBlob (Statistical)

Accuracy:82.3%

Speed:~0.002s

Best for:General text, subjectivity

Logistic Regression (ML)

Accuracy:88.7%

Speed:~0.050s

Best for:Structured reviews, domain-specific

RoBERTa (Transformer)

Accuracy:94.2%

Speed:~0.300s

Best for:Complex text, nuanced sentiment

Interactive Demo

The app may take a few seconds to load. Click below to wake it up or open in a new tab.

Wake Up / Open Full Demo

Sample Visualizations

Key Insights

• Transformer models show highest accuracy but slower processing
• VADER excels at social media text with emojis and slang
• Ensemble approach balances accuracy and speed effectively
• ML models require domain-specific training data

Technical Stack

Core Libraries

Python NLTK Pandas NumPy

ML Framework

PyTorch Hugging Face (BERT) Streamlit

Visualization

Matplotlib Seaborn Plotly WordCloud

Pulse: Live Social Sentiment