Sentiment and Emotion Analysis Pipeline

This project builds a full emotion-classification workflow from data preprocessing to model training, explainability outputs, and a FastAPI-powered web interface.

What this project includes

End-to-end ML pipeline across 4 scripts
Feature engineering with TF-IDF + structural text features
Multi-model training and weighted ensemble evaluation
Explainability artifacts with SHAP and LIME
FastAPI backend + web frontend for live text analysis
CLI validator for backend prediction quality checks

Project structure

1_data_preprocessing.py: dataset loading, cleaning, structural feature generation
2_feature_engineering.py: TF-IDF, scaling, label encoding, feature saving
3_model_training_evaluation.py: model training, metrics, plots, SHAP/LIME artifacts
4_finalize_assets.py: final predictions, confidence/intensity summaries
app.py: FastAPI server and inference endpoint
backend_cli_validator.py: CLI test runner for backend prediction validation
results/: generated features, plots, trained models, and final outputs

Prerequisites

Python 3.10+ (recommended: 3.11)
pip
Git (optional for clone workflow)

Setup (Windows PowerShell)

Clone and move into the project folder.

git clone https://github.com/devanshkadu2005/Sentiment.git
cd Sentiment

Create and activate a virtual environment.

python -m venv .venv
.\.venv\Scripts\Activate.ps1

Install Python dependencies.

pip install -r requirements.txt

Setup (macOS/Linux)

git clone https://github.com/devanshkadu2005/Sentiment.git
cd Sentiment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run the full pipeline (step by step)

Run these scripts in order:

python 1_data_preprocessing.py
python 2_feature_engineering.py
python 3_model_training_evaluation.py
python 4_finalize_assets.py

After this, all generated assets will be available under results/.

Run the API + frontend

You can start the server in either of these ways:

python app.py

or

uvicorn app:app --host 127.0.0.1 --port 8501

Then open:

http://127.0.0.1:8501

API usage

Analyze endpoint

Method: POST
URL: http://127.0.0.1:8501/analyze
JSON body:

{
  "text": "I feel very excited and happy today!"
}

Example (PowerShell)

Invoke-RestMethod -Method Post `
  -Uri "http://127.0.0.1:8501/analyze" `
  -ContentType "application/json" `
  -Body '{"text":"I feel very excited and happy today!"}'

Run backend CLI validator

Default test suite:

python backend_cli_validator.py

With custom minimum confidence:

python backend_cli_validator.py --min-confidence 0.6

With additional custom test cases JSON file:

python backend_cli_validator.py --cases-file custom_cases.json

Main outputs

results/preprocessing/: processed train/val/test CSV and preprocessing plots
results/features/: feature matrices, encoders, vectorizer, feature metadata
results/training/: trained models, confusion matrices, ROC/F1 plots, SHAP/LIME outputs
results/final/: final prediction CSVs and summary visualizations

Notes

NLTK resources are downloaded automatically on first run.
If app.py reports missing model files, run steps 1-3 of the pipeline first.
Some steps (especially training + SHAP) may take significant time depending on hardware.
This repository already includes generated artifacts in results/, so you can run app.py directly if files are present.

Troubleshooting

ModuleNotFoundError: run pip install -r requirements.txt again in the active virtual environment.
FastAPI app starts but /analyze fails with missing files: re-run pipeline scripts in order.
If model training is too slow, start by running only preprocessing and feature engineering to verify setup first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment and Emotion Analysis Pipeline

What this project includes

Project structure

Prerequisites

Setup (Windows PowerShell)

Setup (macOS/Linux)

Run the full pipeline (step by step)

Run the API + frontend

API usage

Analyze endpoint

Example (PowerShell)

Run backend CLI validator

Main outputs

Notes

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
results		results
.gitignore		.gitignore
1_data_preprocessing.py		1_data_preprocessing.py
2_feature_engineering.py		2_feature_engineering.py
3_model_training_evaluation.py		3_model_training_evaluation.py
4_finalize_assets.py		4_finalize_assets.py
Emotion_final.csv		Emotion_final.csv
README.md		README.md
app.py		app.py
backend_cli_validator.py		backend_cli_validator.py
index.html		index.html
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Sentiment and Emotion Analysis Pipeline

What this project includes

Project structure

Prerequisites

Setup (Windows PowerShell)

Setup (macOS/Linux)

Run the full pipeline (step by step)

Run the API + frontend

API usage

Analyze endpoint

Example (PowerShell)

Run backend CLI validator

Main outputs

Notes

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages