MA

Madhu

@maddybez

NUS Data Science and Analytics

Joined July 2020

2 Followers

0 Following

AI Insights & Achievements

4/ 100

Veteran

Coding for 6 years.

Profile README

This was a group project for CS3244 Machine Learning in NUS. The whole project is not here.

Scope of Project:

Viewership experience often ruined or diminished by spoilers. Spoilers detract from the thrill and genuine emotional investment from an audience, which may often result in viewers not wanting to tune into spoiled movies, leading to revenue loss for the film industry as well. Having a model that detects movie spoilers in text all across the internet might be able to help preserve this emotional investment and relationship between the viewers, the movie and film-makers.

Dataset:

IMDb Movie Reviews Dataset

Data Understanding (Exploratory Data Analysis):

- Do certain phrases contribute to a spoiler tag?

- Do certain users (reviewers) post spoilers more frequently?

- Correlation between length of review and spoiler classification?

Word Embeddings:

Glove

Models:

1. Linear models : SVM, Naive Bayes, Logistic Regression

2. Neural Networks : Convolutional Neural Network (CNN), Long Short-Term Memory Network (LSTM)

Model Evaluation:

Metric Choice:

In the context of this project, a false negative is more harmful. Allowing spoilers to fall through the net means viewers are more likely to read them and diminish their viewership

Learning Reflection

1. Dealing with Imbalanced Datasets :

This was the first project where I had to handle a dataset with imbalance data. Using resampling methods, such as oversampling and undersampling, it was interesting to find out how to deploy such methods.

2. Metric Choice

Learning to choose appropriate metrics for evaluation in context of the project.

Public Repos

3

Total Stars

0

Total Forks

0

Public Gists

0

Top Languages

Based on primary language of repositories

Recent Activity

Contribution Graph

Activity Timeline

Commits and contributions grouped by day, week, or month.

High productivity periods

Most active day: January 3rd, 2026 • 0 contributions

Most active week: Nov 10–Nov 16 • 0 contributions

Most active month: Apr 2025 • 2 contributions

GitHub Analytics

Madhu

Veteran