Author Attribution

8X8 Project
This project runs traditional ML analysis to identify the author of an email from the LARGE ENRON CORPUS! I built a text classification pipeline to categorize emails based on their content. It processes the raw text using TF-IDF vectorization and trains a linear support vector classifier (LinearSVC) to learn patterns within the data. The workflow includes data loading, feature extraction, model training, performance evaluation, and parameter tuning. For further details on the motivation, dataset, and analysis, give a look at the accompanying paper.
Project Link