ML/DL

Breast Cancer Classification: Comparative ML Analysis

Compared 7 ML models for breast cancer detection using clinical biomarkers, achieving 87% accuracy with Random Forest (AUC: 0.91).

2025 ML/DL

Breast Cancer Classification: Comparative ML Analysis

About This Project

Built and compared 7 supervised ML models to classify breast cancer using routine blood biomarkers and anthropometric data (age, BMI) from the Breast Cancer Coimbra Dataset (116 women; 64 cancer, 52 control). Models evaluated: Naive Bayes, LDA, KNN, Random Forest, Gradient Boosting, SVM, and Deep Neural Network. Applied z-normalization, factor encoding, stratified train-test splits, and repeated 10-fold cross-validation for robust evaluation. Performed hyperparameter tuning including KNN-k sweep and RF mtry optimization. Key insight: Glucose, Resistin, and Adiponectin were top predictors — showing non-invasive tests can assist early cancer detection.

Key Features

Compared 7 models: Naive Bayes, LDA, KNN, RF, GBM, SVM, DNN
Best result: Random Forest — Accuracy 87%, F1 0.88, AUC 0.91
Sensitivity 85%, Specificity 90% — strong clinical balance
Repeated 10-fold cross-validation with stratified splits
Hyperparameter tuning: KNN-k sweep, RF mtry optimization
Top predictors: Glucose, Resistin, Adiponectin

Technologies

R caret Random Forest SVM KNN Gradient Boosting DNN

Breast Cancer Classification: Comparative ML Analysis

About This Project

Key Features

Technologies

Links

More Projects

DocuChat: Voice + Text RAG Chatbot

EvidenceCV: RAG-Powered Resume Engine

BitPredict: Bitcoin Price Forecasting with Neural Networks