This project focuses on developing machine learning models to diagnose breast cancer as malignant or benign. Using a well-known breast cancer dataset, the project involves various data preprocessing techniques, dimensionality reduction using Principal Component Analysis (PCA), and predictive modelling with Logistic Regression.
Data Preprocessing: The dataset undergoes initial preprocessing, including null value removal, feature scaling, and label encoding.
PCA for Dimensionality Reduction: PCA is applied to reduce the high-dimensional data to a lower dimension while retaining significant variance, enhancing model efficiency and reducing overfitting.
Model Development: Logistic Regression model trained on the PCA-transformed data.
Visualization: The PCA-transformed data is visualized to understand the data's distribution in the reduced space.
Technologies Used: Python
Libraries: Pandas, NumPy, Scikit-Learn, Matplotlib