This repository contains my solution to the classic Kaggle competition: Titanic - Machine Learning from Disaster. The goal is to predict which passengers survived the Titanic shipwreck using a classification model.
- Competition: Titanic - Machine Learning from Disaster
- Model: Random Forest Classifier
- Public Score:
0.76076 - Best Score:
0.76076(Version 1)
The dataset includes passenger details such as age, gender, ticket class, number of siblings/spouses aboard, and fare. These features were used to build the model.
The following preprocessing steps were applied:
- Dropped unnecessary columns:
PassengerId,Name,Ticket,Cabin - Filled missing values:
Age: Filled with medianEmbarked: Filled with mode ('S')Fare: Filled with median (only in test set)
- Converted categorical variables:
Sex: Binary mappingEmbarked: One-Hot Encoding
- Algorithm:
RandomForestClassifierfromsklearn.ensemble - Training-Validation Split: 80% training / 20% validation
- Selected Features:
PclassSexAgeSibSpParchFare- One-hot encoded
Embarked
The model was trained and evaluated using basic performance metrics.
- Achieved a public Kaggle score of 0.76076
- This was the first version of the model and performed well on the leaderboard.
Planned improvements and experiments:
- Try other models (e.g., Logistic Regression, XGBoost)
- Perform hyperparameter tuning using GridSearchCV
- Use feature importance to select or engineer better features
- Consider using cross-validation for more reliable evaluation
- π Kaggle Notebook: Titanic - Random Forest v1
- π Competition Page: Kaggle Titanic
Kaggle: kaggle.com/busradeveci