Skip to content

Soumya-Kushwaha/AquaSense

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AquaSense 🌊

A comprehensive data analysis and machine learning project to predict water potability using various water quality metrics.

📊 Project Overview

AquaSense analyzes water quality data to determine potability using multiple machine learning models. The project includes extensive data visualization, preprocessing, and comparative analysis of different classification algorithms.

🔬 Features

  • Comprehensive exploratory data analysis (EDA)
  • Interactive visualizations using Plotly and Seaborn
  • Missing data handling and preprocessing
  • Implementation of 7 different machine learning models:
    • Logistic Regression
    • Decision Tree Classifier
    • Random Forest Classifier
    • XGBoost Classifier
    • K-Nearest Neighbors
    • Support Vector Machine
    • AdaBoost Classifier
  • Model performance comparison and evaluation

📈 Dataset

The project uses a water potability dataset with the following features:

  • pH value
  • Hardness
  • Solids
  • Chloramines
  • Sulfate
  • Conductivity
  • Organic carbon
  • Trihalomethanes
  • Turbidity
  • Potability (target variable)

🛠️ Technical Stack

  • Python: Core programming language
  • Data Processing: Pandas, NumPy
  • Visualization:
    • Matplotlib
    • Seaborn
    • Plotly Express
  • Machine Learning:
    • Scikit-learn
    • XGBoost
  • Development Environment: Jupyter Notebook

📊 Visualizations Include

  • Correlation heatmaps
  • Distribution plots
  • Box plots
  • Violin plots
  • Pair plots
  • Interactive Plotly visualizations
  • Missing data analysis

🤖 Machine Learning Models Performance

Model Accuracy Score
Logistic Regression
Decision Tree
Random Forest
XGBoost
K-Nearest Neighbors
SVM
AdaBoost

🚀 Getting Started

  1. Clone the repository:
git clone https://github.com/yourusername/aquasense.git
cd aquasense
  1. Install required packages:
pip install -r requirements.txt
  1. Run Jupyter Notebook:
jupyter notebook
  1. Open AquaSense.ipynb to view the analysis

📋 Prerequisites

  • Python 3.x
  • Jupyter Notebook
  • Required Python packages:
    • pandas
    • numpy
    • matplotlib
    • seaborn
    • plotly
    • scikit-learn
    • xgboost

🔍 Key Findings

  • Comprehensive analysis of water quality parameters
  • Identification of key factors affecting water potability
  • Comparative analysis of different machine learning approaches
  • Model performance evaluation using various metrics

👥 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📧 Contact

For any queries or suggestions, please reach out through GitHub issues.


Developed with ❤️ by Soumya Kushwaha

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published