• 🧠 Building a Logistic Regression Model with Polynomial Features — Real-World Medical Application

     

    🔍 Introduction

    In real-life scenarios like medical diagnosis, classifying patients as healthy or diseased often involves complex relationships between multiple biological indicators. These relationships are non-linear, which means a straight line (linear model) simply isn’t good enough. To tackle this, I designed a Logistic Regression model with Polynomial Features that can handle such non-linear decision boundaries with high accuracy.

    In this blog post, I’ll walk you through:

    • What I built and why

    • How I created synthetic but realistic data

    • The algorithms and techniques I used

    • The results, visualization, and GitHub repository

    🔗 GitHub Repository:
    👉 Logistic Regression with Polynomial Features


    🏥 Real-World Analogy: A Medical Diagnosis System

    Imagine a simple system that predicts whether a patient is sick or healthy based on two lab test results (e.g., X1 = blood sugar, X2 = blood pressure). If the symptoms follow a circular or ring-shaped pattern (which is common when data clusters form), we need something more powerful than a straight-line model.

    That’s where Polynomial Logistic Regression comes in.


    📘 Techniques and Algorithms Used

    Here are all the techniques and components that make up this project:

    🔹 1. Data Generation (Synthetic Medical Data)

    • Two features per sample (X1 and X2)

    • 100 total samples:

      • Class 0 (Healthy) → Clustered near the center (inner circle)

      • Class 1 (Diseased) → Spread in an outer ring

    🔹 2. Logistic Regression Model

    • Binary Classification using the Sigmoid function:

      σ(z)=11+ez\sigma(z) = \frac{1}{1 + e^{-z}}

    🔹 3. Polynomial Feature Mapping

    • Instead of using raw features [x1, x2], we map them to:

      [1,x1,x2,x12,x1x2,x22][1, x1, x2, x1^2, x1 \cdot x2, x2^2]
    • This allows the model to learn curved (non-linear) decision boundaries.

    🔹 4. Cost Function

    • Standard logistic loss:

      J(w,b)=1mi=1m[yilog(fwb)+(1yi)log(1fwb)]J(w, b) = -\frac{1}{m} \sum_{i=1}^{m} [y_i \log(f_wb) + (1 - y_i) \log(1 - f_wb)]

    🔹 5. Gradient Descent Optimization

    • I implemented gradient descent manually to minimize the cost:

      • Compute gradients: ∂J/∂w and ∂J/∂b

      • Update weights iteratively using learning rate α = 0.01

      • Total iterations: 10,000

    🔹 6. Accuracy Evaluation

    • The model achieved ~95% to 100% training accuracy, depending on the random seed.

    🔹 7. Visualization

    • Scatter plot of the original dataset

    • Decision boundary plotted using a contour plot

    • Clearly shows the model has learned a curved boundary that separates both classes effectively


    🧪 Results

    • Training Accuracy: ~95%–100%

    • 🟦 Class 1 (Diseased): Blue circles

    • 🟥 Class 0 (Healthy): Red crosses

    • 📈 Decision Boundary: Curved yellow line

    • 🔍 Boundary learned only because of polynomial features; without them, the model would fail

    Decision Boundary Output


    📁 Project Structure

    plaintext
    logistic-regression-polynomial/ ┣ logistic_diagnosis.py # Main model code ┣ README.md # GitHub readme

    💻 How to Run the Code

    1. Clone the repository:

    bash
    git clone https://github.com/your-username/logistic-regression-polynomial cd logistic-regression-polynomial
    1. Install required libraries:

    bash
    pip install numpy matplotlib
    1. Run the model:

    bash
    python logistic_diagnosis.py

    You’ll see both the original dataset and the decision boundary plotted.


    💬 Conclusion

    This project demonstrates how basic algorithms, when combined with feature engineering (like polynomial expansion), can solve non-linear classification problems — just like a real-world medical diagnosis system might require. Everything was implemented from scratch, without relying on any machine learning libraries like Scikit-learn or TensorFlow.


    💡 What’s Next?

    • Add L2 regularization to prevent overfitting

    • Try degree 3 or 4 polynomials

    • Build a web interface (Flask/Streamlit) to test predictions

    • Test on real-world healthcare datasets

  • 0 Comments:

    Post a Comment

    GET A FREE QUOTE NOW

    Have questions about fees or need personalized assistance? Feel free to visit my LinkedIn profile for more details and inquiries!

    Powered by Blogger.
    ADDRESS

    CB-1279, Street no 8, Chour Chowk, Rawalpindi, Pakistan

    EMAIL

    adnanxn34101@gmail.com

    MOBILE PHONE

    +92346-802913-8
    +92317-052974-0