🧠 Building a Logistic Regression Model with Polynomial Features — Real-World Medical Application

🔍 Introduction

In real-life scenarios like medical diagnosis, classifying patients as healthy or diseased often involves complex relationships between multiple biological indicators. These relationships are non-linear, which means a straight line (linear model) simply isn’t good enough. To tackle this, I designed a Logistic Regression model with Polynomial Features that can handle such non-linear decision boundaries with high accuracy.

In this blog post, I’ll walk you through:

What I built and why
How I created synthetic but realistic data
The algorithms and techniques I used
The results, visualization, and GitHub repository

🔗 GitHub Repository:
👉 Logistic Regression with Polynomial Features

🏥 Real-World Analogy: A Medical Diagnosis System

Imagine a simple system that predicts whether a patient is sick or healthy based on two lab test results (e.g., X1 = blood sugar, X2 = blood pressure). If the symptoms follow a circular or ring-shaped pattern (which is common when data clusters form), we need something more powerful than a straight-line model.

That’s where Polynomial Logistic Regression comes in.

📘 Techniques and Algorithms Used

Here are all the techniques and components that make up this project:

🔹 1. Data Generation (Synthetic Medical Data)

Two features per sample (X1 and X2)
100 total samples:
- Class 0 (Healthy) → Clustered near the center (inner circle)
- Class 1 (Diseased) → Spread in an outer ring

🔹 2. Logistic Regression Model

Binary Classification using the Sigmoid function:
$\sigma(z) = \frac{1}{1 + e^{-z}}$

🔹 3. Polynomial Feature Mapping

Instead of using raw features [x1, x2], we map them to:
$[1, x1, x2, x1^2, x1 \cdot x2, x2^2]$
This allows the model to learn curved (non-linear) decision boundaries.

🔹 4. Cost Function

Standard logistic loss:
$J(w, b) = -\frac{1}{m} \sum_{i=1}^{m} [y_i \log(f_wb) + (1 - y_i) \log(1 - f_wb)]$

🔹 5. Gradient Descent Optimization

I implemented gradient descent manually to minimize the cost:
- Compute gradients: ∂J/∂w and ∂J/∂b
- Update weights iteratively using learning rate α = 0.01
- Total iterations: 10,000

🔹 6. Accuracy Evaluation

The model achieved ~95% to 100% training accuracy, depending on the random seed.

🔹 7. Visualization

Scatter plot of the original dataset
Decision boundary plotted using a contour plot
Clearly shows the model has learned a curved boundary that separates both classes effectively

🧪 Results

✅ Training Accuracy: ~95%–100%
🟦 Class 1 (Diseased): Blue circles
🟥 Class 0 (Healthy): Red crosses
📈 Decision Boundary: Curved yellow line
🔍 Boundary learned only because of polynomial features; without them, the model would fail

Decision Boundary Output

📁 Project Structure

plaintext
logistic-regression-polynomial/
 ┣ logistic_diagnosis.py     # Main model code
 ┣ README.md                 # GitHub readme

💻 How to Run the Code

Clone the repository:

bash
git clone https://github.com/your-username/logistic-regression-polynomial
cd logistic-regression-polynomial

Install required libraries:

bash
pip install numpy matplotlib

Run the model:

bash
python logistic_diagnosis.py

You’ll see both the original dataset and the decision boundary plotted.

💬 Conclusion

This project demonstrates how basic algorithms, when combined with feature engineering (like polynomial expansion), can solve non-linear classification problems — just like a real-world medical diagnosis system might require. Everything was implemented from scratch, without relying on any machine learning libraries like Scikit-learn or TensorFlow.

💡 What’s Next?

Add L2 regularization to prevent overfitting
Try degree 3 or 4 polynomials
Build a web interface (Flask/Streamlit) to test predictions
Test on real-world healthcare datasets

AdnanScripts

🧠 Building a Logistic Regression Model with Polynomial Features — Real-World Medical Application

🔍 Introduction

🏥 Real-World Analogy: A Medical Diagnosis System

📘 Techniques and Algorithms Used

🔹 1. Data Generation (Synthetic Medical Data)

🔹 2. Logistic Regression Model

🔹 3. Polynomial Feature Mapping

🔹 4. Cost Function

🔹 5. Gradient Descent Optimization

🔹 6. Accuracy Evaluation

🔹 7. Visualization

🧪 Results

📁 Project Structure

💻 How to Run the Code

💬 Conclusion

💡 What’s Next?

0 Comments:

Post a Comment

Popular Posts

GET A FREE QUOTE NOW

SAY HELLO TO ME

Report Abuse

ADDRESS

EMAIL

MOBILE PHONE