data scienceintermediate550 tokens

ML Model Selection Guide

Select the right ML model for your problem

machine-learningmodel-selectionsklearnxgboostml

Prompt Template

You are an ML consultant. Recommend the best model for this problem.

**Problem:** {problem_description}
**Data:** {data_description}
**Target:** {target_variable}
**Constraints:** {constraints}

Recommend models systematically:

**1. Problem Type:**
- Supervised/Unsupervised/Reinforcement
- Classification/Regression/Clustering
- Tabular/Image/Text/Time-series

**2. Model Candidates:**

**For Classification:**
| Model | Pros | Cons | When to Use |
|-------|------|------|-------------|
| Logistic Regression | Fast, interpretable | Linear only | Baseline, need interpretability |
| Random Forest | Handles non-linear, robust | Slow on large data | Tabular data, need feature importance |
| XGBoost | High accuracy | Black box, needs tuning | Competition, production (tabular) |
| Neural Network | Handles complex patterns | Needs lots of data, slow | Images, text, large datasets |

**3. Quick Decision Tree:**
```
Need interpretability? → Logistic Regression, Decision Tree
Tabular data, < 100k rows → Random Forest, XGBoost
Image data → CNN (ResNet, EfficientNet)
Text data → Transformers (BERT, RoBERTa)
Time series → LSTM, Prophet
Need real-time predictions → Simpler models
```

**4. Baseline Models:**
Start with:
1. Simple baseline (mean, median, most frequent)
2. Linear/Logistic Regression
3. Random Forest
4. XGBoost

**5. Evaluation Strategy:**
- Metric: {metric} (choose based on problem)
- Validation: K-fold cross-validation
- Test set: Hold out 20%

**6. Implementation:**
```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

models = {
    'Logistic': LogisticRegression(),
    'RF': RandomForestClassifier(),
    'XGB': XGBClassifier()
}

for name, model in models.items():
    scores = cross_val_score(model, X, y, cv=5, scoring='{metric}')
    print(f"{name}: {scores.mean():.3f} ± {scores.std():.3f}")
```

**7. Recommendation:**
Based on your requirements:
- **Best Model:** {recommended_model}
- **Rationale:** {reason}
- **Expected Performance:** {estimate}
- **Training Time:** {time}
- **Next Steps:** {next_steps}

Provide: Model recommendation + implementation + comparison.

Variables to Replace

{problem_description}
{data_description}
{target_variable}
{constraints}
{metric}
{recommended_model}
{reason}
{estimate}
{time}
{next_steps}

Pro Tips

Always start with simple baselines. The best model depends on your constraints (interpretability, speed, accuracy).

Need More Prompts?

Explore our full library of 60+ professional AI prompt templates

Browse All Prompts →