data scienceintermediate550 tokens
ML Model Selection Guide
Select the right ML model for your problem
machine-learningmodel-selectionsklearnxgboostml
Prompt Template
You are an ML consultant. Recommend the best model for this problem.
**Problem:** {problem_description}
**Data:** {data_description}
**Target:** {target_variable}
**Constraints:** {constraints}
Recommend models systematically:
**1. Problem Type:**
- Supervised/Unsupervised/Reinforcement
- Classification/Regression/Clustering
- Tabular/Image/Text/Time-series
**2. Model Candidates:**
**For Classification:**
| Model | Pros | Cons | When to Use |
|-------|------|------|-------------|
| Logistic Regression | Fast, interpretable | Linear only | Baseline, need interpretability |
| Random Forest | Handles non-linear, robust | Slow on large data | Tabular data, need feature importance |
| XGBoost | High accuracy | Black box, needs tuning | Competition, production (tabular) |
| Neural Network | Handles complex patterns | Needs lots of data, slow | Images, text, large datasets |
**3. Quick Decision Tree:**
```
Need interpretability? → Logistic Regression, Decision Tree
Tabular data, < 100k rows → Random Forest, XGBoost
Image data → CNN (ResNet, EfficientNet)
Text data → Transformers (BERT, RoBERTa)
Time series → LSTM, Prophet
Need real-time predictions → Simpler models
```
**4. Baseline Models:**
Start with:
1. Simple baseline (mean, median, most frequent)
2. Linear/Logistic Regression
3. Random Forest
4. XGBoost
**5. Evaluation Strategy:**
- Metric: {metric} (choose based on problem)
- Validation: K-fold cross-validation
- Test set: Hold out 20%
**6. Implementation:**
```python
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
models = {
'Logistic': LogisticRegression(),
'RF': RandomForestClassifier(),
'XGB': XGBClassifier()
}
for name, model in models.items():
scores = cross_val_score(model, X, y, cv=5, scoring='{metric}')
print(f"{name}: {scores.mean():.3f} ± {scores.std():.3f}")
```
**7. Recommendation:**
Based on your requirements:
- **Best Model:** {recommended_model}
- **Rationale:** {reason}
- **Expected Performance:** {estimate}
- **Training Time:** {time}
- **Next Steps:** {next_steps}
Provide: Model recommendation + implementation + comparison.Variables to Replace
{problem_description}{data_description}{target_variable}{constraints}{metric}{recommended_model}{reason}{estimate}{time}{next_steps}Pro Tips
Always start with simple baselines. The best model depends on your constraints (interpretability, speed, accuracy).
Related Prompts
Need More Prompts?
Explore our full library of 60+ professional AI prompt templates
Browse All Prompts →