AI / ML#automl#hyperparameter#optimization

Automl Hyperparameter Optimization

AutoML and hyperparameter optimization rules for Python ML projects using Ray Tune, Optuna, PyCaret, and time-series AutoML libraries

Use it with Cursor, or export as AGENTS.md / CLAUDE.md for other AI coding agents — pick a format below.

# AutoML and Hyperparameter Optimization Rules

## Scope

- Use AutoML to accelerate model exploration, not to bypass problem framing, validation design, or explainability.
- Start with a simple baseline model and fixed metric before launching a search.
- Keep training, evaluation, feature generation, and search configuration separate.
- Record datasets, splits, metric definitions, random seeds, library versions, and search spaces for every run.

## Experiment Design

- Define the target metric before selecting tooling.
- Use nested validation or a final untouched test split for model selection claims.
- Use time-aware splits for time-series problems; never shuffle across time boundaries.
- Prevent leakage by fitting preprocessing only on training folds.
- Include simple baselines such as linear models, random forests, or naive time-series forecasts.
- Use early stopping and resource limits for expensive searches.
- Prefer structured search spaces with domain-informed ranges over arbitrary broad grids.

## Tooling

- Use Ray Tune or Optuna for custom training loops, distributed trials, pruning, and scheduler control.
- Use PyCaret for quick low-code comparisons when the dataset and metric are straightforward.
- Use AutoTS, Merlion, PyAF, or project-approved time-series tooling when forecast-specific validation, seasonality, and horizon handling matter.
- Store run metadata in MLflow, Weights & Biases, TensorBoard, or a project-approved tracker.
- Use `uv` or the existing project package manager for reproducible environments.

## Search Spaces

- Keep search spaces explicit and reviewed.
- Use log-scale sampling for learning rates, regularization, tree counts, and other scale-sensitive values.
- Constrain model complexity to avoid unrealistic training time or memory use.
- Include preprocessing choices only when they can be applied without leakage.
- Do not tune on the test set.

## Reporting

- Report the selected model, metric, confidence interval or variance, validation scheme, and final test result.
- Include the best parameters and the search budget.
- Compare the chosen model against the baseline and at least one non-AutoML alternative.
- Document operational constraints such as inference latency, memory use, retraining cost, and explainability.

## Common Mistakes

- Do not treat leaderboard rank as proof of production readiness.
- Do not mix train/test data during feature engineering.
- Do not run massive searches before validating labels and data quality.
- Do not ignore class imbalance, calibration, or business cost asymmetry.
- Do not deploy an AutoML model without reproducible training code and pinned dependencies.

How to use: save the file at your project root (e.g. .cursorrules or AGENTS.md) and your AI editor picks it up automatically.

Your tool here

Reach developers who build with Cursor — a focused, high-intent audience.

Become a sponsor →

Related rules

AI / ML→

Cursorrules Cursor AI Next.js 14 Tailwind SEO Setup

Cursor rules for Next.js development with Tailwind CSS and SEO optimization.

#cursor#ai#nextjs#14

AI / ML→

Next.js Supabase Shadcn PWA

Cursor rules for Nextjs Supabase Shadcn Pwa.

#nextjs#supabase#shadcn#pwa

AI / ML→

Next.js Vercel TypeScript

Cursor rules for Next.js development with Vercel and TypeScript integration.

#nextjs#vercel#typescript