Skip to main content

Metrics, Evaluation & Reproducibility

  • Prefer cross-validation; include variance/CI.
  • Track dataset version and checksums per run.
  • Log hyper-params, seeds, env versions.
  • Use semantic versioning for app code/images.
  • Consider experiment tracking (e.g., store metrics table + params JSON per run).