Model lineage (also called model provenance) is the complete record of an AI model’s origins and transformations throughout its lifecycle. It tracks which data was used for training, what code and hyperparameters produced the model, which base model it was fine-tuned from, what evaluation results it achieved, and who approved it for deployment. Model lineage answers the question: “How exactly was this model created, and can we reproduce it?”

What Lineage Tracks

A complete lineage record includes the training data version and any preprocessing steps applied, the base or foundation model used (if fine-tuning), the training code version and framework, hyperparameters and configuration, compute environment details, evaluation metrics on validation and test sets, the identity of the person or pipeline that triggered training, and any post-training modifications (quantization, distillation, pruning).

Why It Matters

Model lineage serves multiple purposes. Reproducibility - Given the lineage record, another engineer should be able to reproduce the model from scratch. Debugging - When a model behaves unexpectedly, lineage helps trace the issue to a specific data version, code change, or configuration. Compliance - Regulations like the EU AI Act require documentation of training data and development processes for high-risk AI systems. GDPR requires knowing what data was used to train models that process personal data. Audit - Internal and external auditors need to verify that models were developed according to organizational policies.

Implementation

Model lineage is typically implemented through experiment tracking tools (MLflow, Weights & Biases, Neptune) that automatically log parameters, metrics, and artifacts. Model registries store lineage metadata alongside model artifacts. Data versioning tools (DVC, LakeFS) track training data versions. CI/CD pipelines for ML capture the full training workflow as code.

Challenges

Lineage for fine-tuned foundation models is inherently incomplete because the base model’s training data is usually unknown. Organizations must document what they can control (their fine-tuning data and process) while acknowledging the opacity of the base model. For compound AI systems, lineage must track multiple models and their interactions, adding complexity.

Sources

  • Buneman, P., Khanna, S., & Tan, W. C. (2001). Why and where: A characterization of data provenance. ICDT 2001. (Foundational data provenance paper; the “why-provenance” and “where-provenance” concepts directly apply to model lineage.)
  • Bose, R., & Frew, J. (2005). Lineage retrieval for scientific data processing: A survey. ACM Computing Surveys, 37(1), 1–28. (Survey of lineage tracking in scientific workflows; directly applicable to ML experiment tracking.)
  • Sculley, D., et al. (2015). Hidden technical debt in machine learning systems. NeurIPS 2015. (Identified undocumented model dependencies as a primary ML technical debt; motivates systematic lineage tracking.)