Skip to main content

Experiment report

Purpose

The experiment report records what was learned before full deployment: whether the value is confirmed, whether there is enough data, whether the chosen AI product fits, which risks were identified, and whether delivery should continue.

This may be the result of a quick check, a prototype, a pilot on a limited group, a test on historical data, or a manual review of the process.

Report structure

Goal of the experiment

  • Which hypothesis was tested.
  • What counts as success: target metrics and threshold values.
  • Connection to the impact hypothesis from the use case document.

Approach and methods

  • Type of test: historical data, limited pilot, manual comparison, control-group test, shadow mode.
  • The AI product or solution prototype used.
  • Test environment: isolated perimeter, test stand, manual export, production process.
  • Duration and scale of the test.

Data

  • Which data was used: sources, volume, period.
  • Data quality: completeness, freshness, presence of labeling.
  • Data limitations identified during the experiment.
  • Whether the data can be used in delivery and operation.

Quantitative results

  • Values of the target quality metrics.
  • Comparison with the current process and target thresholds.
  • Results across different data segments (if available).
  • Statistical significance of the results.
  • Processing time, error rate, share of manual corrections, cost of execution.

Qualitative results

  • What users or experts said.
  • Where the solution helps and where it does not fit.
  • Which limitations need to be shown to the user.
  • Which scenarios require human involvement.

Conclusions and recommendations

  • Whether the hypothesis is confirmed: yes / partially / no.
  • Key findings and insights.
  • Unexpected results or limitations.
  • Factors that influenced the result.
  • Changes to make before delivery.

Identified risks

  • Technical risks: data quality, model performance, dependency on external systems.
  • Business risks: process change, user resistance, regulatory restrictions.
  • Scaling risks: growth in data volumes, infrastructure requirements.
  • AI risks: incorrect answers, hallucinations, bias, lack of explainability.
  • For each risk: probability, impact, proposed action.

Recommendation for the next step

One of the following decisions:

  • Continue delivery — the hypothesis is confirmed, the risks are manageable.
  • Refine the test — the data or quality is insufficient for a decision.
  • Change the product or approach — the task is valuable, but the current route does not fit.
  • Stop — the hypothesis is not confirmed or the risk exceeds the value.
  • Return to assessment — the problem, owner, impact, or data need to be redefined.

Use in the process

The report is prepared by the delivery team with the involvement of the business owner, the product owner, the AI office, and, if needed, security, architecture, or the data owner.

The verification methodology is described in experimentation. The use case validation process is described in validate AI use case.