After a production incident, the team is tired and wants to move on. Writing a thorough postmortem requires reconstructing a timeline from scattered logs, Slack messages, and monitoring dashboards. This often gets delayed or done poorly.

The AI Approach

An LLM ingests the incident’s log data, alert history, Slack channel transcript, and status page updates to draft a structured postmortem document. It reconstructs the timeline, identifies contributing factors, and drafts initial action items.

Three-Step Build

  1. Gather sources - Collect logs from the incident time window, the Slack channel conversation, PagerDuty/Opsgenie alert history, and any deployment events from CI/CD.
  2. Timeline reconstruction - Feed the raw data to an LLM with a prompt that asks it to build a chronological timeline of events: when the issue started, when it was detected, what actions were taken, and when it was resolved.
  3. Postmortem draft - Using the timeline, prompt the LLM to fill in a standard postmortem template: summary, impact, timeline, root cause analysis, contributing factors, and suggested action items.

Where It Breaks

The AI can reconstruct what happened but struggles to identify the true root cause without deep system knowledge. It may also miss context from verbal conversations or war room discussions that were not captured in Slack. The draft should always be reviewed and enriched by the humans who were involved.

The Production Path

Integrate with your incident management tool to auto-trigger postmortem generation when an incident is resolved. Pre-populate a document in your wiki or incident tracker. The human role shifts from writing from scratch to reviewing and refining an AI-generated draft, which is faster and more likely to actually happen.