AI-Powered Incident Reporter:

Introduction
As a DevOps engineer, I spend countless hours documenting incidents. Every time something goes wrong, I have to manually extract information from Slack conversations, format it into reports, and ensure all stakeholders get the details they need. This process typically takes 30-60 minutes per incident, and with multiple incidents per week, it was eating up a significant portion of my time.
When I discovered AWS Q Developer, I saw an opportunity to automate this tedious process. In this article, I’ll share how I built a CLI automation tool that transforms Slack conversations into professional incident reports in under 30 seconds.
The Problem: Manual Incident Documentation
Before Automation
- Time-consuming: 30-60 minutes per incident report
- Error-prone: Missing critical details due to human oversight
- Inconsistent: Different team members used different formats
- Delayed: Reports weren’t available when stakeholders needed them
- Repetitive: Same manual process for every incident
The Impact
This manual process was affecting our team’s productivity and incident response effectiveness. We were spending more time documenting incidents than actually learning from them and implementing improvements.
The Solution: AI-Powered Incident Reporter
I built a Python CLI tool that:
- Fetches Slack conversations using the Slack API
- Analyzes content using Groq’s Llama3-70B model
- Generates structured reports in both HTML and Markdown formats
- Extracts exact timestamps from messages for accurate timelines
- Provides consistent formatting every time
Key Features
- 🔍 Smart Analysis: AI extracts incident details, root cause, and impact
- 📊 Dual Output: Both HTML and Markdown reports for flexibility
- ⏱️ Time Tracking: Automatic timestamp extraction and formatting
- 🎯 Simple Interface: Just provide a Slack message ID
- 📁 Local Storage: Organized file management with timestamped names
How Q Developer Assisted the Development Process
1. Problem Analysis and Architecture Design
Q Developer helped me identify the core requirements:
- The tool needed to be simple enough for non-technical users
- It should integrate with existing Slack workflows
- Output should be professional and consistent
- Processing should be fast and reliable
Architecture decisions Q Developer suggested:
- Use Groq instead of slower AI alternatives for better performance
- Implement both HTML and Markdown outputs for maximum flexibility
- Create a simple CLI interface that only requires a message ID
- Use environment variables for secure configuration management
2. Code Implementation and Integration
Slack API Integration: Q Developer helped me understand the Slack API structure and implement:
- Thread message retrieval with proper error handling
- User information extraction for attribution
- Message formatting cleanup (removing Slack-specific formatting)
- Timestamp conversion from Unix timestamps to readable format
AI Analysis Setup: Q Developer guided me through:
- Groq API integration and authentication
- Prompt engineering for structured JSON output
- Error handling for API rate limits and failures
- Response parsing and validation
Report Generation: Q Developer assisted with:
- HTML template creation with professional styling
- Markdown formatting for documentation systems
- File naming conventions with timestamps
- Browser auto-opening for immediate review
3. Testing and Refinement
Debugging Challenges: Q Developer helped me solve several technical issues:
- Timestamp extraction: Initially the AI wasn’t using the exact timestamps from Slack messages
- JSON parsing: Ensuring the AI returned properly formatted JSON
- Error handling: Making the tool robust for various failure scenarios
- User experience: Improving error messages and feedback
Performance Optimization:
- Optimized the AI prompt for better analysis quality
- Improved error messages for better debugging
- Added markdown output alongside HTML for flexibility
Technical Implementation
Technology Stack
- Python 3.8+: Core automation logic
- Groq API: Fast AI inference (Llama3-70B model)
- Slack API: Message retrieval and user management
- Markdown/HTML: Report formatting and styling
- Environment Variables: Secure configuration management
Key Components
1. Slack Integration
def fetch_slack_thread(self, channel_id, thread_ts):
"""Fetches all messages in a Slack thread."""
response = requests.get(
f"https://slack.com/api/conversations.replies",
headers={"Authorization": f"Bearer {SLACK_BOT_TOKEN}"},
params={"channel": channel_id, "ts": thread_ts}
)
return response.json()
2. AI Analysis
def analyze_with_groq(self, conversation):
"""Sends conversation to Groq for analysis."""
chat_completion = self.groq_client.chat.completions.create(
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"Here is the Slack thread transcript:\n\n{conversation}"},
],
model="llama3-70b-8192",
temperature=0.2,
response_format={"type": "json_object"},
)
return json.loads(chat_completion.choices[0].message.content)
3. Report Generation
def create_local_report(self, title, content, incident_date):
"""Creates both HTML and Markdown reports."""
# Generate filenames with timestamps
html_filename = f"{incident_date}_{safe_title}.html"
md_filename = f"{incident_date}_{safe_title}.md"
# Save markdown version
with open(md_filepath, 'w', encoding='utf-8') as f:
f.write(md_header + content)
# Save HTML version with styling
with open(html_filepath, 'w', encoding='utf-8') as f:
f.write(html_document)
Setup and Usage
Prompt used
I need to create a Python script that automatically generates incident reports from Slack conversations using Groq. Here are the requirements:
**Core Functionality:**
1. Fetch Slack conversation threads using Slack API
2. Use Groq to analyze the conversation
3. Generate structured incident reports as local HTML files
4. Use the exact same AI prompt structure as my existing incident handler
**Technical Requirements:**
- Python 3.7+
- Use boto3 for AWS Bedrock integration
- Use requests for Slack API calls
- Use python-dotenv for environment variables
- Use markdown library for HTML generation
- Generate beautiful, styled HTML reports that open in browser
**Environment Variables Needed:**
- SLACK_BOT_TOKEN (xoxb-...)
- SLACK_USER_TOKEN (xoxp-...)
- SLACK_CHANNEL_ID
**AI Prompt Structure (MUST USE THIS EXACT PROMPT):**
Analyze the conversation to identify key events, user actions, and outcomes.
The JSON object must contain these exact keys:
* "summary"
* "author" (Identify the primary person who resolved the issue)
* "priority" (Value must be one of: "High", "Medium", "Low")
* "status" (Value must be one of: "Resolved", "Investigating", "Monitoring")
* "description"
* "category" (e.g., "Infrastructure", "Application", "Database")
* "environment" (e.g., "Production", "Staging")
* "affected_resources" (Comma-separated list, e.g., "Jenkins, API Server")
* "root_cause_text"
* "remediation_steps_text"
* "impact_text"
* "timeline_markdown": "A detailed, multi-line timeline as a single JSON string. Each event must be a Markdown bullet point on a new line, attributing the action to the user who performed it. All newlines MUST be escaped as \\n. Example format: '- [10:09 PM] - Ajay noticed that storage is full.\\n- [10:11 PM] - Yuga cleaned up storage to free space.'"
* "what_went_well_markdown": "A Markdown numbered list as a single JSON string, describing what went well during the incident response. All newlines MUST be escaped as \\n."
* "what_went_wrong_markdown": "A Markdown numbered list as a single JSON string, describing what could be improved. All newlines MUST be escaped as \\n."
text
Prerequisites
- Python 3.8 or higher
- Slack workspace with admin access
- Groq API key (free tier available)
Quick Setup
- Install dependencies:
pip install -r requirements.txt
- Configure Slack app with required scopes
- Get Groq API key from console.groq.com
- Set environment variables in
.env
file - Run the tool:
python slack_incident_reporter.py <message-id>
Example Usage
# Generate report from a Slack thread
python slack_incident_reporter.py p1753168536411769
# Output files created:
# - incident_reports/2025-01-22_Incident_Report_Storage_Issue.html
# - incident_reports/2025-01-22_Incident_Report_Storage_Issue.md
Results and Impact
Time Savings
- Before: 30-60 minutes per incident report
- After: 30 seconds per incident report
- Improvement: 98% reduction in documentation time
Quality Improvements
- Consistent formatting across all incidents
- Complete timeline extraction with exact timestamps
- Structured analysis including root cause and impact
- Professional presentation for stakeholders
Team Adoption
- Reduced documentation burden on incident responders
- Faster availability of reports for post-mortem analysis
- Improved compliance with incident documentation requirements
- Better knowledge sharing across the organization
Sample Output
Generated Report Structure
# Incident Report: Storage Issue on Shared Runner
**Generated on:** 2025-01-22 15:45:30
**Incident Date:** 2025-01-22
## Summary
Resolved storage issue on shared runner causing GitLab pipeline failures
## Timeline
- [2025-01-22 15:30:45] - Ajay noticed that storage is full
- [2025-01-22 15:32:12] - Ajay checked pipeline logs and found storage issue
- [2025-01-22 15:35:20] - Ajay SSH into shared runner and ran df -h command
- [2025-01-22 15:38:15] - Ajay found docker images and cache taking more space
- [2025-01-22 15:40:30] - Ajay ran docker system prune -af
- [2025-01-22 15:42:45] - Ajay created automation to remove unused images
## Impact
- GitLab pipelines were failing due to insufficient storage
- Development workflow was blocked for 2 hours
- Affected all projects using the shared runner
## Root Cause
Docker images and cache accumulated over time, consuming 95% of available storage space
Lessons Learned
What Went Well
- Q Developer’s guidance was invaluable for architecture decisions
- Groq’s speed made the tool feel instant and responsive
- Simple CLI interface made adoption easy for the team
- Dual output format (HTML + Markdown) provided flexibility
Challenges Overcome
- Timestamp extraction: Required careful prompt engineering
- API integration: Needed robust error handling
- User experience: Iterative improvements based on feedback
- Documentation: Comprehensive setup instructions were crucial
Future Improvements
- Web interface for non-technical users
- Integration with Jira/ServiceNow for ticket creation
- Email notifications for completed reports
- Team dashboard for incident analytics
Conclusion
This project demonstrates how AWS Q Developer can help transform daily productivity pain points into automated solutions. What started as a manual 30-minute process is now a 30-second automated workflow.
The key to success was:
- Identifying a real pain point that affects daily work
- Leveraging Q Developer’s expertise for architecture and implementation
- Focusing on user experience and simplicity
- Iterating based on feedback and testing
The result is a tool that not only saves time but also improves the quality and consistency of our incident documentation, making our team more effective and our processes more reliable.
Resources
- GitHub Repository: https://github.com/Aj7Ay/Amazon-Q.git
- Groq API Documentation: https://console.groq.com/docs
- Slack API Documentation: https://api.slack.com/
- AWS Q Developer: https://aws.amazon.com/q/
This project was built as part of the #q-developer-challenge-1. The automation addresses a real productivity pain point in incident response workflows, demonstrating how AI can transform manual processes into efficient, automated solutions.
Tags: #q-developer-challenge-1 #automation #incident-response #slack #groq #python #cli #ai
Leave a Reply