Voice-First Safety Reporting
Conversational Incident Reporting with Automated Risk Assessment
Conversational Incident Reporting with Automated Risk Assessment
TrustFlight
2023–2024
AI Product Lead, Conversational Design
A maintenance technician stands in the hangar, phone in hand, staring at a safety report form. She just noticed hydraulic fluid residue near the auxiliary power unit during a routine inspection. Not a major leak, nothing grounded the aircraft, but it's worth documenting. She knows these small observations add up—patterns emerge from accumulated reports.
"Incident Type" dropdown. Maintenance Issue? Equipment Malfunction? Fluid Leak? She picks Maintenance Issue.
"Severity Rating" dropdown. Minor, Major, Serious, or Catastrophic? It's definitely minor. Nothing failed. But minor leaks can indicate bigger problems. She picks Minor.
"Barrier Effectiveness" dropdown. Effective, Limited, Minimal, Not Effective. She doesn't know what that means. She picks something.
The narrative text box. She needs to type the whole story on her phone. Standing in the hangar with engine noise in the background. She starts typing with her thumbs. Gets three sentences in, realizes this is going to take fifteen minutes, and she has two more aircraft to inspect before her shift ends.
She closes the app. She'll finish it later.
She doesn't. A week later, she vaguely remembers seeing something but can't recall which aircraft or exactly where the residue was. The report never gets filed.
This is safety reporting in aviation. Small observations that matter for trend analysis—the kind that reveal patterns before they become incidents—get abandoned because reporting them is tedious. By the time the third hydraulic fluid residue report should have triggered a fleet-wide inspection, only one actually made it into the system.
Aviation safety management requires reporting incidents, hazards, and safety observations for trend analysis, regulatory compliance, and safety improvement. But safety reporting has two conflicting requirements: immediate capture (reports filed quickly while details are fresh) and structured risk assessment (safety management systems need risk data for prioritization).
Traditional forms try to satisfy both simultaneously with dropdown menus, severity ratings, narrative boxes, and risk assessment matrices. This fails when it matters most.
Forms demand structured thinking and keyboard access when reporters are in reactive mode, trying to capture observations quickly in crew rooms, hangars, on ramps. Risk assessment requires expertise reporters don't have. The difference between "Major Accident" and "Catastrophic Accident" potential requires understanding safety management concepts. Forms present these as dropdown options with no guidance.
Worst of all, forms force premature classification. Dropdown menus ask "What was the severity?" before the reporter has articulated what happened. Reporters think "we had hydraulic fluid near the APU, found it during inspection, maintenance cleaned it up," not "this was a Level 2 hazard with Limited barrier effectiveness."
The cognitive load is backwards. Forms prioritize database structure over reporter workflow.
We rebuilt safety reporting as a voice conversation that captures incident narratives naturally, then automatically performs structured risk assessment using aviation safety methodology.
TrustFlight's Quality SMS product suite already had a mature iPad-based form system with an established data model for safety reporting. Rather than replace it entirely, we asked: what if reporters could speak their reports instead of typing them?
Complete voice reporting workflow from initial incident description through automated risk assessment and confirmation
Reporters speak their account naturally, as if describing what happened to a colleague. No keyboard, no dropdown menus, no navigating between form fields.
Complete voice reporting workflow from initial incident description through automated risk assessment and confirmation
A maintenance technician might say: "During preflight inspection on aircraft G-ABCD, I noticed hydraulic fluid residue near the auxiliary power unit. Small amount, maybe a few drops. Equipment is still operational. Maintenance has been notified."
As they speak, the system is extracting structured data in real-time—incident type (maintenance observation), aircraft registration (G-ABCD), equipment involved (APU), operational status (still functional), actions taken (maintenance notified). All the fields that would normally require separate form inputs are being populated automatically from the natural description.
Then the system asks contextual follow-ups via voice, only for details that weren't in the initial account: "What time was this?" "Did you take any photos?" The questions adapt to what's already been captured, so reporters aren't repeating themselves or answering irrelevant questions.
Traditional systems force reporters to classify risk manually: picking severity levels and barrier effectiveness ratings from dropdown menus before they've even finished describing what happened. We inverted this. The system listens to the narrative first, extracts risk factors as the reporter speaks, then applies formal safety management methodology automatically.
Real-time risk profile generation and automated safety assessment
A pilot reports: "At cruise altitude, we got a collision warning directing us to descend immediately. We followed it, descended 500 feet, and the other aircraft passed 300 feet above us. Air traffic control hadn't warned us about the traffic."
As the reporter speaks, the system builds a risk profile. It recognizes "collision warning" as a near-miss event, "300 feet above" as dangerously close proximity, "followed immediately" as effective crew response, and "ATC hadn't warned us" as a procedural failure. The system infers what could have happened: mid-air collision (Major Accident), and what prevented it: automated warning system and crew training (Effective barriers). It calculates Risk Score 80: High Risk requiring management review.
Then it confirms via voice: "I've classified this as high risk—a collision avoidance event with effective barriers that prevented escalation. Risk score 80. Does that sound right?" If the assessment feels wrong, reporters can adjust it and the system learns from the correction.
The system tries to understand what happened before asking explicit questions. If the narrative contains enough information—incident severity, barriers present, outcome—it performs the assessment automatically. Only when details are ambiguous does it probe deeper with targeted follow-ups: "Did you maintain full control?" "Was there any point where things could have gotten worse?"
This approach reduces reporter burden significantly. They describe what happened naturally, and risk assessment emerges from that description rather than being forced through a classification matrix. The cognitive load shifts from "what dropdown option matches this situation?" to simply "what happened?"
For pilots filing reports about recent flights, the system already knows the operational context. If a pilot recently flew from London to Manchester on BA1234, the system doesn't make them manually enter flight number, route, aircraft registration, and timing. Instead, it prompts via voice: "Were you reporting about flight BA1234 from London to Manchester today?"
System automatically pulling recent flight data and confirming with the reporter
The reporter confirms or corrects, and the system pre-populates everything it already knows from operational data. This turns what would be five minutes of manual data entry into a five-second confirmation.
We designed for multi-modal input from the start: voice, photos, and operational data working together to accelerate reporting. The goal was to meet reporters wherever they were in the workflow, whether that meant speaking first or leading with a photo.
Visual context analysis and multi-modal reporting workflow demonstration
When a photo is uploaded first, the system analyzes it while pulling in flight context if the reporter recently operated an aircraft. A pilot who just landed uploads a windshield chip photo. The conversation begins with aircraft registration, flight number, and timing already populated, plus an initial observation: "I can see damage to the windshield. Was this discovered during preflight or did it occur in flight?"
By combining voice, visual analysis, and operational context, the system reduces what reporters need to manually input. The faster they can file a complete report, the more likely they are to file it at all.
Voice-first safety reporting is currently in pilot deployment with select operators. Early results are promising:
Average Report Time
vs 15+ minutes for forms
Submission Rates
from operational environments
Risk Consistency
automated vs manual classification
User Feedback
"Finally something I can do while walking"
Average time to file a report dropped from 15+ minutes with forms to under 3 minutes with voice. The reduction comes from eliminating typing and navigating dropdown menus. Reporters can now file complete reports in the time it used to take just to fill out the header fields.
Reports filed from operational environments—hangars, ramps, crew rooms—increased noticeably during the pilot. Voice made it viable to report immediately rather than deferring until reaching a computer. We're still validating whether this translates to more total reports across all contexts, but the shift toward immediate on-the-go reporting is clear.
Automated risk classification produces more consistent risk scores than manual matrix-based assessment. The same incident description yields the same risk score regardless of who reports it. This consistency matters for trend analysis and prioritization, though we're still validating accuracy against safety management expert review.
Early pilot users describe the experience as effortless: "Finally something I can do while walking to my car" and "I don't have to remember dropdown options, I just tell it what happened." The reduction in friction transformed reporting from a desk-bound task to a mobile workflow.
We're expanding voice reporting to additional operators within TrustFlight's customer base, gathering feedback on edge cases, refining speech recognition, and validating that automated risk assessment aligns with safety management practices across different operational contexts. The technology shows promise, but we're still learning where it works best and where human review remains essential.
Building voice-first safety reporting taught us that the interaction model changes fundamentally when you shift from text to voice. Speed wins come with new balancing acts. Text allows reporters to think, edit, and control pacing. Voice is immediate and sequential. Confirmation loops, chunked questions, and conversational language aren't nice-to-haves; they're essential. What works for text doesn't translate directly to voice.
Just a few years ago, this project wouldn't have been feasible. Speech recognition was unreliable enough that even commercial options required extensive custom work: training models, building pronunciation dictionaries, fine-tuning for domain vocabulary. The gap between transcription and structured data extraction required significant engineering. But the landscape shifted dramatically with modern AI. What once would have required a team of speech engineers and months of training now works through well-designed prompts and structured outputs.
The biggest workflow improvement wasn't conversation versus forms; it was enabling on-the-go reporting. Voice made mobile viable, and mobile enabled immediate reporting while details were fresh. Maintenance technicians report from hangars. Pilots report walking back from the aircraft. The friction that caused deferred and forgotten reports disappeared.
We succeeded not by building a better form, but by rebuilding the interaction model around how people naturally communicate, while automating the expertise-heavy parts and keeping human judgment where it matters. AI's role in safety management isn't making the final call; it's making sure critical observations reach the people who do. When reports don't get filed because the interface demands too much cognitive overhead, the entire system fails. Working on this felt like standing at the threshold of a new interaction paradigm: interfaces where natural language is the input, and decades of UI conventions (dropdowns, validation rules, form fields) collapse into a conversation. The complexity doesn't disappear; it just moves behind the interface where it belongs. Not sure how this shapes up over time, but for a use case like this one, it's a no-brainer.
End-to-end governance infrastructure for shareholder and board meetings.