Even after the most rigorous safety risk management process, safety concerns will continue to emerge as a system is implemented and operated, for many reasons. Some concerns may only become apparent in real operations. Other concerns may be very subtle and require more experience and data to identify and characterize. Furthermore, new concerns may arise over years or decades as a technology or operation that had been “safe” becomes stressed by changes in the operating environment and other external factors. Last but not least is the obvious reason that no system will ever be perfect.
Thus, the second component of safety management, safety assurance, is critical. Often discussions of safety assurance focus primarily on the data that is collected, and the analysis that is conducted—and this chapter discusses the challenges and opportunities in data analytics with transformative changes in technology and operation. Furthermore, safety assurance is only truly effective not only if it monitors safety performance but also if its insights are fed back to improve safety risk management, and it’s used to manage change within an organization (detailed in the next chapter).
Safety assurance at some level is expected in all types of aviation operations, and at all stages. Looking at each aircraft, monitoring, and quality checks are important throughout production and manufacture, with ongoing safety assurance and management expected as part of the production certificate. Once manufactured, aircraft operators have routine inspections and monitoring for safety conditions arising within the aircraft, with
detailed maintenance logbooks for each aircraft. Often, operators, original equipment manufacturers (OEMs), and maintenance, repair, and overhaul (MRO) operations collaborate to create mechanisms for detailed health management that allows them to identify when something mechanical may be degrading and to purposefully anticipate an efficient schedule for repair or replacement before it becomes a safety concern. The rare instances of a major malfunction in flight are typically viewed as failures in this vehicle health management cycle, triggering reviews of the design of the failed component, and its manufacturing and maintenance, to not only identify cause of the failure but also adjust the schedule of inspections and repair, and confirm that the appropriate assessments are being made to capture failure modes.
Likewise, within flight operations all parties are expected to monitor for, and investigate and address, safety concerns as they are detected. With the more stringent requirements operating certificates associated with commercial operations, especially Part 121 operations, operators are required in 14 CFR §119 to have a safety management system (SMS) as detailed in 14 CFR §5. This SMS must encompass all the aspects of safety management detailed throughout this report. Some of these aspects establish important aspects of the operation but then do not need day-to-day activities: the required aspects of safety policy and safety promotion establish organizational structures, policies, and plans; and, for an aircraft operator whose aircraft and operations have already been evaluated through a rigorous safety risk management process during FAA certification and operational approval, the safety risk management aspect of their SMS largely serves as an extension of the earlier hazard analyses, providing a reference point and directing where safety assurance should be gathering and analyzing data.
Thus, ongoing safety assurance is the most active component of the SMS. It typically involves and is visible to line personnel and includes substantial data collection and analysis. The required elements of this safety assurance include the stipulation in 14 CFR §5.71 of what processes and systems “safety performance monitoring and measurement” must monitor for, including establishing a confidential and nonpunitive employee reporting system. 14 CFR §5.73 then requires assessments of safety performance including ensuring compliance with the operator’s own safety risk controls defined in their safety risk management process, identifying new hazards and changes in the operational environment that may introduce new hazards, and evaluating the performance of the SMS. 14 CFR §5.73 concludes the regulatory definition of safety assurance with a requirement for continuous improvement: “The certificate holder must establish and implement processes to correct safety performance deficiencies identified in the assessments conducted under 14 CFR §5.73.”
Several types of data are commonly applied to the extensive safety assurance programs typical of current air transport:
These types of data can be analyzed in several ways. Of course, any reports or flags of major safety issues can combine multiple types of data into an intensive, focused investigation.
For ongoing analysis and monitoring, particularly for the digital data, current methods in Flight Data Monitoring (FDM) or Flight Operations Quality Assurance (FOQA) tend to focus on establishing some upper and lower bounds on recorded parameters, and then identifying any “exceedances” of these criteria. For example, examining some of the phenomena that are currently viewed as important to monitor, an international consortium has examined the flight parameters defining stabilized approaches, flagging criteria such as deviation of airspeed relative to desired approach speed, and deviation from localizer and glideslope, that should be flagged.1
___________________
1 See https://www.iata.org/contentassets/b6eb2adc248c484192101edd1ed36015/unstable-approaches-2016-2nd-edition.pdf.
Safety management requires that each organization analyze their data internally and frequently, with a particular call to be alert and responsive to any concerns as they are identified. The air transport industry has further found benefits in collaborating, as highlighted by the activities of the Commercial Aviation Safety Team (CAST) and the Aviation Safety Information Analysis and Sharing (ASIAS) initiative, whose members and contributors span most of the U.S. aviation industry, in partnership with FAA AVS and ATO and including representatives from labor through pilot and air traffic controller associations. By pooling their data, their analysis of it, and their decisions on the appropriate changes and improvements to implement, such collaborations allow not only sharing of insight but also highlight where concerns are system wide and impact many.
Another system-wide assessment comes from the Aviation Safety Reporting System (ASRS). Run by the National Aeronautics and Space Administration (NASA) with funding from FAA, the ASRS is the model and progenitor of many other voluntary safety reporting programs (VSRPs), including those created by individual organizations as part of their SMS. ASRS has been in place for over 45 years and in 2023 received more than 106,000 reports from pilots, air traffic controllers, cabin crew, dispatchers, maintenance technicians, UAS operators, and others.2 Uniquely, de-identified ASRS reports are made public in a searchable database that is widely used for training and education, safety analyses, and research. However, because the reports are de-identified before public release, it can be difficult to then examine specific incidents in detail.
While a great volume of data is collected across the system, it can be hard to integrate and share for two reasons. First, different data sets are often collected using constructs that do not easily mesh. For example, both digital flight data recorder data and air traffic control voice radio logs exist, but the process of relating what was said by and to the flight crew to the other events occurring through the flight, particularly as the aircraft transitions between air traffic sectors using different radio frequencies, requires such effort and time that it is limited to special situations. More pragmatically, different flight data recorders may use different formats or units—different operators’ VSRP may ask different questions—and different observation programs, such as LOSA, may be targeted to, and document, different safety concerns. Thus, there can be significant cost and time required to integrate data, and the choice of what to integrate must be purposeful.
Second, significant policy concerns can limit the open publication of raw data. Information collected by air carriers can include elements considered proprietary to themselves and/or to the manufacturers of equipment
___________________
2 See https://asrs.arc.nasa.gov/docs/ASRS_ProgramBriefing.pdf, retrieved May 30, 2024.
that is being sensed and recorded. Requests for sharing measures of human performance, behavior or physiology, including audio and video recordings, raise significant concerns with individual privacy and impact labor relations. Taken out of context, raw data are easily misrepresented or misinterpreted. Thus, an important consideration in the analysis of emerging trends in aviation is understanding who can have access to data, and understanding the processes by which data can be shared with others or, ultimately, be made publicly available.
One major contribution of ASIAS is its function of gathering major data sources and controlling their dissemination to provide public access where possible, and to provide strong data protections otherwise. The development of this data sharing collaborative is now more than 15 years in the making, reflecting the long-time frame required to develop the trust, policies, and formats for the most sensitive data. Their strong data protections, including a clear understanding of who will have access to the data, the analyses it will be used for, and the type of results and outputs of the analyses (and their dissemination) are generally considered important factors in decisions by air carriers, labor associations, and other parties to voluntarily contribute data that they otherwise might not share—indeed, that they otherwise might choose to not even collect.
Stepping back to make a holistic assessment of the types of data that can be collected—and that are required for safety assurance of novel technologies and operations—the committee noted both challenges and opportunities. First, many new technologies afford opportunities to record more data in several ways. For example, they may be highly instrumented to dynamically control specific components or to improve pilot situation awareness; a highly automated UAS, for example, may have many more sensors on it than a traditional aircraft; modern avionics are capable of recording many data streams and calculating a more comprehensive picture of the flight conditions. Furthermore, once certified, it is expensive to modify flight-critical software to record its internal data streams and logic. Thus, the decisions regarding which data should be recorded by increasingly data-intensive flight systems need to be made early in design.
Second, a challenge and opportunity with many transformative technologies is to accommodate the different roles—and even locations—of human operators, and to capture measures of critical interactions between human operators and machine components. The current standard captures digital flight data focused on simple measures of human–machine interaction.
These digital data are supplemented by VSRP reports by onboard pilots who can explain their concern. However, the current framework of VSRPs may not capture many new forms of human–machine interaction. For example, within the traditional data set, while the autoflight system modes are captured, the pilot’s button presses commanding the modes may not be, leaving ambiguous whether important mode transitions were commanded implicitly by the autoflight system or explicitly by the pilot, and not capturing interactions such as a pilot being confused when a button press does not trigger a desired mode change. At the same time, the pilot may be distanced from some of the direct control functions, and not positioned to identify safety concerns in the same way. Thus, this point in time is an opportunity to examine what data needs to be recorded to analyze the new types of interaction.
Third, as Leveson notes, “any attempt to determine whether software is safe or not without including the context in which it is to be used will not work” (Leveson, 2023, p. 132). Unlike physical systems, that may mechanically break so that they fail in intended operations (and may continue to function in unusual conditions), software does not mechanically break. Absent some malfunction or abnormality, software instead continues doing exactly what it is programmed to do. This creates two types of failure modes that are difficult to identify with current safety assurance data sets, namely, (a) when placed out of intended context, the software’s behavior may have undesired or unsafe outcomes; and (b) the specification of the software’s behavior can be wrong or incomplete.
Fourth, an opportunity in safety assurance can be created with a shift to performance-based standards, as recommended in the previous chapter. These standards would establish metrics suitable not only for safety risk management but then also for safety assurance: metrics of system performance can be directly measured, calculated and recorded, even in real time during a flight. This particularly helps with assuring functions performed by multiple interacting components (where each could appear to be acting according to its own specifications even as the overall function degrades), and with providing high-level specifications of what the software behind the function should be performing toward.
Finally, a challenge and an opportunity in safety assurance exists when examining transformative changes to technology and operations. The challenge comes from this report’s definition of “transformative”—that is, innovations for which current methods of predicting safety do not extrapolate or predict safety well. Safety assurance using traditional data and criteria may miss critical concerns with transformative changes; for example, the flight data recorder data set contains measures of fuel and engine settings that do not apply to electric engines, and the criteria for stabilized approaches assume different profiles for airplanes and helicopters, a
distinction that can be blurred with novel “powered lift” category aircraft. As noted in the previous chapter, safety risk analysis is grappling with the challenge of identifying appropriate metrics of safety during certification and operational approval.
The opportunity exists for safety assurance to learn from, and apply, the metrics and criteria employed in safety risk analysis of transformative technologies and operations. Continuing with the same examples given in Chapter 2, FAA has recently published special conditions for certifying Safran’s proposed electric motor and power system under 14 CFR §333 and proposed airworthiness criteria for one powered-life design, the AgustaWestland AW609 tilt-rotor, designated a “special class” aircraft under 14 CFR §21.17(b).4 Such special conditions provide detailed criteria, and require the applicant to develop sensors, measures and methods for certification. These developments that can then be left in the design to support subsequent safety assurance. For example, the AW609 proposed airworthiness criteria includes several criteria governing its transformative aspect—the transition from vertical flight with the rotors “up” to forward flight using its wing and the rotors “forward,” such as §TR.191(d) “Control margin. To allow for disturbances and for maneuvering, the margin of control power remaining at any stage in the transition shall be demonstrated to be adequate.” The measures defined for special conditions during certification can also be captured during implementation to support ongoing safety assurance.
Finding 3-1: Proposed transformative changes in technology and operations provide both challenges and opportunities to rethink the appropriate data to collect to support safety assurance. Given that many new technologies can record a wide range of new measures, the opportunity exists to systematically determine what potential data streams best monitor potential safety concerns emerging with innovations. This should particularly consider the unique safety concerns with changing roles of human and machine, and with the unique concerns with monitoring software-intensive functions. This should also capitalize on criteria used in certification and approval of transformative changes, including monitoring of criteria applied earlier in certification and other safety risk management processes based on performance-based standards and on special criteria and conditions.
___________________
3 See https://drs.faa.gov/browse/excelExternalWindow/FR-SCPROPOSED-2024-05101-0000000.0001.
4 See https://www.federalregister.gov/documents/2023/06/09/2023-12310/airworthiness-criteria-special-class-airworthiness-criteria-for-the-agustawestland-philadelphia.
Finding 3-2: VSRPs are a vital data source in safety assurance, given the ability of personnel throughout the NAS to detect and describe anomalies. Changing roles between human and machines, particularly with increasingly automated functions and remotely piloted aircraft, may impact human observability of safety concerns. In such cases, the personnel expected to provide VSRP reports, and the questions they are asked within the report, may need to change. Likewise, further digital data may be required to make up for gaps in, and to make sense of, VSRP reports.
Recommendation 3-1: The Federal Aviation Administration Office of Aviation Safety should determine the process and criteria by which an applicant can demonstrate that their proposed data set is appropriate for safety assurance when implementing transformative changes in technology and operation. This determination should be sufficiently proactive to identify where new sensors and recording mechanisms need to be built into systems during their design and certification to then enable safety assurance during their operation. These data sets can leverage the criteria used in safety risk management (including certification and operational approval) to demonstrate safety of their new attributes through performance-based standards and through the special conditions and criteria.
As noted earlier in this chapter, and in the committee’s previous report, current methods for data analysis focus largely on flagging “exceedances” (e.g., airspeed higher or lower than some criteria) and for lagging indicators such as incident reports. This report notes that the definition of “emerging” includes both known constructs that are increasing in magnitude or frequency, and unknown constructs that may appear later, even years into an operation. Given the many forms of data analysis enabled by fields such as data mining and machine learning, more and better is possible in both characterizing and predicting known (or hypothesized) safety concerns, and monitoring for the unknown.
A major distinction is between data analysis for phenomena that are completely unknown, and for those phenomena that are known (or hypothesized). Current methods for measuring exceedances typify analysis for phenomena known so well that a high or low value in a specific variable is mapped to a safety concern. In this case, the exceedances are counted
in order to assess whether they are growing in magnitude overall, or particularly occurring in specific situations. For example, exceedances such as high or low airspeed may be considered a marker of an unstabilized approach for current operations, and they can be monitored to see if they are occurring at a particular airport or with a particular aircraft type.5 The committee notes that this type of detailed monitoring for those conditions that are well characterized and appropriate to the technology and operation is a vital first step in safety assurance.
The concept of monitoring for exceedances can be extended to also monitor for violation of critical metrics and assumptions relied on during design and safety risk management. Flight-critical software can assist with reporting whenever its logic detects hitting any key metric or assumption built into its algorithms. Likewise, onboard systems or offline analysis can track metrics relative to required performance as developed for the first time as part of special conditions, special criteria, or exemptions for certification and operational approval.
Further data analysis methods can bring more insight to safety assurance. As a next step, methods such as association rule-learning and dependency modeling may look for relationships between variables to enable more multidimensional evaluation of a safety concern. Analysis of the time history of many variables may find time sequences and patterns that precede exceedances. Once characterized, these patterns and sequences often provide a “signature” that a future undesired event may be developing, which can then be monitored for. For example, these methods may more fully characterize unstable approaches as patterns across speed, descent angle, power setting and control activity rather just flagging flights with an ex-ceedance in only one variable, such as a low speed. This will provide more insight into exceedances in post-hoc analysis; such insights can conceivably also be used in real time to flag to the pilot or others that a potential situation may be developing, where it may support an immediate response.
Progressively more sophisticated analyses may seek to characterize phenomena that are only understood—or hypothesized—in general terms, such as “pilot mode confusion,” for which no screening criteria currently exist. For example, in 2008, NASA Aeronautics research demonstrated how algorithms could quickly analyze the time history of pilot button presses across thousands of flights (Budalakoti, 2008). Their “sequenceMiner” algorithm learned what typifies “normal” sequences of button presses, and then with that knowledge identified “abnormal” cases; for example, the algorithm flagged a hither-to unidentified case of a pilot pressing the autopilot switch numerous times at three points: 16 minutes, 4 minutes, and then 1 minute
___________________
5 Committee discussions included experiences in which exceedances could be traced back to specific flight instructors.
before landing. Once flagged, a subject-matter expert analyzed the case and confirmed it as an instance of pilot mode confusion.
Ultimately, current data analysis capabilities created in other disciplines and applied in other industries can also support the ultimate step of monitoring for effects that are sufficiently unknown that an operator does not know to monitor for them. Methods such as “clustering” large multivariate data sets can identify the patterns of data typifying “normal” operation—and then flagging “abnormal” situations. These results supplement rather than replace the role of the human analyst in safety assurance, enabling them to focus on the statistically unusual cases.
Unfortunately, little research has been conducted and documented in the public domain on how such methods of data analysis can be applied to the full range of analysis that aviation safety assurance requires. The committee knows of no recent research by federal agencies. Likewise, the sensitive nature of aviation data, and the absence of publicly available data sets, has limited open research into the extension of general data analysis methods to aviation.
Finding 3-3: Extensive data analysis methods have been established in other domains and industries suitable for expanding the capabilities of aviation safety assurance. These methods particularly span the needs of transformative changes to technologies and operations, where simple definitions of exceedances cannot span the open questions in possible safety concerns that need to be monitored for; instead, transformative changes require a range of methods, from those suitable for monitoring for, characterizing, and predicting potential concerns to detecting the unknown.
Recommendation 3-2: The Federal Aviation Administration Office of Aviation Safety should determine the process and criteria by which an applicant can demonstrate that their proposed data analysis methods are appropriate for safety assurance when implementing transformative changes in technology and operation, and that they are appropriate for the data set being collected. This determination should specifically support both (1) characterizing phenomena that are only hypothesized or poorly understood as a result of transformative changes; and (2) monitoring for situations and conditions that are unknown and statistically abnormal, and thus should be flagged for further evaluation.
As defined in Chapter 1, safety assurance only starts with collecting and analyzing data—the insights from the analysis must be applied to manage change and continuous improving the technologies and their operation—and improving the SMS itself. This is noted in FAA’s guidance on SMS:
SMS requires the organization itself to examine its operations and the decisions around those operations. SMS allows an organization to adapt to change, increasing complexity, and limited resources. SMS will also promote the continuous improvement of safety through specific methods to predict hazards from employee reports and data collection. Organizations will then use this information to analyze, assess, and control risk. Part of the process will also include the monitoring of controls and of the system itself for effectiveness. (FAA, n.d.)
This full vision of SMS is also integral to the recognition that AVS cannot personally oversee every aspect of a design and operation: instead, they depend on the organization having the internal systems and processes in place to manage safety. Unfortunately, members of the committee noted instances, both in their own experience and noted in other recent public discussions of safety breakdowns,6 where organizations implemented SMS in strict accordance with FAA criteria, including the required data collection and analysis aspects, without visible evidence that the SMS was then used to drive change.
Several markers can be captured of an organization’s use of SMS to affect change. For example, best practices in flight crew procedures and checklists include welcoming feedback by all for “the assurance that the cold light of the real world is the final test of the goodness of any individual procedure or policy.”7 While there is no single target number for the number of pilot reports, and the number of changes made in response, tracking these numbers can flag situations that are clearly off: too few reports suggests a problem in the reporting process; too few changes in response suggests a problem in the organizational response; too many reports suggest an abnormal breakdown that line personnel find concerning; and too many changes may reflect a process of churn within the organization’s policies and procedures that can itself be de-stabilizing.
___________________
6 Final Report: Expert Panel Review of Section 103 Organization Designation Authorizations (ODA) for Transport Airplanes, 2024.
7 See https://ntrs.nasa.gov/api/citations/19940029437/downloads/19940029437.pdf.
This committee is not alone in its concerns with ensuring that implementation of SMS actually leads to effective safety management. Similar concerns were noted in the recent Expert Panel Review Report for Organizational Designation Authorizations (ODA) for Transport Airplanes. One of their recommendations to FAA was to “partner with industry to define clear measures of success for SMS implementation for … organizations and jointly review those measures of success on a regular basis.”8 While this expert panel was focused on design and manufacturing, this recommendation is also appropriate to all aviation organizations contributing to safe operations—that each safety assurance process needs to include assessment of itself and of its organization’s broader safety management.
The following chapter continues this discussion of managing change within the organization through the lens of Safety Culture, and provides findings on the role of SMS in helping measure and guide the maturity of safety culture.
Budalakoti, S., A. N. Srivastava, and M. E. Otey. 2008. Anomaly detection and diagnosis algorithms for discrete symbol sequences with applications to airline safety. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 39(1):101–113.
FAA (Federal Aviation Administration). n.d. Safety Management System. www.faa.gov/about/initiatives/sms/explained (accessed May 15, 2023).
Leveson, N. G. 2023. An Introduction to System Safety Engineering. MIT Press.
___________________
8 See https://www.faa.gov/newsroom/Sec103_ExpertPanelReview_Report_Final.pdf.