Within the last two decades, machine learning (ML), the main subfield of artificial intelligence (AI), has gained significant momentum across all sectors, driven by a confluence of factors: exponential growth in data generation, advancements in data storage and computing, and innovations in algorithmic techniques. Most notably and recently, the proliferation of deep learning (DL) methods and generative AI tools, such as ChatGPT, are revolutionizing the business landscape. Significant improvements in ML are making it increasingly difficult to discern where it starts and ends in day-to-day operations. Increasingly, ML is being commoditized and offered as a subscription service. New vendors are emerging in transportation offering their AI-powered devices and ML-based analytics services. In some cases, departments of transportation (DOTs) may already be using ML in deployed applications from vendors.
In an era where data are pouring in from new sources, the pace of data growth is exceeding the pace at which state and local DOTs are able to use it. The unprecedented volume of data generated and accessed daily presents both a challenge and an opportunity. Processing and deciphering large amounts of data pose a significant challenge, as DOTs may not have the necessary resources to do so effectively. On the other hand, harnessing and capitalizing on its potential can transform and revolutionize transportation systems. ML can be a promising approach to support both real-time and offline applications. ML’s ability to analyze vast datasets, discover patterns, make predictions, and continuously improve through experience represents a paradigm shift compared to traditional rule-based methods that agencies have been using for decades, offering the opportunity to improve transportation system performance and agency operations. From intelligent traffic management to automated vehicles, and from predictive maintenance to snowplow route optimization, the applications of ML in transportation are diverse and expanding, albeit nascent, at DOTs today.
While ML offers a promising future, it is not suitable for all problems and sometimes brings new challenges and risks. Agencies need to be aware of ML’s potential benefits as well as its challenges, limitations, and risks to make informed decisions about how it is deployed within the agency and across the transportation network. This guide aims to help state DOTs and other transportation agencies kickstart their journey into the rapidly evolving world of ML and build their agency’s ML readiness and capabilities. Insights in this guide are based heavily on best practices and lessons learned synthesized over the past two years. This guide is completed as part of NCHRP Project 23-16, “Implementing and Leveraging Machine Learning at State Departments of Transportation,” and in creating its content, the team benefited from the accumulated knowledge in other components of the project, including the following:
NCHRP Web-Only Document 404: Implementing and Leveraging Machine Learning at State Departments of Transportation (Cetin et al. 2024) is a conduct of research report that documents the development of the guide and the entire research effort and is available on the National Academies Press website (nap.nationalacademies.org) by searching for NCHRP Research Report 1122: Implementing Machine Learning at State Departments of Transportation: A Guide.
The primary target audience for this guide is state DOT program managers considering ML projects and their technical support staff. This guide is also intended to help agency leadership focus on building and delivering data analytics capabilities across the organization. While state DOTs are the target audience and most of the examples included in this guide are from state DOTs, this guide could support other transportation agencies seeking to advance their understanding and use of ML tools and techniques.
The purpose of this guide is to serve as both an education and decision-making tool to assist state DOTs and other transportation agencies in
This guide seeks to help readers in answering questions at the project, program, and portfolio level, including but not limited to the following:
The guide is broken down into a 10-step roadmap to building agency ML capabilities:
ML is a vast topic that continues to evolve at a rapid pace. Given the breadth and depth of ML, this guide does not attempt to synthesize the entire spectrum of knowledge related to ML methods and applications. Instead, it seeks to highlight valuable insights, best practices, and opportunities for ML specifically for state DOTs and other transportation agencies. The top takeaways that readers are expected to gain from reading this guide are concisely summarized in the callout box. The remainder of the guide will discuss these takeaways in detail and help the reader gain practical guidelines for leveraging ML effectively within the transportation sector.
ML Guide Key Takeaways
EXPECTATIONS: ML is not a panacea and will not work for every transportation use case; however, it is becoming increasingly powerful and widespread. Agencies should understand which transportation problems are currently conducive to ML solutions.
DATA: ML is a bottom-up, data-driven approach capable of discovering highly complex patterns in data, whereas traditional approaches tend to be rule-based.
BENEFITS: ML can bring many benefits, such as improving operational efficiency (e.g., by replacing manual processing of large data) and generating new strategies by discovering hidden opportunities.
GAPS: ML may have different needs than traditional approaches, particularly with respect to digital infrastructure (e.g., computing, big data, storage, etc.).
APPLICATIONS AREAS: Many agencies have found success deploying ML in various application areas, such as operations and asset management.
RISKS: ML project implementations and ML solutions in operation introduce new challenges and risks to agencies, such as their black-box nature; these risks should be well understood and managed.
APPROACHES: There are a variety of approaches an agency can take in deploying ML (e.g., custom in-house model development, purchasing ML as a service), with each approach having different benefits and risks.
EVALUATION: Agencies looking to deploy ML solutions should understand typical evaluation metrics for ML applications (e.g., false negative rate), what metrics are desirable to measure for their project, and how these metrics tie into the performance of the transportation system.
SCALING: As with other emerging technology deployments, it is considered a best practice with ML to start small, show value, and then scale up.
COSTS: Data processing and transmission costs could play a significant role in overall ML costs.
WORKFORCE: While agency staff do not have to be ML experts, it is important for them to understand whether and how ML is used by vendors/consultants and be cognizant of potential pitfalls in deployment (e.g., model drift).