Advancing Bridge Infrastructure Management through Artificial Intelligence: A Comprehensive Review

Deepak Kumar; Anil Agrawal

doi:10.70465/ber.v2i3.45

Authors

Deepak Kumar, Ph.D. ATANE Consulting, 40 Wall St, New York, NY 10005
Anil Agrawal, Dist. M. (ASCE), Ph.D., P.E. Professor of Structural Engineering, The City College of New York, New York, NY 10031

DOI:

https://doi.org/10.70465/ber.v2i3.45

Keywords:

AI-driven monitoring systems, Digital twins, Drones, Internet of Things (IoT), resilient bridges, Shape-memory alloys (SMAs), Smart sensors,, damage identification, bridge monitoring, Structural Health Monitoring, Bridge Management

Abstract

Bridge infrastructure serves as a vital component of global transportation systems, yet its aging condition and exposure to increasing environmental and operational stressors necessitate smarter, faster, and more objective approaches to inspection, deterioration modeling, and maintenance management. Traditional methods often suffer from subjectivity, inefficiency, and data limitations. This comprehensive review explores how recent advancements in Artificial Intelligence (AI), including computer vision, natural language processing, deep learning, predictive modeling, robotics, and large language models (LLMs), are revolutionizing the entire bridge management lifecycle. AI-based systems are examined for automated condition detection and rating, data-driven deterioration forecasting, and maintenance prioritization using multi-modal data inputs. Special emphasis is placed on the LLMs for extracting actionable insights from unstructured inspection records and facilitating automated decision support. In addition, the review covers AI-driven training and quality assurance tools for inspectors and demonstrates the potential of LLM-powered bots for real-time bridge condition communication. By benchmarking these innovations against traditional practices, this paper identifies current capabilities, integration challenges, and future research directions essential for realizing intelligent, sustainable, and scalable bridge infrastructure management.

Downloads

Download data is not yet available.

Introduction

According to the 2025 Infrastructure Report Card by the American Society of Civil Engineers,¹ the United States maintains over 623,000 bridges, which carry approximately 4.9 billion daily trips as of 2024. Alarmingly, over 42,000 bridges (6.8%) are rated in poor condition. Nearly 45% of US bridges have exceeded their intended design life of 50 years. Despite recent funding boosts, over $40 billion through the Infrastructure Investment and Jobs Act, the estimated rehabilitation need is $191 billion, leaving a significant gap in resources to ensure long-term structural health. This underscores the importance of proactive preservation strategies to extend service life² and highlights the urgent need for innovative, scalable, and cost-effective approaches to bridge inspection and maintenance. The limitations of traditional practices have highlighted the potential of emerging technologies, particularly artificial intelligence (AI), in addressing the growing complexity of bridge asset management.

Conventional bridge management practices rely on manual visual inspections, simplistic deterioration models, and rule-based maintenance planning. Under standards such as the US National Bridge Inspection Standards, bridges are typically inspected on a 2-year cycle and rated using the National Bridge Inventory (NBI) condition scale, which ranges from 9 (new) to 0 (failed). However, these methods are increasingly strained by the growing number of aging structures, rising traffic loads, and limited inspection resources. Visual inspections are inherently subjective, labor-intensive, and prone to variability among inspectors and agencies. For instance, a New York State study found significant inconsistencies in condition ratings for similar bridge components across different inspectors.³ Furthermore, traditional deterioration models often rely on fixed assumptions (e.g., constant transition probabilities or predefined degradation curves) that do not adequately capture the nonlinear and multifactorial nature of real-world bridge deterioration. These models struggle to reflect the complex interactions of traffic loads, material aging, climate effects, and maintenance history, potentially leading to inaccurate life-cycle predictions.

In recent years, AI has emerged as a transformative tool to address these limitations. Technologies such as computer vision (CV), natural language processing (NLP), machine learning (ML), deep learning (DL), and large language models (LLMs) are being integrated into various bridge management tasks, including inspection, deterioration modeling, maintenance optimization, and decision-making. For instance, CV systems powered by convolutional neural networks (CNNs) and vision transformers can detect surface damage (such as cracks, spalling, and corrosion) from drone images and video. ML models, including neural networks and ensemble algorithms, can analyze high-dimensional inspection and environmental data to predict deterioration rates and future condition states. LLMs can extract structured information from unstructured inspection reports, standardize condition descriptions, generate maintenance summaries, and assist decision-makers in identifying urgent repairs. Collectively, these advancements signal a paradigm shift from reactive, subjective evaluations toward proactive, data-driven infrastructure management.

While several studies have explored specific AI applications in bridge management (e.g., Harle⁴), a comprehensive synthesis of how these technologies interact, their current limitations, and emerging trends is still needed. This review addresses that gap by critically analyzing recent AI advancements across four interconnected domains: (1) damage detection, (2) deterioration modeling, (3) predictive maintenance planning and optimization, and (4) decision support via LLMs for processing inspection narratives. Notably, Prakash et al.⁵ reviewed AI-based advancements in structural health monitoring (SHM) of bridges, complementing this review’s broader life-cycle perspective. Additionally, two demonstrations are presented here: (a) an AI-based deterioration modeling example using NBI data, and (b) an application of LLMs to bridge inspection interpretation and decision support, illustrating how AI can mitigate subjective judgment. This paper also discusses the integration challenges, such as data heterogeneity, model interpretability, and deployment barriers in real-world workflows. By benchmarking these innovations against conventional practices, this review aims to provide infrastructure professionals, researchers, and policymakers with a consolidated understanding of how AI can revolutionize bridge lifecycle management.

AI in Bridge Damage Detection

Damage detection is a critical component of bridge inspection and maintenance, focused on identifying structural deficiencies (such as cracks, corrosion, spalling, delamination, etc.) before they escalate into serious safety hazards. Early detection of damage ensures that maintenance can be performed proactively to prevent further deterioration or catastrophic failures. Traditionally, bridge damage has been assessed through manual, visual inspections by trained inspectors, who document observed defects and rate their severity. While widely practiced, manual inspections are time-consuming, labor-intensive, and inherently subjective. Results can vary significantly between inspectors, depending on their experience and the environmental conditions during inspection (such as lighting, access, and weather). Moreover, some areas of bridges (such as tall piers, undersides of decks, and inside box girders) are difficult or dangerous to access, potentially leading to missed detections. To address these limitations and enhance inspection quality, recent efforts have focused on leveraging AI to automate and standardize damage detection using visual and sensor-based data.

Automated visual damage detection

High-resolution images and videos of bridges, collected via cameras, unmanned aerial vehicles (UAVs), or robotic systems, can be analyzed by AI algorithms to detect defects much faster than a human inspector.⁶ These AI-powered systems aim to provide continuous “eyes on bridge” monitoring, reducing inspector workload and improving detection sensitivity for subtle damage.

One of the earliest and most studied applications of AI in bridge inspection is the use of CNNs for crack detection on concrete surfaces. For instance, Huyan et al.⁷ proposed an improved U-Net model called CrackU-Net, which achieved superior pixel-level crack segmentation compared to traditional thresholding or edge-detection methods. By learning from annotated images, their model could identify fine cracks under various lighting and background conditions. Chu et al.⁸ extended this concept with the Tiny-Crack-Net model, a lightweight CNN incorporating multiscale feature fusion and attention mechanisms to detect thin (hairline) cracks. These innovations improved detection accuracy, even on cluttered surfaces with stains, markings, or shadows. Another advancement was the integration of CV models with UAVs for remote inspection. Jeong et al.⁹ showed that UAV imagery processed with ML algorithms could successfully detect cracks and other defects on hard-to-reach parts of bridges (e.g., high columns or arch undersides). UAVs provide consistent viewing angles and distances, enhancing the quality and standardization of data, and paving the way for safer and more efficient inspections.

As research progressed, DL techniques were extended beyond cracks to detect other surface defects, such as corrosion, spalling, exposed rebar, and delamination, through the development of multi-class damage detection frameworks that can identify multiple defect types in a single pass. For example, Huang et al.¹⁰ introduced BridgeNet, a multi-class segmentation network for concrete surfaces that can label cracks, spalls, and exposed reinforcement bars in an image. Their architecture used a Swin Transformer backbone, a vision transformer model that excels at capturing global context, along with a CARAFE upsampling module to produce finer segmentation boundaries. The transformer backbone helped the model consider the broader image context (important for distinguishing, e.g., a dark line that is a crack from a shadow or surface texture), while the CARAFE upsampling improved the precision of defect mask edges. BridgeNet showed improved recognition of complex damage patterns, though challenges remained, particularly in avoiding false negatives (missed defects) and false positives (wrongly flagged defects) under conditions such as noisy backgrounds or variable lighting. Many of these issues are echoed in a recent review of vision-based crack detection,¹¹ which highlights the need for robust models that handle noise and variability in field conditions.

For steel bridges, detecting corrosion, often appearing as rust or metal thinning, presents unique challenges. Liu et al.¹² introduced CrackFormer, a transformer-based model capable of fine-grained segmentation for cracks and corrosion. Shamsabadi et al.¹³ proposed using a vision transformer combined with image stitching to localize rust on large steel surfaces. By taking overlapping images and stitching them into a panorama, then running a transformer-based segmentation, they achieved a more comprehensive corrosion map of, for example, an entire girder web. These studies reflect a growing emphasis on improving the robustness and generalizability of damage detection, bridging the gap between lab settings (clean images of obvious cracks) and field settings (where dirt, shadows, graffiti, or unusual damage can confuse models).

A notable trend involves adapting foundation models, large pre-trained models from the CV field, for infrastructure tasks. For instance, Li et al.¹⁴ combined the general-purpose Segment Anything Model (SAM), developed by Meta AI, with a specialized network for concrete damage segmentation. SAM was used to generate initial mask proposals for potential damage regions (essentially creating pseudo-labels), and then a dual-backbone CNN (MambaNet + ResNet) was trained on those masks to finalize the segmentation. This approach dramatically reduced the need for manual annotation, one of the biggest bottlenecks in training AI for new tasks. This hints at a future where large pre-trained models, whether for image segmentation or even for multimodal analysis, can be quickly adapted to infrastructure tasks, saving development time.

In parallel to segmentation models, object detection frameworks like the YOLO (You Only Look Once) family and Faster R-CNN have been extensively adapted for bridge defect identification. These models draw bounding boxes around defects and are often faster, making them suitable for real-time applications (e.g., a drone that detects a crack and immediately highlights it on a live video feed). More recently, Zhu et al.¹⁵ introduced BD-YOLOv8s (Bridge Defect YOLOv8-small), an enhanced version of the latest YOLOv8 tailored specifically for bridge defects. The model incorporated attention mechanisms like ODConv (omni-dimensional convolution) and CBAM (convolutional block attention module), as well as improved upsampling, to better detect small or faintly visible defects amid complex backgrounds. On a test dataset of various bridge surface defects, BD-YOLOv8s achieved a mean average precision of 86.2%, outperforming the baseline YOLOv8s by about 5%. Importantly, the enhanced YOLO model not only increased accuracy but also reduced false positives and missed detections, which is crucial for deployment. These high-performing models are opening the door to more practical, deployable AI inspection solutions. Lightweight versions of YOLO and other detectors have been embedded on edge devices for real-time analysis. For example, a drone with a small processor can run a small YOLO model to highlight cracks in real time as the camera scans a bridge. This means inspectors on site could get instantaneous feedback: as they walk or fly a drone around a bridge, an app could be circling areas of concern on the live video feed.

Some projects have developed end-to-end digital inspection pipelines in which images captured by UAVs or climbing robots are automatically processed by AI, and the results are mapped onto a three-dimensional (3D) model of the bridge (e.g., a digital twin or Bridge Information Model [BIM]). Luo et al.¹⁶ noted that combining advanced data collection (such as high-resolution drone imaging or wall-climbing robots for close-ups) with real-time AI analysis can greatly enhance inspection frequency and safety, especially for components that are traditionally hard to access. Instead of sending humans into dangerous positions, robots gather the data and AI processes it, with humans reviewing the summarized results.

The integration of AI-based damage detection into practice is also fostering multimodal inspection approaches. Recently, researchers have begun combining visual data with other sources, such as textual notes or sensor readings, to obtain a fuller picture of bridge health. These vision-language models can detect and localize defects and also generate human-readable descriptions or even interpret inspector comments. For instance, Zhang et al.¹⁷ explored using transformers that take both image features and associated text (such as previous inspection remarks or design details) to perform a cross-modal analysis of bridge condition. Such a model might correlate a described issue (“leaking joint causing rust streaks”) with what it sees in the image, leading to a more robust identification. Similarly, Chun et al.¹⁸ developed a DL model with an attention mechanism to automatically generate captions describing detected bridge damage in images. Their system could look at an image of, for example, a cracked bearing with corrosion and output a sentence like, “Extensive rust and a large crack observed at the base of the bearing.” This represents a step toward intelligent inspection assistants that produce structured reports from raw data, a task that typically requires an experienced inspector. By translating visual findings into text, these models make it easier to integrate AI findings into existing documentation workflows and assist with training new inspectors, as the AI’s description can serve as a first draft or a second opinion.

Despite rapid progress, several limitations still constrain the full-scale deployment of AI-based damage detection in operational bridge inspections. A foremost challenge is the lack of large, diverse, well-annotated datasets that cover the myriad defect types, materials, and environmental conditions encountered in the field. Many research models are trained on relatively small datasets or images captured under controlled conditions, which limits their generalizability “in the wild.” When faced with unusual scenarios (e.g., an uncommon bridge design, graffiti obscuring a crack, or heavy shadows), models can falter. To address this, the community recognizes the need for standardized, open benchmark datasets for bridge damage—spanning different climates, structure types (concrete, steel, and timber), and containing examples of cracks, rust, spalls, etc., under various conditions. Initiatives to crowdsource and share bridge images could accelerate this effort.

Another issue is that multi-class defect detection is inherently complex: defects often co-occur (e.g., a crack with corrosion along it) or one defect causes another (such as spalling leading to exposed rebar). Overlapping or adjacent defects can confuse classification. The risk of misclassification (e.g., tagging a stain as a crack or vice versa) and false negatives (missing a small defect) remains, especially when defects are very small relative to the image or when the background is complex. Improving model robustness is an active area of research—techniques such as data augmentation (simulating difficult conditions), domain adaptation (tuning models to new bridge types using a small amount of new data), and ensemble models (combining outputs of multiple models) are being explored.

Model explainability is also essential for trust. Engineers are more likely to trust an AI that can highlight why it labeled a region as a crack (e.g., by visualizing which pixels or patterns triggered the decision) than a completely opaque system. Some recent models include attention heatmaps to show regions of interest, which can serve as a pseudo-explanation to the user by highlighting the actual crack pixels.

Automated robotic and sensor-enabled inspection

In addition to visual inspections, impact sounding techniques such as chain drag or hammer tapping have been widely used to identify subsurface defects like delamination and voids, especially in concrete bridge decks. These techniques generate acoustic signals that change depending on the underlying material condition. However, traditional sounding is limited by environmental noise and human subjectivity in interpreting the acoustic response. AI technologies, combined with robotics and advanced sensing, are redefining how bridge inspections are performed, shifting the paradigm toward faster, safer, and more objective assessments. Key developments include the use of UAVs, ground robots, and automated sensing systems to collect rich data, which AI algorithms then analyze in real time or near real time.

Recent advancements integrate AI and ML with smart sounding systems to automatically analyze these audio signals and detect hidden damage. For instance, advanced signal processing techniques combined with ML algorithms have been used to improve subsurface defect detection accuracy from impact-sounding data collected using automated sounding with robotic crawlers, as well as data collection with microphones and laser Doppler vibrometers.¹⁹^,²⁰ These approaches significantly reduce human subjectivity and noise interference, improve reproducibility, and enable the extraction of diagnostic features from complex audio data, thereby enhancing the reliability of subsurface damage detection.

The data from these robotic inspections are often voluminous: 3D point clouds, thousands of images, and continuous sensor readings, far beyond what a human could analyze manually in a short time. This is where AI, especially CV and pattern recognition, plays a crucial role in processing and interpreting the data. For instance, the imagery from a drone can be fed into the crack/corrosion detection algorithms discussed in the previous section, resulting in annotated defect maps without an inspector reviewing each image. Zhang et al.²¹ integrated damage segmentation outputs into 3D point cloud models for bridge deck mapping. Essentially, AI can flag areas where the point cloud shows a deviation (potential displacement or sag) or where the AI detects densely clustered cracks, etc.

Beyond vision and audio signals, SHM systems complement inspections by continuously recording structural responses (such as strain, acceleration, tilt, and temperature). For instance, ML models can be trained on the normal vibration patterns of a bridge and can alert operators when these pattern change, possibly due to a crack, cable loss, or foundation issue. Data-driven decision-making in bridge operation and management (O&M) is enhanced when SHM sensor inputs are combined with inspection results.²² An example of this approach is using acceleration data to identify when a bridge’s natural frequencies drop (often a sign of stiffness loss due to damage); AI could continuously analyze the signal and issue a warning if a threshold change is detected, prompting an immediate inspection. A significant shift enabled by these technologies is the move from static, periodic inspections to continuous or predictive inspection frameworks. This convergence of AI, data fusion, and simulation lays the foundation for digital twins of bridges–dynamic, virtual replicas that update as the bridge’s condition evolves. AI plays a central role in constructing, maintaining, and leveraging digital twins for comprehensive bridge lifecycle planning. A bridge’s digital twin aggregates information from design (BIM models), inspections, sensors, and environmental data to provide a holistic, up-to-date representation of its condition and performance. For example, imaging data processed by AI can continuously update the twin’s defect inventory and surface condition. AI-based deterioration models can simulate how the bridge’s components will degrade under various scenarios within the twin, and optimization algorithms can run on the twin to recommend optimal interventions (maintenance or strengthening actions) by virtually “testing” them on the model. This allows engineers to query the digital twin with questions like, “What if we delay painting by 2 years?” to estimate predicted outcomes on condition and cost. For example, Ghavidel et al.²³ developed an AI-based decision support system that considers multi-threat risks to suggest long-term management strategies. In a digital twin, such an AI could highlight that “if you repair joint seals now, you’ll slow deck deterioration enough to extend repainting by 5 years, saving cost.” Essentially, the twin becomes an AI-based advisor, performing millions of virtual what-if experiments to guide optimal lifecycle management. Gao et al.²⁴ created a bridge digital twin by integrating GIS (geographic information) and BIM data with operational data, enabling practical O&M planning through a centralized platform. Along similar lines, Mohamed et al.²⁵ developed a Bridge Information Modeling framework to merge inspection data with 3D bridge models, facilitating maintenance intervention planning. Araya-Santelices et al.²⁶ review how AI, UAVs, and BIM can work together for bridge management, underscoring the importance of technology integration.

Despite the promise, building true digital twins and fully autonomous inspection systems faces several challenges. One is standardization: data from different sources (such as drones, sensors, and human inspections) must be formatted consistently and be interoperable. Another challenge is computational; processing continuous streams from dozens of sensors, plus running complex ML models, can be intensive. This is driving research into real-time edge computing (processing data on drones or on-site devices before sending summaries to the cloud). The cost of achieving true digital twins is another significant challenge for both the simple and complex bridges, given the large number of bridges requiring monitoring and maintenance.

Challenges and scope in AI-based damage detection

Integration of vision-based AI with other sensing modalities is an exciting direction. Vision alone cannot detect internal issues or stress before surface damage appears, but sensors can. Hence, AI-based damage detection has moved from proof-of-concept to a stage where field implementations are feasible. Continued focus on creating robust, generalizable models and integrating them into the inspector’s toolkit, rather than trying to replace the inspector, will be key. Ultimately, the vision is an AI-augmented inspection process: inspectors use drones/robots to collect data, AI to process and highlight concerns, and then the human experts to make the final judgments and plan interventions. Such a process could dramatically increase the frequency and consistency of bridge inspections, catching issues earlier and providing more data-driven evidence. To achieve this, future work should invest in large-scale data collection efforts, domain adaptation techniques to handle new environments, and explainable AI (XAI) coupled with human-in-the-loop systems to ensure that automated findings translate into trusted actions on the ground.

Human oversight remains critical. All these AI and robotic tools are ultimately decision support for engineers, not outright replacements. Regulations currently still require certified inspectors to be responsible for the findings. So, the near-term future is likely “AI-assisted inspection” rather than “AI-only inspection.” Recognizing this, a lot of effort is going into making AI results explainable and user-friendly; for example, augmented reality (AR) interfaces where an inspector wearing AR glasses sees AI annotations (such as “Crack here, ~0.3 mm wide”) superimposed on the actual bridge as they inspect it. This kind of human-in-the-loop system combines the strengths of AI (speed, consistency, and handling big data) with human judgment for confirmation and context-based decision-making.

In conclusion, AI, robotics, and SHM are collectively transforming bridge inspection from a labor-intensive, interval-based activity into a continuous, data-rich process. UAVs and robots serve as extensions of the inspector’s reach, collecting data that AI methods analyze for actionable insights. Over time, as these technologies mature and gain trust, we can expect more frequent inspections with less disruption, improved detection of hidden or early-stage problems, and ultimately longer-lasting bridges due to timely maintenance. Future research and investment should focus on solving integration issues (both technical and organizational), developing training and certification for these new tools, and ensuring that the vast data generated is translated into knowledge that engineers can effectively use to make bridges safer and more reliable.

In the next section, we examine how AI is being applied to model bridge deterioration over time, enabling more accurate forecasting and informed maintenance planning.

AI in Bridge Deterioration Modeling

Bridge deterioration modeling is a foundational element of infrastructure asset management, guiding decisions on inspection frequency, rehabilitation planning, and budget allocation. Traditionally, this task has been addressed using probabilistic and statistical methods such as Markov chains, Weibull distributions, and hazard-based regression. While these approaches offer valuable insights, they often rely on rigid assumptions, such as constant transition probabilities or predefined deterioration rates, that may not capture the dynamic, nonlinear nature of degradation.²⁷^,²⁸ Their applicability is limited in complex environments where deterioration is influenced by multiple interacting factors (such as traffic loads, materials, climate, and maintenance history). For example, simple Markov models assume memoryless transitions of condition states, which is often violated in real bridge data where past condition and repair actions influence future performance.

To overcome these limitations, recent research has turned to AI-driven techniques, especially ML and DL, which can learn deterioration patterns directly from empirical data without strict assumptions about variable relationships. These models offer greater adaptability and predictive accuracy by learning from large volumes of historical inspection records, sensor measurements, and environmental data.²⁹ For instance, Nguyen and Dinh³⁰ developed an Artificial Neural Network (ANN) model using historical inspection records to classify bridge deck condition states with high accuracy, outperforming traditional regression-based methods in predicting when decks fall below a given rating. Similarly, Althaqafi and Chou³¹ demonstrated that a simple ANN can reliably learn deterioration trends, reinforcing the value of neural networks for prediction. Comparative studies have also demonstrated ML’s benefits; for instance, Assaad and El-adaway³² further showed that data-driven models consistently outperformed traditional methods in deck deterioration prediction. Hybrid techniques are also emerging, combining optimization algorithms with ML algorithms; for example, Jiang et al.³³ used a whale optimization algorithm coupled with an Extreme Learning Machine to improve condition prediction accuracy.

Among supervised ML methods, ensemble learning techniques such as Random Forest (RF), Gradient Boosting Machines (GBMs), and eXtreme Gradient Boosting (XGBoost) have shown strong predictive performance for bridge condition modeling. These models are particularly valued for their robustness to noisy or missing data, resistance to overfitting, and ability to capture complex interactions among variables. Almarahlleh et al.³⁴ and Fard et al.³⁵ implemented a suite of ensemble classifiers to estimate bridge deck condition or damage severity, demonstrating that models like RF and GBM achieved superior results compared to baseline statistical classifiers. Likewise, Rashidi Nasab and Elzarka³⁶ reported that ensemble approaches consistently outperformed single learners for predicting Ohio bridge deck deterioration, especially in distinguishing deteriorated vs. good condition classes. Fang et al.³⁷ further demonstrated that ML-based condition predictions, when used to supplement inspectors’ visual assessments, improved the rating consistency across inspectors. Collectively, these findings underscore the growing role of supervised ML in condition modeling and lifecycle forecasting.

Building on the successe of ML, DL architectures offer even greater potential, especially for modeling the temporal and spatial characteristics of deterioration. Recurrent Neural Networks (RNNs), and particularly Long Short-Term Memory (LSTM) networks, have proven effective in capturing long-term dependencies within sequential inspection data. For instance, Miao et al.³⁸ compared an RNN model against a traditional Markov chain and found that the AI model significantly outperformed the Markov approach in forecasting bridge condition ratings. Abu Dabous et al.³⁹ introduced a hybrid semi-Markov model with an embedded LSTM to estimate time-to-failure for bridge decks, achieving higher forecasting accuracy than a time-homogeneous Markov model. The LSTM’s ability to retain information over long sequences makes it well-suited for modeling gradual degradation that unfolds over years or decades, including periods of slow decline punctuated by sudden drops after extreme events or delayed maintenance. CNNs, typically used in image processing in two-dimensional arrays, are used to extract spatial deterioration patterns from the images. Researchers have combined CNNs with LSTMs to form hybrid spatiotemporal models that leverage both visual/spatial features and temporal trends. For instance, Bui-Tien et al.⁴⁰ proposed a combined one-dimensional CNN and LSTM framework for time-series analysis of bridge damage indicators, achieving higher accuracy (98.4%) than either standalone or LSTM models in predicting structural condition over time.

These developments align with a broader trend toward XAI in infrastructure, which seeks to make ML models more transparent and trustworthy for engineering decisions. Trach et al.,⁴¹ for example, emphasized the need for AI models that provide not just predictions but also explanations grounded in engineering concepts. Efforts are underway to embed domain knowledge into ML pipelines and to use interpretable modeling techniques to reveal how factors such as age, traffic, or climate influence deterioration predictions. By integrating structural engineering principles and constraints into AI models (“physics-informed AI”), researchers aim to bridge the gap between predictive performance and actionable insight. For instance, Khatami et al.⁴² augmented data-driven models with physics-based deterioration factors, demonstrating improved realism in predictions. This approach makes AI tools more acceptable to transportation agencies and engineers who require clear, auditable decision support.

Despite the notable progress, several challenges limit the widespread adoption of AI-based deterioration models. A major issue is the scarcity of high-quality, labeled datasets that integrate condition ratings with contextual variables (such as environment, materials, and interventions). In many regions, inspection data are inconsistent or incomplete across agencies, hindering the training and validation of robust models. Additionally, most AI models, particularly DL networks, function as “black boxes,” offering limited insight into how specific inputs affect predictions. This lack of interpretability raises concerns for infrastructure managers who must justify decisions in safety-critical contexts. Moreover, few AI models explicitly handle issues like data censoring (e.g., when components are replaced or when there are no failure observations) or multistate degradation, which are common in bridge lifecycle data.

To address these issues, recent studies advocate incorporating XAI techniques, transfer learning for data-scarce scenarios, and hybrid modeling frameworks that embed engineering knowledge into ML. For example, using physics-based simulation data to pre-train models or applying domain constraints (such as enforcing monotonic deterioration with age) can improve generalization. Integrating real-time sensor data with predictive models is another promising direction, enabling more responsive and condition-based maintenance planning. Similarly, developing generalized AI frameworks that can model multiple component types (such as decks, joints, bearings, and girders) within a single system is seen as important for holistic bridge management. Such multicomponent models could better understand interactions (e.g., how a leaking joint accelerates beam deterioration) than siloed models. Moving forward, collaboration between academia, government agencies, and industry will be essential to create trustworthy, scalable, data-driven deterioration models that enhance the resilience and sustainability of bridge infrastructure. To demonstrate the practical application of AI in deterioration modeling, the following section presents a case study using an ANN trained on historical bridge inspection data to predict deck condition ratings.

Demonstration of ANN for bridge deck rating prediction

A simple ANN classifier was trained on historical NBI inspection data to predict deck condition ratings. While the NBI uses a 0–9 scale for condition ratings, this demonstration focused on the 9–4 range, as ratings below 4 typically indicate structures that are no longer considered serviceable.

The ANN was trained on a subset of bridges in New York State, specifically, concrete decks with uncoated rebars and owned by a single state agency. Fig. 1 shows the ANN-predicted average deck condition trajectory vs. the actual data from this subset. The model broadly follows the general deterioration trend, indicating that the ANN captured the overall pattern of decline in condition over time.

To evaluate model interpretability, a permutation-based feature importance analysis was conducted (Fig. 2). As expected, age emerged as the most influential predictor—older bridges generally show lower condition ratings. Other meaningful features included the average wearing surface permeability (an indicator of chloride ingress potential), deck design category, and ownership type (state). Less influential features included annual average daily traffic, bridge length category, and leaky joints—likely due to their encoding as categorical variables. Had continuous values been available, their influence might have been more pronounced. Nonetheless, this interpretability analysis supports that the ANN relied on reasonable engineering variables (e.g., age, material, exposure) rather than spurious correlations.

To explore temporal modeling, a sequence-to-sequence (Seq2Seq) model was also developed based on the approach of Saremi et al.⁴³. This RNN encoder–decoder architecture was trained on longitudinal bridge inspection data to predict a future sequence of condition ratings based on a sequence of past observations. The model achieved approximately 90% overall accuracy, as shown in the confusion matrix in Fig. 3, where most predictions fall along the diagonal, suggesting many correct classifications. However, a deeper analysis revealed concerning behavior. As illustrated in Fig. 4, the Seq2Seq model’s predictions were dominated by the previous rating feature, with all other features, including age, traffic volume, and environmental exposure, contributing only marginally. This heavy reliance on the prior rating suggests the model may be memorizing rather than learning true deterioration dynamics. In infrastructure datasets, ratings often remain constant over several inspection cycles, so even naïve models can achieve high accuracy by simply copying the last known value.

This form of overfitting, often masked by high accuracy metrics, is a common pitfall in deterioration modeling. Similar issues were observed by Sowemimo et al.,⁴⁴ who applied recurrent networks to bridge rating time series, yet struggled to ensure that the model learns true deterioration characteristics. It highlights the importance of evaluating whether AI models are truly learning from meaningful predictors or simply exploiting temporal inertia. In this case, the Seq2Seq model failed to capture how factors such as increased truck traffic, harsh environments, or delayed maintenance influence condition trajectories. It was effectively a high-performing shortcut, not a robust forecasting tool.

These findings underscore the need for rigorous model validation that goes beyond surface-level metrics such as accuracy or F1-score. In safety-critical domains like bridge asset management, AI models must be transparent, generalizable, and capable of simulating complex deterioration scenarios. To achieve this, future research should prioritize the integration of physically meaningful variables (e.g., climate indices, structural design attributes), improved handling of censored or incomplete data, and the incorporation of domain-specific constraints to ensure physically plausible outputs, such as enforcing monotonic degradation over time. The next section introduces AI-augmented survival analysis, a promising direction that leverages ML within a probabilistic framework to improve modeling of nonlinear deterioration patterns, handle censored data, and estimate time-to-failure more accurately and interpretably.

AI-Augmented Survival Analysis for Enhanced Deterioration Modeling

Survival analysis has long been a cornerstone of bridge deterioration modeling, particularly for estimating the time until a component reaches a failure state or transitions between condition states. Classical survival models (e.g., Weibull regression, exponential life models, or the Cox proportional hazards model) allow engineers to quantify how variables like environment or design affect the hazard rate of deterioration. They also naturally handle censored data (such as bridges that have not yet failed by the end of the study or that were rehabilitated). For example, a Weibull model can estimate the probability that a bridge deck survives 30 years before dropping below condition 5,²⁷ while a Cox model can identify which factors significantly increase the hazard of entering poor condition.⁴⁵ These methods are valued for their interpretability and statistical rigor. However, they typically assume linear or log-linear relationships between covariates and deterioration rate, and often impose specific distributional forms (e.g., assuming deterioration times follow a Weibull distribution). In reality, deterioration is often driven by multiple concurrent mechanisms and affected by maintenance interventions, making such assumptions restrictive. Handling complex censoring, such as multistate conditions or elements replaced before failure, can also be challenging with traditional survival models.

In response to these limitations, recent research has advanced hybrid survival modeling approaches that integrate DL into the survival analysis framework. A systematic review by Wiegrebe et al.⁴⁶ categorizes various deep survival models, noting that many approaches build upon classical survival concepts (like the Cox model or discrete-time hazard models) but enhance them with the flexibility of neural networks. For example, DeepSurv⁴⁷ replaces the Cox model’s linear risk function with a neural network, enabling it to learn complex nonlinear effects of features on the hazard. The network outputs a risk score, while the loss function remains rooted in the Cox partial likelihood, effectively combining ML with survival theory. Cox-Time⁴⁸ goes a step further by modeling time-varying effects, allowing the influence of a variable (e.g., traffic volume) to evolve with bridge age, reflecting realistic dynamics where, for example, fatigue damage accelerates after a threshold age.

Beyond the Cox-family models, there are DL methods for survival analysis that do not assume proportional hazards or specific distributions. Notably, DeepHit⁴⁹ and its variant Dynamic-DeepHit⁵⁰ directly predict the probability distribution of failure time using neural networks, sidestepping any predetermined hazard shape. These models can naturally handle competing risks (multiple failure modes) and provide the full probability of failure by year, which is useful for bridge managers planning interventions. Similarly, Bennis et al.⁵¹ proposed a DL approach that fits a mixture of Weibull distributions to the data (the DPWTE model), effectively learning a flexible hazard function that can approximate complex time-to-failure patterns. In the civil engineering domain, researchers have started applying these deep survival models to bridge datasets; for example, Kalakoti et al.⁵² developed a CNN-based survival model for infrastructure deterioration, and Dhada et al.⁵³ introduced a Weibull RNN for failure prognosis that can learn from binned life data. These approaches show promise in improving predictive accuracy and providing richer information (such as the probability of failure by time X, the most at-risk period, etc.) compared to traditional models.

However, even deep survival models face challenges when applied to real bridge data. Many current studies still assume a single failure mode or “event” (e.g., reaching poor condition) and right-censored data, whereas real infrastructure can undergo multiple events (deterioration, repair, re-deterioration) and can be subject to left-censoring (where a bridge is already in a deteriorated state when observations begin) or interval censoring (when inspections occur only at intervals). Additionally, some deep models sacrifice interpretability; engineers may prefer a simpler parametric model that clearly shows “chloride exposure multiplies hazard by 2” over a black-box network that is only marginally more accurate. Therefore, an important research direction is combining DL with physical knowledge and interpretability. Efforts such as those by Hu and Liu,⁵⁴ who developed a “Structural Deterioration Knowledge Ontology” to inform ML models, exemplify how embedding domain ontologies and constraints can make AI predictions more realistic and transparent.

In conclusion, integrating AI with traditional survival analysis offers a promising path for infrastructure deterioration modeling, capturing nonlinear, time-varying, and multifactor effects that were previously difficult to model. Future research should prioritize model transparency, reproducibility, and benchmarking on common datasets to ensure that improvements are consistent and not artifacts of a particular dataset or training scheme. The fusion of survival theory with neural networks is a powerful approach, but it must be guided by engineering judgment—ensuring, for instance, that predictions of service life make physical sense and that models properly account for maintenance interventions and changes in usage. With careful development, AI-augmented survival models could significantly improve our ability to predict bridge longevity and optimize the timing of repairs or replacements.

Building on enhanced deterioration forecasting, the next section explores how AI is being integrated into bridge maintenance planning to support more proactive, data-driven, and cost-effective decision-making.

AI in Bridge Maintenance

Maintenance planning for bridges is a complex, multi-objective process that involves assessing current conditions, forecasting future deterioration, and selecting optimal interventions under budget and safety constraints. Traditionally, agencies have used deterministic or probabilistic models to decide when and where to perform maintenance; for example, scheduling deck rehabilitations every 20 years or using Markov decision processes to compute optimal policies assuming known deterioration rates. Reliability-based methods have also been introduced, in which interventions are planned to keep failure probabilities below certain thresholds. Akiyama and Frangopol,⁵⁵ for instance, demonstrated probabilistic life-cycle optimization models using physics-informed deterioration and Monte Carlo simulations to predict remaining service life. At the network level, Yang and Frangopol⁵⁶ incorporated system reliability and risk considerations to prioritize maintenance across a portfolio of bridges. Earlier efforts also employed genetic algorithms to optimize maintenance under reliability constraints,⁵⁷^,⁵⁸ laying a foundation now expanded by AI methods capable of ingesting richer, more diverse datasets. More recently, Bukhsh et al.⁵⁹ used entity-embedding neural networks to predict optimal maintenance actions, showcasing the potential of DL architectures for infrastructure optimization.

As highlighted in a recent review by Shahrivar et al.,⁶⁰ the adoption of AI in bridge maintenance has accelerated significantly, particularly as predictive maintenance has become a major focus. Instead of relying on fixed schedules, AI-driven approaches seek to predict when a particular bridge or component will need intervention, and what type of intervention would be the most cost-effective. By learning from historical inspection and maintenance records, AI models can uncover patterns linking certain indicator conditions (such as increasing crack density or rust severity) to subsequent repairs or failures. These models can process large volumes of heterogeneous data (structural attributes, traffic loads, climate trends, past repair actions) to forecast optimal maintenance timings or to flag bridges that will likely require attention soon. For example, Gagliardi et al.⁶¹ applied deep neural networks to optimize maintenance schedules for pavement-like infrastructure, capturing complex interactions among variables and reducing decision errors. In the context of bridges, similar AI models can suggest when a bridge deck should be rehabilitated by balancing deterioration forecasts against budget constraints. Beyond timing, AI can assist in the dynamic prioritization of maintenance actions, which is crucial when resources are limited. Brighenti et al.⁶² proposed an AI-based decision support system that forecasts not only when maintenance should occur, but also which bridges should be prioritized, considering factors like structural criticality and funding limits. This approach moves toward multi-criteria optimization, not just minimizing overall deterioration but doing so while respecting budget caps and maximizing risk reduction across the network.

AI techniques have also been explored for long-term maintenance optimization using reinforcement learning (RL). In these formulations, maintenance planning is viewed as a sequential decision-making problem: each year or inspection cycle, the agency decides whether to repair a given element, and this decision affects future deterioration and costs. Researchers have applied Deep RL (DRL) algorithms to learn policies that minimize life-cycle costs or maximize bridge network reliability. For example, Wei et al.⁶³ developed a DRL framework that learned optimal policies (e.g., when to perform minor rehabilitation vs. full replacement) for structures under uncertain deterioration, outperforming simple time-based schedules. Karaaslan et al.⁶⁴ and Wei et al.⁶⁵ similarly treated maintenance planning as a game-like scenario in which the agent (planner) learns the best strategy through trial-and-error simulations. These RL approaches are promising because they can handle nonlinear deterioration and complex reward structures (e.g., penalties for risk, user costs for closures, etc.). However, they require careful formulation to ensure that the learned policy respects real-world constraints and is interpretable to engineers.

Despite these advancements, critical challenges remain before AI-driven maintenance planning can be widely adopted in practice. One major limitation is the lack of integration with real-time, multimodal data. Shahrivar et al.⁶⁰ emphasized that many existing models are built on static datasets and thus struggle to incorporate dynamically changing field conditions. For AI to truly enable predictive maintenance, it needs to fuse static inspection records with continuous monitoring data and evolving environmental inputs. Another challenge is compatibility with existing bridge management systems (BMSs). For instance, Wang et al.⁶⁶ showed that including past maintenance effects in deterioration models significantly improves prediction accuracy, but integrating such features into legacy BMS platforms remains difficult. AI outputs often do not map directly to predefined categories or intervention rules used by DOTs, creating translation gaps. Data format inconsistencies and the computational intensity of some AI models make integration nontrivial. For example, a DL model might output a recommendation that does not directly map to the categories or rules a DOT uses in its maintenance manual, causing confusion. To address this, AI tools need to be packaged in user-friendly software that can plug into BMS dashboards, and their outputs should be in a form engineers can readily interpret (e.g., suggesting specific maintenance actions defined in the agency’s policy).

In summary, AI is steering bridge maintenance planning toward an era of intelligent, data-driven decision-making. By harnessing vast datasets and learning from experience, AI can help prioritize interventions more effectively than traditional methods, potentially extending the life of structures and optimizing the use of limited funds. To fully realize this potential, future efforts should focus on: (1) integrating real-time and multimodal data for truly adaptive maintenance scheduling, (2) ensuring AI tools are interpretable and can be validated against engineering judgment, and (3) developing standards and interfaces so that AI-driven insights can seamlessly inform existing maintenance programs. With these developments, the next generation of bridge management could shift from static, schedule-based maintenance to a continuous, risk-informed maintenance strategy that intervenes at just the right time and place to ensure safety and sustainability.

The following section focuses on how LLMs are being leveraged to extract actionable insights from unstructured inspection reports, bridging the gap between field observations and intelligent maintenance decision-making.

From Inspection Reports to Action Plans: Leveraging LLMs for Decision Support

Bridge inspection reports are rich in technical insights, often containing narrative descriptions of observed defects, measurements, environmental conditions, and maintenance recommendations. However, these records are typically unstructured and vary significantly in language, terminology, and level of detail between inspectors and agencies. This lack of standardization makes it difficult to systematically analyze large volumes of reports or feed their content directly into asset management systems. As a result, critical observations, such as “crack in Span 3 girder approximately 1 inch wide,” often remain buried in narrative form or require manual extraction for use in condition databases.

Even prior to LLMs, researchers applied NLP techniques to bridge data. For instance, Xia et al.⁶⁷ used textual analysis of inspection reports to assess regional bridge conditions, but such approaches required task-specific tuning. The rise of LLMs, such as transformer-based models like Generative Pretrained Transformer, offers a transformative solution to this challenge.¹⁷^,⁶⁸ LLMs are capable of reading and interpreting free-form text and extracting structured insights. By training on massive corpora of language data (including technical texts), they develop a contextual understanding that goes beyond simple keyword matching. In the context of bridge management, an LLM can be taught to parse inspection narratives to find key information: the type of defect, its severity, location, any recommendations made by the inspector, etc. This LLM-based approach can effectively translate human-written reports into machine-readable data that can then feed into maintenance planning systems or condition databases.

One promising application is automated report analysis. Instead of engineers sifting through hundreds of PDF reports to prioritize work, an LLM-based system could instantly aggregate findings. Zhang et al.¹⁷ integrated LLMs to automatically construct a structured database from historical text reports. In their approach, the LLM learned to recognize patterns and key phrases in old reports (such as “CS-3” indicating condition state 3, or “heavy section loss at bearings”) and convert them into a standardized format. Such a database allows cross-agency benchmarking; for instance, users can query all instances of “section loss >25%” across thousands of bridges to identify the most urgent cases.

Another challenge in bridge management is inconsistency in condition ratings and descriptions due to subjectivity. One inspector’s “moderate cracking” might be another’s “heavy cracking.” Different regions may use slightly different terminologies or rating scales (some use the NBI 0–9, while others use element-level states 1–4, etc.). LLMs offer a data-driven path to standardization and quality assurance in this arena. By training on a large corpus of past reports, an LLM can learn the typical language associated with certain ratings or defects. For example, it might infer that when someone writes “extensive rust with section loss,” it often correlates with a poor condition state. Feng et al.⁶⁹ and Han et al.⁷⁰ both highlight the potential for NLP to support consistency in bridge condition assessment. An LLM could be used to flag discrepancies; for instance, if an inspector rates a component as “Good” but writes “many cracks present,” the LLM could notice that the narrative does not match the high rating and alert a reviewer to a possible error. Conversely, the model can suggest standardized terminology: if one report says “incipient spalling with rust bleed-out,” the system might annotate or translate that to a standard defect code or description like “surface spalling (light) with corrosion staining.” Imagine a real-time tool: as an inspector types or dictates their findings on a tablet, an LLM-based assistant could prompt, “You used the phrase ‘significant corrosion’. Would that correspond to a CS-3 (per guidelines)?” This kind of language model-assisted inspection ensures that the wording and ratings align with agency standards, reducing subjectivity. Over time, as inspectors use such tools, their reports become more uniform, which improves data reliability for network-level analysis.

Beyond standardization, LLMs are being designed to function as intelligent maintenance assistants, moving from data extraction to generating actionable insights. This involves synthesizing information from multiple parts of a report or even multiple reports. For instance, an LLM could review all the inspection notes for a bridge from the last 10 years and produce a concise summary: “The deck has had widening longitudinal cracks with efflorescence since 2015, now in CS-3, with no major repairs done yet.” For reliability purposes, the LLM assistant can be trained to point out the document ID, contract number with page, or exact damage notes in quotes for cross-verification and reliability. Such summaries can help new engineers or decision-makers quickly grasp the history and critical issues of a structure. Furthermore, LLMs can be coupled with knowledge of maintenance codes and standards to recommend actions. If an inspection note says “wide crack with active leakage in the abutment,” an advanced system might infer: crack + leakage -> possible severe issue, and recall maintenance guidelines suggesting epoxy injection or similar interventions. It could then output a suggestion like: “Recommend sealing cracks and investigating source of leakage (maintenance code XYZ),” basically turning raw observations into actionable plans. One can envision a chatbot-like interface where an engineer asks, “LLM, based on the latest inspection, what should we do for Bridge A123?” and the LLM responds with a structured answer: “Bridge A123 has advanced deterioration in the deck (CS-3 with wide cracks and efflorescence). Suggested maintenance: Code X (seal cracks, replace wearing surface), schedule within 1 year. Also note bearing corrosion (CS-2), recommend cleaning and painting (Code Y) within 2 years.” Our demonstration later in this review shows a prototype of such an AI bot using an LLM.

As LLMs become more embedded in these workflows, ensuring their reliability and accuracy is paramount. A hallucinating LLM that suggests a nonexistent maintenance action or misinterprets a report could be problematic. The models will likely need fine-tuning on domain-specific data and possibly constrained generation (for instance, only suggesting maintenance actions from a known list). Human-in-the-loop frameworks are essential: an engineer should review the AI’s suggestions and extractions, at least until there is high confidence or only low-risk tasks are involved, and correct the information or provide their own expert knowledge as reliable data. Over time, trust can be built if the LLM consistently proves accurate and saves time and resources. Issues like data privacy and security also come into play. Inspection reports may contain sensitive infrastructure information, so any AI deployment must ensure secure data handling, probably running on secure servers or local machines, rather than sending data to open application programming interfaces [APIs]).

Another innovative use of LLMs is in training and education. New inspectors can be trained with LLM-powered simulators: an LLM can simulate an experienced inspector’s reasoning. For instance, a trainee describes what they see in a photo, and the LLM (imbued with knowledge from countless past cases) can respond with questions or guidance (“Did you check for any deformation near the crack? If there’s efflorescence, what might that indicate?”). LLMs can also serve as intelligent checklists, as mentioned in the text: an inspector can query something like “What are the criteria for a Condition 3 corrosion on steel girders?” and the LLM will provide the answer from the standards and guidelines it has been trained on.

In the broader AI ecosystem for bridges, LLMs complement the sensor and vision-based systems. Think of it this way: CV algorithms tell us where and what the defects are in images, SHM sensors tell us how the bridge is behaving, and LLMs tie everything together with the context and reasoning that comes from textual data and expert knowledge. Yuan et al.¹¹ outline such integrated frameworks, combining visual defect detection, real-time Internet of Things sensor data, textual report analysis, and predictive modeling. This multimodal synergy can enable truly proactive, life-cycle-optimized management; for example, an LLM could correlate a pattern in sensor data with a type of damage previously described in reports, strengthening the case for a particular fix.

Through continued research and field trials, LLMs have the potential to redefine how bridge inspections are documented and how decisions are made from that information. Instead of piles of text that only yield insights after laborious human reading, we would have immediate, structured intelligence, essentially turning past and present inspection data into a knowledge base for the bridge. These models are not just about automation; they act as partners to human experts, offering suggestions, checking consistency, and providing a second set of “eyes” on the vast amount of textual data generated in bridge management. When combined with visual and sensor AI, they round out the toolkit needed for fully smart BMSs.

To illustrate these capabilities in practice, the next section presents a demonstration of a customized LLM-powered AI bot designed to automate the interpretation of bridge inspection data and generate corresponding maintenance recommendations.

Demonstration of the potential of LLMs in bridge inspection

The AI bot was fine-tuned using domain-specific knowledge, including bridge inspection manuals (e.g., NYSDOT Bridge Inspection Manual⁷¹ and Bridge Inventory Manual⁷²), standardized condition rating guidelines (NBI and element-level condition states), and NYSDOT maintenance codes. The system was evaluated using both visual input (image descriptions) and textual input (inspection note excerpts) to assess its ability to generate appropriate ratings, recommended maintenance actions, and corresponding codes.

In one scenario, the AI bot was presented with an image of a concrete deck from the manual, as shown in Fig. 5. As seen in Table 1, the AI bot correctly classified the condition as CS-3, indicating advanced deterioration that requires attention but is not yet structurally critical. Based on standard practices, the AI bot recommended immediate sealing of the cracks to prevent further water ingress, potential replacement of the wearing surface, and follow-up inspections on a yearly basis. It also provided the relevant NYSDOT maintenance code, Code 059, for replacing the wearing surface, along with a rationale highlighting the risk of internal damage from ongoing water infiltration.

Table 1. Rating and maintenance suggestion for Fig. 5
Figure	Image showing deck image with wide cracks and heavy efflorescence
Description	Deck with cracks and heavy efflorescence indicating water infiltration and potential internal damage
Condition state	CS-3 (advanced deterioration, nonurgent structural review)
Suggested maintenance	Code 059: Replace wearing surface
Frequency/interval	Immediate repair, followed by yearly monitoring
Comments	Action needed to seal cracks and prevent further internal damage

This demonstration illustrates the LLM’s capacity to translate an observed defect into a structured, actionable maintenance plan. The model effectively recognized the implications of the symptoms (“wide cracks” + “efflorescence” → potential water damage), proposed appropriate remediation steps, aligned the findings with a formal condition rating, and referenced applicable agency codes. This level of interpretability and contextual awareness reflects a strong alignment with established engineering logic and highlights the potential for LLMs to assist in inspection-based decision support.

In another test, the AI bot was given an image of a steel beam with severe corrosion (Fig. 6). The AI bot classified the condition as CS-4, indicating severe deterioration with potential implications for structural capacity. It recommended an immediate structural assessment and provided maintenance suggestions, including Code 076 (repair/replace wingwalls), Code 077 (repair/replace backwalls), and Code G81 (bank protection for scour-related risk), as summarized in Table 2.

Table 2. Rating and maintenance suggestion for Fig. 6
Figure	Image showing severe corrosion and material loss on a steel beam
Description	Severe corrosion with material loss on the lower flange of a steel beam
Condition state	CS-4 (severe deterioration, safety concern)
Suggested maintenance	Code 076: Repair and/or replace wingwalls (if wingwalls are affected by similar corrosion) Code 077: Repair and/or replace backwalls if backwalls are present and affected Code G81: Maintain bank protection (scour-related risks)
Frequency/interval	Immediate assessment and biannual follow-ups
Comments	Urgent structural evaluation required; similar components should be checked for moisture exposure damage

This output illustrates the LLM’s capacity to go beyond simply recalling guideline text; it demonstrates reasoning by logically associating severe flange corrosion with possible systemic issues. Although wingwalls and backwalls are not shown in the input image, the AI bot inferred that these components might also be compromised due to their proximity to areas of water exposure or splash corrosion, a common degradation mechanism. Similarly, the suggestion of Code G81 indicates that it linked corrosion to potential scour or poor drainage conditions, highlighting its ability to generalize from training data.

However, this example also highlights a challenge in LLM deployment: the potential for “hallucination”—where the AI introduces elements not explicitly visible in the input. While such inferences may be based on learned patterns and associations, they can lead to over-specification. This underscores the importance of maintaining a human-in-the-loop framework, where engineers validate AI-generated suggestions. In this case, if no wingwalls or backwalls are present, the user can disregard the irrelevant codes. Nevertheless, the AI's ability to generate related recommendations suggests it could serve as a useful assistant in inspection workflows.

In this final demonstration, the LLM’s ability to interpret unstructured inspection text and images in the report is tested, as shown in Fig. 7. The input comprised detailed narrative notes from a real bridge inspection report along with the inspection photos, describing issues in the deck and substructure.

The LLM successfully parsed the notes and extracted meaningful condition observations, as well as checked the associated pictures in the inspection report and generated the descriptions of the images (Fig. 8).

This AI bot has been further fine-tuned on bridge deck-related details, maintenance codes, and ratings to provide suggestions. It can read the inspection report and extract relevant information based on its fine-tuned learning, giving reliable suggestions as shown in Fig. 9. It includes suggested solutions, prioritizes best practices, and provides corresponding maintenance codes aligned with agency standards. These recommendations are not only actionable but also prioritized based on urgency and resource availability, ensuring that critical issues are addressed promptly while optimizing maintenance schedules.

Another version of the LLM-based AI bot was explored, which is fine-tuned to analyze the report descriptions and images, as well as to suggest improvements in note-taking and guidance for identifying critical image locations. Fig. 10 demonstrates the effectiveness of the developed bot in performing the quality control tasks. By inspecting the photos and descriptions, it recommended which areas to highlight, what the inspectors should be concerned about, and whether close-up or nearby pictures are required for better detailing. This automation enhances the efficiency and reliability of QA/QC practices, ultimately improving the overall quality of bridge inspection programs.

This demonstration, while rudimentary and more conceptual in nature, underscores the potential of LLMs to serve as assistive intelligence in bridging the gap between raw inspection findings and maintenance decision-making. It is like having the knowledge of a junior engineer or seasoned inspector encoded, offering a second opinion or a checklist so that nothing is overlooked. However, the occasional inclusion of tangential suggestions (e.g., referencing wingwalls where none exist) emphasizes the ongoing need for human oversight and iterative model refinement. In real-world applications, deploying such AI systems would require extensive validation and may benefit from a phased rollout—initially targeting narrow tasks such as condition state classification or maintenance code suggestion. Nevertheless, even in this preliminary form, the LLM demonstrates clear value: reducing manual workload, improving consistency, and accelerating data-driven maintenance decision-making in bridge asset management.

Conclusion

This review synthesizes the latest advancements in applying AI across the bridge infrastructure management lifecycle, highlighting how AI-driven methods are transforming automated defect detection, deterioration forecasting, survival analysis, maintenance prioritization, and language-based decision support. AI-assisted inspection tools reduce subjectivity and risk while providing consistent identification of defects, even in inaccessible areas. ML models allow engineers to capture nonlinear deterioration patterns involving factors such as age, traffic, and environmental exposure—surpassing the limitations of traditional statistical models. LLMs address a longstanding gap by converting unstructured inspection narratives into structured, standardized data that can be used for condition assessment and automated action planning.

Despite these promising developments, several challenges still hinder widespread adoption. Many AI models function as “black boxes,” offering little interpretability, which undermines the trust of engineers who must make safety-critical decisions. Integration with existing BMSs remains complex, with scalability and compatibility issues posing practical barriers. Moreover, much of the research remains at the pilot or case-study level, with limited implementation at the network scale. Our experiments with ANN and Seq2Seq models demonstrated that, even with high prediction accuracy, these systems were overly reliant on previous condition ratings and failed to incorporate essential deterioration drivers such as environment or load. This reflects a broader concern that many AI applications in bridge deterioration focus too narrowly on performance metrics rather than learning meaningful engineering behaviors.

Addressing these challenges will require more than technical advancement. Improvements in data infrastructure, model transparency, and seamless integration into engineering workflows are critical for realizing the full potential of AI in bridge management. To this end, we recommend the following priorities for future research and deployment: Multimodal data fusion: Future models should combine visual data, sensor data, and textual data into unified predictive frameworks. Achieving this will require developing large-scale, diverse datasets and using techniques like transfer learning and domain adaptation to handle differences between regions and bridge types. XAI and physics-informed AI: It is urgent to integrate domain knowledge into AI models to ensure their predictions are not only accurate but also physically realistic and justifiable. This could mean hybrid models that include factors from structural mechanics (e.g., load capacity formulas or corrosion kinetics) alongside data-driven patterns, or applying constraints so that, for example, predicted deterioration rates do not violate known material behaviors. Interoperable and user-friendly tools: Integrating AI into daily practice will require developing software that can plug into existing BMS and workflows. This includes creating APIs or modules for popular bridge management software that can seamlessly ingest inspection data and output AI analysis. Moreover, regulatory bodies may need to establish guidelines on how AI-generated insights can be used. For example, allowing an AI-predicted condition rating to trigger an action, but perhaps still requiring human verification for certain decisions. Validation and human–AI collaboration: Successful implementation depends on testing AI tools through pilot deployments, where their outputs are compared against expert judgments. Feedback mechanisms should allow engineers and inspectors to flag incorrect AI outputs, enabling continual model refinement. Rather than replacing human expertise, AI should serve as an assistive technology, augmenting inspectors’ capabilities, improving consistency, and reducing oversight risk. Close collaboration between bridge engineers and data scientists is essential to ensure the relevance, usability, and accountability of AI models.

In conclusion, the convergence of AI with bridge engineering has the potential to transform infrastructure safety, reliability, and decision-making. Achieving this vision requires interdisciplinary effort, pairing engineering domain expertise with advanced data-driven modeling. With ongoing collaboration, robust validation, and careful integration into operational workflows, AI-based tools can evolve from experimental models into trusted components of infrastructure asset management. Ultimately, a thoughtful, human-centered approach will ensure that AI supports—not replaces—the expert judgment that underpins safe and resilient bridge networks.

References

Bridges: 2025 Infrastructure Report Card. Published online 2025.
Bridge preservation guide: Maintaining a state of good repair using cost effective investment strategies (No. FHWA-HIF-11-042). United States. Federal Highway Administration. Published online 2018.
Evaluation of the consistency of bridge inspection quality in New York State. J Civil Struct Heal Monit. 2021;11(5):1393-1413.
Advancements and challenges in the application of artificial intelligence in civil engineering: a comprehensive review. Asian J Civil Eng. 2023;25(1):1-18. doi:10.1007/s42107-023-00760-9
Structural health monitoring of concrete bridges through artificial intelligence: a narrative review. Appl Sci. 2025;15(9).
Collection of Data With Unmanned Aerial Systems (UAS) for Bridge Inspection and Construction Inspection (No. FHWA-HRT-21-086). Federal Highway Administration. Office of Infrastructure Research and Development; 2021.
CrackU-net: a novel deep convolutional neural network for pixelwise pavement crack detection. Struct Cont Health Monit. 2020;27(8). doi:10.1002/stc.2551
Tiny-Crack-Net: a multiscale feature fusion network with attention mechanisms for segmentation of tiny cracks. Comput-Aided Civil Infrastruct Eng. 2022;37(14):1914-1931. doi:10.1111/mice.12881
UAV-aided bridge inspection protocol through machine learning with improved visibility images. Expert Syst Appl. 2022;197.
Deep learning for automated multiclass surface damage detection in bridge inspections. Autom Constr. 2024;166.
A review of computer vision-based crack detection methods in civil infrastructure: Progress and challenges. Rem Sens. 2024;16(16).
Crackformer: transformer network for fine-grained crack detection. Published online 2021:3783-3792.
Vision transformer-based autonomous crack detection on asphalt and concrete surfaces. Autom Construct. 2022;140.
SAM-guided concrete bridge damage segmentation with mamba-ResNet hierarchical fusion network. Electronics. 2025;14(8).
DGYOLOv8: an enhanced model for steel surface defect detection based on YOLOv8. Mathematics. 2025;13(5).
Computer vision-based bridge inspection and monitoring: a review. Sensors. 2023;23(18).
Automatic bridge inspection database construction through hybrid information extraction and large language models. Dev Built Environ. 2024;20.
A deep learning-based image captioning method to automatically generate comprehensive explanations of bridge damage. Comput-Aided Civil Infrastruct Eng. 2022;37(11):1387-1401. doi:10.1111/mice.12793
Damage detection in concrete slab using smart sounding. Published online 2022:97-105.
Robotic inspection and characterization of subsurface defects on concrete structures using impact sounding. Published online 2022.
Machine-supported bridge inspection image documentation using artificial intelligence. Transport Res Rec. 2023;2677(5):720-736.
Critical review of data- driven decision-making in bridge operation and maintenance. Struct Infrastruct Eng. 2021;18(1):47-70.
Risk-based multi-threat decision-support methodology for long-term bridge asset management—volume 1: AI-Based bridge-level decision support. Published online 2024.
Bridge digital twin for practical bridge operation and maintenance by integrating GIS and BIM. Buildings. 2024;14(12). doi:10.3390/buildings14123731
A bridge information modeling (BrIM) framework for inspection and maintenance intervention in reinforced concrete bridges. Buildings. 2023;13(11). doi:10.3390/buildings13112798
Bridge management with AI, UAVs, and BIM. Automat Construct. 2025;175.
Deterioration rates of typical bridge elements in New York. J Bridge Eng. 2010;15(4):419-429.
Deterioration models for prediction of remaining useful life of timber and concrete bridges: a review. J Traffic Transport Eng (English Edition). 2020;7(2):152-173.
Machine learning approach for predicting bridge components’ condition ratings. Front Built Environ. 2023;9.
Prediction of bridge deck condition rating based on artificial neural networks. J Sci Technol Civil Eng (STCE)-NUCE. 2019;13(3):15-25.
Developing bridge deterioration models using an artificial neural network. Infrastructures. 2022;7(8).
Bridge infrastructure asset management system: Comparative computational machine learning approach for evaluating and predicting deck deterioration conditions. J Infrastruct Syst. 2020;26(3). doi:10.1061/(asce)is.1943-555x.0000572
Bridge condition deterioration prediction using the whale optimization algorithm and extreme learning machine. Buildings. 2023;13(11).
Predicting concrete bridge deck deterioration: a hyperparameter optimization approach. J Perform Const Facilit. 2024;38(3).
Development and utilization of bridge data of the United States for predicting deck condition rating using random forest, XGBoost, and artificial neural network. Rem Sens. 2024;16(2). doi:10.3390/rs16020367
Optimizing machine learning algorithms for improving prediction of bridge deck deterioration: a case study of ohio bridges. Buildings. 2023;13(6). doi:10.3390/buildings13061517
An improved inspection process and machine-learning-assisted bridge condition prediction model. Buildings. 2023;13(10).
Comparison of Markov chain and recurrent neural network in predicting bridge deterioration considering various factors. Struct Infrastruct Eng. 2024;20(2):250-262. doi:10.1080/15732479.2022.2087691
Modelling bridge deterioration using long short-term memory neural networks: a deep learning-based approach. Smart Sustain Built Environ. 2024;2018(11). doi:10.1108/sasbe-10-2023-0295
Enhancing bridge damage assessment: adaptive cell and deep learning approaches in time-series analysis. Construct Build Mat. 2024;439.
Using AI-based tools to quantify the technical condition of bridge structural components. Appl Sci. 2025;15(3).
Data-assisted prediction of deterioration of reinforced concrete bridges using physics-based models. J Infrastruct Syst. 2023;29(2).
Alternative sequence classification of neural networks for bridge deck condition rating. J Perform Construct Facil. 2023;37(4).
Recurrent neural network for quantitative time series ratings. Infrastructures. 2024;9(12).
Multivariable proportional hazards based probabilistic model for bridge deterioration forecasting. J Infrastruct Syst. 2020;26(2).
Deep learning for survival analysis: a review. Artif Intell Rev. 2024;57(3).
DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol. 2018;18(1):1-12. doi:10.1186/s12874-018-0482-1
Time-to-event prediction with neural networks and Cox regression. J Mach Learn Res. 2019;20(129):1-30.
Deephit: a deep learning approach to survival analysis with competing risks. Published online 2018.
Dynamic-deephit: a deep learning approach for dynamic survival analysis with competing risks based on longitudinal data. IEEE Transact Biomed Eng. 2019;67(1):122-133. doi:10.1109/tbme.2019.2909027
DPWTE: a deep learning approach to survival analysis using a parsimonious mixture of weibull distributions. Published online 2021:185-196.
SurvCNN: a discrete time-to-event cancer survival estimation framework using image representations of omics data. Cancers. 2021;13(13). doi:10.3390/cancers13133106
Weibull recurrent neural networks for failure prognosis using histogram data. Neural Computing and Applications. 2023;35(4):3011-3024.
Structural deterioration knowledge ontology towards physics- informed machine learning for enhanced bridge deterioration prediction. J Computi Civil Eng. 2023;37(1). doi:10.1061/(asce)cp.1943-5487.0001066
Life-Cycle of Structures and Infrastructure Systems. CRC Press; 2023.
Life-cycle management of deteriorating bridge networks with network-level risk bounds and system reliability analysis. Struct Saf. 2020;83(11). doi:10.1016/j.strusafe.2019.101911
Maintenance cost optimization for bridge structures using system reliability analysis and genetic algorithms. J Construct Eng Manag. 2018;144(2).
Multi-objective maintenance optimization model to minimize maintenance costs while maximizing performance of bridges. Built Environ. 2023;4:523-531.
Maintenance intervention predictions using entity-embedding neural networks. Automat Construct. 2020;116.
AI-based bridge maintenance management: a comprehensive review. Artif Intell Rev. 2025;58(5).
Deep neural networks for asphalt pavement distress detection and condition assessment. 2023;12734:251-262.
Forecasting bridge damage within a predictive Structural Reliability-based DSS. Automat Construct. 2024;168.
Optimal policy for structure maintenance: a deep reinforcement learning framework. Struct Saf. 2020;83.
A novel decision support system for long-term management of bridge networks. Appl Sci. 2021;11(13).
Resource-constrained bridge maintenance optimization by harmonizing structural safety and maintenance duration. Eng Struct. 2024;308(13). doi:10.1016/j.engstruct.2024.118024
Network-level bridge deterioration prediction models that consider the effect of maintenance and rehabilitation. J Infrastruct Syst. 2022;28(1).
A data-driven approach for regional bridge condition assessment using inspection reports. Struct Cont Health Monit. 2022;29(4).
Revolutionizing bridge operation and maintenance with LLM-based agents: an overview of applications and insights. Published online 2024.
Condition assessment of highway bridges using textual data and natural language processing-(NLP-) based machine learning models. Struct Cont Health Monit. Published online 2023.
Research progress on intelligent operation and maintenance of bridges. J Traffic Transport Eng. Published online 2024.
Bridge inspection manual (M. Struzinsky, Ed.). Published online 2017.
Bridge and Large Culvert Inventory Manual. Published online 2020.
Automatic bridge crack detection using unmanned aerial vehicle and faster R-CNN. Constr Build Mater. 2023;362.
A Virtual-Reality-Based Training and Assessment System for Bridge Inspectors With an Assistant Drone.; 2022.