A Physics-Informed Foundation Model for

Real-Time High-Fidelity Structural Dynamics

Shiqiao Meng¹, Ying Zhou¹*, Qinghua Zheng², Bingxu Liao¹, Mushi Chang¹, Tianshu Zhang¹, Abouzar Jafari¹, Abderrahim Djerrad¹

¹State Key Laboratory of Disaster Reduction in Civil Engineering, Tongji University, Shanghai, China

²School of Computer Science and Technology, Tongji University, Shanghai, China

Abstract

Accurate and rapid structural-dynamics modeling is critical for structural design, disaster mitigation, and resilience assessment, yet existing computational frameworks rely almost exclusively on nonlinear finite-element analysis. Conventional finite-element analysis approaches require substantial computational resources, with individual simulations typically taking hours to days to complete, making real-time or city-wide structural assessments impractical. To overcome this fundamental limitation, we introduce SeisGPT, a physics-informed foundation model designed specifically to enable high-fidelity, real-time structural response prediction across extensive building portfolios encompassing diverse structural types and topologies. SeisGPT integrates structural mechanics principles with advanced deep-learning methodologies, including a physics-informed graph neural network encoder, a simplified dynamic-response embedding module, and a generative Transformer-based decoder. The model is pretrained on a large-scale dataset comprising over 2 million nonlinear elastoplastic FEA simulations—covering 270,000 AI-generated, code-compliant structural designs created via an automated generative workflow, as well as 694 real-world buildings—totaling more than 10 billion discrete response time-steps. For previously unseen buildings subjected to external loads, SeisGPT achieves displacement and acceleration predictions with less than 5% normalized error while providing an approximately 40,000-fold computational speedup over conventional FEA methods. Furthermore, by assimilating sparse sensor measurements, SeisGPT’s physics-guided latent representations refine prediction accuracy beyond that achievable with conventional FEA simulations, enabling real-time structural-health monitoring and damage localization. By integrating physics-informed modeling with scalable inference, SeisGPT establishes a widely applicable computational paradigm, paving the way for transformative advancements in structural dynamics.

Nonlinear structural dynamics provides the foundational framework for predicting how structures behave under diverse external dynamic excitations, including seismic ground motions, wind loads, traffic-induced vibrations, and mechanical impacts^1–4. Rapid, high-fidelity prediction of structural responses under such dynamic loading conditions yields critical insights for structural design optimization⁵, disaster mitigation⁶, operational maintenance⁷, and digital-twin applications⁸. Accurately characterizing these responses demands solving large-scale systems of nonlinear differential equations, typically through elastoplastic finite-element analysis (FEA)^9–11. Although rigorously formulated, conventional nonlinear FEA methods remain computationally intensive, often requiring hours to days per analysis^12–14, severely constraining their practical application in iterative structural design, rapid response assessment, and real-time decision-making^15,16. Moreover, conventional finite-element models inherently involve simplifications or idealizations of real structural geometries and material properties, introducing discrepancies that reduce the accuracy of computed responses under realistic dynamic excitations, particularly for existing structures subjected to complex loading scenarios¹⁷.

Over recent decades, the structural engineering community has increasingly adopted surrogate modeling techniques to overcome the computational bottleneck of classical FEA¹⁸. Early surrogate models typically employed simplified analytical methods or regression-based approximations, which—while offering notable computational speedups—often suffered from insufficient accuracy due to their inability to capture critical nonlinear behaviors, geometric complexities, or nuanced boundary-condition effects inherent in real structural systems. Early approaches, utilizing empirical regressions or simplified analytical models, often lacked generalizability across varying structural configurations, such as differing heights, lateral-force-resisting systems, or plan irregularities commonly encountered in modern urban architecture. These AI-driven methods can be broadly categorized into two major streams: purely data-driven approaches—such as convolutional neural networks (CNN)^19–21, recurrent neural networks (RNN)^22–24, long short-term memory (LSTM)^25–28 networks, and Transformers^29,30—which primarily utilize observational data to capture complex underlying relationships without explicitly representing physical processes; and physics-informed approaches^31–37, which incorporate structural mechanics principles directly into machine learning (ML) frameworks.

Despite notable advances, several critical challenges persist in contemporary ML-based dynamic response modeling. One principal challenge is (i) the limited structural generalizability—existing models are predominantly data-driven or narrowly physics-informed, typically constrained to singular structural typologies such as standard frames or shear walls, thereby neglecting variations in building height, plan irregularities, material gradation, and hybrid-system connectivity. Consequently, these models frequently struggle to accurately represent continuum-level mechanics and the nonlinear elasto-plastic response of entire building structures under external excitation demand, resulting in considerable performance deterioration when applied beyond their initial training domain^20,34,35. Compounding this limitation is (ii) the lack of universal structural knowledge and the scarcity of experimentally validated, multi-type datasets, both of which hinder reliable predictions across different structural systems³⁵. Third, (iii) achieving high-resolution, spatially detailed predictions, especially in high-rise or geometrically irregular structures, remains challenging, as existing ML models typically trade fine-scale fidelity for computational speed^25,38,39. Furthermore, (iv) existing response-prediction models lack the capability to effectively integrate sparse sensor data, thus limiting their predictive accuracy from surpassing that of finite-element-based simulations.

Collectively, these limitations underscore the urgent need for a generalized, physics-consistent, data-rich foundation model capable of accurately and efficiently predicting responses of unseen structures across diverse typologies. Such a model must seamlessly generalize across varied building geometries, integrate essential structural mechanics to preserve physical realism, assimilate sparse sensor measurements to enhance prediction fidelity, and deliver results swiftly enough to support real-time decision-making.

To bridge this critical gap, we introduce SeisGPT, a physics-informed foundation model explicitly developed for structural response prediction. SeisGPT integrates a mass–stiffness-aware graph neural network encoder, a simplified dynamic-response module and a causal Transformer-based decoder. The model is pretrained on two extensive, complementary datasets: (i) empirical response records from 694 real-world instrumented buildings, providing validated response histories; and (ii) a synthetic large-scale dataset consisting of over 270,000 AI- generated, code-compliant structural designs, encompassing frame, frame–wall hybrid, and shear-wall systems, totaling more than 10 billion discrete nonlinear response time-steps. A sliding-window training method efficiently streams these large-scale datasets through the optimization pipeline, enabling SeisGPT to internalize complex continuum mechanics and capture long-range temporal dependencies across structural systems. Furthermore, to facilitate practical reconstruction applications, SeisGPT incorporates a lightweight module capable of reconstructing full spatiotemporal responses from sparse, real-time sensor inputs. This functionality is particularly valuable for structural health monitoring, rapid post-earthquake damage assessment, and fine-grained localization of structural deterioration, especially in environments lacking dense sensors. By effectively reconciling the long-standing trade-off between computational tractability and analytical fidelity, SeisGPT establishes a scalable and reusable computational paradigm for structural dynamics.

Structural response prediction with SeisGPT

SeisGPT leverages a pretrained deep-learning architecture capable of effectively generalizing across diverse structural configurations, material properties, and excitation scenarios. Trained on an extensive dataset comprising over 270,000 finite element models and 2 million finite element simulations, SeisGPT provides robust predictive capabilities suitable for real-time response prediction applications.

Central to SeisGPT is a deep-learning framework capable of processing extensive multidimensional inputs, including excitation data and essential structural parameters such as mass and stiffness matrices. The architecture incorporates physics-informed components, notably a simplified dynamic response (SDR) module that significantly enhances computational efficiency without compromising critical dynamic response fidelity. Additionally, a physics-informed graph neural network (GNN) is employed to encode complex topological and dynamic interactions within building structures, enabling detailed representation of structural variations under diverse external dynamic loading conditions.

SeisGPT is designed as an encoder-decoder architecture, as depicted in Fig. 1a. Initially, external dynamic excitation records and structural matrices are processed through a preprocessing module incorporating the SDR module, which efficiently generates preliminary floor-level response histories using simplified numerical approximations. These preliminary responses are then input into the encoder, where temporal and structural information are integrated through a time embedding module—specifically designed to encode sequential data—and subsequently refined by the physics-informed graph neural network. The resulting outputs from the GNN are further processed by a dedicated floor embedding module, which encodes structural responses at the floor level, before being passed to the decoder. The decoder, comprising six multi-head self-attention layers followed by a linear projection, generates forecasts of structural responses across multiple future time steps. Crucially, SeisGPT’s capability to produce accurate predictions from sparse sensor data allows it to effectively infer structural responses at locations without direct sensor measurements, thus reducing reliance on dense sensor networks, lowering deployment costs, and maintaining high prediction fidelity.


A Figure 1 \| SeisGPT Model Architecture and Training Strategy. a, Overview of the SeisGPT architecture, which adopts a large-scale encoder–decoder framework incorporating physics-informed modules. The model predicts structural responses over successive intervals using a sliding-window formulation, integrating both external excitations and structural parameters. b, The SeisGPT training strategy. Two-stage training pipeline comprising large-scale pretraining on synthetic building finite element (FE) response data and subsequent fine-tuning on real-world building datasets. The reconstruction-capable variant, SeisGPT-R (SeisGPT-Reconstruction), follows a parallel process, with pretraining on sparsified synthetic building response data and fine-tuning on sparse real-world inputs, enabling high-accuracy reconstruction of unsensed structural responses from limited observations.

Figure 1 | SeisGPT Model Architecture and Training Strategy. a, Overview of the SeisGPT architecture, which adopts a large-scale encoder–decoder framework incorporating physics-informed modules. The model predicts structural responses over successive intervals using a sliding-window formulation, integrating both external excitations and structural parameters. b, The SeisGPT training strategy. Two-stage training pipeline comprising large-scale pretraining on synthetic building finite element (FE) response data and subsequent fine-tuning on real-world building datasets. The reconstruction-capable variant, SeisGPT-R (SeisGPT-Reconstruction), follows a parallel process, with pretraining on sparsified synthetic building response data and fine-tuning on sparse real-world inputs, enabling high-accuracy reconstruction of unsensed structural responses from limited observations.

SeisGPT employs a sliding time window strategy which was previously developed by the authors³⁵, for model training and inference. In this strategy, predictions are conditioned on external excitation data within a defined time interval

$\:\:\left[{E}_{t},{E}_{t+N}\right]$

, with the model outputting corresponding structural responses for the subsequent interval

$\:\left[{R}_{t+M},{R}_{t+N}\right]$

. In this study, the input window length

$\:{T}_{m}\:$

is set to 1,000 time steps and the model predicts responses for the last 20 time steps (

$\:{T}_{p}$

). This approach enables SeisGPT to effectively capture evolving structural behaviors under dynamic excitation while mitigating cumulative prediction errors commonly encountered in autoregressive approaches. Predictions are generated independently across consecutive time windows, thus eliminating temporal drift and facilitating parallel inference of multiple future states. Consequently, this method robustly accommodates dynamic excitation inputs of varying durations, enabling rapid and reliable predictions of structural response.

The SeisGPT framework consists of two specialized models—SeisGPT-Enhanced and SeisGPT-R-Enhanced—each designed for distinct scenarios of data availability. The development process for these models involves a structured two-phase approach: large-scale pretraining followed by targeted fine-tuning. Initially, a foundational model (SeisGPT-Base) is pretrained on an expansive synthetic building dataset comprising over two million finite element simulations. This comprehensive dataset spans diverse structural typologies and a wide range of input excitations, providing the model with robust generalization capabilities across various structural configurations. Overlapping time-window sampling further enriches this dataset, increasing training efficiency and enhancing temporal context learning.

Subsequently, to better reflect real-world structural dynamics, SeisGPT-Base is fine-tuned using a dataset of 694 FE models derived from existing building structures, resulting in SeisGPT-Enhanced. This fine-tuning substantially improves predictive fidelity, aligning the model more closely with practical engineering scenarios. Additionally, SeisGPT-Enhanced can be rapidly adapted to specific buildings when FE analyses are available, employing low-rank adaptation (LoRA)—an efficient parameter-tuning approach requiring minimal computational overhead.

To address the challenge of limited observational data, the framework introduces SeisGPT-R (SeisGPT-Reconstruction), a specialized variant designed to infer complete structural response profiles from sparse sensor measurements. The model is first pretrained as SeisGPT-R-Base on a sparsified version of the synthetic building response dataset, enabling it to learn structural response patterns under incomplete input conditions. It is subsequently fine-tuned on sparse response data from 694 real-world FE building models, yielding the final model SeisGPT-R-Enhanced. This reconstruction-capable variant achieves high-fidelity estimation of responses across all floors, even in the absence of dense instrumentation. Its ability to recover detailed structural dynamics from partial observations extends its applicability across a wide range of critical engineering scenarios, including structural health monitoring, post-earthquake damage evaluation, emergency response, retrofitting strategy development, and resilience-based design—offering a generalizable solution for data-scarce conditions in structural engineering.

Results

To rigorously evaluate the predictive performance of our model across diverse structural scenarios, we adopted three metrics specifically tailored to address the inherent variability in response magnitudes among floors: the floor-wise normalized mean absolute error (FNMAE), the floor-wise normalized root mean squared error (FNRMSE), and the Pearson correlation coefficient (R). Observed structural responses were regarded as the true reference values against which the accuracy of model-generated predictions was assessed. Given that absolute response magnitudes typically differ significantly across floors—primarily due to structural dynamics and amplitude attenuation with height—traditional error metrics would disproportionately weight floors exhibiting larger absolute responses. Consequently, to equitably evaluate prediction accuracy across all floors, responses at each floor were independently normalized by their maximum observed absolute value:

$\:{M}_{f}=\underset{1\le\:t\le\:T}{\text{max}}\left|{O}_{t,f}\right|$

where

$\:{O}_{t,f}$

denotes the observed response at time step t and floor f. Normalized predictions and observations were thus defined as:

$\:{\stackrel{\sim}{P}}_{t,f}=\frac{{P}_{t,f}}{{M}_{f}},\:{\stackrel{\sim}{O}}_{t,f}=\frac{{O}_{t,f}}{{M}_{f}}$

Using these normalized responses, our evaluation metrics were computed as follows:

$\:FNMAE=\frac{1}{T.F}{\sum\:}_{t=1}^{T}\sum\:_{f=1}^{F}\left\|{\stackrel{\sim}{P}}_{t,f}-{\stackrel{\sim}{O}}_{t,f}\right\|$	(3)
$\:FNRMSE=\sqrt{\frac{1}{T.F}{\sum\:}_{t=1}^{T}\sum\:_{f=1}^{F}{\left({\stackrel{\sim}{P}}_{t,f}-{\stackrel{\sim}{O}}_{t,f}\right)}^{2}}$	(4)
$\:R=\frac{1}{F}\sum\:_{f=1}^{F}\frac{{\sum\:}_{t=1}^{T}\left({P}_{t,f}-{\stackrel{̄}{P}}_{f}\right)\left({O}_{t,f}-{\stackrel{̄}{O}}_{f}\right)}{\sqrt{{\sum\:}_{t=1}^{T}{\left({P}_{t,f}-{\stackrel{̄}{P}}_{f}\right)}^{2}{\sum\:}_{t=1}^{T}{\left({O}_{t,f}-{\stackrel{̄}{O}}_{f}\right)}^{2}}}$	(5)

where T is the number of time steps, F represents the number of floors, and the overbars indicate temporal averages. These floor-wise normalized metrics ensure unbiased evaluation across floors by equally weighting prediction accuracy regardless of response magnitude. For each building, metrics were initially computed for every floor and subsequently averaged to yield representative building-level indicators. Finally, to comprehensively evaluate model performance across multiple structures, these building-level indicators were averaged over all buildings investigated in subsequent experimental analyses, facilitating robust comparative insights across various structural typologies.

Pretraining SeisGPT-base on the large-scale dataset for structural generalization

The SeisGPT-Base model was trained and evaluated on an exceptionally large-scale synthetic building dataset, consisting of over 2,000,000 finite element simulations that encompass three distinct structural systems: frame, frame-shear wall, and shear wall buildings. These 270,000 FE models were generated through a sophisticated intelligent design algorithm developed by the authors, as detailed in the Methods section. This noval algorithm autonomously constructs a diverse range of building configurations, including high-rise, mid-rise, and low-rise structures, across multiple structural typologies. The diversity inherent in this dataset ensures a comprehensive representation of real-world building designs, a critical aspect for training a model capable of addressing a wide variety of structural variations. Each configuration underwent a series of external excitation scenarios, utilizing ground motion records with varied magnitudes and frequency content to simulate diverse structural response conditions. The resulting dataset, comprising detailed acceleration and displacement time histories, was utilized as supervisory data for model training. Notably, the pretraining dataset used in this study includes over 10 billion time steps of floor response data, making it the largest response prediction dataset to date. This vast and diverse dataset not only facilitates the robust training of SeisGPT-Base but also significantly improves its ability to generalize across a broad spectrum of building typologies and external excitations.

Two separate models were developed: one to predict absolute accelerations and another to predict relative displacements. During testing, 1,000 previously unseen buildings from each structural category were selected, totaling 3,000 buildings. Figure 2 illustrates the prediction performance across the three structural types using various ML models, while detailed metric values are reported in

Extended Data Table 1. For acceleration prediction, SeisGPT-Base achieved FNMAEs of 0.0226, 0.0216, and 0.0250 for frame, frame–shear wall, and shear wall structures, respectively. For displacement prediction, the corresponding FNMAE values were 0.0437, 0.0416, and 0.0633. These results demonstrate that SeisGPT-Base, pretrained on a diverse and extensive simulation dataset, can deliver high-fidelity predictions for both acceleration and displacement time histories—even for previously unseen structural configurations and inputs—and outperforms existing ML models. This highlights the model’s strong generalizability and its potential as a rapid, accurate surrogate model for structural response prediction across diverse building typologies and scenarios. A visualization of the prediction accuracy for a representative shear wall structure is provided in Extended Data Fig. 1.

To further assess the effectiveness of the SeisGPT architecture, we compared the performance of SeisGPT-Base against several baseline models, including GRU, LSTM, TimesNet, N-Beats, and Informer. Specifically, for the GRU and LSTM models, we replaced the SeisGPT-Base decoder (see Fig. 1b) with the respective model’s decoder while retaining its original encoder, resulting in two hybrid variants: SeisGRU and SeisLSTM. As shown in Fig. 2, SeisGPT consistently outperforms all other models in predicting both acceleration and displacement across all structural types, achieving the best overall performance. Beyond predictive accuracy, SeisGPT also offers a significant advantage in computational efficiency over conventional numerical approaches such as FEM. In our experiments, we compared the runtime of SeisGPT and FEM for predicting structural responses. The results show that SeisGPT delivers predictions several orders of magnitude faster than FEM. For inference on 1,000 buildings, SeisGPT achieved speedups of approximately 4,400×, 17,800×, and 40,500× over traditional FEM for frame, frame–shear wall, and shear wall structures, respectively. FEM simulations were conducted on an Intel® Xeon® CPU Max 9468, while SeisGPT inference was performed on an NVIDIA H800 GPU. The model’s ability to accurately capture full-building structural responses across distinct structural typologies is further visualized in Extended Data Fig. 2, which provides a complementary visualization of full-building response predictions by SeisGPT-Base across representative frame, frame–shear wall, and shear wall structures, demonstrating close agreement with FEM results under seismic excitation.

Fig. 2

Comparison of response predictions across building types using SeisGPT-Base and other machine learning models. a–c, Model prediction performance for absolute accelerations of (a) frame, (b) frame–shear wall, and (c) shear wall structures. d–f, Model prediction performance for relative displacements of (d) frame, (e) frame–shear wall, and (f) shear wall structures. The models compared include SeisGPT, SeisGRU, SeisLSTM, TimesNet, N-Beats, and Informer (M1–M6). g, Time-history comparison of acceleration and displacement responses for a shear wall structure subjected to ground motion RSN13399. SeisGPT-Base predictions exhibit the closest alignment with the reference FEM results, accurately capturing both the amplitude and timing of response fluctuations, and achieving the lowest peak-value errors among all evaluated models.

Fine-tuning SeisGPT-Enhanced with real-world building response dataset

Due to the limited availability of real-world buildings, SeisGPT-Base was pretrained on a synthetic building dataset of 270,000 generated building models adhering to design standards, with corresponding structural responses obtained through FEM to ensure reliability and physical accuracy. Following this pretraining, SeisGPT-Base was fine-tuned using FEM results from 694 real buildings—comprising 629 buildings for training and 65 for testing—resulting in the SeisGPT-Enhanced model (see Fig. 1b). This fine-tuning enabled the model’s predictions to more closely align with the response of real-world scenarios.

To evaluate the effectiveness of the pretraining and fine-tuning approach, we compared the performance of three configurations: SeisGPT-Base (trained without fine-tuning), SeisGPT-Enhanced (fine-tuned using real building FEM data), and a version of SeisGPT trained solely on real building FEM data without prior pretraining. As shown in Figs. 3a and 3b, pretraining on the large-scale synthetic building dataset significantly improved model performance. The synthetic building dataset, which closely approximates real-world building responses, enabled the model to learn structural patterns applicable across a wide range of real buildings. Fine-tuning on real-world FEM data further enhanced predictive accuracy. As a result, SeisGPT-Enhanced showed a slight advantage over SeisGPT-Base in predicting both acceleration and displacement responses for real buildings, confirming that fine-tuning improved the model’s alignment with actual structural behavior (see Figs. 3a and 3b). Additionally, as shown in Figs. 3c and 3d, when compared to other models, SeisGPT-Enhanced consistently outperformed them in predicting both acceleration and displacement, further validating its superior accuracy in structural response predictions. Quantitative results are reported in Extended Data Table 2 and Extended Data Table 3. The model also demonstrated high-fidelity trajectory prediction performance in representative buildings from three structural categories, as shown in Extended Data Fig. 3.

Fig. 3

Response prediction performance of SeisGPT-Enhanced and other models on the real buildings’ dataset. a, b, Comparison of acceleration (a) and displacement (b) prediction performance between SeisGPT-Base (A1), SeisGPT-Enhanced (A2), and a model pre-trained directly on the real buildings dataset (A3), with the latter model trained for additional epochs to ensure a fair comparison. c, d, Comparison of acceleration (c) and displacement (d) prediction performance between SeisGPT-Enhanced and other models: SeisGPT, SeisGRU, SeisLSTM, TimesNet, N-Beats, and Informer (M1-M6).

Few-Shot Fine-Tuning of SeisGPT-Enhanced for Structure-Specific Adaptation

In post-earthquake damage assessment and incremental dynamic analysis (IDA), the availability of recorded seismic response histories for individual buildings is often limited. Efficient utilization of these sparse datasets could significantly enhance predictive accuracy for future seismic events. To examine this possibility, we explored the efficacy of few-shot fine-tuning to personalize the predictive performance of SeisGPT-Enhanced for individual structures. Specifically, starting from a globally fine-tuned baseline model, we employed low-rank adaptation (LoRA) to fine-tune SeisGPT-Enhanced using between one and nine recorded seismic response histories per building from a benchmark dataset comprising 65 real-world structures. An additional, unseen seismic response sequence per building was reserved for independent evaluation. The model was explicitly trained to forecast the relative displacement responses of the buildings under study.

As shown in Fig. 4a, even minimal fine-tuning yielded substantial performance gains: using only a single response sequence increased the Pearson correlation coefficient from 0.932 to 0.952 and simultaneously reduced the floor-wise normalized mean absolute error (FNMAE) by approximately 10%. Increasing the number of training sequences further improved accuracy, achieving FNMAE reductions of approximately 30% with three sequences and as high as 53% with nine sequences.

Additionally, we systematically examined how the placement of LoRA within the model architecture influenced prediction accuracy, evaluating four distinct adaptation configurations: (i) all linear layers, (ii) encoder-only, (iii) decoder-only, and (iv) output-layer-only. As illustrated in Fig. 4b, all configurations led to improved predictive performance, with adaptation applied to encoder yielding the most substantial gains. The fine-tuning procedure proved highly efficient, requiring on average only approximately 20 seconds per building on an NVIDIA H800 GPU (bf16 precision). This computational efficiency underscores the practicality of SeisGPT-Enhanced for real-time seismic response forecasting, making it particularly valuable for rapid structural assessment and large-scale deployment in post-earthquake scenarios.

Fig. 4

Data-efficient specialization of SeisGPT-Enhanced with low-rank adaptation. a, Effect of training-set size on predictive accuracy. Fine-tuning SeisGPT-Enhanced for a single building with LoRA lowers prediction error by 10–50% with only one to nine recorded response sequences; accuracy improves monotonically as additional sequences are provided. b, Effect of architectural placement of LoRA modules on predictive accuracy. Applying LoRA to the encoder yields the most significant improvement, while restricting fine-tuning to the output layer provides the least. Error bars represent one standard deviation across test samples. For the Pearson correlation coefficient (R), which theoretically ranges from − 1 to 1, error bars are reported as mean

$\:\pm\:$

s.d.; in cases where the mean approaches 1, the symmetric confidence interval may extend slightly above 1.0 due to sampling variability. No artificial truncation of error bars was applied.

Sparse-sensor–driven building-scale structural response reconstruction via SeisGPT-R

SeisGPT-R enables accurate prediction and full-building response reconstruction (i.e., all floors) using only a limited number of sensor recordings from a building or its scaled down model, leveraging sparse real-world data. This capability was achieved by continued pretraining and fine-tuning of SeisGPT-Base on large-scale datasets—including real structural data—specifically curated for data completion tasks. To validate the effectiveness of SeisGPT-R-Enhanced and its ability to utilize real sensor data for response reconstruction, the structural responses of 65 real buildings were reconstructed using SeisGPT-R-Enhanced. The study examined the impact of sensor placement at different floor levels relative to the total number of floors, evaluating prediction performance for both absolute acceleration and relative displacement. As shown in Fig. 5, reconstruction accuracy is sensitive to sensor location: placing the sensor near the mid-height of the structure yields the highest correlation and the lowest reconstruction errors. These results indicate that mid-level sensor placement provides the most informative input for accurate full-building response prediction under sparse sensing conditions.

To evaluate the effectiveness of SeisGPT-R-Enhanced under realistic conditions, we conducted a shake-table test on a 1/10-scaled model of a 43-story reinforced concrete moment-resisting frame building, subjected to 7 ground motion records⁴⁰. Displacement data from a sensor located on the 11th floor (the middle level of the model) were provided to SeisGPT-R-Enhanced, which was then used to predict the structural responses at other sensor locations throughout the building. The performance of SeisGPT-R-Enhanced was assessed by comparing its predictions against both finite element method (FEM) results and sensor measurements recorded at all locations for all 7 applied ground motions, as shown in Fig. 5.

As illustrated in Fig. 5, SeisGPT-R-Enhanced not only produced reconstructions with superior accuracy relative to FE analysis but also offered dramatic improvements in computational efficiency, reducing both inference time and resource demands by several orders of magnitude. These results underscore SeisGPT-R-Enhanced’s broad potential for real-time deployment in data-limited scenarios, supporting applications in post-earthquake assessment, rapid structural diagnostics, resilience planning, and beyond.

Fig. 5

SeisGPT-R performance. Effect of incorporating recorded real data from various floors on a. Predicted acceleration responses, b. Predicted displacement responses, with the x-axis indicating the relative floor position, c. The tested specimen on the shake table, d. Locations of displacement sensors on the tested sample, with the 11th-floor sensor used as input for SeisGPT-R-Enhanced and the responses of the other 10 sensors reconstructed, e. Comparison of SeisGPT-R-Enhanced and FEM in predicting displacement response time histories for the tested specimen, with SeisGPT-R-Enhanced showing improved performance as indicated by the R, FNMAE, and FNRMSE metrics, and f. Comparison of the predicted roof response using SeisGPT-R-Enhanced and FEM with recorded sensor data from the shake table test, focusing on a specific segment near the response peak.

Discussion

This study advances large-scale structural dynamics modeling by introducing SeisGPT, a physics-informed foundation model explicitly developed to resolve the longstanding tension between computational fidelity and practical efficiency. SeisGPT uniquely integrates a mass–stiffness-aware graph neural network encoder, a simplified dynamic-response embedding module, and a generative Transformer-based decoder, effectively assimilating more than 10 billion nonlinear elastoplastic response time-steps. Algorithmically, this hybrid architecture effectively combines deep generative learning methods with structural mechanics principles, enabling accurate modeling of complex nonlinear building responses. This approach enables rapid, real-time prediction of structural responses for previously unseen buildings, consistently outperforming existing state-of-the-art machine-learning surrogates and delivering computational speed enhancements up to four orders of magnitude greater than conventional nonlinear finite-element analysis. Notably, when combined with sparse sensor data, SeisGPT’s physics-informed latent representations refine predictions to achieve accuracies surpassing those obtained through full-scale FEA, even with partial observational data. These attributes position SeisGPT as a powerful tool for real-time structural health monitoring, rapid damage assessment, and detailed dynamic analyses.

Several core algorithmic innovations underpin SeisGPT’s robust performance. Firstly, the dual-corpus pre-training strategy—integrating an extensive synthetic dataset of 270,000 AI-generated, code-compliant structural designs with empirical data from 694 instrumented real-world buildings—algorithmically enhances the model’s capacity for structural generalization. This strategy systematically exposes SeisGPT to diverse geometric and material configurations, overcoming the inherent limitations of conventional typology-specific surrogates. Secondly, SeisGPT introduces a novel approach that leverages a physics-informed encoder to explicitly embed structural mechanics information, coupled with a specialized mapping mechanism designed to accurately represent and generalize building structural dynamics. This strategy also enables rapid fine-tuning with minimal building-specific observational data, significantly enhancing prediction accuracy without extensive computational overhead or retraining efforts. The base model, SeisGPT-Base, was trained on over 270,000 finite element models and 2 million finite element simulations of various building types and demonstrated strong performance in predicting both acceleration and displacement responses, achieving normalized FNMAE ranging from 0.02 to 0.06 across different building configurations. This robust transferability underscores the model’s versatility and practical applicability. Built on a bottom-up, data-driven architecture, SeisGPT offers a major advantage in computational efficiency, particularly when compared to traditional FEM. It achieves orders-of-magnitude speedups—up to 40,000 times faster than FEM for certain building types—enabling real-time structural response predictions. Lastly, the reconstruction variant, SeisGPT‑R, fuses sparse sensor measurements with its physics‑informed latent space to deliver reconstruction errors that are not only robust under severe data sparsity but also consistently lower than those produced by full nonlinear FEA carried out on the same partial information. This superior fidelity—achieved at millisecond‑level runtimes—greatly strengthens the model’s utility for continuous city‑scale health monitoring, rapid post‑disaster diagnostics, and resilience dashboards in instrument‑limited settings.

Despite these promising advances, several areas require further investigation to strengthen SeisGPT’s applicability and reliability. Firstly, the current datasets do not extensively cover extreme structural responses such as progressive collapse or complex hybrid structural systems. Expanding the training corpus to include detailed, high-resolution simulations of extreme scenarios and additional structural typologies would substantially enhance model robustness and versatility. Secondly, while the present validation focuses on external dynamic loading, the inherent generality of SeisGPT’s architecture provides a foundation for future adaptation to other types of loading conditions, such as blast or fire events, thereby extending its applicability to comprehensive multi-hazard resilience assessment. Lastly, although SeisGPT already demonstrates real-time performance on conventional GPU hardware, deploying it at city-wide scales may necessitate further computational optimizations, including model pruning, quantization, and edge-oriented deployment strategies.

In conclusion, SeisGPT presents a scalable, algorithmically innovative framework combining robust physics-informed structural mechanics with advanced generative deep learning methodologies. Its generalized modeling capabilities significantly advance the field of structural dynamics, facilitating rapid and accurate predictive analytics across diverse building types and loading scenarios. Continued expansion and validation across broader structural and hazard contexts promise to elevate SeisGPT into an essential computational tool for resilient assessment, proactive disaster mitigation, and evidence-based structural management.

Acknowledgments

The authors gratefully acknowledge the Distinguished Young Scientist Fund of National Natural Science Foundation of China (Grant No.52025083), the financial support from the National Key Research and Development Program of China (Grant No. 2023YFC3805000), the XPLORE PRIZE (Grant No. XP202342), and the Shanghai Urban Digital Transformation Special Fund (Grant No. 202201033).

Code availability

The source code is archived on Zenodo and can be accessed during peer review via the private link: https://zenodo.org/records/15534060?preview=1&token=eyJhbGciOiJIUzUxMiIsImlhdCI6MTc0ODQ0MDIxMywiZXhwIjoxNzk4NTg4Nzk5fQ.eyJpZCI6ImMwOGRhNjVkLWMyMzEtNDBjNy04NDk3LWIxYzM5NzI0Y2RkOCIsImRhdGEiOnt9LCJyYW5kb20iOiJjMzZkNzIzZmQ2NTQ4NmQwODc1ZjUzMjk5MWRkZGY3NCJ9.RyNNZ7fkER36GMEoCvzAUFV58wfRHhE8oY5VLqU9F1g_EefJUHKHlmeoKjbZf2-noiaZ1T_5ndJ_7nDIcRwAtw. The code will be made public upon publication.

Data availability

The dataset used in this study is deposited at the same Zenodo record and is available to editors and reviewers through the link above. The dataset will be made public upon publication.

Author contributions

Conceptualization: Y.Z., S.M., Q.Z.; Methodology and software: S.M., Y.Z; Data curation: S.M., B.L., M.C., T.Z.; Resources: Y.Z., Q.Z.; Validation and formal analysis: S.M., A.J., A.D; Writing – original draft: S.M.; Writing – review and editing: Y.Z., S.M., A.J., A.D. Supervision and funding acquisition: Y.Z., Q.Z.

References

1. Mestav Sarica, G. & Pan, T.-C. Seismic loss dynamics in three Asian megacities using a macro-level approach based on socioeconomic exposure indicators. Communications Earth & Environment 3, 101 (2022).

2. Birkmann, J., Welle, T., Solecki, W., Lwasa, S. & Garschagen, M. Boost resilience of small and mid-sized cities. Nature 537, 605–608 (2016).

3. Persson, P., Andersen, L. V., Persson, K. & Bucinskas, P. Effect of structural design on traffic-induced building vibrations. Procedia Engineering 199, 2711–2716 (2017).

4. Zhao, S., Zhang, C., Dai, X. & Yan, Z. Review of wind-induced effects estimation through nonlinear analysis of tall buildings, high-rise structures, flexible bridges and transmission lines. Buildings 13, 2033 (2023).

5. Wang, Y., Zhao, D. & Li, H. Finite Element Model Updating Technique for Super High-Rise Building Based on Response Surface Method. Buildings 15, 126 (2025).

6. Abdelnaby, A. E. & Elnashai, A. S. Numerical modeling and analysis of RC frames subjected to multiple earthquakes. Earthquakes and Structures 9, 957–981 (2015).

7. Li, X. et al. Mechanics-informed autoencoder enables automated detection and localization of unforeseen structural damage. Nature Communications 15, 9229 (2024).

8. Cao, Z. et al. Dynamic sensitivity-based finite element model updating for nonlinear structures using time-domain responses. International Journal of Mechanical Sciences 184, 105788 (2020).

9. Yao, M. Nonlinear structural dynamic finite element analysis using Ritz vector reduced basis method. Shock and Vibration 3, 259–268 (1996).

10. Ohsaki, M. et al. High-precision finite element analysis of elastoplastic dynamic responses of super‐high‐rise steel frames. Earthquake engineering & structural dynamics 38, 635–654 (2009).

11. McKenna, F. OpenSees: a framework for earthquake engineering simulation. Computing in Science & Engineering 13, 58–66 (2011).

12. Moaveni, B., Conte, J. P. & Hemez, F. M. Uncertainty and Sensitivity Analysis of Damage Identification Results Obtained Using Finite Element Model Updating. Computer-Aided Civil and Infrastructure Engineering 24, 320–334 (2009).

13. Butt, F. & Omenzetter, P. Seismic response trends evaluation and finite element model calibration of an instrumented RC building considering soil-structure interaction and non-structural components. Engineering Structures 65, 111–123 (2014).

14. Karapiperis, K. & Kochmann, D. M. Prediction and control of fracture paths in disordered architected materials using graph neural networks. Communications Engineering 2, 32 (2023).

15. Cremen, G., Galasso, C. & McCloskey, J. A simulation-based framework for earthquake risk‐informed and people‐centered decision making on future urban planning. Earth's Future 10, e2021EF002388 (2022).

16. Li, Y. et al. The physics-based deterministic scenarios for earthquake hazards and losses of the Zhujiangkou fault in southern China. npj Natural Hazards 2, 6 (2025).

17. Ereiz, S., Duvnjak, I. & Jiménez-Alonso, J. F. Review of finite element model updating methods for structural applications. In Proceedings of the Structures, 684–723 (2022).

18. Wang, X. et al. Kriging-based surrogate data-enriching artificial neural network prediction of strength and permeability of permeable cement-stabilized base. Nature Communications 15, 4891 (2024).

19. LeCun, Y. & Bengio, Y. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361, 1995 (1995).

20. Wang, T. Y. et al. Probabilistic Seismic Response Prediction of Three-Dimensional Structures Based on Bayesian Convolutional Neural Network. Sensors 22 (2022).

21. Wu, R.-T. & Jahanshahi, M. R. Deep convolutional neural network for structural dynamic response estimation and system identification. Journal of Engineering Mechanics 145, 04018125 (2019).

22. Chung, J., Gulcehre, C., Cho, K. & Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).

23. Perez-Ramirez, C. A. et al. Recurrent neural network model with Bayesian training and mutual information for response prediction of large buildings. Engineering Structures 178, 603–615 (2019).

24. Hu, Y., Tsang, H. H., Lam, N. & Lumantarna, E. Physics-informed neural networks for enhancing structural seismic response prediction with pseudo-labelling. Archives of Civil and Mechanical Engineering 24 (2023).

25. Zhang, R. et al. Deep long short-term memory networks for nonlinear structural seismic response prediction. Computers & Structures 220, 55–68 (2019).

26. Peng, H., Yan, J. W., Yu, Y. & Luo, Y. Z. Time series estimation based on deep Learning for structural dynamic nonlinear prediction. Structures 29, 1016–1031 (2021).

27. Liu, F., Li, J. & Wang, L. PI-LSTM: Physics-informed long short-term memory network for structural response modeling. Engineering Structures 292, 116500 (2023).

28. Xu, Z., Chen, J., Shen, J. & Xiang, M. Recursive long short-term memory network for predicting nonlinear structural seismic response. Engineering Structures 250, 113406 (2022).

29. Vaswani, A. et al. Attention is all you need. Advances in neural information processing systems 30 (2017).

30. Zhang, Q. et al. Transformer-based structural seismic response prediction. Structures 61, 105929 (2024).

31. Eshkevari, S. S., Takác, M., Pakzad, S. N. & Jahani, M. DynNet: Physics-based neural architecture design for nonlinear structural response modeling and prediction. Engineering Structures 229 (2021).

32. Yu, Y., Yao, H. & Liu, Y. Structural dynamics simulation using a novel physics-guided machine learning method. Engineering Applications of Artificial Intelligence 96, 103947 (2020).

33. Zhang, R., Liu, Y. & Sun, H. Physics-guided convolutional neural network (PhyCNN) for data-driven seismic response modeling. Engineering Structures 215, 110704 (2020).

34. Zhang, R., Liu, Y. & Sun, H. Physics-informed multi-LSTM networks for metamodeling of nonlinear structures. Computer Methods in Applied Mechanics and Engineering 369, 113226 (2020).

35. Zhou, Y., Meng, S., Lou, Y. & Kong, Q. Physics-informed deep learning-based real-time structural response prediction method. Engineering 35, 140–157 (2024).

36. Chen, Z., Liu, Y. & Sun, H. Physics-informed learning of governing equations from scarce data. Nature communications 12, 6136 (2021).

37. Okazaki, T., Ito, T., Hirahara, K. & Ueda, N. Physics-informed deep learning approach for modeling crustal deformation. Nature Communications 13, 7092 (2022).

38. Mangalathu, S. & Jeon, J.-S. Classification of failure mode and prediction of shear strength for reinforced concrete beam-column joints using machine learning techniques. Engineering Structures 160, 85–94 (2018).

39. Sun, H., Burton, H. V. & Huang, H. Machine learning applications for building structural design and performance assessment: State-of-the-art review. Journal of Building Engineering 33, 101816 (2021).

40. Zhou, Y. et al. Seismic performance of a frame-supported shear wall over‐track building through shaking table test. The Structural Design of Tall and Special Buildings 33, e2098 (2024).

41. Ancheta, T. D., Darragh, R. B., Stewart, J. P., Seyhan, E., Silva, W. J., Chiou, B. S. J., Donahue, J. L.,. NGA-West2 Database. PEER Report 2013/03. Pacific Earthquake Engineering Research Center, University of California, Berkeley, 2014.

42. Zhang, L., Rao, A. & Agrawala, M. Adding conditional control to text-to-image diffusion models. In Proceedings of the the IEEE/CVF international conference on computer vision, 3836–3847 (2023).

43. Ministry of Housing and Urban-Rural Development of the People’s Republic of China. Code for Seismic Design of Buildings. GB 50011–2010. Beijing: 2016.

44. Ministry of Housing and Urban-Rural Development of the PRC. Technical Specification for Concrete Structures of Tall Buildings. JGJ 3-2010. Beijing: China Architecture & Building Press, 2010.

45. Hendrycks, D. & Gimpel, K. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016).

46. Ainslie, J. et al. Gqa: Training generalized multi-query transformer models from multi-head checkpoints. arXiv preprint arXiv:2305.13245 (2023).

47. Su, J. et al. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 568, 127063 (2024).

48. Shazeer, N. Glu variants improve transformer. arXiv preprint arXiv:2002.05202 (2020). Methods Numerical module: Dataset acquisition and processing The development of a robust numerical module is essential for laying the foundation of the deep learning model in this study. This module focuses on acquiring and processing high-quality data that accurately represents real-world dynamic scenarios. Acknowledging the critical role of high-fidelity data in enhancing deep learning performance, we created a comprehensive dataset using non-linear time history analysis (NLTHA) simulations.

1. -External excitations Selection: To robustly generalize structural response predictions across a diverse spectrum of dynamic loading scenarios, the external excitation database includes seismic ground motions, stochastic white noise, subway-induced vibrations, and impulsive loading. A total of 26,000 dynamic records were assembled, with the majority sourced from realistic seismic ground-motion records⁴¹ spanning predominantly mid-frequency ranges (approximately 1–10 Hz). These seismic records were supplemented with other dynamic excitations strategically selected or synthesized to comprehensively cover lower (< 1 Hz) and higher (> 10 Hz) frequency domains. Amplitude scaling was consistently applied to maintain realistic intensity distributions aligned with established seismic design codes and structural engineering guidelines. All excitation records were uniformly sampled at a time interval of 0.02 seconds, ensuring high temporal resolution for subsequent nonlinear time-history analyses (NLTHA). By incorporating dynamic phenomena across multiple frequency scales—extending over three orders of magnitude—the diversified excitation dataset enhances SeisGPT’s predictive accuracy and generalizability, thereby advancing its potential as a foundational model for accurately predicting structural responses under a wide variety of external dynamic conditions

2. -Building Models: A total of 270,000 detailed macro-scale building models were developed for this study, encompassing four principal structural categories: 150,000 frame structures, 60,000 frame–shear wall structures, 60,000 shear wall structures, and 694 models derived from real-world buildings. These models span a wide spectrum of typologies—including high-rise, mid-rise, and low-rise configurations—and incorporate diverse functional programs, such as residential, office, and commercial types. A summary of all finite element models and corresponding nonlinear time-history simulation data used in this study is provided in Extended Data Table 4. To enable large-scale, diverse, and structurally valid building model generation, we developed a dedicated generative design pipeline built upon a family of customized diffusion models: ArchiFlux, StructFlux, and BeamFlux. These models were designed and trained by the authors, extending the capabilities of the base Flux architecture through targeted fine-tuning using ControlNet⁴², which conditions generation on architectural constraints. This framework allows for the synthesis of high-fidelity structural configurations with semantic consistency and compliance with seismic design codes. The training data consisted of semantically labeled architectural drawings from real-world buildings, in which key components—including partitions, windows, doors, beams, columns, and shear walls—were annotated using distinct color channels to facilitate visual and algorithmic parsing. These annotated plans formed a structured dataset for training our domain-specific generative models. Each model in the suite serves a specialized function: ArchiFlux synthesizes floor plans with spatial segmentation aligned to functional zoning; StructFlux generates the spatial arrangement of primary load-resisting elements such as columns and shear walls; and BeamFlux defines the distribution of secondary framing components. The outputs from these models are then post-processed and evaluated to ensure structural coherence, such as alignment between vertical and horizontal load paths and adherence to minimum clearances and layout feasibility. To ensure compliance with seismic performance criteria, a final optimization phase adjusts the geometric dimensions and reinforcement ratios of structural components. This phase incorporates drift ratio constraints and flexural strength checks to satisfy inter-story performance limits and material capacity bounds. The result is a collection of 270,000 high-fidelity macro-models, encompassing 150,000 frame structures, 60,000 frame–shear wall structures, and 60,000 shear wall structures, all of which conform to relevant structural design specifications. The integration of advanced diffusion-based generation with physics-informed post-optimization marks a significant methodological advance, enabling automated, scalable production of realistic and code-compliant structural models. Further details of the model architecture, training protocol, and optimization workflow are provided in Extended Data Fig. 4. The input variables for these building structures and their structural elements were defined according to the specifications in GB/T50011-2010⁴³ and JGJ 3-2010⁴⁴. The generation of each building geometry and configuration model adhered to a structured process that maintained realistic relationships between parameters while allowing for controlled variability. To ensure a robust and versatile dataset, the range of values and combinations was intentionally extended beyond typical correlations, allowing for a more comprehensive exploration of the input space. This approach enhances the model’s generalizability by training it on a diverse set of building configurations. Key geometric parameters include floor height (ranging from 2.8 to 4.5 m), slab thickness (80 to 120 mm), and shear wall thickness (200 to 400 mm). The dimensions of columns and beams were automatically designed per design codes, with calculations for axial compression and flexural strength to ensure compliance. If any generated dimensions were found to be unreasonable, they were regenerated. The reinforcement ratio for all structural elements was automatically calculated according to the applicable design standards. The material properties, such as concrete strength (ranging from C25 to C50) and reinforcement steel bar strength (ranging from 355 to 400 MPa), were considered. This detailed generation process ensures that each building configuration is unique, realistic, and internally consistent, effectively representing real-world structural designs. The design parameters and their ranges for reinforced concrete (RC) buildings are summarized in Extended Data Table 5.

3. -NLTHA Simulations: Each of the 270,694 building models was subjected to three randomly selected ground motions in both the x and y directions, resulting in a total of 2,053,880 dynamic simulations. All simulations were conducted using the OpenSees¹¹, a widely adopted open-source platform for nonlinear structural analysis. Implicit analysis was employed for these simulations, as it computes the response at each time step under the assumption of static equilibrium, making it particularly suitable for handling complex material behavior and large deformations, both of which are essential in dynamic simulations. The final output from the simulations consists of detailed NLTHA responses, providing crucial response data for each RC structure under the specified external excitation. This response data is critical for training, validating, and testing the deep learning models.

4. -Dataset Splitting and Characteristics: To ensure robust model evaluation and generalizability, the dataset was divided into training and testing sets using a random selection process. The final split allocated 267,629, buildings for training and 3,065 buildings for testing. With a time interval of 0.02 seconds for each input excitation and the duration of the simulations, the entire dataset represents approximately 10 billion data points. This vast and diverse dataset provides a solid foundation for model training and evaluation across different structural types and scenarios.

The comprehensive nature of the dataset—encompassing various building types, external dynamic excitations, and structural responses—ensures that the resulting deep learning model will be well-equipped to handle a wide range of real-world scenarios. This robust numerical module forms the cornerstone of our study, enabling the development of a highly accurate and versatile response prediction model.

Simplified dynamic response (SDR) module

The simplified dynamic‑response (SDR) module is devised for rapid, coarse‑resolution estimation of global structural response, converting a high-fidelity FE model into a compact mechanical surrogate. The complete workflow is illustrated in Extended Data Fig. 5. Floor masses were obtained by integrating the self-weight of the slabs, tributary masses from vertical members, and prescribed live-load allowances, resulting in the lumped mass matrix (M). This approach preserves the true vertical distribution of inertia, which governs modal participation in tall or irregular buildings. To capture lateral and torsional stiffness without using spring analogies, unit horizontal forces

$\:{F}_{i}$

were applied sequentially at each floor of the FE model, and the resulting displacement field

$\:{U}_{ij}$

was assembled. Enforcing equilibrium

$\:\varvec{K}\varvec{U}=\varvec{F}$

and imposing the symmetry constraint

$\:{K}_{ij}={K}_{ji}$

, a dense, full-bandwidth stiffness matrix

$\:\varvec{K}$

was obtained by constrained least-squares minimisation, faithfully capturing shear–flexure interaction, diaphragm eccentricity, and other higher-order coupling effects that conventional tri-diagonal stick models overlook. The pair

$\:\{\varvec{K},\:\varvec{M}\}$

thus defines a reduced multi-degree-of-freedom system that reproduces the fundamental natural frequencies and participation factors of the parent FE model while enabling time-history analyses several orders of magnitude faster. Agnostic to structural typology—moment frames, braced cores, or coupled shear walls—and tolerant of non-uniform mass–stiffness distributions, the SDR surrogate furnishes a mechanics-grounded backbone for downstream ML inference across heterogeneous building inventories, combining computational efficiency with accuracy.

Simplified dynamic response calculation

The SDR module produces a coarse-resolution approximation of the floor-wise dynamic response by transforming a high-fidelity finite element model into a simplified mechanical surrogate. By integrating story mass, stiffness, and structural dynamic equations, it embeds key structural parameters into the calculations, streamlining the task of fitting deep learning models to structural responses under external dynamic excitation.

The calculation method leverages a simplified numerical approach, specifically the Newmark-β method, which assumes linear acceleration changes within a time interval. This method incorporates floor displacement and acceleration, as described by the following equations:

$\:{u}_{n+1}={u}_{n}+\varDelta\:t{\dot{u}}_{n}+\varDelta\:{t}^{2}\left(\frac{\left(1-2\beta\:\right){\ddot{u}}_{n}}{2}+2\beta\:{\ddot{u}}_{n+1}\right)$

(6)

$\:{\dot{u}}_{n+1}={\dot{u}}_{n}+\varDelta\:t\left(\left(1-\gamma\:\right){\ddot{u}}_{n}+\gamma\:{\ddot{u}}_{n+1}\right)$

(7)

Here,

$\:\gamma\:$

and

$\:\beta\:$

are constants. The structural dynamics equation that must be satisfied is:

$\:M{\ddot{u}}_{n+1}+C{\dot{u}}_{n+1}+K{u}_{n+1}=G$

(8)

By substituting equations (1) and (2) into Eq. (3), the equation for

$\:{\ddot{u}}_{n+1}$

is obtained:

$\:{\ddot{u}}_{n+1}={M}^{-1}\left(G-C\left({\dot{u}}_{n}+\varDelta\:t\left(1-\gamma\:\right){\ddot{u}}_{n}\right)-K\left({u}_{n}+\varDelta\:t{\dot{u}}_{n}+\frac{\varDelta\:{t}^{2}\left(1-2\beta\:\right){\ddot{u}}_{n}}{2}\right)\right)$

(9)

Once

$\:{\ddot{u}}_{n+1}$

is found, it is introduced back into equations (1) and (2) to calculate

$\:{u}_{n+1}$

. The damping matrix

$\:C$

uses Rayleigh damping, expressed as:

$\:C=\alpha\:M+\beta\:K$

(10)

Parameters α and β are determined by:

$\:\alpha\:=\frac{2\zeta\:{\omega\:}_{1}{\omega\:}_{2}}{{\omega\:}_{1}+{\omega\:}_{2}}$

(11)

$\:\beta\:=\frac{2\zeta\:}{{\omega\:}_{1}+{\omega\:}_{2}}$

(12)

where

$\:\zeta\:$

is the damping ratio, and

$\:{\omega\:}_{1}$

and

$\:{\omega\:}_{2}$

are the first- and second-order natural frequencies.

The structural dynamics equation is solved to obtain the nonlinear response, considering the damping matrix expressed via Rayleigh damping. This iterative process, based on the structural dynamic information extraction module, allows for accurate estimation of the nonlinear structural response.

SeisGPT core deep learning model

The SeisGPT model incorporates a feature embedding module, a physics-informed structural encoder (PhySE), and a response feature decoder (RFD). Its architecture integrates state-of-the-art techniques from machine learning and structural engineering to model complex interactions within structural systems and capture long-range dependencies in response predictions. The core components of the model include the PhySE, enhanced gated fusion modules, and Transformer blocks, each playing a critical role in learning physics-informed and data-driven representations. A detailed schematic of the SeisGPT architecture is shown in Fig. 6.

Feature embedding module: This module processes multiple data sources, including excitation data, simplified structural response data, and structural matrices (stiffness and mass). It transforms these inputs into a unified latent space utilized by the model’s subsequent layers, particularly the Transformer blocks. The excitation data

$\:{X}_{e}\in\:{\mathbb{R}}^{\text{B}\:\times\:\:\text{F}\:\times\:\:\text{T}}$

and simplified structural response data

$\:{X}_{s}\in\:{\mathbb{R}}^{\text{B}\:\times\:\:\text{F}\:\times\:\:\text{T}}$

were initially passed through linear transformations to embed the temporal sequences, followed by a dropout operation to mitigate overfitting during training.

Time embeddings were then computed for each input sequence, encoding the excitation and simplified structural response data into time-specific representations that capture the temporal dynamics of the inputs. In parallel, floor embeddings were generated by applying linear transformations to the structural data, ensuring consistent encoding across varying numbers of floors and capturing the physical characteristics of the structures.

The encoded features were subsequently processed by the PhySE, which integrates temporal and structural features along with the stiffness matrix

$\:\varvec{K}$

and mass matrix

$\:\varvec{M}$

. The excitation, structural, and simplified response features were fused using the enhanced gated fusion mechanism, allowing the model to dynamically learn meaningful interactions among external excitation, simplified structural response, and building properties. To further capture temporal dependencies, sinusoidal positional encodings were precomputed and applied to the features, enabling the model to effectively represent the temporal structure inherent in the data.

Physics-informed structural encoder: The PhySE is central to incorporating structural knowledge into the learning process, leveraging both graph-based learning and physics-informed feature extraction. The encoder is designed to process the stiffness matrix

$\:\varvec{K}$

and the mass matrix

$\:\varvec{M}$

, which represent key structural properties, and produce feature embeddings that capture the structural response.

The physics-informed graph neural network (PIGNN) incorporates structural domain knowledge into the GNN architecture to enhance the modeling of responses in structural systems. The model constructs a physics-informed graph, where nodes represent structural elements (floors in a building), and edges encode the physical connections between adjacent nodes, such as beams or walls. Node features were derived from the stiffness matrix

$\:\varvec{K}\:\in\:\:{\mathbb{R}}^{\left\{B\:\times\:\:N\:\times\:\:N\right\}}$

and the mass matrix

$\:\varvec{M}\:\in\:\:{\mathbb{R}}^{\left\{B\:\times\:\:N\:\times\:\:1\right\}}$

, where

$\:B$

is the batch size and

$\:N$

is the number of floors. The graph structure is formed by extracting the diagonal elements of the stiffness matrix to represent the stiffness associated with each node.

In the physics-informed graph attention layer, input features first undergo a feature projection step, where a fully connected layer is applied to reduce dimensionality. Each input node feature

$\:X\:\in\:\:{\mathbb{R}}^{\left\{B\:\times\:\:N\:\times\:\:\text{D}\right\}}$

is then processed through multiple single-head attention mechanisms. For each attention head

$\:h$

, the feature vector

$\:{x}_{i}$

undergoes a linear transformation:

$\:{v}_{i}^{h}={W}_{h}{x}_{i}$

(13)

where

$\:{W}_{h}$

is the weight matrix of the attention head. Edge features are computed as follows:

$\:{e}_{ij}^{h}={W}_{e}^{h}\bullet\:\left[\sqrt{\frac{{M}_{ij}}{{K}_{ij}}},{M}_{ij},{K}_{ij}\right]$

(14)

The attention coefficients are calculated using the following equation:

$\:{\alpha\:}_{ij}^{h}=\frac{exp\left(LeakyReLU\left({W}_{a}^{h}\left[{Q}_{i}^{h}\left|\left|{K}_{j}^{h}\right|\right|{e}_{ij}^{h}\right]\right)\right)}{\sum\:_{k\in\:{N}_{i}}exp\left(LeakyReLU\left({W}_{a}^{h}\left[{Q}_{i}^{h}\left|\left|{K}_{k}^{h}\right|\right|{e}_{ik}^{h}\right]\right)\right)}$

(15)

where

$\:{W}_{a}^{h}\in\:{\mathbb{R}}^{2d}$

is a learnable weight matrix associated with each attention head, and || denotes vector concatenation. In the subsequent feature aggregation step, each node aggregates information from its neighboring nodes, weighted by the learned attention coefficients:

$\:{\widehat{v}}_{i}^{h}=\sigma\:\left(\sum\:_{j\in\:N\left(i\right)}{\alpha\:}_{ij}^{h}{W}_{h}{v}_{j}\right)$

(16)

where 𝜎 denotes an activation function, and Z

$\:N\left(i\right)$

represents the set of neighboring nodes of node 𝑖. The outputs from all attention heads were then concatenated to form the updated feature representation for each node. This updated representation was subsequently processed through a layer normalization step and passed through a GELU⁴⁵ activation function. Finally, a residual connection adds the processed features back to the original input to complete the computation.

Response feature decoder (RFD) is an advanced transformer-based neural network. It begins with an embedding layer that maps input tokens into higher-dimensional vector representations, expanding the number of channels to

$\:C=256$

and timesteps to

$\:T=4096$

. This transformation enables the model to address data heterogeneity and handle diverse feature types by representing them in a format suitable for subsequent processing. root mean square (RMS) normalization was then applied to stabilize and normalize the embedded vectors, which was particularly beneficial for time-series data.

$\:{\stackrel{̄}{a}}_{i}=\frac{{a}_{i}}{RMS},\:\text{w}\text{h}\text{e}\text{r}\text{e}\:RMS=\sqrt{\frac{1}{d}{\sum\:}_{i=1}^{d}{a}_{i}^{2}}$

(17)

where

$\:{\stackrel{̄}{a}}_{i}$

represents the normalized activation value,

$\:{a}_{i}$

denotes the original activation value, and

$\:d$

indicates the dimension of the token vector.

A key feature of the model is the grouped query attention⁴⁶ (GQA) mechanism, a variant of the standard multi-head attention. GQA segments queries into groups, with each group sharing a single set of keys and value heads, thus enhancing both performance and memory efficiency. This mechanism plays a critical role in focusing on relevant features by grouping input elements and applying self-attention to identify the most important ones within each group. This prioritization of informative features significantly boosts the model’s performance. Rotary position embedding⁴⁷ (RoPE) was applied to the queries and keys within the GQA mechanism to effectively encode positional information. The output from this attention layer then passed through a feed-forward network (FFN) that employs the SwiGLU⁴⁸ (Sigmoid-Weighted Linear Unit) activation function, which has been shown to outperform the traditional ReLU activation function in certain cases.

$\:SwiGLU\left(x\right)=x\cdot\:\sigma\:\left(x\right)\:\text{w}\text{h}\text{e}\text{r}\text{e}\:\sigma\:\left(x\right)=\frac{1}{1+{e}^{-x}}$

(18)

Residual connections and additional RMS normalization were applied after both the attention and feed-forward layers to facilitate gradient flow and improve model training. This block configuration was repeated N times, indicating the stacking of transformer blocks, which deepens the model and enables it to learn complex patterns. After these repetitions, the output underwent another RMS normalization, followed by a linear transformation to map it to the desired dimensions. This architecture integrated modern techniques such as grouped query attention and SwiGLU, enhancing both performance and efficiency in processing sequential data. The RFD’s advanced design allowed it to effectively manage diverse and complex inputs, making it a powerful tool for real-time structural response prediction.

Fine-Tuning with SeisGPT: While SeisGPT was designed for general structural response prediction, it could be fine-tuned to enhance accuracy for a specific building, provided prior structural response data was available. During the fine-tuning process, the model was adapted to better predict responses for the target building, ensuring improved prediction accuracy for that specific structure. This adaptation was achieved by introducing LoRA layers to all linear layers in the response feature decoder. During fine-tuning, all model parameters, except those in the LoRA layers, were frozen. Only the small parameters in the LoRA layers were updated during training. This approach allowed SeisGPT to efficiently adapt to the structural characteristics of the target building and provide more accurate responses, enhancing the model’s performance in real-world applications where building-specific data was available.

SeisGPT-R: SeisGPT-R augments the SeisGPT framework to enable accurate reconstruction of structural responses from sparse sensor measurements. The model ingests three primary input types: external excitation sequences (

$\:{X}_{e}$

), simplified response predictions (

$\:{X}_{s}$

) generated via the SDR module, and partial observational data (

$\:{X}_{r}$

) derived from sparse sensor instrumentation. To accommodate the sparse and irregular nature of

$\:{X}_{r}$

, dedicated time embedding and floor embedding layers preprocess these sensor inputs, ensuring appropriate temporal and spatial representation.

Critically, SeisGPT-R employs a learned gated fusion mechanism within its physics-informed structural encoder. Specifically, structural representations generated by the physics-informed graph neural network (PIGNN), denoted by

$\:{H}_{i}$

, are adaptively fused with sensor-derived embeddings (

$\:{S}_{i}$

). This fusion occurs at the feature level—after structural topology and dynamics have been encoded by the PIGNN—rather than at the raw input level. The floor-wise fusion is governed by an adaptive gating weight,

$\:{\alpha\:}_{i}$

, calculated via a linear transformation followed by a sigmoid activation function, as follows:

$\:{\alpha\:}_{i}=\sigma\:\left(W\bullet\:\left[{H}_{i}\right|\left|{S}_{i}\right]+b\right)$

(19)

The resulting fused feature,

$\:{Z}_{i}$

, is computed as a weighted combination:

$\:{Z}_{i}={\alpha\:}_{i}\bullet\:{S}_{i}+(1-{\alpha\:}_{i})\bullet\:{H}_{i}$

(20)

where W and b are learnable parameters. This gating mechanism dynamically balances physics-informed structural priors against real observational data based on the reliability and availability of sensor measurements at each floor. Subsequently, the fused representation

$\:{Z}_{i}$

is passed through the decoder network to reconstruct detailed structural response profiles. This architecture not only ensures robust spatial continuity and physical coherence of the predicted responses but also provides adaptability in data-limited scenarios, significantly enhancing reconstruction accuracy and generalization across diverse structural typologies and external loading conditions.

Fig. 6

Architecture of the SeisGPT foundation model for physics-informed structural response prediction. The SeisGPT model adopts a hybrid encoder–decoder architecture that integrates structural physics with data-driven learning. The model processes three primary input types: external excitation sequences, simplified dynamic responses from the SDR module, and structural information (mass and stiffness matrices). These inputs are embedded with time-embedding module and floor-embedding module. Multimodal representations are integrated via an enhanced gated fusion mechanism and subsequently processed by the physics-informed structural encoder (PhySE), a graph-based module that encodes structural topology and dynamic characteristics using attention mechanisms modulated by stiffness and mass properties. In the SeisGPT-R variant, sparse observational data (e.g., sensor measurements) are processed through a separate encoder path and fused adaptively with structural features. The fused representation is decoded by the response feature decoder (RFD), which employs grouped-query attention, rotary positional encoding, and stacked SwiGLU-activated transformer blocks to produce accurate, floor-wise structural response predictions, including reconstructions from incomplete measurements.

Extended Data Table 1 | Quantitative comparison of structural response prediction performance across models and structural types.

Structure Type	Models	R		FNMAE		FNRMSE
Structure Type	Models	Acceleration	Displacement	Acceleration	Displacement	Acceleration	Displacement
Frame Structure	SeisGRU	0.8313	0.8015	0.0624	0.0685	0.0954	0.1023
	SeisLSTM	0.8317	0.8713	0.0611	0.0771	0.0932	0.1134
	TimesNet	0.8507	0.8880	0.0575	0.0685	0.0869	0.1013
	N-Beats	0.7533	0.8359	0.0737	0.0856	0.1130	0.1266
	Informer	0.7080	0.7902	0.1465	0.4401	0.1902	0.4850
	SeisGPT-Base	0.9754	0.9468	0.0226	0.0437	0.0363	0.0649
Frame-Shear Wall Structure	SeisGRU	0.8684	0.9454	0.0559	0.0738	0.0864	0.1127
	SeisLSTM	0.8668	0.8915	0.0558	0.0930	0.0864	0.1473
	TimesNet	0.8906	0.9018	0.0523	0.0729	0.0795	0.1121
	N-Beats	0.7690	0.8129	0.0729	0.1125	0.1135	0.2025
	Informer	0.6854	0.6678	0.2076	0.4142	0.3074	0.5287
	SeisGPT-Base	0.9832	0.9628	0.0216	0.0416	0.0339	0.0616
Shear Wall Structure	SeisGRU	0.8765	0.8713	0.0543	0.0909	0.0842	0.1282
	SeisLSTM	0.8512	0.8573	0.0588	0.0946	0.0915	0.1371
	TimesNet	0.9187	0.8554	0.0459	0.0985	0.0712	0.1437
	N-Beats	0.7195	0.7667	0.0766	0.1822	0.1198	0.4602
	Informer	0.6739	0.8072	0.1388	0.1972	0.2330	0.2890
	SeisGPT-Base	0.9747	0.9453	0.0250	0.0617	0.0394	0.0852

Average prediction performance of SeisGPT-Base and baseline models (SeisGRU, SeisLSTM, TimesNet, N-Beats, and Informer) evaluated on 3,000 previously unseen buildings, equally distributed across frame, frame–shear wall, and shear wall structures. Metrics include Floor-wise Normalized Mean Absolute Error (FNMAE), Floor-wise Normalized Root Mean Squared Error (FNRMSE), and Pearson correlation coefficient (R), computed separately for acceleration and displacement predictions. SeisGPT-Base consistently achieves the lowest error and highest correlation across all tasks and typologies, demonstrating superior generalization and fidelity in full-building response modeling.

Extended Data Table 2 | Quantitative evaluation of SeisGPT fine-tuning performance on real-world building response dataset.

Models	R		FNMAE		FNRMSE
Models	Acceleration	Displacement	Acceleration	Displacement	Acceleration	Displacement
SeisGRU	0.8551	0.8754	0.0656	0.0901	0.0858	0.1267
SeisLSTM	0.8152	0.8544	0.0617	0.0947	0.0957	0.1371
TimesNet	0.9145	0.8606	0.0470	0.1036	0.0719	0.1453
N-Beats	0.6721	0.7572	0.0775	0.1800	0.1217	0.4184
Informer	0.5805	0.8085	0.3300	0.3008	0.4227	0.4049
SeisGPT-Enhanced	0.9734	0.9435	0.0243	0.0633	0.0404	0.0881

Evaluation of structural response prediction accuracy for six models—SeisGPT-Enhanced, SeisGRU, SeisLSTM, TimesNet, N-Beats, and Informer—on a test set of 65 real buildings. Metrics reported include Floor-wise Normalized Mean Absolute Error (FNMAE), Floor-wise Normalized Root Mean Squared Error (FNRMSE), and Pearson correlation coefficient (R), computed separately for acceleration and displacement predictions. SeisGPT-Enhanced consistently achieves the lowest prediction errors and highest correlation across both tasks, demonstrating superior alignment with finite element reference responses and improved generalization to real structural systems.

Extended Data Table 3 | Impact of pretraining and fine-tuning on SeisGPT performance for real-world building structural response prediction.

Models	R		FNMAE		FNRMSE
Models	Acceleration	Displacement	Acceleration	Displacement	Acceleration	Displacement
A1	0.9424	0.8416	0.0379	0.1166	0.0586	0.2181
A2	0.9716	0.9395	0.0255	0.0656	0.0423	0.0920
A3	0.9734	0.9435	0.0243	0.0633	0.0404	0.0881

Comparison of three model configurations evaluated on 65 real buildings: SeisGPT-Base (A1), pretrained on the large-scale synthetic building dataset; SeisGPT-Enhanced (A2), obtained by fine-tuning SeisGPT-Base on real building data; and a non-pretrained model (A3), trained from scratch on the same real-world building dataset. Fine-tuning consistently improves predictive accuracy over pretraining alone, while models trained without pretraining exhibit notably higher errors, underscoring the value of large-scale synthetic building dataset for learning transferable structural representations.

Extended Data Table 4 | Statistical Catalogue of Structural Typologies and Associated Finite-Element Analysis Case Counts.

Structure Type	Quantity	Number of Analysis Cases	Number of Floors
Frame Structure	150,000	1,300,000	1–10
Shear wall-Frame Structure	60,000	520,000	11–20
Shear Wall Structure	60,000	220,000	10–30
Real World Structure	694	13,880	2–30
Total	270,694	2,053,880	-

This table presents the distribution of structural types and the corresponding number of FEM cases in the dataset used in this study. The statistics include the number of different building structure types as well as the number of cases analyzed for each type using finite element analysis.

Extended Data Table 5 | Parameterization and Sampling Ranges Employed in AI-Driven Finite-Element Model Generation.

	Parameter	Minimum	Maximum
Geometric and structural layout parameters	Number of stories	1	30
	Floor height (m)	2.8	4.5
	Slab thickness (mm)	80	150
	Column size (mm)	Designed per specifications, flexural strength calculated.
	Beam size (mm)	Designed per specifications, flexural strength calculated.
	Wall thickness (mm)	200	400
	Reinforcement	Automatically computed per specifications.
Material parameters	Concrete strength (MPa)	C20	C50
Material parameters	Rebar strength (MPa)	355	400

This table presents the ranges of design parameters for the generated buildings, produced by the AI-based algorithm used in this study. The parameters include key structural design values and their corresponding ranges.

Extended Data Fig. 1 | Visualization of full-building structural response predictions by SeisGPT-Base compared with FEM.

Visualization of predicted acceleration and displacement time histories for a 12-story shear wall structure subjected to ground motion RSN13399. SeisGPT-Base outputs are compared against reference responses generated by nonlinear finite element analysis. The results show that SeisGPT-Base closely reproduces the temporal and amplitude characteristics of the structural response.

Extended Data Fig. 2 | Full-building structural response predictions of SeisGPT-Base compared with FEM

across three representative building typologies. a, Prediction comparison for an 8-story frame structure. b, Results for a 20-story frame–shear wall structure. c, Predictions for a 23-story shear wall structure. All responses have been normalized by dividing by the peak absolute value of the ground motion, thus providing a scale-independent basis to evaluate and visualize the prediction accuracy of the model relative to FEM.

Extended Data Fig. 3 | Top-floor displacement trajectories predicted by SeisGPT-Enhanced compared with FEM under seismic excitations.

The figure shows top-floor displacement trajectories in the x and y directions for three representative building structures, with SeisGPT-Enhanced predictions compared against FEM results. Although the trajectories are plotted in two dimensions, the predictions in each direction were computed independently and subsequently combined for visualization. a, 9-story frame structure. b, 15-story frame–shear wall structure. c, 23-story shear wall structure.

Extended Data Fig. 4 | AI-driven workflow for structural design, automated FE modeling, and synthetic building generation. a

, Schematic of the end-to-end pipeline for AI-based structural design and automated finite element model generation. The process starts with the definition of design conditions, functional requirements, and structural constraints. Based on these inputs, three specialized deep learning models—ArchiFlux, StructFlux, and BeamFlux—were sequentially employed to generate the architectural layout (including doors, windows, and partitions), the spatial distribution of shear walls and columns, and the beam configuration, respectively. These models were fine-tuned from the Flux architecture using ControlNet, trained on a semantically annotated dataset derived from over 200 real-world architectural drawings, where structural elements were labeled by type using color-coded masks. ArchiFlux takes as input a conceptual sketch or spatial boundary and outputs the floorplan with openings and wall placements. StructFlux builds upon this to determine the arrangement of primary vertical load-resisting elements. BeamFlux generates beam layouts, with outputs post-processed using OpenCV to extract precise geometric placements. Detailed structural design follows, including material specification, reinforcement detailing, and optimization based on inter-story drift constraints. Iterative evaluations ensure compliance with performance criteria. The final structural system was automatically converted into a three-dimensional FE model using the Structural Model Auto-Generator, a tool developed by the authors, which integrates the generated layout with OpenSees for numerical analysis. The synthetic building dataset comprising tens of thousands of FE models used in this study was generated through this fully automated pipeline. b, Representative visualizations of diverse building structures within the synthetic building dataset. These samples illustrate the architectural and structural variability captured by the generative pipeline, encompassing a wide range of typologies.

Extended Data Fig. 5 | Derivation of a simplified model from an FE model.

The figure illustrates the process of deriving a simplified model from an FE model of a building structure. A unit force was applied sequentially to each floor, and the resulting displacements were computed. The stiffness matrix, constrained to be symmetric, was obtained by solving using the least squares method. The mass matrix was directly extracted from the FE model, with floor masses concentrated at discrete nodes and the mass of structural components distributed to the adjacent upper and lower floors. Together, and define the simplified model used for structural analysis.

Yes