A Physics-Informed Operator Learning Benchmark for Thermal-Mechanical Surrogate Modeling in WA-DED

Mengjiao Li 1

Haochen Mu 1,2,3

Donghong Ding 1,2,3✉ Emaildonghong@njtech.edu.cn

Mingxuan Tian 1

Yuhan Ding 1,2,3

School of Mechanical and Power Engineering Nanjing Tech University 211816 Nanjing China

2 Institute of Reliability centered Manufacturing Nanjing Tech University 211816 Nanjing China

3 Jiangsu Provincial Key Laboratory of Energy Power Manufacturing Equipment and Reliability Technology Nanjing Tech University 211816 Nanjing China

Mengjiao Li^a,#, Haochen Mu^a,b,c,#, Donghong Ding^a,b,c,*, Mingxuan Tian^a, Yuhan Ding^a,b,c

^a School of Mechanical and Power Engineering, Nanjing Tech University, Nanjing 211816, China

^b Institute of Reliability centered Manufacturing, Nanjing Tech University, Nanjing 211816, China

^c Jiangsu Provincial Key Laboratory of Energy Power Manufacturing Equipment and Reliability Technology, Nanjing Tech University, Nanjing 211816, China

^* Corresponding author: Donghong Ding

Email address: donghong@njtech.edu.cn

^# Equal contribution

Abstract

Operator-learning surrogates are emerging as a practical route to real-time simulation for wire-arc directed energy deposition (WA-DED). However, predictive accuracy is often degraded when jointly learning transient reversible quantities (e.g., elastic and thermal strain) and history-dependent responses (e.g., plastic strain and stress). To address the resulting feature interference, we establish a benchmark for evaluating Physics-Informed Operator Learning (PIOL) pathways via constitutive behavior, including FEM labeling, decoupled modeling, and PIOL evaluating. The PIOL framework uses a shared trunk to represent geometry-conditioned basis functions, while employing heterogeneous branches matched to constitutive behavior: a lightweight CNN-MLP branch for transient elastic response and a CNN-LSTM branch for history-dependent plasticity and stress. A constitutive constraint term further regularizes the thermal-strain prediction. Finally, through thermally calibrated FEM simulated dataset across multiple process conditions, the PIOL scheme evaluated on the benchmark demonstrate improved stress prediction accuracy relative to baseline variants, with millisecond-scale inference suitable for online digital-twin deployment.

Key words:

WA-DED

Machine Learning

Operator Learning

Thermal-Mechanical Modeling

Real-time Prediction

1. Introduction

With the advancement of Industry 4.0 and the rapid development of intelligent manufacturing technologies, the manufacturing industry is undergoing a transformation in production modes. As an advanced technology, Metal Additive Manufacturing (AM) is becoming a vital component of advanced manufacturing systems. Due to its ability to form complex geometric structures without molds and its high material utilization efficiency, it has emerged as a key technology in the manufacturing field^[1]. Among various metal AM technologies, Wire-Arc Directed Energy Deposition (WA-DED), which is based on the principle of DED, utilizes the arc as a high-energy heat source to melt metal wire, depositing molten metal layer by layer under computer numerical control. This capability to construct complex geometries without tooling, combined with low manufacturing costs and high deposition efficiency^[2–4], has demonstrated broad industrial application prospects in high-end fields such as aerospace, marine engineering, and heavy mold manufacturing ^{[5, 6]}.

Despite these advantages, WA-DED is inherently a complex thermal-mechanical process involving non-equilibrium rapid solidification and cyclic thermal loading^[7–9]. Extreme temperature gradients induce incompatible thermal expansion and contraction within the material, leading to significant residual stress and distortion within the deposited layers^[10]. From a mechanical perspective, this macroscopic distortion behavior is essentially the dynamic superposition of elastic, plastic, and thermal strains across spatiotemporal scales, where the evolution of internal stress is determined by constitutive relations^[11]. Specifically, thermal strain undergoes transient reversible changes with temperature fluctuations, whereas irreversible plastic strain accumulates layer by layer along the deposition path. This complex strain interaction mechanism ultimately results in residual stress, permanent distortion, or even cracks in components^[12–14]. Therefore, the ability to analyze and predict strain and stress fields in a decoupled manner forms the physical foundation for high-fidelity digital twins, while also being a prerequisite for accurately characterizing the component’s complex mechanical state. Precisely predicting and decoupling each strain component is required to overcome single-physics analysis limitations. This allows for an accurate thermodynamic characterization of the deposition process, which is critical for controlling final component dimensions and mechanical properties.

Currently, methods for addressing the complex thermal-mechanical evolution during AM generally fall into two categories: numerical methods and data-driven methods. Numerical methods are based on continuum mechanics theory, solving Partial Differential Equations (PDEs) governing heat transfer, fluid flow, and solid mechanics through discretization to determine the distribution of multi-physics fields within the spatiotemporal domain under specific Boundary Conditions (BCs)^{[15, 16]}. In the field of AM, the most commonly used numerical methods are the FEM^[17] and Computational Fluid Dynamics (CFD) ^[18]. The core principle of FEM involves discretizing the mesh and simulating the thermodynamic evolution of the AM process to solve for stress, strain, and distortion fields^[19]. Unlike FEM, which focuses on macroscopic mechanical behavior, CFD emphasizes the fine-scale simulation of fluid physical behaviors, mainly used for analyzing metal flow, heat transfer, and solidification processes within the molten pool^[20]. However, a complete, high-precision thermal-mechanical simulation often requires days to complete^[21], making it difficult to meet the needs of real-time monitoring and online process optimization. Consequently, researchers are seeking more efficient surrogates, such as data-driven methods^[22].

Unlike numerical simulation methods, data-driven methods bypass the tedious solution of physical equations, directly mining the non-linear mapping relationship between inputs and outputs from massive experimental or simulation data, effectively resolving the issues of high computational cost and low time efficiency associated with traditional FEM^{[23, 24]}. In early studies, Farias et al. ^[25] utilized Artificial Neural Networks (ANN) to establish a rapid mapping between process parameters and interlayer temperatures. However, ANNs often ignore the spatial topological structure of data, making it difficult to characterize full-field distribution features. To address this, Xia et al. ^[26] utilized Convolutional Neural Networks (CNN), which possess strong spatial feature extraction capabilities, to achieve real-time state diagnosis based on molten pool images. Although the CNN performs well in the spatial dimension, WA-DED is essentially a dynamic process with strong historical dependence. Addressing this characteristic, Nalajam et al. ^[27] employed Recurrent Neural Networks (RNN) to capture the temporal correlation of thermal cycles, achieving dynamic tracking of temperature history at key points.

In addition, Scientific Machine Learning (SciML) methods, represented by Physics-Informed Neural Networks (PINN), Neural Ordinary Differential Equations (Neural ODE), and Deep Operator Networks (DeepONet), have provided new methodologies for solving data scarcity, high-dimensional, complex geometry, inverse solution, and multi-physical field decoupling problems in recent years. There are fundamental differences in the underlying mechanisms of these three paradigms: PINN and Neural ODE focus on instance solutions of equations^{[28, 29]}, while DeepONet is dedicated to operator learning^[30]. When facing complex problems involving multi-scale, strong coupling mechanisms like WA-DED, PINN and Neural ODE reveal limitations. The governing equations of WA-DED exhibit stiffness, making the loss function optimization of PINN difficult^[31], or causing the time-step integration of Neural ODE to be extremely slow or even divergent^[32]. In contrast, by learning infinite-dimensional operators, DeepONet can establish an end-to-end mapping from complete thermal history to mechanical response, avoiding the tedious iterative solution process of differential equations in traditional methods^[30], and effectively capturing the state evolution process relied upon by yield criteria. However, existing DeepONet architectures mostly employ a single network structure with parameter sharing to handle multi-output tasks^[33], ignoring the differences in evolution mechanisms between different physical fields. This oversight easily induces feature interference, limiting the model's precision in characterizing complex multi-physics fields.

To this end, this paper introduces a PIOL surrogate model oriented towards real-time digital twins. This model designs a differentiated operator learning architecture based on the evolution characteristics of WA-DED multi-physics fields: a Shared Trunk network is used to learn universal basis functions of the geometric domain; in the design of branch networks, a lightweight CNN-MLP encoding is adopted for the transient, reversible elastic field, while a CNN-LSTM branch is constructed for the plastic and stress fields with strong historical dependencies. Through this hybrid architecture, the model ensures full-field prediction accuracy while achieving optimal computational adaptation for different physical constitutive behaviors. Therefore, The PIOL framework should have the following key features:

Specific model architecture for predicted physics field: Design heterogeneous branch networks to adapt to the constitutive behaviors of each target, distinguishing between transient reversible responses and history-dependent cumulative effects.

Physics constrained: A constitutive regularization term based on thermal expansion theory is explicitly embedded into the learning objective to ensure the predictions strictly comply with thermodynamic laws.

Decouple the multi-output into single output questions: The complex multi-physics task is transformed into independent learning pathways within the parameter space, effectively mitigating feature interference between different physical quantities.

Synergistic enhancement: A shared trunk network is employed to extract universal geometric basis functions, enabling the plastic strain to serve as auxiliary supervision for enhancing residual stress prediction.

The remainder of this paper is organized as follows: Section 2 details the methodology and the established benchmark. Section 3 describes the experimental calibration, dataset construction, and the setup for model comparison. The analysis in Section 4 focuses on prediction accuracy, multi-output synergy, and computational efficiency. Finally, Section 5 presents the conclusions.

2. Methodology

2.1 System overview

This paper establishes a systematic benchmark organized into three main modules: "data generation", "decoupling the physics field", and "PIOL scheme summary", as shown in Fig. 1. First, a WA-DED experimental platform based on DED is established, utilizing a thermal camera to collect transient temperature history during the deposition process. Second, experimental data are used to rigorously calibrate the heat source parameters and BCs of the offline FEM. Given the extreme difficulty of in-situ stress field measurement, and considering that the reliability of classical thermal-mechanical FEM models rigorously calibrated for thermal trends has been widely verified, this study establishes the calibrated FEM model as the digital ground truth. Finally, unlike traditional end-to-end learning, the established benchmark can provide a feasible PIOL scheme that explicitly decouples the learning pathways for transient and history-dependent physics.

Fig. 1

The established benchmark for PIOL surrogate modelling in WA-DED.

2.2 Finite element model

To construct a high-fidelity dataset for verifying the multi-physics decoupling prediction model, this study utilized ANSYS software to perform thermal-mechanical coupled simulation of the WA-DED process, followed by extraction and preprocessing of relevant data. The heat source employs the Goldak double-ellipsoid model^[34]:

$\:\begin{array}{c}{q}_{v}\left(x,y,z\right)=\left\{\begin{array}{c}{\frac{6\sqrt{3}{f}_{f}Q}{{a}_{f}bc\pi\:\sqrt{\pi\:}}exp}^{\left(-\frac{3{x}^{2}}{{a}_{f}^{2}}-\frac{3{y}^{2}}{{b}^{2}}-\frac{3{z}^{2}}{{c}^{2}}\right)}\\\:{\frac{6\sqrt{3}{f}_{r}Q}{{a}_{r}bc\pi\:\sqrt{\pi\:}}exp}^{\left(-\frac{3{x}^{2}}{{a}_{r}^{2}}-\frac{3{y}^{2}}{{b}^{2}}-\frac{3{z}^{2}}{{c}^{2}}\right)}\end{array}\right.\#\left(1\right)\end{array}$

$\:\begin{array}{c}{f}_{f}+{f}_{r}=2\#\left(2\right)\end{array}$

$\:\begin{array}{c}Q=\eta\:VI\#\left(3\right)\end{array}$

Where relevant parameters are calibrated by comparing simulated molten pool dimensions with experimental measurements. The substrate and filler wire are Q235B low carbon steel and ER70S-6 welding wire, respectively, with material properties set as temperature-dependent non-linear parameters. Figure 2(a) shows the established FEM model of the thin-wall structure and the meshing strategy. The model includes a substrate with dimensions of

$\:300\times\:300\times\:10\:{mm}^{3}$

and a deposited wall consisting of 20 layers with a deposition length of 100

$\:mm$

. To ensure physical fidelity, the specific geometric dimensions (i.e., width and layer height) of the simulated wall were configured to be consistent with the experimental measurements corresponding to the specific process parameters for each case (as detailed in Section 3.2). The simulation employs element birth and death techniques to simulate the layer-by-layer material deposition process, achieving dynamic material addition by sequentially activating elements and applying corresponding thermal loads. Crucially, this FEM model outputs not only the temperature field but also generates high-fidelity full-field data including elastic strain, plastic strain, thermal strain, and Von Mises stress based on continuum mechanics theory, as shown in Fig. 2(c). These physical fields solved by governing equations constitute the high-quality labels for the surrogate model training.

Fig. 2

FEM simulation setup and results. (a) Geometry and mesh. (b) Scanning path. (c) Multi-physics contours.

2.3 Theoretical analysis of multi-physics field decomposition

Within the framework of continuum mechanics for metal AM assuming the material undergoes small distortions, the total strain

$\:{\epsilon\:}_{total}$

can be decoupled into three independent physical components according to the additive decomposition theorem^[35]:

$\:\begin{array}{c}{\epsilon\:}_{total}={\epsilon\:}_{e}+{\epsilon\:}_{p}+{\epsilon\:}_{th}\#\left(4\right)\end{array}$

Where

$\:{\epsilon\:}_{e}$

$\:{\epsilon\:}_{p}$

, and

$\:{\epsilon\:}_{th}$

represent elastic, plastic, and thermal strain, respectively. Although these three components are controlled by the same set of spatiotemporal BCs and thermal history constraints, they follow distinctly different physical evolution laws. This intrinsic difference in physical mechanisms constitutes the physical basis for the differentiated hybrid branch architecture adopted in this study: that is, achieving decoupled learning of different physical constitutive behaviors in parameter space through parallel network channels.

Elastic strain and thermal strain inherently manifest more as transient reversible responses based on current thermodynamic inputs. Specifically, thermal strain follows^[35]:

$\:\begin{array}{c}{\epsilon\:}_{th}=\alpha\:\left(T\right)\left(T-{T}_{ref}\right)I\#\left(5\right)\end{array}$

Where

$\:\alpha\:\left(T\right)$

is the temperature-dependent thermal expansion coefficient,

$\:{T}_{ref}$

is the reference temperature, and

$\:\mathbf{I}\:$

is the second-order identity tensor. Elastic strain is primarily determined by the current stress state and temperature-dependent material properties, exhibiting rapid reversible changes with temperature fluctuations. This implies that these two physical quantities are mainly determined by the temperature field state and geometric constraints at the current moment, possessing weak historical memory.

In sharp contrast, plastic strain and the stress field exhibit strong historical dependence and irreversible cumulative evolution characteristics. The constitutive relation between the Cauchy stress tensor

$\:\sigma\:$

and the elastic strain tensor

$\:{\epsilon\:}_{e}$

can be expressed as ^[36]:

$\:\begin{array}{c}\sigma\:=C:{\epsilon\:}_{e}=\frac{E\left(T\right)}{1+\nu\:\left(T\right)}\left[{\epsilon\:}_{e}+\frac{\nu\:\left(T\right)}{1-2\nu\:\left(T\right)}\text{t}\text{r}\left({\epsilon\:}_{e}\right)\mathbf{I}\right]\#\left(6\right)\end{array}$

Where

$\:\mathbb{C}$

is the fourth-order elasticity tensor,

$\:E$

and

$\:\nu\:$

are the temperature-dependent Young's modulus and Poisson's ratio respectively, and

$\:\mathbf{I}$

is the second-order identity tensor. Although this relation links stress to elastic strain, in the complex reciprocating thermal cycles of WA-DED, the macroscopic residual stress field is essentially the result of the accumulation of incompatible plastic distortion within the material over time and space. Plastic strain is an irreversible distortion produced after material yielding; its value is not defined solely by the current temperature or stress state but is the result of the time-domain integration of the plastic strain rate throughout the entire distortion history:

$\:\begin{array}{c}{\epsilon\:}_{p}\left(t\right)={\int\:}_{0}^{t}\dot{{\epsilon\:}_{p}\left(\tau\:\right)}d\tau\:\#\left(7\right)\end{array}$

2.4 Physics field prediction via single-output models

Before constructing the WA-DED thermal-mechanical coupling surrogate model, it is essential to clarify the machine learning theories supporting spatiotemporal physical field prediction: the operator network for solving infinite-dimensional function space mapping problems, and neural ordinary differential equations for modeling continuous time evolution processes.

2.4.1 Neural operator learning

To achieve efficient prediction, this study adopts the neural operator framework ^[30]. Its goal is to learn an operator

$\:G$

that maps an input function

$\:u$

(representing the temperature history) to an output function

$\:G\left(u\right)$

, where

$\:y$

represents the spatiotemporal coordinates of the output. To realize this mapping, the branch network encodes the input function

$\:u$

through limited sensor points, generating a latent representation vector summarizing the global behavior of the input field. Meanwhile, the trunk network processes the query coordinates

$\:y$

, outputting a feature vector representing the local response basis at that location. The final prediction result is obtained by the inner product of these two feature vectors, expressed as:

$\:\begin{array}{c}{G}_{\theta\:}\left(u\right)\left(y\right)=\sum\:_{k=1}^{p}{b}_{k}\left(u\right)\bullet\:{t}_{k}\left(y\right)\#\left(8\right)\end{array}$

Even when the geometric structure is relatively fixed, the thermal history

$\:u\left(t\right)$

experienced by points at different spatial locations in a WA-DED component varies significantly. The core advantage of DeepONet lies in its learning of the universal constitutive mapping operator from thermal history to mechanical response, allowing the model to capture complex non-linear physical laws and perform inference at arbitrary continuous spatiotemporal coordinates.

2.4.2 Neural ODE

Neural ODE provide a theoretical perspective based on continuous systems for analyzing sequential evolution dynamics in WA-DED^[29]. Unlike traditional RNN that discretize the time axis into fixed sequence steps, Neural ODE models the trajectory of the hidden state

$\:\mathbf{h}\left(t\right)$

as a parameterized initial value problem of an ordinary differential equation:

$\:\begin{array}{c}\frac{d\mathbf{h}\left(t\right)}{dt}={f}_{\theta\:}\left(\mathbf{h}\left(t\right),t\right)\#\left(9\right)\end{array}$

At this point, the state

$\:\mathbf{h}\left(t\right)$

at any time

$\:t$

can be obtained by integrating from the initial state

$\:\mathbf{h}\left({t}_{0}\right)$

via an integration solver. However, thermal history data in the WA-DED process exhibits numerical stiffness, characterized by severe thermal shocks within extremely short periods and slow cooling over long time scales. This multi-scale temporal characteristic forces the ODE solver to make difficult trade-offs between extremely small step sizes and numerical stability, resulting in high computational costs. This theoretical dilemma also constitutes the core motivation for the subsequent chapters of this study, which delve into the applicability boundaries of discrete gating mechanisms and continuous integration mechanisms in modeling stiff systems.

2.4.3 Baseline single-output model architectures

To systematically evaluate the effectiveness of temporal feature encoding mechanisms, five baseline models are constructed covering traditional ML and operator learning architectures, as shown in Fig. 3. These models are strictly divided into non-operator baselines and DeepONet operator variants based on their different handling of physical field mapping mechanisms, aiming to investigate optimal feature interaction methods.

Fig. 3

Schematic diagrams of five baseline model architectures. (a) CNN-LSTM architecture. (b) CNN-Neural ODE architecture. (c) CNN-LSTM DeepONet architecture. (d) CNN-Neural ODE DeepONet architecture. (e) CNN-MLP DeepONet architecture.

The first category of architectures employs traditional feature concatenation and direct decoding strategies (Fig. 3a-b), characterized by a lack of explicit branch-trunk decoupling structures. Among them, the CNN-LSTM (Fig. 3a) model follows the classic sequence regression paradigm, first utilizing a multi-scale CNN encoder to extract high-dimensional features from thermal history, followed by LSTM layers to capture discrete time dependencies. The key is that the hidden state vector output by the LSTM is directly concatenated with spatiotemporal coordinates

$\:\:(x,y,z,t)$

and then input into an MLP decoder to regress physical field values in an end-to-end manner. As a contrast for continuous-time modeling, the CNN-Neural ODE (Fig. 3b) architecture retains the front-end CNN feature extractor but introduces a neural differential equation solver to replace the LSTM module. It models the continuous evolution of latent states by numerically integrating

$\:\frac{\:dh}{dt}\:=\:f(h,t)$

, aiming to explore the performance of continuous dynamic systems in processing non-steady thermal history data.

The second category of architectures strictly follows the DeepONet theoretical operator inner product structure

$\:{G}_{\theta\:}\left(u\right)\left(y\right)=\sum\:_{k=1}^{p}{b}_{k}\left(u\right)\bullet\:{t}_{k}\left(y\right)$

(Fig. 3c-e). A common feature of these models is the use of a shared MLP trunk network to extract spatiotemporal features, but the branch network designs differ to test different encoding hypotheses. As a basic control group, CNN-MLP DeepONet (Fig. 3e) features a branch network composed solely of fully connected layers, treating thermal history as a static vector and ignoring the local correlation of time-series data. CNN-Neural ODE DeepONet (Fig. 3d) attempts to introduce continuous dynamics into operator learning, utilizing an ODE module in the branch network to generate dynamically changing basis function coefficients. The core architecture designed in this study, CNN-LSTM DeepONet (Fig. 3c), deeply integrates the feature compression capability of CNN and the long-term memory capability of LSTM within the branch network. Ultimately, by calculating the dot product of branch latent basis vectors and trunk latent coordinates, it achieves efficient operator mapping for the complex thermal-mechanical process of WA-DED.

2.5 Multi-output prediction via physics-informed DeepONet

As the core of the established PIOL framework in this study, a physics-informed hybrid multi-output DeepONet architecture has been developed to predict the spatiotemporal evolution of multi-physics fields, as shown in Fig. 4. The model employs heterogeneous temporal feature encoders specifically tailored to the constitutive characteristics of the physical quantities. Specifically, for the transient, reversible elastic strain, a lightweight CNN-MLP branch is adopted to efficiently encode the instantaneous thermal state lacking historical memory. Conversely, for plastic strain and stress, which exhibit strong historical cumulative effects and irreversible evolution characteristics, the model retains the CNN-LSTM architecture as the history-dependent branch. The gating mechanism of the LSTM is used to simulate the time-varying integration process, effectively capturing plastic hardening and residual stress accumulation behaviors throughout the thermal cycle history. This architecture is verified with the results of section 2.4 (as shown in section 4.1–4.2).

Considering that all mechanical field quantities are subject to the same geometric entity and thermal history constraints, the model introduces a “shared trunk” network strategy to extract universal geometric spatiotemporal modes. Finally, through multi-output collaborative dot-product operations, the explicit decoupling of unique evolution mechanisms of each physical field and the sharing of common spatial distribution features are analyzed. This architecture is fundamentally designed to enhance prediction accuracy and explainability regarding material thermal-mechanical behavior under variable temperature fields.

Fig. 4

Physics-informed hybrid multi-output DeepONet architecture of PIOL framework in the benchmark

To systematically analyze the synergistic mechanisms, four typical task configurations were designed for comparative experiments, as illustrated in Fig. 5. These configurations, which include the “all-branch” synergistic model and three dual-task subsets, maintain the fixed network topology while varying the output task combinations. The primary objective is to explore how the shared spatial basis functions act as bridges between different physical fields, and to evaluate how the joint training of specific physical quantities affects the final prediction performance through inductive bias.

Fig. 5

Model configurations for analyzing multi-output synergy mechanisms in the PIOL framework

The rationale behind this experimental design lies in the common basis hypothesis implemented by the shared trunk network. Although the differential equations governing elastic strain, plastic strain, and stress differ substantially—where the former is controlled by the current thermal state and the latter involves complex historical accumulation—they are all constrained by the same geometric entity and thermal BCs. Therefore, the shared trunk utilizes geometric homology to introduce a physical constraint, based on the hypothesis that distinct physical fields, despite having different weight coefficients, are all linearly expanded within the same basis function space reflecting universal geometric characteristics.

2.6 Model training

The models are trained using a supervised learning paradigm, with the objective of achieving high-precision regression from temperature history to multi-physics fields. To balance the learning of multi-physics components and effectively apply constitutive constraints, the total objective function is defined as:

$\:\begin{array}{c}L={\mathcal{L}}_{data}+{\mathcal{L}}_{physics}\#\left(10\right)\end{array}$

Where the data-driven loss

$\:{\mathcal{L}}_{data}$

covers all collaborative prediction target variables:

$\:\begin{array}{c}{\mathcal{L}}_{data}=\frac{1}{N}{\sum\:}_{i=1}^{N}\left({w}_{e}{‖{\widehat{\epsilon\:}}_{e}^{\left(i\right)}-{\epsilon\:}_{e}^{\left(i\right)}‖}^{2}+{{w}_{p}‖{\widehat{\epsilon\:}}_{p}^{\left(i\right)}-{\epsilon\:}_{p}^{\left(i\right)}‖}^{2}+{w}_{th}{‖{\widehat{\epsilon\:}}_{th}^{\left(i\right)}-{\epsilon\:}_{th}^{\left(i\right)}‖}^{2}+{w}_{\sigma\:}{‖{\widehat{\epsilon\:}}_{\sigma\:}^{\left(i\right)}-{\epsilon\:}_{\sigma\:}^{\left(i\right)}‖}^{2}\right)\#\left(11\right)\end{array}$

The weight coefficients

$\:w$

are dynamically adjusted according to the magnitude of each physical quantity. The constitutive constraint term

$\:{\mathcal{L}}_{physics}$

is constructed based on Eq. 5 (thermal expansion constitutive relationship), forcing the network-predicted thermal strain to strictly adhere to the theoretical value determined by the thermal expansion coefficient and temperature variation. This constraint mechanism effectively embeds the material's thermophysical properties as prior knowledge into the neural network, enhancing the model's generalization robustness under sparse training data conditions.

Model training was performed under the PyTorch deep learning framework, using the Adam optimizer to adapt to the non-stationary objective function, with the initial learning rate set to

$\:5\times\:{10}^{-4}$

. All computational tasks were completed on a personal workstation equipped with an Intel Core i9 processor, 32GB RAM, and a single NVIDIA GeForce RTX 4060 (8GB) graphics card. Thanks to memory mapping techniques and an efficient parallel computing architecture, model training on the large-scale dataset containing millions of samples maintained an efficient convergence speed.

To strictly evaluate the surrogate model's generalization capability under unseen operating conditions, this study adopted a leave-one-out dataset partition strategy, rather than traditional random shuffling. The experimental dataset contains 5 typical cases with distinct thermal histories. Each case contains 1,071 spatial nodes, with each node recording an evolution history of 1,500 time steps. In the experiment, 4 cases (approximately 3.2 million samples) were used as the training set to cover diverse thermodynamic characteristics; the remaining 1 case (approximately 0.8 million samples) was strictly isolated as an independent test set. This partition method is highly challenging as it requires the model not merely to memorize data distributions but to truly learn the physical operator laws governing strain evolution, thereby achieving high-precision extrapolation prediction under entirely new operating conditions.

3. Experimental setup

To ensure the FEM model used as digital ground truth possesses sufficient physical credibility, this study employs a hybrid strategy of experimental calibration and numerical generation. First, a physical experimental platform for WA-DED is established, utilizing collected in-situ thermal-mechanical data to rigorously calibrate the parameters of the FEM model described in Section 2.2. Subsequently, multi-physics training data covering 5 operating conditions is generated based on the validated numerical model.

3.1 Experimental settings and model calibration

To obtain true data for calibrating the numerical model, an integrated robotic WA-DED system was constructed. The system uses an ABB IRB 2600 six-axis industrial robot as the motion execution unit, coordinated with Cold Metal Transfer (CMT) welding power source for material deposition. The experiment selected a Q235B low carbon steel plate of

$\:300\times\:300\times\:10\:{mm}^{3}$

as the substrate, with ER70S-6 solid wire of 1.2

$\:mm$

diameter as the filler material. To prevent high-temperature oxidation and ensure arc stability, a shielding gas mixture of 80% Ar + 20% CO₂ with a flow rate of 17

$\:L/min$

was used. Regarding data acquisition, to comprehensively capture thermal evolution features during the deposition process, the platform was equipped with a high-resolution thermal camera, FLIR A655sc, recording the surface temperature field distribution of the molten pool and heat-affected zone in real-time at a sampling rate of 6.25

$\:Hz$

. The experimental system is shown in Fig. 6.

Fig. 6

The WA-DED system.

Based on the aforementioned experimental platform, single-pass multi-layer thin-walled samples were prepared, and experimental measurement results were used to calibrate key heat source parameters in the offline FEM model. Since the shape parameters

$\:({a}_{f},{a}_{r},b,c)$

of the Goldak double-ellipsoid heat source model directly determine the spatial distribution of heat input, geometric morphology comparison was performed. By fine-tuning the heat source parameters, good agreement was achieved between the simulated and experimental molten pool dimensions. Furthermore, thermocouple temperature measurement data from key characteristic points on the substrate and infrared thermal imaging data were extracted and aligned with the time-temperature curves of corresponding nodes in the simulation model.

3.2 Dataset construction strategy

To endow the deep learning model with generalization robustness under different energy inputs, a single set of process parameters is insufficient to cover the possible thermal history space. Based on the strictly experimentally validated FEM model mentioned above, this study designed and calculated a library of 5 simulation cases with distinct thermal histories. These cases were generated by systematically varying the Wire Feed Speed (WFS) and Travel Speed (TS), resulting in distinct layer geometries (i.e., width and height) for each case. The coupling of these two parameters directly determines the deposition amount per unit length and the degree of heat accumulation. Specifically, a higher WFS is typically accompanied by greater welding current and heat input. This results in a wider molten pool and deeper heat-affected zone, thereby inducing more severe thermal softening effects and high-temperature plastic flow. Meanwhile, changes in TS directly alter the cooling rate, consequently affecting the accumulation mode of residual stress. The parameter settings for these 5 cases cover a typical process window from low to high heat input, with specific parameters detailed in Table 1.

Table 1

Process parameters and corresponding bead geometries for the thermal-mechanical FEM of WA-DED
	WFS (m/min)	TS (m/min)	Width (mm)	Height (mm)	Heat input (J/cm)
Case 1	6	0.4	8	3	2266.87
Case 2	6	0.48	6	2	1889.06
Case 3	5	0.4	8	2.5	1881.00
Case 4	5	0.45	6	2.5	1672.00
Case 5	5	0.48	5	2	1567.50

The final constructed training dataset contains 4 training cases and 1 completely independent test case, totaling approximately 4 million spatiotemporal sample points. For each sample point, the dataset extracts not only the input temperature history vector

$\:u\left(t\right)$

and spatiotemporal coordinates

$\:(x,y,z,t)$

, but also the four target components explicitly decoupled and calculated by the finite element constitutive equations: elastic strain

$\:{\epsilon\:}_{e}$

, plastic strain

$\:{\epsilon\:}_{p}$

, thermal strain

$\:{\epsilon\:}_{th}$

, and Von Mises stress

$\:\sigma\:$

This refined data structure, containing full-field stress and decoupled strains, is the foundation of the collaborative prediction method introduced in this study. It allows DeepONet to break through the black-box limitations of traditional end-to-end models, utilizing the shared basis function mechanism to capture implicit constitutive correlations between different physical quantities through supervised learning. In particular, using the unseen 5th case as an independent test set forces the model to learn the underlying physical operator laws rather than simply memorizing data distributions, thereby validating its generalized prediction capability under unknown process parameters.

4. Results and discussion

The feasibility of the benchmark is evaluated through comparing its resulting PIOL architecture in the synergistic prediction of multi-physics fields for WA-DED. To systematically verify the effectiveness of the framework and reveal its underlying physical mechanisms, this chapter establishes the superiority of the differentiated operator learning strategies through the benchmark with single time-series models and various operator variants: specifically, verifying the necessity and effectiveness of adopting a CNN-MLP branch for transient reversible elastic fields, and a CNN-LSTM branch for plastic and stress fields with strong historical dependencies. Furthermore, through multi-output combination experiments, it deeply analyzes how the shared trunk mechanism leverages simple physical quantities to assist in the prediction of complex non-conservative fields, revealing the synergistic enhancement mechanism among physical fields. Finally, using transient evolution curves of key characteristic points, the model's capability to capture peak responses and residual accumulation during complex thermal cycles is visually demonstrated.

4.1 Quantitative analysis of single-output model performances

To establish the optimal operator learning architecture, this study conducted comparative evaluations of five baseline models in independent single-task settings (listed in Table 2). Prediction accuracy was quantified using Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). Experimental results reveal the decisive influence of feature extraction mechanisms and time-series modeling strategies on prediction accuracy of complex thermal-mechanical fields.

Table 2

Comparative evaluation of prediction performance in terms of MAE and RMSE across five baseline models under independent single-task settings.
Model	Elastic Strain (‰)		Plastic Strain (‰)		Thermal Strain (‰)
Model	MAE	RMSE	MAE	RMSE	MAE	RMSE	MAE	RMSE
CNN-LSTM	0.192	0.306	0.248	0.389	0.316	0.388	13.646	18.364
CNN-Neural ODE	0.186	0.313	0.283	0.447	0.433	0.512	14.944	21.505
CNN-LSTM DeepONet	0.201	0.331	0.245	0.359	0.421	0.500	12.766	17.995
CNN-Neural ODE DeepONet	0.218	0.340	0.389	0.582	0.278	0.371	18.080	24.906
CNN-MLP DeepONet	0.171	0.294	0.264	0.359	0.363	0.433	16.807	23.167

For elastic strain, which is primarily governed by the current thermodynamic state and exhibits transient reversible characteristics, the structurally simpler CNN-MLP DeepONet achieved the lowest error among all models. With an MAE of 0.171‰ and an RMSE of 0.294‰, it outperformed the more complex LSTM variants. This indicates that for physical quantities lacking significant historical dependence, fully connected networks possess sufficient non-linear mapping capabilities and effectively mitigate the risk of overfitting potentially introduced by complex temporal modules.

In the plastic strain prediction task, the performance of the CNN-MLP DeepONet declined, whereas the CNN-LSTM DeepONet, incorporating Long Short-Term Memory mechanisms, demonstrated superior performance. By achieving the lowest RMSE of 0.359‰ and MAE of 0.245‰ across all categories, it outperformed the traditional end-to-end CNN-LSTM model. This demonstrates the necessity of gating mechanisms in capturing material non-linear hardening and historical cumulative effects.

Regarding the prediction of Von Mises stress, which comprehensively reflects thermo-mechanical coupling effects, the CNN-LSTM DeepONet exhibited the highest predictive accuracy. Its MAE of 12.766 MPa and RMSE of 17.995 MPa were both superior to those of the traditional end-to-end CNN-LSTM model, which recorded an MAE of 13.646 MPa and an RMSE of 18.364 MPa, as well as the Neural ODE and MLP-based baselines. This result highlights the core advantage of the LSTM-integrated operator architecture in capturing complex plastic evolution and residual stress characteristics.

It is worth noting that the inclusion of thermal strain prediction in this study aims to verify the feasibility of decoupling spatiotemporal coordinates from temperature features within the model. Given that the physical nature of thermal strain is equivalent to fitting the thermal expansion equations of materials under varying temperature fields, which represents a strongly deterministic physical law, it will not be the primary focus of the subsequent in-depth discussion on decoupling mechanisms.

4.2 Qualitative analysis of single-output model performances

To further dissect the specific performance of each model in capturing the dynamic evolution of complex physical fields in WA-DED, this section conducts visualization analysis from both time-domain evolution and spatial distribution dimensions.

4.2.1 Time-domain evolution characteristics analysis

Figures 7–10 display the prediction curves of different models at four representative node locations (Node 200, 400, 600, 800). Based on the mesh topology of 51 nodes per layer, these nodes correspond spatially to the 4th, 8th, 12th, and 16th layers along the build direction, respectively. This selection evaluates model robustness across varying thermal history lengths: Node 200 (4th layer) in the bottom region undergoes the most extensive cyclic thermal shocks and plastic accumulation, whereas Node 800 (16th layer) in the top region represents newly deposited material with a shorter history.

Among the physical quantities, plastic strain (Fig. 8), as a typical history-dependent variable, exhibits an irreversible cumulative evolution characteristic that increases layer by layer, manifesting as a stepped growth. Observing the curves for Node 200, which requires the longest temporal integration, it is evident that the CNN-Neural ODE DeepONet exhibits non-physical oscillations and numerically irrational decreases during the cooling phase. This specific failure at the bottom layer reveals the limitation of continuous ODE solvers: measuring the evolution of Node 200 requires integrating over a long time horizon with severe stiffness, leading to significant cumulative numerical drift.

In contrast, curves generated by CNN-LSTM DeepONet and CNN-LSTM are smooth and closely adhere to true values, confirming that the gating mechanism effectively captures the plastic increment even after prolonged thermal cycles.In Von Mises stress prediction (Fig. 10), stress presents pronounced sawtooth fluctuations as deposition layers increase. Neural ODE-based models show clearly insufficient fitting capabilities at peaks and valleys, especially exhibiting significant cumulative drift in the later stages (t > 1000s) at Node 200. This further corroborates that continuous differential equations struggle to adapt to such non-smooth dynamic processes over long sequences. Conversely, CNN-LSTM DeepONet reconstructs the rapid response details of stress variations with temperature, confirming its capability to capture high-frequency thermal-mechanical responses. Additionally, predictions for elastic strain (Fig. 7) and thermal strain (Fig. 9) further verify the capability of the DeepONet architecture in handling reversible thermodynamic behaviors.

Fig. 7

Time-variant prediction results of Elastic Strain (Z-component) across five models at different monitoring nodes.

Fig. 8

Time-variant prediction results of Plastic Strain (Z-component) across five models at different monitoring nodes.

Fig. 9

Time-variant prediction results of Thermal Strain (Z-component) across five models at different monitoring nodes.

Fig. 10

Time-variant prediction results of Von Mises Stress across five models at different monitoring nodes.

4.2.2 Spatial physical field distribution analysis

This section selected the Von Mises stress field at the end of deposition (t = 1460s) as the evaluation object (Fig. 11), as this indicator can comprehensively reflect the model's overall modeling capability for thermal-mechanical coupling mechanisms. At this point, the ground truth stress field displays a clear layered stripe structure, reflecting the periodic gradient distribution brought by layer-by-layer deposition, and there is a significant stress concentration area at the junction of the substrate and the deposited layer. Comparing the prediction contours of various models, it can be found that the prediction of CNN-Neural ODE is the most blurred, with the largest error value, and the high gradient information at the interlayer interface is almost completely lost, presenting an over-smoothed state that fails to reflect the mesoscopic features of the WA-DED process. Although CNN-MLP DeepONet recovers some layered structures, its predicted values in the stress concentration area at the bottom (Z < 10mm) are low, failing to capture the high-stress core region. In contrast, CNN-LSTM DeepONet achieves the best reconstruction effect, not only accurately restoring the stress concentration amplitude at the bottom but also clearly preserving the high-frequency interlayer stress gradient stripes along the Z-axis. This result indicates that the model possesses not only memory capabilities in time sequence but also effectively approximates the high-frequency geometric features of the physical field in spatial operator mapping, further corroborating its reliability as a high-performance surrogate model.

The temporal feature extraction mechanism has a significant impact on the reconstruction quality of the spatial field. The prediction results of CNN-Neural ODE and its DeepONet variants are the most blurred, with high-gradient information at interlayer interfaces almost completely lost and the largest overall error, indicating the limitations of continuous integration solvers in processing such non-continuous fields with drastic spatial distribution changes. CNN-MLP DeepONet, while recovering layered structures to some extent, shows significant numerical underestimation in the stress concentration area at the bottom (Z < 10mm) as seen in the error contour, failing to accurately capture the high-stress core zone. CNN-LSTM DeepONet demonstrated the best reconstruction effect, with the most uniform error distribution. This result proves that for deposition process with strong historical dependence characteristics like WA-DED, long-short term memory mechanisms should be introduced into the operator learning framework to achieve precise approximation of complex physical field high-frequency geometric features and historical cumulative distributions.

Fig. 11

Comparison of Von Mises stress spatial distributions and prediction errors at the end of deposition (t = 1460s).

4.3 Physics-informed DeepONet performance in multi-output prediction

This section delves into the core role of the shared trunk network in multi-physics prediction and its underlying mathematical mechanism. In the independent branch, shared trunk multi-output structure constructed in this study, its mathematical expression is

$\:{G}_{k}\left(u\right)\left(y\right)\:=\:\mathbf{b}\left(u\right)\bullet\:\mathbf{t}\left(y\right)\:+\:{\beta\:}_{k}$

. Here,

$\:\mathbf{t}\left(y\right)$

constitutes a set of universal geometric spatiotemporal basis functions reused across tasks. This formula clarifies the mechanism of the model: the specificity of physical evolution is represented by independent coefficients

$\:b\left(u\right)$

, while the commonality of geometric space is represented by unified basis functions

$\:t\left(y\right)$

4.3.1 Quantitative evaluation and comparison of multi-output configurations

The experimental data in Table 3 illustrate the specific impact of multi-output synergy on prediction accuracy from a quantitative dimension.

Table 3

Quantitative comparison of prediction performance in terms of MAE and RMSE across different task configuration strategies.
Model	Elastic Strain (‰)		Plastic Strain (‰)		Von Mises Stress (MPa)
Model	MAE	RMSE	MAE	RMSE	MAE	RMSE
All-Branch	0.305	0.412	0.364	0.506	20.030	26.722
Elastic-Plastic	0.254	0.339	0.303	0.462	—	—
Elastic-Stress	0.193	0.325	—	—	16.986	23.946
Plastic-Stress	—	—	0.266	0.397	12.935	18.424

Comparison reveals that the Plastic-Stress combination achieved the lowest error among all experimental groups in Von Mises stress prediction, with an MAE of 12.935 MPa and an RMSE of 18.424 MPa. In contrast, the Elastic-Stress combination resulted in an MAE of 16.986 MPa. The accuracy gap of approximately 24% between the two proves the synergistic effect at the physical level: since the generation of residual stress fundamentally stems from the accumulation of incompatible plastic deformation within the material, introducing the plastic branch provides key auxiliary information for stress prediction, thereby improving prediction accuracy.

4.3.2 Synergistic enhancement and feature competition

The above quantitative results can be explained by the mathematical essence of the DeepONet operator. In the independent branch, shared trunk architecture, the prediction result is essentially the inner product of independent branch coefficients and universal trunk basis functions.

Comparing the error fields in Fig. 13, it is evident that in the Elastic-Stress model lacking plastic branch assistance, the stress prediction error surface exhibits fluctuations, with high error peaks at the edges. Combined with the temporal error curve analysis in Fig. 14, the error of the Elastic-Stress model tends to generate spikes during the transient process of interlayer switching. When the Plastic-Stress combination is adopted, the error trend becomes significantly smoother. This reflects the physical essence of multi-output synergy: the plastic branch actually acts as an explicit constraint for historical cumulative features, forcing the shared Trunk network to prioritize encoding those spatial modes highly relevant to historical accumulation and non-linear hardening. This mechanism allows the learning process of the plastic branch to provide key supervision for stress prediction, helping the model more accurately reconstruct stress fields with historical dependencies.

On the other hand, a feature competition phenomenon also exists in the shared trunk network. The All-Branch model, which includes all physical quantities, did not show the expected advantage; its stress prediction error (MAE 20.030 MPa) was actually higher than that of the Plastic-Stress combination. This is because plastic strain and stress are both irreversible quantities influenced by historical processes, and they can facilitate each other when sharing the trunk network; in contrast, elastic strain mainly changes rapidly with current temperature and belongs to reversible transient quantities. If the same Trunk network is forced to simultaneously adapt to both elastic and plastic fields, the optimization directions of the network parameters will conflict, leading to the model's inability to balance features of different time scales, ultimately dragging down overall performance.

Therefore, the sharing strategy of DeepONet should follow the similarity of physical evolution laws. Joint training of plasticity and stress, which share similar laws, yields the best results, while forcibly introducing the elastic field with different properties will cause interference.

Fig. 13

Comparison of spatial error field distributions for multi-physics components across different multi-output configurations.

Fig. 14

Temporal evolution of Mean Absolute Error (MAE) throughout the deposition process. (a) Elastic strain. (b) Plastic Strain. (c) Von Mises stress.

4.4 Computational efficiency comparison

Beyond prediction accuracy, computational efficiency is a key indicator of whether a surrogate model can support industrial digital twin systems. This section compares the essential difference in time consumption between traditional FEM and the developed PIOL in this study. The computational bottleneck of traditional FEM simulation lies in its high online cost; every time step requires constructing a massive stiffness matrix and performing complex non-linear iterative solutions. The computational volume increases sharply with the number of meshes, causing the simulation of a single deposition layer to often take hours, making it difficult to meet online monitoring needs.

This model adopts an offline training plus online inference mode, transforming heavy physical equation solving into efficient neural network weight matrix multiplication. In particular, the hybrid architecture designed in this study adopts a lightweight CNN-MLP branch for the transient elastic field, avoiding unnecessary recurrent calculation overhead found in full LSTM architectures. Under the same hardware environment, the inference time for full-field physical quantities is compressed to the millisecond level. This acceleration of several orders of magnitude relative to traditional numerical methods breaks through the time bottleneck of physical solving. The extremely low response latency allows this model to be embedded in digital twin systems, making up for the shortcoming of FEM being limited to pre-planning. It can instantly calculate the invisible stress and plastic states inside the component based on real-time collected temperature data, thereby providing a real-time basis for online process adjustment of industrial robots, truly realizing closed-loop manufacturing control.

4.5 Discussion and PIOL framework summary

This study establishes a PIOL framework for strain-stress decoupling prediction in WA-DED. When predicting transient reversible fields (i.e., elastic strain), structural decoupling from history-dependent fields (i.e., plastic strain and stress) is required to avoid feature interference. Conversely, when predicting residual stress, plastic strain should be introduced as a synergistic auxiliary task to enhance physical consistency. This framework is supported by the differentiated operator learning strategy validated in our experiments. For memoryless physical processes governed by instantaneous thermodynamic states (such as elastic strain), the CNN-MLP architecture demonstrates superiority, as introducing recurrent units would unnecessarily increase model complexity and overfitting risks. In sharp contrast, for plastic strain and stress, the CNN-LSTM architecture is essential. Its gating mechanism effectively captures the irreversible cumulative effects during thermal cycles, successfully overcoming the numerical oscillation and divergence issues encountered by continuous Neural ODE solvers when processing stiff thermal history data.

The benchmark constructed in this paper reveals the profound mechanism of feature interaction within the shared operator space. Although different physical quantities follow distinct evolution laws, they share universal basis functions constrained by the same geometric domain. The superior performance of the "Plastic-Stress" configuration indicates that the plastic branch functions as a constraint, forcing the shared trunk network to prioritize encoding spatial modes relevant to historical accumulation, thereby reducing stress prediction error. However, the failure of the "All-Branch" configuration highlights a critical limitation: feature competition. Forcing the shared trunk network to simultaneously encode high-frequency transient elastic modes and low-frequency cumulative plastic trends leads to conflicting optimization directions. This results in an over-smoothed basis space that degrades overall accuracy. Therefore, the sharing strategy in DeepONet is suggested follow the similarity of physical evolution laws.

Fig. 15

The summarized optimal PIOL architecture for physical fields prediction with a physically decoupled dual-trunk topology.

Based on these findings, a physically decoupled dual-trunk DeepONet is established as the optimal PIOL architecture, as summarized in Fig. 15. This topology implements a mechanism of selective structural coupling. Distinct from the single-trunk sharing strategy, this architecture employs two independent trunk networks: one dedicated to the elastic branch to capture transient reversible modes, and another shared exclusively between the plastic and stress branches to encode history-dependent cumulative modes. This design avoids the interference of transient elastic features with cumulative history learning, while forcing the plastic-stress shared operator to prioritize the encoding of spatial modes governing irreversible material hardening and residual distortion. Consequently, the shared trunk for the history-dependent task acts as a physical regularization term, utilizing the high-fidelity plastic strain gradients to guide the accurate reconstruction of the stress field, thereby effectively filtering out high-frequency elastic noise and achieving the highest physical consistency.

5. Conclusion

Addressing the heterogeneity of multi-physics evolution mechanisms in the WA-DED process, this study proposes a Physics-informed hybrid multi-output DeepONet surrogate model. This method fuses transient mapping and temporal memory mechanisms within the same operator framework, achieving high-precision synergistic prediction of elastic strain, plastic strain, and residual stress. Based on systematic validation using a high-fidelity thermal-mechanical coupled finite element dataset, the highlights are:

A thermally calibrated thermo-mechanical dataset has been constructed and benchmarked for full-field strain/stress prediction in WA-DED.

A PIOL strategy based on constitutive characteristics has been proposed and systematically evaluated.

Beyond accuracy comparisons, we reveal the synergistic mechanism and feature competition within a shared operator space, leading to a physically motivated dual-trunk topology as an optimal strategy for mixed reversible/irreversible targets.

The resulting surrogate achieves millisecond-scale inference, supporting deployment in digital twins for near-real-time monitoring and control of thermo-mechanical responses.

Future work will focus on expanding this hybrid framework to unstructured mesh prediction for 3D complex components and further exploring the integration of microstructural evolution into this hybrid operator system to construct a unified macroscopic and microscopic real-time digital twin model.

Data availability

The data supporting the model training of this research can be obtained from the corresponding author upon request.

Code availability

The codes supporting the model of this research can be obtained from the corresponding author upon request.

References

Xiong Y, Tang Y, Zhou Q et al (2022) Intelligent additive manufacturing and design: state of the art and future perspectives[J]. Additive Manuf 59:103139

Dinovitzer M, Chen X, Laliberte J et al (2019) Effect of wire and arc additive manufacturing (WA-DED) process parameters on bead geometry and microstructure[J]. Additive Manuf 26:138–146

Li Y, Su C, Zhu J (2022) Comprehensive review of wire arc additive manufacturing: Hardware system, physical process, monitoring, property characterization, application and future prospects[J]. Results Eng 13:100330

Singh SR, Khanna P (2021) Wire arc additive manufacturing (WA-DED): A new process to shape engineering materials[J]. Materials Today: Proceedings, 44: 118–128

Mu H, He F, Yuan L et al (2023) Toward a smart wire arc additive manufacturing system: A review on current developments and a framework of digital twin[J]. J Manuf Syst 67:174–189

Uriondo A, Esperon-Miguez M, Perinpanayagam S (2015) The present and future of additive manufacturing in the aerospace sector: A review of important aspects[J]. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 229(11): 2132–2147

Zhang H, Li R, Liu J et al (2024) State-of-art review on the process-structure-properties-performance linkage in wire arc additive manufacturing[J]. Virtual Phys Prototyp 19(1):e2390495

Gil Plazas AF, Amaya Villabón TA, Ramírez Vargas DA et al (2025) Influence of interlayer thermal cycling on microstructural evolution in WA-DED processed carbon steel[J]. Weld World, : 1–19

DebRoy T, Wei HL, Zuback JS et al (2018) Additive manufacturing of metallic components–process, structure and properties[J]. Prog Mater Sci 92:112–224

10.

Ding J, Colegrove P, Mehnen J et al (2011) Thermo-mechanical analysis of Wire and Arc Additive Layer Manufacturing process on large multi-layer parts[J]. Comput Mater Sci 50(12):3315–3322

11.

Lindgren LE (2001) Finite element modeling and simulation of welding part 1: increased complexity[J]. J Therm Stresses 24(2):141–192

12.

Xie R, Chen G, Zhao Y et al (2019) In-situ observation and numerical simulation on the transient strain and distortion prediction during additive manufacturing[J]. J Manuf Process 38:494–501

13.

Mukherjee T, Zuback JS, De A et al (2016) Printability of alloys for additive manufacturing[J]. Sci Rep 6(1):19717

14.

Denlinger ER, Gouge M, Irwin J et al (2017) Thermomechanical model development and in situ experimental validation of the Laser Powder-Bed Fusion process[J]. Additive Manuf 16:73–80

15.

Graf M, Pradjadhiana KP, Hälsig A et al (2018) Numerical simulation of metallic wire arc additive manufacturing (WA-DED)[C]//AIP conference proceedings. AIP Publishing LLC, 1960(1): 140010

16.

Chen Z, Yuan L, Zhu H et al (2024) A comprehensive review and future perspectives of simulation approaches in wire arc additive manufacturing (WA-DED)[J]. International Journal of Extreme Manufacturing

17.

Ding J et al (2014) A computationally efficient finite element model of wire and arc additive manufacture. Int J Adv Manuf Technol 70(1):227–236

18.

Cadiou S, Courtois M, Carin M et al (2020) 3D heat transfer, fluid flow and electromagnetic model for cold metal transfer wire arc additive manufacturing (Cmt-WA-DED)[J]. Additive Manuf 36:101541

19.

LI R, XIONG J, LEI Y (2019) Investigation on thermal stress evolution induced by wire and arc additive manufacturing for circular thin-walled parts [J]. J Manuf Process 40:59–67

20.

Cook PS, Murphy AB (2020) Simulation of melt pool behaviour during additive manufacturing: Underlying physics and progress[J]. Additive Manuf 31:100909

21.

Ou W, Wei Y, Liu R et al (2020) Determination of the control points for circle and triangle route in wire arc additive manufacturing (WA-DED)[J]. J Manuf Process 53:84–98

22.

Kim YH, Gunasegaram DR, Cleary PW et al (2025) Numerical modelling of wire arc additive manufacturing: methods, status, trends, and opportunities[J]. J Phys D 58(14):143001

23.

He F, Yuan L, Mu H et al (2023) Research and application of artificial intelligence techniques for wire arc additive manufacturing: a state-of-the-art review[J]. Robot Comput Integr Manuf 82:102525

24.

Hamrani A, Agarwal A, Allouhi A et al (2024) Applying machine learning to wire arc additive manufacturing: a systematic data-driven literature review[J]. J Intell Manuf 35(6):2407–2439

25.

Farias FW C, da Cruz Payão Filho J, e, Oliveira (2021) V H P M. Prediction of the interpass temperature of a wire arc additive manufactured wall: FEM simulations and artificial neural network[J]. Additive Manufacturing, 48: 102387

26.

Xia C, Pan Z, Li Y et al (2022) Vision-based melt pool monitoring for wire-arc additive manufacturing using deep learning method[J]. Int J Adv Manuf Technol 120(1):551–562

27.

Nalajam PK, Varadarajan R (2021) A hybrid deep learning model for layer-wise melt pool temperature forecasting in wire-arc additive manufacturing process[J]. IEEE Access 9:100652–100664

28.

Raissi M, Perdikaris P, Karniadakis GE (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations[J]. J Comput Phys 378:686–707

29.

Chen RTQ, Rubanova Y, Bettencourt J et al (2018) Neural ordinary differential equations[J]. Adv Neural Inf Process Syst, 31

30.

Lu L, Jin P, Pang G et al (2021) Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators[J]. Nat Mach Intell 3(3):218–229

31.

Wang S, Teng Y, Perdikaris P (2021) Understanding and mitigating gradient flow pathologies in physics-informed neural networks[J]. SIAM J Sci Comput 43(5):A3055–A3081

32.

Kim S, Ji W, Deng S et al (2021) Stiff neural ordinary differential equations[J]. Chaos: Interdisciplinary J Nonlinear Sci, 31(9)

33.

Jin P, Meng S, Lu L, MIONet (2022) Learning multiple-input operators via tensor product[J]. SIAM J Sci Comput 44(6):A3490–A3514

34.

Goldak J, Chakravarti A, Bibby M (1984) A new finite element model for welding heat sources[J]. Metall Trans B 15(2):299–305

35.

Goldak JA, Akhlaghi M (2005) Computational welding mechanics[M]. Springer US, Boston, MA

36.

Simo JC, Hughes TJR (1998) Computational inelasticity[M]. Springer New York, New York, NY

Acknowledgement

This work is financially supported in part by the National Natural Science Foundation of China (NSFC) under Grant 52375344.

Author Information

M.L., H.M., and D.D. conceptualized, proved theory, designed the experiments. M.L. and M. T. conducted experiments, data collection and processing. M.L. implemented the code. The structure of the manuscript was designed by H.M., and the paper was mainly written by M.L, with substantial feedback provided by H.M. and D.D. Supervision and coordination of the development of the method was provided by H.M. and D.D., and the design and writing of the manuscript was supervised by H.M. D.D. is the corresponding author.

Competing interests

The authors declare no competing interests.

Yes