Predicting Magnetron Sputtering Deposition Rate through Process Parameters Using Supervised Machine Learning

SriVishnuJami1Emailsrivishnujami@gmail.com

SaktiPrasannaMuduli2Emailpinkusakti08@gmail.com

PareshKale2✉Emailpareshkale@nitrkl.ac.in

1School of Electrical EngineeringVIT Vellore632001VelloreTamil NaduIndia

2Department of Electrical EngineeringNIT Rourkela769008RourkelaOdishaIndia

Sri Vishnu Jami¹, Sakti Prasanna Muduli², & Paresh Kale^{2, *}

¹School of Electrical Engineering, VIT Vellore, Vellore, Tamil Nadu, India, 632001

²Department of Electrical Engineering, NIT Rourkela, Rourkela, Odisha, India, 769008

srivishnujami@gmail.com (S. V. Jami), pinkusakti08@gmail.com (S. P. Muduli),

^*Corresponding author: pareshkale@nitrkl.ac.in (P. Kale)

Abstract

Magnetron sputtering, a widely used physical vapor deposition method, is a plasma-matter interaction process. The magnetron sputtering application includes semiconductor fabrication, optics, and surface coating. The key roles of sputtering in the semiconductor industry are forming interconnects, barrier layers, electrode contacts in solar cells, integrated circuits, and other microelectronic devices. Despite wide applications and impact, rare mathematical analyses exist for predicting the deposition rate derived from first principles. A supervised machine learning approach uses process parameters such as power, target-substrate distance, and target material to determine the deposition rate. The work explains the process parameters and their impact on the deposition rate. Seven regression machine learning models are briefly discussed, with the relevance for model sputtering deposition rate, which are evaluated using parameters such as Mean Squared Error, Mean Absolute Error, and Coefficient of determination (R²). The average performing models are tree-based regression models with R² above 0.9 and minimal error. Random Forest and XGBoost are the top-performing models, with R² of 0.96 and 0.97, respectively. Predicting the sputter deposition rate using optimized machine learning is a novel approach to reduce experimental time and expenditure.

Keywords:

Substrate-target distance

hyperparameters

Sputtering pressure

Coefficient of determination

Regression models

1. Introduction

The sputtering technique deposits thin films of a material (called ‘target’) on the substrate by accelerating the target atoms and inducing controlled collision cascades of Ar⁺ (any ionized gas molecules) at the target surface. An electric field ionizes the Ar gas, as per Eq. (1). Magnetron sputtering incorporates the traditional sputtering by placing a magnetron beneath the target [1], which enhances the bombardment of the Ar⁺ on the target surface, causing increased sputtering cascades and consequently improved sputter yield [2]. The magnetic field of the magnetron is at right angles to the electric field, which traps the electrons near the target surface and causes a fast movement of the Ar⁺ in a spiral path to increase the possibility of collisions (Eq. (2)). The combined production of Ar⁺ increases the sputter yield. The accelerated gas ions lift off the surface particles of the target and transfer momentum through collision. The ejected atoms are electrically neutral and settle on the substrate surface, creating a thin film interface with the substrate. The sputter phenomenon is quantitatively measured using sputter yield, the observed average number of ejected target atoms per incident ion, as shown in Eq. (3).

Under an electric field:		Ar → Ar⁺ + e^-	(1)
Cascading collision:		Ar + e^- → Ar⁺ + 2e^-	(2)
	$\:Sputter\:Yield=\frac{Number\:of\:ejected\:atoms\:}{Number\:of\:incident\:ions}$		(3)

As a prime physical deposition method, sputtering stands out among techniques like thermal evaporation, electron beam evaporation, and molecular beam epitaxy [3]. Sputtering overcomes the limitations of other methods by effectively depositing materials with high melting points and offering excellent adhesion [4]. Compared to chemical deposition methods such as Chemical Vapor Deposition (CVD), Atomic Layer Deposition (ALD), Electroplating, and Spin coating, sputtering is recognized as a superior and highly versatile technique due to cost-effectiveness and comparatively simple operation [5]. The key advantages lie in the high scalability, precise controllability over film thickness and composition, remarkable versatility in material compatibility (including metals, oxides, and nitrides), and uniform deposition over large areas [6]. The attributes firmly establish sputtering as an industry-preferred choice for thin film deposition applications.

In conventional (diode or non-magnetron) sputtering, a simple electrode setup is used where the substrate is the anode and the target material is the cathode. Ions from a plasma collide with the target, knocking off atoms that are then deposited on the substrate. This method is inefficient due to low plasma density, leading to lower deposition rates and higher substrate heating [6, 7]. In contrast, magnetron sputtering uses magnets behind the target, which creates a magnetic field to trap electrons near the target surface, significantly increasing plasma density and ionization efficiency. The enhancement leads to higher deposition rates, better film quality, and lower substrate temperatures, making the magnetron sputtering a preferred method for precise industrial and research applications [6]. The technique also allows operation at lower voltages and substrate temperatures, which prevents damage to sensitive materials. Additionally, magnetron sputtering provides better deposition uniformity over large areas and enables coating of a wide range of materials, including non-conductive targets when using RF power.

Sputtering systems are at the forefront of thin film deposition, a critical process for fabricating materials with enhanced functionality in various industries. The sputtering system is extensively employed to impart properties such as increased wear-resistance, lubricity, corrosion-resistance, and chemical-resistance to surfaces [8]. Sputtering is a cornerstone technique in the semiconductor industry, enabling the precise fabrication of material structures, including conductive layers, barrier layers, and contacts, which are fundamental to modern-day computers and microelectronics [8]. Sputtering offers the necessary control and purity for depositing films essential for integrated circuit performance and stability. Sputtering systems are vital for photovoltaics in making solar cells more efficient [9]. By enabling the deposition of tailored thin films, sputtering enhances and optimizes the interface properties of solar cells, improving light absorption and charge transport [9].

1.1. Magnetron sputtering

The typical setup for the magnetron sputtering system is depicted in Fig. 1, which comprises a vacuum chamber, a voltage source, a target, and a substrate holder. At the bottom of the system is the sputtering target, typically a planar disc of the material to be deposited, and connected to a negative potential acting as a cathode. Directly above the target, the substrate is mounted, which serves as the anode, where the sputtered material is deposited. The chamber possesses an inlet for sputter gas, which is introduced at controlled flow rates. An outlet is also present to maintain the desired vacuum conditions. During operation, a high voltage is applied between the target and the substrate, ionizing the sputter gas, and the sputter gas glows with a characteristic glow upon conversion to a plasma state within the chamber. The plasma contains Ar⁺, accelerated towards the negatively biased sputtering target.

Figure 1. Schematic of ionization of the Ar and the bombardment, to sputter the target atom, and deposition on the substrate surface

1.2. Overview of magnetron sputtering modes

The four modes of the RF sputtering are (i) DC, (ii) RF, (iii) Pulsed DC, and (iv) reactive sputtering. The first three modes are based on the power source used, depending on the type of target material, whereas reactive sputtering uses any power source. DC Sputtering is straightforward and cost-effective, offering easy controllability and relatively low operation costs. In DC non-magnetron sputtering mode, the potential applied to the cathode is constant, usually from − 2 kV to -5 kV; however, the DC magnetron sputtering potential level varies from 200 V to 1 kV [10]. The major limitation of DC sputtering arises during the sputtering of non-metals; arcing occurs due to the charge built up over time, eventually leading to uneven voltage drops, resulting in system breakdown or uneven film deposition [11].

Radio frequency (RF) Sputtering uses AC power with a fixed frequency of 13.56 MHz [7]. The RF sputtering solves arcing due to charge buildup in DC Sputtering and allows non-metal targets. Due to the continuous switching of voltages, passive buildup of charges is prevented on the target surface. RF sputtering is relatively expensive and requires technical expertise. RF sputtering operates at a lower chamber pressure than DC sputtering, resulting in fewer collisions, higher deposition efficiency, and improved film quality. RF sputtered films generally exhibit smoother morphology with better packing density than DC sputtered films.

Pulsed DC sputtering combines key benefits of DC and RF techniques—minimizing arcing, enabling deposition of challenging materials (dielectrics), and maintaining high deposition rates. Pulsed DC sputtering is more straightforward to implement and operate than RF sputtering. The source is a pulsed DC with a frequency between 10 kHz and 350 kHz [11]. Pulsed DC sputtering periodically reverses the target polarity by applying a brief positive voltage pulse after the main negative sputtering pulse. The polarity reversal neutralizes positive charge buildup on the target surface, effectively suppressing arcing and enabling continuous deposition of dielectric materials [12]. Unlike conventional DC sputtering, which is unable to handle charge accumulation on non-conductive targets, pulsed DC alternating pulses maintain process stability and improve film quality, which is advantageous for reactive and dielectric film deposition.

For reactive sputtering, the process gas (Ar) is released along with reaction gases to form respective compounds deposited on the substrate. The compound formation occurs at the substrate and target levels [13]. The reactive sputtering produces compound films such as nitrides and oxides of the target materials. The ratio of Ar to reactive gas is maintained at a constant ratio. These processes are tricky and often show hysteresis effects; hence, system control becomes complicated.

1.3. Machine learning models for deposition techniques

Sputtering involves complex, multi-variable interactions, such as power, pressure, and target material, which are challenging to control fully with traditional experimental methods [14]. By leveraging real-time data (like voltage and deposition rate measurements), machine learning rapidly identifies suitable process conditions, optimizes parameters, and predicts outcomes such as film composition or quality, while minimizing material waste and experimental time [15]. Machine learning enables automated, efficient exploration of the vast process parameter space, allowing researchers to achieve optimal results with fewer experiments than manual trial-and-error approaches, and is particularly valuable as sputtering systems and material requirements become increasingly intricate [13].

Lang et al. [14] used a Gaussian machine learning framework to model the thickness profile, using process parameters such as pressure, RF power, and the geometry of the sputtering setup. In a statistical approach, by Akiyama et al. [16], analysis of variance found the RF power as the most critical parameter in determining the crystal orientation of the films. Kino et al. [15] used machine learning to predict sputter yield based on physical descriptors, and the heat of formation strongly affected the value of sputtering threshold energy. Kamataki et al. [17] developed a hybrid machine learning model combining classification and regression to identify boundary conditions for producing high mobility amorphous indium tin oxide film. The model identified the conditions for making films with 27% higher mobility. Bhaskar et al. [18] used a parametric approach to determine the hydrogen storage capacity of different material classes, and parameters such as temperature and pressure were considered. Although the study has no direct relevance to sputtering, it highlights the predictability of supervised models on physical effects. Models such as decision trees and random forests offered good coefficient of determination (R²) values.

Machine learning predicts the optical properties of NiO films produced using RF magnetron sputtering, utilizing Artificial Neural Networks (ANN), Support Vector Machine (SVM), Gaussian Process Regression (GPR), and Adaptive Network Fuzzy Inference System (ANFIS) [13]. ANN and GPR are preferable in predicting the optical properties compared to the other models deployed. Paturi et al. [19] used machine learning models such as ANN and SVM to estimate coating thickness for electrostatic spray deposition, using process parameters such as electric potential, powder feed pressure, and distance between the nozzle tip and the substrate. The experiment achieved R² of 0.92 and 0.99 on the ANN and SVM models, respectively. Tang et al. [20] used machine-learning-assisted Multiphysics simulations to optimize the chemical vapor deposition process of 4H-SiC epitaxial layer growth. The study investigated the effect of process parameters such as deposition temperature, inlet-flow volume, rotational speed, and pressure on the quality of films. Ant colony optimization-back propagation neural network (ACOBPNN) was used as the machine learning algorithm, and the error rate was 4.03%. Zhai et al. [21] used k-nearest neighbors (KNN), SVR, decision tree, random forest, CatBoost, and backpropagation neural network (BPNN) for the prediction of SiN_x deposition rate during PECVD. CatBoost had the maximum R² value of 0.93, and SHAP analysis proved the gas flow rate as the most influential parameter.

1.4. The need for a predictive model for sputtering

Predictive models, especially machine learning, are crucial for optimizing sputtering processes. The sputtering involves multiple interdependent parameters like pressure, power, and temperature, which affect the deposition properties in a complex way. Instead of a time-consuming trial-and-error approach, predictive models analyze experimental data to identify the most influential parameters and predict the best operating conditions to achieve desired film characteristics [21]. The approach allows researchers to accelerate experiments, reduce waste, and consistently produce high-quality films. Predictive models improve process efficiency and scalability for advanced material development [18].

Predictive models optimize experimental parameters by analyzing past data to understand the relationships between process variables (like pressure and power) and desired outcomes (such as film quality) [22]. The models use machine learning to predict results for different parameter combinations and identify the most successful ones. Optimization algorithms use the predictions to efficiently find the ideal settings, allowing researchers to quickly achieve the preferred results while avoiding a time-consuming trial-and-error approach.

2. Overview of sputtering parameters

Sputtering power directly influences the rate; higher power generates more energetic ions to sputter more atoms from the target. The distance between the target and the substrate is also critical, with shorter distances increasing the rate by reducing atom scattering, though this can compromise film uniformity. The working pressure in the chamber affects the mean free path of particles; lower pressures can improve uniformity and directionality. Additionally, the type of process gas, mass, and the target material properties (e.g., atomic mass and binding energy) also significantly impact the sputtering yield and, consequently, the deposition rate. The section briefly explains the sputtering parameters associated with various sputtering system parts – vacuum chamber, target and cathode arrangement, substrate parameters, types of power supply, and gases apart from process gas.

1.5. Chamber ambient parameters

The air from the sputtering chamber is evacuated, and the pressure develops due to the inflow of process gas. The mean free path (λ) measures the average distance an atom or particle travels before the subsequent collision. The λ is inversely proportional to the chamber pressure. MFP is inversely proportional to the chamber pressure [23]. Sputtered particles suffer more collisions at high pressure, and the number of target atoms reaching the substrate decreases [24]. Thus, the chamber ambient parameters are the base and process gas pressures. The base pressure represents the vacuum level before releasing the process gas (typically in the range of 10^− 5 to 10^− 6 mbar). The decrease in the base pressure ensures the effect of other gases (apart from process gas) on the sputtering, leading to increased controllability. The flow rate of the process gas balances the chamber pressure (also known as the sputtering pressure, typically varying in the range of 10^− 2 to 10^− 3 mbar) to form plasma under certain conditions of power and cathode-anode distance [23].

Target-substrate distance is a chamber parameter limited by the chamber design. Target-Substrate Distance indirectly affects the film grain size and deposition rate [25]. Lowering the target-substrate distance increases the deposition rate and film grain (crystalline) size, and decreases the resistivity [26]. When the angle of incidence between the target surface and the argon atoms is maintained at 90°, the Ar atoms tend to graze or bounce off the target surface, reducing the sputtering probability. The sputter yield increases with an increase in angle of incidence; the maximum sputter yield is achieved between the angles 60° and 80°, and rising further decreases the sputter yield [27]. Chamber temperature affects mobility on the substrate surface; higher mobilities result in larger grain sizes and smoother films. An empirical formula by Yamamura et al. [28] is expressed in Eq. (4), where Y(θ) is the sputter yield at θ angle of incidence, and Y(0) is the yield at normal angle of incidence. θ_opt is the angle of maximum sputter yield, and f is the fitting parameter that best fits the curve with experimental data.

$\:Y\left(\theta\:\right)=Y\left(0\right)*\left(\frac{1}{{\text{cos}\theta\:}^{f\:}}\right)*\text{e}\text{x}\text{p}\left(f*\text{cos}{\theta\:}_{opt}*\left(\frac{1}{\text{cos}\theta\:}-1\right)\right)$

1.6. Target Parameters

The surface binding energy of the target material impacts sputter yield, which is directly proportional to the deposition rate, and is dependent on the material properties of the target. The sputter yield is expressed in Eq. (5) based on the empirical Equation from Yamamura. Λ is a material-dependent property inversely proportional to U_o, which is the surface binding energy of the material. Hence, lower surface binding energy leads to higher sputter yields. S_n(E) is the nuclear stopping power of the incident ion in the target material at energy E. E_th is the threshold energy for sputtering. The target area determines the power density per unit area; lower power density corresponds to lower deposition rates.

$\:Y\left(0\right)=\:\varLambda\:.{S}_{n}\left(E\right).\left(1-\frac{Eth}{E}\right)$

1.7. Substrate Parameters

The adhesion strength of the substrate to the target increases the thickness of the deposited film [25]. Substrate temperature determines the atom mobility; higher mobilities result in greater grain sizes, which increase the film thickness. The atoms spread out at higher temperatures, creating larger grain sizes and clusters. The enhanced distribution is evident after the film-deposited substrate is brought to room temperature [25]. Substrate rotation is a popular technique used to increase the uniformity of deposition, evens out the effect of uneven sputtering target flux by rotating the substrate, which results in each substrate point being deposited with target atoms at different flux points at distinct instances. Substrate voltage bias is applied to induce the movement of ions towards the target surface, which helps to control the film properties [29].

1.8. Process gas parameters

Sputtering is an electrically neutral momentum transfer process; hence, for effective sputtering, the heavier targets should be sputtered using heavier sputter gases and vice versa [30]. The properties of sputter gas affecting deposition rates and film properties are its atomic weight, reactivity with the target, and gas flow rate in the chamber. Typically, Ar is used as the sputter gas; however, Ne is preferred for lighter targets, and either Xe or Kr is preferred for heavier targets [31]. The sputter gas can be mixed with either N₂ or O₂, leading to the formation of either nitrides or oxides of the target material. Compared to elemental targets, sputter yields are lower in compound materials [32]. The reactivity of the sputter gas is quantified using the ratio of noble gas to the reactive gas. Adjusting the sputter gas flow rate is a critical balancing act. A higher flow rate increases the chamber pressure, which reduces the mean free path of sputtered particles, leading to increased diffusion and a reduced deposition rate. Conversely, reducing the gas flow rate lowers the pressure and decreases the number of argon ions, which are essential for ejecting material from the target.

1.9. Sputtering mode-dependent parameters

DC sputtering offers higher deposition rates than RF or reactive sputtering. In a comparative study of ITO layers deposited by DC power and RF power sputtering [33], non-reactive DC sputtering yielded a deposition rate of 60 nm/min at lower power density. In comparison, reactive RF sputtering yielded 20 nm/min. Deposition rates are higher in setups using non-reactive sputter gases than reactive gases. The deposition rate depends on the applied power. An increase in power increases the energy transmitted to the plasma, which results in a higher sputter yield, increasing the deposition rate irrespective of the power source (DC or RF). Higher electrical powers enhance crystallinity and resistivity [34]. A study conducted on the deposition of ZnO:Ga on glass substrates using RF power [35] showed an increase in deposition rate.

3. Experimentation and Methodology

The basic parameters of sputtering include sputtering power, working pressure, sputtering time, process gas flow rate, and substrate temperature. These parameters are fundamental for controlling the deposition rate, film uniformity, microstructure, and properties of the thin film produced.

1.10. Dataset and variables

The dataset for the model has been sourced from Gencoa’s online universal sputtering calculator, which is accessible at Gencoa Sputter Rate Calculator [36]. The dataset contains material name, power(W), sputtering mode (DC or RF), target-substrate distance (in cm), and the dependent variable is deposition rate (in nm/minute). The dataset contains 3870 entries of 43 different elements and compounds, with power ranging from 10 W to 150 W, while the target-substrate distance ranges from 50 mm to 150 mm. In DC and RF settings, the target area is fixed at a diameter of 50.8 mm, and standard gas flow and vacuum pressures have been considered. The input variables (material, power, sputtering mode, distance) are used to train the model; they are the internal variables, and the output variable is deposition rate. The model is tuned using hyperparameters, which are also referred to as external variables. Table 1 lists all the models used in the evaluation and the respective hyperparameters.

Table 1
Regression machine learning models and their hyperparameters
S. N.	Model Name	Hyperparameters
1	Linear Regression	LinearRegression estimator in scikit-learn, which employs Ordinary Least Squares, does not have explicit hyperparameters that typically require tuning. The optimized coefficients are found analytically.
2	Polynomial Regression	Degree of polynomial
3	Decision Trees	max_depth, min_samples_split, min_samples_leaf, max_features, splitter, criterion
4	Random Forest Regression	Number of trees(n_estimators), max_features, max_depth, min_samples_split, min_samples_leaf
5	Boosted Decision Trees	n_estimators, learning rate, max_depth, min_samples_split, min_samples_leaf, subsample, loss function
6	XGBoost	n_estimators, learning_rate, max_depth, subsample, booster, objective function, lambda, alpha, tree method
7	Support Vector Regression	C-Regularization, Kernel, epsilon, gamma

1.11. ML Regression methods

Linear regression, the most popular machine learning model globally, is widely used for simplicity [18], establishing a linear relation between the input and output matrices, as illustrated in Eq. (6). Where [Y] is the matrix set of output variables (or dependent variables), [A] is the matrix set of coefficients of [X], [X] is the matrix set of independent variables, and B is the intercept of the linear relation, while e is the error term. The output matrix and input matrix contain single variables, which refers to linear regression; otherwise, multiple linear regression. The objective of the linear regression model is to find the optimal set of coefficients and intercept that best fits the given dataset and minimizes the error term (ε) as effectively as possible, which is done by iteratively comparing the cost function of each set of coefficients and intercept. The cost function is illustrated in Eq. (7). Y(i) and Yⁱ are the actual output values and predicted output values for the instances of [A] and B. The optimum solution is found using the gradient descent algorithm.

Linear Regression Equation	$\:\left[Y\right]=\left[A\right]\left[X\right]\:+B$ + ε	(6)
Cost Function	J ([A], B) = $\:\frac{1}{2}\sum\:_{i=1}^{m}{\left(Y\left(i\right)-{Y}^{i}\right)}^{2}$	(7)

Polynomial regression works like linear regression models and tries to establish a nonlinear relationship between the output and input as an n^th degree polynomial in the form as illustrated in Eq. (8). Like the linear regression method, the polynomial coefficients are found using MSE, as in Eq. (2), which minimizes the error term most effectively. Where Y is the target variable, C₀, C₁, C₂…C_n are coefficients, X is the independent variable, and e is the error term.

Polynomial Regression Equation:

Y = C₀ + C₁ X¹ + C₂ X² +…+ C_n Xⁿ + ε

(8)

Decision Trees regression [37] models the relationship between target and feature variables by building a tree structure. The dataset is broken down into subsets, and an associated decision tree is developed step by step. The decision tree consists of nodes and leaves, where nodes represent the decision being made and leaves are the predicted continuous value based on the feature of the decision node. The node is divided into leaves through splitting; the best split is found at every node, which minimizes the loss function. Decision Trees can handle both regression and classification tasks and are known for their versatility. Their results are independent of data scaling and preprocessing, and they handle nonlinear relationships effectively.

Random forest regression is an ensemble machine learning model to predict continuous outcomes [38], which uses multiple decision trees to achieve accuracy and overcome overfitting. Each decision tree is trained on a different subset of the data. Boosted Decision Trees [39] is a technique that combines many weak learners to come up with one strong learner. Weak learners are individual decision trees, and all the trees are connected; each tree tries to minimize the error of the predecessor, which can handle both numerical and categorical features, removing preprocessing. XGBoost (eXtreme Gradient Boosting) [40] is a tree ensemble model combining many weak prediction models to build a stronger predictive model by directly incorporating L1(Lasso) and L2(Ridge) regularization into the objective function to prevent overfitting. XGBoost is robust enough to prevent overfitting and can handle large datasets efficiently.

Support Vector Regression (SVR) is an extension of the Support Vector Machine for performing regression tasks [41]. SVR aims to find a function that best predicts the continuous output value for the input value by looking for a hyperplane that best fits all the data points. SVR handles linear and nonlinear relationships using kernel functions such as linear, polynomial, and radial basis functions. Kernels are mathematical functions that transform data into higher dimensions.

1.12. Methodology

As illustrated in Fig. 2, the steps have been followed to achieve accuracy. Power, distance, and deposition rates are numerical values in the dataset, whereas the material name is categorical. Sputtering mode has been encoded as 0, 1 to denote DC, RF sputtering, respectively. A label encoder has been employed to convert character values to numeric values for the material class, which enables the model to interpret the material class. An essential step in model training is data scaling, in which data is scaled down by dividing each data column by its maximum value, to ensure the dataset variables remain well adjusted. The step especially enhances the accuracy of neural network regression and gradient-search-based models. The dataset is split into two parts, one for training the selected model and another for testing the chosen model. 80% was allocated for training the model, and 20% for testing the model. The primary step is the selection of models. Machine learning models differ in their hyperparameters.

Fig. 2

Flow chart of the investigation of the magnetron sputtering deposition rate prediction

The possible values of hyperparameters are chosen manually based on the selected model. The hyperparameters are optimized using GridSearchCV (Grid Search Cross Validation), which evaluates the performance of different combinations of hyperparameters using cross-validation to make the best selection. The cross-validation works by splitting the data into training and validation (testing) sets. The best set of hyperparameters is then chosen using a scoring metric evaluated based on the corresponding performance on the validation set. The model is trained to fit the dataset, and the performance of each is evaluated using performance metrics such as MSE, MAE, and R².

4. Results and discussion

The dataset features, material, power, sputtering mode, and target-substrate distance were subjected to data scaling. The material section is a categorical class converted to standard format using the encoder. Scaling the features helps speed up the convergence process and improves the effectiveness of the applied analysis. The data was scaled using the standard scaler available in Sci-Kit Learn. After scaling, all feature values are adjusted between 0 and 1.

The values of actual deposition rates and predicted deposition have been plotted on the x-axis and y-axis for each regression model in Fig. 3. The training and testing data were marked on the plots, which visually examine the performance on training and testing data and help check for overfitting. A larger number of points aligned with the X = Y line indicates higher prediction accuracy. The points deviate heavily in the case of SVR (Fig. 3 (g)) and linear regression (Fig. 3 (a)), indicating higher rates of MSE and MAE. In the case of decision trees, the model achieved 100% accuracy on the training set, as shown in Fig. 1(d). The training set datapoints align perfectly with the X = Y line, while the testing data slightly deviates from the X = Y line. The individual performances of the applied regression models have been assessed quantitatively using the MSE, MAE, and R² [32], and tabulated in Table 2.

Fig. 3

The predicted outcome vs. actual outcome graph was plotted along with X = Y for error referencing for all models

The R² is a statistical method that measures the ability to replicate the observed outcomes, also called goodness of fit or accuracy. R² values lie between 0 and 1. Eq. (9) formulates the R², where SS_res and SS_tot are the residual and total sum of squares, respectively. A higher R² value indicates higher explainability of the model of the variance in dependent variables using the independent variables.

Table 2
Evaluation of the performance of the regression models used to predict deposition rate
$\:{R}^{2}=1-\left(\frac{{SS}_{res}}{{SS}_{total\:}}\right)$	(9)

Metric Model	MSE	MAE	R²
Linear Regression	33715.18	117.5	0.67
Polynomial Regression	6292.45	50.68	0.93
Decision Trees	5312.11	44.07	0.94
Random Forest	3501.01	34.14	0.96
Boosted Decision Trees	20451.0	81.71	0.80
XGBoost	2790.48	25.08	0.97
SVR	65191.32	124.49	0.36

MSE is an error evaluation metric popularly used in statistics, which measures the average squared difference between actual and estimated values [42]. It is computed using Eq. 10, where Y_i is the exact value and

$\:{\widehat{Y}}_{i}$

is the estimated value, and m is the total number of observations. The metric penalizes errors heavily, as significant errors significantly increase the value of MSE. The MAE measures the absolute value of the difference between the actual value and the estimated value, as expressed in Eq. 11, in which Y_i is the exact value and

$\:{\widehat{Y}}_{i}$

is the estimated value, and m is the total number of observations. MAE is robust to prediction outliers and helps identify models that work best on most of the data.

Fig. 4

Significance of process parameters on deposition rate in tree-based models

Additionally, the importance of power, sputtering mode, and target-substrate distance is illustrated in Fig. 4; feature importance was generated by keeping the material constant at GaAs. Feature importances have been generated using the feature importance attribute in the Sci-Kit Learn library. In the case of tree-based models (Random Forest, Decision Trees, Boosted Decision Trees), the value is calculated using Mean Decrease in Impurity (MDI) or Gini importance [43], which is calculated based on the feature’s role in shaping the nodes of the trees. The feature importances of Decision Trees, Boosted Decision Trees, and Random Forest show the highest feature importance in terms of power.

The peculiarities of the work are (i) considering a large dataset, (ii) considering the four most influential sputtering parameters, and (iii) studying seven predictive models. A larger dataset for predictive models improves generalization and reduces overfitting. With a wide range of data, the model learns the noise and a wide scenario to make the model more towards real-time prediction by capturing complex relations of multiple sputtering parameters and the edge cases. Table 3 lists similar previous reports to explain the novelty of the work. Though Lee et al. [44] and Lang et al. [14] reported better R² values, the works considered small datasets and conventional predictive models. The work considers the four most influential parameters as the optimal input features in trial-and-error methods. The seven predictive models’ training and testing enhance the probability of choosing the appropriate model for the sputtering to conclude any deposition to reduce time and cost. Lang et al. [14] also tested seven models with only 100 data points; however, the work missed the advanced models like boosted decision trees and XGBoost.

Table 3
Comparison of previous works on machine learning for sputtering
Study	Input features	Target features	Dataset size	Models deployed	Best model (R²)	Ref
Lee et al.	power, duration, nitrogen flow rate, argon flow rate, pressure, voltage, current (C), and reflective index (RI)	Deposition rate, N/Ti ratio	40	SVR, GPR, Bayesian ridge regression, Reptile	Reptile (0.99)	[44]
Lang et al.	chamber pressure, RF power, bias, and target-to-wafer spacing	Deposition rate	68	GPR	0.98	[14]
Lang et al.	Chamber configurations, chamber pressure, RF power, bias, and target-to-wafer spacing	Deposition rate	100	GPR, Polynomial, GBRT, MARS, CNN, DNN, RBFNN	GP (0.83)	[14]
Salimian et al.	Spectral emission	Ni%, Ti%, RF Power, Gas Flow rate	2352	CNN	0.98	[45]
Chen et al.	Ion mass, ion evaporation enthalpy, target mass, target evaporation	Sputter yield	266	ANN, Hidden layers of 5, 10, and 15)	ANN (0.91) (Hidden layers 5)	[46]
This work	Material, Power, Target-Substrate distance, sputtering mode	Deposition rate	3870	Linear regression, polynomial regression, decision trees, random forest, boosted decision trees, XGBoost, SVR	XGBoost (0.97)

5. Conclusion

Seven regression machine learning models were applied to the sputtering dataset designed using the openly available process calculator to find the best-suited regression model for sputtering processes. The models were assessed using R², MSE, and MAE. The XGBoost (eXtreme Gradient Boosting) model was the most effective due to its highest R² at 0.97. Minimal values of MSE and MAE at 2790.48 nm² and 25.08 nm, respectively, followed by random forest, which offered R² of 0.96, while MSE and MAE stood at 3561 nm² and 34.19 25.08 nm, respectively. The least effective regression models were Linear regression and SVR. Support Vector offered the lowest R² of 0.36, followed by Linear regression, which offered an R² of 0.67. Polynomial and Decision tree regression models scored R² greater than 0.9 and had lower error metrics. Secondary insights from the feature importances in random forest, decision trees, and boosted decision trees reveal that power influences the deposition rate most. The partial dependence plots show a positive linear relationship between power and deposition rate, while target-substrate distance shows an inverse relationship with deposition rate.

Conflict of interest

Authors declare no conflict of interest.

Funding

The authors declare that no funds, grants, or other support were received during the preparation of the manuscript.

Data Availability

Data will be available upon genuine request from the corresponding author.

Author Contribution

Sri Vishnu Jami: Conceptualization, Data curation, Investigation, Methodology, Formal analysis, Writing – original draft, review & editing, Validation, Visualization;Sakti Prasanna Muduli: Formal analysis, Writing – original draft, review & editing, Validation, Visualization;Paresh Kale: Conceptualization, Supervision, Funding acquisition, Resources, Project administration, Writing – review and editing, Validation

References

R.K. Waits, Planar magnetron sputtering, J. Vac. Sci. Technol. 15 (1978) 179–187. https://doi.org/10.1116/1.569451.

M. Al-Mansoori, S. Al-Shaibani, A. Al-Jaeedi, J. Lee, D. Choi, F.S. Hasoon, Effects of gas flow rate on the structure and elemental composition of tin oxide thin films deposited by RF sputtering, AIP Adv. 7 (2017). https://doi.org/10.1063/1.5001883.

O. Oluwatosin Abegunde, E. Titilayo Akinlabi, O. Philip Oladijo, S. Akinlabi, A. Uchenna Ude, Overview of thin film deposition techniques, AIMS Mater. Sci. 6 (2019) 174–199. https://doi.org/10.3934/matersci.2019.2.174.

P.M. Martin, ed., Chap. 1 - Deposition Technologies: An Overview, in: Handb. Depos. Technol. Film. Coatings (Third Ed., Third Edit, William Andrew Publishing, Boston, 2010: pp. 1–31. https://doi.org/https://doi.org/10.1016/B978-0-8155-2031-3.00001-6.

J.E. Crowell, Chemical methods of thin film deposition: Chemical vapor deposition, atomic layer deposition, and related technologies, J. Vac. Sci. Technol. A Vacuum, Surfaces, Film. 21 (2003) S88–S95. https://doi.org/10.1116/1.1600451.

P.J. Kelly, R.D. Arnell, Magnetron sputtering: A review of recent developments and applications, Vacuum. 56 (2000) 159–172. https://doi.org/10.1016/S0042-207X(99)00189-X.

F. Shinoki, A. Itoh, Mechanism of rf reactive sputtering, J. Appl. Phys. 46 (1975) 3381–3384. https://doi.org/10.1063/1.322242.

W. Kern, K.K. Schuegraf, 1 - Deposition Technologies and Applications: Introduction and Overview, in: K. Seshan (Ed.), Handb. Thin Film Depos. Process. Tech. (Second Ed., Second Edi, William Andrew Publishing, Norwich, NY, 2001: pp. 11–43. https://doi.org/https://doi.org/10.1016/B978-081551442-8.50006-7.

J. Fritsche, A. Klein, W. Jaegermann, Thin film solar cells: materials science at interfaces, Adv. Eng. Mater. 7 (2005) 914–920.

10.

J.T. Gudmundsson, A. Hecimovic, Foundations of DC plasma sources, Plasma Sources Sci. Technol. 26 (2017) 123001. https://doi.org/10.1088/1361-6595/aa940d.

11.

A.L. Gobbi, P.A.P. Nascente, DC Sputtering, in: Encycl. Tribol., Springer US, Boston, MA, 2013: pp. 699–706. https://doi.org/10.1007/978-0-387-92897-5_1029.

12.

J. Vlček, A.D. Pajdarová, J. Musil, Pulsed dc magnetron discharges and their utilization in plasma surface engineering, Contrib. to Plasma Phys. 44 (2004) 426–436. https://doi.org/10.1002/ctpp.200410083.

13.

S. Berg, T. Nyberg, Fundamental understanding and modeling of reactive sputtering processes, Thin Solid Films. 476 (2005) 215–230. https://doi.org/10.1016/j.tsf.2004.10.051.

14.

C.I. Lang, A. Jansen, S. Didari, P. Kothnur, D.S. Boning, Modeling and Optimizing the Impact of Process and Equipment Parameters in Sputtering Deposition Systems Using a Gaussian Process Machine Learning Framework, IEEE Trans. Semicond. Manuf. 35 (2022) 229–240. https://doi.org/10.1109/TSM.2021.3132562.

15.

H. Kino, K. Ikuse, H.-C. Dam, S. Hamaguchi, Characterization of descriptors in machine learning for data-based sputtering yield prediction, Phys. Plasmas. 28 (2021). https://doi.org/10.1063/5.0006816.

16.

M. Akiyama, C.-N. Xu, K. Nonaka, K. Shobu, T. Watanabe, Statistical approach for optimizing sputtering conditions of highly oriented aluminum nitride thin films, Thin Solid Films. 315 (1998) 62–65. https://doi.org/10.1016/S0040-6090(97)00697-4.

17.

K. Kamataki, H. Ohtomo, N. Itagaki, C.F. Lesly, D. Yamashita, T. Okumura, N. Yamashita, K. Koga, M. Shiratani, Prediction by a hybrid machine learning model for high-mobility amorphous In2O3: Sn films fabricated by RF plasma sputtering deposition using a nitrogen-mediated amorphization method, J. Appl. Phys. 134 (2023). https://doi.org/10.1063/5.0160228.

18.

A. Bhaskar, R.C. Muduli, P. Kale, Prediction of hydrogen storage in metal hydrides and complex hydrides: A supervised machine learning approach, Int. J. Hydrogen Energy. 98 (2025) 1212–1225. https://doi.org/10.1016/j.ijhydene.2024.12.121.

19.

U.M.R. Paturi, N.S. Reddy, S. Cheruku, S.K.R. Narala, K.K. Cho, M.M. Reddy, Estimation of coating thickness in electrostatic spray deposition by machine learning and response surface methodology, Surf. Coatings Technol. 422 (2021) 127559. https://doi.org/10.1016/j.surfcoat.2021.127559.

20.

Z. Tang, S. Zhao, J. Li, Y. Zuo, J. Tian, H. Tang, J. Fan, G. Zhang, Optimizing the chemical vapor deposition process of 4H–SiC epitaxial layer growth with machine-learning-assisted multiphysics simulations, Case Stud. Therm. Eng. 59 (2024) 104507. https://doi.org/10.1016/j.csite.2024.104507.

21.

Y. Zhai, W. Wang, R. Chen, T. Yu, X. Zheng, H. Shao, D. Shang, L. Li, L. Filipovic, A Machine Learning Model for Interpretable PECVD Deposition Rate Prediction, Adv. Intell. Discov. (2025). https://doi.org/10.1002/aidi.202500074.

22.

G. Venkata Ramana, P. Saravanan, S. V. Kamat, Y. Aparna, Optimization of sputtering parameters for SmCo thin films using design of experiments, Appl. Surf. Sci. 261 (2012) 110–117. https://doi.org/10.1016/j.apsusc.2012.07.109.

23.

S.. Jennings, The mean free path in air, J. Aerosol Sci. 19 (1988) 159–166. https://doi.org/10.1016/0021-8502(88)90219-4.

24.

P.B. Nair, V.B. Justinvictor, G.P. Daniel, K. Joy, V. Ramakrishnan, P.V. Thomas, Effect of RF power and sputtering pressure on the structural and optical properties of TiO2 thin films prepared by RF magnetron sputtering, Appl. Surf. Sci. 257 (2011) 10869–10875. https://doi.org/10.1016/j.apsusc.2011.07.125.

25.

H. Ejaz, S. Hussain, M. Zahra, Q.M. Saharan, S. Ashiq, Several sputtering parameters affecting thin film deposition, J. Appl. Chem. Sci. Int. (2022) 41–49. https://doi.org/10.56557/jacsi/2022/v13i37590.

26.

S.H. Jeong, J.H. Boo, Influence of target-to-substrate distance on the properties of AZO films grown by RF magnetron sputtering, Thin Solid Films. 447–448 (2004) 105–110. https://doi.org/10.1016/j.tsf.2003.09.031.

27.

K. Wasa, I. Kanno, H. Kotera, Handbook of Sputter Deposition Technology, Handb. Sputter Depos. Technol. Fundam. Appl. Funct. Thin Film. Nano-Materials MEMS. 2 (2012) 644.

28.

Y. Yamamura, Y. Itikawa, N. Itoh, Angular dependence of sputtering yields of monatomic solids, Res. Rep. IPPJ-AM. (1983).

29.

A.L. Gobbi, P.A.P. Nascente, DC Sputtering, in: Encycl. Tribol., Springer US, Boston, MA, MA, 2013: pp. 699–706. https://doi.org/10.1007/978-0-387-92897-5_1029.

30.

I. Petrov, I. Ivanov, V. Orlinov, J.-E. Sundgren, Comparison of magnetron sputter deposition conditions in neon, argon, krypton, and xenon discharges, J. Vac. Sci. Technol. A Vacuum, Surfaces, Film. 11 (1993) 2733–2741. https://doi.org/10.1116/1.578634.

31.

G.T. West, P.J. Kelly, Influence of inert gas species on the growth of silver and molybdenum films via a magnetron discharge, Surf. Coatings Technol. 206 (2011) 1648–1652. https://doi.org/10.1016/j.surfcoat.2011.08.025.

32.

W.D. Sproul, D.J. Christie, D.C. Carter, Control of reactive sputtering processes, Thin Solid Films. 491 (2005) 1–17. https://doi.org/10.1016/j.tsf.2005.05.022.

33.

F. Kurdesau, G. Khripunov, A.F. da Cunha, M. Kaelin, A.N. Tiwari, Comparative study of ITO layers deposited by DC and RF magnetron sputtering at room temperature, J. Non. Cryst. Solids. 352 (2006) 1466–1470. https://doi.org/10.1016/j.jnoncrysol.2005.11.088.

34.

M.-T. Le, Y.-U. Sohn, J.-W. Lim, G.-S. Choi, Effect of Sputtering Power on the Nucleation and Growth of Cu Films Deposited by Magnetron Sputtering, Mater. Trans. 51 (2010) 116–120. https://doi.org/10.2320/matertrans.M2009183.

35.

X. Yu, J. Ma, F. Ji, Y. Wang, X. Zhang, C. Cheng, H. Ma, Effects of sputtering power on the properties of ZnO:Ga films deposited by r.f. magnetron-sputtering at low temperature, J. Cryst. Growth. 274 (2005) 474–479. https://doi.org/10.1016/j.jcrysgro.2004.10.037.

36.

Gencoa Sputter Rate Calculator, (n.d.). https://www.gencoa.com/customers/apps/sputtercalc/index.php (accessed August 14, 2025).

37.

M. Xu, P. Watanachaturaporn, P.K. Varshney, M.K. Arora, Decision tree regression for soft classification of remote sensing data, Remote Sens. Environ. 97 (2005) 322–336. https://doi.org/10.1016/j.rse.2005.05.008.

38.

S.J. Rigatti, Random Forest, J. Insur. Med. 47 (2017) 31–39. https://doi.org/10.17849/insm-47-01-31-39.1.

39.

H. Drucker, C. Cortes, Boosting Decision Trees, in: D. Touretzky, M.C. Mozer, M. Hasselmo (Eds.), Adv. Neural Inf. Process. Syst., MIT Press, 1995.

40.

T. Chen, C. Guestrin, XGBoost: A Scalable Tree Boosting System, in: Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., Association for Computing Machinery, New York, NY, USA, 2016: pp. 785–794. https://doi.org/10.1145/2939672.2939785.

41.

M. Awad, R. Khanna, Support Vector Regression, in: Effic. Learn. Mach., Apress, Berkeley, CA, 2015: pp. 67–80. https://doi.org/10.1007/978-1-4302-5990-9_4.

42.

Z. Zhao, L. Alzubaidi, J. Zhang, Y. Duan, Y. Gu, A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations, Expert Syst. Appl. 242 (2024) 122807. https://doi.org/10.1016/j.eswa.2023.122807.

43.

S. Nembrini, I.R. König, M.N. Wright, The revival of the Gini importance?, Bioinformatics. 34 (2018) 3711–3718. https://doi.org/10.1093/bioinformatics/bty373.

44.

J. Lee, C. Yang, Deep neural network and meta-learning-based reactive sputtering with small data sample counts, J. Manuf. Syst. 62 (2022) 703–717. https://doi.org/10.1016/j.jmsy.2022.02.004.

45.

A. Salimian, E. Haine, C. Pardo-Sanchez, A. Hasnath, H. Upadhyaya, Implementing Supervised and Unsupervised Deep-Learning Methods to Predict Sputtering Plasma Features, a Step toward Digitizing Sputter Deposition of Thin Films, Coatings. 12 (2022) 953. https://doi.org/10.3390/coatings12070953.

46.

Y. Chen, J. Luo, W. Lei, Y. Shen, S. Cao, Analysis and prediction of sputtering yield using combined hierarchical clustering analysis and artificial neural network algorithms, Plasma Sci. Technol. 26 (2024) 115504. https://doi.org/10.1088/2058-6272/ad709c.

Yes