Closed-Loop Workflow of High-Entropy Materials Discovery: Efficient and Accurate Synthesizability Prediction via Domain-Specific Local LLMs

Yeongjun Yoon 1

Geun Ho Gu 2

Kyeounghak Kim 1,3✉ Emailchemekim@hanyang.ac.kr

Department of Chemical Engineering Hanyang University 222 Wangsimni-ro, Seongdong-gu 04763 Seoul South Korea

2 Department of Energy Engineering Korea Institute of Energy Technology (KENTECH) 58330 Naju Republic of Korea

3 Clean-Energy Research Center Hanyang University 222 Wangsimni-ro, Seongdong-gu 04763 Seoul South Korea

Yeongjun Yoon^a, Geun Ho Gu^b, and Kyeounghak Kim^a,c*

^a Department of Chemical Engineering, Hanyang University, 222 Wangsimni-ro, Seongdong-gu, Seoul, 04763, South Korea

^bDepartment of Energy Engineering, Korea Institute of Energy Technology (KENTECH), Naju 58330, Republic of Korea

^c Clean-Energy Research Center, Hanyang University, 222 Wangsimni-ro, Seongdong-gu, Seoul, 04763, South Korea

*Corresponding author: Kyeounghak Kim, chemekim@hanyang.ac.kr

Abstract

High-entropy materials (HEMs) offer unprecedented opportunities for superior mechanical, thermal, and catalytic properties, but their vast chemical space makes experimental discovery resource-intensive. State-of-the-art commercial large language models (LLMs) notably fail at HEM synthesizability prediction, a critical bottleneck in materials development. We demonstrate that domain-specific fine-tuning transforms open-weight local LLMs into accurate predictors. Using a dataset of 321,083 inorganic compositions with 2,560 HEM examples, we fine-tuned three 4-bit-quantized models (gpt-oss-20b, Qwen3-14b, and DeepSeek-R1-Distill-Qwen-14b), achieving remarkable balanced accuracy of 0.957, 0.961, and 0.956, respectively. Critically, these models operate efficiently on accessible hardware (< 15GB VRAM), eliminating costly API dependencies while ensuring data privacy and consistent reproducibility. This work could open new pathways toward autonomous closed-loop discovery, where distributed local models enable rapid screening and iterative improvement through experimental feedback. Future collaborative efforts in open data sharing, particularly including negative results, would address current fragmentation in synthesis reporting and accelerate community-wide HEM discovery.

Graphical Abstract

Keywords:

High-entropy material (HEM)

large language model (LLM)

Prediction

Introduction

High-entropy materials (HEMs), single-phase crystalline solid solutions incorporating multiple principal elements, represent a paradigm shift in materials design. Encompassing classes such as metal alloys,^1–5 borides,^6–8 carbides,^9–11 nitrides,^12–14 oxides,^15–17, and sulfides,^18–20, these HEMs exhibit vast compositional flexibility. This enables the tuning of exceptional mechanical, electrochemical, and catalytic properties, driven by mechanisms like cocktail effects, localized lattice distortion, and entropy-driven phase stabilization.^21,22

However, the rational design and screening of new HEMs remain a formidable challenge. The combinatorial complexity of their chemical space makes exhaustive experimental screening intractable. Consequently, discovery has historically relied on empirical or semi-empirical guidelines like the Hume-Rothery rules.^23,24 Yet, these rules, developed for binary or ternary alloys, frequently fall short for HEMs, failing to capture the complex multi-body interactions and thermodynamic competition in compositionally complex systems.²⁵ Traditional machine learning (ML) approaches have shown promise, but they often depend on computationally expensive, hand-crafted descriptors (e.g., from DFT) and are severely constrained by the scarcity of reliable, labeled experimental data. This scarcity is exacerbated by systematic reporting bias: successful syntheses are published, while failed attempts (valuable negative data) are rarely reported, leading to biased models.^26–30

Recently, Large Language Models (LLMs) emerged as a new, descriptor-free avenue. Kim et al. demonstrated that fine-tuned LLMs could predict the synthesizability of general inorganic materials.³¹ Inspired by this, we sought to address a more complex challenge: extending this LLM-based approach specifically to HEMs, a domain defined by a much vaster compositional space and more nuanced formation rules. Furthermore, we aimed to solve the critical practical bottlenecks of accessibility, cost, and data privacy associated with proprietary, API-based commercial models.

Results and Discussion

Fig. 1

Schematic illustration of workflow for fine-tuning domain specific LLMs to predict synthesizability of HEMs.

Our approach leverages domain-specific fine-tuning of open-weight LLMs to create an efficient, locally deployable system for HEM synthesizability prediction (Fig. 1). The details for our methods are included in SI. For this purpose, our initial dataset comprised 393,053 unique inorganic compositions obtained from the previous study by Kim et al.,³¹ which integrated data from the Materials Project³² and Open Quantum Materials Database³³ (OQMD) retrieved in February 2020. Among these, 40,817 compounds with Inorganic Crystal Structure Database (ICSD) references were labeled as positive (P, synthesized), while the remaining 352,236 were initially designated as unlabeled (U, hypothesized).

To address the label noise inherent in Positive-Unlabeled (PU) learning, we rigorously screened the unlabeled dataset using the Stoichiometric Crystal Graph Neural Fingerprint (stoi-CGNF) framework,³⁴ rather than naively treating all unreported data as unsynthesizable (N). This process filtered out "potential hidden positives" to define a set of "Reliable Negatives" (RN), creating a robust Positive-Negative (PN) dataset essential for learning accurate high-entropy compositional rules. Detailed procedures for this data curation strategy are provided in Supporting Information.

Furthermore, to enhance the representation of high-entropy materials (HEMs) in our dataset, we collected additional data through three complementary approaches: (1) database literature on high-entropy alloys and complex concentrated alloys,³⁵ (2) systematic searches using Elsevier's Scopus APIs, and (3) manual curation. These sources contributed substantially to enriching our training set with 2,560 additional synthesized HEM examples.

To overcome the significant hardware barrier of deploying huge LLM models, our strategy involves 4-bit quantization, a process schematically illustrated in Fig. 2a. We selected three open-weight models (gpt-oss-20b,³⁶ Qwen3-14b,^37, and DeepSeek-R1-Distill-Qwen-14b^38,39). However, even these relatively modest-sized LLMs pose significant computational challenges, requiring substantial GPU memory that exceeds the resources available to most researchers in typical laboratory settings. To address this barrier, we applied 4-bit quantization to compress the high-precision floating-point (FP) parameters of each model (Fig. 2a). This quantization dramatically reduced both memory footprint and memory traffic, enabling deployment on accessible hardware platforms such as consumer GPUs or even cloud-based environments like Google Colaboratory.⁴⁰ The quantized models require approximately 15GB of VRAM, making them practical for researchers without access to specialized computing infrastructure. See Supporting Information for detailed model architectures.

To evaluate model performance, we employed the confusion matrix framework (Fig. 2b). We treated this task as a binary classification operating on the constructed Positive-Negative (PN) dataset. From this, we calculated the True Positive Rate (TPR = TP/(TP + FN)), which quantifies the model's capability to correctly identify synthesizable candidates. Critically, we also calculated the True Negative Rate (TNR = TN/(TN + FP)), regarding the Reliable Negative (RN) set, which evaluates the model's effectiveness in excluding chemically non-viable compositions.

Figure 2. (a) Schematic illustration of 4-bit quantization of LLMs. (b) The confusion matrix and the considered performance metrics. TPR, TNR, and bAcc indicate true positive rate, true negative rate, and balanced accuracy, respectively. (c) Performance of commercial LLMs for synthesizability prediction.

For an experimental screening tool, TNR is the most important metric. A model with high TPR but low TNR is practically useless; it would recommend synthesizing nearly everything, leading to 100% experimental cost. A high TNR is what provides economic value, saving experimental resources by confidently filtering out non-viable candidates. The balanced accuracy (bAcc = (TPR + TNR)/2) provides a single, un-skewed metric that balances both goals, where 0.5 signifies the failure of random guessing.

First, we established a performance baseline by evaluating state-of-the-art commercial LLMs accessible via API as of December 3, 2025. The assessment included the latest models from OpenAI (gpt-5.1, gpt-5-mini, gpt-4.1, gpt-4.1-mini), Google (gemini-3-pro-preview, gemini-2.5-pro, gemini-2.5-flash), and Anthropic (claude-opus-4.5, claude-sonnet-4.5, claude-haiku-4.5). Due to the significant costs and rate limits associated with commercial APIs, we conducted the evaluation on a stratified subset of 10,000 compositions randomly sampled from our Positive-Negative (PN) dataset. The system prompt defined the role and output constraints: "You are a materials science assistant. Given a chemical composition, answer only with 'P' (synthesizable/positive) or 'N' (non-synthesizable/negative)." Correspondingly, each user query was formatted as: "Is the material {composition} likely synthesizable? Answer with P (positive) or N (negative).".

The results revealed that, with the notable exception of Google’s gemini-3-pro-preview, most commercial LLMs exhibited severely unbalanced performance between TPR and TNR, resulting in low balanced accuracy (bAcc) (Fig. 2c). While gemini-3-pro-preview, considered a frontier model based on its performance on the GPQA-Diamond benchmark,⁴¹ a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry, achieved a remarkable bAcc of 0.82 with balanced TPR and TNR, other frontier models struggled to establish a reliable decision boundary. For instance, gpt-5.1 achieved a high TPR of 0.79 but a critically low TNR of 0.31, indicating a persistent bias toward classifying materials as synthesizable. Conversely, claude-sonnet-4.5 showed high TNR but poor TPR. Consequently, excluding gemini-3-pro-preview, most commercial models showed bAcc values hovering near or slightly above the random-guessing threshold of 0.5.

This pattern of inadequate discrimination is perfectly mirrored in our 4-bit quantized base open-weight models (gpt-oss-20b, Qwen3-14b, and DeepSeek-R1-Distill-Qwen-14B) (Figure S1). Specifically, gpt-oss-20b exhibited an extreme negative bias (TPR = 0.00, TNR = 1.00), effectively classifying all candidates as non-synthesizable, whereas DeepSeek-R1-Distill-Qwen-14B (r1-Qwen2.5-14b) showed the opposite failure mode with a severe positive bias (TPR = 0.95, TNR = 0.06). Consequently, all base models hovered near the random-guessing threshold with balanced accuracies of approximately 0.50 to 0.61. This demonstrates that despite the advancements in some frontier foundation models like gemini-3-pro-preview, the majority of base LLMs lack the inherent domain-specific knowledge to distinguish chemically non-viable (N) from synthesizable (P) materials without fine-tuning.

To address this fundamental knowledge gap, we employed Quantized Low-Rank Adaptation⁴² (QLoRA) to fine-tune our open-weight local LLMs using the HEM-existed P|N labeled dataset (Figs. 1a and 3a). In QLoRA, trainable low-rank adapter modules, representing less than 1% of total model parameters, are introduced into the frozen 4-bit quantized base model. This approach proved remarkably efficient, requiring only about 9 hours per epoch on a single consumer GPU (NVIDIA RTX A6000) while maintaining the model's general chemical knowledge intact. Importantly, the low-rank constraint acts as a natural regularization mechanism that prevents overfitting to dataset artifacts and instead focuses learning on the discriminative features essential for HEM synthesizability prediction. Details of the fine-tuning methodology and evolution of performance metrics are provided in the Supporting Information.

This QLoRA-based fine-tuning dramatically transformed model performance across all three architectures. For gpt-oss-20b, the TPR surged from 0.00 to 0.98 (Fig. 3b), overcoming its initial inability to identify synthesizable materials, while maintaining a near-perfect TNR of 0.94 (Fig. 3c). Conversely, for r1-Qwen2.5-14b, which initially suffered from a severe lack of TNR, the TNR improved drastically from 0.06 to 0.94, while retaining a high TPR of 0.98. Qwen3-14b also showed substantial gains, evolving from a biased state to a balanced predictor with a TPR and TNR of 0.96. The transformative impact of fine-tuning becomes clear when examining balanced accuracy (bAcc) (Fig. 3d). The base models achieved bAcc values near or slightly above 0.50 (red dashed line) due to their extreme biases—either positive or negative—averaging out to chance-level performance. This systematic failure to discriminate between synthesizable (P) and non-synthesizable (N) materials made them impractical for screening. After fine-tuning, however, all models achieved remarkable bAccs of 0.96, demonstrating robust discriminative capability.

Fig. 3

(a) Schematic illustration of QLoRA process of local LLMs. (b-d) Performance metrics for synthesizability prediction comparing base models with QLoRA-adapted fine-tuned (FT) versions across the local LLMs. (b) True positive rate (TPR), (c) True negative rate (TNR), and (d) balanced accuracy (bAcc).

This result represents a fundamental transformation: from indiscriminate positive classification to genuine discriminative capability. The dramatically improved TNR enables these models to effectively filter out chemically non-viable compositions, steering experimental efforts away from materials unlikely to form stable phases. Most importantly for our primary objective, these fine-tuned open-weight models demonstrated exceptional performance specifically on HEM compositions (Fig. 4). To establish a rigorous baseline, we selected the single best-performing model from each commercial provider—OpenAI, Google, and Anthropic—based on the highest balanced accuracy (bAcc) achieved in the general screening task (Fig. 2c). Accordingly, gpt-4.1, gemini-3-pro-preview, and claude-sonnet-4.5 were chosen as the representative benchmarks. When evaluated on the HEM subset, our fine-tuned models achieved TPR values of 0.973, 0.931, and 0.973 for gpt-oss-20b, Qwen3-14B, and r1-Qwen2.5-14b, respectively. Notably, both gpt-oss-20b (TPR = 0.973) and r1-Qwen2.5-14b (TPR = 0.973) outperformed even the top-tier commercial model, Google's gemini-3-pro-preview (TPR = 0.930), while Qwen3-14b achieved comparable performance. These near-perfect scores indicate that our models can identify virtually all synthesizable high-entropy materials. This remarkable accuracy on the most challenging multi-component systems validates our approach of combining general materials data with targeted HEM examples during training.

Fig. 4

True positive rate (TPR) of HEMs of commercial models (pink) and open-weight fine-tuned models (green).

Beyond prediction, our work establishes a practical pathway for autonomous, closed-loop workflows (Fig. 5a). A locally-deployable model, running on a lab's own hardware, can screen millions of de novo candidates, prioritizing a small, high-probability candidates for synthesis. The experimental outcomes are then fed back to iteratively retrain and improve the model. The true power of this loop lies in the "failures": a composition predicted as 'P' but found to fail in synthesis (a False Positive, FP) is not an error, but rather the most valuable data point. It provides the verified negative data that is currently missing from the literature, allowing the model to continuously correct its biases. This local deployment model ensures laboratories maintain complete data sovereignty over their novel compositions and experimental results.

Fig. 5

Schematic illustration of (a) closed-loop workflow of domain-specific open-weight LLM-accelerated HEMs discovery and (b) collaborative framework with multiple parallel closed-loop workflows sharing data through a centralized HEMs open data repository, enabling accelerated HEMs discovery through distributed learning and knowledge sharing.

Transformative potential emerges through collaborative frameworks (Fig. 5b). Multiple research groups operating parallel workflows could contribute to centralized open HEM data repository, pooling successes and failures across diverse synthesis conditions and methods. This addresses the critical weakness of these fields: fragmented reporting, where negative results are often discarded, and positive results often lack reproducibility details. Shared learning at this scale would accelerate discovery beyond what any single laboratory could achieve. Our approach democratizes advanced materials prediction by enabling any research group with standard GPU hardware to deploy these models and participate in collective discovery. As HEMs research expands into unexplored compositional territories, this distributed framework offers the only viable path through the combinatorial explosion of multi-component chemical space. These models establish a foundation for a new paradigm that combines local computation, global collaboration, and continuous learning to accelerate materials discovery.

In conclusion, we have demonstrated that domain-specific fine-tuning, when combined with a targeted, data-centric enrichment strategy, transforms modest, 4-bit-quantized open-weight LLMs from non-functional predictors into powerful and accurate tools for HEM synthesizability prediction. Our fine-tuned models achieve balanced accuracy (bAcc) exceeding 0.96 and near-perfect TPR (> 0.97) for HEM compositions, all while running on accessible consumer-grade hardware. This work proves that the future of accelerated materials discovery lies not exclusively in massive, proprietary models, but in specialized, accessible, and collaborative tools. We provide a practical foundation for a new discovery paradigm that combines local computation, global data sharing, and continuous learning from both successes and failures.

Code availability

The fine-tuned models used in this study are available at: https://huggingface.co/collections/evenfarther/synthesizability-pn-prediction-balance-tpr-tnr

Author Contribution

K. K. and Y. Y. designed the research framework and drafted the manuscript. Y. Y. performed all simulations and developed LLM models. K. K. supervised the research. G. G. provided feedback on the manuscript.

Competing interests

The authors declare no competing interests.

Acknowledgement

This research was supported by the Nano & Material Technology Development Program through the National Research Foundation of Korea (NRF) funded by Ministry of Science and ICT (RS-2024-00448287) and Hyundai Motor Chung Mong-Koo Foundation

Funding

Open Access funding enabled and organized by the "Regional Innovation System & Education (RISE)" through the Seoul RISE Center, funded by the Ministry of Education (MOE) and the Seoul Metropolitan Government. (2025-RISE-01-027-04).

Electronic Supplementary Material

Below is the link to the electronic supplementary material

Supplementary Material 1

Data Availability

The fine-tuned models used in this study are available at: https://huggingface.co/collections/evenfarther/synthesizability-PN-prediction-balance-tpr-tnr.

References

Zhang, Y.; Yang, X.; Liaw, P. K. Alloy Design and Properties Optimization of High-Entropy Alloys. JOM 2012, 64, 830–838.

Yang, X.; Chen, S. Y.; Cotton, J. D.; Zhang, Y. Phase stability of low-density, multiprincipal component alloys containing aluminum, magnesium, and lithium. JOM 2014, 66, 2009 – 2020.

Takeuchi, A.; Amiya, K.; Wada, T.; Yubuta, K.; Zhang, W. High-entropy alloys with a hexagonal close-packed structure designed by equi-atomic alloy strategy and binary phase diagrams. JOM 2014, 66, 1984 – 1992.

Laws, K. J.; Crosby, C.; Sridhar, A.; Conway, P.; Koloadin, L. S.; Zhao, M.; Aron-Dine, S.; Bassman, L. C. High entropy brasses and bronzes – Microstructure, phase evolution and properties. J. Alloys Compd. 2015, 650, 949–961.

Sohn, S.; Liu, Y.; Liu, J.; Gong, P.; Prades-Rodel, S.; Blatter, A.; Scanley, B. E.; Broadbridge, C. C.; Schroers, J. Noble metal high entropy alloys. Scr. Mater. 2017, 126, 29–32.

Gild, J.; Zhang, Y.; Harrington, T.; Jiang, S.; Hu, T.; Quinn, M. C.; Mellor, W. M.; Zhou, N.; Vecchio, K.; Luo, J. High-entropy metal diborides: a new class of high-entropy materials and a new type of ultrahigh temperature ceramics. Sci. Rep. 2016, 6, 37946.

Rosenberg, A. A.; Lintz, D. T.; Li, J.; Zhang, Y.; Doane, J. T.; Bristol, M. N.; Kolakji, A.; Wang, T.; Yeung, M. T. Tailoring high-entropy borides for hydrogenation: crystal morphology and catalytic pathways. Inorg. Chem. Front. 2025, 12, 4828–4834.

Wang, X.; Zuo, Y.; Horta, S.; He, R.; Yang, L.; Moghaddam, A. O.; Ibáñez, M.; Qi, X.; Cabot, A. CoFeNiMnZnB as a high-entropy metal boride to boost the oxygen evolution reaction. ACS Appl. Mater. Interfaces 2022, 14, 48212–48219.

Sarker, P.; Harrington, T.; Toher, C.; Oses, C.; Samiee, M.; Maria, J.-P.; Brenner, D. W.; Vecchio, K. S.; Curtarolo, S. High-entropy high-hardness metal carbides discovered by entropy descriptors. Nat. Commun. 2018, 9, 4980.

10.

Li, Y.; Zhao, S.; Wu, Z. Uncovering the effects of chemical disorder on the irradiation resistance of high-entropy carbide ceramics. Acta Mater. 2024, 277, 120187.

11.

Hossain, M. D.; Borman, T.; Oses, C.; Esters, M.; Toher, C.; Feng, L.; Kumar, A.; Fahrenholtz, W. G.; Curtarolo, S.; Brenner, D. W.; LeBeau, J. M.; Maria J.-P. Entropy landscaping of high-entropy carbides. Adv. Mater. 2021, 33, 2102904.

12.

Moskovskikh, D.; Vorotilo, S.; Buinevich, V.; Sedegov, A.; Kuskov, K.; Khort, A.; Shuck, C.; Zhukovskyi, M.; Mukasyan, A. Extremely hard and tough high entropy nitride ceramics. Sci. Rep. 2020, 10, 19874.

13.

Hei, J.; Wang, N.; Jing, R.; Chen, X.; Yin, X.; Li, J.; Zuo, P.; Yin, Y.; Cui, L. High-entropy nitrides as superior electrocatalysts: unveiling the role of entropy in enhanced performance. Chem. Eur. J. 2025, 31, e202500039.

14.

Li, J.; Chen, Y.; Zhao, Y.; Shi, X.; Wang, S.; Zhang, S. Super-hard (MoSiTiVZr)Nₓ high-entropy nitride coatings. J. Alloys Compd. 2022, 926, 166807.

15.

Qian, F.; Cao, D.; Chen, S.; Yuan, Y.; Chen, K.; Chimtali, P. J.; Liu, H.; Jiang, W.; Sheng, B.; Yi, L.; Huang, J.; Hu, C.; Lei, H.; Wu, X.; Wen, Z.; Chen, Q.; Song, L. High-entropy RuO₂ catalyst with dual-site oxide path for durable acidic oxygen evolution reaction. Nat. Commun. 2025, 16, 6894.

16.

Bao, W.; Shen, H.; Zhang, Y.; Qian, C.; Zeng, G.; Jing, K.; Cui, D.; Xia, J.; Liu, H.; Guo, C.; Yu, F.; Sun, K.; Li, J. High-entropy oxides for energy storage and conversion. J. Mater. Chem. A 2024, 12, 23179–23201.

17.

Iwase, K.; Honma, I. High-entropy spinel oxide nanoparticles synthesized via supercritical hydrothermal processing as oxygen evolution electrocatalysts. ACS Appl. Energy Mater. 2022, 5, 9292–9296.

18.

Lin, L.; Wang, K.; Sarkar, A.; Njel, C.; Karkera, G.; Wang, Q.; Azmi, R.; Fichtner, M.; Hahn, H.; Schweidler, S.; Breitung, B. High-entropy sulfides as electrode materials for Li-ion batteries. Adv. Energy Mater. 2022, 12, 2103090.

19.

Gong, L.; Zhang, W.; Zhuang, Y.; Zhang, K.; Zhao, Q.; Xiao, D.; Liu, S.; Liu, Z.; Zhang, Y. High-entropy sulfides as electrode materials for Li-ion batteries. ACS Appl. Mater. Interfaces 2024, 16, 66211–66218.

20.

Zhang, F.; Gao, T.; Zhang, Y.; Sun, K.; Qu, X.; Luo, Y.; Song, Y.; Fang, F.; Sun, D.; Wang, F.; Liu, Y. High-entropy metal sulfide nanocrystal libraries for highly reversible sodium storage. Adv. Mater. 2025, 37, 2418890.

21.

Schweidler, S.; Botros, M.; Strauss, F.; Wang, Q.; Ma, Y.; Velasco, L.; Marques, G. C.; Sarkar, A.; Kübel, C.; Hahn, H.; Aghassi-Hagmann, J.; Brezesinski, T.; Breitung, B. High-entropy materials for energy and electronic applications. Nat. Rev. Mater. 2024, 9, 266–281.

22.

Gu, X.; Guo, X.-B.; Li, W.-H.; Jiang, Y.-P.; Liu, Q.-X.; Tang, X.-G. High-entropy materials for application: electricity, magnetism, and optics. ACS Appl. Mater. Interfaces 2024, 16, 53372–53392.

23.

Hume-Rothery, W.; Powell, H. M. Z. On the theory of super-lattice structures in alloys. Kristallogr. Cryst. Mater. 1935, 91, 23–47.

24.

Mizutani, U. Hume-Rothery Rules for Structurally Complex Alloy Phases; CRC Press: Boca Raton, FL, 2016.

25.

Otto, F.; Yang, Y.; Bei, H.; George, E. P. Relative effects of enthalpy and entropy on the phase stability of equiatomic high-entropy alloys. Acta Mater. 2013, 61, 2628–2638.

26.

Rao, Z.; Tung, P.-Y.; Xie, R.; Wei, Y.; Zhang, H.; Ferrari, A.; Klaver, T. P. C.; Körmann, F.; Sukumar, P. T.; da Silva, A. K.; Chen, Y.; Li, Z.; Ponge, D.; Neugebauer, J.; Gutfleisch, O.; Bauer, S.; Raabe, D. Machine learning–enabled high-entropy alloy discovery. Science 2022, 378, 78–85.

27.

Wen, C.; Zhang, Y.; Wang, C.; Xue, D.; Bai, Y.; Antonov, S.; Dai, L.; Lookman, T.; Su, Y. Machine learning assisted design of high entropy alloys with desired property. Acta Mater. 2019, 170, 109–117.

28.

Zhou, Z.; Zhou, Y.; He, Q.; Ding, Z.; Li, F.; Yang, Y. Machine learning guided appraisal and exploration of phase design for high entropy alloys. npj Comput. Mater. 2019, 5, 128.

29.

Rickman, J. M.; Chan, H. M.; Harmer, M. P.; Smeltzer, J. A.; Marvel, C. J.; Roy, A.; Balasubramanian, G. Materials informatics for the screening of multi-principal elements and high-entropy alloys. Nat. Commun. 2019, 10, 2618.

30.

Liu, X.; Zhang, J.; Pei, Z. Machine learning for high-entropy alloys: progress, challenges and opportunities. Prog. Mater. Sci. 2023, 131, 101018.

31.

Kim, S.; Jung, Y.; Schrier, J. Large Language Models for Inorganic Synthesis Predictions. J. Am. Chem. Soc. 2024, 146, 19654–19659.

32.

Jain, A.; Ong, S. P.; Hautier, G.; Chen, W.; Richards, W. D.; Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; Persson, K. A. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 2013, 1, 011002.

33.

Kirklin, S.; Saal, J. E.; Meredig, B.; Thompson, A.; Doak, J. W.; Aykol, M.; Rühl, S.; Wolverton, C. The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies. npj Comput. Mater. 2015, 1, 15010.

34.

Jang, J.; Noh, J.; Zhou, L.; Gu, G. H.; Gregoire, J. M.; Jung, Y. Synthesizability of materials stoichiometry using semi-supervised learning. Matter 2024, 7, 2294–2312.

35.

Gorsse, S.; Nguyen, M. H.; Senkov, O. N.; Miracle, D. B. Database on the mechanical properties of high entropy alloys and complex concentrated alloys. Data in Brief 2018, 21, 2664–2678.

36.

OpenAI; Agarwal, S.; Ahmad, L.; Ai, J.; Altman, S.; Applebaum, A.; Arbus, E.; Arora, R. K.; Bai, Y.; Baker, B.; et al. gpt-oss-120b & gpt-oss-20b Model Card. arXiv 2025, arXiv:2508.10925.

37.

Yang, A.; Li, A.; Yang, B.; Zhang, B.; Hui, B.; Zheng, B.; Yu, B.; Gao, C.; Huang, C.; Lv, C.; et al. Qwen3 Technical Report. arXiv 2025, arXiv:2505.09388.

38.

Guo, D.; Yang, D.; Zhang, H.; et al. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature 2025, 645, 633–638.

39.

Yang, A.; Yu, B.; Li, C.; Liu, D.; Huang, F.; Huang, H.; Jiang, J.; Tu, J.; Zhang, J.; Zhou, J.; et al. Qwen2.5-1M Technical Report. arXiv 2025, arXiv:2501.15383.

40.

Google. Google Colaboratory. https://colab.research.google.com/ (accessed 2025-12-05)

41.

Rein, D.; Hou, B. L.; Stickland, A. C.; Petty, J.; Pang, R. Y.; Dirani, J.; Michael, J.; Bowman, S. R. GPQA: A graduate-level Google-proof Q&A benchmark. arXiv 2023, arXiv:2311.12022.

42.

Dettmers, T.; Pagnoni, A.; Holtzman, A.; Zettlemoyer, L. QLoRA: Efficient Finetuning of Quantized LLMs. arXiv 2023, arXiv:2305.14314.

Yes