Improving nuclear mass predictions by correcting mass residuals using eXtreme Gradient Boosting

Figures(8)

Get Citation
X. Y. Zhang, W. F. Li and J. Y. Fang. Improving nuclear mass predictions by correcting mass residuals using eXtreme Gradient Boosting[J]. Chinese Physics C. doi: 10.1088/1674-1137/ae25cd
X. Y. Zhang, W. F. Li and J. Y. Fang. Improving nuclear mass predictions by correcting mass residuals using eXtreme Gradient Boosting[J]. Chinese Physics C.  doi: 10.1088/1674-1137/ae25cd shu
Milestone
Received: 2025-08-30
Article Metric

Article Views(58)
PDF Downloads(0)
Cited by(0)
Policy on re-use
To reuse of subscription content published by CPC, the users need to request permission from CPC, unless the content was published under an Open Access license which automatically permits that type of reuse.
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Email This Article

Title:
Email:

Improving nuclear mass predictions by correcting mass residuals using eXtreme Gradient Boosting

    Corresponding author: J. Y. Fang, jiyufang@aust.edu.cn
  • 1. Key Laboratory of Functional Materials and Devices for Informatics of Anhui Educational Institutions, Fuyang Normal University, Fuyang 236037, China
  • 2. School of Physics, Anhui University, Hefei 230601, China
  • 3. School of Mechanics and Photoelectric Physics, Anhui University of Science and Technology, Huainan 232001, China

Abstract: Nuclear masses are investigated for the first time using the eXtreme Gradient Boosting (XGBoost) method. Nucleon numbers, valence nucleon numbers, and physical quantities related to the magic number are used as input features for the decision tree, which learns the residuals of experimental binding energies with respect to the Bethe-Weizsäcker (BW2) formula predictions, and the XGBoost method can achieve high accuracy predictions of nuclear binding energy. For nuclear masses of magic number nuclei with prediction challenges, XGBoost can better capture the physical information associated with the magic number compared to that using BW2, and the root mean square deviation of its predicted nuclear mass ranges from 2.769 to 0.732 MeV. Comparing the results of BW2* and XGBoost* with the pseudo-experimental data of Finite-Range Droplet Model (FRDM12) suggests that the XGBoost* method may have better extrapolation abilities.

    HTML

    I.   INTRODUCTION
    • The mass of a nucleus is a fundamental physical quantity that encapsulates critical information about its structure [1, 2] such as shell effects [3] and nuclear deformation [4, 5]. It plays a crucial role in determining the parameters of nuclear effective interactions [68], which are essential for understanding the properties of nuclei and predicting their behavior in nuclear reactions and decays. The accurate prediction of nuclear mass is vital not only for nuclear physics but also for fields such as astrophysics [9, 10] and nuclear engineering [11], wherein nuclear mass determines the path of the rapid neutron-capture process (r-process) as well as the nuclear reactions and decay energies.

      Recent advances in nuclear mass measurements driven by the development of radioactive ion beam facilities [1214] have provided approximately 2,500 valuable experimental data. However, the masses of many nuclei remain experimentally undetermined and unmeasurable, at least for the foreseeable future. Consequently, the theoretical predictions of nuclear masses are urgently required. Several nuclear mass models have been developed to address this issue, including empirical formulas, macro-microscopic models, and microscopic models. The Bethe-Weizsäcker (BW) formula [15, 16], which is one of the earliest models, is the best-known empirical formula for nuclear mass. Although it provides a general approximation, its accuracy is limited to around 3 MeV. The inclusion of additional terms has improved the accuracy to approximately 1.5 MeV [17]; however, large deviations remain for nuclei near the magic numbers, highlighting the importance of incorporating microscopic effects. Macro-microscopic models such as the finite-range droplet model (FRDM) [18] and Weizsäcker-Skyrme (WS) model [19] can achieve accuracies of around 500 keV. Microscopic models based on non-relativistic [2025] and relativistic [2629] density functional theories, offer even higher precision, improving to around 500 keV or less, and they are often considered superior in extrapolation ability [30, 31].

      With the development of artificial intelligence, many hot topics in the nuclear field have been investigated using machine learning methods [32, 33] such as β decay half-lives [3436], low-lying excitation spectra [3739], and fission yields [40, 41], etc., providing a large number of highly accurate physical inputs for the study of the r-process. Various machine learning methods including radial basis function (RBF) [42, 43], Bayesian neural network (BNN) [44, 45], convolutional neural network (CNN) [46], Light Gradient Boosting Machine (LightGBM) [47], and kernel ridge regression (KRR) [48, 49] have been uesd to predict nuclear mass. In addition, machine learning methods can capture the physics describing nuclear mass well, generally achieving higher prediction accuracies compared to that using the traditional nuclear theoretical mass model.

      A decision tree is a machine learning algorithm that represents the decision-making process as a tree structure and has a wide range of applications in regression problems. Decision trees can deal with missing values and outliers in a data set and efficiently access patterns and relationships in the data to make effective predictions. Single decision trees cannot meet practical needs, leading to the combination of multiple decision trees through ensemble learning. Note that ensemble learning includes two methods, i.e., boosting and bagging. Given this property, decision trees may be good at describing the properties of certain special nuclei, such as those close to magic numbers.

      In this study, we use the eXtreme Gradient Boosting (XGBoost) method [50, 51] to investigate nuclear mass, focusing on the charactezisation of nuclear shell effects, which are more difficult to describe with the BW2 empirical formulation. Input features related to nuclear shell effects are carefully selected to learn the residuals of experimental masses with respect to BW2 predictions, and the ability of the XGBoost method to improve the traditional empirical formulations is explored. The predictive ability of the XGBoost method is analyzed by comparing the results with experimental data as well as with those of other nuclear models, which can provide a reference for the study of nuclear properties using the XGBoost method. The details of the formula for BW2 and the XGBoost method are given in Sec. II. The corresponding results are presented in Sec. III. Finally, the summary and perspectives are given in Sec. IV.

    • I.   INTRODUCTION
      • The mass of a nucleus is a fundamental physical quantity that encapsulates critical information about its structure [1, 2] such as shell effects [3] and nuclear deformation [4, 5]. It plays a crucial role in determining the parameters of nuclear effective interactions [68], which are essential for understanding the properties of nuclei and predicting their behavior in nuclear reactions and decays. The accurate prediction of nuclear mass is vital not only for nuclear physics but also for fields such as astrophysics [9, 10] and nuclear engineering [11], wherein nuclear mass determines the path of the rapid neutron-capture process (r-process) as well as the nuclear reactions and decay energies.

        Recent advances in nuclear mass measurements driven by the development of radioactive ion beam facilities [1214] have provided approximately 2,500 valuable experimental data. However, the masses of many nuclei remain experimentally undetermined and unmeasurable, at least for the foreseeable future. Consequently, the theoretical predictions of nuclear masses are urgently required. Several nuclear mass models have been developed to address this issue, including empirical formulas, macro-microscopic models, and microscopic models. The Bethe-Weizsäcker (BW) formula [15, 16], which is one of the earliest models, is the best-known empirical formula for nuclear mass. Although it provides a general approximation, its accuracy is limited to around 3 MeV. The inclusion of additional terms has improved the accuracy to approximately 1.5 MeV [17]; however, large deviations remain for nuclei near the magic numbers, highlighting the importance of incorporating microscopic effects. Macro-microscopic models such as the finite-range droplet model (FRDM) [18] and Weizsäcker-Skyrme (WS) model [19] can achieve accuracies of around 500 keV. Microscopic models based on non-relativistic [2025] and relativistic [2629] density functional theories, offer even higher precision, improving to around 500 keV or less, and they are often considered superior in extrapolation ability [30, 31].

        With the development of artificial intelligence, many hot topics in the nuclear field have been investigated using machine learning methods [32, 33] such as β decay half-lives [3436], low-lying excitation spectra [3739], and fission yields [40, 41], etc., providing a large number of highly accurate physical inputs for the study of the r-process. Various machine learning methods including radial basis function (RBF) [42, 43], Bayesian neural network (BNN) [44, 45], convolutional neural network (CNN) [46], Light Gradient Boosting Machine (LightGBM) [47], and kernel ridge regression (KRR) [48, 49] have been uesd to predict nuclear mass. In addition, machine learning methods can capture the physics describing nuclear mass well, generally achieving higher prediction accuracies compared to that using the traditional nuclear theoretical mass model.

        A decision tree is a machine learning algorithm that represents the decision-making process as a tree structure and has a wide range of applications in regression problems. Decision trees can deal with missing values and outliers in a data set and efficiently access patterns and relationships in the data to make effective predictions. Single decision trees cannot meet practical needs, leading to the combination of multiple decision trees through ensemble learning. Note that ensemble learning includes two methods, i.e., boosting and bagging. Given this property, decision trees may be good at describing the properties of certain special nuclei, such as those close to magic numbers.

        In this study, we use the eXtreme Gradient Boosting (XGBoost) method [50, 51] to investigate nuclear mass, focusing on the charactezisation of nuclear shell effects, which are more difficult to describe with the BW2 empirical formulation. Input features related to nuclear shell effects are carefully selected to learn the residuals of experimental masses with respect to BW2 predictions, and the ability of the XGBoost method to improve the traditional empirical formulations is explored. The predictive ability of the XGBoost method is analyzed by comparing the results with experimental data as well as with those of other nuclear models, which can provide a reference for the study of nuclear properties using the XGBoost method. The details of the formula for BW2 and the XGBoost method are given in Sec. II. The corresponding results are presented in Sec. III. Finally, the summary and perspectives are given in Sec. IV.

      II.   THEORETICAL FRAMEWORK
      • Let us start with the well-known empirical Bethe-Weizsäcker (BW2) formula [17] for nuclear binding energies, where the binding energies are given by

        $\begin{split} B = \;&p_1A+p_2A^{2/3}+p_3Z^2A^{-1/3}+p_4(N-Z)^2A^{-1} \\ &+p_5\delta A^{-1/2}+p_6Z^{4/3}A^{-1/3}+p_7|N-Z|A^{-1}\\ &+p_8(N-Z)^2A^{-4/3}+p_9A^{1/3}+p_{10}P+p_{11}P^2, \end{split}$

        (1)

        where $ p_1 $, $ p_2 $,..., $ p_{11} $ are the parameters, and their values (in MeV) obtained by fitting the experimental data of AME2020 [12] are $ 16.4881 $, $ -25.5527 $, $ -0.7612 $, $ -32.6042 $, $ 11.0820 $, $ 1.7035 $, $ -61.3448 $, $ 61.3608 $, $ 13.3012 $, $ -2.0069 $, and $ 0.1566 $, respectively. For the physical meaning and importance of each parameter, please refer to Ref. [17] for details. B represent the nuclear binding energy; and Z, N, and A represent the proton number, neutron number, and mass number, respectively. The quantities δ and P, which account for pairing and shell effects, are expressed as

        $ \delta = \frac{(-1)^Z+(-1)^N}{2}, \;\;P = \frac{\nu_p \nu_n}{\nu_p+\nu_n}. $

        (2)

        where $ \nu_p $ and $ \nu_n $ represent differences between the actual nucleon numbers Z and N and the nearest magic numbers (8, 20, 28, 50, 82,126, and 184).

        XGBoost is a powerful machine learning algorithm based on decision trees that is widely used for regression tasks. XGBoost is an effective gradient boosting technique with notable advantages, which consists of multiple decision trees, each of which predicts the residual of the true value and sum of the predicted values of all previous decision trees. The predicted values of all decision trees are summed up as the final result. XGBoost can automatically learn the direction of the split, making XGBoost more robust when dealing with data containing missing or extreme values. The method incorporates a regularization term in the objective function to control the complexity of the model and prevent overfitting. This regularization strategy helps improve the generalizability of the model such that it maintains good predictive performance even on new data.

        In the XGBoost method, assuming the existence of K trees, the score of the ith sample is the sum of the predicted values of the K trees, and the final prediction is

        $ \hat{y}_i = \sum\limits_{k = 1}^K f_k(x_i). $

        (3)

        For a task with n samples and K trees, the objective function has two parts and can be expressed as

        $ obj = \sum\limits_{i}^n {\cal{L}}(y_i,\hat y_i) + \sum\limits_{k = 1}^{K} \Omega(f_k), $

        (4)

        where $ {\cal{L}}(y_i,\hat y_i) $ and $ \Omega(f_k) $ represent the loss function and a regular term representing the complexity of the tree, respectively.

        Let the output value of the Kth tree for sample $ x_i $ be $ f_K(x_i) $. The predicted value of sample $ x_i $ when trained on the Kth tree is

        $ \begin{split} \hat y_{i}^{(K)}\;& = f_1(x_i) + f_2(x_i) + \ldots + f_K(x_i) \\ &= \sum\limits_{j = 1}^{K-1}f_j(x_i) + f_K(x_i) \\ &= \hat y_{i}^{(K-1)}+f_{K}(x_i). \end{split} $

        (5)

        After training the first $ (K-1) $ trees, the previous prediction $ \hat y_{i}^{(K-1)} $ is known, the K-th tree attempts to minimize the residuals further, thereby optimizing the objective function as

        $ obj = \sum\limits_{i}^n{\cal{L}}(y_i,\hat y_i^{(K-1)}+f_{K}(x_i))+\Omega(f_K)+C, $

        (6)

        $ \Omega(f_{K}) = \gamma T+\frac{1}{2} \lambda||\omega||^{2}, $

        (7)

        where C is a fixed value. T represents the number of leaf nodes contained in the tree, ω represents the fraction corresponding to each leaf node, and γ and λ are hyperparameters.

        Furthermore, a second-order expansion of the $ {\cal{L}}(y_i,\hat y_i^{(K-1)}+f_{K}(x_i)) $ term in Eq. (6) according to Taylor's formula yields

        $ obj = \sum\limits_{i}^{n}[g_i \cdot f_K(x_i)+\frac{1}{2}h_i \cdot f_{K}^{2}(x_i)]+\gamma T+\frac{1}{2} \lambda||\omega||^{2}, $

        (8)

        where $ g_i = \partial_{\hat y_{i}^{(K-1)}}({\cal{L}}(y_i,{\hat y_{i}^{(K-1)}})) $ and $ h_i = \partial_{\hat y_{i}^{(K-1)}}^{2}({\cal{L}}(y_i,{\hat y_{i}^{(K-1)}})) $ represent the first and second derivatives of the loss function.

        For each sample of leaf nodes, the loss function is minimized and then fit to the function that can be summarized as Eqs. (9)−(11).

        $ c_{Kj} = \arg \min obj^{(K)}, $

        (9)

        $ z_K(x) = \sum\limits_{j = 1}^{J} c_{Kj}I, $

        (10)

        $ f_K(x) = f_{K - 1}(x) + z_K(x), $

        (11)

        where $ c_{Kj} $ represents the optimal output value of K-round the fitted leaf node; $ z_K(x) $ represents the K-round decision tree fitting function; and I represents the set of ω and $ f_K(x) $ is the final prediction result.

        In this study, the XGBoost method is used to study the nuclear binding energy residuals $ \Delta B $ between the experimental data from AME2020 [12] and BW2 formula. We found that introducing relevant physical input features into the XGBoost method can significantly improve the predictive power of the model [51]. To improve the prediction of nuclei near magic numbers, Z, N, $ \nu_p $, $ \nu_n $, and P are used as inputs.

        The experimental mass used to train the model is taken from nuclei containing $ Z \geqslant 8 $ and $ N \geqslant 8 $ in the AME2020 [12]. To test the reliability of the XGBoost method, we select the nuclei that once appear in the atomic mass evaluation of 2016 (AME2016) [52] as the learning set, and the remaining nuclei as the testing set. There are $ 2,386 $ and $ 71 $ experimental data in the learning and testing sets, respectively. In the XGBoost method, hyperparameters such as the regularization parameters (γ and λ) are used to control the model complexity and prevent overfitting. In this study, the number of trees K is set to 54, maximum tree depth to 5, learning rate to 0.08, and subsample ratio to 0.3, while γ and λ are set to their default values of 0 and 1, respectively.

        The root-mean-square (rms) deviation of model predictions from experimental data is commonly used in nuclear physics research to evaluate the accuracy of a model, which, in this work, can be defined as

        $ \sigma_{\rm{rms}}(M) = \sqrt{\sum\limits_{i = 1}^{N}(M_{i,{\rm{exp}}}-M_{i,{\rm{th}}})^2/N}, $

        (12)

        where $ M_{i,{\rm{exp}}} $ and $ M_{i,{\rm{th}}} $ represent experiments and theoretical data of the mass for nuclei i, and N represents the number of data to be evaluated. The output from the XGBoost model is the corrected binding energy residual, which is added to the original theoretical binding energy $ B_{\rm{BW2}} $ to yield the final XGBoost adjusted binding energy prediction. In addition, the nuclear mass can be calculated based on binding energy.

      II.   THEORETICAL FRAMEWORK
      • Let us start with the well-known empirical Bethe-Weizsäcker (BW2) formula [17] for nuclear binding energies, where the binding energies are given by

        $\begin{split} B = \;&p_1A+p_2A^{2/3}+p_3Z^2A^{-1/3}+p_4(N-Z)^2A^{-1} \\ &+p_5\delta A^{-1/2}+p_6Z^{4/3}A^{-1/3}+p_7|N-Z|A^{-1}\\ &+p_8(N-Z)^2A^{-4/3}+p_9A^{1/3}+p_{10}P+p_{11}P^2, \end{split}$

        (1)

        where $ p_1 $, $ p_2 $,..., $ p_{11} $ are the parameters, and their values (in MeV) obtained by fitting the experimental data of AME2020 [12] are $ 16.4881 $, $ -25.5527 $, $ -0.7612 $, $ -32.6042 $, $ 11.0820 $, $ 1.7035 $, $ -61.3448 $, $ 61.3608 $, $ 13.3012 $, $ -2.0069 $, and $ 0.1566 $, respectively. For the physical meaning and importance of each parameter, please refer to Ref. [17] for details. B represent the nuclear binding energy; and Z, N, and A represent the proton number, neutron number, and mass number, respectively. The quantities δ and P, which account for pairing and shell effects, are expressed as

        $ \delta = \frac{(-1)^Z+(-1)^N}{2}, \;\;P = \frac{\nu_p \nu_n}{\nu_p+\nu_n}. $

        (2)

        where $ \nu_p $ and $ \nu_n $ represent differences between the actual nucleon numbers Z and N and the nearest magic numbers (8, 20, 28, 50, 82,126, and 184).

        XGBoost is a powerful machine learning algorithm based on decision trees that is widely used for regression tasks. XGBoost is an effective gradient boosting technique with notable advantages, which consists of multiple decision trees, each of which predicts the residual of the true value and sum of the predicted values of all previous decision trees. The predicted values of all decision trees are summed up as the final result. XGBoost can automatically learn the direction of the split, making XGBoost more robust when dealing with data containing missing or extreme values. The method incorporates a regularization term in the objective function to control the complexity of the model and prevent overfitting. This regularization strategy helps improve the generalizability of the model such that it maintains good predictive performance even on new data.

        In the XGBoost method, assuming the existence of K trees, the score of the ith sample is the sum of the predicted values of the K trees, and the final prediction is

        $ \hat{y}_i = \sum\limits_{k = 1}^K f_k(x_i). $

        (3)

        For a task with n samples and K trees, the objective function has two parts and can be expressed as

        $ obj = \sum\limits_{i}^n {\cal{L}}(y_i,\hat y_i) + \sum\limits_{k = 1}^{K} \Omega(f_k), $

        (4)

        where $ {\cal{L}}(y_i,\hat y_i) $ and $ \Omega(f_k) $ represent the loss function and a regular term representing the complexity of the tree, respectively.

        Let the output value of the Kth tree for sample $ x_i $ be $ f_K(x_i) $. The predicted value of sample $ x_i $ when trained on the Kth tree is

        $ \begin{split} \hat y_{i}^{(K)}\;& = f_1(x_i) + f_2(x_i) + \ldots + f_K(x_i) \\ &= \sum\limits_{j = 1}^{K-1}f_j(x_i) + f_K(x_i) \\ &= \hat y_{i}^{(K-1)}+f_{K}(x_i). \end{split} $

        (5)

        After training the first $ (K-1) $ trees, the previous prediction $ \hat y_{i}^{(K-1)} $ is known, the K-th tree attempts to minimize the residuals further, thereby optimizing the objective function as

        $ obj = \sum\limits_{i}^n{\cal{L}}(y_i,\hat y_i^{(K-1)}+f_{K}(x_i))+\Omega(f_K)+C, $

        (6)

        $ \Omega(f_{K}) = \gamma T+\frac{1}{2} \lambda||\omega||^{2}, $

        (7)

        where C is a fixed value. T represents the number of leaf nodes contained in the tree, ω represents the fraction corresponding to each leaf node, and γ and λ are hyperparameters.

        Furthermore, a second-order expansion of the $ {\cal{L}}(y_i,\hat y_i^{(K-1)}+f_{K}(x_i)) $ term in Eq. (6) according to Taylor's formula yields

        $ obj = \sum\limits_{i}^{n}[g_i \cdot f_K(x_i)+\frac{1}{2}h_i \cdot f_{K}^{2}(x_i)]+\gamma T+\frac{1}{2} \lambda||\omega||^{2}, $

        (8)

        where $ g_i = \partial_{\hat y_{i}^{(K-1)}}({\cal{L}}(y_i,{\hat y_{i}^{(K-1)}})) $ and $ h_i = \partial_{\hat y_{i}^{(K-1)}}^{2}({\cal{L}}(y_i,{\hat y_{i}^{(K-1)}})) $ represent the first and second derivatives of the loss function.

        For each sample of leaf nodes, the loss function is minimized and then fit to the function that can be summarized as Eqs. (9)−(11).

        $ c_{Kj} = \arg \min obj^{(K)}, $

        (9)

        $ z_K(x) = \sum\limits_{j = 1}^{J} c_{Kj}I, $

        (10)

        $ f_K(x) = f_{K - 1}(x) + z_K(x), $

        (11)

        where $ c_{Kj} $ represents the optimal output value of K-round the fitted leaf node; $ z_K(x) $ represents the K-round decision tree fitting function; and I represents the set of ω and $ f_K(x) $ is the final prediction result.

        In this study, the XGBoost method is used to study the nuclear binding energy residuals $ \Delta B $ between the experimental data from AME2020 [12] and BW2 formula. We found that introducing relevant physical input features into the XGBoost method can significantly improve the predictive power of the model [51]. To improve the prediction of nuclei near magic numbers, Z, N, $ \nu_p $, $ \nu_n $, and P are used as inputs.

        The experimental mass used to train the model is taken from nuclei containing $ Z \geqslant 8 $ and $ N \geqslant 8 $ in the AME2020 [12]. To test the reliability of the XGBoost method, we select the nuclei that once appear in the atomic mass evaluation of 2016 (AME2016) [52] as the learning set, and the remaining nuclei as the testing set. There are $ 2,386 $ and $ 71 $ experimental data in the learning and testing sets, respectively. In the XGBoost method, hyperparameters such as the regularization parameters (γ and λ) are used to control the model complexity and prevent overfitting. In this study, the number of trees K is set to 54, maximum tree depth to 5, learning rate to 0.08, and subsample ratio to 0.3, while γ and λ are set to their default values of 0 and 1, respectively.

        The root-mean-square (rms) deviation of model predictions from experimental data is commonly used in nuclear physics research to evaluate the accuracy of a model, which, in this work, can be defined as

        $ \sigma_{\rm{rms}}(M) = \sqrt{\sum\limits_{i = 1}^{N}(M_{i,{\rm{exp}}}-M_{i,{\rm{th}}})^2/N}, $

        (12)

        where $ M_{i,{\rm{exp}}} $ and $ M_{i,{\rm{th}}} $ represent experiments and theoretical data of the mass for nuclei i, and N represents the number of data to be evaluated. The output from the XGBoost model is the corrected binding energy residual, which is added to the original theoretical binding energy $ B_{\rm{BW2}} $ to yield the final XGBoost adjusted binding energy prediction. In addition, the nuclear mass can be calculated based on binding energy.

      III.   RESULTS AND DISCUSSION
      • The ability of XGBoost to predict the nuclear mass in the known region is assessed by showing the distribution of the difference between the experimental and predicted masses on the nuclear chart (Fig. 1) and the scatter distribution with the proton and neutron numbers (Fig. 2). The BW2 results are given for comparison. Figure 1 shows that the nuclear mass predicted by BW2 has a large deviation from the experimental data. For the heavy doubly magic number nuclei region, BW2 overestimates the experimental mass, while for the light doubly magic number nuclei region, BW2 underestimates the experimental mass. In addition, for nuclei with proton magic numbers between the doubly magic number regions, BW2 predictions are systematically underestimated. XGBoost significantly improves the nuclear mass description across the nuclear chart, with most nuclei predicted to be within $ 1 $ MeV of the experimental mass. Incidentally, XGBoost predictions near the doubly magic number with (Z, N) = (50, 82) overestimate experiment masses. Figure 2 shows the trend of the predictions of the two methods with the proton and neutron numbers. The nuclei with large differences between BW2 predictions and experimental data are focused around the magic numbers and in the middle to the neutron magic numbers. The mass difference of $ 1,945 $ nuclei in the BW2 model and experimental data is within the range of 2 MeV, which accounts for 81.5% of the total data, while for $ 2,382 $ nuclei, which accounts for 99.8% of the nuclear masses predicted by the XGBoost method. The mass difference is within the range of 2 MeV. Although the BW2 model includes a shell correction term, the final results remain suboptimal. This limitation suggests that traditional approaches, even with shell corrections, struggle to fully capture the complex effects of nuclear structure. The deviation of XGBoost predictions from experiments is more stable overall, suggesting that the XGBoost method describes nuclear masses well, even for outlier magic nuclei.

        Figure 1.  (color online) Mass differences between the experimental data in AME2020 [12] and theoretical predictions. Panels (a) and (b) correspond to the BW2 and XGBoost mass models, respectively.

        Figure 2.  (color online) Differences between the experimental nuclear mass and theoretical results with BW2 and XGBoost for the $ 2386 $ selected nuclei with $ Z \geqslant 8 $ and $ N \geqslant 8 $ versus nuclei number.

        Considering Sn, Pb isotopes and $ N=50 $, $ 82 $ isotones as examples, Figs. 3 and 4 illustrate the difference between the experimental and predited masses from the BW2 formula and XGBoost model. For Sn and Pb isotopes, BW2 significantly overestimates the experimental mass for doubly magic nuclei, with deviations even greater than 6 MeV. As the distance from the magic numbers increases, the BW2 predictions decrease gradually from an overestimation of the experimental data to zero and then increase gradually to an underestimation of the experimental mass. XGBoost significantly improves the description of the nuclear mass compared to that using the BW2 formula; however, there is still some overestimation in its predictions near doubly magic nuclei. Interestingly, for nuclei that are far from magic numbers, there is an odd-even staggering between the experimental mass and the predicted bias of the two models. This may indicate that our XGBoost based on BW2 training does not capture the trend of mass with nucleon number very well. Similar to the Sn, Pb isotopes, the $ N=50 $, $ 82 $ isotones achieve similar conclusions. There is an anomalous mass deviation in the $ N = 50 $ isotones in the $ Z = 36 - 40 $ region for the two methods, and this can help us understand the predictions of new magic numbers such as $ N = 32 $.

        Figure 3.  (color online) Differences between the experimental mass [12] and XGBoost results for Sn (a) and Pb (b) isotopes.

        Figure 4.  (color online) Same as Fig. 3 but for $ N = 50 $ (a) and $ N = 82 $ (b) isotones.

        The rms deviations between the theoretical predictions and experimental masses can reflect the overall accuracies of the different methods, as shown in Fig. 5. The models include the macroscopic mass model BW2, macroscopic-microscopic mass models Koura-Tachibana-Uno-Yamada (KTUY) [53] and FRDM12 [18], microscopic mass models Relativistic Mean-Field (RMF) [31], Hartree-Fock-Bogoliubov (HFB)-31 [54], Universal Nuclear Energy Density Functional (UNEDF1) [55], and the Brussels-Skyrme-on-a-Grid model (BSkG1) [56]. The RMF, UNEDF1, and BW2 mass models have rms deviations greater than 1 MeV, while other mass models have rms deviations around $ 0.5 $ MeV. The XGBoost method shows a promising performance, with an rms deviation of $ 0.498 $ MeV, which is lower than the deviations of the aforementioned models. This result demonstrates the potential of machine learning to refine traditional nuclear mass models, offering a more accurate representation of nuclear masses, including for isotopes where shell effects are significant.

        Figure 5.  (color online) Rms deviations $ \sigma_{\rm{rms}} $ between the mass predicted by the theoretical models and experimental data [12]. Corresponding rms deviations from left to right given by the RMF, UNEDF1, BSkG1, KTUY, FRDM12, HFB-31, BW2, and XGBoost are shown.

        Figure 6 compares rms deviations between the learning and testing sets of RMF, UNEDF1, and BW2 mass models and the XGBoost method, where the extrapolation ability is studied with nuclei that first appear in AME2020 as the test set. The results confirm that the rms deviation of the learning set is reduced from $ 1.621 $ MeV to $ 0.480 $ MeV by training the residual difference between the BW2 method and experimental data, indicating the predictive ability of the XGBoost method. In the testing set, the rms deviation of the XGBoost method decreases from 1.804 MeV to 0.908 MeV compared to that when using the BW2 formula. In this study, there is a large discrepancy between the rms deviations of the XGBoost method in the learning and testing sets. Although this discrepancy may imply overfitting, this difference can also be attributed to the inherent complexity and variability of the nuclear masses, as confirmed by the rms deviations of the two microscopic mass models in Fig. 6 for different datasets. For these newly appearing nuclei in AME2020, there are large evaluation errors in their masses themselves, some as large as 0.3−0.4 MeV. To further assess potential overfitting, we examined the performance of the testing set during hyperparameter tuning. In addition, we have regularization parameters in XGBoost to reduce overfitting by penalizing overly complex models. In future, we plan to combine this with density functional theory for improving the generalization ability of machine learning models.

        Figure 6.  (color online) The rms deviations $ \sigma_{\rm{rms}} $ of different mass model predictions with respect to the experimental data for the learning and testing sets.

        To further test the predictive ability of the XGBoost method, we adopted the FRDM12 model as pseudo-experimental data. Two derived models, BW2* and XGBoost*, were constructed based on the FRDM12 results in the known AME2020 region. The deviations of FRDM12 model from the results of these two models are shown in Fig. 7. BW2* significantly underestimates masses in the medium-mass doubly-magic regions compared to that of FRDM12. When extrapolated to unknown regions, BW2* substantially overestimates the FRDM12 nuclear masses for proton magic nuclei. In contrast, XGBoost* reproduces the FRDM12 results considerably better across the nuclear chart. Although a slight overestimation remains for proton-magic nuclei, the overall deviation is remarkably reduced. Both methods exhibit considerable deviations for light nuclei far from β-stability, which awaits further experimental verification.

        Figure 7.  (color online) Difference of nuclear mass between the pseudo-experimental data of FRDM12 and theoretical predictions. Panels (a) and (b) correspond to the XGBoost* and BW2* mass models, respectively. The contours indicate the boundary of nuclei with known masses in AME2020, and the dotted lines denote the traditional magic numbers.

        Furthermore, we present the rms deviations of the two models relative to FRDM12 as a function of minimun distance to the isotopes in the known region for nuclei with $ 28 \leqslant Z \leqslant100 $, as shown in Fig. 8. The trend of the XGBoost* curve is similar to that of BW2*; however, it remains lower across all extrapolation distances. This suggests that the XGBoost model, while retaining the information from BW2*, significantly improves the description of the pseudo-experimental data in both the known and unknown regions. Within 30 extrapolation steps, the rms deviation of XGBoost* is approximately 1 MeV smaller than that of BW2*. This further demonstrates the excellent ability of the XGBoost method in describing nuclear masses when combined with other theoretical models. The XGBoost method can systematically calculate nuclear masses, providing nuclear physics inputs for r-process studies and improving our understanding of the origin of heavy elements in the Universe.

        Figure 8.  (color online) Rms deviations of the BW2* and XGBoost* models relative to FRDM12 as a function of the minimum distance to isotopes in the known region.

      III.   RESULTS AND DISCUSSION
      • The ability of XGBoost to predict the nuclear mass in the known region is assessed by showing the distribution of the difference between the experimental and predicted masses on the nuclear chart (Fig. 1) and the scatter distribution with the proton and neutron numbers (Fig. 2). The BW2 results are given for comparison. Figure 1 shows that the nuclear mass predicted by BW2 has a large deviation from the experimental data. For the heavy doubly magic number nuclei region, BW2 overestimates the experimental mass, while for the light doubly magic number nuclei region, BW2 underestimates the experimental mass. In addition, for nuclei with proton magic numbers between the doubly magic number regions, BW2 predictions are systematically underestimated. XGBoost significantly improves the nuclear mass description across the nuclear chart, with most nuclei predicted to be within $ 1 $ MeV of the experimental mass. Incidentally, XGBoost predictions near the doubly magic number with (Z, N) = (50, 82) overestimate experiment masses. Figure 2 shows the trend of the predictions of the two methods with the proton and neutron numbers. The nuclei with large differences between BW2 predictions and experimental data are focused around the magic numbers and in the middle to the neutron magic numbers. The mass difference of $ 1,945 $ nuclei in the BW2 model and experimental data is within the range of 2 MeV, which accounts for 81.5% of the total data, while for $ 2,382 $ nuclei, which accounts for 99.8% of the nuclear masses predicted by the XGBoost method. The mass difference is within the range of 2 MeV. Although the BW2 model includes a shell correction term, the final results remain suboptimal. This limitation suggests that traditional approaches, even with shell corrections, struggle to fully capture the complex effects of nuclear structure. The deviation of XGBoost predictions from experiments is more stable overall, suggesting that the XGBoost method describes nuclear masses well, even for outlier magic nuclei.

        Figure 1.  (color online) Mass differences between the experimental data in AME2020 [12] and theoretical predictions. Panels (a) and (b) correspond to the BW2 and XGBoost mass models, respectively.

        Figure 2.  (color online) Differences between the experimental nuclear mass and theoretical results with BW2 and XGBoost for the $ 2386 $ selected nuclei with $ Z \geqslant 8 $ and $ N \geqslant 8 $ versus nuclei number.

        Considering Sn, Pb isotopes and $ N=50 $, $ 82 $ isotones as examples, Figs. 3 and 4 illustrate the difference between the experimental and predited masses from the BW2 formula and XGBoost model. For Sn and Pb isotopes, BW2 significantly overestimates the experimental mass for doubly magic nuclei, with deviations even greater than 6 MeV. As the distance from the magic numbers increases, the BW2 predictions decrease gradually from an overestimation of the experimental data to zero and then increase gradually to an underestimation of the experimental mass. XGBoost significantly improves the description of the nuclear mass compared to that using the BW2 formula; however, there is still some overestimation in its predictions near doubly magic nuclei. Interestingly, for nuclei that are far from magic numbers, there is an odd-even staggering between the experimental mass and the predicted bias of the two models. This may indicate that our XGBoost based on BW2 training does not capture the trend of mass with nucleon number very well. Similar to the Sn, Pb isotopes, the $ N=50 $, $ 82 $ isotones achieve similar conclusions. There is an anomalous mass deviation in the $ N = 50 $ isotones in the $ Z = 36 - 40 $ region for the two methods, and this can help us understand the predictions of new magic numbers such as $ N = 32 $.

        Figure 3.  (color online) Differences between the experimental mass [12] and XGBoost results for Sn (a) and Pb (b) isotopes.

        Figure 4.  (color online) Same as Fig. 3 but for $ N = 50 $ (a) and $ N = 82 $ (b) isotones.

        The rms deviations between the theoretical predictions and experimental masses can reflect the overall accuracies of the different methods, as shown in Fig. 5. The models include the macroscopic mass model BW2, macroscopic-microscopic mass models Koura-Tachibana-Uno-Yamada (KTUY) [53] and FRDM12 [18], microscopic mass models Relativistic Mean-Field (RMF) [31], Hartree-Fock-Bogoliubov (HFB)-31 [54], Universal Nuclear Energy Density Functional (UNEDF1) [55], and the Brussels-Skyrme-on-a-Grid model (BSkG1) [56]. The RMF, UNEDF1, and BW2 mass models have rms deviations greater than 1 MeV, while other mass models have rms deviations around $ 0.5 $ MeV. The XGBoost method shows a promising performance, with an rms deviation of $ 0.498 $ MeV, which is lower than the deviations of the aforementioned models. This result demonstrates the potential of machine learning to refine traditional nuclear mass models, offering a more accurate representation of nuclear masses, including for isotopes where shell effects are significant.

        Figure 5.  (color online) Rms deviations $ \sigma_{\rm{rms}} $ between the mass predicted by the theoretical models and experimental data [12]. Corresponding rms deviations from left to right given by the RMF, UNEDF1, BSkG1, KTUY, FRDM12, HFB-31, BW2, and XGBoost are shown.

        Figure 6 compares rms deviations between the learning and testing sets of RMF, UNEDF1, and BW2 mass models and the XGBoost method, where the extrapolation ability is studied with nuclei that first appear in AME2020 as the test set. The results confirm that the rms deviation of the learning set is reduced from $ 1.621 $ MeV to $ 0.480 $ MeV by training the residual difference between the BW2 method and experimental data, indicating the predictive ability of the XGBoost method. In the testing set, the rms deviation of the XGBoost method decreases from 1.804 MeV to 0.908 MeV compared to that when using the BW2 formula. In this study, there is a large discrepancy between the rms deviations of the XGBoost method in the learning and testing sets. Although this discrepancy may imply overfitting, this difference can also be attributed to the inherent complexity and variability of the nuclear masses, as confirmed by the rms deviations of the two microscopic mass models in Fig. 6 for different datasets. For these newly appearing nuclei in AME2020, there are large evaluation errors in their masses themselves, some as large as 0.3−0.4 MeV. To further assess potential overfitting, we examined the performance of the testing set during hyperparameter tuning. In addition, we have regularization parameters in XGBoost to reduce overfitting by penalizing overly complex models. In future, we plan to combine this with density functional theory for improving the generalization ability of machine learning models.

        Figure 6.  (color online) The rms deviations $ \sigma_{\rm{rms}} $ of different mass model predictions with respect to the experimental data for the learning and testing sets.

        To further test the predictive ability of the XGBoost method, we adopted the FRDM12 model as pseudo-experimental data. Two derived models, BW2* and XGBoost*, were constructed based on the FRDM12 results in the known AME2020 region. The deviations of FRDM12 model from the results of these two models are shown in Fig. 7. BW2* significantly underestimates masses in the medium-mass doubly-magic regions compared to that of FRDM12. When extrapolated to unknown regions, BW2* substantially overestimates the FRDM12 nuclear masses for proton magic nuclei. In contrast, XGBoost* reproduces the FRDM12 results considerably better across the nuclear chart. Although a slight overestimation remains for proton-magic nuclei, the overall deviation is remarkably reduced. Both methods exhibit considerable deviations for light nuclei far from β-stability, which awaits further experimental verification.

        Figure 7.  (color online) Difference of nuclear mass between the pseudo-experimental data of FRDM12 and theoretical predictions. Panels (a) and (b) correspond to the XGBoost* and BW2* mass models, respectively. The contours indicate the boundary of nuclei with known masses in AME2020, and the dotted lines denote the traditional magic numbers.

        Furthermore, we present the rms deviations of the two models relative to FRDM12 as a function of minimun distance to the isotopes in the known region for nuclei with $ 28 \leqslant Z \leqslant100 $, as shown in Fig. 8. The trend of the XGBoost* curve is similar to that of BW2*; however, it remains lower across all extrapolation distances. This suggests that the XGBoost model, while retaining the information from BW2*, significantly improves the description of the pseudo-experimental data in both the known and unknown regions. Within 30 extrapolation steps, the rms deviation of XGBoost* is approximately 1 MeV smaller than that of BW2*. This further demonstrates the excellent ability of the XGBoost method in describing nuclear masses when combined with other theoretical models. The XGBoost method can systematically calculate nuclear masses, providing nuclear physics inputs for r-process studies and improving our understanding of the origin of heavy elements in the Universe.

        Figure 8.  (color online) Rms deviations of the BW2* and XGBoost* models relative to FRDM12 as a function of the minimum distance to isotopes in the known region.

      IV.   SUMMARY AND PERSPECTIVES
      • The XGBoost method has been used to study nuclear masses by learning residuals between the experimental binding energies and predictions of the BW2 formula. By considering the nucleon numbers, valence nucleon numbers, and magic-number-related quantities as inputs, the XGBoost model achieves a significantly improved description of nuclear masses compared with the BW2 formula for magic nuclei. The predictive ability of XGBoost is further validated by comparing its root-mean-square deviations with those of various microscopic and macroscopic mass models in the total, learning, and testing sets. To evaluate the extrapolation performance of the models, pseudo-experimental data from FRDM12 were employed to construct BW2* and XGBoost* models. The results show that the XGBoost* reproduces the FRDM12 data more accurately in both known and unknown regions, with deviations systematically smaller than those of the BW2*. This indicates that the XGBoost method has good extrapolation ability. In future, machine learning methods that consider more physical effects or physical constraints will be further developed to improve machine learning predictions of nuclear mass, focusing on the description of light nuclei to provide more accurate nuclear physics inputs for r-process studies.

      IV.   SUMMARY AND PERSPECTIVES
      • The XGBoost method has been used to study nuclear masses by learning residuals between the experimental binding energies and predictions of the BW2 formula. By considering the nucleon numbers, valence nucleon numbers, and magic-number-related quantities as inputs, the XGBoost model achieves a significantly improved description of nuclear masses compared with the BW2 formula for magic nuclei. The predictive ability of XGBoost is further validated by comparing its root-mean-square deviations with those of various microscopic and macroscopic mass models in the total, learning, and testing sets. To evaluate the extrapolation performance of the models, pseudo-experimental data from FRDM12 were employed to construct BW2* and XGBoost* models. The results show that the XGBoost* reproduces the FRDM12 data more accurately in both known and unknown regions, with deviations systematically smaller than those of the BW2*. This indicates that the XGBoost method has good extrapolation ability. In future, machine learning methods that consider more physical effects or physical constraints will be further developed to improve machine learning predictions of nuclear mass, focusing on the description of light nuclei to provide more accurate nuclear physics inputs for r-process studies.

    Reference (56)

目录

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return