-
Implementing a NN for inverse TOV mapping requires a large number of EOS as training samples. Moreover, these EOS also need to meet the range of nuclear parameters as much as possible, such as L,
$ K_{\mathrm{sym}} $ , etc. The isospin-dependent parametric EOS can satisfy both the need to generate a large number of EOS and the rationalization of nuclear parameters.For asymmetric nuclear matter, the energy per nucleon can be expanded by isospin asymmetry [2, 3]
$ \begin{array}{*{20}{l}} E\left( \rho ,\delta \right) \approx E_0\left( \rho \right) +E_{\mathrm{sym}}\left( \rho \right) \delta ^2, \end{array} $
(1) where
$ \delta =\left(\rho _\mathrm{n}-\rho _\mathrm{p} \right) /\left(\rho _\mathrm{n}+\rho _\mathrm{p} \right) $ is the isospin asymmetry. The first term on the right side$ E_0\left(\rho \right) $ is usually referred to as the energy of symmetric nuclear matter, and the second term$ E_{\mathrm{sym}}\left(\rho \right) $ is referred to as the symmetry energy, which has the physical meaning of the energy difference per nucleon between asymmetric nuclear matter and pure neutron matter.Usually, the two terms on the right side in Eq. (1) can be expanded at the saturation density
$ \rho_0 $ [24, 38]$ E_0\left( \rho \right) =E_0\left( \rho _0 \right) +\frac{K_0}{2}\left( \frac{\rho -\rho _0}{3\rho _0} \right) ^2+\frac{J_0}{6}\left( \frac{\rho -\rho _0}{3\rho _0} \right) ^3, $
(2) $\begin{aligned}[b]E_{\mathrm{sym}}\left( \rho \right) =E_{\mathrm{sym}}\left( \rho _0 \right) +L\left( \frac{\rho -\rho _0}{3\rho _0} \right)\end{aligned} $
$\begin{aligned}[b]\quad +\frac{K_{\mathrm{sym}}}{2}\left( \frac{\rho -\rho _0}{3\rho _0} \right) ^2+\frac{J_{\mathrm{sym}}}{6}\left( \frac{\rho -\rho _0}{3\rho _0} \right) ^3.\end{aligned} $
(3) Constrained by terrestrial nuclear experiments, the most probable values of the parameters in the above equations are as follows:
$ L=58.7\pm 28.1\ \mathrm{MeV} $ ,$ -400\leqslant K_{\mathrm{sym}}\leqslant 100\ \mathrm{MeV} $ ,$ -200\leqslant J_{\mathrm{sym}}\leqslant 800\ \mathrm{MeV} $ ,$ -300\leqslant J_0\leqslant 400\ \mathrm{MeV} $ ,$ K_0=240\pm 20\ \mathrm{MeV} $ ,$ E_{\mathrm{sym}}\left(\rho _0 \right) = 31.7\pm 3.2\ \mathrm{MeV} $ [39−43]. It is worth pointing out that in recent years, PREX-II has given higher L values ($ L=106\pm 37\ \mathrm{MeV} $ [8]). This result is also considered later in this work.The pressure of the system can be calculated numerically from
$ P\left( \rho ,\delta \right) =\rho ^2\frac{{\rm d}\left( E/\rho \right)}{{\rm d}\rho}. $
(4) To construct the EOS in the entire density range, the NV EOS model [44] and BPS EOS model [45] are used for the inner crust and outer crust, respectively.
-
In this work, neutron stars are considered to be isolated, non-rotating, and statically spherically symmetric. The Tolman-Oppenheimer-Volkoff (TOV) equations consist of the equation of hydrostatic equilibrium [46, 47]
$ \frac{{\rm d}P}{{\rm d}r}=-\frac{\left[ m\left( r \right) +4\pi r^3P\left( r \right) \right] \left[ \rho \left( r \right) +P\left( r \right) \right]}{r\left[ r-2m\left( r \right) \right]}, $
(5) and
$ \frac{{\rm d}m\left( r \right)}{{\rm d}r}=4\pi r^2\rho \left( r \right). $
(6) The external boundary condition of the star is as follows:
$ \rho=p=0 $ . Given the center density of the neutron star, it can be solved layer by layer from the interior to the exterior of the star using the method of solving high-precision initial value problems.Tidal deformation of neutron star is an EOS-dependent property, which can be measured through gravitational wave events such as GW170817 . In linear order, the tidal deformation
$ \varLambda $ is defined as [48]$ \begin{array}{*{20}{l}} Q_{ij}=-\varLambda \varepsilon _{ij}, \end{array}$
(7) where
$ Q_{{ij}} $ and$ \varepsilon _{{ij}} $ represent the induced quadrupole moment and the static tidal field, respectively. The second tidal Love number$ k_2 $ is defined as$ k_2=\frac{3}{2}\varLambda R^{-5}, $
(8) which can be solved together with the TOV equation [49, 50].
-
To clearly demonstrate the process of implementing an NN for inverse TOV mapping, we draw the flow chart as shown in Fig. 1. The first step requires the provision of a relatively complete training set. After determining the range of symmetry energy parameters, the massive EOS (output of NN) is generated by the method in Sec. II.A, while the corresponding mass-radius points (input of NN) are calculated using the TOV equation. The training set of NN is filtered by astronomical observations. In the second step, the training set is substituted into the initialized NN training to examine the loss. The key parameters of NN are adjusted until the loss converges. The final step is to find the minimum number of neurons at the input, which involves comparing the relative errors under different mass-radius point conditions. The different mass-radius points come from the non-training sample, which is the EOS in the non-training set and the corresponding mass-radius (consistent with the parameter range and filtering in the first step).
The most significant prerequisite for the implementation of this algorithm is the training sample processing. Four groups of variable parameters are used to generate training samples to cover an EOS range that is as large as possible. The variable parameter space is as follows:
$30\leqslant L\leqslant 143 ~ \mathrm{MeV}$ ,$ -400\leqslant K_\mathrm{sym}\leqslant 100\ \mathrm{MeV} $ ,$-200\leqslant J_\mathrm{sym}\leqslant 800\ \mathrm{MeV}$ ,$ -300\leqslant J_0\leqslant 400\ \mathrm{MeV} $ ,$ K_0=240\ \mathrm{MeV} $ ,$E_{\mathrm{sym}}\left(\rho _0 \right) = 31.7~ \mathrm{MeV}$ [39−43]. It is especially worth noting that the slope L range was developed by combining terrestrial [8] and astrophysical data. To obtain preliminary training samples, we generate EOS (Sec. II.A) by taking points at equal intervals in the above parameter interval and then solve the corresponding M-R with the TOV equation. Moreover, in order to make the output more reasonable, we used astronomical observation results during the sample generation stage:$ M_{\max}\geqslant 2.14\ M_{\odot} $ [51],$\varLambda _{1.4}= 190_{-120}^{+390}$ ,$ 1.44\pm 0.15\ M_{\odot} $ [9] corresponding to$ \left[11.2,\ 13.3 \right] $ km,$ 2.08\pm 0.07\ M_{\odot} $ [10] corresponding to$ \left[12.2,\ 16.3 \right]\ \mathrm{km} $ . Similarly to Ref. [52], considering the causality condition ($ c_s<c $ ) and the stability condition (${\rm d}p/{\rm d}\rho > 0$ ), if the maximum mass filter is then added, the sample size is comparable to that after the screening of astronomical observations as described above, with a limited improvement in the generalization ability. In the sample processing stage, due to the randomness of observations, we randomly sampled the mass-radius sequences over the entire mass range. Note that because it is more difficult to form low-mass neutron stars, as in Ref. [52], the sampling point is set to be greater than 1$ M_{\odot} $ . However, considering the theoretical lower mass limit of neutron stars that can reach 0.1$ M_{\odot} $ [53], as well as recent observations of the HESS J1731-347 [54], a wider mass sampling interval was adopted. The final sample size involved in the NN is approximately 880000, where M-R points are used as input and P-ρ points are used as output.The NN was initialized with reference to the framework of Ref. [19] and [35] and according to the background of our training set. By adjusting the relevant parameters, such as the number of layers, the optimization function, the epoch size, etc., the loss convergence of the NN is achieved. We finalized the NN architecture as shown in Table 1. The 80 neurons in the input layer means that the NN can provide accurate predictions under the condition of 40 pairs of (M,R) points. The other key parameters for this NN are shown in Appendix A, and the predicted results for the different input points are shown in Appendix B.
Layer Number of neurons Activation function 0(Input) 80 N/A 1 100 ReLU 2 300 ReLU 3 400 ReLU 4 400 ReLU 5 400 ReLU 6 300 ReLU 7(Output) 182 tanh Table 1. Architecture of neural network.
To verify the accuracy, we used several mass-radius sequences of the EOS to substitute into this NN for prediction. In particular, we use APR3 and APR4 to test the effectiveness of this NN for non-parameter EOS.
Figure 2 shows the NN predictions for three adopted EOS. It is shown that our NN method achieves a relatively accurate prediction function with approximately 40 mass-radius points as input. The solid black line for the (a) panel equation of state is randomly generated and substituted into Eq. (5) to give the solid black line for the mass-radius sequence. Additionally, we also test some EOS other than the parameterized EOS and find that some of them can also be predicted well, as shown in (b) and (c) in Fig. 2 for APR3 and APR4 EOS. This result means that our NN method will no longer be limited to parametric EOS even though we use only parametric EOS as training samples.
Figure 2. (color online) The EOS predictions of NN from 40 sampled points for the
$ L-J_0-J_{\mathrm{sym}}-K_{\mathrm{sym}} $ panel, where the solid black line represents the original EOS, and the blue dots denote the NN predictions. The three sample EOSs are, (a) parametric EOS with randomly generated parameters as$ L = 40.5\ \mathrm{MeV} $ ,$ K_{\mathrm{sym}} = -200.5\ \mathrm{MeV} $ ,$ J_{\mathrm{sym}} = 320\ \mathrm{MeV} $ ,$ J_0 = 110\ \mathrm{MeV} $ ,$ K_0 = 240\ \mathrm{MeV} $ ,$ E_{\mathrm{sym}}\left(\rho _0 \right) = 31.7\ \mathrm{MeV} $ , (b) APR3 EOS, and (c) APR4 EOS. The inset shows the NN sampling from the mass-radius sequence. The lower panels show the relative error of the fitting.For comparison with the results in Ref. [35], we also probe the output of NN by setting the parameters of EOS in the
$ L-K_{\mathrm{sym}} $ panel, where$ J_0 = J_{\mathrm{sym}} = 0 $ , and L and$ K_{\mathrm{sym}} $ take a range of$ \left[30,\ 90 \right]\ \mathrm{MeV} $ and$ \left[-400,\ 100 \right]\ \mathrm{MeV} $ , respectively. Using the same sample generation process as for the$ L-J_0-J_{\mathrm{sym}}-K_{\mathrm{sym}} $ panel, the sample size at this point is approximately 8000. Figure 3 shows the NN predictions for the randomly selected test EOS with the parameters limited in the previously mentioned range. We try to use as few input samples as possible to achieve the prediction function. Here, the prediction function for the full density segment is implemented in the NN with 4 rows of mass-radius sequences as input (the data of the input is shown in Table 2). Since the low-dimensional parameter space is simpler compared to the high-dimensional, e.g., the same (M, R) point corresponds to fewer EOS, only four output points and some parameters of the NN in the simplified Table 1 are needed to implement the function. The above results show that our NN design is more suitable for inverse TOV mapping under isospin-dependent parametric EOS conditions.Figure 3. (color online) The EOS predictions of NN from 4 sampled points for the
$ L-K_{\mathrm{sym}} $ panel (The EOS parameters are as follows:$ L = 40.5\ \mathrm{MeV} $ ,$ K_{\mathrm{sym}} = 80.5\ \mathrm{MeV} $ ,$ J_{\mathrm{sym}} = 0\ \mathrm{MeV} $ ,$ J_0 = 0\ \mathrm{MeV} $ ,$ K_0 = 240\ \mathrm{MeV} $ ,$ E_{\mathrm{sym}}\left(\rho _0 \right) = 31.7\ \mathrm{MeV} $ ). The solid black line indicates the original EOS from which the sample points shown as the red dots were generated. The blue dots represent the EOS data output by NN. The inset shows the NN sampling from the mass-radius sequence (Greater than 1$ M_\odot $ ).Mass ( $ M_\odot $ )Radius/km 1.650030 11.8838 1.916370 11.6289 2.046519 11.3741 2.201511 10.3368 Table 2. Mass-radius sampling points for Fig. 3.
Compared to Fig. 2, Fig. 3 demonstrates a greater accuracy of the NN predictions after reducing the two parameter dimensions. The fact that other different kinds of the EOS can be valid suggests that this NN prediction has certain universality.
-
NN, one of the common algorithms in ML, is now used in almost every aspect of scientific research and engineering [57]. Based on the universal approximation theorem, NN can implement complex nonlinear mappings [58, 59]. Its construction is mainly divided into an input layer, a hidden layer, and an output layer. Each neuron can be treated as a container of numbers. Similar to the transmission of electrical signals between synapses, neurons use numbers as signals between them. Starting from the input layer, each layer of neurons goes through the following process between each other
$ \begin{array}{*{20}{l}} z^\mathrm{l}=w_1a_{2}^{\mathrm{l-1}}+\cdots +w_\mathrm{n}a_{\mathrm{n}}^{\mathrm{l-1}}+b. \end{array}\tag{A1} $
where w represents the weight, l represents the serial number of layers, and b represents the bias value [60]. The calculated z is then substituted into the activation function to obtain the value of the output of the neuron to the next layer:
$ \begin{array}{*{20}{l}} a_{\mathrm{j}}^{\mathrm{l}}=f\left( z^\mathrm{l} \right) , \end{array}\tag{A2} $
where
$ f(x) $ is called the activation function and is usually a Sigmoid or ReLU function, the latter being adopted in this work. When summarizing the operations of the whole layer of neurons, we can obtain the following expression:$ \begin{array}{*{20}{l}} f \left( \left[ \begin{matrix} w_{0,0} & w_{0,1} & \cdots & w_{0,\mathrm{n}}\\ w_{1,0} & w_{1,1} & \cdots & w_{1,\mathrm{n}}\\ \vdots & \vdots & \ddots & \vdots\\ w_{\mathrm{k},0} & w_{\mathrm{k},1} & \cdots & w_{\mathrm{k},\mathrm{n}}\\ \end{matrix} \right] \left[ \begin{array}{c} a_{0}^{0}\\ a_{1}^{0}\\ \vdots\\ a_{\mathrm{n}}^{0}\\ \end{array} \right] +\left[ \begin{array}{c} b_0\\ b_1\\ \vdots\\ b_\mathrm{n}\\ \end{array} \right] \right) = output. \end{array}\tag{A3} $
At this point, the NN has completed one forward propagation. We also need a metric to evaluate how well the output compares to the true value. This type of evaluation metric is called the loss function, in which we usually use the mean squared error (
$ MSE=\dfrac{1}{n}\sum_{i=1}^{n} {\left(Y_{i}-\widehat{Y_{i}} \right) ^2} $ ) method. The MSE of each batch is equivalent to the construction drawing, guiding the optimizer to continuously adjust a huge number of weights (w) and biases (b) in the NN. The available optimizers are Stochastic Gradient Descent (SGD) and Adam [61], the latter being adopted in this study.The platform used for implementing our NN is Keras, with Tensorflow as its backend [62, 63]. It integrates the excellent features of Compute Unified Device Architecture (CUDA) for parallel computing on GPU with Tensorflow as the backend, thus providing a rich interface. The training of the NN was performed on NVIDIA GeForce GTX 1650, which can significantly save the time cost of computing while implementing complex NN. To prevent overfitting, Dropout was set to 0.3 between layers. Furthermore, we chose an initial learning rate of 0.0003, a batch size of 500, and an epoch of 800. The proportion of the training dataset used for the validation dataset was 0.1. The hidden layer wasused as a fully connected layer. Based on the above conditions, we designed six layers of NN to achieve inverse TOV mapping. All the data in the NN need to be normalized, which in this study is used as (M
$ / $ 3, R$ / $ 35) and (($ \log _{10}p $ )$ / $ 40,$ (\log _{10}\rho)/{20} $ ). The EOS is taken in a logarithmic order to avoid very large gaps in the pressure magnitude between different density intervals, which could affect the prediction accuracy of the NN. For ease of calculation, both P and ρ are taken as logarithmic results in$\mathrm {MeV\cdot fm^{-3}}$ and$\mathrm {kg\cdot m^{-3}}$ , respectively.
From masses and radii of neutron stars to EOS of nuclear matter through neural network
- Received Date: 2023-08-05
- Available Online: 2024-02-15
Abstract: The equation of state (EOS) of dense nuclear matter is a key factor for determining the internal structure and properties of neutron stars. However, the EOS of high-density nuclear matter has great uncertainty, mainly because terrestrial nuclear experiments cannot reproduce matter as dense as that in the inner core of a neutron star. Fortunately, continuous improvements in astronomical observations of neutron stars provide the opportunity to inversely constrain the EOS of high-density nuclear matter. Several methods have been proposed to implement this inverse constraint, including the Bayesian analysis algorithm, the Lindblom's approach, and so on. Neural network algorithm is an effective method developed in recent years. By employing a set of isospin-dependent parametric EOSs as the training sample of a neural network algorithm, we set up an effective way to reconstruct the EOS with relative accuracy using a few mass-radius data. Based on the obtained neural network algorithms and according to the NICER observations on masses and radii of neutron stars with assumed precision, we obtain the inversely constrained EOS and further calculate the corresponding macroscopic properties of the neutron star. The results are basically consistent with the constraint on EOS in Huth et al. [Nature 606, 276 (2022)] based on Bayesian analysis. Moreover, the results show that even though the neural network algorithm was obtained using the finite parameterized EOS as the training set, it is valid for any rational parameter combination of the parameterized EOS model.