Fund Project:Project supported by the National Natural Science Foundation of China (Grant Nos. 11675033, 12075043)
Received Date:18 September 2020
Accepted Date:13 November 2020
Available Online:06 March 2021
Published Online:20 March 2021
Abstract:The jet tagging task in high-energy physics is to distinguish signals of interest from the background, which is of great importance for the discovery of new particles, or new processes, at the large hadron collider. The energy deposition generated in the calorimeter can be seen as a kind of picture. Based on this notion, tagging jets initiated by different processes becomes a classic image classification task in the computer vision field. We use jet images as the input built on high dimensional low-level information, energy-momentum four-vectors, to explore the potential of convolutional neural networks (CNNs). Four models of different depths are designed to make the best underlying useful features of jet images. Traditional multivariable method, boosted decision tree (BDT), is used as a baseline to determine the performance of networks. We introduce four observable quantities into BDTs: the mass, transverse momenta of fat jets, the distance between the leading and subleading jets, and N-subjettiness. Different tree numbers are adopted to build three kinds of BDTs, which is intended to have variable classifying abilities. After training and testing, the results show that the CNN 3 is the neatest and most efficient network under the design of stacking convolutional layers. Deepening the model could improve the performance to a certain extent but it is unable to work all the time. The performances of all BDTs are almost the same, which is possibly due to a small number of input observable types. The performance metrics show that the CNNs outperform the BDTs: the background rejection efficiency increases up to 150% at 50% signal efficiency. Besides, after inspecting the best and the worst samples, we conclude the characteristics of jets initiated by different processes: jets obtained by Z boson decays tend to concentrate in the center of jet images or have a clear differentiable substructure; the substructures of jets from general quantum chromodynamics processes have more random forms and not only just have two subjets. As the final step, the confusion matrix of the CNN 3 indicate that it comes to be kind of conservative. Exploring the way of keeping the balance between conservative and radical is our goal in the future work. Keywords:decays of Z bosons/ quarks/ gluons/ neural network
其中, $ i $表示输出神经元所代表的输入类别, 0代表背景, 1代表信号; $ o $代表了神经元的本身的输出. 选取信号神经元来查看由不同类别的输入得到的输出分布, 如图5所示. 图中信号的输出大部分集中于1附近, 背景集中于0到0.3附近, 模型可以很好地将它们区分开来. 图 5 CNN 3信号神经元对于信号(橘色)与背景(蓝色)的输出分布 Figure5. Distribution of the signal neuron of the CNN 3 on signal and background samples.
图 7 最优与最差的背景喷注图 Figure7. The best and the worst background jet images.
图 8 CNN 3在测试集上的混淆矩阵, 其中纵坐标代表喷注图的真实类别, 横坐标代表模型预测的类别 Figure8. Confusion matrix of the CNN 3 on the test set. The true label is on the vertical axis, and the predicted label in on the horizontal axis.