CNNPixelSeedsProducerTool - Example workflow using ML in track reconstruction with CMS 2018 simulated data
Di Florio, Adriano
Di Florio, Adriano; Pantaleo, Felice; Pierini, Maurizio;
CNNPixelSeedsProducerTool - Example workflow using ML in track reconstruction with CMS 2018 simulated data. CERN Open Data Portal.
An example workflow to produce datasets to be used to develop machine learning algorithms for selection and filtering pixel doublet seeds in tracking applications with CMS 2018 simulated data. The code can be run inside the CMS Open Data environment
One of the first steps of the track finding workflow is the creation of track seeds, i.e. compatible pairs of hits from different detector layers, that are subsequently fed to to higher level pattern recognition steps. However the set of compatible hit pairs is highly affected by combinatorial background resulting in the next steps of the tracking algorithm to process a significant fraction of fake doublets. For each event an $O(10^6)$ doublets are produced while only an $O(10^3)$ are genuine resulting in a fake ratio of $O(10^3)$. A possible way of reducing this effect is using Machine Learning and Deep Learning techniques to check the compatibility between two hits. Indeed, the task of fake rejection can be seen as a typical classification problem for which networks and MVA methods have been widely proven to provide reliable results. The dataset provided is intended to be used to explore this techniques.
The workflow provided produces a dataset consisting of a collection of pixel doublet seeds, i.e. the hit pairs that could belong to the same particle. The compatibility between two hits is evaluated only on the basis of geometrical considerations, such as cuts in $\eta$, $\phi$ and $r$. These doublets define the building blocks for further tracks. Each doublet is characterised by a set of features, such as its coordinates and the charge released in the Pixel detector, and the pixel cluster shape, projected on 2D histogram.