2020-12-21 by CMS Collaboration
The CMS experiment at CERN has released into the public domain its first batches of open data from heavy-ion collisions at the Large Hadron Collider. The two batches contain two datasets from 2010 and four from 2011, recorded during the LHC’s first data-taking period (“Run 1”), along with eight sets of reference data from proton–proton collisions recorded at the same collision energy. The total disk volume of the release is 214 TB.
Famous for the proton–proton collisions that resulted in the discovery of the Higgs boson, the LHC also collides nuclei from heavy elements (heavy ions) for a few weeks each year, complementing data from proton–proton interactions. Heavy-ion collisions at the LHC occur at the highest energy achieved in the laboratory. They can recreate the very early conditions of the universe, when the quarks and gluons, which are normally bound together to form particles like protons and neutrons, existed freely in a soup-like state known as a quark–gluon plasma. Scientists at the LHC can use such collisions to understand these early-universe conditions and study the interaction of other particles with the quark–gluon plasma.
In 2010 and 2011, the heavy-ion collisions were between nuclei of lead. Using these data, CMS could observe several signatures of the quark–gluon plasma, including the imbalance between the momenta of each jet of particles produced in a pair, the suppression (“quenching”) of particle jets in jet–photon pairs and the “melting” of certain composite particles.
CMS has already released 100% of the proton–proton data the collaboration recorded in 2010 and 2011. Therefore, with the release of the lead–lead data, all of the data CMS recorded in the first two years of LHC operation is in the public domain, in accordance with the collaboration’s open-data policy.
Analysing data from the LHC requires specialised software and analysis environments. CMS is providing these in the form of Virtual Machine images as well as Docker container images, which come with all the necessary software. Further, in order for a new data explorer to familiarise themself with the kinds of analysis one can perform with the data, CMS has also provided usage examples. Those interested in specific advice can ask for it on the official CERN Open Data forum.
“Preparing this release presented unique challenges to CMS,” notes Kati Lassila-Perini, who has led the CMS Data Preservation and Open Access team since 2012. “The heavy-ion community on CMS is quite small and a lot of the knowledge from a decade ago has been difficult to track down. Nevertheless, thanks to the help of our colleagues, we’re thrilled that we can share these interesting and valuable datasets with the world.”
Since this is the first release of heavy-ion data from CMS, the collaboration is looking forward to receiving input and feedback from the wider community of high-energy physicists.
“We’ve seen with the data we released earlier how useful the feedback from the open-data users can be, and we will do our best to address eventual shortcomings reported to us,” adds Lassila-Perini.
As before, all of the CMS open data are released into the public domain under the Creative Commons CC0 waiver via the CERN Open Data portal. The portal is openly developed on GitHub by the CERN Information Technology team in cooperation with the experimental collaborations. CMS would like to thank CERN for providing resources and expertise to build and maintain the portal. We would also like to acknowledge the continuous effort of many of our collaboration members who have helped us release this latest batch of CMS open data.