Starting today, the data from 100 trillion proton collisions is now public on the Open Data Portal! This marks the world's first open release of 8 TeV data, gathered from the Large Hadron Collider in 2012.
Given its strong emphasis on learning, the release of ATLAS Open Data is accompanied by additional resources that are available via the ATLAS educational platform. The resources guide users through the data, explain how to visualise and how to download and use the data, and even provides open-source software. As part of this, ATLAS has made seven physics analyses available for users to get started with their research. By releasing additional resources and comprehensive documentation ATLAS aims at bridging the gap between phycisists and the public.
Today, the CMS Collaboration at CERN has released more than 300 terabytes (TB) of high-quality open data. These include over 100 TB, or 2.5 inverse femtobarns (fb−1), of data from proton collisions at 7 TeV, making up half the data collected at the LHC by the CMS detector in 2011. This follows a previous release from November 2014, which made available around 27 TB of research data collected in 2010.
Available on the CERN Open Data Portal — which is built in collaboration with members of CERN’s IT Department and Scientific Information Service — the collision data are released into the public domain under the CC0 waiver and come in types: The so-called “primary datasets” are in the same format used by the CMS Collaboration to perform research. The “derived datasets” on the other hand require a lot less computing power and can be readily analysed by university or high-school students, and CMS has provided a limited number of datasets in this format. Read full release announcement
The dataset from the ATLAS Higgs Machine Learning Challenge has been released on CERN Open Data Portal.
The Challenge, which ran from May to September 2014, was to develop an algorithm that improved the detection of the Higgs boson signal. The specific sample used simulated Higgs particles into two tau particles inside the ATLAS detector. The downloadable sample was provided for participants at the host platform on Kaggle’s website. With almost 1,785 teams competing, the event was a huge success. Participants applied and developed cutting edge Machine Learning techniques, which have been shown to be better than existing traditional high-energy physics tools.
The dataset was removed at the end of the Challenge but due to high public demand, ATLAS as organiser of the event, has decided to house it in the CERN Open Data Portal where it will be available permanently. The 60MB zipped ASCII file can be decoded without a special software, and a few scripts are provided to help users get started. Detailed documentation for physicists and data scientists is also available. Thanks to the Digital Object Identifiers (DOIs) in CERN Open Data Portal, the dataset and accompanying material can be cited like any other paper.
The Challenge’s winner Gábor Melis and recipients of the Special High Energy Physics meets Machine Learning Award, Tianqi Chen and Tong He, will be visiting CERN to deliver talks on their winning algorithms on 19 May.
ALICE is making a public release of a number of datasets customised for demonstration and educational purpose. A number of reconstructed data files which are not statistically representative are included to allow plotting transverse momentum and pseudorapidity distributions. More advanced analysis and tools as well as larger data samples will be available in future releases.
At this stage we are making available a set of outreach and educational analysis exercises. These are based on specifically-selected ALICE data and are widely used for the particle physics masterclasses. These exercises expose simplified tools, which however give the feel of the real tools employed by the physicists for the data analysis, and come in the form of analysis packages and small datasets organised as root files. Each analysis downloads on demand the required software and data from a common graphics interface.
The documentation coming with these exercises contains both some physics introduction and the instructions for running the programs, which highlight some of the ALICE physics. One exercise is the search for particles containing strange quarks, based on their V0 decays; the motivation is to give an insight to strangeness enhancement, one of the first signatures for the Quark-Gluon Plasma (QGP). Another exercise basically looks at charged particle tracks; the aim is to calculate the nuclear modification factor RAA, by comparing particle yields in the case of lead-lead and proton-proton collisions; the fact that RAA is less than one indicates the suppression of charged particles due to interactions of partons with the QGP.
The Compact Muon Solenoid (CMS) Collaboration at CERN is excited to announce the public release of the first batch of high-level, analysable and open data from the Large Hadron Collider (LHC), recorded by the CMS detector. The datasets are available on the new CERN Open Data Portal and are being released into the public domain under the Creative Commons CC0 waiver, in keeping with CMS’s commitment to data preservation and open data. Read full release announcement