Familiarise yourself with the working environments and tools needed to analyse the data provided on this portal.

Getting started with CMS 2010 data

"I have installed the CERN Virtual Machine: now what?"

To analyse CMS data collected in 2010, you need version 4.2.8 of CMSSW, supported only on Scientific Linux 5. If you are unfamiliar with Linux, take a look at this short introduction to Linux or try this interactive command-line bootcamp. Once you have installed the CMS-specific CERN Virtual Machine, execute the following command in the terminal if you haven't done so before; it ensures that you have this version of CMSSW running:

$ cmsrel CMSSW_4_2_8

Then, make sure that you are always in the CMSSW_4_2_8/src/ directory by entering the following command in the terminal (you must do so every time you boot the VM before you can proceed):

$ cd CMSSW_4_2_8/src/

"OK! Where can I get the CMS data?"

It is best if we start off with a quick introduction to ROOT. ROOT is the framework used by several particle-physics experiments to work with the collected data. Although analysis is not itself performed within the ROOT GUI, it is instructive to understand how these files are structured and what data and collections they contain.

The primary data provided by CMS on the CERN Open Data Portal is in a format called "Analysis Object Data" or AOD for short. These AOD files are prepared by piecing raw data collected by various sub-detectors of CMS and contain all the information that is needed for analysis. The files cannot be opened and understood as simple data tables but require ROOT in order to be read.

So, let's see what an AOD file looks like and take ROOT for a spin!

Making sure that you are in the CMSSW_4_2_8/src/ folder, execute the following command in your terminal to launch the CMS analysis environment:

$ cmsenv

You can now open a CMS AOD file in ROOT. Let us open one of the files from the CERN Open Data Portal by entering the following command:

$ root root://eospublic.cern.ch//eos/opendata/cms/Run2010B/Mu/AOD/Apr21ReReco-v1/0000/00459D48-EB70-E011-AF09-90E6BA19A252.root

You will see the ROOT logo appear on screen. You can now open the ROOT GUI by entering:

TBrowser t

Excellent! You have successfully opened a CMS AOD file in ROOT. If this was the first time you've done so, pat yourself on the back. Now, to see what is inside this file, let us take a closer look at some collections of physics objects.

On the left window of ROOT (see the screenshot below), double-click on the file name (root://eospublic.cern.ch//eos/opendata/…). You should see a list of entries under Events, each corresponding to a collection of reconstructed data. We are interested in the collections containing information about reconstructed physics objects.
Screenshot: After running 'TBrowser t'

Let us take a peek, for example, at the electrons, which are found in recoGsfElectrons_gsfElectrons__RECO, as shown on the list of physics objects. Look in there by double-clicking on that line and then double-clicking on recoGsfElectrons_gsfElectrons__RECO.obj. Here, you can have a look at various properties of this collection, such as the plot for the transverse momentum of the electrons: recoGsfElectrons_gsfElectrons__RECO.obj.pt_.

You can exit the ROOT browser through the GUI by clicking on Browser on the menu and then clicking on Quit Root or by entering .q in the terminal.

"Nice! But how do I analyse these data?"

In AOD files, reconstructed physics objects are included without checking their "quality", i.e. in case of our electron collection that you opened in ROOT, without ensuring that the reconstructed object is really an electron. In order to analyse only the "good quality" data, you must apply some selection criteria.

With these criteria, you are in effect reducing the dataset, either in terms of the number of collisions events it contains or in terms of the information carried by each event. Following this, you run your analysis code on the reduced dataset.

Depending on the nature of your analysis you can run your analysis code directly on the AOD files themselves, if needed, performing the selections along the way. However, this can be resource-intensive and is done only for very specific usecases.

NOTE: To analyse the full event content, the analysis job needs access to the "condition data", such as the jet-energy corrections. Connections to the condition database are established by the CERN Virtual Machine needed to analyse CMS data from 2010. (To see how the connection to the condition database is established to analyse CMS data from 2011, you can check the "pattuples2011" example.) For simpler analyses, where we use only physics objects needing no further data for corrections, you do not need to connect to the condition database. This is the case for the example for analysing the primary datasets below.

Your final analysis is done using a software module called an "analyzer". If you have followed the validation step for the virtual machine setup, you have already produced and run a simple analyzer. You can specify your initial selection criteria within the analyzer to perform your analysis directly on the AOD files, or further elaborate the selections and other operations needed for analysing the reduced dataset. To learn more about configuring analyzers, follow these instructions in the CMSSW WorkBook. Make sure, though, that you replace the release version (CMSSW_nnn) with the release that you are using, i.e. one that is compatible with the CMS open data.

You can also pass the selection criteria through the configuration file. This file activates existing tools within CMSSW in order to perform the desired selections. If you have followed the validation step for the virtual machine setup, you have already seen a configuration file, which is used to give the parameters to the cmsRun executable. You can see how this is done in our analysis example.

We will now take you through these steps through a couple of specially prepared example analyses.

Option A: Analysing the primary dataset

As mentioned above, you do not typically perform an analysis directly on the AOD files. However, there may be cases when you can do so. Therefore, we have provided an example analysis to take you through the steps that you may need on the occassions that you want to analyse the AOD files directly. You can find the files and instructions in this CMS Tools entry.

Option B: Analysing reduced datasets

We start by applying selection cuts via the configuration file and reduce the AOD files into a format known as PATtuple. You can find more information about this data format (which gets its name from the CMS Physics Analysis Toolkit, or PAT) on the CMSSW PAT WorkBook.

Important: Be aware that the instructions in the WorkBook are in use in CMS currently and have been updated for more recent CMSSW releases. With the 2010 data, you should always use the releases in the series of CMSSW_4_2 and not higher. Also note that more recent code does not work with older releases, so whenever you see git cms-addpkg… in the instruction, it is likely that the code package this command adds does not work with the release you need. However, the material under the pages gives you a good introduction to PAT.

Code as well as instructions for producing PATtuples from the CMS open data can be found in this GitHub repo. However, since it took a dedicated computing cluster nine days (!!!) to run this step and reduce the several TB of AOD files to a few GB of PATtuples, we have provided you with the PATtuples in that GitHub repo, saving you quite a lot of time! So you can jump to the next step, below ("Performing your analysis…"). Although you do not need to run this step, it is worth looking at the configuration file:

You can see that the line removeAllPATObjectsBut(process, ['Muons','Electrons']) removes all "PATObjects" but muon and electrons, which will be needed in the final analysis step of this example.

Note also how only the validated runs are selected on lines:

import FWCore.ParameterSet.Config as cms import PhysicsTools.PythonAnalysis.LumiList as LumiList myLumis = LumiList.LumiList(filename='Cert_136033-149442_7TeV_Apr21ReReco_Collisions10_JSON_v2.txt').getCMSSWString().split(',') process.source.lumisToProcess = cms.untracked.VLuminosityBlockRange() process.source.lumisToProcess.extend(myLumis)

This selection must always be applied to any analysis on CMS open data, and to do so you must have the validation file downloaded to your local area.

You can also see how the correct set of condition data are defined by mentioning the Global Tag on lines 45–46 in the file PAT_data_repo.py.

Performing your analysis on the PATtuplesÂ

Now, as the intermediate PATtuple files have been produced for you, you can go directly to the next step, as described in this second GitHub repo and follow the instructions on that page.

    Note that even though these are derived datasets, running the analysis code over the full data can take several hours. So if you want just give it a try, you can limit the number events or read only part of the files. Bear in mind that running on a low number of files will not give you a meaningful plot.

    Your analysis job is defined in OutreachExercise2010/DecaysToLeptons/run/run.py. The analysis code is in the files located in the OutreachExercise2010/DecaysToLeptons/python directory.

    This example uses IPython, which gets configured and starts the job with the following command:

ipython run.py

That's it! Follow the rest of the instructions on the README and you have performed an analysis using data from CMS. Hope you enjoyed this exercise. Feel free to play around with the rest of the data and write your own analyzers and analysis code. (To exit IPython, enter exit().)

Getting started with CMS 2011 data

"I have installed the CERN Virtual Machine: now what?"

To analyse CMS data collected in 2011, you need version 5.3.32 of CMSSW, supported only on Scientific Linux 6. If you are unfamiliar with Linux, take a look at this short introduction to Linux or try this interactive command-line bootcamp. Once you have installed the CMS-specific CERN Virtual Machine, execute the following command in the terminal if you haven't done so before; it ensures that you have this version of CMSSW running:

$ cmsrel CMSSW_5_3_32

Then, make sure that you are always in the CMSSW_5_3_32/src/ directory by entering the following command in the terminal (you must do so every time you boot the VM before you can proceed):

$ cd CMSSW_5_3_32/src/

"OK! Where can I get the CMS data?"

It is best if we start off with a quick introduction to ROOT. ROOT is the framework used by several particle-physics experiments to work with the collected data. Although analysis is not itself performed within the ROOT GUI, it is instructive to understand how these files are structured and what data and collections they contain.

The primary data provided by CMS on the CERN Open Data Portal is in a format called "Analysis Object Data" or AOD for short. These AOD files are prepared by piecing raw data collected by various sub-detectors of CMS and contain all the information that is needed for analysis. The files cannot be opened and understood as simple data tables but require ROOT in order to be read.

So, let's see what an AOD file looks like and take ROOT for a spin!

Making sure that you are in the CMSSW_5_3_32/src/ folder, execute the following command in your terminal to launch the CMS analysis environment:

$ cmsenv

You can now open a CMS AOD file in ROOT. Let us open one of the files from the CERN Open Data Portal by entering the following command:

$ root root://eospublic.cern.ch//eos/opendata/cms/Run2011A/ElectronHad/AOD/12Oct2013-v1/20001/001F9231-F141-E311-8F76-003048F00942.root

You will see the ROOT logo appear on screen. You can now open the ROOT GUI by entering:

TBrowser t

Excellent! You have successfully opened a CMS AOD file in ROOT. If this was the first time you've done so, pat yourself on the back. Now, to see what is inside this file, let us take a closer look at some collections of physics objects.

On the left window of ROOT (see the screenshot below), double-click on the file name (root://eospublic.cern.ch//eos/opendata/…). You should see a list of entries under Events, each corresponding to a collection of reconstructed data. We are interested in the collections containing information about reconstructed physics objects.
Screenshot: After running 'TBrowser t'

Let us take a peek, for example, at the electrons, which are found in recoGsfElectrons_gsfElectrons__RECO, as shown on the list of physics objects. Look in there by double-clicking on that line and then double-clicking on recoGsfElectrons_gsfElectrons__RECO.obj. Here, you can have a look at various properties of this collection, such as the plot for the transverse momentum of the electrons: recoGsfElectrons_gsfElectrons__RECO.obj.pt_.

You can exit the ROOT browser through the GUI by clicking on Browser on the menu and then clicking on Quit Root or by entering .q in the terminal.

"Nice! But how do I analyse these data?"

In AOD files, reconstructed physics objects are included without checking their "quality", i.e. in case of our electron collection that you opened in ROOT, without ensuring that the reconstructed object is really an electron. In order to analyse only the "good quality" data, you must apply some selection criteria.

With these criteria, you are in effect reducing the dataset, either in terms of the number of collisions events it contains or in terms of the information carried by each event. Following this, you run your analysis code on the reduced dataset.

Depending on the nature of your analysis you can run your analysis code directly on the AOD files themselves, if needed, performing the selections along the way. However, this can be resource-intensive and is done only for very specific usecases.

NOTE: To analyse the full event content, the analysis job needs access to the "condition data", such as the jet-energy corrections. You can see how connections to the condition database are established in the "pattuples2011" example. For simpler analyses, where we use only physics objects needing no further data for corrections, you do not need to connect to the condition database. This is the case for the example for analysing the primary datasets below.

Your final analysis is done using a software module called an "analyzer". If you have followed the validation step for the virtual machine setup, you have already produced and run a simple analyzer. You can specify your initial selection criteria within the analyzer to perform your analysis directly on the AOD files, or further elaborate the selections and other operations needed for analysing the reduced dataset. To learn more about configuring analyzers, follow these instructions in the CMSSW WorkBook. Make sure, though, that you replace the release version (CMSSW_nnn) with the release that you are using, i.e. one that is compatible with the CMS open data.

You can also pass the selection criteria through the configuration file. This file activates existing tools within CMSSW in order to perform the desired selections. If you have followed the validation step for the virtual machine setup, you have already seen a configuration file, which is used to give the parameters to the cmsRun executable. You can see how this is done in our analysis example.

We will now take you through these steps through a couple of specially prepared example analyses.

Option A: Analysing the primary dataset

As mentioned above, you do not typically perform an analysis directly on the AOD files. However, there may be cases when you can do so. Therefore, we have provided an example analysis to take you through the steps that you may need on the occassions that you want to analyse the AOD files directly. You can find the files and instructions in this CMS Tools entry.

Before you proceed: Note that this example is tailored to the CMS data from 2010. In order to run the same analysis on 2011 data, you need to make the following adjustments to the workflow:

  1. Different CMS VM image versions should be used for analysing 2010 and 2011 data. For 2011 data, use CMS VM image for 2011 data.
  2. Change all instances of CMSSW to the correct version (CMSSW_5_3_32 for 2011 datasets).
  3. In the demoanalyzer_cfg.py file, replace PhysicsTools.PythonAnalysis.LumiList with FWCore.PythonUtilities.LumiList on line 4.
  4. In the demoanalyzer_cfg.py file, under the section to define the input data, select any primary dataset with muons from the 2011 data (such as SingleMu or DoubleMu).
  5. Download the list of validated runs from this record and, in the demoanalyzer_cfg.py file, replace goodJSON = '/home/cms-opendata/CMSSW_4_2_8/src/Demo/DemoAnalyzer/datasets/Cert_136033-149442_7TeV_Apr21ReReco_Collisions10_JSON_v2.txt' with goodJSON = '/home/cms-opendata/CMSSW_5_3_32/src/Demo/DemoAnalyzer/datasets/Cert_160404-180252_7TeV_ReRecoNov08_Collisions11_JSON.txt'.

Option B: Analysing reduced datasets

We start by applying selection cuts via the configuration file and reduce the AOD files into a format known as PATtuple. You can find more information about this data format (which gets its name from the CMS Physics Analysis Toolkit, or PAT) on the CMSSW PAT WorkBook.

Important: Be aware that the instructions in the WorkBook are in use in CMS currently and have been updated for more recent CMSSW releases. With the 2011 data, you should always use the releases in the series of CMSSW_5_3 and not higher. Also note that more recent code does not work with older releases, so whenever you see git cms-addpkg… in the instruction, it is likely that the code package this command adds does not work with the release you need. However, the material under the pages gives you a good introduction to PAT.

Code as well as instructions for producing PATtuples from the CMS open data can be found in this GitHub repo. However, since it can take a dedicated computing cluster several days to run this step and reduce the several TB of AOD files to a few GB of PATtuples, we have provided you with the PATtuples in that GitHub repo, saving you quite a lot of time! So you can jump to the next step, below ("Performing your analysis…"). Although you do not need to run this step, it is worth looking at the configuration file:

You can see that the line removeAllPATObjectsBut(process, ['Muons','Electrons']) removes all "PATObjects" but muon and electrons, which will be needed in the final analysis step of this example.

Note also how only the validated runs are selected on lines:

import FWCore.ParameterSet.Config as cms import FWCore.PythonUtilities.LumiList as LumiList myLumis = LumiList.LumiList(filename='Cert_160404-180252_7TeV_ReRecoNov08_Collisions11_JSON.txt').getCMSSWString().split(',') process.source.lumisToProcess = cms.untracked.VLuminosityBlockRange() process.source.lumisToProcess.extend(myLumis)

This selection must always be applied to any analysis on CMS open data, and to do so you must have the validation file downloaded to your local area.

You can also see the steps needed to use the condition data. First, as shown in the README, you have to set the symbolic links to the condition database.

ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA FT_53_LV5_AN1 ln -sf /cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db FT_53_LV5_AN1_RUNA.db

Then, the correct set of condition data are defined by mentioning the Global Tag on lines 46–48 in the file PAT_data_repo.py.

#globaltag process.GlobalTag.connect = cms.string('sqlite_file:/cvmfs/cms-opendata-conddb.cern.ch/FT_53_LV5_AN1_RUNA.db') process.GlobalTag.globaltag = 'FT_53_LV5_AN1::All'

Important: If you plan on running the code to produce the PATtuples needed for this analysis, note that the first time you run the job, the CERN Virtual Machine will read the condition data from the remote database. This process will take time (an example run of a 10 Mb/s line took 45 mins), but it will only happen once as the files will then be cached on your VM. The job will not produce any output during this time. However, you can check the ongoing processes with the command top and you can monitor the progress of reading the condition data to the local cache with the command df.

Performing your analysis on the PATtuples

Now, as the intermediate PATtuple files have been produced for you, you can go directly to the next step, as described in this second GitHub repo and follow the instructions on that page.

    Note that even though these are derived datasets, running the analysis code over the full data can take time. So if you want just give it a try, you can limit the number events or read only part of the files. Bear in mind that running on a low number of files will not give you a meaningful plot.

    Your analysis job is defined in OutreachExercise2011/DecaysToLeptons/run/run.py. The analysis code is in the files located in the OutreachExercise2011/DecaysToLeptons/python directory.

    This example uses IPython, which gets configured and starts the job with the following command:

ipython run.py

That's it! Follow the rest of the instructions on the README and you have performed an analysis using data from CMS. Hope you enjoyed this exercise. Feel free to play around with the rest of the data and write your own analyzers and analysis code. (To exit IPython, enter exit().)

Learn how to use the ALICE virtual machine to have a first look at ALICE events and use analysis tools

"How do I start the ALICE software?"

After successfully installing the ALICE VM you should contextualise it as described here [link to installation guide]. Note that you should pick from the CernVM market the "ALICE OpenAccess" item from the list of available ALICE contexts. When starting the VM, this will automatically login the user “alice” (password “alice”) and start a graphical user interface allowing to run the ALICE masterclasses as well as a basic tutorial on how to run a custom analysis on ALICE data. There is no need to setup the software; both the software tools and the environment are automatically set to use the supplied analysis tools.

“How should I use the graphics user interface?”

The interface can be quit at any time by clicking the “Exit” button. To re-enter the interface, one can type in a terminal:

[alice@localhost analysis]$ root masterclass.C

Just a fast introduction on what the “root” program is. This is an object-oriented analysis toolkit widely used in high-energy physics among many other fields. ROOT can be used as a library to link in your program - the ALICE software is built like that - or to run simple C++ programs in the form of macros, such as the user interface ‘masterclass.C’. You may want to have a closer look of what ROOT is an how you can use it if you want to go deeper and write your own analysis for ALICE data.

Coming back to the user interface, you will notice that every analysis module or masterclass is represented as a separate tab which can be clicked and will present a different interface for each module. In general, these contain an “Info” button, some data or parameter selection interface and the “Exit” button. Clicking the picture button for each analysis module will run the selected module with the selected dataset and settings. Just follow the instructions provided with each example.

You can bring up the general documentation for the masterclasses by clicking the big picture button with the ALICE logo:

“But where is ALICE data?”

You do not need to manually download the data, it will be automatically downloaded when you run a given module. The interface will allow you in some cases to select a given dataset, then it will download it in a local folder. In general, every module will create a local folder on the VM file system: /home/alice/analysis/. For example, the “Pt” analysis will create the folder /home/alice/analysis/PtAnalysis and download the selected data in folders like data/LHC2010b_pp_ESD_117222/0000/AliESDs.root

“How can I see the content of ALICE data?”

ALICE data are written in ROOT format, which allows to store objects like events, tracks, vertices in a compact form and read them with random access. You can use the ‘root’ program to open any of the data files having the extension “.root” and inspect their content, but you will have to write a simple program to do a bit more than that. You can do this after you get more familiar with ROOT and with the tutorials available in the ALICE VM.

Learn how to use the LHCb virtual machine to have a first look at LHCb events and use analysis tools

"How do I start LHCb software?"

LHCb software comes via a virtual machine image. The only thing you need to install by yourself on you desktop is VirtualBox. Then you just need to download the LHCb virtual machine image open it with VirtualBox and click on the LHCMasterclass icon on the desktop. (see instructions here )

"What kind of LHCb data will I work on?”

The data samples you can download from this portal consists of candidates for a type of charmed particle known as a D0 particle found in a sample of randomly collected LHC interactions during 2011 data taking. A D0 particle consists of a charm quark and an up anti-quark. The particles are measured decaying in the mode D0→K-π+ where the final state particles are a kaon (K-) consisting of a strange quark and an anti-up quark, and a pion (π+) that consists of a down anti-quark and an up quark. The +, - and 0 refer to the electric charge of the particle, whether it is positively charged, negatively charged or neutral.

These particles have lifetime which are long enough that, for the purpose of this exercise, they are stable within the LHCb detector. The particles have been preselected using loose criteria so that you begin in the samples you will have ith a visible signal, but background events are also present.

"I have installed VirtualBox, downloaded LHCb VM image and launched it. And now?”

The LHCbMasterclass exercise is divided into two parts: the Event Display and the D0 lifetime fitting exercise, which should be executed in this order.

Once you click on the icon LHCbMasterclass you will be asked to select a language, enter your details and select the sample you want to analyse.

After clicking on the Save button, you can start the Event Display. If you want to move directly to the second exercise, just click on Move on to D0 exercise.

"What can I learn from this exercise?"

You will be working on real collisions recorded by the LHCb experiment during 2011 data taking, which contain both signal and background particles. This set of two exercises is designed to teach you how to

  • Use an event display of the proton-proton collisions inside the LHCb detector to search for charmed particles and separate this signal from backgrounds.
  • Fit functional forms for the signal and background to the data in order to measure the number of signal events in the data sample and their purity (defined as the fraction of signal events in the total sample).
  • Obtain the distribution of signal events in a given variable by taking the combined distribution of events in the data sample (which contains both signal and background events) and subtracting the background distribution. The result of the fit in the previous step is used to find a sample of pure background events for subtraction, and to compute from the signal yield and purity the appropriate amount of background which should be subtracted.
  • The signal you will be looking at decays exponentially with time, analogously to a radioactive isotope. You can now use the sample of events passing the previous step to measure the "lifetime" of the signal particle. The lifetime is defined as the time taken for (e-1)/e of the signal events to decay, where e~2.718 is the base of the natural logarithm. It is analogous to the concept of half-life in radioactive decay.

"How does the Event display exercise work?"

The aim of the event display exercise is to locate displaced vertices belonging to D0 particles in the vertex detector of the LHCb experiment. When you launch the exercise and load an event, you will see an image of the LHCb detector and particle trajectories ("tracks") inside it. These tracks are colour coded, and a legend at the bottom of the GUI tells you which colour corresponds to which kind of particle.

In order to make identifying vertices easier, you can view an event in three different two-dimensional projections : y-z, y-x and x-z, show for one event the following pictures:

Different events will be clearer in different projections, so feel free to experiment with all three! Displaced vertices appear as a pair of intersecting tracks, far away from the other tracks in the event. When you click on a particle, you will see its information, including mass and momentum, in the Particle Info box. A D0 particle decays into a kaon and a pion, so you will need to find a displaced vertex where a kaon track intersects with a pion track. Once you find a track which you think is part of the displaced vertex, you can save it using the Save Particle button. Once you have saved two particles, you can compute their mass by clicking on the Calculate button. If you think this combination has a mass compatible with that of the D0 particle, click on Add to save it : by saving a combination for each event, you will build up a histogram of the masses of the displaced vertices in the different events.

Remember that you are looking at real data so it contains both signal and background, and the detector has a finite resolution, so not all displaced vertices will have exactly the D0 mass (even the signal ones). They should, however, be within the range 1816-1914 MeV (this range is around 3% each way around the true D0 mass). If you try to save a combination which is too far away from the real D0 mass the exercise will warn you that you have not found the correct displaced vertex pair and won't let you save it. If you are not able to find the displaced vertex for an event after a few minutes, move on to the next event and come back to the one which was giving you trouble if you have time at the end of the exercise. Once you have looked at all events, you can examine your mass histogram by clicking the Draw button.

"How does the D0 lifetime fitting exercise work?"

Before describing the fitting part of the exercise, it will be useful to list the variables involved in this exercise :

D0 mass: this is the invariant mass of the D0 particle. The signal can be seen as a peaking structure rising above a at background. The range of masses relevant for this analysis is 1816-1914 MeV. The signal shape is described by the Gaussian (also known as "normal") distribution. The center ("mean") of this distribution is the mass of the D0 particle, while the width represents the experimental resolution of the detector.

D0 TAU: this is the distribution of decay times of the D0 candidates. The signal is described by a single exponential whose slope is the D0 lifetime (the object of the last exercise), while the background concentrates at short decay times.

D0 IP: this is the D0 distance of closest approach ("impact parameter") with respect to the proton-proton interaction in the event. The smaller the impact parameter, the more likely it is that the D0 actually came from that primary interaction. In order to simplify the drawing, we actually plot and cut on the logarithm (base 10) of this quantity in the exercise.

D0 PT: this is the momentum of the D0 transverse to the LHC beamline.


Exercise 1 : fitting the mass distribution and obtaining signal variable distributions The object of this exercise is to fit the distribution of the D0 mass variable, and extract the signal yield and purity.

  • Click on the Plot D0 mass button to plot the overall mass distribution. You will see a peak (signal) on top of a at distribution (background). The peak should be described by a Gaussian function, whose mean corresponds to the mass of the D0 and whose width (σ) is determined by the experimental resolution of the LHCb detector.
  • Click on Fit mass distribution to fit this distribution using a Gaussian function for the signal and a linear function for the background.
  • Look at the fitted mass distribution. You can split it into three regions: the signal region and two background-only "sidebands": one above the signal (the upper sideband) and one below the signal (the lower sideband). A Gaussian distribution contains 99.7% of its events within three standard deviations of the mean, so this "three σ" region around the mean is usually the definition of the signal region.
  • Use the slider labelled Sig range to define the beginning and end of the signal region. All events not falling into the signal region will be said to fall into the background region.
  • You can now use the definitions of the signal and background regions in the mass variable to determine the signal and background distributions in other variables. Click on the button labelled Apply cuts and plot variables. You will see the signal (blue) and background (red) distributions for the other three variables plotted next to the mass distribution. You should discuss the exercise with an instructor at this point.

Exercise 2 : measuring the D0 lifetime The object of this exercise is to use the signal sample which you obtained in the previous step to measure the lifetime of the D0 particle. This is the same quantity as the half-life of a radioactive particle: the D0 decays according to an exponential distribution, and if this exponential is fitted to a distribution of the D0 decay times, the slope of this exponential is the lifetime of the D0.

  • Fit the lifetime of the D0.
  • Compare the slope of this exponential to the D0 lifetime given by the Particle Data Group. Talk to an instructor about how well these agree with each other.
  • In addition to statistical uncertainties, measurements can suffer from systematic uncertainties, caused by a miscalibrated apparatus or an incorrect modelling of the backgrounds. One basic technique for estimating these is to repeat the measurement while changing the criteria used to select signal events. If the result changes significantly when changing the criteria, we know that there is something wrong!
  • Repeat your fit for the lifetime of the D0 while varying the maximum allowed D0 impact parameter. The allowed values range from -4:0 to 1.5 in the original fit. Move this upper value from 1.5 to -2.0 in steps of 0.25, and refit the D0 lifetime at each point, saving the results as you go along.
  • Plot the histogram showing the fitted value of the D0 lifetime as a function of the upper cut on the impact parameter. Discuss the shape, and what it tells you about the D0 lifetime, with an instructor.
  • What other sources of systematic uncertainty might we need to consider when making a lifetime measurement?