Cite as: Kallonen, Kimmo; (2019). Sample with jet properties for jet-flavor and other jet-related ML studies JetNTuple_QCD_RunII_13TeV_MC. CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.RY2V.T797
Dataset Derived Datascience CMS CERN-LHC Parent Dataset: /QCD_Pt-15to7000_TuneCUETP8M1_Flat_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_magnetOn_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM
The dataset consists of particle jets extracted from simulated proton-proton collision events at a center-of-mass energy of 13 TeV generated with Pythia 8. The particles emerging from the collisions traverse through a simulation of the CMS detector. The particles were reconstructed from the simulated detector signals using the particle-flow (PF) algorithm. The reconstructed particles are also called PF candidates. The jets in this dataset were clustered from the PF candidates of each collision event using the anti-$k_t$ algorithm with distance parameter $R = 0.4$. The standard L1+L2+L3+residual jet energy corrections are applied to the jets and pileup contamination is mitigated using the charged hadron subtraction (CHS) algorithm.
From each collision event, only those jets with transverse momentum exceeding 30 GeV were saved to file. The jets were also required to have pseudorapidity of less than 2.5 (this indicates the jet's position in the detector). For each jet, there are variables describing the jet on a high-level, particle-level and generator-level. There are also some variables describing the collision event and the conditions of its simulation. All of the variables are saved on a jet-by-jet basis, which means that one row of data corresponds to one jet.
The origin of a jet is particularly interesting. This so-called flavor of the jet is obtained from the generator-level particles by a jet flavor algorithm, which attempts to match a reconstructed jet to a single initiating particle. As a consequence, the jet flavor definition depends on the chosen algorithm. Here three different flavor definitions are available. The ‘hadron’ definition identifies b- and c-hadrons from the jet’s constituents, so it is only useful for b-tagging studies. The ‘parton’ definition extends this to include the light jet flavors (u, d, s and gluon). Finally there is the ‘physics’ definition, which looks at the quarks and gluons of the initial collision. The ‘parton’ and ‘physics’ definitions both identify all jet flavors, but the former is more biased towards b- and c-quarks. If in doubt, it is recommended to use the ‘physics’ definition.
Variable | Type | Description |
---|---|---|
jetPt | Float_t | Transverse momentum of the jet. |
jetEta | Float_t | Pseudorapidity (η) of the jet. |
jetPhi | Float_t | Azimuthal angle (ϕ) of the jet. |
jetMass | Float_t | Mass of the jet. |
jetGirth | Float_t | Girth of the jet (as defined in arXiv:1106.3076 [hep-ph]). |
jetArea | Float_t | Catchment area of the jet; used for jet energy corrections. |
jetRawPt | Float_t | Transverse momentum of the jet before the energy corrections. |
jetRawMass | Float_t | Mass of the jet before the energy corrections. |
jetLooseID | UInt_t | Binary variable indicating whether the jet passes 'loose' criteria for being a real jet. |
jetTightID | UInt_t | Binary variable indicating whether the jet passes 'tight' criteria for being a real jet. |
jetGenMatch | UInt_t | 1: if a matched generator level jet exists; 0: if no match was found. |
jetQGl | Float_t | Quark-Gluon jet likelihood discriminant variable built out of the three following variables (see the report CMS-PAS-JME-13-002 for more information). |
QG_ptD | Float_t | Jet energy variable (see CMS-PAS-JME-13-002). |
QG_axis2 | Float_t | Minor axis of the jet (see CMS-PAS-JME-13-002). |
QG_mult | UInt_t | Jet constituent multiplicity with additional cuts (see CMS-PAS-JME-13-002). |
partonFlav | Int_t | Flavour of the jet, as defined by the CMS parton-based definition. |
hadronFlav | Int_t | Flavour of the jet, as defined by the CMS hadron-based definition. |
physFlav | Int_t | Flavour of the jet, as defined by the CMS 'physics' definition (if in doubt, use this). |
isPartonUDS | UInt_t | Indicates light quark (Up, Down, Strange) jets: partonFlav = 1, 2, 3. |
isPartonG | UInt_t | Indicates gluon jets: partonFlav = 21. |
isPartonOther | UInt_t | Indicates any other kind of jet: partonflav != 1, 2, 3, 21. |
isPhysUDS | UInt_t | Indicates light quark (Up, Down, Strange) jets: physFlav = 1, 2, 3. |
isPhysG | UInt_t | Indicates gluon jets: physFlav = 21. |
isPhysOther | UInt_t | Indicates any other kind of jet: physFlav != 1, 2, 3, 21. |
jetChargedHadronMult | UInt_t | Multiplicity of charged hadron jet constituents. |
jetNeutralHadronMult | UInt_t | Multiplicity of neutral hadron jet constituents. |
jetChargedMult | UInt_t | Multiplicity of charged jet constituents. |
jetNeutralMult | UInt_t | Multiplicity of neutral jet constituents. |
jetMult | UInt_t | Multiplicity of jet constituents. |
nPF | UInt_t | Number of particle flow (PF) candidates (particles reconstructed by the particle flow algorithm); contains all particles within |Δϕ| < 1 and |Δη| < 1 from the center of the jet. |
PF_pT[nPF] | Float_t | Transverse momentum of a PF candidate. |
PF_dR[nPF] | Float_t | Distance of a PF candidate to the center of the jet. |
PF_dTheta[nPF] | Float_t | Polar angle (θ) of a PF candidate. |
PF_dPhi[nPF] | Float_t | Azimuthal angle (ϕ) of a PF candidate. |
PF_dEta[nPF] | Float_t | Pseudorapidity (η) of a PF candidate. |
PF_mass[nPF] | Float_t | Mass of a PF candidate. |
PF_id[nPF] | Int_t | Generator level particle identifier for the particle flow candidates, as defined in the PDG particle numbering scheme. |
PF_fromPV[nPF] | UInt_t | A number indicating how tightly a particle is associated with the primary vertex (ranges from 3 to 0). |
PF_fromAK4Jet[nPF] | UInt_t | 1: if the particle flow candidate is a constituent of the reconstructed AK4 jet; 0: if it is not a constituent of the jet. |
genJetPt | Float_t | Transverse momentum of the matched generator level jet. |
genJetEta | Float_t | Pseudorapidity (η) of the matched generator level jet. |
genJetPhi | Float_t | Azimuthal angle (ϕ) of the matched generator level jet. |
genJetMass | Float_t | Mass of the matched generator level jet. |
nGenJetPF | UInt_t | Number of particles in the matched generator level jet. |
genPF_pT[nGenJetPF] | Float_t | Transverse momentum of a particle in the matched generator level jet. |
genPF_dR[nGenJetPF] | Float_t | Distance of a particle to the center of the matched generator level jet. |
genPF_dTheta[nGenJetPF] | Float_t | Polar angle (θ) of a particle in the matched generator level jet. |
genPF_mass[nGenJetPF] | Float_t | Mass of a particle in the matched generator level jet. |
genPF_id[nGenJetPF] | Int_t | Generator level particle identifier for the particles in the matched generator level jet, as defined in the PDG particle numbering scheme. |
eventJetMult | UInt_t | Multiplicity of jets in the event. |
jetPtOrder | UInt_t | Indicates the ranking number of the jet, as the jets are ordered by their transverse momenta within a single event. |
dPhiJetsLO | Float_t | The phi difference of the two leading jets. |
dEtaJetsLO | Float_t | The eta difference of the two leading jets. |
alpha | Float_t | If there are at least 3 jets in the event, alpha is the third jet's transverse momentum divided by the average transverse momentum of the two leading jets. |
event | ULong64_t | Event number. |
run | UInt_t | Run number. |
lumi | UInt_t | Luminosity block. |
pthat | Float_t | Transverse momentum of the generated hard process. |
eventWeight | Float_t | Weight assigned to the generated event. |
rhoAll | Float_t | The median density (in GeV/A) of pile-up contamination per event; computed from all PF candidates of the event. |
rhoCentral | Float_t | Same as above, computed from all PF candidates with |η| < 2.5. |
rhoCentralNeutral | Float_t | Same as above, computed from all neutral PF candidates with |η| < 2.5. |
rhoCentralChargedPileUp | Float_t | Same as above, computed from all PF charged hadrons associated to pileup vertices and with |η| < 2.5. |
PV_npvsGood | UInt_t | The number of good reconstructed primary vertices. |
Pileup_nPU | UInt_t | The number of pileup interactions that have been added to the event in the current bunch crossing. |
Pileup_nTrueInt | Float_t | The true mean number of the poisson distribution for this event from which the number of interactions in each bunch crossing has been sampled. |
This dataset was produced with the software available in:
JetNtupleProducerTool - Jet tuple producer from CMS Run2 MiniAOD
The use of these files does not require any software specific to the CMS experiment. There are two sets of equivalent files in two different formats: ROOT and H5. An example notebook is provided.
These open data are released under the Creative Commons Zero v1.0 Universal license.
Neither the experiment(s) ( CMS ) nor CERN endorse any works, scientific or otherwise, produced using these data.
This release has a unique DOI that you are requested to cite in any applications or publications.