• Help
    Discussion forum
    Search tips
  • About
    CERN Open Data
    ALICE
    ATLAS
    CMS
    DELPHI
    LHCb
    OPERA
    TOTEM
    Glossary

Samples with full event information including tracker hits for tracking, ML, and top quark tagging studies

Usai, Emanuele ; Andrews, Michael ; Burkle, Bjorn ; Gleyzer, Sergei ; Narain, Meenakshi

Cite as: Usai, Emanuele; Andrews, Michael; Burkle, Bjorn; Gleyzer, Sergei; Narain, Meenakshi; (2019). Samples with full event information including tracker hits for tracking, ML, and top quark tagging studies. CERN Open Data Portal. DOI:10.7483/OPENDATA.CMS.CHC3.5KPG

Dataset Derived Datascience CMS CERN-LHC Parent Dataset: Trakcer-hit-enriched 300 to 600 bin of QCD_Pt-15to3000_TuneZ2star_Flat_8TeV_pythia6 Parent Dataset: Trakcer-hit-enriched 400 to 600 bin of QCD_Pt-15to3000_TuneZ2star_Flat_8TeV_pythia6 Parent Dataset: Trakcer-hit-enriched 600 to 3000 bin of QCD_Pt-15to3000_TuneZ2star_Flat_8TeV_pythia6 Parent Dataset: Trakcer-hit-enriched TTJets_HadronicMGDecays_8TeV-madgraph


Description

Samples in this record are in a custom root ntuple format and contain the position of the hits and information from the generator-level objects associated to the tracker hits. The samples can be used to study top quark identification algorithms that use low-level detector information such as tracker hits. Machine learning algorithms are suitable for this classification task.

They have been produced from datasets, which consists of events extracted from simulated proton-proton collision events at a center-of-mass energy of 8 TeV generated with Pythia 6 (QCD) or MadGraph2.6 and Pythia6 (top-antitop pair sample). The particles emerging from the collisions traverse through a simulation of the CMS detector.

The parent datasets of these samples contain light jets (QCD) in various energy ranges or all-hadronic high transverse momentum decays of top quarks and consist of hits from the tracking detector, reconstructed tracks, simulated tracks, generated particles, and jets clustered from the generated particles. The various objects are matched in order to reconstruct the provenance of the various hits. Samples are produced from the standard CMS format "AODSIM" plus a series of low-level tracker-related collections that allow the extraction of the tracker hits.

Data set name Description Number of events Number of files
QCD300to600 QCD, flat pT hat spectrum, 300 < pT hat < 600 GeV 1497600 2496
QCD400to600 QCD, flat pT hat spectrum, 400 < pT hat < 600 GeV 1989000 3315
QCD600to3000 QCD, flat pT hat spectrum, 600 < pT hat < 3000 GeV 2974800 4959
ttbar ttbar, fully hadronic decays, pT of the top/antitop greater than 400 GeV 2969109 4055

Related datasets

QCD300to600_RunI_8TeV was derived from:

Tracker-hit-enriched 300 to 600 bin of QCD_Pt-15to3000_TuneZ2star_Flat_8TeV_pythia6

QCD400to600_RunI_8TeV was derived from:

Tracker-hit-enriched 400 to 600 bin of QCD_Pt-15to3000_TuneZ2star_Flat_8TeV_pythia6

QCD600to3000_RunI_8TeV was derived from:

Tracker-hit-enriched 600 to 3000 bin of QCD_Pt-15to3000_TuneZ2star_Flat_8TeV_pythia6

ttbar_RunI_8TeV was derived from:

Tracker-hit-enriched TTJets_HadronicMGDecays_8TeV-madgraph

Dataset characteristics

9430509 events. 14825 files. 11.7 TiB in total.

Dataset semantics

Variable Type Description
hit_global_x std::vector<float> global x position of the RecHit
hit_global_y std::vector<float> global y position of the RecHit
hit_global_z std::vector<float> global z position of the RecHit
hit_local_x std::vector<float> x pos. of the hit in the local sensor coordinate
hit_local_y std::vector<float> y pos. of the hit in the local sensor coordinate
hit_local_x_error std::vector<float> x error in the local sensor coordinate
hit_local_y_error std::vector<float> y error in the local sensor coordinate
hit_sub_det std::vector<unsigned int> subdetector generating the hit [1 PixelBarrel, 2 PixelEndcap, 3 TIB, 4 TID, 5 TOB, 6 TEC]
hit_layer std::vector<unsigned int> layer/disk of the subdetector generating the hit
hit_type std::vector<unsigned int> type of sistrip hit [0 Pixel hit, 1 rphiRecHit, 2 stereoRecHit, 3 rphiRecHitUnmatched, 4 stereoRecHitUnmatched]
hit_simtrack_id std::vector<int> ID number of the sim track matched to the hit
hit_simtrack_index std::vector<unsigned int> index of the sim track matched to the hit
hit_simtrack_match std::vector<bool> is the hit matched to a sim track?
hit_genparticle_id std::vector<unsigned int> index of the gen particle matched to the hit
hit_pdgid std::vector<int> PDG ID of the gen particle matched to the hit
hit_recotrack_id std::vector<unsigned int> index of the reco track matched to the hit
hit_recotrack_match std::vector<bool> is the hit matched to a reco track?
hit_genparticle_match std::vector<bool> is the hit matched to a gen particle?
hit_genjet_id std::vector<unsigned int> index of the gen jet matched to the hit
hit_genjet_match std::vector<bool> is the hit matched to a gen jet?
simtrack_id std::vector<unsigned int> ID number of the sim track
simtrack_pdgid std::vector<int> PDG ID of the sim track
simtrack_charge std::vector<int> charge of the sim track
simtrack_px std::vector<float> momentum x component of the sim track
simtrack_py std::vector<float> momentum y component of the sim track
simtrack_pz std::vector<float> momentum z component of the sim track
simtrack_energy std::vector<float> energy of the sim track
simtrack_vtxid std::vector<unsigned int> ID number of the sim vertex of the sim track
simtrack_genid std::vector<unsigned int> index of the gen particle associated to the track
simtrack_evtid std::vector<uint32_t> event ID of the sim track
genpart_collid std::vector<int> collision ID of the gen particle
genpart_pdgid std::vector<int> PDG ID of the gen particle
genpart_charge std::vector<int> charge of the gen particle
genpart_px std::vector<float> momentum x component of the gen particle
genpart_py std::vector<float> momentum y component of the gen particle
genpart_px std::vector<float> momentum z component of the gen particle
genpart_energy std::vector<float> energy of the gen particle
genpart_status std::vector<int> PDG status of the gen particle
genjet_px std::vector<float> momentum x component of the gen jet
genjet_py std::vector<float> momentum y component of the gen jet
genjet_pz std::vector<float> momentum z component of the gen jet
genjet_energy std::vector<float> energy of the gen jet
genjet_emEnergy std::vector<float> electromagnetic energy of the gen jet
genjet_hadEnergy std::vector<float> hadronic energy of the gen jet
genjet_invisibleEnergy std::vector<float> invisible energy of the gen jet
genjet_auxiliaryEnergy std::vector<float> auxiliary energy of the gen jet
genjet_const_collid std::vector<std::vector<int> > collision ID of the constituent of the gen jet
genjet_const_pdgid std::vector<std::vector<int> > PDG ID of the constituent of the gen jet
genjet_const_charge std::vector<std::vector<int> > charge of the constituent of the gen jet
genjet_const_px std::vector<std::vector<float> > momentum x component of the constituent of the gen jet
genjet_const_py std::vector<std::vector<float> > momentum y component of the constituent of the gen jet
genjet_const_pz std::vector<std::vector<float> > momentum z component of the constituent of the gen jet
genjet_const_energy std::vector<std::vector<float> > energy of the constituent of the gen jet
track_chi2 std::vector<float> chi2 of the reco track fit
track_ndof std::vector<float> ndof of the reco track fit
track_chi2ndof std::vector<float> reduced chi2 of the reco track fit
track_charge std::vector<float> charge of the reco track
track_momentum std::vector<float> momentum of the reco track
track_pt std::vector<float> transverse momentum of the reco track
track_pterr std::vector<float> error on the transverse momentum of the reco track
track_hitsvalid std::vector<unsigned int> number of valid hits in the reco track
track_hitslost std::vector<unsigned int> number of lost hits in the reco track
track_theta std::vector<float> theta angle of the reco track
track_thetaerr std::vector<float> error on theta of the reco track
track_phi std::vector<float> phi angle of the reco track
track_phierr std::vector<float> error on phi of the reco track
track_eta std::vector<float> pseudorapidity of the reco track
track_etaerr std::vector<float> error on pseudorapidity of the reco track
track_dxy std::vector<float> transverse impact parameter of the reco track
track_dxyerr std::vector<float> error on the transverse impact parameter of the reco track
track_dsz std::vector<float> longitudinal impact parameter of the reco track
track_dszerr std::vector<float> error on the longitudinal impact parameter of the reco track
track_qoverp std::vector<float> charge over momentum of the reco track
track_qoverperr std::vector<float> error on charge over momentum of the track
track_vx std::vector<float> x position of the vertex of the reco track
track_vy std::vector<float> y position of the vertex of the reco track
track_vz std::vector<float> z position of the vertex of the reco track
track_algo std::vector<Int_t> algorithm type of the reco track
track_hit_global_x std::vector<std::vector<float> > global x position of the RecHit associated to the reco track
track_hit_global_y std::vector<std::vector<float> > global y position of the RecHit associated to the reco track
track_hit_global_z std::vector<std::vector<float> > global z position of the RecHit associated to the reco track
track_hit_local_x std::vector<std::vector<float> > local x position of the RecHit associated to the reco track
track_hit_local_y std::vector<std::vector<float> > local y position of the RecHit associated to the reco track
track_hit_local_x_error std::vector<std::vector<float> > error on local x position of the RecHit associated to the reco track
track_hit_local_y_error std::vector<std::vector<float> > error on local y position of the RecHit associated to the reco track
track_hit_sub_det std::vector<std::vector<unsigned int> > subdetector generating the hit [1 PixelBarrel, 2 PixelEndcap, 3 TIB, 4 TID, 5 TOB, 6 TEC] associated to the reco track
track_hit_layer std::vector<std::vector<unsigned int> > layer/disk of the subdetector generating the hit associated to the reco track

How were these data selected?

This dataset was produced with the software available in:

TrackerRecHitProducerTool - Example workflow to add tracker hits to CMS Run1 AOD and to produce full event ROOT ntuple with tracker hits

How can you use these data?

The use of these files does not require any software specific to the CMS experiment. An example notebook is provided. The code reads the ntuples and produces a scatter plot of the rechits from three events.


      

Files and indexes

Disclaimer

These open data are released under the Creative Commons Zero v1.0 Universal license.

Logo CC0-1.0

Neither the experiment(s) ( CMS ) nor CERN endorse any works, scientific or otherwise, produced using these data.

This release has a unique DOI that you are requested to cite in any applications or publications.

ALICE experiment
ATLAS experiment
CMS experiment
DELPHI experiment
LHCb experiment
OPERA experiment
PHENIX experiment
TOTEM experiment
© CERN, 2014–2025 ·
Terms of Use ·
Privacy Policy ·
Help ·
GitHub ·
Twitter ·
Email
Powered by Invenio
Open Data Portal v0.4.6
CERN