DataPipe F2F Meeting and Hackathon

Europe/Paris
CEA Paris-Saclay, France

CEA Paris-Saclay, France

Orme des Merisiers, Bât 709 91191 Gif sur Yvette
Description

DataPipe is the sub-system of the DPPS (Data Processing and Preservation System) responsible for DL0-DL3 data processing for CTAO and is comprised of ctapipe, pyirf, a benchmarking suite, DIRAC workflows, and other software.

Covered subjects

The meeting will be organized around these five DataPipe top-level workflows

  • Reconstruction Model Training
  • IRF Generation
  • Benchmarking
  • Observation Data Processing
  • Quality Metrics

Attendee Survey

A moderately long survey can be found here

Structure

This event will days with interleaved introductory presentations, hands on demos of existing software, discussions about future needs, and hacking sessions relevant for the subjects of the day.

The discussions will aim to cover the following items

  • Functionality: what is done, what needs to be done?
  • Lessons from past experience: What problems and solutions in LST, NectarCam, HESS/MAGIC/VERITAS/etc are relevant for DataPipe?
  • Tools, Workflows & Configuration: what executables/standard-scripts are needed and how we will automate this with DIRAC or other systems? Not just for one bin in observation space, but thinking toward the final goal of hundreds... What physics studies are needed to define the standard configuration?
  • Verification: are the requirements and/or concepts well-enough specified? Or do we need to better explain things? Do we all agree what is needed and what is the scope?
  • Validation/Benchmarking: how will we check the requirements are met? How do we check if a release is ok?
  • Interfaces: how do internal DataPipe sub-systems or external systems outside of DataPipe connect to (and depend on) each other, and what needs to be defined (data formats, etc)
Participants
    • 09:30 10:00
      Social events: Welcome coffee
    • 10:00 11:10
      Introduction: and welcome

      Introduction to the structure of the f2f and logistics etc

      • 10:00
        Welcome 10m
        Speaker: Thierry Stolarczyk (Université Paris-Saclay, Université Paris Cité, CEA, CNRS, AIM, 91191, Gif-sur-Yvette, France)
      • 10:10
        Intro to Datapipe IKC and architecture 30m
        Speaker: Karl Kosack (AIM, CEA, CNRS, Universite Paris-Saclay, Universite Paris Diderot, Sorbonne Paris Cite, F-91191 Gif-sur-Yvette, France)
      • 10:40
        Personal introductions 10m
      • 10:50
        Development Status and Missing Features 20m
        Speakers: Karl Kosack (AIM, CEA, CNRS, Universite Paris-Saclay, Universite Paris Diderot, Sorbonne Paris Cite, F-91191 Gif-sur-Yvette, France), Dr Maximilian Linhoff (Department of Physics, TU Dortmund University)
    • 11:10 11:20
      Break 10m
    • 11:20 12:00
      Reconstruction and model training

      Covering: Calibration, De-noising, Incomplete data (data reduction, missing data eg pixels)

      Image cleaning:
      What is the correct method?
      What is used by community but missing from dppspipe?
      What is appropriate benchmark?

      Image Modification
      Is it needed?
      When do we need it?

      Missing pixel handling:
      What is correct way to handle it?
      When is in-painting possible?

      Calibration
      What does the calibpipe DL1 interface look like

      Experience of raw data

      Convener: Dr Maximilian Linhoff (Department of Physics, TU Dortmund University)
      • 11:20
        Presentation on local use up to DL2 20m
        Speaker: Lukas Nickel (Department of Physics, TU Dortmund University)
      • 11:40
        DEMO: API for reconstructor and IO plugins 20m
        Speaker: Dr Maximilian Linhoff (Department of Physics, TU Dortmund University)
    • 12:00 13:30
      Lunch break 1h 30m RIE algorithms

      RIE algorithms

    • 13:30 15:30
      Reconstruction and model training

      Covering: Calibration, De-noising, Incomplete data (data reduction, missing data eg pixels)

      Image cleaning:
      What is the correct method?
      What is used by community but missing from dppspipe?
      What is appropriate benchmark?

      Image Modification
      Is it needed?
      When do we need it?

      Missing pixel handling:
      What is correct way to handle it?
      When is in-painting possible?

      Calibration
      What does the calibpipe DL1 interface look like

      Experience of raw data

      Convener: Dr Maximilian Linhoff (Department of Physics, TU Dortmund University)
      • 13:30
        Demo and tools for training, application, and performance 1h
        Speaker: Lukas Beiske (TU Dortmund University)
      • 14:30
        IRFs for RTA 15m
        Speaker: Vincent Pollet (LAPP IN2P3)
      • 14:45
        Discussion: Reconstruction model performance validation 20m

        What tests are needed
        What tests are implemented

      • 15:05
        Define roadmap for reconstruction 25m
    • 15:30 16:00
      Coffee break 30m
    • 16:00 18:00
      IRF and optimisation
      • 16:00
        IRF introduction 8m
        Speaker: Tomas Bylund (CEA DAP)
      • 16:08
        Demo of IRF generation 20m
        Speaker: Tomas Bylund (CEA DAP)
      • 16:43
        IRF performance benchmarks 45m
        Speakers: Karl Kosack (AIM, CEA, CNRS, Universite Paris-Saclay, Universite Paris Diderot, Sorbonne Paris Cite, F-91191 Gif-sur-Yvette, France), Tomas Bylund (CEA DAP)
      • 17:28
        Define roadmap for IRFs 32m
    • 09:30 10:30
      Workflows: putting it together on large scale cluster
      Convener: Luisa Arrabito (Laboratoire Univers et Particules de Montpellier, Universite de Montpellier, CNRS/IN2P3)
      • 09:30
        Overview of existing workflows with DIRAC 20m
        Speaker: Luisa Arrabito (Laboratoire Univers et Particules de Montpellier, Universite de Montpellier, CNRS/IN2P3)
      • 09:50
        Discussion: future of workflows 20m
      • 10:10
        DEMO: Local workflows with CWL 20m
        Speaker: Mykhailo Dalchenko (University of Geneva)
    • 10:30 11:00
      Coffee break
    • 11:00 11:30
      Workflows: putting it together on large scale cluster
      • 11:00
        DEMO: Fetching productions 10m
        Speaker: Luisa Arrabito (Laboratoire Univers et Particules de Montpellier, Universite de Montpellier, CNRS/IN2P3)
      • 11:10
        Define roadmap for Workflows 20m
    • 11:30 12:30
      Hackathon
    • 12:30 14:00
      Lunch break 1h 30m RIE algorithms

      RIE algorithms

    • 14:00 16:00
      Social events: Mini Excursions
    • 16:00 16:30
      Coffee break
    • 16:30 18:00
      Benchmarking System / Performance / Release verification
      Convener: Tomas Bylund (CEA DAP)
      • 16:30
        Introduction to benchmarking 10m
        Speaker: Tomas Bylund (CEA DAP)
      • 16:40
        Demo/Discussion: what is there now 20m
      • 17:00
        Discussion: Benchmark coverage, types, and features 20m
      • 17:20
        Discussion: release verification 20m
      • 17:40
        Define roadmap for Benchmarking 20m
    • 19:00 21:30
      Social events: Social dinner
    • 09:30 10:30
      Algorithm Topics
      • 09:30
        ImPACT 20m
        Speaker: Georg Schwefer (MPIK)
      • 09:50
        Image truncation 20m
        Speaker: Clara Escanuela Nieves (Max-Planck-Institut für kernphysik)
      • 10:10
        Discussion: Missing Algorithms 20m
    • 10:30 11:00
      Coffee break 30m
    • 11:00 12:30
      Hackathon
    • 12:30 13:30
      Lunch break 1h RIE algorithms

      RIE algorithms