Astronomical gamma-ray data analysis can be very CPU and / or I/O intensive. The purpose of this 9-week, first year physics student project is to time and profile typical data analysis tasks with a focus on the speedups that can be obtained for the maximum likelihood fitting step by using multiple CPU cores.
This is basically an introduction and link collection for Andrei. The project description is in the next section.
We have nine weeks ... it’s very hard to predict how fast results are obtained ... so I reserved week 6 to continue with the main project or to do one of the side projects and there are two weeks at the end to write up the report and finish up loose ends.
The project report and notes and scripts in the https://github.com/gammapy/gamma-speed/ repo are the product of your project. It should be a starting point for further work on HESS, Fermi, CTA data analysis speed by others in the future. Detailed descriptions of which tools you tried to time and profile (and possibly measure memory usage and disk I/O) and which are useful and which aren’t and how to use them is helpful.
The most useful thing would be an automatic script that measures certain aspects of ctools performance for typical analysis scenarios that can easily be re-run to try out speed improvements and prevent performance regressions, but this level of automation is most likely not possible in the given time. Just to get the idea I have in mind here, have a look at the PyPy speed center or the pandas benchmark as measured by vbench
Here’s some more useful references for tools you might use: