vdas_20120717
by
Marie Anne Bizouard
—
last modified
2012-07-17 17:52
Special VDAS call: GPU use for MBTA
1. Announcements
2. MNTA enhancement with GPUs [Gergely, Frederique, Benoit]
Attendance: John, Jeroen, Chris, Walter, Michalis, Frederique, Benoit, Gergely, Alberto, MAB [minutes]
Frederique: Core of MBTA: filtering is not dominated by FFT computing time: 30% (no latency stress) --> 60% (latency optimized). Still dominating by the combination of the 2 bands. Use of GPUs for parameter estimation. Spin would improve the localization. Requires lots of computing time.
Combination of 2 bands: lots of if, test. Not straightforwardly optimisable by GPUs. Reminder: 30 cores for MBTA. GSTLAL: >1000 cores. cutoff: 30 Hz. Going down? Gain not obvious. SNR gain vs FAR increase. Study to be done, but needs also to know the real data glitch rate.
Combination: match filtering of each band. Offset them (translation + rotation). Virtual templates are built. For each of them one knows the offset to apply in the 2 bands.
Mours: combination part optimization not done. S6/VSR3 : use of only 4 cores! Computing was not a problem.
Gergely: MBTA: 50 s of latency. with GPU we could achieve 5s. Is that needed? Measure the performance of filtering. Maximization would be improved with GPUs. GPUs for MBTA should not be abandoned, but the goals must be clarified (if 50s latency is fine)
Frederique: 50s is fine. would prefer to see the PE with spinning waveforms available.
Gergely: PE project is progressing well. But I thought that there are parts of MBTA that could be improved. Replacing 10 lines of code would be easy. For instance the band combination?
Frederique: OK what about hardware?
Gergely: ~500 euros for a standard card. 2000 euros for special optimised computing card.
John: 2 PE projects: templates generation speed. Goal: reduce the time to 1 hour. MCMC are not well suited for GPUs. Not highly paralelized. Much bigger project. Pay off could also be larger.
John: spinning templates generation is the bottleneck.
Gergely working with Riccardo+Salvatore. John interested.
John: pb with MCMC d everytime you use compute the likelihood for a template, you decide
--> PE project with GPUs could be very well become a flagship project.
John: Could these templates callable from LAL?
Gergely: yes, it should be. For a first implementation, it will not be callable from LAL, but then we will work on this.
Frederique: at the moment, we think priorities should be put on PE. There might be a day where MBTA GPU enhancement could be useful.
Benoit: future improvements for MBTA: multi threading. First steps towards GPU implementation
Meeting adjourned
2. MNTA enhancement with GPUs [Gergely, Frederique, Benoit]
Attendance: John, Jeroen, Chris, Walter, Michalis, Frederique, Benoit, Gergely, Alberto, MAB [minutes]
Frederique: Core of MBTA: filtering is not dominated by FFT computing time: 30% (no latency stress) --> 60% (latency optimized). Still dominating by the combination of the 2 bands. Use of GPUs for parameter estimation. Spin would improve the localization. Requires lots of computing time.
Combination of 2 bands: lots of if, test. Not straightforwardly optimisable by GPUs. Reminder: 30 cores for MBTA. GSTLAL: >1000 cores. cutoff: 30 Hz. Going down? Gain not obvious. SNR gain vs FAR increase. Study to be done, but needs also to know the real data glitch rate.
Combination: match filtering of each band. Offset them (translation + rotation). Virtual templates are built. For each of them one knows the offset to apply in the 2 bands.
Mours: combination part optimization not done. S6/VSR3 : use of only 4 cores! Computing was not a problem.
Gergely: MBTA: 50 s of latency. with GPU we could achieve 5s. Is that needed? Measure the performance of filtering. Maximization would be improved with GPUs. GPUs for MBTA should not be abandoned, but the goals must be clarified (if 50s latency is fine)
Frederique: 50s is fine. would prefer to see the PE with spinning waveforms available.
Gergely: PE project is progressing well. But I thought that there are parts of MBTA that could be improved. Replacing 10 lines of code would be easy. For instance the band combination?
Frederique: OK what about hardware?
Gergely: ~500 euros for a standard card. 2000 euros for special optimised computing card.
John: 2 PE projects: templates generation speed. Goal: reduce the time to 1 hour. MCMC are not well suited for GPUs. Not highly paralelized. Much bigger project. Pay off could also be larger.
John: spinning templates generation is the bottleneck.
Gergely working with Riccardo+Salvatore. John interested.
John: pb with MCMC d everytime you use compute the likelihood for a template, you decide
--> PE project with GPUs could be very well become a flagship project.
John: Could these templates callable from LAL?
Gergely: yes, it should be. For a first implementation, it will not be callable from LAL, but then we will work on this.
Frederique: at the moment, we think priorities should be put on PE. There might be a day where MBTA GPU enhancement could be useful.
Benoit: future improvements for MBTA: multi threading. First steps towards GPU implementation
Meeting adjourned