Personal tools
You are here: Home Virgo Data Analysis VESF DA Schools School Winter 2010 Exercises for the CBC data analysis
Document Actions

Exercises for the CBC data analysis

by Damir Buskulic last modified 2010-02-11 17:58

Practicing exercises for the CBC data analysis


0 - Introductory remarks and setup of the environment

The purpose of the present exercises is purely demonstrative. We will analyze the data coming from a single detector (i.e. Virgo), which is not a standard and complete procedure. The true pipelines combine three detectors (the two LIGO and Virgo) to reduce noise and put upper limits on detection... or detect something, hopefully !

The ROOT environment

The following exercises are using the Vega environment, based on ROOT. It extends ROOT by adding a few libraries that allow manipulation of gravitational wave data and graphical display. When logged in on the Cascina computers, one launches Vega by typing :

vega

A few macros were designed that will need very small adjustments for the students to be able to complete the exercises. Once in Vega, a macro is launched with the command ".x", like for example :

.x ExampleExercise.C

To quit vega, type ".q"

Setup for the exercises

Before running the macros and analyzes described below, you should set up your environment. This is done by running the script /users/buskulic/vesf_public/vesf_setup_cbc.sh from your home directory on a farmn machine :

cd  ~/
/users/buskulic/vesf_public/vesf_setup_cbc.sh

This will also copy for you all the files you need to do the exercises. Those files will be in a newly created directory called vesf_CBC in your home directory.

All the provided macros are there as examples, you should and will need to modify them, though they should work as is and give you a preliminary information.

I - Understanding the waveforms and their characteristics

The material for the following exercise is in the directory vesf_CBC/Waveforms

  1. Draw in the time domain a few standard Taylor 2 PN expansion waveforms with various masses. Especially notice the length of the waveform.
        Macro : waveforms.C
    Try a few cases with
    • Same masses for the two components, from 1 Ms to 20 Ms (Ms = solar mass)
    • Various eta parameter values for the same total mass
      • 5 Ms total mass
      • 20 Ms total mass
    • Various chirp masses
  2. For the previous cases, evaluate the duration of the chirp. Be aware that in the macro waveforms.C, the calculated duration and the starting frequency that is set is only valid for the first generated waveform. For the following ones, the starting frequency is adjusted so that the length is approximately the same. Try to make a plot of the duration versus the chirp mass.
  3. For the previous cases, evaluate by eye (try to zoom on the x axis) the final frequency of the waveforms. Try to plot the final frequency versus the total mass of the system.
  4. Now in the frequency domain, plot a waveform for different cases. You can try to superimpose on the same plot (with different colors) the various cases to see the differences.
        Macro : waveforms.C
  5. Draw in the time domain and in the frequency domain an Effective One Body (EOB) waveform
        Macro : waveforms.C
    • what are the differences with the 2 PN waveforms ?
  6. Draw in the time domain and in the frequency domain a "spining" waveform
        Macro : waveforms_spin.C
    • Try to modify the alignment of the spins with respect to the orbital momentum. When do the oscillations on the waveform become small ?

II - The matched filtering process

The material for the following exercise is in the directory vesf_CBC/Waveforms

This exercise needs more than a macro to work. You will have to compile a small program that will generate some time series, which in turn will be read by a macro in Vega. The following steps are needed :

  • Modify to suit your needs the program compute_cross_product.c. We strongly advise to try it first without modifications.
  • Compile it with the small shell script compile_compute_cross_product that is provided in the same directory :

                ./compile_compute_cross_product

  • Run the compiled program :

                ./compute_cross_product

    This will generate a set of files cross_product_template1.vect, cross_product_template2.vect and cross_product_template3.vect . Of course, you should adapt the program to answer the questions below.
  • In Vega, run the script plot_cross_products.C that will plot the contents of the files produced above.

Now, to the work :

  1. Simple matched filtering of a signal with a time shifted version of itself. Obtain a time domain plot of the result versus the time shift. What is the time of the maximum ? It's value ?
  2. Do the same but change a little bit the parameters of the time shifted template. How much does the maximum value differ from the previous case ? Try with a few different parameters. For example, with the same variation of the parameters, change the total mass.

III - Running the MBTA data analysis pipeline without injections

The MBTA data analysis pipeline has been run in advance on the data from the Virgo C7 run. This is to let students do interesting things. It is so boring waiting for an analysis to finish (it takes up to an hour for the C7 data we use) !
The result of the analysis is in the folder vesf_CBC/C7_data/. The files of interest are
  • The template grid files generated :
    • for the low frequency band : m1Min_1.00_m1Max_5.00_Kmin_0.00_Kmax_100.00_nuMin_80.00_nuMax_263.00.grid
    • for the high frequency band : m1Min_1.00_m1Max_5.00_Kmin_0.00_Kmax_100.00_nuMin_263.00_nuMax_2000.00.grid
    • for the total frequency band : m1Min_1.00_m1Max_5.00_Kmin_0.00_Kmax_100.00_nuMin_80.00_nuMax_2000.00.grid
  • The spectrum frame vector file : PSD_sampling_4000_npoints_8192_channel_h_4kHzNo50.vect
  • The file containg the set of candidate events for analysis without injections : MBTA_results.ffl
The analysis has also been done on a set of simulated data with a colored gaussian noise. The spectrum is not the same as the one of the true C7 data, but this doesn't change much the overall results. The result of the analysis is in the folder vesf_CBC/Gaussian_noise/. The files of interest are
  • The template grid files generated :
    • for the low frequency band : m1Min_1.30_m1Max_2.00_Kmin_0.00_Kmax_100.00_nuMin_80.00_nuMax_180.00.grid
    • for the high frequency band : m1Min_1.30_m1Max_2.00_Kmin_0.00_Kmax_100.00_nuMin_180.00_nuMax_2000.00.grid
    • for the total frequency band : m1Min_1.30_m1Max_2.00_Kmin_0.00_Kmax_100.00_nuMin_80.00_nuMax_2000.00.grid
  • The spectrum frame vector file : PSD_sampling_4000_npoints_8192_channel_h_4kHzNo50.vect
  • The file containg the set of candidate events for analysis without injections : MBTA_results.ffl

Now to the work :

  1. Visualize the grid of templates (above) generated by the MBTA data analysis pipeline.
         Macro : Plot_Saved_Grid.C
    Try also to visualize the low and high frequency band grids. The name of the file should give you all the information you need to change the parameters in the macro
    (the masses and the nuMin = lower frequency limit).
  2. In the simulated gaussian data (be sure to go to the right directory !), using the look_at_one_event.C macro, extract the particular selected event and obtain it's parameters. Visualize the cross-product as a function of time around the event. Compare with the ideal case of exercise II.
    You can do this for the two times suggested in the macro.
  3. For the simulated gaussian data (be sure to go to the right directory !), analyze the MBTA resulting events (with the macro : all.C) to obtain the following plots/distributions
    • Signal over Noise Ratio (SNR) of the events versus the time of the event
      • notice the variation of the rate of events with time. Spot the "noisy" periods and "quiet" periods
    • SNR distribution of the events
    • 2-dim plot of the SNR inthe high frequency band versus the SNR in the low frequency band for the events
      • where would you await the true events to be on this plot ?
    • SNR versus total mass of the template that fired the event.
      • Can you explain the distribution, in particular the influence of high masses ?
  4. Same work for the real C7 data. Make a comparison of the plots in the two cases gaussian noise / C7 data.

IV - Running the MBTA data analysis pipeline on data containing injections

The result of the analysis on data containing software injections is in the folder vesf_CBC/C7_data_with_injections/. The files of interest are
  • The template grid files generated :
    • for the low frequency band : m1Min_1.00_m1Max_5.00_Kmin_0.00_Kmax_100.00_nuMin_80.00_nuMax_263.00.grid
    • for the high frequency band : m1Min_1.00_m1Max_5.00_Kmin_0.00_Kmax_100.00_nuMin_263.00_nuMax_2000.00.grid
    • for the total frequency band : m1Min_1.00_m1Max_5.00_Kmin_0.00_Kmax_100.00_nuMin_80.00_nuMax_2000.00.grid
  • The spectrum frame vector file : PSD_sampling_4000_npoints_8192_channel_h_4kHzNo50plusSignal.vect
  • The file containg the set of candidate events for analysis with injections : MBTA_results.ffl

The analysis has also been done on a set of simulated data containing injections with a colored gaussian noise. The spectrum is not the same as the one of the true C7 data, but this doesn't change much the overall results. The result of the analysis is in the folder vesf_CBC/Gaussian_noise_with_injections/.

Now to the work :

  1. In order to check that it is the same as for the previous exercise, visualize the grid of templates (above) generated by the MBTA data analysis pipeline.
         Macro : Plot_Saved_Grid.C
    Try also to visualize the low and high frequency band grids.
    The name of the file should give you all the information you need to change the parameters in the macro (the masses and the nuMin = lower frequency limit).
  2. In the simulated gaussian data (be sure to go to the right directory !), using the look_at_one_event.C macro, extract the particular selected event and obtain it's parameters. Visualize the cross-product as a function of time around the event. Compare with the ideal case of exercise III.
    You can do this for the two times suggested in the macro.
  3. For the simulated gaussian data (be sure to go to the right directory !), analyze the events (Macro : all.C) to obtain the following plots/distributions
    • Signal over Noise Ratio (SNR) of the events versus the time of the event
    • notice the variation of the rate of events with time. Spot the "noisy" periods and "quiet" periods
    • SNR distribution of the events
    • 2-dim plot of the SNR inthe high frequency band versus the SNR in the low frequency band for the events
      • where would you await the true events to be on this plot ? Can you spot them ?
    • Chi^2 vs SNR distribution of the events
      • where would you await the true events to be on this plot ? Can you spot them ?
    • SNR versus total mass of the template that fired the event.
      • Can you explain the distribution, in particular the influence of high masses ?
  4. Examples of the previous plots are given in the macro all.C. Writing the code yourself, add a plot of the SNR of the simulated/injected event versus the SNR of the reconstructed event. All the ingredients are accessible, although not in a single macro !
  5. Observe the distributions and decide of a reasonable set of cuts to separate the noise and "true" events (simulated injections)
  6. Same work for the real C7 data. Make a comparison of the plots in the two cases gaussian noise / C7 data.
  7. In all cases, how many events do you obtain with your cuts ? Can you estimate their approximate parameters (masses) ?

V - Blind injection challenge

A set of injections were made on the C7 data (in the directory vesf_CBC/C7_data_with_blind_injections/), but without knowledge of the actual times or values of parameters (so called "blind injections"). Of course, we know the times and parameters...
Will you be able to find these events ? What can you say about them ?