Personal tools
You are here: Home Virgo Data Analysis Running Rome CW code to produce SFDB: locally, on LSF at CNAF, on LIGO Condor clusters
Document Actions

Running Rome CW code to produce SFDB: locally, on LSF at CNAF, on LIGO Condor clusters

by Alberto Colla last modified 2013-12-04 11:15

Index

Pss_Fr package

The core of the Rome CW code is the C-based Pss_Fr package. IT is currently on SVN:

svn co $SVNROOT/Pss_Fr

To compile it, e.g. at CNAF:

cd Pss_Fr/trunk/src/pss/pss_sfdb
./compila CNAF

Note: the "location" (CNAF) is necessary to locate the libraries containing the ephemerid database

The executable crea_sfdb.out will be produced.

At CNAF the code is available on the software area:

/opt/exp_software/virgo/virgoDev/RomeCW/RomePSS

and the executable has been put here:

/opt/exp_software/virgo/virgoDev/RomeCW/Pss_Fr/executable/crea_sfdb.out

Manually run the program

The program requires a set of input parameters from the standard input. They are explained in the following.

./crea_sfdb.out

 |crea_sfdb_20131111_152930.log| 
Detector ? virgo,ligoh,ligol 
Put your favourite detector here

Type  of SFDB files? 
 <=0--> Void SFDB 
 ==1--> no SFDB 
 >=2-->full SFDB 
0: SFDB files will contain spectrograms only
1: No SFDB files will be produced
2: SFDB files will contain the full spectra. Select this option to make SFDB!

Factor to write EVF in the file ?  (1 or 2 usually)
EVF are the frequency peaks overcoming a given Critical Ratio (CR) threshold above the average spectrum. This option is used to limit the number of EVF written in the log file.

input ffl file, for frame input: 
Path of the FFL file (frame file list). FFL of the latest runs are available at CNAF under: /opt/exp_software/virgo/virgoData/ffl/

science segment file name: 
Path of the file listing the science segments, in the format: gps_start gps_end duration .

If the science segment file is not available put a dummy string, e.g. "pippo" here. The program writes in the log file and in the SFDB header whether the science segment information has been used or not.

Science segments are used to mask the data using a "trapezoidal" window (data are set to zero ouside the segments and scaled linearly from 0 to 1 over 60 seconds at the edges of the segments).

Science segments for LIGO H and L during S6 are here: /opt/exp_software/virgo/virgoDev/RomeCW/science_segments

file frame input channel
 h_4096Hz (4096Hz)
 Pr_B1_ACp (20kHz)
 h_20000Hz (20kHz)
 Em_SEBDCE01 (1 kHz)
 Em_MABDCE01 (20kHz) 
 H1:LDAS-STRAIN (16384 Hz LIGO H1)
  L1:LDAS-STRAIN (Ligo L1)
H1:LDAS-STRAIN
Frame channel selected: 
Put here your favourite channel name (reconstructed strain channel: h_4096Hz for Virgo, H1:LDAS-STRAIN or L1:LDAS-STRAIN for LIGO)

Verbosity level 0 1 2 3
Standard output verbosity level

chunk len  -  DS type (0,1,2=interlaced) 
Chunk len will be rounded to the next 2 power, if it is not a power of 2
Typical for Virgo 1-band:
 len=4194304 (4096Hz)len =16777216 (20kHz) (es. Em_MABDCE01 Pr_B1_ACp )
len =1048576 (1kHz) (es Em_SEBDCE01)
 ds type=2 interlaced; 1 not interlaced
Two numbers, return-separated here:

  1. the length of the FFT chunk: e.g. for 1024 s long FFTs, put 4194304 for 4096 Hz sampled (Virgo) h(t), 16777216 for 16 KHz sampled (LIGO) h(t)
  2. interlaced (2) or non interlaced (1) FFTs

Maximum number of data chunk to be done ?
Put a big number (10000000) to analyze a full run, a small number for testing purposes.

Reduction factor, for very short FFTs (e.g. 2,4,8..Sugg.:128) ?
Frequency reduction factor for the spectrograms written in the SFDB files

windows  (0=no,1=Hann,2=Hamm,3=MAP, 4=Blackmann flatcos 5=flat top,cosine edge. Sugg. 5) 
Put 5

Cut frequency for the highpass filtering (e.g. 100. If 0 the highpass and EVT veto are skipped) ?
We put 100 for Virgo data. WARNING: This is not the only parameter used for highpass filtering: there are others, hardcoded, specific for Virgo or LIGO! These parameters should be carefully reviewed and tuned.

Subsampling factor for the veto in the subbands. Power of 2 (e.g. 128)? <=0: not applied
Typical for Virgo Em_SEBDCE01 ch (1kHz) =4

Typical for Virgo Em_MABDCE01 ch (20KHz) =4
Put -1 (do not analyze sub-bands)

Max number of FFTs in one output file (e.g. 100) ?
Number of FFTs per SFDB file

The program produces two files:

  • crea_sfdb_DATE_TIME.log: the log file, with the list of EVF, EVT (transient events in time domain), etc. Here DATE and TIME correspond to when the program is run;
  • CHANNEL_FFTDATE.SFDB09 (e.g. H1:LDAS-STRAIN_20100831_002139.SFDB09): the FFT data file. FFTDATE is the date of the first FFT in the file.

The NoEMi wrapper

The NoEMi framework contains a Python wrapper to make it easier to run crea_sfdb.out, in particular in a batch system.

The NoEMi package is in SVN:

svn co $SVNROOT/Noemi

and a copy is available at Cnaf: /opt/exp_software/virgo/virgoDev/Noemi/

To run the SFDB wrapper, the following files are required:

  • Noemi/trunk/scripts/noemi_event_finder.py: the Python wrapper
  • Noemi/trunk/templates/noemi_event_finder.ini: the configuration file for the wrapper
  • Noemi/trunk/templates/Noemi_env.sh: Environment source file

First of all, source the environment file. Check that paths are correct! If they are, noemi_event_finder.py will be in the path.

The configuration file sets up the input for crea_sfdb.out and other parameters (in parentheses brief explanation / corresponding input parameter of crea_sfdb.out):

[EXEC]

executable:/opt/exp_software/virgo/virgoDev/RomeCW/Pss_Fr/executable/crea_sfdb.out (-> path to the executable)


[INPUTPARS]

detector: ligoh                     #(-> Detector?)
channel: H1:LDAS-STRAIN          #(-> file frame input channel)
onlineJob: 2                        #(->Type  of SFDB files?)
genFactEvf: 4                     #(-> Factor to write EVF in the file ? )
subsamplFact: 4                 #(-> resampling factor)

sciSegFile: /opt/exp_software/virgo/virgoDev/RomeCW/science_segments/segments_Science_LIGOH_S6.txt       #(-> science segment file name)

debugLevel: 1                    #(-> Verbosity level)
fftLength: 16777216          #(-> chunk len)
interlace: 2                         #(-> DS type (0,1,2=interlaced))
nFft: 1000000                    #(-> Maximum number of data chunk to be done)
reductFactor: 128               #(-> Reduction factor, for very short FFTs)
window: 5                           #(-> windows)
cutFr: 100                           #(-> Cut frequency for the highpass filtering)
evbSub: -1                          #(-> Subsampling factor for the veto in the subbands)
nFftperFile: 100                  #(-> Max number of FFTs in one output file)


[RUNPARS]

fflFileName: /opt/exp_software/virgo/virgoData/ffl/S6/H1_LDR_C02_L2.ffl #(-> input ffl file, for frame input)

# possible formats for startDate and endDate: YYYY-MM-DD, 'yesterday', 'today' or 'ffl'
# 'ffl': the startDate and/or endDate are taken from the first/last frame of the ffl
startDate: 2009-06-01
endDate: 2011-08-10

runPath:  pwd #(-> path for the output)
tmpPath:  pwd #(-> path of a local folder where to store temporary results)


nDaysPerRun: 10 #(-> Period split)

# timeout for the executable
timeout: 36000

runLogLevel: DEBUG  #(-> log level of the wrapper)


Some of the configuration file parameters can be also given as options of noemi_event_finder.py:


noemi_event_finder.py    --config=<name of the .ini file> 
                                        --log=<name of the NoEMi log file> 
                                        --channel=<channel name>
                                        --fftLength=<length of the FFT (->chunk len)>
                                        --fflFileName=<name of the FFL file>
                                        --startDate=<YYYY-MM-DD> --endDate=<YYY-MM-DD> --

NoEMi will create a directory run with various subdirectories. The most important are:

  • run/log: contains crea_sfdb.out log file, named DATE_sfdb.log, where DATE is the startDate set in input
  • run/sfdb: SFDB output files, not renamed
  • run/out: program standard output/error

Run on LSF at CNAF

The worker nodes at CNAF cannot write directly on the shared folders (/gpfs_virgo4, /gpfs_virgo3). Output data from crea_sfdb.out or NoEMi must be transferred to the user interfaces with LSF sandbox. To this purpose, we have created a further bash script which handles the submission of NoEMi in LSF: Noemi/trunk/templates/run_sfdb_parallel.sh

This script compresses the run directory created by NoEMi? in a file called run_DATE.tgz (DATE is the start date) which can be retrieved with bsub's -f option.

In short, this is an example how to run NoEMi? to produce SFDB at CNAF:

bsub    -f "run_2009-07-07.tgz < /home/VIRGO/collaalb/run_2009-07-07.tgz" 
            -f "run_2009-07-07.log < run_2009-07-07.log" 
            -o run_2009-07-07.log 
                   /opt/exp_software/virgo/virgoDev/Noemi/trunk/templates/run_sfdb_parallel.sh 
                                         --config=/opt/exp_software/virgo/virgoDev/CW/noemi_event_finder.ini 
                                         --startDate="2009-07-07" --endDate="2009-07-16"

which means: write the stdout in run_2009-07-07.log (-o option) and retrieve stdout and output tgz file to the user interface (-f option).

Note that the configuration file must be either put in a folder accessible to the worker nodes (e.g. in /opt/exp_software/...) or sent in the input sandbox with the -f option (-f noemi_event_finder.ini > /remote/path/noemi_event_finder.ini).

After transferring the zipped file, uncompress it with tar zxvf run_2009-07-07.tgz.

In this example the time span by the process is 10 days, which results in job durations of about 1 day.

Run on LIGO Clusters

To run the program to create the SFDB on the Condor batch system of LIGO clusters (Caltech, Hannover...)
one can use a subset of the scripts implemented for the full NoEMi framework.

The scripts are stored in Noemi/trunk/ligo_scripts. The fundamental scripts are:

  • Noemi_env.sh: Source environment file
  • run_condor_daily.sh: main job management script
  • run_condor_sfdb.sh: Condor submission script. Prepares the Condor configuration file and issues the " condor_submit "
  • sfdb.sh: wrapper of the actual noemi processes run in the batch system
  • channels.txt: a file containing the list of channels to be analyzed, in the format channel_name sampling_frequency

Noemi_env.sh defines the main env variables. The following three variables define the structure of the directories where the framework will run:

export noemiPath=/home/pulsar/NoEMi
export detector=H1
export run=HIFO-Y

In this case the "working directory" ($runPath) is /home/pulsar/NoEMi/H1/HIFO-Y. This directory should contain the channels.txt file. The sistem will create directories with the channel names where NoEMi will store the output. The other scripts mentioned above should be put in a directory software under $runPath, and the NoEMi event finder configuration file (noemi_event_finder.ini) in the directory software/config.

Furthermore, the NoEMi package should be put under $noemiPath. In other words, noemi_event_finder.py should be found in $noemiPath/Noemi/trunk/scripts.

Of course one can set up another directory structure, but he should check the other env variables!

The other compulsory variables are dateStart and dateEnd.

There are three "boolean" variables, run_noemi_event_analyzer, run_noemi_line_tracker, run_noemi_make_lines_page, to choose which NoEMi processes to run. To run the event_finder only to produce SFDB, set the three variables to "no".

By default, the main channel (set by $dfchname) produces SFDB also.

The variables detector_ffl and dataType are used to query the frame file list (ligo_data_find) and should be set up according the type of data to be analyzed.

Let's turn to run_condor_daily.sh. The first part deals with the query for frame files using ligo_data_find. In case the FFL is already available one can comment this part and set fflFileName variable to the FFL file.

Then the script extracts the channels listed in the file channels.txt in a new frame file set. This was implemented to solve the data access problems in the LIGO raw data.
Using h(t) this is probably not necessary, therefore one can comment this part and set fflCopyFileName=$fflFileName.

To summarize, to produce SFDBs on the LIGO Condor clusters:

  • Check out Pss_Fr, compile crea_sfdb.out;
  • Check out Noemi from the base working directory ($noemiPath)
  • Create a directory structure runPath=$noemiPath/$detector/$run
  • Create the subdirectories $runPath/software and $runPath/software/config
  • Put bash scripts from Noemi/trunk/ligo_scripts to $runPath/software
  • Put noemi_event_finder.ini from Noemi/trunk/templates to $runPath/software/config
  • Edit noemi_event_finder.ini, in particular set the correct path to crea_sfdb.out
  • Edit the channel list file $runPath/channels.txt with the channel(s) to analyze
  • Edit Noemi_env.sh
  • run $runPath/software/run_condor_daily.sh!