Summary pages

Contents

Summary pages

Introduction

The summary pages are a website showing plots of useful channels updated every ~30min. They can be found here:

https://nodus.ligo.caltech.edu:30889/detcharsummary/

Different kinds of plots are can be produced: time series, spectra, spectrograms, Rayleigh gaussianity statistics, etc. This is all produced using the GWsumm software developed for the big detectors (https://ldas-jobs.ligo.caltech.edu/~detchar/summary/).

The 40m summary pages are also hosted on the LDAS server, which is protected by LIGO.ORG credentials:

https://ldas-jobs.ligo.caltech.edu/~40m/summary/

Configuration

The content of the pages is controlled by configuration files found in:

/users/public_html/detcharsummary/ConfigFiles (this is a symlink to /home/export/home/detcharsummary/40m-summary-pages/configurations/c1). These files are version controlled and so you should commit your changes if you make a change.

These are synced to the LDAS computer cluster, where the data are processed. The filenames must begin with the characters c1 and have the .ini extension. A special case is the defaults.ini file, which contains HTML and other general information. Although this file is always loaded, its settings can be overwritten by custom files (e.g. if the same property is defined in defaults.ini and c1-lsc.ini, the latter will take precedence).

The remote LDAS folder mirrors the local one in nodus, where the files are version controlled. For most purposes, the user should edit the (version controlled) config files on nodus, and not the corresponding ones on ldas-pcdev6.

For information on the INI format itself see the GWsumm docs or this short guide.

Technical info

Workflow

The central part of the process are cron-like gwsumm jobs executed on the cluster every 30min. Schematically (note that ldas-pcdev6 is now used instead of the decommissioned ldas-pcdev1):

This is the chain of events:

1. A cron-type Condor job wakes up and executes the gw_daily_summary_40m bash script which:

Sets up Python environment;
Rsyncs nodus and LDAS directories containing config files;
Lists the config files present;
Executes a gw_summary_pipe job with proper options and waits for it to finish.
The condor manual is a bit impenetrable. You may find some more helpful tutorials here.

2. Files are processed in parallel in the cluster:

gw_summary_pipe spawns multiple gw_summary jobs (one per config file);
Each gw_summary job corresponds to a node in the Condor DAG;
Condor jobs are processed in the local universe, maximum two at a time (default).

3. Output is synced back to nodus:

Handled by a regular cron tab running every 15 minutes;
Only HTML from local time "yesterday," "today" and "tomorrow" (if exists) are rsynced.

Note that this whole process depends on the 40m frames being available in the cluster; the processes responsible for that are handled by Dan Kozak. The files themselves can be found at /hdfs/40m/full/ on the cluster. Also, a similar process takes place once a day to re-process the plots from the previous UTC day (gw_daily_summary_40m_rerun script).

Software

Several independent pieces come into play:

GWpy: python module to handle LIGO data developed by Duncan Macleod;
GWsumm: python module and associated executables (gw_summary and gw_summary_pipe) that produced detector summary pages, based on GWpy (docs);
Configuration files: stored locally in nodus and VCS version controlled;
40m-specific scripts: git repository containing bash scripts and condor submit files that make the necessary preparations to run GWsumm jobs for the 40m, also takes care of syncing HTML, plots and config files between nodus and LDAS;
Data transfer: cron jobs that sync frames from nodus to LDAS, managed by Dan Kozak.

Notes

On the cluster side, everything (except the frame transfer) is executed from the 40m shared LDAS account.
All jobs (including crontabs) are currently run on the pcdev1 CIT headnode.
To update the plots of a past day, you need to log in to the 40m shared account and run:
./DetectorChar/bin/gw_daily_summary_40m --day YYYYMMDD --file-tag some_custom_tag
(the file-tag option is not strictly necessary but will prevent conflict with other instances of the code running simultaneously).

Known failure modes

An .ini file was defined which requires the downloading of many 16k channels. For some reason, this consumes a lot of memory, and if a particular job's memory consumption exceeds the RequestMemory attribute set in the condor ClassAd for that job, the job will automatically be put on hold. In turn, this prevents submission of the next 30 minute cycle of condor jobs. To fix this, you have to log on to the cluster and either (i) manually raise the RequestMemory attribute (currently defaults to 100 GB), or (ii) remove the job from the queue altogether.
The checkstatus script decides if the summary pages are working or not simply by scanning for condor jobs with the hold status. Sometimes, the above problem arises for a particular tab, and so the status page will display "Dead", even if most tabs are still being processed.
The pushnodus script sometimes fails to replace the hard coded paths with the relative paths, and so the plots will not be displayed on the summary pages. It is unclear why this happens, but usually, the problem is resolved on the next rsync cycle 15 minutes later.

Running your own jobs

If you would like to test a configuration file, you can run the code manually.

LDAS cluster

The following instructions will work on any CIT headnode. Note it is not necessary to log in to the 40m shared account for this.

Default environment

You can use the default detchar installation; first, activate Conda from the Open Science Grid via

. /cvmfs/oasis.opensciencegrid.org/ligo/sw/conda/etc/profile.d/conda.sh

and then activate the ligo-summary-3.9 environment by doing:

conda activate /home/detchar/.conda/envs/ligo-summary-3.9

(see note below about using a shortcut in the .bashrc file).

You will also need to obtain Kerberos credentials before proceeding:

kinit albert.einstein

Then, to use a given configuration file to create HTML and plots for today's data, cd into the desired destination directory and run:

gw_summary day --ifo c1 --config-file path/to/configfile

You can add as many --config-file options as desired. Almost always, you will want to point to the file containing the default options:

gw_summary day --ifo c1 --config-file path/to/defaults.ini --config-file path/to/configfile

If you want to use data from a day other than today, just add the date in YYYYMMDD format after day, e.g.:

gw_summary day 20150721 --ifo c1 --config-file path/to/configfile

For testing purposes, it is usually quicker to process a shorter amount of time, say 10 minutes. You can do this by providing specific GPS start and end times:

gw_summary gps 1121558360 1121558960 --ifo c1 --config-file path/to/configfile

Finally, use the --verbose option to see a more detailed output.

Shortcut in .bashrc file

To make life easier, you can add the following lines to your account's .bashrc file:

# configure conda
export CONDA_PATH="/cvmfs/oasis.opensciencegrid.org/ligo/sw/conda”
conda_init() { . ${CONDA_PATH}/etc/profile.d/conda.sh; }

# summary page environment convenience
conda_summary() {
    . ${CONDA_PATH}/etc/profile.d/conda.sh;
    conda activate /home/detchar/.conda/envs/ligo-summary-3.9;
}

Then, activating the LIGO summary pages' Conda environment is as easy as logging in and running

conda_summary

Custom environment

If you wish to work with a local version of the code, you can install the software in your home directory:

1. [OPTIONAL] Clone the GWpy repository and install locally:

git clone https://github.com/gwpy/gwpy.git
cd gwpy
python setup.py install --user

2. Clone the GWsumm repository and install locally:

git clone https://github.com/gwpy/gwsumm.git
cd gwsumm
python setup.py install --user

After doing this, you will want to make sure you have added ~/.local/bin to your path, so that Python can find these custom modules. You can achieve this by modifying the variables (add these lines to your ~/.bashrc file to avoid having to do this every time):

PATH=/home/albert.einstein/.local/bin:${PATH}
PYTHONPATH=/home/albert.einstein/.local/bin:${PYTHONPATH}

Note, it is highly recommended that you follow standard practice and use Conda environments rather than install directly to your local home directory, since these are much easier to maintain.

How to Define and Implement States

By incorporating states into the 40m summary pages, we can exclude periods of time where certain channels output irrelevant data or restrict data visualization to periods only when the interferometer is locked. For example, we can set the MC2 Trans QPD (top right) plot on the IOO tab to only show data for times when C1:IOO-MC_TRANS_SUM is greater than 1e4.

Two files must be changes: the defaults.ini file and the .ini file corresponding to the specific tab in which you are adding states.

Add a section to the relevant .ini file which defines the state. Set a name, key, and requirement for a condition that must be met in order for the state to apply. Any operator found here can be used: https://github.com/gwpy/gwsumm/blob/master/gwsumm/state/core.py#L38.

[state-MC_LOCK]
key = MC_LOCK
name = Locked (MC)
definition = C1:IOO-MC_TRANS_SUM >= 1e4

At the top of the relevant .ini file, edit the header to include a list of included states:

[tab-Eve]
name = Eve
layout = 2
states = MC_LOCK

In defaults.ini, add a list of all states used and replicate the definition:

[states]
MC_LOCK = C1:IOO-MC_TRANS_SUM >= 1e4

If the all state is used, also add this to defaults.ini:

[state-all]
name = All
description = All times

In order to show all data in a plot, regardless of the selected state, add 1-all-states = True to the plot definition.

Sync from LDAS to nodus

In the DetectorChar repo (see above), there is a bash script called pushnodus that can sync the HTML and associated files from LDAS to nodus, while also making the necessary url adjustments. When run without any options, the script syncs the files corresponding to the current, previous and following UTC days; alternatively, a UTC date or list of dates in the format YYYYMMDD can be passed as arguments to rsync only those particular days, e.g.

./DetectorChar/bin/pushnodus 20160701 20160702

will synchronize the directories corresponding to July 1 and July 2, 2016.

LDAS crontab

Push HTML and plots to nodus every 15 minutes:

0,15,30,45 * * * * /home/40m/DetectorChar/bin/pushnodus > /dev/null

The extra > /dev/null option ensures notifications are only sent if something goes wrong and output is written to stderr, instead of stdout.

MEDM Screens (no longer working)

The MEDM tab is generated as an external tab, meaning that it simply displays an external HTML using the summary page format. The scripts that generate this page are located in:

/opt/rtcds/caltech/c1/scripts/MEDMtab/bin

The functions of the scripts are as follows:

medmCapture.sh is the script which brings up the MEDM screen to be captured and takes a screenshot of it using imageMagick. The script stores the screenshot into two locations: one to be displayed on the main summary page tab and one to be archived for lookup at a later time.

medmGrab.sh iteratively calls medmCapture.sh with an input file, /opt/rtcds/caltech/c1/scripts/MEDMtab/etc/medmScreens.txt, that specifies which MEDM screens to screenshot.

medmHtml.pl creates the HTML that the summary page links to as an external tab.

medmThumbnail.pl resizes the images captured by medmCapture.sh to create thumbnails for use on the HTML.

summCronjob.sh creates the necessary directory structure and calls all of the above scripts to take the screenshots and make the HTML.

These scripts also output information into /opt/rtcds/caltech/c1/scripts/MEDMtab/log.txt for debugging and to monitor the status of a running script. The crontab in megatron includes the following line:

5,25,45 * * * * /opt/rtcds/caltech/c1/scripts/MEDMtab/bin/summCronjob.sh

On the HTML side, for example https://nodus.ligo.caltech.edu:30889/detcharsummary/day/20160808/medm/, the main page displays the most recent screenshots. If one of the screens on the main page is clicked, the user can navigate between different times of the same screen by using keyboard arrow keys or arrow buttons located in the top right and top left corners of the image. Additionally, there is an archive lookup function for each day where the user can select a particular screen and time from that day to view.