= Summer 2022 CDS system upgrade = The plan is to upgrade the 40m CDS system to latest RTS release. As of this writing, that would be [[https://git.ligo.org/cds/software/advligorts/-/tags/4.2.8|advLigoRTS 4.2.8]]. The basic plan: 1. Set up a new rack midway down the Y arm (1Y3b). 1. Move all 6 current front end machines, and FB, to the new rack. 1. Install a new front end machine with support for AVX512 instructions in the new rack. 1. Run OneStop fiber between all the front ends and their IO chassis (which all stay in their current locations). 1. Install Dolphin IX interconnect for all front ends in the new rack (drop all old RFM). * IX card in all front ends * IX switch in new rack 1. Upgrade all machines (front ends (via diskless boot) and FB) to Debian 11 (or 10, whichever is currently in production at the sites). 1. Install all needed software components via CDSSoft Debian 11 (or 10) archive. Before the upgrade we will test the new software configuration on the 40m test stand. We will assume that we will support up to 8 front end machines in the new system (7 will be present after the upgrade) ---- What do we need that we don't currently have: What we need that we think we currently have (need to actually find and store these to avoid double counting): 1. A Rack 1. Some other FE machines for all of our running models? How many? Can we re-use some existing computers? NEED to TEST that new RTS works on these or we are in danger of hanging up the whole 40m during this upgrade. 1. A framebuilder. I know is **SHOULD** work, but that's what we say about everything right before it doesn't work. Need to test this also and possibly buy a new FB if our old one has issues. || Items || Quantity || Description || Status || Action || || KVM 8-port Switch || 1 || keyboard and display control switch ||<#FF0000> unknown|| See table below for details|| || OneStop Card || 17 || PCIe extension from FE to IO chassis ||<#00FF00> received|| || || OneStop Fibre (100 m) || 7 || PCIe extension fibre ||<#00FF00> received|| || || Dolphin IX Card || 8 || FE RFM adapter card ||<#00FF00> received|| || || Dolphin IX Cables || 8 || FE to dolphin switch ||<#00FF00> received|| || || Dolphin 8-port switch || 1 || dolphin switch to connect all FE ||<#00FF00> received|| || || Rack || 1 || Rack to mount all FE machines, dolphin switch, KVM switch ||<#00FF00> received|| || ---- || Items || Quantity || Description || Link || Price || || KVM PS2 8-port Switch || 1 || Keyboard and display control switch || [[https://www.iogear.com/product/GCS1808KITTAA/|KVMPS2]] || $429.95|| || || PS/2 KVM Cable 6 Ft || 2 || Converts the VGA to VGA + PS2 for the older computers || [[https://www.iogear.com/product/GCS1808KITTAA/|PS/2 KVM Cable ]] || $19.95|| || || OneStop Fiber Optic Cable || 6 || Connects IOChssis to Computers. (THESE ARE DIRECTIONAL) || [[https://au.mouser.com/ProductDetail/Samtec/PCIEO-4G3-100.0-11?qs=PB6%2FjmICvI1BELEXGo4few%3D%3D|4G3 PCEIO Cable]] || Available at Request|| || || OneStop Copper Cable || 1 || Connects IOChssis to Computers. (THESE ARE DIRECTIONAL) || [[https://onestopsystems.com/collections/pcie-cables/products/pcie-x4-cable| Copper OneStop 7m Cable]] || $372.00|| || ---- [[https://wiki-40m.ligo.caltech.edu/CDS/c1teststand|C1teststand]] The FE machines initially spec'd for the teststand are c1sus2 and c1bhd. Currently, c1sus2 has been moved to the main system and c1bhd may be moved over in the near future, so I believe we need at least 2+ FE machines to have an operational teststand longterm. In the meantime, we need a new front-end to test dolphin communication with c1bhd. || Items || Quantity || Description || Status || Action|| || Front-end (FE) machines || 2+ || Needed to test dolphin communication||<#00FF00> received|| || || [[ https://onestopsystems.com/products/pcie-x4-gen2-host-cable-adapter?variant=39350777970723|OSS-PCIE-HIB25-X4 Gen 2]] Host Cable Adaptor. || 2+ || OneStop companion card on FE ||<#FF0000> unknown|| || ---- == recipe == === fb1 === * debootstrap /diskless/root * FIXME: tftpd-hpa and pxe setup * in /diskless/root * apt install locales openssh-server man emacs-nox nfs-client * FIXME: update initramfs: * /etc/initramfs-tools/initramfs.conf * /etc/initramfs-tools/modules * update-initramfs -u * FIXME: make script to setup/mount /diskless/root on fb1 * mount --bind /dev /diskless/root/dev * mount -t proc /proc /diskless/root/proc * chroot /diskless/root * apt install advligorts-rcg advligorts-fe * FIXME: need to install rtcds kernel for deb 10 for now, until rcg 5.0 dkms fixed for deb 10 * fix permissions on /opt/rtcds: * advligorts on NFS host for target * recursive group advligorts * setgid for all directories * umask for group write permission ## bugs: * bug finding generate_KisselButton.py: ``` Unable to find the following file in CDS_MEDM_PATH: SUS_SINGLE.adl ERROR: Could not find file: generate_KisselButton.py Searched path: /opt/rtcds/caltech/c1/post_build Exiting make: *** [Makefile:166: install-c1sus] Error 1 ``` --- == chroot installation issues == We ran into some problems installing dolphin via chroot, which was fixed by *sudo mount --bind /dev /diskless/root/dev *sudo mount -t proc /proc /diskless/root/proc *sudo chroot /diskless/root == uname -r dolphin driver install issue == We also noticed that the different linux kernel versions on the bootserver and front-ends was causing issues due to chroot 'uname -r' call giving the bootserver's kernel instead of the FE kernel, so we fixed that by installing the same FE rtcds kernel on the boost server as well. *sudo apt install linux-image-4.19.0-6-rtcds-amd64-unsigned *sudo apt remove linux-image-4.19.0-21-amd64 *sudo reboot *sudo apt install ligo-dolphin-ix-node == boot server == These are the cmds we ran to setup dolphin on the bootserver. *sudo apt install ligo-dolphin-networkmanager *sudo cp /diskless/root/etc/apt/sources.list.d/restricted.list /etc/apt/sources.list.d/ *sudo apt update *sudo apt install ligo-dolphin-networkmanager *sudo /opt/DIS/sbin/dis_mkconf -fabrics 1 -sclw 8 -stt 1 -nodes c1sus c1bhd -nosessions --- *#cd /diskless/root/etc *#rm -rf dis *#ln -s /var/log dis We decided the symlink idea was not a good one, so we edited /diskless/root/etc/fstab to mount a writeable /etc/dis instead. == Model edits for dolphin test == [[https://dcc.ligo.org/DocDB/0001/T080135/011/LIGO-T080135-v11.pdf | RCG guide]] * Edited IOP models and user models for c1bhd and c1sus. We now have them both in the c1bhd folder. * Included `dolphin_time_xmit=1` in c1x06 CDS block parameter on the sender to send timing over dolphin and `dolphin_time_rcvr=1` on the receiving IOP model ie. c1x02, etc. * This prevented the IOP model from starting bcos it needed the advligorts-dolphin-daemon package, which depends on ligo-dolphin-srcdis, so * apt install ligo-dolphin-srcdis * apt install advligorts-dolphin-daemon * apt install advligorts-dolphin-proxy-km-dkms *No I/O chassis means ADC cards in c1x02 model should give a build/start error, so we used virtualIOP=2 in c1x02, c1sus IOP, to allow error-free build. == Additional configuration for Gen2 dolphin == * Edit /opt/DIS/lib/modules/dis_ix.conf to increase the broadcast group size to 16MB * i.e. ntb_mcast_group_size=24; * [IS THIS NECESSARY?] Implementation of ‘pciRfm=1’ to ‘pciRfm=2. Instead of doing this for all IOP models, Create a USE_DOLPHIN_GEN2 instead as shown below * chroot /diskless/root * cd /usr/share/advligorts/src/src/include * touch USE_DOLPHIN_GEN2 == (posible bug in packaging) System services needed == To get dolphin communication working, we had to do the following: * sudo modprobe dolphin-proxy-km * sudo systemctl start rts-dolphin_daemon FIX * sudo chroot /diskless/root * add the text `dolphin-proxy-km` to the file `/etc/modules` [Doesn't seem to work. rts-dolphin_daemon depends on this package, so does not show green light until this is loaded. Not sure why loading at boot is a problem. ] * edit `/lib/systemd/system/rts-dolphin_daemon.service` by adding: `[Install]` `WantedBy=multi-user.target` * `systemctl enable rts-dolphin_daemon` == DAQ setup == On FB1 * apt-get install isc-dhcp-server * edit `/etc/network/interface` to assign static ip address * `sudo ifup enp2s0` to bring up interface * edit /etc/dhcp/dhcp.conf * `sudo apt install advligorts-common advligorts-edc advligorts-rcg -t buster-unstable` * `sudo apt install advligorts-gpstime-dkms advligorts-mbuf-dkms -t buster-unstable` * `sudo apt install advligorts-local-dc advligorts-transport-common advligorts-transport-pubsub -t buster-unstable` * `sudo apt install ldas-tools-framecpp advligorts-daqd -t buster_unstable` * `sudo systemctl enable rts-transport@cps_recv` * `sudo systemctl start rts-transport@cps_recv` * `sudo systemctl enable rts-daqd` * `sudo systemctl start rts-daqd` * `sudo ln -s /opt/rtcds/caltech/c1/target/gds/param/testpoint.par /etc/advligorts/testpoint.par` * `sudo ln -s /opt/rtcds/caltech/c1/target/daqd/master /etc/advligorts/master` or copy contents for models of interest i.e c1bhd and c1sus into `/etc/advligorts/testpoint.par` * LOOK AT `/etc/advligorts/daqdrc` for how to enable frame writing On FE * add enviroment file `/etc/advligorts/systemd_env_${hostname}`, i.e. `cat /etc/advligorts/systemd_env_c1bhd` `local_dc_args='-w 0 -s "c1x06 c1bhd" -b local_dc -m 100 -d /opt/rtcds/caltech/c1/target/gds/param' ` `cps_xmit_args="-b local_dc -m 100 -p 'tcp://${10.0.113.2}:9000' -D 1"` * `apt-get install advligorts-pubsub` * `systemctl enable rts-transport@cps_xmit` * `systemctl enable rts-local_dc`