|
Size: 13744
Comment:
|
Size: 9663
Comment:
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 1: | Line 1: |
| = Computer Restart Procedures = |
|
| Line 5: | Line 7: |
| ||[#coldstart Coldstart procedures] || ||[#hosttable Martian Host Table] || ||[#c1lsc c1lsc]||[#c1iscex c1iscex]||[#c1iscey c1iscey]||[#c1sosvme c1sosvme]||[#c1susvme1 c1susvme1]||[#c1susvme2 c1susvme2]|| ||[#c1psl c1psl]||[#c1iool0 c1iool0]||[#c0dcu1 c0dcu1]||[#c1asc c1asc]||[#c0daqawg c0daqawg]||[#c0daqctrl c0daqctrl]|| ||[#c1omc c1omc]||[#fb40m fb40m]||[#EPICS EPICS]||[#op440m op440m]||[#op340m op340m]|| ---------- . <<Anchor(c1lsc)>> '''c1lsc''' Turn OFF all the SUS buttons on the right hand side of the LSC screen (C1LSC.adl). Push both of the RESET buttons on the little RESET screen on the LSC screen. From a control room terminal type: {{{ssh c1lsc}}} log in as controls. Then do 'su' to become root. Then {{{cd /cvs/cds/caltech/target/c1lsc/}}} {{{./startup.cmd}}} --------- . <<Anchor(c1iscex)>>'''c1iscex''' Shut off the watchdogs for ETMX via epics. Push the corresponding RESET button on the "C0DAQ_RFMNETWORK.adl" MEDM screen. From a control room terminal type: {{{telnet c1iscex}}} Copy and paste this into the command line (including the "<"): {{{< /cvs/cds/caltech/target/c1iscex/startup.cmd}}} When the line "starting main loop (printing from sus_start)." appears, hit CTRL-] and type: {{{quit}}} Turn the watchdogs back on, once the computer is up again. --------- . <<Anchor(c1iscey)>> '''c1iscey''' Shut off the watchdogs for ETMY via epics. Push the corresponding RESET button on the "C0DAQ_RFMNETWORK.adl" MEDM screen. From a control room terminal type: {{{telnet c1iscey}}} Copy and paste this into the command line (including the "<"): {{{< /cvs/cds/caltech/target/c1iscey/startup.cmd}}} Turn the watchdogs back on, once the computer is up again. Type ctrl-] to break. Then "quit". --------- . <<Anchor(c1sosvme)>> '''c1sosvme''' Shut off the watchdogs for all optics via epics. Push the corresponding RESET button on the "C0DAQ_RFMNETWORK.adl" MEDM screen. From a control room terminal type: {{{telnet c1sosvme}}} Copy and paste this into the command line (including the "<"): {{{< /cvs/cds/caltech/target/c1sosvme/startup.cmd}}} You will probably need to restart c1susvme1 and c1susvme2 now. Type ctrl-] to break. Then "quit". --------- . <<Anchor(c1susvme1)>> '''c1susvme1''' Shut off the watchdogs for ITMX, ITMY, BS, PRM via epics. Push the corresponding RESET button on the "C0DAQ_RFMNETWORK.adl" MEDM screen. From a control room terminal type: {{{ssh c1susvme1}}} log in with the ''controls'' password. Become superuser by running the ''su'' command. Go to the ''/cvs/cds/caltech/target/c1susvme1'' directory. {{{./startup.cmd}}} Turn the watchdogs back on, once the computer is up again. --------- . <<Anchor(c1susvme2)>> '''c1susvme2''' Shut off the watchdogs for SRM, MC1, MC2, MC3 optics via epics. Push the corresponding RESET button on the "C0DAQ_RFMNETWORK.adl" MEDM screen. From a control room terminal type: {{{telnet c1susvme2}}} Login as controls At the prompt, type: {{{su}}} Type in the superuser password. {{{cd c1susvme2}}} {{{./startup.cmd}}} Type crtl-] to break. Then type "quit". Turn the watchdogs back on, once the computer is up again. Reenable the Mode Cleaner autolocker. ---------- . <<Anchor(c1psl)>>'''c1psl''' |
||[[Computer Restart Procedures/BURTgooey|BURTgooey]] || [#links Useful links] || ||[#c1sus c1sus] ||[#c1ioo c1ioo] || || [#c1omc c1omc] ||[#c1ass c1ass] || || [#nodus nodus] ||[#fb fb] (Includes DAQ) || || [#c1psl c1psl] ||[#c1iool0 c1iool0] ||[#c1pem1 c1pem1] || ||[#op440m op440m] ||[#op340m op340m] || '''Out of Date''' Ethernet network connection diagram as of Oct 7, 2008: attachment:40m_network_10-07-08.pdf [[Martian Host Table]] [#Electronics Here] you can find a map of the computers around the lab. ---------- . <<Anchor(links)>> '''Useful links''' ''Which models run on which machines?'' . Answer: [[Electronics/Existing RCG DCUID and gds ids]] ---------- . <<Anchor(fb)>> '''fb and DAQ issues''' ''To restart the frame builder process, simply do the following from a control room machine:'' . 1. telnet fb 8088 . 2. shutdown The init process running on the fb machine will then automatically restart daqd. /!\ Generally after restarting the frame builder process, the front ends will not be talking to the fb properly (0x2bad and red lights). The easiest solution is to reboot the front ends. ''For dataviewer to get data you need to make sure "daqd" and "nds pipe" are running on the fb machine. '' daqd and nds have been added to the /etc/inittab file on the fb machine. These will automatically restart when killed or the machine is restarted. However, if either process fails to start several times in rapid succession, the init process will stop trying. . 1. Fix the underlying problem. Try looking at /opt/rtcds/caltech/c1/target/fb/logs/daqd.log.XXXX for error messages. . 2. ssh fb . 3. "sudo /sbin/init q" to restart the init process or restart the fb machine with "sudo shutdown -r now" The code which is called by the init process lives in /opt/rtcds/caltech/c1/target/fb/. ''For testpoints to be available for a given front end, you need running on the correct front end computer:'' . The IOP needs to be running, since it handles, it is genamed something like c1x00, c1x01, etc. . "sudo /opt/rtcds/caltech/c1/scripts/startc1SYSNAME" where SYSNAME is something like sus or ioo or x02 . The above starts the IOCs, awgtpman, and loads the front end module. . mx_streams running (use "sudo /etc/restart_streams") this should start a mx_stream for each front end system and is needed to talk to the fb ''To confirm the necessary codes are running on a front end, you can:'' . To check if the front ends are loaded, you can use "lsmod" on the front end machine, looking for c1SYSNAMEfe entries . To check if the IOCs are running, you can "ps -ef | grep epicsC1.cmd" - there should be 1 per model . To check if mx_streams are running, you can "ps -ef | grep mx_stream" - there should be 1 per model . To check if awgtpman are running, you can "ps -ef | grep awgtpman" - there should be 1 per model ''Cold start order is:'' . 0. If necessary, stop everything (fb, front ends, mx_streams, etc) . 1. Start fb codes (daqd, nds, dhcpd) . 2. Start the front ends. . 3. Start IOP on front ends (startc1x##) . 4. Start FE models (startc1SYSNAME) . 5. Start mx_streams (/etc/restart_streams) . If the DAQ is in a bad state, try to start fresh in this order. It seems to usually work. . The frame builder for an unknown reason is flaky, and generally I find I have to restart it 2 or 3 times before it doesn't die within the first 60 seconds or so. . If you get past that first minute or so, it tends to be stable from then on ''Restarting the Nightly Backup of Frames'' . The following instructions are valid for the old fb40m Frame Builder. They're here for reference, until the backup scripts are fixed, at which time new instructions will be posted. ---JD 18Oct2010 Restart the nightly BACKUP of /cvs/cds and our trend-frames by following the instructions in Restarting the backup script Summary of how to restart the backup script: The steps are as follows (copy everything after each of the numbered steps verbatim): 1. ssh fb40m 2. cd /cvs/cds/caltech/scripts/backup 3. ssh-agent > .agent 4. awk '/setenv/' .agent > .agent.edit 5. mv .agent.edit .agent 6. source .agent 7. ssh-add ~/.ssh/id_rsa (This one will not ask for a passphrase) 8. ssh-add ~/.ssh/backup2PB ( This one requires a passphrase. Read the README: ..../scripts/backup/000README.txt ) 9. ssh-add -l (This verifies that both the id_rsa and backup2PB are there. If it also picks up the wrong one (id_dsa), remove it by typing "ssh-add -d" ) 10. ssh 40m@ldas-cit.ligo.caltech.edu /bin/ls /archive/frames/trend/minute-trend/40m (This should do a test ssh, and list the archived frame folders. You can open the last one, and then look at the gps time of the last .gwf file, and it should be sometime in the middle of the previous night.) ---------- . <<Anchor(c1ioo)>> '''c1ioo''' This machine runs the c1x03, c1ioo, and c1gpt FE models. It controls mode cleaner wavefront sensors, mode cleaner length, and green locking. On reboot, these models should automatically start up. See also the [#fb fb/DAQ] section. c1ioo is a Sun X4600 machine. As such for a complete shutdown (not normally necessary but sometimes), do the following: Shutdown the computer normally. (Power button or "shutdown -h now"). Go out to the rack and unplug all 4 power supply cables on the back of the machine. Wait for a bit for the machine to completely stop (30 seconds or so). Plug all the cables back in, and press the power button. ---------- . <<Anchor(c1sus)>> '''c1sus''' This machine runs the c1x02, c1sus, c1mcs, c1rms FE models. It controls the BS, ITMX, ITMY, PRM,SRM,MC1,MC2, and MC3 optics. On reboot, these models should automatically start up. See also the [#fb fb/DAQ] section. ---------- . <<Anchor(c1psl)>>'''c1psl''' |
| Line 144: | Line 153: |
| {{{telnet c1psl}}} {{{reboot}}} |
. {{{ telnet c1psl reboot}}} |
| Line 155: | Line 164: |
| . <<Anchor(c1iool0)>>'''c1iool0''' | . <<Anchor(c1pem1)>>'''c1pem1''' Sometimes you can just do this guy by doing: . {{{ telnet c1pem1 reboot}}} then burt restore this guy. If it still doesn't come back then sing [http://www.amazon.com/gp/music/clipserve/B000002W9Q001005/1/ref=mu_sam_ra001_005/002-7727484-0862420 this] link. ------- . <<Anchor(c1iool0)>>'''c1iool0''' |
| Line 159: | Line 181: |
| {{{telnet c1iool0}}} | . {{{ telnet c1iool0}}} Try {{{CTRL+x}}} It should reboot c1iool0 This computer automatically executes startup.cmd. So there is no need to run it manually. If for some reason it does not load the startup script automatically, try this: |
| Line 163: | Line 194: |
| {{{< /cvs/cds/caltech/target/c1iool0/startup.cmd}}} | . {{{ < /cvs/cds/caltech/target/c1iool0/startup.cmd}}} |
| Line 167: | Line 199: |
| {{{quit}}} ------- . <<Anchor(c0dcu1)>>'''c0dcu1''' Restart procedure needed. (Temporary) Turn key on DAQ CTRL crate to turn off ALL framebuilders; turn back after ~10 seconds Follow restart procedures for all other computers OR, first try pressing the "reset" button on c0dcu1 and waiting ~3 minutes. ------- . <<Anchor(c1asc)>>'''c1asc''' Turn key on ASC crate to shut off power; turn back after ~10 seconds Type {{{telnet c1asc}}} Then type {{{< /cvs/cds/caltech/target/c1asc/startup.cmd}}} After signal light on DAQ_DETAIL screen turns green, type CTRL-], followed by {{{quit}}} ------- <<Anchor(c0daqawg)>>'''c0daqawg''' 1) First try this: > telnet c0daqawg if you get a prompt try: > vmeBusReset The AWG light on the RFM screen ought to go red. IF it does, wait ~5 minutes for it to come back. 2) If the above doesn't work, try: Turn key on DAQ AWG crate to shut off power; turn back after ~10 seconds That's it ------- <<Anchor(c0daqctrl)>>'''c0daqctrl''' 1) No clue. Power cycle? SSH? Stab it with a spoon??? ------- <<Anchor(c1omc)>>'''c1omc''' |
. {{{ quit}}} ------- . <<Anchor(c1omc)>>'''c1omc''' -1) Make sure the c1omc is powered on--it doesn't power up automatically following a power outage. First find the OMC, then press its power button. |
| Line 228: | Line 212: |
| target directory. | . target directory. |
| Line 236: | Line 221: |
| If nothing works, check the mount tables and make sure that linux1:/home/cds is mounted as /cvs/cds. If it's not, sudo mount -a. |
If nothing works, check the mount tables and make sure that linux1:/home/cds is mounted as /cvs/cds. If it's not, sudo mount -a. |
| Line 240: | Line 224: |
| Line 247: | Line 232: |
| <<Anchor(fb40m)>>'''fb40m''' This is not really a reboot procedure, as I don't know it. But, to restart the testpoint manager, log in as root and run '/usr/controls/tpman' in the background. Then restart the 'daqd' process by doing a "telnet fb40m 8087" and typing "shutdown" at the prompt. If everything gets hosed, and the RAID is angrily flashing red lights, power off the framebuilder (by logging in as SU and then typing "poweroff"), power-cycle the RAID, then turn the framebuilder on. If there is disk corruption, you can use "fsck -y" to automatically answer "yes" to all of "fsck"'s questions, so it can run unattended. ------- <<Anchor(EPICS)>>'''EPICS''' c1dcuepics runs the processes labeled "dcuepics40m" and "losepics". These should start automatically. c1iscepics runs the process "iscepics40m" which can be started by running ./startupC1 as user 'controls' ------- C1:IOO-MC_F channels may not come back unless the IOO rack is keyed; follow C1IOVME procedure after that ------- <<Anchor(op440m)>>'''op440m''' Reboot as usual. If its acting weird or slow just hit the moon button. Pick the shutdown option. After a few minutes it will turn off. The hit the on button on the front of the machine. Wait for the login prompt. Then log in as controls. After logging in you must restart the following scripts: autolockMCmain, PSLwatch, and FSSSlowServo. Run those scripts in the background. Then type "restart_conlogger" at the prompt. ------- <<Anchor(op340m)>>'''op340m''' Reboot as usual. It's headless, so you'll need to ssh in and type 'reboot'. ------- <<Anchor(List)>> '''List of all lab controls computers''' In control room: linux1 - in network rack - NFS server for /cvs/cds (keeps two copies, raid1) [[BR]] linux2 - controls console, running Linux kernel 2.6.9-1.667smp [[BR]] linux3 - controls console, running Linux kernel 2.6.9-1.667 [[BR]] op140m - controls console, running Solaris 9 [[BR]] op440m - controls console, running Solaris 9 [[BR]] op540m - controls console, running Solaris 9 [[BR]] [[BR]] From /cvs/cds/caltech/target : [[BR]] c0daqawg - front-end VME cpu running linux (?) in 1Y6 [[BR]] c0daqctrl - front-end VME cpu running linux in 1Y7 [[BR]] c0dcu1 - front-end VME cpu running VxWorks (?) in 1Y7 [[BR]] c1asc - front-end VME cpu running linux in 1X5 [[BR]] c1aux - EPICS VME cpu running VxWorks in 1X1 [[BR]] c1auxex - EPICS VME cpu running VxWorks in 1X9 [[BR]] c1auxey - EPICS VME cpu running VxWorks in 1Y7 [[BR]] c1dcuepics - EPICS PC cpu running linux in 1Y6 [[BR]] c1iool0 - EPICS VME cpu running VxWorks in 1Y2 [[BR]] c1iovme - front-end VME cpu running linux in 1Y2 [[BR]] c1iscaux - EPICS VME cpu running VxWorks(?) in 1X5 [[BR]] c1iscaux2 - EPICS VME cpu running VxWorks(?) in 1X5 [[BR]] c1iscepics - EPICS PC cpu running linux in 1X6 [[BR]] c1iscex - front-end VME cpu running linux in 1X9 [[BR]] c1iscey - front-end VME cpu running linux in 1Y7 [[BR]] c1losepics - EPICS PC cpu running linux in 1Y6 [[BR]] c1lsc - front-end VME cpu running linux in 1X5 [[BR]] c1pem1 - EPICS VME cpu running VxWorks(?) in 1Y? [[BR]] c1psl - EPICS VME cpu running VxWorks(?) in 1Y1 [[BR]] c1sosvme - front-end VME cpu running linux in 1Y4 [[BR]] c1susaux - EPICS VME cpu running VxWorks(?) in 1Y5 [[BR]] c1susvme1 - front-end VME cpu running linux in 1Y4 [[BR]] c1susvme2 - front-end VME cpu running linux in 1Y4 [[BR]] c1vac1 - EPICS VME cpu running VxWorks in 1Y9 [[BR]] c1vac2 - EPICS VME cpu running VxWorks in 1Y9 [[BR]] ------- <<Anchor(hosttable)>> '''Martian Host Table''' This is a list of the Martian network hosttable on op140m on April 23rd 2007 The NAT router is at 131.215.113.2 131.215.113.20 linux1 131.215.113.21 linux2 131.215.113.22 linux3 131.215.113.10 c0rga 131.215.113.211 op140m op140m.ligo.caltech.edu loghost 131.215.113.201 rana113 rana 131.215.113.202 fb40m fb0 131.215.113.203 br40m 131.215.113.204 dmt140m 131.215.113.205 dmt240m 131.215.113.206 131.215.113.207 131.215.113.208 131.215.113.209 131.215.113.210 hpmartian 131.215.113.211 op140m 131.215.113.212 op240m 131.215.113.213 op340m 131.215.113.214 op440m 131.215.113.215 op540m 131.215.113.221 40mars-221 131.215.113.222 40mars-222 131.215.113.223 40mars-223 131.215.113.224 40mars-224 131.215.113.225 40mars-225 131.215.113.226 40mars-226 131.215.113.227 40mars-227 131.215.113.228 40mars-228 131.215.113.229 40mars-229 131.215.113.230 40mars-230 131.215.113.231 40mars-231 131.215.113.232 40mars-232 131.215.113.233 40mars-233 131.215.113.234 40mars-234 131.215.113.235 40mars-235 131.215.113.236 40mars-236 131.215.113.237 40mars-237 131.215.113.238 40mars-238 131.215.113.239 40mars-239 131.215.113.240 40mars-240 131.215.113.7 cdssol6 131.215.113.51 scipe1 c1pem1 131.215.113.52 scipe2 c1vac1 131.215.113.53 scipe3 c1psl 131.215.113.54 scipe4 c1vac2 131.215.113.55 scipe5 c1susaux 131.215.113.56 scipe6 c1omc 131.215.113.57 scipe7 c1iool0 131.215.113.58 scipe8 c1ass 131.215.113.59 scipe9 c1auxex 131.215.113.60 scipe10 c1auxey 131.215.113.61 scipe11 c1aux 131.215.113.62 scipe12 c1lsc 131.215.113.63 scipe13 c1susvme2 131.215.113.64 scipe14 c1susvme1 131.215.113.65 scipe15 131.215.113.66 scipe16 131.215.113.67 scipe17 c1iovme 131.215.113.68 scipe18 c1sosvme 131.215.113.69 scipe19 c1losepics 131.215.113.70 scipe20 c1asc 131.215.113.71 scipe21 c0daqctrl 131.215.113.73 scipe23 131.215.113.74 scipe24 c0dcu1 131.215.113.75 scipe25 c1dcuepics 131.215.113.77 scipe27 c1lscbootserver c1iscepics 131.215.113.78 scipe28 c0daqawg 131.215.113.79 scipe29 c1iscey 131.215.113.80 scipe30 c1iscex 131.215.113.81 scipe31 c1iscaux 131.215.113.82 scipe32 c1iscaux2 131.215.113.90 rfm-bypass vmiacc-5595 131.215.113.101 linux101 131.215.113.102 linux102 ------- <<Anchor(coldstart)>> '''coldstart procedures''' check the vacuum controls in rack 1Y9 (on UPS) [[BR]] check the laser chiller, laser power supply, ion pump HV [[BR]] make sure linux1 is up and serving /cvs/cds (on UPS) [[BR]] make sure rana113 (gate40m) is up and has /cvs/cds mounted (on UPS) [[BR]] reset the Marconi RF signal generators [[BR]] The controls computers should be up (on UPS) [[BR]] Bring up the embedded computers, starting with EPICS: [[BR]] c1vac1 and c1vac2 (on UPS), c1psl, c1iool0, c1iscaux, c1iscaux2, c1iscepics, c1dcuepics, c1susaux, c1aux, c1auxex, c1auxey [[BR]] NB: you will probably have to actually power-on scipe27 (c1iscepics) and scipe25 (c1losepics/c1dcuepics). [[BR]] then the DAQ: c0daqctrl, c0daqawg, c0dcu1 [[BR]] check the RFM switch--if it's not green, reset it [[BR]] make sure the framebuilder (fb40m) is building frames--i.e., all the MEDM lights are green[[BR]] then the front-end servos: c1iovme, c1sosvme, c1susvme1, c1susvme2, c1iscex, c1iscey, c1lsc, c1asc, c1omc, c1ass NB: above, re-boot c1susvme1, c1susvme2, c1lsc so they can get a fresh copy of linux from scipe27. [[BR]] do BURT restores of c1iscepics.snap, c1losepics.snap, c1omcepics.snap, c1assepics.snap--everything else should do saverestore (automatic) [[BR]] check for stuck EPICS buttons/sliders (just give everything a quick twiddle) [[BR]] restart the testpoint manager [[BR]] reset the mechanical shutters after power outages[[BR]] press the "closed loop" buttons for the input-steering piezojena controllers [[BR]] |
. <<Anchor(c1ass)>>'''c1ass''' * Currently the procedure for restarting C1ASS seems to be the same as for C1OMC above except that the ass test point manager doesn't need the "-2" flag. ------- . <<Anchor(op440m)>>'''op440m''' Reboot as usual. If its acting weird or slow just hit the moon button. Pick the shutdown option. After a few minutes it will turn off. The hit the on button on the front of the machine. Wait for the login prompt. Then log in as controls. ------- . <<Anchor(op340m)>>'''op340m''' Reboot as usual. It's headless, so you'll need to ssh in and type 'reboot'. Restart the following scripts: * [[conlog]] ------- . <<Anchor(nodus)>>'''nodus''' Nodus is a Solaris box in the rack in the office. Here are some of the things that it runs that you will want to restart: * [[EPICS gateway]] * [[ndsproxy]] * [[ApacheOnNodus|Apache (Required for SVN remote access)]] * [[elog]] |
Computer Restart Procedures
Here is where we should keep information on how to restart the computers that periodically need restarting.
[#List List of all lab computers] |
[#links Useful links] |
[#c1sus c1sus] |
[#c1ioo c1ioo] |
[#c1omc c1omc] |
[#c1ass c1ass] |
[#nodus nodus] |
[#fb fb] (Includes DAQ) |
[#c1psl c1psl] |
[#c1iool0 c1iool0] |
[#c1pem1 c1pem1] |
[#op440m op440m] |
[#op340m op340m] |
Out of Date Ethernet network connection diagram as of Oct 7, 2008: attachment:40m_network_10-07-08.pdf
[#Electronics Here] you can find a map of the computers around the lab.
Useful links
Which models run on which machines?
fb and DAQ issues
To restart the frame builder process, simply do the following from a control room machine:
- 1. telnet fb 8088
- 2. shutdown
The init process running on the fb machine will then automatically restart daqd.
Generally after restarting the frame builder process, the front ends will not be talking to the fb properly (0x2bad and red lights). The easiest solution is to reboot the front ends.
For dataviewer to get data you need to make sure "daqd" and "nds pipe" are running on the fb machine.
daqd and nds have been added to the /etc/inittab file on the fb machine. These will automatically restart when killed or the machine is restarted.
However, if either process fails to start several times in rapid succession, the init process will stop trying.
- 1. Fix the underlying problem. Try looking at /opt/rtcds/caltech/c1/target/fb/logs/daqd.log.XXXX for error messages.
- 2. ssh fb
- 3. "sudo /sbin/init q" to restart the init process or restart the fb machine with "sudo shutdown -r now"
The code which is called by the init process lives in /opt/rtcds/caltech/c1/target/fb/.
For testpoints to be available for a given front end, you need running on the correct front end computer:
- The IOP needs to be running, since it handles, it is genamed something like c1x00, c1x01, etc.
- "sudo /opt/rtcds/caltech/c1/scripts/startc1SYSNAME" where SYSNAME is something like sus or ioo or x02
- The above starts the IOCs, awgtpman, and loads the front end module.
- mx_streams running (use "sudo /etc/restart_streams") this should start a mx_stream for each front end system and is needed to talk to the fb
To confirm the necessary codes are running on a front end, you can:
- To check if the front ends are loaded, you can use "lsmod" on the front end machine, looking for c1SYSNAMEfe entries
- To check if the IOCs are running, you can "ps -ef | grep epicsC1.cmd" - there should be 1 per model
- To check if mx_streams are running, you can "ps -ef | grep mx_stream" - there should be 1 per model
- To check if awgtpman are running, you can "ps -ef | grep awgtpman" - there should be 1 per model
Cold start order is:
- 0. If necessary, stop everything (fb, front ends, mx_streams, etc)
- 1. Start fb codes (daqd, nds, dhcpd)
- 2. Start the front ends.
- 3. Start IOP on front ends (startc1x##)
- 4. Start FE models (startc1SYSNAME)
- 5. Start mx_streams (/etc/restart_streams)
- If the DAQ is in a bad state, try to start fresh in this order. It seems to usually work.
- The frame builder for an unknown reason is flaky, and generally I find I have to restart it 2 or 3 times before it doesn't die within the first 60 seconds or so.
- If you get past that first minute or so, it tends to be stable from then on
Restarting the Nightly Backup of Frames
- The following instructions are valid for the old fb40m Frame Builder. They're here for reference, until the backup scripts are fixed, at which time new instructions will be posted. ---JD 18Oct2010
Restart the nightly BACKUP of /cvs/cds and our trend-frames by following the instructions in Restarting the backup script Summary of how to restart the backup script: The steps are as follows (copy everything after each of the numbered steps verbatim):
- ssh fb40m
- cd /cvs/cds/caltech/scripts/backup
ssh-agent > .agent
awk '/setenv/' .agent > .agent.edit
- mv .agent.edit .agent
- source .agent
- ssh-add ~/.ssh/id_rsa
- (This one will not ask for a passphrase)
- ssh-add ~/.ssh/backup2PB
- ( This one requires a passphrase. Read the README: ..../scripts/backup/000README.txt )
- ssh-add -l
- (This verifies that both the id_rsa and backup2PB are there. If it also picks up the wrong one (id_dsa), remove it by typing "ssh-add -d" )
ssh 40m@ldas-cit.ligo.caltech.edu /bin/ls /archive/frames/trend/minute-trend/40m
- (This should do a test ssh, and list the archived frame folders. You can open the last one, and then look at the gps time of the last .gwf file, and it should be sometime in the middle of the previous night.)
c1ioo
This machine runs the c1x03, c1ioo, and c1gpt FE models. It controls mode cleaner wavefront sensors, mode cleaner length, and green locking.
On reboot, these models should automatically start up. See also the [#fb fb/DAQ] section.
c1ioo is a Sun X4600 machine. As such for a complete shutdown (not normally necessary but sometimes), do the following:
Shutdown the computer normally. (Power button or "shutdown -h now").
Go out to the rack and unplug all 4 power supply cables on the back of the machine.
Wait for a bit for the machine to completely stop (30 seconds or so).
Plug all the cables back in, and press the power button.
c1sus
This machine runs the c1x02, c1sus, c1mcs, c1rms FE models. It controls the BS, ITMX, ITMY, PRM,SRM,MC1,MC2, and MC3 optics.
On reboot, these models should automatically start up. See also the [#fb fb/DAQ] section.
c1psl
Sometimes you can just do this guy by doing:
telnet c1psl reboot
then burt restore this guy.
But often, this just makes it upset and the screens go white but it never comes back. When that happens go out to the rack (the one next to the one with the MC servo) and turn off the crate (on the bottom) which has the c1psl processor. After ~3.14 seconds, turn it back on. c1psl ought to come back now.
If it still doesn't come back then sing [http://www.amazon.com/gp/music/clipserve/B000002W9Q001005/1/ref=mu_sam_ra001_005/002-7727484-0862420 this] link.
c1pem1
Sometimes you can just do this guy by doing:
telnet c1pem1 reboot
then burt restore this guy.
If it still doesn't come back then sing [http://www.amazon.com/gp/music/clipserve/B000002W9Q001005/1/ref=mu_sam_ra001_005/002-7727484-0862420 this] link.
c1iool0
At the command prompt, type:
telnet c1iool0
Try CTRL+x
It should reboot c1iool0
This computer automatically executes startup.cmd. So there is no need to run it manually.
If for some reason it does not load the startup script automatically, try this:
At the telnet prompt, type
< /cvs/cds/caltech/target/c1iool0/startup.cmd
Then, after the main loop is started, type CTRL-], followed by
quit
c1omc
-1) Make sure the c1omc is powered on--it doesn't power up automatically following a power outage. First find the OMC, then press its power button.
0) Make sure the previous incarnation of the code is no longer running. See Appendix A for details.
1) while logged in as controls, run the script startupC1 in the c1omcepics target directory.
2) Log in as root. Start the real-time code by running the omcfe.rtl script in the c1omc
- target directory.
2.5) Now the process will wait for a BURT restore. Find the appropriate autoburt snapshot file, and restore it.
3) Also, as root, run the command /opt/gds/awgtpman -2 in the background.
Note that c1omc has two ethernet ports. Use the bottom one.
If nothing works, check the mount tables and make sure that linux1:/home/cds is mounted as /cvs/cds. If it's not, sudo mount -a.
A) To stop the front end code, first press the red FE RESET button on the C1OMC_GDS screen. Then,
- i) log in to c1omc. become root.
- ii) kill epics with a 'pkill omcepics'
- iii) kill the test-point manager with a 'pkill awgtpman'
- iv) remove the front end kernel module with '/sbin/rmmod omcfe'
- v) check that the [omcfe] kernel module is gone with a '/sbin/lsmod'
c1ass
- Currently the procedure for restarting C1ASS seems to be the same as for C1OMC above except that the ass test point manager doesn't need the "-2" flag.
op440m
Reboot as usual. If its acting weird or slow just hit the moon button. Pick the shutdown option. After a few minutes it will turn off. The hit the on button on the front of the machine. Wait for the login prompt. Then log in as controls.
op340m
Reboot as usual. It's headless, so you'll need to ssh in and type 'reboot'.
Restart the following scripts:
nodus
Nodus is a Solaris box in the rack in the office. Here are some of the things that it runs that you will want to restart:
