Using Seastar

A manual for using HPC-systems especially targetted to the beowulf cluster “Seastar” of the CMT research group of the University of Antwerp
Nikolas Garofil

May 18, 2015

On a sunny afternoon at the UA,
I was threatened: “Explain Seastar !!! Okay ?!?”
So I started writing documentation,
with everything but it’s location,
’cause that is TOP-secret, so I’ll never say !

1 Disclaimer

This manual is distributed in the hope that it will be useful, but without any warranty, without even the implied warranty of not setting the printer on fire by printing it out, destroying your computer by reading it with a pdf-reader, ...

This is still an alpha version of the manual, most likely it will contain a lot of grammatical and spelling errors. There might also be errors in the code fragments which could cause serious damage.

!!! Do not run anything that you do not fully understand !!!

2 Some details about Seastar

It’s probably best to skip this section if you’ve never used Seastar before, this is more of a technical reference…

Seastar is a Beowulf cluster1 with the following specifications:

3 Getting started

3.1 Account-creation

In order to get started you’ll need an account, so contact me and I’ll create it for you. Most of the account names on seastar start with the first character of the first name of the user followed by the last name (e.g. someone named Kurt Cobain would receive the username “kcobain”).

I’ll also ask you to choose a password. I might forget to tell you but make sure that this is a strong password. You can do this by following the usual rules that most sites demand (9 chars or longer, mix in special chars, don’t base it on recognizable words, ...) but if you don’t mind typing in long passwords then I would recommend XKCD-style13 passwords. Whatever password you choose, make sure you don’t use it anywhere else.

3.2 Logging in

In this document I’m assuming that you are using a POSIX-compliant14 operating system. To be honest I have to admit that it’s also possible to connect to the cluster from other operating systems, so why am I doing this ?

I also have to assume that you already have some basic experience with GNU/Linux15 . Otherwise I would have to explain every tiniest step and this document would become unreadably long.

So now your account is created. You should now be able to login if you start an SSH16 -client and connect to server seastar-64.cmi.uantwerpen.be17 with your username and password and use the TCP-port18 that’s registered for ssh. For our example user “Kurt Cobain” this would work like this:

  1. Start a shell: You can do this by looking in the menu for a program named “terminal”, “console” or something very similar. Starting this should open a (probably black) window with a prompt and the possibility to enter commands behind this prompt. The prompt differs a bit from system to system but it will probably look similar to this:
    kurt@nirvana ~ $ 
    kurt is here your username on your pc and nirvana the hostname of your pc.
  2. Login on Seastar: This is really easy, use the command below, enter a password when requested19 and you will be connected. (The password won’t be visible on the screen, this is normal behaviour)
    kurt@nirvana ~ $ ssh kcobain@seastar-64

    The screen will now show some uninteresting information about the cluster and on the last line you will have a prompt that looks like this:
    0 kcobain@seastar-64 .../ $ 
    In case you are wondering what the number at the front is, it’s the returncode20 of the last command you executed. When you want to logout, just type the command exit and you’ll be disconnected.

  3. Submit a script: This step will be handled in detail later, but for now I’ll tell you the basic steps:
  4. Transferring files: There are multiple ways to transfer files between your PC and the cluster that use a GUI22 but the 2 most popular methods use the shell:

    4 Writing and submitting scripts

    4.1 What qsub does

    In section 3 I glanced over how you should submit scripts. I’ll now look at how you should write scripts and give some more info about submitting them. The most important thing that you should know is: qsub submits the scripts and transforms them into jobs. All the rest are just details.

    I even have some good news: If you don’t want to do anything “fancy” and just want to run a simple program you don’t even need a script. You can just use pipe the commands to qsub:

     
    0kcobain@seastar-64.../ $echo"sing’SmellsLikeTeenSpirit’"|
    qsub
     
    314160.beosrv-c  

    What’s now happening is the following:

    Most likely you won’t like the default options for the jobs so let’s see how we can change them. I’ll show you how to change one option now and I’ll give you a list with the other possible options later:

     
    0kcobain@seastar-64.../ $echo"sing’Comeasyouare’"|qsub-N
    singasong
     
    314161.beosrv-c  

    This is pretty much the same job as the previous with one, but by using the -N option we gave it the name singasong and the 2 files that will be created after the job is finished will be singasong.o314161 and singasong.e314161

    4.2 Writing scripts

    The previous way of creating jobs works, but usually it’s better to write a script. When you write a script you don’t have to pipe anything to qsub, instead you just run qsub (with the arguments you want) and behind the last argument you place the path to the script:

     
    0kcobain@seastar-64.../ $qsub-Nsingasong
    teenspirit.sh
     
    314162.beosrv-c  

    As you can see, the scriptname ends with .sh, the reason for this is that the script is a shell script. You don’t have to end your scripts with .sh but it’s a convention that most people follow so I would recommend doing the same thing if you don’t want to complicate things. Although the submit scripts don’t have to be shell scripts (you can also write submit scripts in perl, python, ...) I would recommend using regular shell scripts written in (ba)sh24 . Other scripts can cause strange problems25 .

    Covering shell scripts completely will lead us to far. For now I’ll have to be very brief about it:

    For example, this could be the submitscript of the last job (teenspirit.sh):

    #!/bin/bash 
    echo Here we are now, entertain us...

    The following is a example of a submitscript that contains some interesting features of bash:

    #!/bin/bash 
    #Lines starting with # are comments and are ignored by bash 
    #Notice that  instead of  is used on the lines with seq and mktemp 
    #The difference between these chars is important ! 
    #The following command is the first that runs, 
    #bash looks for it in the current directory 
    ./runsfirst 
    #The following command is searched for in the directory 
    #that is the PATH-variable of bash 
    runsnext 
    #The following command starts running in parallel with the 
    #rest of the script right after runsnext is finished 
    #Although the node sees it as a separate process, the cluster 
    #still sees it as 1 job together with the rest of the script 
    ./runparallel & 
    #the following runs runalot 10 times 
    for i in seq 1 10‘ ; do ./runalot ; done 
    #the following creates a temporary file and saves its 
    #path in the variable MYTEMPFILE 
    MYTEMPFILE=‘mktemp 
    #the next commands output is sent to the temporary file 
    ./sendtofile > $MYTEMPFILE 
    #the next command receives its input from the file 
    ./receivefromfile < $MYTEMPFILE 
    #the following removes the tempfile 
    rm $MYTEMPFILE 
    #the following line checks if the tempfile is really deleted 
    if [ ! -e $MYTEMPFILE ] ; then echo temporary file is removed ; done 
    #the following command is the last that runs. 
    #its not located in the current directory 
    /l/home/kcobain/somewhere/runslast

    Arguments that you pass to qsub can also be placed in the submitscript by adding lines in this format right behind #!/bin/bash:

    #PBS -argument1 optionsforargument1 
    #PBS -argument2 optionsforargument2

    So if we change teenspirit.sh like this:

    #!/bin/bash 
    #PBS -N singasong 
    echo Here we are now, entertain us...

    then we can submit it like this and it would still have the name “singasong”:

     
    0kcobain@seastar-64.../ $qsubteenspirit.sh
     
    314163.beosrv-c  

    4.3 PBS-options

    I’m only mentioning the options here that I think are actually useful for you. For a whole list, see man qsub .




    Option Arguments

    Explanation




    -a [[[[CC]YY]MM]DD]hhmm[.SS]

    Do not launch the job before this time. (e.g. -a 2150 will make sure the job is not launched before 9:50pm today (or tomorrow if it’s already after 21:50) and with -a 11250345.07 it won’t start before 03:45:07 on the 25th of november)




    -c enabled

    Enables checkpointing of the job, the job will run a tiny bit slower (nanoseconds, not noticable) but it has the advantage that if for some reason the cluster needs to be rebooted the job can be saved first. It’s always a good idea to use this option




    -I

    Makes job “interactive”: when it starts running, the 3 standard streams will be connected to the shell where you ran qsub, this makes it look as if the job is running in that shell




    -j oe or eo

    STDOUT and STDERR of the job will be joined together, with oe they both end up in STDOUT, with eo they land in STDERR




    -l resourcelist

    A list of resources and their values separated by comma’s that the job will need that are different from the default values. (e.g. -l walltime=50,nodes=3 to request that the job can run on 3 nodes for 50 seconds27). See also the resources-table in section 4.3.1.




    -m abe

    You will receive a mail when the job is aborted, when it begins or when it ends. (You’re not obliged to use all 3 of these chars)




    -M kurt.cobain@nirvana.com

    Mail-address used by -m. If you want multiple people to receive the mail, seperate the addresses with commas




    -N name

    Gives the job a name (max. 15 chars long and the first char should be alphabetic).




    -q node, queue or node@queue

    Tells the cluster where the job should be executed. The node@queue option is handy because a node can be part of multiple queues and in that case it has multiple versions of default resources




    -t n

    The job will be submitted n times.28




    -V

    The job will receive all environment variables available in the shell where you ran qsub (seastar-64’s bash). If you want to know what these vars are, run set




    -v var1=value1[,var2=value2...]

    A list of environment variables and their values that the job will need




    -w /some/where

    Set the default working directory to /some/where




    -W attributelist

    A list of attributes and their values separated by commas that the job will need. (e.g. -W depend=afterok:314164.beosrv-c to request that this job gets scheduled when 314164.beosrv-c finishes without any errors) Be careful if you start combining multiple attributes, you can get some strange effects! See also the attributes-table in section 4.3.2




    -X

    Enables X-forwarding29




    -z

    The job id will not be sent to STDOUT when you submit the job.




    4.3.1 Resources

    There are a lot of different resources that you can request, for a full list see http://docs.adaptivecomputing.com/torque/4-_1-_3/help.htm#topics/2-_jobs/requestingRes.htm. If you are overwhelmed by the options, then just using the default instead of choosing options yourself is mostly a good choice. This is a list with the most popular options:




    Option Arguments

    Explanation




    cput [[HH:]MM;]SS

    Maximum amount of CPU-time used by all processes in the job




    walltime [[HH:]MM;]SS

    Maximum amount of real time during which the job can be in the running state




    nodes options

    Reserve a list of nodes, see the examples on the website mentioned above




    mem size30

    Maximum amount of memory the job can use




    pmem size

    Maximum amount of memory any single process can use




    pvmem size

    Maximum amount of virtual memory any single process can use




    vmem size

    Maximum amount of virtual memory




    4.3.2 Attributes

    Attributes are useful if you want to arrange the order of your own jobs, to see a full list run man qsub and scroll to the -W additional_attributes option. But the most important attributes are the following




    Option Arguments

    Explanation




    depend=afterok: jobid

    This job can only be scheduled after the job jobid has terminated without errors




    depend=beforeok: jobid

    If this job has terminated without errors, then job jobid can begin




    depend=afterany: jobid

    The same thing as afterok but you can ignore errors




    depend=beforeany: jobid

    The same thing as beforeok but you can ignore errors




    depend=afternotok: jobid

    The same thing as afterok but replace “without” by “with”




    depend=beforenotok: jobid

    The same thing as beforeok but replace “without” by “with”




    5 Checking stuff

    Now that you know everything that you should know to create and submit jobs, you’ll probably also want to be able to monitor them. Seastar has multiple ways to follow them of which there are 3 that regular users should know about31 :

    6 Running interactive programs

    If you are planning to run interactive programs, make sure you know about X-forwarding and showstart. The most interesting script here is runmatlabXXX. It works like this: You replace the XXX by a version number (most likely this will be 714 for 7.14). As first argument you give the amount of hours (as integer) you want matlab to run, all the other arguments are passed on to matlab itself. So Kurt Cobain would run something like this if he wanted to run matlab -desktop for almost32 3 hours:

     
    0kcobain@seastar-64.../ $runmatlab7143-desktop
          

    7 Differences with the VSC-systems

    Many of you will also have an account on the VSC-system’s (also known as Calcua or Turing and Hopper), these are the most important differences:

    Obviously there are also a lot of smaller differences hidden away. If you are planning to work with the VSC-system I would recommend that you start here: https://www.uantwerpen.be/en/research-_and-_innovation/research-_at-_uantwerp/core-_facilities/core-_facilities/calcua/support/

    8 Seamouse’s services

    Seamouse does a lot of things of which the following might be interesting for you:

    9 Questions and “fake” bugs

    ... I am waiting for input from the reader here ...

You just reached the end of the documentation.
Contact me for even more information,
Also, report what’s unclear,
I will fix it, don’t fear!
And when bored: Feel free to write a translation !