Research Blog
Welcome to my Research Blog.
This is mostly meant to document what I am working on for myself, and to communicate with my colleagues. It is likely filled with errors!
This project is maintained by ndrakos
I will be running things on Pleiades. Documentation on using this system can be found here, but in this post I will write my notes on getting everything working.
The NAS Control Room staff are available 24/7 and can be contacted by phone at 650-604-4444 or through email at support@nas.nasa.gov.
There is information on how to log-on to Pleiades here.
In summary:
1) ssh into username@sfe1.nas.nasa.gov first (PAM authentication refers to the password generated by the soft token—the pin is an 8 digit number)
2) ssh into username@pfe (this is the Pleiades Front-End Load Balancer)
There are instructions in the link above for setting up a one-step connection, which I will want to do eventually.
You can track your jobs and allocations on the myNAS web portal.
In the terminal, you can also check your usage with the command
acct_ytd
.
Information on quotas can be found
here, and you can check your quota using quota -v
.
By default you get 8 GB on your home directory, but you can try and request more by emailing support@nas.nasa.gov.
For short-term storage you get 1 TB on the Lustre/nobackup filesystems.
For long-term storage you can use the Lou Mass Storage System, which has no disk quota limits.
The HECC supercomputers use the Portable Batch System (PBS) to schedule jobs.
The following command is used to submit jobs:
%qsub job_script
There is a sample job_script here.
The normal
, long
and low
queues are for production work, while the debug
and devel
queues are for debugging and development. For the debug
queue you can have a maximum two running jobs with a total of 128 nodes, while in the devel
queue maximum one job at a time, maximum wall-clock time of 2 hours and 512 nodes.
The command for checking the status of all your jobs is qstat -nu username
. For the full status of a job 12345, you use qstat -f 12345
.
You can delete a job using qdel 12345
and hold/release a job with qhold 12345
and qrls 12345
.
As usual, I will follow these notes for setting up gadget-2.
Note that you have to load an mpi module to install fftw (use module load package
; you can check available modules using module avail
).
Here is my job script:
THIS ISN’T WORKING: STILL DEBUGGING THIS
Update: I had to add the line “export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/nasa/pkgsrc/sles12/2016Q4/lib:/u/username/install_to_here/gsl_in/lib”to the job script, and it works now.