Research Blog
Welcome to my Research Blog.
This is mostly meant to document what I am working on for myself, and to communicate with my colleagues. It is likely filled with errors!
This project is maintained by ndrakos
Here are my notes about switching to the 1024 simulations.
I ran 1024 simulations a while ago and checked their output was okay, as documented here
Additionally, I ran Rockstar and checked that the HMF looked okay here
I also ran consistent-trees on the snapshots. This was significantly slower at this resolution. I have been running on lou, since that’s where I have all the outputs stored. However, lou only allows you to run a job for 3 days, and you can only use 1 node at a time (with 24 cores).
Therefore, I needed to look into how to restart jobs that didn’t finish, and also how to run on multiple cores.
Both the gravitational_consistency
and find_parents_and_cleanup
programs accept an extra command line argument which is the snap number at which the code should resume processing. The information on where the code died can be found in timing.log
.
Therefore, I edited the file in consistent-trees/do_merger_tree.pl
to take extra command line argument, as follows:
#!/usr/bin/perl -w
my $cfg = $ARGV[0];
my $restart = $ARGV[1];
check_system("./gravitational_consistency", $cfg, $restart);
check_system("./find_parents_and_cleanup", $cfg);
check_system("./resort_outputs", $cfg);
check_system("./assemble_halo_trees", $cfg);
sub check_system {
system(@_) == 0 or
die "Tree creation failed while executing \"@_\".\n(See errors above).\n";
}
The code automatically uses OpenMP if available. here was my job script (note I commented out the first line after it finished running… in this example I had restarted the second step at snapshot 337):
#PBS -l select=1:ncpus=4:ompthreads=4:mem=100MB
#PBS -l walltime=72:00:00
#PBS -q ldan
module load mpi-sgi/mpt
module load comp-intel/2018.3.222
module load python3/3.7.0
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/nasa/pkgsrc/sles12/2016Q4/lib:/pleiades/u/ndrakos/ins\
tall_to_here/gsl_in/lib
cd /pleiades/u/ndrakos/consistent-trees
#perl /pleiades/u/ndrakos/Rockstar/scripts/gen_merger_cfg.pl /u/ndrakos/wfirst1024/HaloFinder/ro\
ckstar_param.cfg
perl do_merger_tree.pl /u/ndrakos/wfirst1024/HaloFinder/outputs/merger_tree.cfg 337
perl halo_trees_to_catalog.pl /u/ndrakos/wfirst1024/HaloFinder/outputs/merger_tree.cfg
I am now running the light cone on the 1024 simulations. I now have this setup to run on the NASA computers (previously I was running it on my laptop).
I altered the code so that it is parallelized—every tiled box can be run separately. Therefore, I can run on up to 60 cores to get the maximum amount of speedup. I could potentially parallelize this even more, but since I am currently running on lou, I am limited to 24 cores anyway.