Setting up ulimit for ArrayServer
From Array Suite Wiki
When ArrayServer is running a large number of analyses, such as parallel jobs for 200 alignments. You may get Thread creation failed error. The error is caused by the limit in a linux machine. Make sure the ulimit for "max user processes" and "open files" are set to the max value: 65536. You can check the values by typing: ulimit -a.
$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 515184 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 65536 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
Modify ulimit in two config files following ulimit setup wiki.
Most of time, the error is caused by the process (-u) and open files (-n) limits.
In Linux, ArrayServer is run under Mono platform. Each mono job may occupy up to 20 processes (to open files, loading dll, etc.). Therefore, our recommendation is to change the process limit and file limit from 1024 to 65536 to make sure your server can handle hundreds of jobs.
Note: When ArrayServer is running on a cluster submission machine and with cluster enabled, most of jobs are running on the cluster nodes. ArrayServer will create processes in submission machine to monitor the project progress. For each qsub job, it will have a process to monitor the cluster job progress using qstat/qacct. These jobs are not computational intensive but they will use the quotas. Therefore, the limit on the cluster submission machine will also cause the error.
Login as root to change system settings.
There are two places where changes need to be recorded:
sysctl.conf is for setting a system wide ceiling:
# max open files (systemic limit) fs.file-max = 65536
limits.conf is for setting a user space floor and ceiling:
Ensure both a hard limit and a soft limit are set, otherwise the setting will not become active.
For example: to set the number of files and processes any user of the system may have open at a given time to 65536
* soft nofile 65536 * hard nofile 65536 * soft nproc 65536 * hard nproc 65536
Note, user * does not include user root. You should additional lines for root.
root soft nofile 65536 root hard nofile 65536 root soft nproc 65536 root hard nproc 65536
You have to log out and login again to make these limit effective.
Login as ArrayServer admin linux account, double check these limits using
ulimit -a before restarting ArrayServer to use the system settings. For more information, please google "Increasing ulimit number of files and processes limit on Linux".
double check the file /etc/security/limits.d/*-nproc.conf as this is likely overriding your settings
One of our super-users also recommended a command to raise the limit temporarily
sudo sh -c "ulimit -n 65536 && exec su $LOGNAME"
Error Message Example
Error occured. Thread creation failed.@@@ StackTrace= at System.Threading.Thread.Start () [0x00000] in <filename unknown>:0 at System.Threading.Thread.Start (System.Object parameter) [0x00000] in <filename unknown>:0 at Omicsoft.JobRunner.Run (ILogger logger, Omicsoft.Job job, Int32 cpuNumber) [0x00000] in <filename unknown>:0