Results 1 to 9 of 9

Thread: How to Torque on ubuntu 10.04 on a single multicore machine

Threaded View

  1. #1
    Join Date
    Feb 2006
    Beans
    3

    How to Torque on ubuntu 10.04 on a single multicore machine

    I updated this on jul 21 using comments below

    Torque is a batch job queuing system that is used on clusters. But I find it handy to use it on my multi-core workstation as well. It allows jobs that need to be run to be schedule by multiple users. The scheduler will make sure that not too many jobs are run simultaneously which could cause high system loads or memory issues.

    I previously posted how to install torque on ubuntu hardy from the torque source package. However, torque is now in the repositories of lucid and here are the steps that I had to take to get it to work on my workstation.

    For this setup I kept the server host name 'torqueserver' which is the default in the package. You can do the same or use a fully qualified domain name. In that case, you will have to adept the steps somewhat.

    My workstation has 8 cores, and I only want to give 6 of them to the que. Please adapt your numbers accordingly.


    0) open root terminal
    Code:
    sudo -s
    1) add torqueserver as an alias to /etc/hosts.
    Code:
    gedit /etc/hosts
    change 127.0.1.1	myHostName to 127.0.1.1	myHostName torqueserver
    *) see post by drlemon. Alternatively use a resolvable host name (check with 'host $HOSTNAME') in the file: /var/lib/torque/server_name and whereever torqueserver is used below, use that host name


    2) install torque from repositories
    Code:
    apt-get install torque*
    3) stop torque
    Code:
    qterm
    4) check torque is not running (otherwise you can kill it)
    Code:
    ps aux | grep pbs
    5) create missing directory
    Code:
    mkdir /var/lib/torque/server_priv/arrays
    6) add torqueserver as serverhost
    Code:
    echo "SERVERHOST localhost" >> /var/lib/torque/torque.cfg
    7)setup nodes:
    Code:
    echo "torqueserver np=8" >> /var/lib/torque/server_priv/nodes
    echo "pbs_server = 127.0.1.1" >> /var/lib/torque/mom_priv/config
    8) setup database
    Code:
    pbs_server -t create
    9) create que and set server settings in database
    Code:
    qmgr torqueserver
    create queue batch
    set queue batch queue_type = Execution
    set queue batch max_running = 6
    set queue batch resources_max.ncpus = 8
    set queue batch resources_max.nodes = 1
    set queue batch resources_default.ncpus = 1
    set queue batch resources_default.neednodes = 1:ppn=1
    set queue batch resources_default.walltime = 24:00:00
    set queue batch max_user_run = 6
    set queue batch enabled = True
    set queue batch started = True
    set server default_queue = batch
    set server scheduling = True
    
    exit
    10) restart server and scheduler and node server
    Code:
    qterm
    pbs_server
    pbs_sched #this will give some warning about missing files
    pbs_mom
    11) check that the nodes are up
    Code:
    pbsnodes -a
    12) exit the root terminal and as a normal user test the que
    Code:
    exit
    qstat -q
    echo "sleep 30" | qsub
    qstat
    13) see drlemon: do a gedit /etc/init.d/torque* and change in all three files the pidfile= line so that it points to /var/lib instead of /var/spool. Additionally remove the -t create from the server options in the torque-server file.

    This works for me but probably requires more configuration in a demanding computing environment. Check out the torque website for more queue configurations, user management etc.
    Last edited by jouke.postma; July 21st, 2011 at 04:47 PM. Reason: incoporating comment below

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •