[xcat-user] xCAT 1.3b2 and Torque 2.1.8 STDOUT and STDERR problem

Chris Beggio cabeggi at sandia.gov
Thu Dec 13 16:05:02 MST 2007


While obvious, I should add that /home is an NFS mounted filesystem on all
nodes.


On 12/13/07 3:55 PM, "Chris Beggio" <cabeggi at sandia.gov> wrote:

> 
> xCAT users,
> 
> I have recently pounded a new flat spot on my head with a PBS problem on a
> recently installed cluster. Versions are as follows:
> 
> Red Hat Enterprise Linux WS release 4 (Nahant Update 5)
> 
> xCAT 1.3.0-beta2
> Fri Jul 13 00:18:24 MDT 2007
> 
> Torque 2.1.8
> 
> The problem is that out of the box, PBS was not copying the PBS prolog and
> epilog, along with STDERR (${PBS_JOBID}.ER) and STDOUT (${PBS_JOBID}.OU) or
> the joined output and error stream to the directory where the job was
> executed (${PBS_O_WORKDIR}), and instead was copying them to the user home
> directory (${PBS_O_HOME}). I compared two identical machines and all the
> versions and configurations appear to be the same, but while one was copying
> output to ${PBS_O_WORKDIR}, the more recently installed was not. Then I
> changed /var/spool/pbs/mom_priv/config on the compute nodes to include:
> 
> $usecp *:/home  /home
> 
> This line does not exist on the other working cluster. Everything works now
> and the file ${PBS_JOBNAME}.o${PBS_JOBID} is now deposited in
> ${PBS_O_WORKDIR} as expected.  What am I missing?
> 
> Thanks and Happy Festivus.
> 
> Chris

-- 
__________________________________________________
Chris Beggio

1600 Computing Support Team

Commercial Data Systems
Contracted by Sandia National Laboratories

Phone: 505-284-8001
Fax: 505-284-6078
Email: cabeggi at sandia.gov
__________________________________________________




More information about the xcat-user mailing list