[xcat-user] OpenMPi issue with /tmp, /dev/shm and
stateless[Scanned]
Egan Ford
datajerk at gmail.com
Mon Jul 6 08:20:32 MDT 2009
The 10M /tmp is not a default of xCAT. It was documented to show that
if you need to restrict the size of any file system you can. The xCAT
documentation should be more clear on that point. Most shops do not
create a separate /tmp or /var/tmp.
On Fri, Jul 3, 2009 at 6:03 AM, Arif Ali<aali at ocf.co.uk> wrote:
> Hi,
>
> I'm not sure if this is the scope of the xCAT team or not, but I will
> ask/comment anyway
>
> When running openMPI on a diskless cluster using Infiniband (openib or psm
> (infinipath) interface) a shared memory process require an area to work
> with. using openib you need /tmp, which then gets cleared automatically, and
> /dev/shm for the psm interface, they need to be a lot bigger than just 10m.
> In our environment for benchmarking this cluster we have changed the size
> for these to be 100m, and since then we've had no problems. with regards to
> psm, we may need to look at the epilogue and prologue script changes so that
> the psm temp files created in /dev/shm are removed either after job
> completion or before a job is started
>
> I am not sure if this needs to be documented anywhere
>
> We have been trying to diagnose this problem for a few days, so hopefully
> someone else having the same problem, will come across this e-mail
>
> regards,
>
> --
> Arif Ali MBCS
> HPC Software Engineer
> OCF plc
>
> Support Phone: +44 (0)845 702 3829
> Support E-mail: support at ocf.co.uk
>
> Please note, any emails relating to an OCF Support request must always
> be sent to support at ocf.co.uk for a ticket number to be generated or
> existing support ticket to be updated. Should this not be done then OCF
> cannot be held responsible for requests not dealt with in a timely manner.
>
> This email is confidential in that it is intended for the exclusive
> attention of the addressee(s) indicated. If you are not the intended
> recipient, this email should not be read or disclosed to any other
> person. Please notify the sender immediately and delete this email from
> your computer system. Any opinions expressed are not necessarily those
> of the company from which this email was sent and, whilst to the best of
> our knowledge no viruses or defects exist, no responsibility can be
> accepted for any loss or damage arising from its receipt or subsequent
> use of this email.
>
>
> _______________________________________________
> xcat-user mailing list
> xcat-user at lists.xcat.org
> http://www.xcat.org/mailman/listinfo/xcat-user
>
More information about the xcat-user
mailing list