[xcat-user] Problems with the initial pxe boot of the node

Vallard Benincosa vallard at benincosa.com
Fri Aug 28 15:31:13 MDT 2009


It looks like you're trying the new stuff which has some dependency
problems.  Please install:

http://xcat.sourceforge.net/yum/xcat-dep/syslinux-xcat-3.82-1.noarch.rpm
and
http://xcat.sourceforge.net/yum/xcat-dep/xnba-undi-0.9.7-12.noarch.rpm

Make sure that the pxelinux from that RPM is installed there.  These updates
have been made to xCAT to address the scalability problems we have seen with
TFTP.  When 2.3 is officially released we should be better off.

The dead link in /tftpboot/pxelinux.cfg is not used anymore with the new
xCAT

Hope that helps.

V

On Fri, Aug 28, 2009 at 1:57 PM, Putz, Andreas (AFCC) <
Andreas.Putz at afcc-auto.com> wrote:

>  I am new in the cluster world, so I might be missing something obvious. I
> think I followed the instructions for the setup of the management node and I
> am at the stage, where I would like to initialize a diskfull install of my
> nodes.
> If the node enters the pxe boot stage, it obtains the correct IP address
> via DHCP from my management node, but I do not get a valid pxe bootimage.
>
>
> I get the following error message on the management node:
>
>
> ==============================================================================
> var/log/messgaes
>
> ==============================================================================
> Aug 21 13:03:24 mnode dhcpd: DHCPDISCOVER from 00:30:48:c6:27:f4 via eth0
> Aug 21 13:03:24 mnode dhcpd: DHCPOFFER on 10.1.1.2 to 00:30:48:c6:27:f4 via
> eth0
> Aug 21 13:03:26 mnode dhcpd: DHCPREQUEST for 10.1.1.2 (10.1.1.1) from
> 00:30:48:c6:27:f4 via eth0
> Aug 21 13:03:26 mnode dhcpd: DHCPACK on 10.1.1.2 to 00:30:48:c6:27:f4 via
> eth0
> Aug 21 13:03:26 mnode atftpd[14024]: Serving xcat/xnba.kpxe to
> 10.1.1.2:2070
> Aug 21 13:03:26 mnode atftpd[14024]: File 5/xcat/xnba.kpxe not found
> Aug 21 13:03:26 mnode atftpd[14024]: Server thread exiting
> Aug 21 13:03:26 mnode atftpd[14024]: Serving xcat/xnba.kpxe to
> 10.1.1.2:2071
> Aug 21 13:03:26 mnode atftpd[14024]: File 5/xcat/xnba.kpxe not found
> Aug 21 13:03:26 mnode atftpd[14024]: Server thread exiting
>
>
> ==============================================================================
>
> My tftpboot directory looks like this:
>
> ==================================================================
> [root at mnode ~]# ls -la /tftpboot/pxelinux.cfg/
> ==================================================================
> total 24
> drwxr-xr-x 2 root root 4096 2009-08-20 09:46 .
> drwxr-xr-x 5 root root 4096 2009-05-08 16:24 ..
> -rw-r--r-- 1 root root  108 2009-08-18 16:52 0A01
> lrwxrwxrwx 1 root root    7 2009-08-20 09:46 0A010102 -> node002  <= This
> file does not exist
> -rw-r--r-- 1 root root  110 2009-08-18 16:52 0A0204
> -rw-r--r-- 1 root root  109 2009-08-18 16:52 7F
> -rw-r--r-- 1 root root  113 2009-08-18 16:52 C0A87A
> ==================================================================
>
> where the link to node002 does not exist.
>
> I find an entry for my node in the following tables:
>
> ==================================================================
> [root at mnode ~]# ls -la /tftpboot/xcat/xnba/nodes/
> ==================================================================
> total 12
> drwxr-xr-x 2 root root 4096 2009-08-06 10:23 .
> drwxr-xr-x 4 root root 4096 2009-08-06 10:23 ..
> -rw-r--r-- 1 root root  331 2009-08-20 09:46 node002
> ==================================================================
>
> ==================================================================
> [root at mnode autoinst]# more /tftpboot/xcat/xnba/nodes/node002
> ==================================================================
> #!gpxe
> #install fedora9-x86_64-compute
> imgfetch -n kernel http://
> ${next-server}/tftpboot/xcat/fedora9/x86_64/vmlinuz
> imgload kernel
> imgargs kernel nofb utf8 ks=http://10.1.1.1/install/autoinst/node002ksdevice=eth0 console=ttyS1,19200n8r noipv6
> imgfetch http://${next-server}/tftpboot/xcat/fedora9/x86_64/initrd.img
> imgexec kernel
> ==================================================================
>
> My DHCP configuration was created by xcat using makedhcp and looks like
> this:
>
> ==================================================================
> more /etc/dhcpd.conf
> ==================================================================
> #xCAT generated dhcp configuration
>
> authoritative;
> option space isan;
> option isan-encap-opts code 43 = encapsulate isan;
> option isan.iqn code 203 = string;
> option isan.root-path code 201 = string;
> option space gpxe;
> option gpxe-encap-opts code 175 = encapsulate gpxe;
> option gpxe.bus-id code 177 = string;
> option gpxe.no-pxedhcp code 176 = unsigned integer 8;
> option iscsi-initiator-iqn code 203 = string;
> ddns-update-style none;
> option client-architecture code 93 = unsigned integer 16;
> option gpxe.no-pxedhcp 1;
>
> omapi-port 7911;
> key xcat_key {
>   algorithm hmac-md5;
>   secret "QWhrQ0JqanFaRXZxbzFnRXljQXlaQTRrMzFVcXVLOU8=";
> };
> omapi-key xcat_key;
> shared-network eth0 {
>   subnet 10.1.0.0 netmask 255.255.0.0 {
>     max-lease-time 43200;
>     min-lease-time 43200;
>     default-lease-time 43200;
>     option routers  10.1.1.1;
>     next-server  10.1.1.1;
>     option log-servers 10.1.1.1;
>     option ntp-servers 10.1.1.1;
>     option domain-name "cluster.priv";
>     option domain-name-servers  10.1.1.1;
>     if exists gpxe.bus-id { #x86, gPXE
>        filename = "xcat/xnba/nets/10.1.0.0_16";
>     } else if option client-architecture = 00:00  { #x86
>       filename "xcat/xnba.kpxe";
>     } else if option vendor-class-identifier = "Etherboot-5.4"  { #x86
>       filename "pxelinux.0";
>     } else if option client-architecture = 00:02 { #ia64
>        filename "elilo.efi";
>     } else if substring(filename,0,1) = null { #otherwise, provide yaboot if the client isn't specific
>        filename "/yaboot";
>     }
>     range dynamic-bootp 10.1.1.20 10.1.1.250;
>   } # 10.1.0.0/255.255.0.0 subnet_end
> } # eth0 nic_end
> #definition for host node002 aka host node002 can be found in the dhcpd.leases file
> #definition for host node003 aka host node003 can be found in the dhcpd.leases file
> ==================================================================
>
> ==================================================================
> more /var/lib/dhcpd/dhcpd.leases
> ==================================================================
> # The format of this file is documented in the dhcpd.leases(5) manual page.
> # This lease file was written by isc-dhcp-4.0.0
>
> host node002 {
>   dynamic;
>   hardware ethernet 00:30:48:c6:27:f4;
>   fixed-address 10.1.1.2;
>         supersede host-name = "node002";
>         supersede server.next-server = 0a:01:01:01;
> }
> host node003 {
>   dynamic;
>   hardware ethernet 00:30:48:c6:3a:50;
>   fixed-address 10.1.1.3;
>         supersede host-name = "node003";
>         supersede server.next-server = 0a:01:01:01;
> }
>
> The node definition and site table is as follows:
>
> ==================================================================
> lsdef node002
> ==================================================================
> Object name: node002
>     arch=x86_64
>     chassis=CrayCX01
>     currchain=boot
>     currstate=install fedora9-x86_64-compute
>     groups=all,RackCX01,compute,vnc
>     initrd=xcat/fedora9/x86_64/initrd.img
>     installnic=eth0
>     interface=eth0
>     kcmdline=nofb utf8 ks=http://10.1.1.1/install/autoinst/node002 ksdevice=eth0 noipv6
>     kernel=xcat/fedora9/x86_64/vmlinuz
>     mac=00:30:48:C6:27:F4
>     mgt=ipmi
>     netboot=pxe
>     nfsdir=/install
>     nfsserver=10.1.1.1
>     nodetype=osi
>     os=fedora9
>     postscripts=syslog,remoteshell,otherpkgs,syncfiles
>     power=ipmi
>     primarynic=eth0
>     profile=compute
>     rack=Main
>     room=Tekmira
>     slot=3
>     tftpserver=10.1.1.1
>     xcatmaster=10.1.1.1
>
> ==================================================================
> tabdump site
> ==================================================================
>
> #key,value,comments,disable
> "xcatdport","3001",,
> "xcatiport","3002",,
> "tftpdir","/tftpboot",,
> "master","10.1.1.1",,
> "domain","cluster.priv",,
> "installdir","/install",,
> "timezone","America/Vancouver",,
> "nameservers","10.1.1.1",,
> "dhcpinterfaces","eth0",,
> "forwarders","172.30.7.195,172.30.7.196",,
>
> Hardware:
>
>     - CrayCX1 with currently three nodes, expecting a lot mor until the end of the year. Built in GigE switch and Infiniband switch.
>
>     - OS: Fedora 9
>
>     - Each node has two ethernet cards (eth0: private cluster and eth1 public company network) and one infiniband port.
>
>     - The management node has a BMC module accessible via TCP/IP.
>
> XCAT installation:
>
> Installed Packages
> xCAT.x86_64                                                                                 2.3-snap200908270907                                                                installed
> xCAT-UI.noarch                                                                              4:2.3-snap200908200921                                                              installed
> xCAT-client.noarch                                                                          4:2.3-snap200908270859                                                              installed
> xCAT-nbkernel-ppc64.noarch                                                                  1:2.6.18_92-4                                                                       installed
> xCAT-nbkernel-x86.noarch                                                                    1:2.6.18_92-8                                                                       installed
> xCAT-nbkernel-x86_64.noarch                                                                 1:2.6.18_92-8                                                                       installed
> xCAT-nbroot-core-ppc64.noarch                                                               4:2.3-snap200906231531                                                              installed
> xCAT-nbroot-core-x86.noarch                                                                 4:2.3-snap200906231531                                                              installed
> xCAT-nbroot-core-x86_64.noarch                                                              4:2.3-snap200906231531                                                              installed
> xCAT-nbroot-oss-ppc64.noarch                                                                2.0-snap200801291320                                                                installed
> xCAT-nbroot-oss-x86.noarch                                                                  2.0-snap200804021050                                                                installed
> xCAT-nbroot-oss-x86_64.noarch                                                               2.0-snap200801291344                                                                installed
> xCAT-server.noarch                                                                          4:2.3-snap200908270900                                                              installed
>
> **
>
> Thanks for the help,
>
> Andreas
>
> *------------------------------------------------*
>
>  *Andreas Putz*
>
> AFCC
>
>
>
> *AFCC Automotive Fuel Cell Cooperation Corp.***
>
> 9000 Glenlyon Parkway
>
> Burnaby, BC V5J 5J8
>
> Canada
>
>
>
> Phone: +1 604 453 3744
>
> Fax:    +1 604 415 7291
>
>
>
> Email:  andreas.putz at afcc-auto.com
>
>
>
> www.afcc-auto.com**
>
>
> _______________________________________________
> xcat-user mailing list
> xcat-user at lists.xcat.org
> http://www.xcat.org/mailman/listinfo/xcat-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.xcat.org/pipermail/xcat-user/attachments/20090828/daa71a8e/attachment-0001.htm


More information about the xcat-user mailing list