[xcat-user] Deployment on xSeries 3455
Antonis A. Constantinou
a.constantinou at newcytech.com
Wed Sep 3 12:47:18 MDT 2008
Dear all
I am currently in the process of re-installing a cluster with twelve x3455 compute nodes and an x3655 management node.The cluster was using xCAT 1.3 but since it wan barely used we decided to completely remove everything and rebuild the cluster with xCAT2.
I have installed RHEL 5 x86_64 on the management node and followed the cookbook on xcat.org to and references from my old xCAT 1.3 configuration files to configure the cluster. Since the XCAT 1.3 cluster i setup before was my first experience with xCAT i have some questions regarding xCAT 2.0. I also have some remarks which may help others. As i can understand xCAT 2 promoted the idea of stateless nodes. My compute nodes have a single SAS hard drive and we would like to use this drive to have statefull nodes with hard drives. Also each node has 2 Ethernet controllers. Eth0 is used for cluster communication and node management eth1 is the public interface on a different subnet and VLAN.Since i had some issue with this setup on xCAT 1.3 i would like to ask what to pay attention on when trying to configure the cluster like this. I have set up the DHCP and mac adresses for eth0 and each node take the predifined ip from the management node at boot up.The public adress as i see it is not an issue and can be any address on the piblic subnet.(it is just fpr the nodes to have internet access and register with redhat)
The first problem i had to face is that i could not access the BMC of the nodes from the management node. As i remember during xCAT 1.3 setup the BMCs were configured by xCAT. When i tried to access the nodes now i always go an Invalid username or password.Finally i had to manuall reset all BMC to defaults using the bmc_cfg utility. This solved the problem and i know have full access on all the nodes. Conserver works fine and i can get SOL on the nodes (this did not work on xCAT 1.3)
The second problem i am currently having is that i can not deploy the nodes. Despite the fact the dhcp.atftp,nfs is configured fine as i think the boot up process does not take place. After finishing the configuration of the management node i noticed that copycds command is a little diffrent at xCAT 2 and you have to provide an iso image. What i did was to use the dd command to create an iso image of the RHEL 5 dvd.rom.The i used the copycds command with the correct arguments to copy the iso image to righ place. I can see that all the content of the DVD are under /install/rhel5/x86_64. Then i used the rinstall command to start the installation on one node. I can also see under /tftpboot/xcat/rhel5/x86_64 the kernel image and i can also see under /tftpboot/pxelinux.cfg
node01 file with the following contents
DEFAULT xCAT
LABEL xCAT
KERNEL xcat/rhel5/x86_64/vmlinuz
APPEND initrd=xcat/rhel5/x86_64/initrd.img nofb utf8 ks=http://masternode/insta
ll/autoinst/node01 ksdevice=eth0 console=ttyS0,19200n8r noipv6
Http server works (xCAT 1.3 did not use http as far as i remeber) but still when node01 boots up it takes the predifines ip adress but then i get an ATFP on the screen and nothing happens.After some seconds i get a failure message and the node continues trying to boot from theother network adapter.
Any help will be appreciated since i am not very expierenced with xCAT and all my efforts failed to solve this problem.
Regards
Antonis Constantinou
NewCytech Business Solutions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.xcat.org/pipermail/xcat-user/attachments/20080903/d2ad0345/attachment.htm
More information about the xcat-user
mailing list