Cloning revisited.
We have purchased 21 additional node and they all came with SATA hard disks
and their motherboard could only handle one SATA disk...
Therefore it was not possible to clone in the old way, by preinstalling one IDE
(ATA) disk, inserting another one into the same machine and cloning as described
in Part I. There were several ways to proceed:
1. Simple and tiresome. Just install each machine from CDs fully.
2. Half-simple and half-tiresome. Install minimum version and then ftp
partitions to be cloned and install over the minimum version.
3. Manual. Store images onto an IDE (ATA) disk, insert ATA and SATA disks
into one machine, and copy.
4. Professional. Use netcat and copy over network.
The first two were too tiresome for me, and the third way required a spare HDD
that was not at hand. I decide to try to do things right, at least sometimes in
my life. So below is the short description what worked for me.
Note that I mostly followed what others wrote on the
subject
(the netcat part). The netcat command is probably not in your standard
distribution, should be obtained from the Web and prestored on the master node.
Installation with netcat has got 3 steps.
1. Preinstall one node (master) and boot from it with local network support.
2. Boot each new node (slave) from a CD and set up local network. Both
master and slave should be connected to a network hub, preferrably Gigabit.
3. Copy disks with netcat.
Here are all steps in some detail.
1. I had to use Fedora Core 3 as older versions would not work on my
hardware. Otherwise there was no specific trouble.
2. This was the hard part. Knoppix 3.7 has two kernels, 2.4.* and 2.6.*.
The former one boots up but does not recognise the SATA hard disk. The latter
one labelled "experimental" lives to its name and hangs up.
I tried two versions of Knoppix, Japanese and the original German and both did not
work. Next I searched the Web more and found an OSSACC Fedora Core 3 bootable
disk. The catch was that the homepage where it was located and the disk itself
were in Chinese. You had to know Chinese just to surf the page and be able to
download the disk. Fortunately, with the Great Chinese Expansion, there is
likely to be a Chinese colleague around anywhere in the world to help.
The important thing is that the bootable disk worked, but it did not have
the netcat program! Now, our nodes did not have a floppy disk. The CD from
which I booted could not be removed (the Fedora Core grabbed it with both hands
and did not let it go). One may type the source code as one way (not so long
ago people did type boot loader binary codes every time when booting.
Fortunately, there was no Microsoft at that time and rebooting was not needed
as often as nowdays), luckily one can use network for an easier one.
So the important thing was to permit outside network connexions TO the master
machine, sshd for example. Another annoying thing about that OSSACC disk is that
it tried to activate X Windows which I did not need (and all X menues were in
Chinese, too).
To my great delight the X system failed to work in 95% of the time, for the
reason unknown to me (it seems possible that my frantically pressing CTRL-C had
some effect. There was no help for boot options to avoid running X).
Here is the list of commands I had to execute on each of 20 slave nodes:
ifconfig eth0 192.168.1.2 up
scp 192.168.1.1:netcat .
Comments. The first commands activates the network. You should also execute
something like "ifconfig eth0 192.168.1.1 up" on the master node.
The second command copies netcat from the master's ~root/netcat where you
should put it before executing that command. If you do not know how to do scp
to the superuser account, you can still create a simple account, say, joe and
then do instead:
scp joe@192.168.1.1:netcat .
3. Cloning itself.
There are a few ways here. I have partitioned the disk in this way:
/dev/sda1 /
/dev/sda2 /usr
/dev/sda3 swap
/dev/sda4 /work
So the important part to clone was in the beginning, the total size being
256 M + 3 G
One can run fdisk on each node manually but I decided to simply copy
the relevant part of /dev/sda, including MBR. There was one problem that I
failed to understand. By default, the following entry appears in /etc/fstab
of the master node:
LABEL=/work /work ext3 defaults 1 2
This is supposed to mean that /dev/sda3 lies in an extended partition (despite my choosing "put in a separate partition" when installing Linux to the master).
However, if one copies only a fraction of /dev/sda, and tries booting from this
disk, then very weird behaviour is
observed: USB keyboard stops functioning and /work gives file system errors.
The solution is to change the above entry on the master before copying into:
/dev/sda4 /work ext3 defaults 0 0
Then everything works.
The command to be executed on the master:
dd if=/dev/sda bs=65536 count=60000 | ./netcat 192.168.1.2 9000
Explanation: send the initial 60000*64K, or about 3.6 gigabytes of HDD
to 192.168.1.2 (slave). 3.6 G is a slight overkill. Copying the whole disk is
also possible but is a waste of time (80 G disk took 70 min). For some strange
reason the command never terminates.
However, it reports sending the required amount of data when done. After that
CTRL-C should be pressed to terminate.
The commands to be executed on the slave:
./netcat -l -p 9000 | dd of=/dev/sda
NB. Run netcat on the slave before running it on the master! Unless
somebody is listening (slave), the master will not waste efforts yelling over
IP! When the master terminates, the slave does too. Do not terminate netcat on
slave to avoid loosing data .
This is really the end of cloning. Next, reboot the slave from HDD,
and finish installation.
mkswap /dev/sda3
mkfs -t ext3 /dev/sda4
vi /etc/sysconfig/network
vi /etc/sysconfig/network-scripts/ifcfg-eth0
The latter two files contain the slave host name and the IP address, so this
is where you should christen the new family member and assign it a SSN too.
The end!