We have purchased 21 additional node and they all came with SATA hard disks
and their motherboard could only handle one SATA disk...
Therefore it was not possible to clone in the old way, by preinstalling one IDE
(ATA) disk, inserting another one into the same machine and cloning as described
in Part I. There were several ways to proceed:
1. Simple and tiresome. Just install each machine from CDs fully.
2. Half-simple and half-tiresome. Install minimum version and then ftp partitions to be cloned and install over the minimum version.
3. Manual. Store images onto an IDE (ATA) disk, insert ATA and SATA disks into one machine, and copy.
4. Professional. Use netcat and copy over network.
The first two were too tiresome for me, and the third way required a spare HDD that was not at hand. I decide to try to do things right, at least sometimes in my life. So below is the short description what worked for me.
Note that I mostly followed what others wrote on the subject (the netcat part). The netcat command is probably not in your standard distribution, should be obtained from the Web and prestored on the master node. Installation with netcat has got 3 steps.
1. Preinstall one node (master) and boot from it with local network support.
2. Boot each new node (slave) from a CD and set up local network. Both master and slave should be connected to a network hub, preferrably Gigabit.
3. Copy disks with netcat.
Here are all steps in some detail.
1. I had to use Fedora Core 3 as older versions would not work on my hardware. Otherwise there was no specific trouble.
2. This was the hard part. Knoppix 3.7 has two kernels, 2.4.* and 2.6.*. The former one boots up but does not recognise the SATA hard disk. The latter one labelled "experimental" lives to its name and hangs up. I tried two versions of Knoppix, Japanese and the original German and both did not work. Next I searched the Web more and found an OSSACC Fedora Core 3 bootable disk. The catch was that the homepage where it was located and the disk itself were in Chinese. You had to know Chinese just to surf the page and be able to download the disk. Fortunately, with the Great Chinese Expansion, there is likely to be a Chinese colleague around anywhere in the world to help. The important thing is that the bootable disk worked, but it did not have the netcat program! Now, our nodes did not have a floppy disk. The CD from which I booted could not be removed (the Fedora Core grabbed it with both hands and did not let it go). One may type the source code as one way (not so long ago people did type boot loader binary codes every time when booting. Fortunately, there was no Microsoft at that time and rebooting was not needed as often as nowdays), luckily one can use network for an easier one. So the important thing was to permit outside network connexions TO the master machine, sshd for example. Another annoying thing about that OSSACC disk is that it tried to activate X Windows which I did not need (and all X menues were in Chinese, too). To my great delight the X system failed to work in 95% of the time, for the reason unknown to me (it seems possible that my frantically pressing CTRL-C had some effect. There was no help for boot options to avoid running X). Here is the list of commands I had to execute on each of 20 slave nodes:
ifconfig eth0 192.168.1.2 up
scp 192.168.1.1:netcat .
Comments. The first commands activates the network. You should also execute something like "ifconfig eth0 192.168.1.1 up" on the master node. The second command copies netcat from the master's ~root/netcat where you should put it before executing that command. If you do not know how to do scp to the superuser account, you can still create a simple account, say, joe and then do instead:
scp firstname.lastname@example.org:netcat .
3. Cloning itself.
There are a few ways here. I have partitioned the disk in this way:
So the important part to clone was in the beginning, the total size being 256 M + 3 G
One can run fdisk on each node manually but I decided to simply copy the relevant part of /dev/sda, including MBR. There was one problem that I failed to understand. By default, the following entry appears in /etc/fstab of the master node:
LABEL=/work /work ext3 defaults 1 2
This is supposed to mean that /dev/sda3 lies in an extended partition (despite my choosing "put in a separate partition" when installing Linux to the master). However, if one copies only a fraction of /dev/sda, and tries booting from this disk, then very weird behaviour is observed: USB keyboard stops functioning and /work gives file system errors. The solution is to change the above entry on the master before copying into:
/dev/sda4 /work ext3 defaults 0 0
Then everything works.
The command to be executed on the master:
dd if=/dev/sda bs=65536 count=60000 | ./netcat 192.168.1.2 9000
Explanation: send the initial 60000*64K, or about 3.6 gigabytes of HDD to 192.168.1.2 (slave). 3.6 G is a slight overkill. Copying the whole disk is also possible but is a waste of time (80 G disk took 70 min). For some strange reason the command never terminates. However, it reports sending the required amount of data when done. After that CTRL-C should be pressed to terminate.
The commands to be executed on the slave:
./netcat -l -p 9000 | dd of=/dev/sda
NB. Run netcat on the slave before running it on the master! Unless somebody is listening (slave), the master will not waste efforts yelling over IP! When the master terminates, the slave does too. Do not terminate netcat on slave to avoid loosing data .
This is really the end of cloning. Next, reboot the slave from HDD, and finish installation.
mkfs -t ext3 /dev/sda4
The latter two files contain the slave host name and the IP address, so this is where you should christen the new family member and assign it a SSN too.