Some questions regarding clusters

Elledan

Banned
Jul 24, 2000
8,880
0
0
For the cluster I'll be bulding, I've chosen MOSIX. However, I've still a couple of questions regarding MOSIX on which I couldn't find the answer anywhere (or I did not understand it):

First of all, like in an SMP system, is a MOSIX cluster able to share all of the RAM of the nodes among these nodes, i.e. if all 10 nodes in a cluster have 128 MB of RAM, will a node in this cluster be able to use the RAM of other nodes?
So far I've only encountered the term 'memory ushering', with which I'm not familiar.
If MOSIX doesn't provide such a feature (sharing memory among the nodes), are there any other cluster-types which do support this?

Then, to save costs and time spend on maintenance, I would prefer to use diskless nodes, i.e. the nodes will have a NIC with a bootROM on it, so that it will load a remote filesystem. Is this possible? And if so, would it be a good idea? It'll cost quite some RAM on the nodes, so perhaps it's not such a bright idea?
If a HD per node is a better idea, I'll still have to install the OS on each single node, which will cost quite some time. Any tips on simplifying this would be more than welcome. Not all of the nodes will have the same hardware configuration, so 'ghosting' is not possible, as far as I know.

To end this post, one final question remains:
For the nodes, I'll be mounting the mainboard and the other stuff on a plate of triplex, with wooden stand-offs for the mainboard. The PCI/AGP cards are supported by a metal rail. There are two of such rails: one on each side of the plate. These will be used to secure the nodes when placed against each other (vertical or horizontal).
Does this configuration seem 'good' enough?

Thanks for your time :)
 

J.Zorg

Member
Feb 20, 2000
47
0
0
As far as i know Mosix is not sharing the ram between all nodes.(not shure) The Problem with Memory sharing is the network latency. Nomally you schoul have 7-10ns Ram ....but network latency is maybe 10ms. That the factor 1000....I think this would slow down your system to much.

Mosix can be run with diskless clients. You need boot floppys with a special Mosix Kernel and RARP, DHCP, BOOTP, NFS support. You need a NFS server providing the clients with a small Linux system.

If you need more information:
Mosix
Diskless Howto
Mosixview
Diskless Mosix Howto
Beowulf Howto
 

Elledan

Banned
Jul 24, 2000
8,880
0
0
Thanks, J.Zorg :)

I just found out that 'shared memory' support is on its way for MOSIX and some other cluster types. Beowulf and the rest don't and will not support shared memory for a long time.

However, some of the simulations I'll be running on this cluster will require lots of RAM (multiple GB's), so no shared memory functionality will kinda ruin the whole reason for using a cluster in the first place. An SMP machine would then make more sense.

Before I'll decide whether or not to use diskless nodes, I'll have to know exactly how much RAM the average FS will take. Anyone got some information on this?
 

Armitage

Banned
Feb 23, 2001
8,086
0
0
If you need lots of memory, then diskless nodes are probably not the best answer due to the VM swapping. Of course, you want to have enough ram onboard to avoid swapping, but if it does have to swap, doing over the network will really suck. The same argument holds for shared memory on a cluster. You might be able to do it, but a such a performance penalty that you may as well not.

If you do go diskless, I would reccomend doing the boot floppy option over a NIC with boot-rom. More flexible and probably cheaper. There is a HOWTO on network booting for linux. You should be able to find it @ linuxdoc.


 

Elledan

Banned
Jul 24, 2000
8,880
0
0
I'm still divided on the diskless vs. disk issue... If I just had a quick and easy way to prepare the disks for the nodes, I would choose for HD's in a heartbeat.
 

Armitage

Banned
Feb 23, 2001
8,086
0
0


<< If I just had a quick and easy way to prepare the disks for the nodes, I would choose for HD's in a heartbeat. >>



Well, that part is easy.
Just get one system set up the way you want it. Then connect the drives for the other nodes to the second controller in the first machine, and use dd to copy the contents of the set-up drive to the new drive.
Now you can put that new drive in the other node and boot it up. All you'll have to change is the hostname & IP address (assuming the hardware is identical). And now you have two systems setup, so you can copy two drives at one time, then 4, 8, 16... :D

I did this for a 20 node cluster about a year ago. Had everything up in just over a day.
 

Elledan

Banned
Jul 24, 2000
8,880
0
0
Well, the problem is that the hardware of the nodes is not equal, but even if half of the nodes has the same hardware configuration it would save some time.

By the way, what's 'dd'? Is it a way to copy a medium bit for bit?
 

Armitage

Banned
Feb 23, 2001
8,086
0
0


<< Well, the problem is that the hardware of the nodes is not equal, but even if half of the nodes has the same hardware configuration it would save some time. >>



How different is the hardware? You might just have to install different drivers or something. In fact, if you use RedHat, it might detect the differenceon the first boot up and offer to make the changes for you automagically. Maybe other distros can do that also.



<< By the way, what's 'dd'? Is it a way to copy a medium bit for bit? >>



Yea, basically. See man dd for the details. The interface is a little different.
 

Elledan

Banned
Jul 24, 2000
8,880
0
0


<<

<< Well, the problem is that the hardware of the nodes is not equal, but even if half of the nodes has the same hardware configuration it would save some time. >>



How different is the hardware? You might just have to install different drivers or something. In fact, if you use RedHat, it might detect the differenceon the first boot up and offer to make the changes for you automagically. Maybe other distros can do that also.
>>

I've yet to receive most nodes, but it are probably mainly Pentiums. The used chipsets might variate quite a bit, though (VIA, Intel... you name it ;) )

I'm still considering which distro to use. I got the CD's of SuSe 6.3 and Slackware 8.0 lying around (and Mandrake 7.something :p ). I could also just try Slackware first and use SuSe if it fails.

As for the drivers, from what I know about Linux, it has little trouble with changed peripherals on start up. I'm less certain about chipset drivers, though.



<<

<< By the way, what's 'dd'? Is it a way to copy a medium bit for bit? >>



Yea, basically. See man dd for the details. The interface is a little different.
>>

Okay, thanks :)
 

J.Zorg

Member
Feb 20, 2000
47
0
0
The different hardware setups shouldn´t be a problem if you build your own kernel with everything you need on all machines (might be a huge kernel ;-) ).During startup the kernel will automatically determine what it needs for every system. You might have to use i386 or pentium optimization so that all different cpus are supportet. Simply compile everything in the kernel an don´t leave anything as a module. Because this would mean additional configuration. If you use DHCP you only have to change the hostname for every node.
Simply install one machine, build you own "one-kernel-for-all hardware-setups", configure everything else. Then make a Norton Ghost Image or Powerquest Drive Image and install it on the other systems. You might need a fat32 Partition to dump your image on (don´t know if they can dump on linux partitions). Norton has some kind of network support (havent really checked it out). This means you boot from a floppy and transfer the image over the network. (CD or HD is also possible, but with many hosts network should be faster).

Don´t use Suse 6.3! It´s too old. 7.3 is the latest version. Get it from one of their ftp mirrows or maybe buy a new version.
 

m0ti

Senior member
Jul 6, 2001
975
0
0
Network support (at least in Ghost) is master slave. You have to boot up one as the master and the other as a slave. It's really rather simple. Just set up the disk through the Ghost Boot Manager (or whatever it's called) and make sure your drivers are on it. As for linux and hardware, the huge kernel approach should definitely work. It'll take a while for the systems to come up, but you don't care about that, since they'll be on 24/7 anyway.