Linux iSCSI Target

spyordie007 · Sep 28, 2006

We've got a spare HP DL385 in our lab with 6x146GB of storage and I'm thinking about running Linux on it with one of the open source iSCSI target implementations out there so that we can take advantage of some of the more HA features in ESX (we have 2 more DL385s in the lab running ESX 3).

Anyone have some experience using Linux boxes as iSCSI servers and have suggestions about distros to use, iSCSI target implimentations, etc.? My Linux knowledge is moderate-at-best so anything you can offer that would make this easier would be much appreciated.

Tentitivly I'm thinking either a minimal install of FC5 with iSCSI enterprise target or just openfilter.

Thanks in advance,

Erik

spyordie007 · Sep 28, 2006

BTW another option would be to put Windows Server on it (which we have licensing for), but I'm not aware of any inexpensive iSCSI target implementations for it.

Brazen · Sep 28, 2006

This looks like a pretty good guide for Fedora Core, however I would STRONGLY suggest using CentOS instead of Fedora. The Fedora guide should work verbatim for CentOS.

I set up a linux iSCSI a few years ago and got it working. I remember it wasn't easy, but then the iSCSI target software was very very new then and there was pretty much zero documentation. But it worked. It worked great even though I believe the software was beta at the time, if not alpha. We didn't keep it, but only because we decided to go with fibre channel instead.

Personally, I think I would go with OpenFiler now though, if only for the ease of scheduling snapshots (unless you are already comfortable doing that manually with LVM).

drag · Sep 28, 2006

Originally posted by: spyordie007
We've got a spare HP DL385 in our lab with 6x146GB of storage and I'm thinking about running Linux on it with one of the open source iSCSI target implementations out there so that we can take advantage of some of the more HA features in ESX (we have 2 more DL385s in the lab running ESX 3).

Anyone have some experience using Linux boxes as iSCSI servers and have suggestions about distros to use, iSCSI target implimentations, etc.? My Linux knowledge is moderate-at-best so anything you can offer that would make this easier would be much appreciated.

Right now I am typing on a computer that is booted up over PXE and use iSCSI for it's harddrive. No local disk at all. Works out decently well, but I wouldn't recommand it. Open-iscsi initiator isn't able to effectively transfer control from one daemon to another and you have to do that if your setting up the connection in the initrd (as I am) and do a root pivot. It'll get fixed as it matures.

It's stable. It's fast. I have a Linux server running software raid 5 and that serves out 'scsi disks' which are realy just logical volumes. In a setup like that it's almost as fast as local disk access.

I was a bit worried about stability so I hammered it as much as I could with different file system benchmarks and such and it worked fine. I kept track of some of them and I used the 'CSV' output from Bonnie++ and imported it into OpenOffice.org Calc and figured out how to make a pretty little graph, which luckily I still have:
(note that all of the following was done before I got rid of my local disk.. which I added to the raid 5 array by expanding it. 🙂 )
http://img179.imageshack.us/my.php?image=resultshj5.png

and I have other stuff I saved in notes:

The 'tester' file is a 1.9Gb file made up of random garbage. I wanted it big because I care about large media files also I needed it big enough that it couldn't cache entirely in RAM to get a more accurate result.

sshfs file transfer speed

real 3m41.131s
user 0m0.468s
sys 0m6.676s

SMB file transfer speed

real 2m45.295s
user 0m0.308s
sys 0m9.161s

NFS file transfer speed

real 1m44.998s
user 0m0.040s
sys 0m4.100s

iSCSI file transfer speed

real 0m49.336s
user 0m0.268s
sys 0m6.648s

For the same file using netcat, which should indicate fast raw speed:

For netcat the command would go like this:
On the 'server' you would go:
nc -l -p 8000 > tester
and on the 'client' you would go"
time nc remote.machine 8000 -q 0 < tester

For a result I got:
real 0m51.147s
user 0m0.208s
sys 0m4.048s

Which since ISCSI is faster it probably indicates that the file system cache stuff is kicking in and it's not fully writing out to disk when the program exists.

Running md5sum on transfered files...
sshfs file read speed
fd53f34f8b3f9a8e433ed00740e8c424 mnt/rocker2/tester

real 1m37.310s
user 0m4.428s
sys 0m1.376s

SMB file read speed
fd53f34f8b3f9a8e433ed00740e8c424 mnt/ok/tester

real 2m44.506s
user 0m4.480s
sys 0m4.908s

NFS file read speed
fd53f34f8b3f9a8e433ed00740e8c424 mnt/rocker/tester

real 0m51.867s
user 0m4.664s
sys 0m1.984s

iSCSI file read speed
fd53f34f8b3f9a8e433ed00740e8c424 /mnt/tester

real 1m3.691s
user 0m4.416s
sys 0m1.692s

Md5sum run locally on each machine for comparisions

On the server machine:
real 0m44.525s
user 0m7.318s
sys 0m7.018s

On the client machine:
real 0m36.029s
user 0m4.428s
sys 0m2.016s

To use the simplistic benchmarks you run the 'hdparm -tT /dev/hda' command. Of course substituting /dev/hda with proper device file.

On the server machine the test of the raw software raid device /dev/md0:
sudo hdparm -tT /dev/md0

/dev/md0:
Timing cached reads: 1272 MB in 2.00 seconds = 634.80 MB/sec
Timing buffered disk reads: 204 MB in 3.02 seconds = 67.60 MB/sec

On the server one of the SATA harddrives making up the raid array.
/dev/sda:
Timing cached reads: 1264 MB in 2.00 seconds = 631.09 MB/sec
Timing buffered disk reads: 174 MB in 3.01 seconds = 57.89 MB/sec

And my old PATA drive that makes up part of the array...
/dev/hda:
Timing cached reads: 1272 MB in 2.01 seconds = 634.22 MB/sec
Timing buffered disk reads: 144 MB in 3.04 seconds = 47.41 MB/sec

🙂

The local SATA disk on the client machine.
/dev/sda:
Timing cached reads: 4028 MB in 2.00 seconds = 2016.40 MB/sec
Timing buffered disk reads: 128 MB in 3.03 seconds = 42.20 MB/sec

The iSCSI device on the client machine
/dev/sdb:
Timing cached reads: 4048 MB in 2.00 seconds = 2025.90 MB/sec
Timing buffered disk reads: 144 MB in 3.11 seconds = 46.35 MB/sec

And I wanted to test the speed of the actual network to see what the highest I could get out of my el-cheapo 20 buck switch and onboard ethernet:

To see what the maximum actual performance of the network is I used iperf benchmark. Just ran some simple benchmarks.

From my server to the desktop I got:
0.0-60.0 sec 3.98 GBytes 569 Mbits/sec

From my desktop to the server I got:
0.0-60.0 sec 4.82 GBytes 690 Mbits/sec

So maybe that will be helpfull.

But I wouldn't recommend running any diskless machines on ISCSI or any network. There is a known problem when the machine begins to run out of RAM and begins accessing the swap file or partition on the iscsi drive. The system needs to allocate more RAM to perform the network operation to write out RAM to swap. The system is not able to properly discern the difference between required network packets for storage and just random server/network stuff. So you end up with a race condition were the machine needs to allocate more ram to allocate more ram. It's BAD.

So for anything you want to boot up over a network and you want to run heavy loads on you need to at least have a local disk for swap. Also Like I said before on Linux the open-iscsi initiator stuff has limitations that make it difficult to go diskless with that anyways. If your interested in my initramfs for doing this though I'll post it. It's realy hackish though I wouldn't recommend it.

So next time I set it up I'll go local disk for core system and use GFS or OCFSv2 other clustering file system solutions so that I can share out /home directories and /usr/local directories. Also I want to look into DNBD which a network block device solution for Linux from the GFS project. It may offer better performance for Linux as it's tuned specificly for how Linux operates. Not usefull for windows clients, obviously.

But as far as using ISCSI goes...

Microsoft has a no-cost ISCSI initaitor for

Windows XP Pro SP1 or later
Windows Server 2003 or later
Windows 2000 SP3 or later

http://www.microsoft.com/downloads/deta...-4585-b385-befd1319f825&DisplayLang=en

For Linux Open-ISCSI support is in the kernel. I don't know what version it made it in though.
For userland stuff:
http://www.open-iscsi.org/
Debian has packages for it. I don't know about other distros.

For ISCSI target there are a couple to choose from. For instance Intel released a open source one.. but those seem more for application and network testing and don't offer the best performance. Just what from I read. The one I used was 'ISCSI Enterprise Target' which seems to be designed for serious work.
http://iscsitarget.sourceforge.net/

I've never tried to do ISCSI for Windows.

Obviously for production use you want to use benchmarks and such to hammer the systems as hard as you can in testing. These things can end up very RAM-needy with large files and such so watch for that.

Personally for day to day stuff it's not a problem. Run bittorrent, play games off of it, whatever I want. hammer hammer hammer. No sweat. But if that Open-ISCSI deamon fails, since it was running out of initrd I am screwed. (otherwise it can be restarted) As long as the network stays up, so does my machine. I recommend Ext3 for Linux stuff, it's able to handle failures better then XFS can.

edit:

Also security is a issue. You want to use a private network for storage. Iscsi has some security features, but they are not strong. My idea for production servers involves using Xen to setup virtual disks on the iscsi stuff and only exposing external network for them thus removing direct access to machines running in DomU to the storage network. That way if a machine gets compromised it only can access it's own stuff. But I bet there are other ways to acheive the same effect.

Brazen · Sep 28, 2006

Originally posted by: drag

<snip>

But I wouldn't recommend running any diskless machines on ISCSI or any network. There is a known problem when the machine begins to run out of RAM and begins accessing the swap file or partition on the iscsi drive. The system needs to allocate more RAM to perform the network operation to write out RAM to swap. The system is not able to properly discern the difference between required network packets for storage and just random server/network stuff. So you end up with a race condition were the machine needs to allocate more ram to allocate more ram. It's BAD.

So for anything you want to boot up over a network and you want to run heavy loads on you need to at least have a local disk for swap. Also Like I said before on Linux the open-iscsi initiator stuff has limitations that make it difficult to go diskless with that anyways. If your interested in my initramfs for doing this though I'll post it. It's realy hackish though I wouldn't recommend it.

<snip>

Personally for day to day stuff it's not a problem. Run bittorrent, play games off of it, whatever I want. hammer hammer hammer. No sweat. But if that Open-ISCSI deamon fails, since it was running out of initrd I am screwed. (otherwise it can be restarted) As long as the network stays up, so does my machine. I recommend Ext3 for Linux stuff, it's able to handle failures better then XFS can.

drag, lots of good information, but you'll notice he wants to use it with ESX HA, so there is a specific way to set up the SAN. Basically, he'll just be saving his virtual machine files on the SAN.

spyordie007, Setting up ESX to work with a SAN is a snap. I assume you have ESX 3.0, right? Pre-3.0 will only work with fibre channel SAN and will NOT work with an iSCSI SAN. Since the SAN will hammered pretty heavily by the combined disk access of all the virtual machines, you'll definately want separate NICs and switches (or vlans) for just the SAN connectivity.

ESX Server has it's own iSCSI initiator built in, so you don't need to worry about initiator software. I _think_ it's iSCSI initiator will even take care of path failover if you have more than one NIC (it will do path failover for fibre channel, at least I know).

And of course, the datastore will be formatted with vmfs. I read that you also want to keep a vmfs formatted volume on your local disk because ESX will cache stuff to it, even if all your files are on the SAN volume; I'm not sure if I believe that or not though, especially since the SAN volume should be a faster cache anyway (at least with fibre channel SAN).

drag · Sep 28, 2006

Well when he mentioned he wanted a 'ISCSI target' I assumed he had no SAN. ISCSI target is the 'server' portion. I like IET or 'ISCSI Enterprise Target' for that. It seems fast, reliable, and relatively low memory and cpu requirements.

Other choices for setting up ISCSI LUNs are:
"Intel iSCSI Reference Implementation" which as support for both ISCSI initiator and Target, but seems more for application developement as it seems pretty slow. Never used it though, just from what I've read.

"UNH ISCSI" Which provides both a iscsi target and initiator. (although obviously since Iscsi is a standard you should be able to mix and match how you feel like)

And then there is "Ardis ISCSI" which is a target, and IET is a fork from that.

As for talking about Open-ISCSI initiator that's just to fill out what I've done with Iscsi. I've never used a hardware initiator or any body else's Iscsi initiator.

Also I've tried AoE with with 'Vblade'. This thing is ATA over Ethernet with Vblade being software emulated I/O for this. Technically it should be good because it avoids the overhead of TCP and runs directly on Ethernet.. but in practice I've found Iscsi to be better.

drag · Sep 28, 2006

Also for my idea for Xen (it's something I pitched at work, just to give people ideas realy.).

it in involved making Iscsi a way to do highly aviable storage. The idea is to setup 2 linux boxes as raid storage boxes. Either using software or hardware, depending on what you want. These are dedicated storage boxes so software raid can offer better performance, but it's mostly up to budget. But whatever it doesn't realy matter. The have to be identical.

On those you would two ethernet ports each. These will be honed using Linux's built-in stuff for failover and load balancing. This gives you some better performance, but mostly it's so that each port goes to a different gigabit switch.

From each of the 'iscsi Clients' each one those would have at least 3 ethernet jacks.. 2 going to the storage network with one jack for each switch with the 3rd going to the external LAN. Then on the 'iscsi Clients' I would use 'CLVM' to setup logical volume management for clusters. The entire raid array on both machines would be exported as ISCSI lun. CLVM would be used to manage logical volumes and with newer versions of CLVM I would mirror the logical volumes between the two storage boxes. (or you could use DRBD + Linux-HA or you could use software mirror RAID.. don't know which would be best at this point) All the 'iscsi clients' would have access to all the logical volumes so that it would be easy to set it up so you can use the migration features of Xen to move operating systems from box to box.

Then on top of all that I would use Linux-HA to monitor the status of services and servers. If one of the 'iscsi client' boxes was to go down then the other would take over the server duty automaticly by fsckng the volume and restarting the OSes that died with the server.

The idea is that there would be no single point of failure. To have a real failure you would have to have one of the following happen:
Have 4 disks simultaniously go down (if the storage boxes are running raid 5)
Both switches would have to fail.
Massive power outage.
Enough of the 'xen servers' boxes to go down that they run out of RAM to restart the operating systems in domU.

(so you could end up in a situation were you had a switch go down, 3 disk failures on the storage boxes, have power supplies blow out and take the motherboard with it on 2 or 3 Xen hosts.. and still have everything running fine with maybe at maximum a 30 second unaviability of services.)

Pretty neat I thought.

So 2 storage boxes. 2 switches. Rendundant ethernet connections. Multiple Xen host boxes that are capable of accessing any of the logical volumes at once and have enough ram to allow you to migrate and restart operating systems as you need them.

I am sure that it's possible to do the same thing with ESX stuff also. 🙂

Missing Ghost · Sep 28, 2006

Did you think of using NetBSD for the target? I think they have native support to do this.

spyordie007 · Oct 1, 2006

Thanks for all the responses, esp drag + brazen.

Yes the boxes that will be connecting are both VMWare ESX 3 which is a custom linux build with a built-in iSCSI initiator client. I'm pretty much just going to setup a third server in our lab with an iSCSI target so our team can take advantage of some of the HA features in the lab environment.

I should have some time next week to setup the linux box; I'll try and report back on how it goes.

Thanks again,

Erik

spyordie007 · Oct 11, 2006

Okay finally got an install of iSCSI Enterprise Target working as I wanted. As it turns out the documentation isnt very clear that openssl-devel and gcc are required for the install. Naturally as with any server install I was going for minimal and installed only the basic modules for the OS and the iSCSI target install was failing; I had to do some RTFM to figure out the proper requirements to get it running.

As of today we've finally go that box working as an iSCSI target for our lab ESX servers so we should be able to lab VMotion/HA moving forward.

Thanks again for the help/suggestions

Erik

Brazen · Oct 11, 2006

Originally posted by: spyordie007
Naturally as with any server install I was going for minimal and installed only the basic modules for the OS

Hey, glad I'm not the only one who takes the time to do this! It may run you into odd problems, but that makes you learn the software better and now you know that openssl-devel and gcc are required.

It would be nice if FMs didn't make assumptions and told you what all the dependencies are when starting from a minimal OS, but some do and some don't.

Edit: btw, I was just reading through the ESX 3 manual and I see that you can use VMotion with a NFS datastore. iSCSI would be much faster, but this something to keep in mind in case you don't like iSCSI for other reasons.

spyordie007 · Oct 11, 2006

and now you know that openssl-devel and gcc are required

Yeah GCC was easy enough to figure out (you have to compile it after all), but I probably burned 2-3 hours on openssl-devel. Based on the error I had a pretty good idea it was openssl so I spent a lot of time troubleshooting it thinking openssl was messed up (even ended up doing a couple of reinstalls) before I finally realized it was actually the openssl development tools that it was after 😱

drag · Oct 11, 2006

Ya. It's just one of the things you learn.

If your stuff is complaining about ssl or whatnot then the first thing I do is make sure that the *-dev (in debian) or *-devel (in centOS, Redat, or Fedora) are installed.

Otherwise you can't compile anything against anything. Those development files have the headers and such that are required for compiling against those packages.

Lots of times I don't even know what a library is for or anything. If it's the error says something about some *.h file missing, like something to do with 'libmgi' or something like that I'll just go:
apt-cache search mgi|grep dev
and see what pops up.

Depending on what your using for a operating system it's very possible to have a very slim server system. It's fine for you since this is obviously for a lab, but if I was to use it in a production environment I would want to setup a small scale example of the environment and test the stuff I compile against it.

Once I find the configuration I want then I would build it into a deb or rpm package that I can then deploy cleanly on my servers. Then after that it would be a relatively simple thing to push down configuration changes and software updates via a local repository mirror.

There is a checkinstall program for Debian for making quick and dirty packages out of 'make install' or 'setup' and other similar commands. It'll track changes in a fakeroot and help you quickly generate a custom package. I bet there is something similar for RPM stuff. For more professional package results making a custom one by learning the documentation and such isn't very hard either.

Brazen · Oct 11, 2006

Originally posted by: drag
Once I find the configuration I want then I would build it into a deb or rpm package that I can then deploy cleanly on my servers. Then after that it would be a relatively simple thing to push down configuration changes and software updates via a local repository mirror.

There is a checkinstall program for Debian for making quick and dirty packages out of 'make install' or 'setup' and other similar commands. It'll track changes in a fakeroot and help you quickly generate a custom package. I bet there is something similar for RPM stuff. For more professional package results making a custom one by learning the documentation and such isn't very hard either.

Oh yeah, very good advice. I do this too for when I can't find a prebuilt package. Checkinstall works for rpms too. It's what I use and it's a snap. You just have to be sure you know what any rpm dependencies are because a checkinstall package won't check for any dependencies (I think there is a way to define dependencies, but I never mess with it). Checkinstall is not in the default yum repos, but there is an rpm for it on it's homepage.

edit: I don't remember how good checkinstall's documentation is, but IIRC, you build a package like normal and then substitute "make install" with "checkinstall" (or something similar, maybe "make checkinstall?"), answer a few simple questions and that's it.

Linux iSCSI Target

spyordie007

Diamond Member

spyordie007

Diamond Member

Brazen

Diamond Member

drag

Elite Member

Brazen

Diamond Member

drag

Elite Member

drag

Elite Member

Missing Ghost

Senior member

spyordie007

Diamond Member

spyordie007

Diamond Member

Brazen

Diamond Member

spyordie007

Diamond Member

drag

Elite Member

Brazen

Diamond Member

TRENDING THREADS