Reset a blade or a virtual server on a blade?

tmdpc

Junior Member
Jul 4, 2013
3
0
0
Hi folks - I need help.

I'm a second year IT student who took a job as a support tech, mostly outfitting new machines for deployment, end-user support, printer help etc. A couple weeks ago, my (former) boss and the only other person in the dept got mad and walked away from the job on very short notice. In spite of the fact that we have 70+ employees, 12 deprtments, 8 vlans and point-to-point wireless at 4 satellite locations, the big bosses have put together a committee to determine what qualifications the new IT manger should have and expect to advertise the job opening in 2 -4 weeks... I was given some passwords and some vendor contacts but I'm in way over my head. We have about 10 physical and 15 logical servers, the virtual servers live on a pair of IBM blades run via EXSI/VMware/VCenter.

I don't know how to tell what other partitions are on the blade and I have at least 10 people experiencing down time right now. I guess I did get lucky in the fact that it died about 1 pm on July 3rd - People were literally thanking me as they left early on the day before the holiday...

I'm stuck. I don't know if the entire blade needs to be restarted because I'm not exactly sure what lives on that blade. Because I'm not sure what lives on that blade, I'm completely freaked out about restarting it - Well that and the fact that I'm unsure as to how exactly how to restart a blade.

I've got Veeam backups of everything so at worst we should only lose a day or two of data, but I have no idea where to begin.

I can't ping the main server or the VCenter. Both blades are showing green lights and no indication of trouble.

I think we have support contracts with IBM and VMware but I'm not entirely sure.

Any assistance you can provide would be very,very much appreciated. Even if you can recommend a more appropriate forum or site.

Thanks everyone.
 

Red Squirrel

No Lifer
May 24, 2003
67,526
12,192
126
www.anyf.ca
Yikes sounds like a scary situation to be in. Is anything labeled at all, IP addresses etc?

From my understanding of blades, they're essentially separate machines, they just share some common hardware like power supply and network (virtual/built in switch).

So in vcenter whatever server is showing as down you need to find that physical server, whether it's one of the blades or another box. The server hardware probably wont show any fault as it could be a software level issue like the OS froze or something.

I have not played with vcenter in a long time though so I can't really help as far as specifics go.
 

ch33zw1z

Lifer
Nov 4, 2004
37,802
18,099
146
VCenter will give you management of all the questions you are asking. What VM's are running, what hardware is in the blade.

I can be more helpful with the blades that with VMware, tho.

Here's IBM's warranty lookup: http://www-947.ibm.com/support/entry/portal/wlup

The Bladecenter has it's own MTM-SN: Post what you have, usually an 8852...but you could have 8677 or 8886

Each Blade how it's own MTM-SN: most currently popular is 7870, but there's many different MT's for blades, 7875 is the newest revision...

At the back of the blade center is an AMM (maybe two), Advanced Management module, you can plug a Keyboard, Mouse, Monitor and have direct access to the blades instead of using a java console. AMM's are also mananged by IP...usually customized by each customer to their IP scheme. It's default is 192.168.70.125 username/password default is USERID/PASSW0RD

On the blades themselves, you'll find three buttons on the front. One white power button, one button labled "MT", and one that is labeled looking like a monitor:

Power button, turns if off and on
MT: when this button is lit, that blade owns the media tray
Monitors: when this button is lit, that blade owns the KVM I mentioned.

Resetting the blade process is different depending on your type of blades. Older blades were standard BIOS, these blades were quickly ready to power on after inserting in to the chassis.

In xseries world, a slow flashing power light means it's ready to boot...

Newer blades are IMM (Integrated Management Module) based. When you insert these into the blade chassis, you will get a fast flashing power light, it's not ready to boot until it's at a slow flashing power light. this can take up to 10 minutes in some configurations, so be patient with your IMM based blade.

Ok, there's my quick crash course on how blades work.

I strongly recommend you get a KVM (keyboard, video, mouse) hooked up to the blade center and make sure both blades are responsive.

Questions support centers will ask you to start:
Do any of the blades, or the blade chassis, have an error light?
Do you blades boot to internal drives, or to a SAN?
Are your blades responsive via console?

If your blades appear to be fine, call VM support first and see what they say.

If you think there's a hardware issue, call IBM support. They will ask you to provide a DSA log if possible,, and an AMM service data which you get from the web gui of the AMM:

http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=serv-dsa

If you can't get any of that stuff, work the best you can with support to be their eye and hands. This is help speed up the process. If you have warranty or maintenance on the hardware, they will dispatch someone to do problem determination on the hardware. I stress hardware, IBM field techs are not software support, they will not be able to help you with other issues not hardware related.
 
Last edited:

tmdpc

Junior Member
Jul 4, 2013
3
0
0
I want to thank both of you for your prompt and thourough responses!


The chasis is a #8886, the blade in question a #7870. I contacted IBM support and spoke with a gentleman named Joel who was incredibly helpful. I shut down the blade via the AMM, reseated it, then powered it back up.

The good news is that the blade appears to be working normally, the bad news is I still can't get to my servers, and the VSphere client is giving me a 'unable to connect to remote server' error.


Although the blade seems to be up and running fine I did notice a amber light on the rack storage array that sits immediately below the S Center. Its not a light on one of the SAS disks themselves, but an amber light on the very left of the device.


My next step is to contact VMware and see if we have a support contract with them. If you have any more ideas I'd appreciate it and thank you both again for your help.
 

ch33zw1z

Lifer
Nov 4, 2004
37,802
18,099
146
8886 is the Blade Center S chassis, 6 blade slots and up to 12 DDM sas drives inside the chassis.

The drives in the chassis are controlled by two RSSM (sas controllers) in the back of the chassis. Make sure they're both reporting in to the AMM as good. The RSSM's have 4 SAS ports on each that can connect to external SAS storage.

The blades can have access to the storage through these RSSM's, it's setup up as a SAN which has zoning rules that can be checked in the AMM as well. Keep an eye on those zoning configs if SAS storage through these RSSM's is being used.

Things I would look at:

1. Where are the VM images stored.
2. What is the external storage unit? IBM? Dell? External storage is usually managed via a software application and IP address. If it's IBM then post the MTM and I'll see what I can give you. You definitely want to look into that amber light, it may be nothing but it may be something.
3. My experience with VMware is limited, I'm more hardware break/fix oriented. Calling them is a good next step.
4. Does the blade boot up to the VMWare login prompt? is it at least loading the base OS hypervisor? Or is it failing to load VMWare at all..
 

tmdpc

Junior Member
Jul 4, 2013
3
0
0
I found an email confirmation of an April this year EXSi 5.0 upgrade purchase, and called VMware support.

After 3 ½ hours they got my three clients back online. I learned a great deal.

One of our EXSi clients is at 5.1, the other two are at 4.X…. (That explains the EXSi 5.0 upgrade disks with the double-exclamation-point yellow Post-It sticky notes my former boss left on his desk).

That was yesterday.

Today my virtual machines couldn’t log in to the server and were getting a ‘display protocol can’t connect because of the firewall’ error. Another 3 hours with VMware support, but the tech finally got them up and running. He had to release them from the pool, then re-join then to get the PCoIP protocol to be recognized.

Fortunately the lower array enclosure is under 24/7 warranty. The IBM techs are concerned about it displaying an amber light. Software says the array ‘needs attention’. If they can’t get the logs a tech will be here Monday.

I really can’t say enough about IBM support. Really professional, nice people. VMware people were also good.

Thanks for your help cw. It is much appreciated.
 

ch33zw1z

Lifer
Nov 4, 2004
37,802
18,099
146
Great, it's a good day when you learn.

What type of storage is it? MT?

If the array "needs attention", that could be a few different things. Since there's no error light on any of the drives, "Impending drive failure" would be my guess. On it's way out, but not failed yet.