Saturday, March 31, 2012

Managing Old Dell servers

A few Dell servers I am working with were getting long in the tooth. I was worried about their performance so I decided to monitor them more closely. I didn't have a lot of experience dealing with on-board management with Dell servers, although at first glance, they took a different approach than HP or IBM servers. With HP or IBM, the on-board management system has it's own network interface with a web-based front end or accessible via remote clients. They worked independently of the server OS. Dell's approach was to have a system running on the server's OS accessing the management hardware, even though the management hardware had it's own network interface.
I've always known Dell for their support of IPMI, so decided to take that route first. This great article that explains how to get it up and running quickly. It was command line-based and flexible. However, it wasn't giving me the information I needed easily. IPMI spewed a lot of information but the output needed to be parsed. To decipher the data, it needed to be cleaned-up and restructured. Sorta like SNMP but a tad friendlier.
Then I read on-line about OpenManage which was Dell's own system. OpenManage was both the monitoring hardware on the system and the software suite that used it. It gave administrators information on the servers via a web interface. The servers had the hardware but I wasn't sure about the software.
A quick check revealed that the original CDs were long gone with the boxes after a clean-up some years ago. So I hunted for the files on Dell's support site. The site was helpful in throwing at me all they had but finding the explanaiton for what the files were required a lot of back and forth between web pages that didn't refresh well
In finding out what some of those files were from Google, I found another great article on thegeekstuff.com that was basically a howto to set up OpenManage. But I still needed the correct installation files. I first tried to get all the correct files mentioned in the article from the Dell Support site but couldn't sift through all the gunk. I found the MD5s for the files i needed but not the actual files. I felt more and more like Dell wanted me to download ISOs from their site and find them there.
Finally, I decided to dump the 'correct' way and use the files the article linked to instead. The files the article linked to were on a Dell ftp server but were meant for specific versions of Linux for specific versions of OpenManage. The files the article used didn't match the version of CenOS the server had nor the version of OpenManage that was compatible with it. Regardless, I downloaded a more recent version of OpenManage for the RedHat equivalent version. I reckon if yum/rpm craps out, then i'll just hunt some more. It was a CentOS instead of true-red RHEL but I guess they were close enough.

I followed the instructions in the article but the install program wouldn't run. A quick peek at the shell script revealed that it was just installing some RPMs based on responses to questions it gave. I found the files the shell script mentioned and deduced which ones I needed. I did a "yum localinstall" on the set of files as mentioned in the article. Yum figured out the RPM dependencies, downloaded them and the system installed in no time. Following the article's instruction, I went to the web interface at port 1311 and happy times.
I tried it on another erver and rpms installed ok but I couldn't get past the login screen. I figured maybe it was because I installed IPMI and then OpenManage and the two were using the other's services. I installed the ipmi-dkms that was part of the earlier install but still no go. I read the log files and they were talking about failure to load /lib/security/system-auth.
Since the problem was with loading a security library at the point of me logging on, I figured that I could change how the credentials were authenticated. I edited /etc/pam.d/omath and commented out the lines there. I added

auth       required     /lib32/security/pam_unix.so nullok
account    required     /lib32/security/pam_unix.so

This changes the way the program uses credentials on the system to authenticate me. You can read more about how this works here.If you are writing programs and tired of managing accounts, you should look into PAM and decide whether you should have the system take care of that instead of rolling your own. I restarted the back end
/etc/init.d/instsvcdrv restart 
and the web front-end
/etc/init.d/dsm_om_connsvc start  
Logged in and OpenManage worked like a charm. I could see the information I need in a nice interface. Nice enough to share with those less technically inclined. As a plus, I found out OpenManage also offers to send SNMP traps for various warnings.




No comments:

Post a Comment

Recently Popular