I think it's best not to use the sarge packages of OpenIPMI. Ubuntu's 7.04 kernel is way too new for the kernel drivers to work properly. It's best to start from a RHEL5 rpm, convert it to a .deb with alien, then install it. Do note that I don't have any experience with Ubuntu in general. I strictly use Debian and Gentoo, so don't shoot me if I'm off somewhere.
First, make sure all the necessary components are installed: you need the kernel source and kernel headers, gcc and the likes, and of course, alien. Download the latest rpm of OpenIPMI for RHEL5 (it is closest to the kernel version of Ubuntu), and convert it to a Debian package. After that, install the package. It should create a folder under /opt. If possible (in Debian this was the case), the package will automatically try to build the modules, though I'm not sure whether they are installed (I believe there's a script in /opt/hp-OpenIPMI/ that does this for you).
A note of caution: I had severe problems with version 7.7 of the hp-OpenIPMI drivers. I strongly suggest to install version 7.8. If you have problems, or you're stuck somewhere, please let us know, and we'll try to help.
I checked it using the management homepage which seem to report the disks and controller just fine. My guess is the ACU CLI would be able to read it out from command line. In any case, whenever something goes wrong with disks, it should be logged to syslog when running 64bit.
As for the homepage. I don't know about Ubuntu, but on Debian, I had no trouble with missing PAM modules, but I did have to install a bunch of 32bit compatibility libraries:
I don't know what the Ubuntu equivalents are, but I'm pretty sure they do exist on Ubuntu. In any case, could you post whatever error you're getting with the homepage. Then, I may help you look for a solution.
I just installed Etch on a new DL380G5, with HP Agents 7.7.0, and checked whether the SAS agent was running, which wasn't.
When checking the SMH, the controller seems to be detected properly, and the status of the disks and such are being read out, though. Can you verify this?
Downloading the .deb for sarge doesn't work on Etch right away because of a kernel version difference. Moreover, while the source of the OpenIPMI driver has been included, it does fit neatly into the kernel source, but the compiler bombs out with this:
Code:
CC [M] drivers/char/ipmi/ipmi_msghandler.o drivers/char/ipmi/ipmi_msghandler.c: In function 'ipmi_smi_watcher_register': drivers/char/ipmi/ipmi_msghandler.c:332: error: too few arguments to function 'watcher->new_smi' drivers/char/ipmi/ipmi_msghandler.c: In function 'call_smi_watchers': drivers/char/ipmi/ipmi_msghandler.c:356: error: too few arguments to function 'w->new_smi' drivers/char/ipmi/ipmi_msghandler.c: At top level: drivers/char/ipmi/ipmi_msghandler.c:1724: error: conflicting types for 'ipmi_register_smi' include/linux/ipmi_smi.h:176: error: previous declaration of 'ipmi_register_smi' was here drivers/char/ipmi/ipmi_msghandler.c: In function 'handle_oem_get_msg_cmd': drivers/char/ipmi/ipmi_msghandler.c:2569: error: 'IPMI_OEM_RECV_TYPE' undeclared (first use in this function) drivers/char/ipmi/ipmi_msghandler.c:2569: error: (Each undeclared identifier is reported only once drivers/char/ipmi/ipmi_msghandler.c:2569: error: for each function it appears in.) drivers/char/ipmi/ipmi_msghandler.c: In function 'ipmi_init_msghandler': drivers/char/ipmi/ipmi_msghandler.c:3364: warning: implicit declaration of function 'notifier_chain_register' drivers/char/ipmi/ipmi_msghandler.c: In function 'cleanup_ipmi': drivers/char/ipmi/ipmi_msghandler.c:3384: warning: implicit declaration of function 'notifier_chain_unregister' drivers/char/ipmi/ipmi_msghandler.c: At top level: drivers/char/ipmi/ipmi_msghandler.c:3420: error: conflicting types for 'ipmi_register_smi' include/linux/ipmi_smi.h:176: error: previous declaration of 'ipmi_register_smi' was here make[3]: *** [drivers/char/ipmi/ipmi_msghandler.o] Error 1
Now, this would be reasonably understandable, because of the major kernel version difference between sarge (2.6. and etch (2.6.18), so I downloaded the RHEL 5 rpm, extracted the source (yes, it's present as well), and infused it into the source. The result was better, but still not quite enough: I still get the one error message:
Code:
CC [M] drivers/char/ipmi/ipmi_msghandler.o drivers/char/ipmi/ipmi_msghandler.c: In function 'handle_oem_get_msg_cmd': drivers/char/ipmi/ipmi_msghandler.c:3153: error: 'IPMI_OEM_RECV_TYPE' undeclared (first use in this function) drivers/char/ipmi/ipmi_msghandler.c:3153: error: (Each undeclared identifier is reported only once drivers/char/ipmi/ipmi_msghandler.c:3153: error: for each function it appears in.) make[3]: *** [drivers/char/ipmi/ipmi_msghandler.o] Error 1
When examining the source, it's a logical error: the symbol IPMI_OEM_RECV_TYPE is indeed only present on that line, and is not defined anywhere else.
After a call to HP, there seems to be a solution present. Apparently it's a known issue on these boxes, when one is using the open source IPMI, instead of the HP OpenIPMI. The resolution was to install HP OpenIPMI:
Quote
DESCRIPTION On an HP ProLiant ML350 G5 server configured with a single processor, if the HP System Health Application and Insight Management Agents for Linux are installed without the HP OpenIPMI (hp-OpenIPMI) device driver loaded, a console message is displayed indicating that there is a problem with the system fan and that the server will shut down in 60 seconds. After 60 seconds has passed, the server reboots. When this occurs, the following message is written to the /var/log/messages file:
hpasmlited: WARNING: System Fan Removed (Fan 6, Location CPU)
The HP OpenIPMI (hp-OpenIPMI) device driver reports information on the status of the system fan to the HP System Health Application. If the HP OpenIPMI (hp-OpenIPMI) device driver is not loaded, then the OpenIPMI device driver that is included with the Linux kernel is loaded by default. The hpasmlited application uses the IPMI Sensor Device Records (SDRs) to determine what devices are present and working and what action to take. The SDR for Fan 6 is delivered to the hpasmlited application with instructions to shut down the server if the fan is missing or failed. As a result, the HP System Health Application flags the system fan as not being present and shuts down the server. The HP OpenIPMI (hp-OpenIPMI) device driver relies on OEM messages sent by the Base Management Controller (BMC) to shut the server down. The IPMI 2.0 OEM messages are not supported by the standard Linux drivers shipped with the Linux kernel.
When the ProLiant Support Pack for Linux is loaded, the HP OpenIPMI (hp-OpenIPMI) device driver is loaded by default. However, if a user chooses to not load the HP OpenIPMI (hp-OpenIPMI) device driver, then the OpenIPMI device driver that is included with the Linux kernel is loaded.
Note: This does not occur on a ProLiant ML350 G5 configured with two processors.
SCOPE Any HP ProLiant ML350 G5 server configured with a single processor and running HP System Health Application and Insight Management Agents for Linux and the OpenIPMI device driver that is included with the Linux kernel.
RESOLUTION To prevent the server from rebooting, load the HP OpenIPMI (hp-OpenIPMI) device driver instead of the OpenIPMI device driver if the HP System Health Application and Insight Management Agents for Linux are installed.
To locate the latest version of the HP OpenIPMI device driver, click on the following URL, select the desired Linux operating system, and then click on "Driver - System Management":
We're having a slight problem here with a ML350 G5, with a dual-core Xeon 5120. The system runs Etch AMD64 (stable), and everything seems to be in order. Installation of the agents went flawlessly either (with a bit of script tweaking).
The agents start up fine, and work properly, but unfortunately, a bit too good: once the hpasmlited is running, it triggers a system reboot. The syslog revealed the reason:
Code:
May 2 14:38:31 pbxcalpam hpasmlited[9887]: CRITICAL: System Fan Removed (Fan 6, Location CPU) May 2 14:38:31 pbxcalpam hpasmlited[9887]: A System Reboot has been requested by the management processor in 60 seconds.
The thing is, it's very normal that Fan 6 (covering the second CPU socket) is not present, because there is no second CPU installed. I updated the BIOS to the latest version (as apparently, there was a newer version), but to no result. My guess is that the agents inadvertently see the second core (it's a dual-core after all) as a new CPU, and thus detects a missing fan, covering the core.
Any-one any thoughts about how to get rid of this problem (aside from nothing using the agents)?
So what I did is enabling serial console (S1) output for vt320 emulation (/etc/inittab), configured it to be standard console (/boot/grub/menu.lst) and accessed the server via remote serial console in iLO2. (Grub don't has to be modified because it's pre-os and therefore working with native iLO-serial-support - maybe it would be better to do so (faster, better usability...).
In light of this thread, and my get agents running on DL380 G5 x86_64 spree, this little howto...
Basic howto
First of, configure iLO2 with your desired users, passwords and adresses, like one would normally do (or not do, if the default password suffices). One thing to set up though, is the Virtual Serial Port speed. It's recommended to increase the value from 9600bps to something more sane (38400 or 57600 for instance).
Next up, we need to edit grub so the grub console is shown through the serial link, as well as the kernel messages: edit /boot/grub/menu.lst, and add these lines:
Code:
serial --unit=1 --speed=38400 terminal --timeout=15 console serial
The two lines tell:
set up the serial console on COM2 (COM1 would be --unit=0; COM3 would be --unit=2, etc) with speed 38400bps
Next up we define a terminal selection thingy: what we do here is tell grub to show a selection on both screens ("press any key to continue"). If within 15 sec, a key is pressed on either the console (monitor and keyboard), or through the serial link, that terminal is chosen as the display for grub. If after 15 sec no choice has been made, the monitor is selected as default.
Next, we need to tell the kernel to direct its output to regular console, as well as the serial link. Otherwise, we won't see the pretty kernel messages rolling. The kernel accepts more than one console target, so we can add the following to the boot parameters:
Code:
console=ttyS1 console=tty0
This tells the kernel to use both COM2 as well as the default console (monitor and keyboard) as the "active consoles".
As a final step, we need to tell inittab to set up a respawning serial console on our port: add (or uncomment) these lines in /etc/inittab:
Code:
T0:2345:respawn:/sbin/getty -L ttyS1 38400 vt320
Note that we use vt320 emulation instead of vt100, as was told at this thread.
Start a reboot, and immediately fire up the serial console of the iLO2 webinterface. You should be able to follow the complete boot process of the BIOS, grub, and boot up of linux. It's possible that kernel message are displayed on the monitor but not through the serial console. This is because inittab is initialized and the getty takes over the serial port, and thus is perfectly normal (to my knowledge).
EDIT: Addendum: configuring single user mode in grub
To get single user going through serial line, it's a bit trickier, because the kernel only takes one console as the main console (rather than all specified consoles). Only the last console specified is used, so we need to add an additional entry for single-user mode, and tell the kernel to use the serial link:
Once you have the hpasm running on x86_64, installation of the hpsmh package is pretty straight-forward. Change the script to fetch the x86_64 rpm instead of i386, run it (of course, make sure the deps are all in place), and start it up.
You should be able to control just about everything, even the hp log, although it's a bit silly, as logging to hp log doesn't (yet) seem to work with IPMI.
It's a quick and dirty method, but seems to work (tested on a DL380 G5 running Etch rc1). If there are any errors to it, please post those in that thread.
Following up on the thread started by other folks, we've managed to get hpasm somewhat running on x86_64 systems. Apparently (and not very suprisingly), the x86_64 rpms are basically still 32-bit versions, accept for hpasmlited (the IPMI based hpasm), which is 64bit. We will therefore be using IPMI to read out the sensors, instead of the regular hpasm (not that we have much of a choice...). The following has been tested on a shiny new DL380 G5, running Debian 4.0 Etch RC1.
Part 1: getting to run IPMI-enabled hpasm
Anyway, I started out by downloading the HPasm-rpm2deb script, and changing the arch from i386 to x86_64. Again, make sure the necessary dependencies (alien, wget and fakeroot) are installed before firing up the script:
Now, before running hpasm configure, head towards your /etc/init.d directory, and create a symlink:
Code:
# cd /etc/init.d # ln -s ipmi.hp ipmi
The problem is that the startup scripts of hpasm look for /etc/init.d/ipmi, which obviously doesn't exist. You may also want to remove/comment the annoying . /etc/init.d/functions line in the hpasm startup script, as it is quite unnecessary.
Next up (and still before running the configure script), load in all the ipmi modules:
If it's not running, check the syslog to see the problem. If there are messages saying /dev/ipmi0 can't be found, you haven't modprobed the necessary modules (or your controller is not detected by the kernel drivers).
To read out the stats, you can use ipmitool. It's one command away:
Code:
# apt-get install ipmitool
You can now issue commands like:
Code:
# ipmitool sdr list
to read out various sensors.
Part 2: getting to run storage agents
As said before, the storage agents are still 32bit binaries, which means we can't use them as-is on our 64bit Debian. Luckily, Debian has foreseen this very problem with a set of 32bit emulation libraries for 64bit systems. Installing these will enable you to run 32bit applications on 64bit environments (provided they don't have funny dependencies). To get the cmastor bit running, we installed these:
Actually, you only need to punch in the first two, the others are automagically selected as deps. Once that's done, do a restart of hpasm, and you'll notice a whole lot more processes are running:
Code:
7769 ? S 0:00 cmathreshd -p 5 -s OK 7775 ? S 0:00 cmahostd -p 15 -s OK 7781 ? Sl 0:00 cmapeerd 7820 ? Sl 0:00 cmastdeqd -p 30 7827 ? Sl 0:00 cmahealthd -p 30 -s OK -t OK -i 7838 ? S 0:00 cmaperfd -p 30 -s OK 7909 ? S 0:00 cmaeventd -p 15 7915 ? S 0:00 cmaidad -p 15 -s OK 7921 ? S 0:00 cmafcad -p 15 -s OK 7927 ? S 0:00 cmaided -p 15 -s OK 8097 ? Ssl 0:00 hpasmlited -f /dev/ipmi0
EDIT: Part 3: automatic startup
For automatic startup, I added the load of the ipmi modules to /etc/modules, as they are not loaded automatically. Secondly, I issues the command below to get hpasm in the startup:
I ran this particular command to ensure that hpasm gets started after snmp (though I'm not sure if snmp is really used with hpasmlited), and stopped before snmp stops. Not sure if this is necessary, but it's a cleaner way.
Part 4: unresolved issues(edit: was part 3)
Okay, there are still a few 'bugs': first of is the fact that we're not using the fully fledged hpasm. This means we're missing logging to the HP Event Log (which is a pain, I admit). Instead, hpasmlited seems to be logging things to the syslog. Secondly, although hpasmcli seems to work, it's still a bit unstable on emulation and regularly segfaults. Note that ipmitool can be used as an alternative to this though.
As a conclusion, this small howto is far from finished, and was only tested on one machine. It will most likely need to be generalized for the broader audience. Also, I haven't tried to run hpsmh or the ilo agents just yet (this will probably be done in the next few days). Anyway, any type of comment or problems are welcome in this thread.
Because of the troublesome and plain idiotic business of our site being hacked, we've decided to kick our current CMS out, and to run solely on our forum for the timebeing. I've made the downloads available here: