Marvin and IPMI (remote hardware control)

From wiki
Revision as of 12:38, 2 February 2018 by Rf (talk | contribs)
Jump to: navigation, search

Introduction

For example, when networking goes down on marvin and yet the machine is still up, ssh can no longer be used, so the administrator is stuck.

However all modern, industrial grade (i.e. non-consumer) servers now include a separate subsystem on the machine which is independent of the main machine. This is known by various names, and there is a standard called IPMI, which is what Supermicro calls it. Dell calls it DRAC and HP calls it LightsOn (or something to that effect).

It can be seen as a hardware remote control system and is a device which include a network connection so one can login to the IPMI module and send hardware commands (principally power-on and power-cycle) to the main machine.

Despite IPMI's isolation from the main system, it is not immune to faults of its own, in which case, the only cure is to ring Ian at IT Services and physically go to the datacentre. Unfortunately, IPMI tends not to cooperate exactly when the man machines has problems of its own, which is disappointing because that's exactly when it's needed. Nevertheless, IPMI is better than nothing and has proved useful on many occasions.

Details

Marvin's nodes can all be remotely controlled but only in marvin itself. So the usual exercise is to run firefox on marvin and connect to the node's IPMI IPs there.

When marvin's IPMI itself needs to be used, then this can be done from another computer within the University campus.

There is a standalone GUI application called IPMIconfig which does then same things as the IPMI web interface, but because it doesn't need a browser, can be faster.

The virtual console on IPMI's web interface uses JNLP (javaws) program and is the best implementation, but it can be patchy. It also allows the loading of a local Live Linux ISO file so that the machine may be booted from it, though this can be a bit tortuous. Certainly, it is very clear that Supermicro's IPMI interface is considerably inferior to Dell's DRAC interface used on the biotime machine. Nevertheless, it is possible to boot the recommended Linux Live ISO, sysrescuecd on Supermicro's IPMI.

Again it must be repeated that the virtual console's functioning is patchy. An even less dependable version of Virtual Console is SOL. It may seem silly to mention SOL when it is even worse than Virtual console, however it has one or two crucial advantages which make it the holy grail of remote hardware control:

  • SOL is a raw terminal connection to the login screen of the main machine.
  • it does not operate via buggy GUI's and web interfaces.
  • one can connect via the command line and record all input and output via your local linux computer's "script" progam (see "man script").
  • When it works, it is much faster than the alternatives.
  • It behaves as if one really was sitting down locally at the machine, looking at the login screen.