RedHat

From wiki
Jump to: navigation, search

The StABU cluster subscribes to Red Hat, and uses version 6, which is one behind the latest version, version 7.

As of November 2017, all nodes were on the latest version of version 6.9 "Santiago".

Being one version behind is typicaly not a problem, as it continues to be fully supported. However, there are software packages that like to use the bleeding edge, and there are admitted problems installing these programs.

Installation of devtoolset-2

There are two ways apparently via RHN Classic or RH Subscription Management, and we choose the latter because there is a useful command-line tool for it.

To install on a node:

ssh node4 'subscription-manager subscribe --pool=8a85f9814ed67b98014eda13ed983c7d'

Note here that we have previously been able to verify the pool ID. Documentation for this can be found at

Click here or google Red Hat Developer Toolset

This will also enable the appropriate repos for you (though *-source-rpms and *-debug-rpms may be left out).

Problems

outdated repomd.xml warnings

You can get rid of them (i.e. update the repomd.xml) by:

yum clean all
yum update yum*

PYCURL error 7

Note that this could easily occur is the squid server on marvin has failed. Make sure you at least try and restart the squid server on marvin.

This link has a fairly exhaustive for checking errors using tcpdump and nc.

RedHat service types

  • A service is usually under one or more of these.
  • Some of them are a little old.
  • it's probably best to use subscription manager.
man subscription-manager

will work.

The wiki entry on problems with node1 can also be helpful: node 1 issues

In November, marvin was migrated from RHN to RHSM. This apparently looked difficult to do, but then it turned out that RHN was being decomissioned, so there was no choice. In the end, it was not such a difficult thing to do with the

rhn-migrate-classic-to-rhsm

command.

ssl connection errors

The nodes are connected to the internet via the squid proxy service running on marvin. The proxy was installed due to pressure from IT Services, as the NAT option has been discredited by them. In Miguel-era StABU, the NAT would be set up during updates or short temporary periods (a hour or two) and then closed. However, this would mean that generally, the nodes woul dnot have access to internet.

The Squid proxy was installed so that nodes would have full internet access and to satisfy IT Services. The relevant IP and port numbers are

172.25.64.1:3128

This has always worked quite well since 2016 when it was installed. And the big September 2017 update went through without a hitch. However in November 2017, it started failing with SSL connection errors. Not to EPEL or anything that was not REd Hat, but criticially, to

https://cdn.redhat.com/content/dist/rhel/server/6/6Server/x86_64/optional/os/repodata/repomd.xml

This is nasty because it's the most important connection. A longish, but somewhat raw troubleshooting page from Red Hat is up at [https://access.redhat.com/solutions/189533 this link.

One vital test is working out whether you can get through to Redhat's SSL https servers. Try this with

curl -v --proxy 172.25.64.1:3128 https://subscription.rhn.redhat.com --cacert /etc/rhsm/ca/redhat-uep.pem

or, because it seems to amount to the same thing:

curl -v -k --proxy 172.25.64.1:3128 https://subscription.rhn.redhat.com/subscription/ --cacert /etc/rhsm/ca/redhat-uep.pem

Output from this, with this problem still present is

*   Trying 172.25.64.1...
* TCP_NODELAY set
* Connected to 172.25.64.1 (172.25.64.1) port 3128 (#0)
* Establish HTTP proxy tunnel to subscription.rhn.redhat.com:443
> CONNECT subscription.rhn.redhat.com:443 HTTP/1.1
> Host: subscription.rhn.redhat.com:443
> User-Agent: curl/7.54.1
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.0 200 Connection established
< 
* Proxy replied OK to CONNECT request
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/rhsm/ca/redhat-uep.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / AES256-GCM-SHA384
* Server certificate:
*  subject: C=US; ST=North Carolina; O=Red Hat, Inc.; OU=Red Hat Network; CN=subscription.rhn.redhat.com; emailAddress=customerservice@redhat.com
*  start date: May 18 16:24:18 2017 GMT
*  expire date: May 18 16:24:18 2019 GMT
*  issuer: C=US; ST=North Carolina; O=Red Hat, Inc.; OU=Red Hat Network; CN=Red Hat Entitlement Operations Authority; emailAddress=ca-support@redhat.com
*  SSL certificate verify ok.
> GET /subscription/ HTTP/1.1
> Host: subscription.rhn.redhat.com
> User-Agent: curl/7.54.1
> Accept: */*
> 
* HTTP 1.0, assume close after body 
< HTTP/1.0 200 OK
< Content-Type: application/json
< Server: BigIP
* HTTP/1.0 connection set to keep alive!
< Connection: Keep-Alive
< Content-Length: 1224
< 
[ {"rel":"","href":"/"},{"rel":"deleted_consumers","href":"/deleted_consumers"},{"rel":"consumertypes","href":"/consumertypes"},{"rel":"crl","href":"/crl"},{"rel":"entitlements","href":"/entitlements"},{"rel":"guestids","href":"/consumers/{consumer_uuid}/guestids"},
{"rel":"pools","href":"/pools"},{"rel":"products","href":"/products"},{"rel":"activation_keys","href":"/activation_keys"},{"rel":"statistics/generate","href":"/statistics/generate"},{"rel":"distributor_versions","href":"/distributor_versions"},{"rel":"owners","href":"/owners"},
{"rel":"cdn","href":"/cdn"},{"rel":"roles","href":"/roles"},{"rel":"events","href":"/events"},{"rel":"consumers","href":"/consumers"},{"rel":"atom","href":"/atom"},{"rel":"content_overrides","href":"/consumers/{consumer_uuid}/content_overrides"},{"rel":"rules","href":"/rules"},
{"rel":"content","href":"/content"},{"rel":"hypervisors","href":"/hypervisors"},{"rel":"users","href":"/users"},{"rel":"migrations","href":"/migrations"},{"rel":"status","href":"/status"},{"rel":"admin"* Connection #0 to host 172.25.64.1 left intact
,"href":"/admin"},{"rel":"serials","href":"/serials"},{"rel":"jobs","href":"/jobs"},{"rel":"subscriptions","href":"/subscriptions"} ,{"href": "/consumers/{consumer_uuid}/packages","rel": "packages"} ]

This all looks really positive. Is the error bona-fide? Well, it does refer to optional-rpms only? Yes, in fact, this is the nature of the problem. OK, disable it

subscription-manager repos --disable rhel-6-server-optional-rpms

However, this gets the following error:

/var/log/rhsm/rhsm.log

Work around

The Redhat webpage helping with this issue mentioned above (#189533) rambles on a bit, it's not quite clear how to follow it. Problem is, it's often quoted as the go-to webpage for thes e issues! In any case, none of its checks gave any joy. Over and over again, "yum check-update" would fail continuously with the Error 14 no ssl connection.

Because updating the nodes is rather import, the workaroudn consists of

  • opening up the NAT temporarily from marvin
sudo iptables -t nat -A POSTROUTING -o bond0.506 -j MASQUERADE
  • on each node, commenting out proxy IP and port number in the
/etc/rhsm/rhsm.conf

file

  • updating yum as usual, and rebooting node
  • (optional because root cause is not resolved) uncomment the proxy IP and PORT settings in /etc/rhsm/rhsm.conf.
  • If yu have finished with the nodes, then remove the NAT rule on marvin:
sudo iptables -t nat -D POSTROUTING 1

(note this command assumes the rule to be the only rule and numbered "1")

Considering upgrading to RHEL7

A link for this is solution number 637583

Thoughts

  • This could be one heck of an epic upgrade, especially as regards the manually installed software programs (150 or them, some of them very big)
  • Most likely it will be the different C library that will cause problems
  • Library and linking problems are big ... fear them!

Test with node10

Following solution 637583, we install the upgrade assessemnt software and and run

preupg

which, happily enough, will not install the upgrade but report on what problems are likely to be encountered.

Note, that it takes a while to run, may be 30 mins.