Difference between revisions of "RedHat"

From wiki
Jump to: navigation, search
(Work around)
(Work around)
 
Line 232: Line 232:
  
  
if stuff still fails!!! then:
+
if stuff still fails!!! then: (this was th efix for marvin head node)
  
 
/etc/yum.repos.d/redhat.repo is the only repository which is required for OS patching. Could you move all the files (except redhat.repo) from  "/etc/yum.repos.d/" to /tmp/ and then run below commands:
 
/etc/yum.repos.d/redhat.repo is the only repository which is required for OS patching. Could you move all the files (except redhat.repo) from  "/etc/yum.repos.d/" to /tmp/ and then run below commands:
Line 242: Line 242:
 
  # yum repolist
 
  # yum repolist
 
  # yum update
 
  # yum update
 +
 +
 +
Node 9 still failing. This was the fix for that:
  
 
= Considering upgrading to RHEL7 =
 
= Considering upgrading to RHEL7 =

Latest revision as of 08:29, 26 September 2019

The StABU cluster subscribes to Red Hat, and uses version 6, which is one behind the latest version, version 7.

As of November 2017, all nodes were on the latest version of version 6.9 "Santiago".

Being one version behind is typicaly not a problem, as it continues to be fully supported. However, there are software packages that like to use the bleeding edge, and there are admitted problems installing these programs.

Installation of devtoolset-2

There are two ways apparently via RHN Classic or RH Subscription Management, and we choose the latter because there is a useful command-line tool for it.

To install on a node:

ssh node4 'subscription-manager subscribe --pool=8a85f9814ed67b98014eda13ed983c7d'

Note here that we have previously been able to verify the pool ID. Documentation for this can be found at

Click here or google Red Hat Developer Toolset

This will also enable the appropriate repos for you (though *-source-rpms and *-debug-rpms may be left out).

Problems

outdated repomd.xml warnings

You can get rid of them (i.e. update the repomd.xml) by:

yum clean all
yum update yum*

PYCURL error 7

Note that this could easily occur is the squid server on marvin has failed. Make sure you at least try and restart the squid server on marvin.

This link has a fairly exhaustive for checking errors using tcpdump and nc.

RedHat service types

  • A service is usually under one or more of these.
  • Some of them are a little old.
  • it's probably best to use subscription manager.
man subscription-manager

will work.

The wiki entry on problems with node1 can also be helpful: node 1 issues

In November, marvin was migrated from RHN to RHSM. This apparently looked difficult to do, but then it turned out that RHN was being decomissioned, so there was no choice. In the end, it was not such a difficult thing to do with the

rhn-migrate-classic-to-rhsm

command.

ssl connection errors

The nodes are connected to the internet via the squid proxy service running on marvin. The proxy was installed due to pressure from IT Services, as the NAT option has been discredited by them. In Miguel-era StABU, the NAT would be set up during updates or short temporary periods (a hour or two) and then closed. However, this would mean that generally, the nodes woul dnot have access to internet.

The Squid proxy was installed so that nodes would have full internet access and to satisfy IT Services. The relevant IP and port numbers are

172.25.64.1:3128

This has always worked quite well since 2016 when it was installed. And the big September 2017 update went through without a hitch. However in November 2017, it started failing with SSL connection errors. Not to EPEL or anything that was not REd Hat, but criticially, to

https://cdn.redhat.com/content/dist/rhel/server/6/6Server/x86_64/optional/os/repodata/repomd.xml

This is nasty because it's the most important connection. A longish, but somewhat raw troubleshooting page from Red Hat is up at [https://access.redhat.com/solutions/189533 this link.

One vital test is working out whether you can get through to Redhat's SSL https servers. Try this with

curl -v --proxy 172.25.64.1:3128 https://subscription.rhn.redhat.com --cacert /etc/rhsm/ca/redhat-uep.pem

or, because it seems to amount to the same thing:

curl -v -k --proxy 172.25.64.1:3128 https://subscription.rhn.redhat.com/subscription/ --cacert /etc/rhsm/ca/redhat-uep.pem

Output from this, with this problem still present is

*   Trying 172.25.64.1...
* TCP_NODELAY set
* Connected to 172.25.64.1 (172.25.64.1) port 3128 (#0)
* Establish HTTP proxy tunnel to subscription.rhn.redhat.com:443
> CONNECT subscription.rhn.redhat.com:443 HTTP/1.1
> Host: subscription.rhn.redhat.com:443
> User-Agent: curl/7.54.1
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.0 200 Connection established
< 
* Proxy replied OK to CONNECT request
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/rhsm/ca/redhat-uep.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / AES256-GCM-SHA384
* Server certificate:
*  subject: C=US; ST=North Carolina; O=Red Hat, Inc.; OU=Red Hat Network; CN=subscription.rhn.redhat.com; emailAddress=customerservice@redhat.com
*  start date: May 18 16:24:18 2017 GMT
*  expire date: May 18 16:24:18 2019 GMT
*  issuer: C=US; ST=North Carolina; O=Red Hat, Inc.; OU=Red Hat Network; CN=Red Hat Entitlement Operations Authority; emailAddress=ca-support@redhat.com
*  SSL certificate verify ok.
> GET /subscription/ HTTP/1.1
> Host: subscription.rhn.redhat.com
> User-Agent: curl/7.54.1
> Accept: */*
> 
* HTTP 1.0, assume close after body 
< HTTP/1.0 200 OK
< Content-Type: application/json
< Server: BigIP
* HTTP/1.0 connection set to keep alive!
< Connection: Keep-Alive
< Content-Length: 1224
< 
[ {"rel":"","href":"/"},{"rel":"deleted_consumers","href":"/deleted_consumers"},{"rel":"consumertypes","href":"/consumertypes"},{"rel":"crl","href":"/crl"},{"rel":"entitlements","href":"/entitlements"},{"rel":"guestids","href":"/consumers/{consumer_uuid}/guestids"},
{"rel":"pools","href":"/pools"},{"rel":"products","href":"/products"},{"rel":"activation_keys","href":"/activation_keys"},{"rel":"statistics/generate","href":"/statistics/generate"},{"rel":"distributor_versions","href":"/distributor_versions"},{"rel":"owners","href":"/owners"},
{"rel":"cdn","href":"/cdn"},{"rel":"roles","href":"/roles"},{"rel":"events","href":"/events"},{"rel":"consumers","href":"/consumers"},{"rel":"atom","href":"/atom"},{"rel":"content_overrides","href":"/consumers/{consumer_uuid}/content_overrides"},{"rel":"rules","href":"/rules"},
{"rel":"content","href":"/content"},{"rel":"hypervisors","href":"/hypervisors"},{"rel":"users","href":"/users"},{"rel":"migrations","href":"/migrations"},{"rel":"status","href":"/status"},{"rel":"admin"* Connection #0 to host 172.25.64.1 left intact
,"href":"/admin"},{"rel":"serials","href":"/serials"},{"rel":"jobs","href":"/jobs"},{"rel":"subscriptions","href":"/subscriptions"} ,{"href": "/consumers/{consumer_uuid}/packages","rel": "packages"} ]

This all looks really positive. Is the error bona-fide? Well, it does refer to optional-rpms only? Yes, in fact, this is the nature of the problem. OK, disable it

subscription-manager repos --disable rhel-6-server-optional-rpms

However, this gets the following error:

/var/log/rhsm/rhsm.log

Work around

The Redhat webpage helping with this issue mentioned above (#189533) rambles on a bit, it's not quite clear how to follow it. Problem is, it's often quoted as the go-to webpage for thes e issues! In any case, none of its checks gave any joy. Over and over again, "yum check-update" would fail continuously with the Error 14 no ssl connection.

Because updating the nodes is rather import, the workaroudn consists of

  • opening up the NAT temporarily from marvin
ON MARVIN, then sudo ssh
sudo iptables -t nat -A POSTROUTING -o bond0.506 -j MASQUERADE
  • on each node, commenting out proxy IP and port number in the
nano /etc/rhsm/rhsm.conf

the following lines for the above command:

# an http proxy server to use
proxy_hostname = 172.25.64.1


# port for http proxy server
proxy_port = 3128


file

  • updating yum as usual, and rebooting node
  • (optional because root cause is not resolved) uncomment the proxy IP and PORT settings in /etc/rhsm/rhsm.conf.
  • If yu have finished with the nodes, then remove the NAT rule on marvin:
sudo iptables -t nat -D POSTROUTING 1

(note this command assumes the rule to be the only rule and numbered "1")

This may be of use at some point:

sudo subscription-manager remove --all
sudo subscription-manager unregister
sudo subscription-manager clean
sudo subscription-manager register
sudo subscription-manager refresh
sudo subscription-manager attach --auto
yum clean all
subscription-manager repos --list
  • additional unknown ssl errors occured, and RH classic license was still on. To remove. This worked for PT 2019 Sep 17th


sudo subscription-manager register WARNING
mv /etc/sysconfig/rhn/systemid /etc/sysconfig/rhn/systemid_old
set enabled = 0 in file "/etc/yum/pluginconf.d/rhnplugin.conf"
clear cache :- # rm -rf /var/cache/yum/*


rm -rf /var/cache/yum/*
mv /etc/yum.repos.d/* /tmp
sudo subscription-manager remove --all
sudo subscription-manager unregister
sudo subscription-manager clean
sudo subscription-manager register
sudo subscription-manager refresh
sudo subscription-manager attach --auto
yum clean all

Then

yum update yum
yum update

A diganositc step: https://access.redhat.com/solutions/1323123

host $( grep ^hostname /etc/rhsm/rhsm.conf | cut -d '=' -f 2 )
subscription.rhsm.redhat.com has address 209.132.183.107


If all this STILL fails, then this is the new, methods. AGAIN AGAIN AGAIN ... FFS

make sure you do the werid thing on the head node and alter the conf files, as specified above.

1. Remove the proxy entries from yum.conf

     # nano /etc/yum.conf 
     proxy=http://172.25.64.1:3128   <<<<< Remove this line - I only comment this out, and pupt it back in after, as I dont know what it really does!
  2, Unset the Proxy variables since the system is not able to connect with the proxy address itself, 
     # unset http_proxy 
     # unset ftp_proxy   (I didnt need to do this one!!!)

Now try to execute the following commands and we should be able to proceed further accordingly,

 # subscription-manager refresh
 # yum clean all
 # yum repolist



The root cause can be concluded here from the analysis of both the system is that, your systems are not able to communicate with your proxy address of 172.25.64.1:3128. But your systems are already having access to Red Hat registration servers through the Coprorate firewall, so the registration and running yum commands was possible.

Hence, you will need to make sure that the system does not have any Proxy entries in any of these below files or in the environment variable,

   - /etc/yum.conf
   - /etc/rhsm/rhsm.conf 
   - Proxy Env Variables can be removed using, 
     # unset http_proxy 
     # unset https_proxy 
     # unset ftp_proxy


if stuff still fails!!! then: (this was th efix for marvin head node)

/etc/yum.repos.d/redhat.repo is the only repository which is required for OS patching. Could you move all the files (except redhat.repo) from "/etc/yum.repos.d/" to /tmp/ and then run below commands:

# ls -l /etc/yum.repos.d/
---> Make sure only redhat.repo file is present.
# yum clean all;
# yum repolist
# yum update


Node 9 still failing. This was the fix for that:

Considering upgrading to RHEL7

A link for this is solution number 637583

Thoughts

  • This could be one heck of an epic upgrade, especially as regards the manually installed software programs (150 or them, some of them very big)
  • Most likely it will be the different C library that will cause problems
  • Library and linking problems are big ... fear them!

Test with node10

Following solution 637583, we install the upgrade assessemnt software and and run

preupg

which, happily enough, will not install the upgrade but report on what problems are likely to be encountered.

Note, that it takes a while to run, may be 30 mins.