Difference between revisions of "RedHat"
PeterThorpe (talk | contribs) (→Work around) |
|||
(11 intermediate revisions by 2 users not shown) | |||
Line 137: | Line 137: | ||
Because updating the nodes is rather import, the workaroudn consists of | Because updating the nodes is rather import, the workaroudn consists of | ||
* opening up the NAT temporarily from marvin | * opening up the NAT temporarily from marvin | ||
+ | ON MARVIN, then sudo ssh | ||
+ | sudo iptables -t nat -A POSTROUTING -o bond0.506 -j MASQUERADE | ||
* on each node, commenting out proxy IP and port number in the | * on each node, commenting out proxy IP and port number in the | ||
− | /etc/rhsm/rhsm.conf | + | nano /etc/rhsm/rhsm.conf |
+ | |||
+ | the following lines for the above command: | ||
+ | |||
+ | # an http proxy server to use | ||
+ | proxy_hostname = 172.25.64.1 | ||
+ | |||
+ | |||
+ | # port for http proxy server | ||
+ | proxy_port = 3128 | ||
+ | |||
+ | |||
file | file | ||
* updating yum as usual, and rebooting node | * updating yum as usual, and rebooting node | ||
− | * (optional because root | + | * (optional because root cause is not resolved) uncomment the proxy IP and PORT settings in /etc/rhsm/rhsm.conf. |
+ | * If yu have finished with the nodes, then remove the NAT rule on marvin: | ||
+ | sudo iptables -t nat -D POSTROUTING 1 | ||
+ | (note this command assumes the rule to be the only rule and numbered "1") | ||
+ | |||
+ | This may be of use at some point: | ||
+ | |||
+ | sudo subscription-manager remove --all | ||
+ | sudo subscription-manager unregister | ||
+ | sudo subscription-manager clean | ||
+ | sudo subscription-manager register | ||
+ | sudo subscription-manager refresh | ||
+ | sudo subscription-manager attach --auto | ||
+ | yum clean all | ||
+ | |||
+ | subscription-manager repos --list | ||
+ | |||
+ | * additional unknown ssl errors occured, and RH classic license was still on. To remove. This worked for PT 2019 Sep 17th | ||
+ | |||
+ | |||
+ | sudo subscription-manager register WARNING | ||
+ | mv /etc/sysconfig/rhn/systemid /etc/sysconfig/rhn/systemid_old | ||
+ | set enabled = 0 in file "/etc/yum/pluginconf.d/rhnplugin.conf" | ||
+ | clear cache :- # rm -rf /var/cache/yum/* | ||
+ | |||
+ | |||
+ | rm -rf /var/cache/yum/* | ||
+ | mv /etc/yum.repos.d/* /tmp | ||
+ | sudo subscription-manager remove --all | ||
+ | sudo subscription-manager unregister | ||
+ | sudo subscription-manager clean | ||
+ | sudo subscription-manager register | ||
+ | sudo subscription-manager refresh | ||
+ | sudo subscription-manager attach --auto | ||
+ | yum clean all | ||
+ | |||
+ | Then | ||
+ | yum update yum | ||
+ | yum update | ||
+ | |||
+ | A diganositc step: https://access.redhat.com/solutions/1323123 | ||
+ | host $( grep ^hostname /etc/rhsm/rhsm.conf | cut -d '=' -f 2 ) | ||
+ | subscription.rhsm.redhat.com has address 209.132.183.107 | ||
+ | |||
+ | |||
+ | If all this STILL fails, then this is the new, methods. AGAIN AGAIN AGAIN ... FFS | ||
+ | |||
+ | make sure you do the werid thing on the head node and alter the conf files, as specified above. | ||
+ | |||
+ | 1. Remove the proxy entries from yum.conf | ||
+ | # nano /etc/yum.conf | ||
+ | proxy=http://172.25.64.1:3128 <<<<< Remove this line - I only comment this out, and pupt it back in after, as I dont know what it really does! | ||
+ | |||
+ | 2, Unset the Proxy variables since the system is not able to connect with the proxy address itself, | ||
+ | # unset http_proxy | ||
+ | # unset ftp_proxy (I didnt need to do this one!!!) | ||
+ | |||
+ | Now try to execute the following commands and we should be able to proceed further accordingly, | ||
+ | # subscription-manager refresh | ||
+ | # yum clean all | ||
+ | # yum repolist | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | The root cause can be concluded here from the analysis of both the system is that, your systems are not able to communicate with your proxy address of 172.25.64.1:3128. | ||
+ | But your systems are already having access to Red Hat registration servers through the Coprorate firewall, so the registration and running yum commands was possible. | ||
+ | |||
+ | Hence, you will need to make sure that the system does not have any Proxy entries in any of these below files or in the environment variable, | ||
+ | - /etc/yum.conf | ||
+ | - /etc/rhsm/rhsm.conf | ||
+ | - Proxy Env Variables can be removed using, | ||
+ | # unset http_proxy | ||
+ | # unset https_proxy | ||
+ | # unset ftp_proxy | ||
+ | |||
+ | |||
+ | |||
+ | if stuff still fails!!! then: (this was th efix for marvin head node) | ||
+ | |||
+ | /etc/yum.repos.d/redhat.repo is the only repository which is required for OS patching. Could you move all the files (except redhat.repo) from "/etc/yum.repos.d/" to /tmp/ and then run below commands: | ||
+ | |||
+ | # ls -l /etc/yum.repos.d/ | ||
+ | ---> Make sure only redhat.repo file is present. | ||
+ | |||
+ | # yum clean all; | ||
+ | # yum repolist | ||
+ | # yum update | ||
+ | |||
+ | |||
+ | Node 9 still failing. This was the fix for that: | ||
+ | |||
= Considering upgrading to RHEL7 = | = Considering upgrading to RHEL7 = | ||
Latest revision as of 08:29, 26 September 2019
The StABU cluster subscribes to Red Hat, and uses version 6, which is one behind the latest version, version 7.
As of November 2017, all nodes were on the latest version of version 6.9 "Santiago".
Being one version behind is typicaly not a problem, as it continues to be fully supported. However, there are software packages that like to use the bleeding edge, and there are admitted problems installing these programs.
Contents
Installation of devtoolset-2
There are two ways apparently via RHN Classic or RH Subscription Management, and we choose the latter because there is a useful command-line tool for it.
To install on a node:
ssh node4 'subscription-manager subscribe --pool=8a85f9814ed67b98014eda13ed983c7d'
Note here that we have previously been able to verify the pool ID. Documentation for this can be found at
Click here or google Red Hat Developer Toolset
This will also enable the appropriate repos for you (though *-source-rpms and *-debug-rpms may be left out).
Problems
outdated repomd.xml warnings
You can get rid of them (i.e. update the repomd.xml) by:
yum clean all yum update yum*
PYCURL error 7
Note that this could easily occur is the squid server on marvin has failed. Make sure you at least try and restart the squid server on marvin.
This link has a fairly exhaustive for checking errors using tcpdump and nc.
RedHat service types
- A service is usually under one or more of these.
- Some of them are a little old.
- it's probably best to use subscription manager.
man subscription-manager
will work.
The wiki entry on problems with node1 can also be helpful: node 1 issues
In November, marvin was migrated from RHN to RHSM. This apparently looked difficult to do, but then it turned out that RHN was being decomissioned, so there was no choice. In the end, it was not such a difficult thing to do with the
rhn-migrate-classic-to-rhsm
command.
ssl connection errors
The nodes are connected to the internet via the squid proxy service running on marvin. The proxy was installed due to pressure from IT Services, as the NAT option has been discredited by them. In Miguel-era StABU, the NAT would be set up during updates or short temporary periods (a hour or two) and then closed. However, this would mean that generally, the nodes woul dnot have access to internet.
The Squid proxy was installed so that nodes would have full internet access and to satisfy IT Services. The relevant IP and port numbers are
172.25.64.1:3128
This has always worked quite well since 2016 when it was installed. And the big September 2017 update went through without a hitch. However in November 2017, it started failing with SSL connection errors. Not to EPEL or anything that was not REd Hat, but criticially, to
https://cdn.redhat.com/content/dist/rhel/server/6/6Server/x86_64/optional/os/repodata/repomd.xml
This is nasty because it's the most important connection. A longish, but somewhat raw troubleshooting page from Red Hat is up at [https://access.redhat.com/solutions/189533 this link.
One vital test is working out whether you can get through to Redhat's SSL https servers. Try this with
curl -v --proxy 172.25.64.1:3128 https://subscription.rhn.redhat.com --cacert /etc/rhsm/ca/redhat-uep.pem
or, because it seems to amount to the same thing:
curl -v -k --proxy 172.25.64.1:3128 https://subscription.rhn.redhat.com/subscription/ --cacert /etc/rhsm/ca/redhat-uep.pem
Output from this, with this problem still present is
* Trying 172.25.64.1... * TCP_NODELAY set * Connected to 172.25.64.1 (172.25.64.1) port 3128 (#0) * Establish HTTP proxy tunnel to subscription.rhn.redhat.com:443 > CONNECT subscription.rhn.redhat.com:443 HTTP/1.1 > Host: subscription.rhn.redhat.com:443 > User-Agent: curl/7.54.1 > Proxy-Connection: Keep-Alive > < HTTP/1.0 200 Connection established < * Proxy replied OK to CONNECT request * Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH * successfully set certificate verify locations: * CAfile: /etc/rhsm/ca/redhat-uep.pem CApath: none * TLSv1.2 (OUT), TLS handshake, Client hello (1): * TLSv1.2 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Request CERT (13): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Certificate (11): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Client hello (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS change cipher, Client hello (1): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / AES256-GCM-SHA384 * Server certificate: * subject: C=US; ST=North Carolina; O=Red Hat, Inc.; OU=Red Hat Network; CN=subscription.rhn.redhat.com; emailAddress=customerservice@redhat.com * start date: May 18 16:24:18 2017 GMT * expire date: May 18 16:24:18 2019 GMT * issuer: C=US; ST=North Carolina; O=Red Hat, Inc.; OU=Red Hat Network; CN=Red Hat Entitlement Operations Authority; emailAddress=ca-support@redhat.com * SSL certificate verify ok. > GET /subscription/ HTTP/1.1 > Host: subscription.rhn.redhat.com > User-Agent: curl/7.54.1 > Accept: */* > * HTTP 1.0, assume close after body < HTTP/1.0 200 OK < Content-Type: application/json < Server: BigIP * HTTP/1.0 connection set to keep alive! < Connection: Keep-Alive < Content-Length: 1224 < [ {"rel":"","href":"/"},{"rel":"deleted_consumers","href":"/deleted_consumers"},{"rel":"consumertypes","href":"/consumertypes"},{"rel":"crl","href":"/crl"},{"rel":"entitlements","href":"/entitlements"},{"rel":"guestids","href":"/consumers/{consumer_uuid}/guestids"}, {"rel":"pools","href":"/pools"},{"rel":"products","href":"/products"},{"rel":"activation_keys","href":"/activation_keys"},{"rel":"statistics/generate","href":"/statistics/generate"},{"rel":"distributor_versions","href":"/distributor_versions"},{"rel":"owners","href":"/owners"}, {"rel":"cdn","href":"/cdn"},{"rel":"roles","href":"/roles"},{"rel":"events","href":"/events"},{"rel":"consumers","href":"/consumers"},{"rel":"atom","href":"/atom"},{"rel":"content_overrides","href":"/consumers/{consumer_uuid}/content_overrides"},{"rel":"rules","href":"/rules"}, {"rel":"content","href":"/content"},{"rel":"hypervisors","href":"/hypervisors"},{"rel":"users","href":"/users"},{"rel":"migrations","href":"/migrations"},{"rel":"status","href":"/status"},{"rel":"admin"* Connection #0 to host 172.25.64.1 left intact ,"href":"/admin"},{"rel":"serials","href":"/serials"},{"rel":"jobs","href":"/jobs"},{"rel":"subscriptions","href":"/subscriptions"} ,{"href": "/consumers/{consumer_uuid}/packages","rel": "packages"} ]
This all looks really positive. Is the error bona-fide? Well, it does refer to optional-rpms only? Yes, in fact, this is the nature of the problem. OK, disable it
subscription-manager repos --disable rhel-6-server-optional-rpms
However, this gets the following error:
/var/log/rhsm/rhsm.log
Work around
The Redhat webpage helping with this issue mentioned above (#189533) rambles on a bit, it's not quite clear how to follow it. Problem is, it's often quoted as the go-to webpage for thes e issues! In any case, none of its checks gave any joy. Over and over again, "yum check-update" would fail continuously with the Error 14 no ssl connection.
Because updating the nodes is rather import, the workaroudn consists of
- opening up the NAT temporarily from marvin
ON MARVIN, then sudo ssh sudo iptables -t nat -A POSTROUTING -o bond0.506 -j MASQUERADE
- on each node, commenting out proxy IP and port number in the
nano /etc/rhsm/rhsm.conf
the following lines for the above command:
# an http proxy server to use proxy_hostname = 172.25.64.1
# port for http proxy server proxy_port = 3128
file
- updating yum as usual, and rebooting node
- (optional because root cause is not resolved) uncomment the proxy IP and PORT settings in /etc/rhsm/rhsm.conf.
- If yu have finished with the nodes, then remove the NAT rule on marvin:
sudo iptables -t nat -D POSTROUTING 1
(note this command assumes the rule to be the only rule and numbered "1")
This may be of use at some point:
sudo subscription-manager remove --all sudo subscription-manager unregister sudo subscription-manager clean sudo subscription-manager register sudo subscription-manager refresh sudo subscription-manager attach --auto yum clean all
subscription-manager repos --list
- additional unknown ssl errors occured, and RH classic license was still on. To remove. This worked for PT 2019 Sep 17th
sudo subscription-manager register WARNING mv /etc/sysconfig/rhn/systemid /etc/sysconfig/rhn/systemid_old set enabled = 0 in file "/etc/yum/pluginconf.d/rhnplugin.conf" clear cache :- # rm -rf /var/cache/yum/*
rm -rf /var/cache/yum/* mv /etc/yum.repos.d/* /tmp sudo subscription-manager remove --all sudo subscription-manager unregister sudo subscription-manager clean sudo subscription-manager register sudo subscription-manager refresh sudo subscription-manager attach --auto yum clean all
Then
yum update yum yum update
A diganositc step: https://access.redhat.com/solutions/1323123
host $( grep ^hostname /etc/rhsm/rhsm.conf | cut -d '=' -f 2 ) subscription.rhsm.redhat.com has address 209.132.183.107
If all this STILL fails, then this is the new, methods. AGAIN AGAIN AGAIN ... FFS
make sure you do the werid thing on the head node and alter the conf files, as specified above.
1. Remove the proxy entries from yum.conf
# nano /etc/yum.conf proxy=http://172.25.64.1:3128 <<<<< Remove this line - I only comment this out, and pupt it back in after, as I dont know what it really does!
2, Unset the Proxy variables since the system is not able to connect with the proxy address itself, # unset http_proxy # unset ftp_proxy (I didnt need to do this one!!!)
Now try to execute the following commands and we should be able to proceed further accordingly,
# subscription-manager refresh # yum clean all # yum repolist
The root cause can be concluded here from the analysis of both the system is that, your systems are not able to communicate with your proxy address of 172.25.64.1:3128.
But your systems are already having access to Red Hat registration servers through the Coprorate firewall, so the registration and running yum commands was possible.
Hence, you will need to make sure that the system does not have any Proxy entries in any of these below files or in the environment variable,
- /etc/yum.conf - /etc/rhsm/rhsm.conf - Proxy Env Variables can be removed using, # unset http_proxy # unset https_proxy # unset ftp_proxy
if stuff still fails!!! then: (this was th efix for marvin head node)
/etc/yum.repos.d/redhat.repo is the only repository which is required for OS patching. Could you move all the files (except redhat.repo) from "/etc/yum.repos.d/" to /tmp/ and then run below commands:
# ls -l /etc/yum.repos.d/ ---> Make sure only redhat.repo file is present.
# yum clean all; # yum repolist # yum update
Node 9 still failing. This was the fix for that:
Considering upgrading to RHEL7
A link for this is solution number 637583
Thoughts
- This could be one heck of an epic upgrade, especially as regards the manually installed software programs (150 or them, some of them very big)
- Most likely it will be the different C library that will cause problems
- Library and linking problems are big ... fear them!
Test with node10
Following solution 637583, we install the upgrade assessemnt software and and run
preupg
which, happily enough, will not install the upgrade but report on what problems are likely to be encountered.
Note, that it takes a while to run, may be 30 mins.