...
Description of problem: I want to update a running pacemaker cluster that includes a pacemaker remote node. [root@ratester3 ~]# crm_mon -1 Stack: corosync Current DC: ratester1 (version 2.0.1-4.el8_0.4-0eb7991564) - partition with quorum Last updated: Tue Oct 8 06:02:14 2019 Last change: Tue Oct 8 05:49:02 2019 by root via cibadmin on ratester2 3 nodes configured 1 resource configured Online: [ ratester1 ratester2 ] RemoteOnline: [ ratester3 ] Active resources: ratester3 (ocf::pacemaker:remote): Started ratester1 I'm updating only the real cluster nodes (ratester1 and ratester2) to a newer pacemaker version: [root@ratester3 ~]# crm_mon -1 Stack: corosync Current DC: ratester2 (version 2.0.2-3.el8-744a30d655) - partition with quorum Last updated: Tue Oct 8 07:36:09 2019 Last change: Tue Oct 8 06:58:22 2019 by ratester3 via crm_attribute on ratester2 3 nodes configured 1 resource configured Online: [ ratester1 ratester2 ] RemoteOnline: [ ratester3 ] Active resources: ratester3 (ocf::pacemaker:remote): Started ratester2 At this point, I'm trying to set an attribute in the live CIB, everything still works ok: [root@ratester3 ~]# pcs node attribute ratester2 foo=foo_value However, if I try to run the same operation on an offline CIB, the same operation fails: [root@ratester3 ~]# pcs cluster cib > cib.xml [root@ratester3 ~]# pcs -f cib.xml node attribute ratester2 bar=bar_value Error: unable to set attribute bar Error performing operation: Protocol not supported Error setting bar=bar_value (section=nodes, set=nodes-2): Protocol not supported In fact, with debug info enabled, it seems this is because the feature set version has been bumped in the cluster, even if the pacemaker remote hasn't been upgraded yet. [root@ratester3 ~]# PCMK_debug=yes PCMK_logfile=/dev/stdout pcs -f cib.xml node attribute ratester2 bar=bar_value Error: unable to set attribute bar Set r/w permissions for uid=189, gid=189 on /dev/stdout Oct 08 08:59:06 ratester3 crm_attribute [26653] (crm_log_args) notice: Invoked: /usr/sbin/crm_attribute -t nodes --node ratester2 --name bar --update bar_value Oct 08 08:59:06 ratester3 crm_attribute [26653] (validate_with_relaxng) info: Creating RNG parser context Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_file_signon) debug: crm_attribute: Opened connection to local file 'cib.xml' Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_file_perform_op_delegate) info: cib_query on /cib/configuration/nodes/node[translate(@uname,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive[@class='ocf'][@provider='pacemaker'][@type='remote'][translate(@id,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive/meta_attributes/nvpair[@name='remote-node'][translate(@value,'ABCDEF Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_process_xpath) debug: Processing cib_query op for /cib/configuration/nodes/node[translate(@uname,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive[@class='ocf'][@provider='pacemaker'][@type='remote'][translate(@id,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ='ratester2']|/cib/configuration/resources/primitive/meta_attributes/nvpair[@name='remote-node'][translate(@value,'A Oct 08 08:59:06 ratester3 crm_attribute [26653] (query_node_uuid) info: Mapped node name 'ratester2' to UUID 2 Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_file_perform_op_delegate) info: cib_query on //cib/configuration/nodes//node[@id='2']//instance_attributes//nvpair[@name='bar'] Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_process_xpath) debug: cib_query: //cib/configuration/nodes//node[@id='2']//instance_attributes//nvpair[@name='bar'] does not exist Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_file_perform_op_delegate) info: cib_modify on nodes Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_perform_op) error: Discarding update with feature set '3.2.0' greater than our own '3.1.0' Oct 08 08:59:06 ratester3 crm_attribute [26653] (update_attr_delegate) info: Update <node id="2"> Oct 08 08:59:06 ratester3 crm_attribute [26653] (update_attr_delegate) info: Update <instance_attributes id="nodes-2"> Oct 08 08:59:06 ratester3 crm_attribute [26653] (update_attr_delegate) info: Update <nvpair id="nodes-2-bar" name="bar" value="bar_value"/> Oct 08 08:59:06 ratester3 crm_attribute [26653] (update_attr_delegate) info: Update </instance_attributes> Oct 08 08:59:06 ratester3 crm_attribute [26653] (update_attr_delegate) info: Update </node> Error performing operation: Protocol not supported Oct 08 08:59:06 ratester3 crm_attribute [26653] (cib_file_signoff) debug: Disconnecting from the CIB manager Oct 08 08:59:06 ratester3 crm_attribute [26653] (crm_xml_cleanup) info: Cleaning up memory from libxml2 Error setting bar=bar_value (section=nodes, set=nodes-2): Protocol not supported This is problematic for our use of pacemaker and pcs in OpenStack, for a couple of reasons: 1. operators can upgrade their cluster nodes in a random order, so we can't guarantee that they will upgrade all their pacemaker remotes before upgrading the real cluster nodes. 2. likewise, we are using bundles, which run pacemaker remotes, and we can't guarantee that operators will restart all containers with up-to-date container images before upgrading the real cluster nodes. 3. in OpenStack we have an idiomatic way of calling pcs with offline CIB, because we drive the creation of pcs resources from puppet and we have to implement a means of checking for resource differences between two puppet runs. Version-Release number of selected component (if applicable): pacemaker-2.0.1-4.el8_0.4.x86_64 How reproducible: Always Steps to Reproduce: 1. create a cluster with a pacemaker remote node (with e.g. pacemaker-2.0.1-4.el8_0.4.x86_64) 2. upgrade the real cluster node to a pacemaker rpm that ships a different feature set (e.g. pacemaker-2.0.2-3.el8.x86_64) 3. from the non-upgraded remote node, try to update a node attribute in a offline CIB Actual results: no attribute can be updated in the offline CIB attribute because the node's feature set lags behind. Expected results: adding/updating attributes in the offline CIB should still work even of the real cluster nodes have been upgraded. Additional info:
Done-Errata