Loading...
Loading...
When legacy Instance Data Replication (IDR) is installed, scheduled jobs get created and run frequently by design. When IDR v2 is being used, all new replication sets are created in Washington DC and later. Additionally a new set of scheduled jobs are used and will continue to work when the infrastructure (Confluent Kafka cluster) for legacy IDR is disabled. The legacy IDR infrastructure is being disabled in phases from April 2025 through October 2025 in order to provide better support and ultimately a better IDR experience on the newer, more reliable v2 infrastructure. All affected customers have been notified that they need to upgrade their legacy replication sets through COMM records. An issue has been identified where the jobs intended for legacy IDR are not automatically disabled when all replication sets have been migrated to v2. When the Confluent Kafka cluster is no longer reachable, these legacy IDR scheduled jobs continue to write errors to the log each time, which occurs once every few seconds and can cause the replication log (which extends syslog) to get flooded with Kafka TimeoutExceptions.
Set up Instance Data Replication (IDR) with a legacy replication set (https://www.servicenow.com/docs/bundle/xanadu-servicenow-platform/page/administer/instance-data-replication/concept/instance-data-replication-set-up.html) (optional) Deactivate the replication set Change the glide.idr.boostrap_servers property from Scripts - Background: gs.setProperty("glide.idr.bootstrap_servers", "localhost:9592") Observe that the replication log (which extends syslog) gets flooded with Kafka TimeoutExceptions: org.apache.kafka.common.errors.TimeoutException
It is recommended that impacted instances, upgraded from legacy IDR to v2 IDR, upgrade to a fixed version where the issue is permanently fixed and the legacy IDR jobs are automatically cleaned up. In the meantime and if unable to upgrade to a fixed version, to prevent the large volume of errors in the replication log after disabling the legacy IDR infrastructure, it is necessary to clean up the scheduled jobs that run legacy IDR. A proactive maintenance (attached IDRLegacyJobCleanupScript.txt) is being scheduled and will be run on impacted instances, who have already migrated from legacy IDR to IDR v2, to cleanup the remaining legacy IDR jobs that are no longer needed. ----------------- FAQ Q: When will this maintenance happen? A: A COMM record will be sent notifying impacted instances with the date and time of the planned maintenance. Q: How can I tell if my instance is affected by this issue? A: Customers that have the IDR plugin installed, and active producer or consumer replication sets where Legacy=true. Q: What does the maintenance do? A: This will clean up the scheduled jobs from the sysauto_script and sys_trigger tables: "IDRConsumerJob", "IDRMetadataConsumerJob", and remote jobs, which will have a 3-letter code appended to the end (e.g. "IDRConsumerJob-ytz", "IDRConsumerJob-bwi"). These jobs only process data for legacy IDR replication sets. IDR v2 replication sets will be unaffected (replication sets where Legacy=false). Q: Will this maintenance have any service impact? A: Removing these records will have no service impact. Cache flush or node restart is NOT required during the maintenance. Q: What is the impact if I do not perform this maintenance? A: If there are legacy replication sets that are not in a deactivated state (for example, an error state), the instance will log an error message on every execution of the jobs (approximately 4x every 15 seconds), which will end up filling disk space. Q: Why is there a maintenance to remove a deprecated feature that is not enabled out of box? A: The errors will only present when the legacy IDR infrastructure is deactivated in 2025, which is not tied to a family release. This is a proactive maintenance to prevent errors when this deactivation occurs. Q: Can I run this maintenance script on my own? A: Yes, it is attached to this KB and admins can run it on their own in scripts background in global scope. As always, it is recommended that the script be first applied to sub-production instance(s) before applying to production instance(s). Q: What am I to expect once the maintenance is completed? What is the validation? A: The following records will be cleaned up (removed): sys_trigger = IDRConsumerJob, IDRMetadataConsumerJob, IDRDeltaConsumerJob-*, IDRMetadataConsumerJob-*, IDRSeedingConsumerJob-* sysauto_script = IDRConsumerJob, IDRMetadataConsumerJob, IDRDeltaConsumerJob-*, IDRMetadataConsumerJob-*, IDRSeedingConsumerJob-* Note the following jobs are NOT being touched or deactivated by this maintenance: IDRHermesConsumerJob* IDRHermesMetadataConsumerJob* Q: Is this maintenance updating the code defect itself? A: No, this maintenance is only removing the records mentioned above. The code defect removes the code from future releases and patch upgrades. Q: What if I apply the maintenance script on my own? A: The maintenance will still run on the instance but not perform actions due to the logic check that the maintenance was applied already. However, the COMM record will still report that the maintenance is performed. There is no impact as the maintenance is already manually performed. Q: I have an instance that wasn't added for maintenance even though I believe it should be added. What do I do? A: Please run the maintenance script attached to this KB. There is no risk to running it as it only removes the deprecated feature. Q: I have an activity occurring at the same time as the maintenance, such as a deployment. Will this impact my deployment? A: The maintenance will not interrupt ongoing activities being performed at the same time. However, if the instance is temporarily down due to an activity such as a clone, the maintenance will not proceed and the ServiceNow admin for the instance needs to manually run the maintenance script in the workaround section in scripts background in global scope. Q: What versions will address this issue? A: See the Fixed Versions sections of this article. Q: If I upgrade to a later family (Ex: X to Y), do I need to run the maintenance again? A: No, the maintenance only needs to be run once so upgrading to one of the Fixed Versions will automatically address this issue. Q: When is the legacy IDR Infrastructure being disabled? A: The legacy infrastructure is being disabled in phases from April 2025 through October 2025 depending on the instance. Each of the affected customers have received COMMs detailing the specific dates they need to migrate from legacy IDR to IDR v2. Q: Can I check if my instance is affected? A: Yes, insert your instance name and launch this url query (to confirm whether instance is using legacy IDR or migrated to IDR v2): https:// <instance_name> .service-now.com/now/nav/ui/classic/params/target/idr_replication_set_list.do%3Fsysparm_query%3Dactive%253Dtrue%255Eis_legacy%253Dtrue%26sysparm_first_row%3D1%26sysparm_view%3D If the above url query returns NO records that means no legacy sets are active on the instance so this maintenance should be applied to cleanup legacy jobs. If the above url query returns record(s) that means there are still legacy sets STILL active on the instance. These legacy replication sets would first need to be migrated to v2 and then this maintenance applied to cleanup legacy jobs.
PRB1885869
Click on a version to see all relevant bugs
ServiceNow Integration
Learn more about where this data comes from
Bug Scrub Advisor
Streamline upgrades with automated vendor bug scrubs
BugZero Enterprise
Wish you caught this bug sooner? Get proactive today.