Fixed Issues in Apache YARN
Review the list of YARN issues that are resolved in Cloudera Runtime 7.1.9.
- COMPX-14340: YARN-11490 JMX QueueMetrics breaks after mutable config validation in CS
 - Fix: JMX metrics broke after 2 or more configuration validation.
 - COMPX-13959: Applications submitted to ambiguous queue fail during recovery if "Specified" Placement Rule is used
 - Fixed the issue of app killed, if specified placement is used and rm is restarted while the app is still running.
 - COMPX-13773: YARN-11461 NPE in determineMissingParents when the queue is invalid
 - Fix NPE log warning when submitting to invalid queue.
 - COMPX-14120: Backport YARN-11463: Node Labels root directory creation doesn't have a retry logic
 - Retry logic is implemented and backported for root directory creation during RM node label store inititalization.
 - COMPX-10909: Investigate if placement rules are working fine if username contains dot, and default queue is set to that queue
 - Usernames with dot now will work well with CS placement rules
 - COMPX-13554: Backport YARN-10178 to 7.1.9 CHFx : Crash in global async scheduler thread
 - With this fix the Capacity Scheduler Global Scheduler AsyncThread won't crash when multi async thread concurrently compares queue usage statistics and ResourceCommitterService applies leaf queue change statistics.
 - COMPX-12661: YARN-11075 Explicitly declare serialVersionUID in LogMutation class
 - The serialVersionUID field is explicitly set for the LogMutation class.
 - COMPX-13392: HADOOP-18602 Remove netty3 dependency - CDH-7.1.9
 - netty3 is removed
 - COMPX-12815: Backport YARN-10178 to 7.1.8 CHFx : Crash in global async scheduler thread
 - With this fix the Capacity Scheduler Global Scheduler AsyncThread won't crash when multi async thread concurrently compares queue usage statistics and ResourceCommitterService applies leaf queue change statistics.
 - COMPX-12783: Backport YARN-11079 (Make an AbstractParentQueue to store common ParentQueue and ManagedParentQueue functionality)
 - Made an AbstractParentQueue to store common ParentQueue and ManagedParentQueue functionality
 - COMPX-14124: Backport YARN-10739 GenericEventHandler.printEventQueueDetails causes RM recovery to take too much time
 - GenericEventHandler.printEventQueueDetails causes RM recovery to take too much time so added thread pool for async print event details ,to prevent wasting too much time for RM.
 - COMPX-14122: Backport YARN-11286: Make AsyncDispatcher#printEventDetailsExecutor thread pool parameter configurable
 - Made AsyncDispatcher#printEventDetailsExecutor thread pool parameter configurable
 - CDPD-41982: Yarn - Upgrade Guava: Google Core Libraries for Java to v28.2/31.1-jre due to CVEs
 - Upgraded Guava Google Core Libraries for Java to v28.2 due to CVEs
 - CDPD-57948: [7.1.9 ZDU Simulation] Hive Query is failing when YARN is into rolling restart
 - YARN-side fix is implemented and backported to cdpd-master and 7.1.9.x
 - COMPX-6054: PlacementPolicy Rules(default rule) is not honoured in case limit 2 is breached for AQC
 - This issue is resolved.
 
- COMPX-5244: Root queue should not be enabled for auto-queue creation
 - This issue is resolved.
 
- COMPX-3181: Application logs does not work for AZURE and AWS cluster
 - Support of automatically fetching Delegation Token for YARN Log Aggregation Path (S3 or Azure) in YarnClient.
 
- OPSAPS-52066: Stacks under Logs Directory for Hadoop daemons are not accessible from Knox Gateway.
 - Issue was due to wrong URL being displayed. Both jstacks log viewer and download URLs have been fixed.
 
- OPSAPS-57067: Yarn Service in Cloudera Manager reports stale configuration yarn.cluster.scaling.recommendation.enable.
 - This issue is resolved.
 
- CDPD-2936: Application logs are not accessible in WebUI2 or Cloudera Manager
 - This issue is resolved.
 
- OPSAPS-50291: Environment variables HADOOP_HOME, PATH, LANG, and TZ are not getting whitelisted
 - "HADOOP_HOME,PATH,LANG,TZ" are now added by default to the yarn.nodemanager.env-whitelist Yarn configuration option.
 
- COMPX-3303: Auto queue deletion is not supported in relative and absolute resource allocation mode
 - This issue is resolved.
 - OPSAPS-68058: [CKP-4] YARN allowed system users are hardcoded
 - Allowed system users are now generated dynamically, based on the Kerberos principals, process users and auth-to-local rules.
 - OPSAPS-67682: [CKP-3, 4(unequal)] Yarn failed to start the resource manager
 - The permissions of the node label directory were eased to allow the process users group members to access it.
 - OPSAPS-67860: [BLOCKER] 718CHF9 to 719 | During rolling upgrade Delete the confstore on YARN Zookeeper nodes failed
 - The script was fixed to use Kerberos auth instead of relying on digest.
 - OPSAPS-68108: Upgrade failures from CDH6 to 7.1.9 because ACL is not the expected for znode after OPSAPS-67993
 - Fixed issue with the ACL validator.
 - OPSAPS-67993: Upgrade failures from CDH6 to 7.1.9 because ACL is not the expected for znode after OPSAPS-63187
 - The bash script was updated to work in a secured environment.
 
Apache patch information
- MAPREDUCE-7237
 - MAPREDUCE-7268
 - MAPREDUCE-7434
 - MAPREDUCE-7433
 - MAPREDUCE-7431
 - YARN-10930
 - YARN-11286
 - YARN-10739
 - YARN-10178
 - HADOOP-18602
 - YARN-11190
 - YARN-11463
 - YARN-11461
 - YARN-11513
 - YARN-10888
 - YARN-11533
 - YARN-11490
 
