Fixed issues in 7.1.9 SP1 CHF 9

Know more about the cumulative hotfix 9 for 7.1.9 SP1.

Following are the list of fixes that were shipped for CDP Private Cloud Base version 7.1.9-1.cdh7.1.9.p1045.67903105.

CDPD-82688: Error when building the kernel module
Navigator Encrypt failed to install on RHEL 7.9 with the following error:
Failed installation of navencryptfs 7.1.9.1005 DKMS kernel module!

This issue is fixed now.

CDPD-82671: Log files for RocksDB ops should be cleaned more frequently for heavy load
The RocksDB log files are quite heavy when a lot of snapshot operations are involved. This fix cleans log files of RocksDB operations frequently for heavy loads.
CDPD-81939: Volume scanner should fail volume if rocksDB is inaccessible
When RocksDB becomes unreadable on a DataNode due to disk-related issues, the DataNode will mark the affected storage volume as unhealthy. This proactive health marking enables the system to initiate data replication processes more rapidly, thereby maintaining data availability and integrity.
Apache Jira: HDDS-12723
CDPD-81528: Cleanup expired incomplete MPUs
Any parts of a multipart-upload which are not committed are retained for a duration of 30 days by default. With this fix, MultipartUploadCleanupService cleans up the expired parts by aborting the upload. The following configurations can be used to tune the cleanup:
  • ozone.om.open.mpu.cleanup.service.interval (default value=24h)
  • ozone.om.open.mpu.cleanup.service.timeout (default value=300s)
  • ozone.om.open.mpu.expire.threshold (default value=30d)
  • ozone.om.open.mpu.parts.cleanup.limit.per.task (default value=1000)
Apache Jira: HDDS-9095
CDPD-66534: Support cross realm Kerberos out of box
ozone.om.kerberos.principal.pattern was not configured previously which prevented cross-cluster communication in Kerberized environment. This issue is fixed.
Apache Jira: HDDS-10328
CDPD-81786: Auto-compact tables which can tend to be large in size at intervals
The following configuration properties are added:
Configuration property Default value Description
ozone.compaction.service.enabled false Enable or disable a background job that periodically compacts rocksdb tables flagged for compaction.
ozone.om.compaction.service.run.interval 6h A background job that periodically compacts rocksdb tables flagged for compaction. Units can be defined with postfix (ns,ms,s,m,h,d).
ozone.om.compaction.service.timeout 10m A timeout value of compaction service. If this is set greater than 0, the service will stop waiting for compaction completion after this time. Units can be defined with postfix (ns,ms,s,m,h,d).
ozone.om.compaction.service.columnfamilies keyTable,fileTable,directoryTable,deletedTable,deletedDirectoryTable,multipartInfoTable A comma separated, no spaces list of all the column families that are compacted by the compaction service. If this is empty, no column families are compacted.
Apache Jira: HDDS-12819
CDPD-81764: Patch to replace the long strings set in spark_process attributes
As the atlas.process.spark.attributes.update.patch is set to TRUE by default, the spark_process entity attributes details and sparkPlanDescription no longer cause out-of-memory issues as they are no longer contain a large amount of data.
CDPD-80588: Tomcat Upgrade to 9.0.99+
Upgraded Tomcat to 9.0.99+ to address CVE-2025-24813.
CDPD-79160: Nullpointer exception while deleting business metadata
After the update, the migration status is stored correctly and reused to restart migration from the point where it failed earlier.
Apache Jira: ATLAS-4863
CDPD-83356: Improved database type detection for metastore
Hive Metastore clients sometimes experienced delays when connecting to the metastore.
This issue was addressed by enhancing how the system determines the database type during metastore initialization. This improvement avoids repetitive checks, reduces connection delays, and ensures HMS clients connect more quickly and reliably.

Apache Jira: HIVE-28460

CDPD-83355/CDPD-73669: Secondary pool connection starvation caused by updatePartitionColumnStatisticsInBatch API
Hive queries intermittently failed with Connection is not available, request timed out errors. The issue occurred because the updatePartitionColumnStatisticsInBatch method in ObjectStore used connections from the secondary pool, which had a pool size of only two, leading to connection starvation.
The fix ensures that the updatePartitionColumnStatisticsInBatch API now requests connections from the primary connection pool, preventing connection starvation in the secondary pool.

Apache Jira: HIVE-28456

CDPD-81122: Enhanced concurrent access in HWC secure mode
Spark applications running multiple concurrent queries in HWC's SECURE_ACCESS mode encountered failures and correctness problems. This happened because the system faced difficulties when generating temporary table names and managing staging directories simultaneously for multiple reads.
This issue was addressed by improving the handling of concurrent operations within HWC's SECURE_ACCESS mode.
CDPD-82887: Hive web interfaces no longer expose server version
Hive web interfaces and related services previously exposed their underlying server version in the header.
This issue was addressed by stopping the web interfaces from sending information about their server version.
CDPD-83079: Included Iceberg fixes related to EXPIRE SNAPSHOTS
This release inlcudes the following Iceberg fixes to ensure that the EXPIRE SNAPSHOTS can effectively delete data files from storage:
  • Deletion of expired snapshot files in a transaction — When a snapshot is expired as part of a transaction, the manifest list files should be deleted upon transaction commit. Previously, a recent change prevented these files from being deleted if they were also committed as part of a transaction. However, this caused issues in simpler cases where no new files were committed.

    This issue is now fixed by ensuring deletion is not skipped when the list of committed files is empty.

  • Improved logic for determining committed files in Base Transaction — The logic to identify the set of committed files in 'BaseTransaction' has been corrected for scenarios where no new snapshots are available.
CDPD-83116: Query planning incorrectly estimated row counts
Impala's query planner generated negative cardinality estimation over very specific condition (TOP-N sort over very high NDV column). This results in IllegalStateException: null thrown and fail the query.
This issue was resolved by improving the calculation that estimates row counts for sorting operations.
CDPD-82273: Backport KUDU-3661 Ranger policy not honored in Kudu
Fixed an issue in the Ranger authorization provider that could cause some table privileges to be missing in certain environments. This happened when processing the SELECT privilege, which caused the system to stop checking for additional permissions. The issue was primarily seen on RHEL/CentOS 8 systems due to platform-specific behavior in the underlying system libraries.
CDPD-82275: Run a range-aware cluster rebalance with multiple tables
Previously, when rebalancing a cluster with the Kudu command-line tool, the --enable_range_rebalancing flag required the --tables flag to specify exactly one table.This fix removes that restriction. You can now pass multiple tables to the -- tables flag when range rebalancing is enabled. Range-partitioned tables among those specified will be rebalanced with ranges considered, while other tables rebalance as usual. If you do not set the --tables flag, all tables in the cluster will be rebalanced.
CDPD-67359: Improved S3 Directory Listing in Hue for Large Datasets
Previously, Hue encountered "incorrect signature" errors when attempting to list S3 directories containing 1,000 or more objects. This issue was caused by S3's pagination mechanism, where the marker field for subsequent requests was not correctly passed to the Ranger Authorization Service (RAZ). As a result, the signature generated by RAZ did not align with the actual S3 request, leading to verification failures.
This problem is fixed by enhancing Hue's handling of the S3 marker field. Hue now accurately includes this pagination parameter when requesting signed headers from RAZ. This ensures that RAZ provides a precise signature, enabling S3 to successfully verify the request and allow Hue to correctly list and paginate through large directories.
CDPD-82364: Nested loop join rewrites disjunctive subquery incorrectly
Queries with a subquery inside an OR condition could return incorrect results. The join was rewritten incorrectly, leading to a wrong comparison.
The issue was addressed by skipping the join rewrite when the subquery is inside an OR condition.

Apache Jira: IMPALA-13991

CDPD-82303: EXEC_TIME_LIMIT_S incorrectly includes planning time
The EXEC_TIME_LIMIT_S setting was triggered during the planning and scheduling phases, which could cause queries to fail before any processing on backends started.
The issue was addressed by starting the EXEC_TIME_LIMIT_S countdown only after the query is ready to run on the backends. This ensures the timeout applies only to the actual processing phase.

Apache Jira: IMPALA-14001

CDPD-82251: Inconsistent row count output in Impala-shell
When running Impala queries, some commands over HiveServer2 protocol, some commands (like REFRESH or INVALIDATE) did not show the "Fetched X row(s) in Ys" output in Impala-shell, even though Beeswax protocol shows them.
This issue was resolved by adding a new option in Impala-shell called --beeswax_compat_num_rows. When this option is enabled, Impala-shell now prints "Fetched 0 row(s) in" along with the elapsed time for all Impala commands.

Apache Jira: IMPALA-13584

CDPD-81556: Improved handling of invalid data in text files
Impala encountered crashes when processing text files that contained invalid binary data.
This issue was resolved by enhancing Impala's data handling. The system now correctly identifies and flags invalid binary data encountered in text files, preventing system instability.

Apache Jira: IMPALA-13927

CDPD-80019: Impala now starts with large catalog-update topics
Impala coordinators previously failed to start if their internal update data exceeded 2GB. This prevented the coordinator from processing necessary updates.

This issue was resolved.

Apache Jira: IMPALA-13020

CDPD-66938: [Analyze] [Atlas] test_time_range tests fail
Atlas stores all timestamps in UTC, but the UI or API would interpret TODAY or YESTERDAY based on the local server time zone. For instance, if the server is in a different time zone from the user, TODAY may refer to different times than expected, causing mismatched results. After the update, search results remain accurate despite the server's timezone as the date and time conversions while fetching the results use UTC time zone.
Common Vulnerabilities and Exposures (CVE) that is fixed in this CHF: