Fixed issues in Cloudera Data Warehouse on premises 1.5.5 SP1
Review the issues fixed in this release of the Cloudera Data Warehouse service.
- DWX-21395: Log routers are not requesting enough resources
- Previously, the Cloudera Data Warehouse log-router
component's resource quota was insufficient. Because the log router runs as a DaemonSet (with
instances on multiple nodes), the resource request calculation did not account for the number
of nodes or pod instances in the cluster. This led to resource constraints and issues with log
router pods.
This issue is now resolved ensuring that sufficient resources are requested and allocated. Cloudera Data Warehouse on premises now correctly calculates the resource quota of the log router by multiplying the resource request of the underlying pod by the number of nodes it is executed on.
- DWX-14325: Support for virtual warehouse access control in Cloudera Data Warehouse on premise for Hive with Kerberos
- Previously, enabling warehouse-level access control for Hive or Impala warehouses in Unified
Analytics mode and associating a user group with a Virtual Warehouse disabled Kerberos
authentication. This prevented the use of Kerberos security alongside warehouse-level access
control, limiting the ability to enforce both access restrictions and strong authentication
simultaneously.
This issue is now resolved. When warehouse-level access control is enabled for Hive Virtual Warehouses, Kerberos authentication remains functional. This ensures that only the designated user groups can access the Virtual Warehouse through supported channels such as Hue, JDBC, Beeline, Impala-shell, and Impyla, while maintaining the security provided by Kerberos authentication.
- DWX-21185: Resolution of Zookeeper configuration validation error during Cloudera Data Warehouse 1.5.5 activation
- When activating a Cloudera Data Warehouse 1.5.5 on premises, the process failed with an incorrect Zookeeper
configuration validation error related to ssl.clientAuth. This error occurred specifically when
TLS communication was not enabled for the Zookeeper service in the Cloudera Base on premises deployment.
This issue is now resolved, and activation now works regardless of the TLS configuration.
- DWX-21037: Invalid trust store provided by Cloudera Data Warehouse when certification manager is enabled
- Previously, when the certi-manager was enabled, the trust store file
provided by the Cloudera Data Warehouse UI was invalid for Beeline connections due
to a mismatch with the Hive cluster certificate.
This issue is now resolved, and the UI now provides a compatible trust store file, allowing users to configure TLS and connect to Beeline successfully.
- DWX-21209: Resource template validation failure
- Previously, when attempting to copy certain predefined resource
templates, a validation error occurred with the memory (should have a 33% overhead): X
cannot be less than or equal to xmx*1.33: 30345 message.
This issue is now resolved, and the validation error no longer occurs.
- DWX-2147: Virtual warehouse creation fails intermittently after upgrade
- Previously, after upgrading Cloudera Data Warehouse on
premises to version 1.5.5, the first attempt to create or update an Impala Virtual Warehouse
might have failed due to an internal timing conflict during resource pool migration.
This issue is now resolved.
- DWX-15302: Upgrade button stays visible even after the upgrade completes
- Previously, after upgrading the Database Catalog, the
Upgrade button remained visible on the Cloudera Data Warehouse web interface instead of disappearing or being disabled.
This issue is now resolved.
- CDPD-83530: Task commits were allowed despite an exception being thrown in the Tez processor
- A communication failure between the coordinator and executor caused a
running task to terminate, resulting in a
java.lang.InterruptedExceptionbeing thrown by theReduceRecordProcessor.init(). Despite this exception, the process still allowed the task to be committed and generated a commit manifest.This issue has now been resolved. The fix ensures that outputs are not committed if an exception is thrown in the Tez processor.
Apache Jira: HIVE-28962
- Cookie-Based authentication support for JWT tokens
- When JWT tokens are used for authentication, every
HTTPrequest within a session requires token verification. If these tokens have a short lifespan, it can lead to authentication failures and disrupt session continuity. - CDPD-80798: Stable Catalogd initialization in HA mode
- Catalogd initialization previously might timeout to complete in high availability mode. This happened because metadata operations started prematurely, blocking Catalogd from becoming active.
- CDPD-83059: Optimized Impala Catalog cache warmup
- Impala's Catalogd previously started with an empty cache. This led to slow query startup for important tables and affected high availability failovers.
- CDPD-87222: Consistent TRUNCATE operations for external tables
- Impala's
TRUNCATEoperations on external tables previously did not consistently delete files in subdirectories, even when recursive listing was enabled. - CDPD-82415:
TABLESAMPLEclause of theCOMPUTE STATSstatement has no effect on Iceberg tables - This fix resolves a regression introduced by IMPALA-13737. For example, the following query scans the entire Iceberg table to
calculate statistics, whereas it should ideally use only about 10% of the
data.
COMPUTE STATS t TABLESAMPLE SYSTEM system(10);This fix introduces proper table sampling logic for Iceberg tables, which can be utilized for
COMPUTE STATS. The sampling algorithm previously located inIcebergScanNode.getFilesSample()is now relocated toFeIcebergTable.Utils.getFilesSample().Apache Jira: IMPALA-14014
- CDPD-82599: Query rejected due to too many fragment instances on one node
- Some queries failed with a scheduling error when too many fragment instances were placed on a single executor node.
- CDPD-80939: Missing equivalence conjunct in aggregation node with inline views
- In queries that include filters on the result of a
UNIONoperation, the planner sometime removed required conjuncts, which caused incorrect query results. - CDPD-82303: EXEC_TIME_LIMIT_S incorrectly includes planning time
- The
EXEC_TIME_LIMIT_Stimer was triggered during the planning and admission(queuing) phases, which could cause queries to fail before any processing on backend started. - CDPD-82364: Nested loop join rewrites disjunctive subquery incorrectly
- Queries with a subquery inside an OR condition could return incorrect results due to an improper join predicate rewrite.
- Invalid cardinality calculation in sortnode's computestats
- An error occurred during query execution due to an overflow in the calculation of row limits, causing unexpected failures.
TABLESAMPLEclause of theCOMPUTE STATSstatement has no effect on Iceberg tables- This fix resolves a regression introduced by IMPALA-13737. For example, the following query scans the entire Iceberg table to
calculate statistics, whereas it should ideally use only about 10% of the
data.
COMPUTE STATS t TABLESAMPLE SYSTEM system(10);This fix introduces proper table sampling logic for Iceberg tables, which can be utilized for
COMPUTE STATS. The sampling algorithm previously located inIcebergScanNode.getFilesSample()is now relocated toFeIcebergTable.Utils.getFilesSample().Apache Jira: IMPALA-14014
IllegalStateExceptionwith Iceberg table with DELETE- Running a query on an Iceberg table fails with an
IllegalStateExceptionerror in the following scenario:- The Iceberg table has delete files for every data file (no data files without delete files) AND
- An anti-join operation is performed on the result of the Iceberg delete operation (IcebergDeleteNode or HashJoinNode)
This fix resolves the issue by setting the
TableRefIdsof the node corresponding to the Iceberg delete operation (IcebergDeleteNode or HashJoinNode) to only the table reference associated with the data files, excluding the delete files.Apache Jira: IMPALA-14154
- Error unnesting arrays in Iceberg tables with DELETE files
- The following error occurred when unnesting a nested array (a 2D
array) from an Iceberg table. This issue was triggered specifically when the table contained
delete files for some, but not all, of its data
files.
Filtering an unnested collection that comes from a UNION [ALL] is not supported yet.Reading an Iceberg table with this mixed data and delete file configuration creates a
UNION ALLnode in the query execution plan. The system had a check that explicitly blocked any filtering on an unnested array.This fix relaxes the validation check, allowing the operation to proceed if all
UNIONoperands share the same tuple IDs. This ensures the query can successfully unnest the array.Apache Jira: IMPALA-14185
- DWX-21190: Hue UI can't upload files larger than 10Mb to Ozone
- Previously, Hue UI experienced file upload failures for files larger
than 10MB when uploading to Ozone. This was traced to incompatible chunked upload handling
caused by constraints in Ozone’s HttpFS API. Additionally, you needed to navigate to the
Groups tab, select the default group, and
manually enable the file
browser.ofs_access: Access to OFS from filebrowser and filepicker permissionto access the file browser. - CDPD-82819: Improved
kt_renewerlogging for kerberos ticket renewal failures - Previously, the
kt_renewerapplication stopped after three retry attempts and did not log the kinit error, making troubleshooting difficult. This occurred because the subprocess output streams (stdoutandstderr) were read twice, causing empty error messages to be recorded whenkinitfailed.
