Fixed Issues

Review the list of Cloudera Manager issues that are resolved in Cloudera Manager 7.13.1 and its cumulative hotfixes.

Cloudera Manager 7.13.1 CHF4 (7.13.1.400)

OPSAPS-73370: Enhance code to merge compressed Spark event log files

Fixes an issue with unreported metrics in Cloudera Observability when the spark.eventLog.compress property was set to true.

The spark.eventLog.compress property is set to false by default, but enabling it will no longer encoded event logs to fail when processed in Cloudera Observability.

OPSAPS-60642: Host header injection issue on


                /j_spring_security_check

internal endpoint

/j_spring_security_check is internal endpoint which is vulnerable to Host header injection. This issue occurs if the user disabled PREVENT_HOST_HEADER_INJECTION feature flag.

Host header injection: In an incoming HTTP request, web servers often dispatch the request to the target virtual host based on the value supplied in the Host header. Without proper validation of the header value, the attacker can supply invalid input to cause the web server to:

Dispatch requests to the first virtual host on the list
Redirect to an attacker-controlled domain
Perform web cache poisoning
Manipulate password reset functionality

This issue is resolved now by adding Feature Flag PREVENT_HOST_HEADER_INJECTION to prevent host header injection vulnerability on /j_spring_security_check internal endpoint. This feature flag is by default enabled and it enables additional logic to block potential Host Header Injection attacks targeting the /j_spring_security_check endpoint in Cloudera Manager.

OPSAPS-74019/OPSAPS-72739: Query execution stability with temporary directories

Queries previously failed with an execution error when using a compression library. Although /tmp is a default temporary folder, its use for script execution was blocked due to security restrictions, causing queries to fail.

This issue was resolved by configuring Hive to use a different default temporary folder, /var/lib/hive, instead of /tmp.

OPSAPS-74141: Hive service setup on reused databases

During 7.3.1 base cluster installations, the Hive service setup failed when attempting to validate the Hive Metastore Schema. This happened specifically when the new cluster used a database that had been previously used by an older installation, causing the schema validation to fail due to a version mismatch with the newer Hive components.

This issue was resolved by modifying the Hive script in Cloudera Manager to use the -initOrUpgradeSchema argument instead of -initSchema for the hive_metastore_create_tables command. This change allows the Hive Metastore schema to be properly initialized or upgraded even when connecting to a database that was previously used by an older installation.

OPSAPS-73011: Wrong parameter in the /etc/default/cloudera-scm-server file

In case the Cloudera Manager needs to be installed in High Availability (2 nodes or more as explained here), the parameter CMF_SERVER_ARGS in the /etc/default/cloudera-scm-server file is missing the word "export" before it (on the file there is only CMF_SERVER_ARGS= and not export CMF_SERVER_ARGS=), so the parameter cannot be utilized correctly.

This issue is fixed now.

OPSAPS-72756:The runOzoneCommand API endpoint fails during the Ozone replication policy run

The /clusters/{clusterName}/runOzoneCommand Cloudera Manager API endpoint fails when the API is called with the getOzoneBucketInfo command. In this scenario, the Ozone replication policy runs also fail if the following conditions are true:

The source Cloudera Manager version is 7.11.3 CHF11 or 7.11.3 CHF12.
The target Cloudera Manager is version 7.11.3 through 7.11.3 CHF10 or 7.13.0.0 or later where the feature flag API_OZONE_REPLICATION_USING_PROXY_USER is disabled.

This issue is fixed now.

OPSAPS-72710: Marking the snapshots created by incremental replication policies differently

In the Ozone bucket browser, the snapshots created by an Ozone replication are marked. When the snapshots are deleted, a confirmation modal window appears before the deletion. The restore bucket modal window now displays information about how the restore operation is implemented in Ozone and how this operation affects Ozone replications.

OPSAPS-72447, CDPD-76705: Ozone incremental replication fails to copy renamed directory

Ozone incremental replication using Ozone replication policies succeed but might fail to sync nested renames for FSO buckets.

When a directory and its contents are renamed between the replication runs, the outer level rename synced but did not sync the contents with the previous name.

This issue is fixed now.

OPSAPS-71046: The jstack logs collected on Cloudera Manager 7.11.3 are not in the right format

On viewing the jstack logs in the user cluster, the jstack logs for ozone and other services on Cloudera Manager 7.11.3 and CDP Private Cloud Base 7.1.9 are not in the right format. This issue is fixed now.

OPSAPS-65377: Cloudera Manager - Host Inspector not finding Psycopg2 on Ubuntu 20 or Redhat 8.x when Psycopg2 version 2.9.3 is installed.

Host Inspector fails with Psycopg2 version error while upgrading to Cloudera Manager 7.13.1.x versions. When you run the Host Inspector, you get an error Not finding Psycopg2, even though it is installed on all hosts. This issue is fixed now.

OPSAPS-70226: Atlas uses the Solr configuration directory available in ATLAS_PROCESS/conf/solr instead of the Cloudera Manager provided directory

The solrconfig.xml and schema.xml files are updated for the existing atlas_configs directory to ensure backward compatibility for existing customers. This included downloading the current Atlas configuration from Solr, applying the necessary updates, and then overwriting the modified configuration back to Solr. This issue is fixed now and Atlas uses the correct configuration directory in /var/run/cloudera-scm-agent/process/151-atlas-ATLAS_SERVER/solrconf.xml. New clusters already use the updated configuration directory.

OPSAPS-74147: Atlas rolling upgrade related to Zero Downtime Upgrade (ZDU) fails from from 7.1.7 SP3 to 7.3.1.400

The issue causing ZDU failures during upgrades from Cloudera Runtime 7.1.7 SP3 to 7.3.1.400 has been resolved. Previously, the Atlas rolling upgrade was failing because the RoleState for Atlas was not checked, and the upgradeCommand was not set correctly.

OPSAPS-73174: Autoscaling fails when any of the RM hosts are down

When a master node hosting RM abruptly goes down, Cloudera Manager can proceed with the NM commission/decommission command-flow.

OPSAPS-73780: Existing Iceberg replication policies fail after Cloudera Manager is upgraded

Existing Iceberg replication policies fail when the source and target clusters are upgraded to Cloudera Manager 7.13.1.x which use CDP Private Cloud Base 7.1.9.x version. The replication policies fail because of compatibility issues. You must ensure that the CDP versions and Cloudera Manager versions are compatible before you run replication policies. For example, use Cloudera on premises 7.3.1.x with Cloudera Manager 7.13.1.x. This issue is resolved.

Cloudera Manager 7.13.1 CHF3 (7.13.1.300)

OPSAPS-73225: Cloudera Manager Agent reporting inactive/failed processes in Heartbeat request

As part of introducing Cloudera Manager 7.13.x, some changes were done to the Cloudera Manager logging, eventually causing Cloudera Manager Agent to report on inactive/stale processes during Heartbeat request.

As a result, the Cloudera Manager servers logs are getting filled rapidly with these notifications though they do not have impact on service.

In addition, with adding the support for the Observatory feature, some additional messages were added to the logging of the server. However, in case the customer did not purchase the Observatory feature, or the telemetry monitoring is not being used, these messages (which appears as "TELEMETRY_ALTUS_ACCOUNT is not configured for Otelcol" are filling the server logs and preventing proper follow-up on the server activities).

This issue is fixed now.

OPSAPS-72270: Start ECS command fails on uncordon nodes step

Ensure the kube-apiserver is up and running for at least 60 seconds before proceeding with the uncordon step, and use the correct target node name, not just the name of the node where the uncordon command is executed.

This issue is fixed now.

OPSAPS-72978: The getUsersFromRanger API parameter truncates the user list after 200 items

The Cloudera Manager API endpoint v58/clusters/[***CLUSTER***]/services/[***SERVICE***]/commands/getUsersFromRanger API endpoint no longer truncates the list of returned users at 200 items.

Cloudera Manager 7.13.1 CHF2 (7.13.1.200)

OPSAPS-72809: Ranger policy script for Knox fails due to double quotation marks: The Ranger policy script for Knox (setupRanger.sh) fails, because the CSD_JAVA_OPTS parameters are enclosed by double quotation marks in the script. The issue is fixed now.
OPSAPS-72795: Do not allow multiple Ozone services in a cluster: It is possible to configure multiple Ozone services in a single cluster which can cause irreversible damage to a running cluster. So, this fix allows you to install only one Ozone service in a cluster.
OPSAPS-72767: Install Oozie ShareLibCloudera Manager command fails on FIPS and FedRAMP clusters: The Install Oozie ShareLib command using Cloudera Manager fails to execute on FIPS and FedRAMP clusters. This issue is fixed now.
OPSAPS-72323: Cloudera Manager UI is down with bootstrap failure due to ConfigGenExecutor throwing exception: This issue is fixed now.
OPSAPS-71566: The polling logic of RemoteCmdWork goes down if the remote Cloudera Manager goes down: When the remote Cloudera Manager goes down or when there are network failures, the RemoteCmdWork stops to poll. To ensure that the daemon continues to poll even when there are network failures or if the Cloudera Manager goes down, you can set the remote_cmd_network_failure_max_poll_count=[*** ENTER REMOTE EXECUTOR MAX POLL COUNT***] parameter on the Cloudera Manage > Administration > Settings page. Note that the actual timeout is provided by a piecewise constant function (step function) where the breakpoints are: 1 through 11 is 5 seconds, 12 through 17 is 1 minute, 18 through 35 is 2 minutes, 36 through 53 is 5 minutes, 54 through 74 is 8 minutes, 75 through 104 is 15 minutes, and so on. Therefore when you enter 1, the polling continues for 5 seconds after the Cloudera Manager goes down or after a network failure. Similarly when you set it 75, the polling continues for 15 minutes.
OPSAPS-67197: Ranger RMS server shows as healthy without service being accessible: Being a Web service, Ranger RMS might not be initialized due to other issues causing RMS to be inaccessible. But Ranger RMS service was still shown as healthy, because Cloudera Manager only monitors Process Identification Number (PID).
This issue is fixed now. Added the health status canary support for Ranger RMS service which connects to RMS after some specific intervals and shows alert on the Cloudera Manager UI if RMS is not reachable.
OPSAPS-71933: Telemetry Publisher is unable to publish Spark event logs to Cloudera Observability when multiple History Servers are set up in the Spark service.: This issue is now resolved by adding the support for multiple Spark History Server deployments in Telemetry Publisher.
OPSAPS-71623: Some Spark jobs are missing from the Workload XM interface. In the Telemetry Publisher logs for these Spark jobs, the error message java.lang.IllegalArgumentException: Wrong FS for Datahub cluster is displayed.: The issue is resolved by addressing Telemetry Publisher failures during the processing of Yarn logs.

Cloudera Manager 7.13.1 CHF1 (7.13.1.100)

OPSAPS-72369: Update snapshot default configuration for enabling ordered snapshot deletion

This issue is now resolved by changing the default configuration value on Cloudera Manager.

OPSAPS-72215: ECS CM UI Config for docker cert CANNOT accept the new line - unable to update new registry cert in correct format

Currently there is no direct way to update the external docker certificate in the UI for ECS because newlines are removed when the field is saved. Certs can be uploaded by adding '\n' character for newline now. When user wants to update docker cert through Cloudera Manager UI config. User need to add '\n' to specify a newline character in the certificate. Example:

OPSAPS-72662: UIDs (User IDs) conflicts for the kubernetes containers as the Kubernetes containers use the user ID - 1001 which is a pretty common UID in a Unix environment.

This issue is fixed now by using a large UID such as 1000001 to reduce UID conflicts.

Using large UIDs (User IDs) for Kubernetes containers is a recommended security practice because it helps minimize the risk of a container compromising the host system. By assigning a high UID, it reduces the chances of conflicts with existing user accounts on the host, particularly if the container is compromised and attempts to access host files or escalate privileges. In essence, a large UID ensures the container operates with restricted permissions on the host system. Therefore, when creating the CLI pod in Cloudera Manager, the runAsUser value should be set to an integer greater than 1,000,000. To avoid UID conflicts, it is advisable to use a UID such as 1000001.

OPSAPS-72559: Incorrect error messages appear for Hive ACID replication policies

Replication Manager now shows correct error messages for every Hive ACID replication policy run on the Cloudera Manager > Replication Manager > Replication Policies > Actions > Show History page as expected. This issue is fixed now.

OPSAPS-72509: Hive metadata transfer to GCS fails with ClassNotFoundException

Hive external table replication policies from an on-premises cluster to cloud failed during the Transfer Metadata Files step when the target is on Google Cloud and the source Cloudera Manager version is 7.11.3 CHF7, 7.11.3 CHF8, 7.11.3 CHF9, 7.11.3 CHF9.1, 7.11.3 CHF10, or 7.11.3 CHF11. This issue is fixed.

OPSAPS-72559: Incorrect error messages appear for Hive ACID replication policies

OPSAPS-72558, OPSAPS-72505: Replication Manager chooses incorrect target cluster for Iceberg, Atlas, and Hive ACID replication policies

When a Cloudera Manager instance managed multiple clusters, Replication Manager picked the first cluster in the list as the Destination during the Iceberg, Atlas, and Hive ACID replication policy creation process, and the Destination field was non-editable. You can now edit the replication policy to change the target cluster in these scenarios.

OPSAPS-72468: Subsequent Ozone OBS-to-OBS replication policy do not skip replicated files during replication

Replication Manager now skips the replicated files during subsequent Ozone replication policy runs after you add the following key-value pairs in Cloudera Manager > Clusters > Ozone service > Configuration > Ozone Replication Advanced Configuration Snippet (Safety Valve) for core-site.xml:

com.cloudera.enterprise.distcp.ozone-schedules-with-unsafe-equality-check = [***ENTER COMMA-SEPARATED LIST OF OZONE REPLICATION POLICIES’ ID or ENTER all TO APPLY TO ALL OZONE REPLICATION POLICIES***]
The advanced snippet skips the already replicated files when the relative file path, file name, and file size are equal and ignores the modification times.
caution
Usage of this advanced snippet might lead to data loss. For example, if you modified a file on the source or target cluster and the file size remains the same, the advanced snippet ignores the file during the replication run.
com.cloudera.enterprise.distcp.require-source-before-target-modtime-in-unsafe-equality-check = [***ENTER true OR false***]

When you add both the key-value pairs, the subsequent Ozone replication policy runs skip replicating files when the matching file on the target has the same relative file path, file name, file size and the source file’s modification time is less or equal to the target file modification time.

OPSAPS-72214: Cannot create a Ranger replication policy if the source and target cluster names are not the same

You could not create a Ranger replication policy if the source cluster and target cluster names were not the same. This issue is fixed.

OPSAPS-71853: The Replication Policies page does not load the replication policies’ history

When the sourceService is null for a Hive ACID replication policy, the Cloudera Manager UI fails to load the existing replication policies’ history details and the current state of the replication policies on the Replication Policies page. This issue is fixed now.

OPSAPS-72181: Currently Apply Host Template checks for active command on the service, if the active command is taking time (like a long-running replication command) then Apply Host Template operation will also get delayed.

This issue is fixed now for certain scenario like when host template has only gateway role then the Apply Host Template operation will not check for active command on service. If host template has other roles than gateway then the behaviour remains same. Apply Host Template with gateway roles only will not wait for any active service command.

OPSAPS-72249: Oozie database dump fails on JDK17

Oozie database dump and load commands couldn't be executed from Cloudera Manager with JDK 17. This issue is fixed now.

OPSAPS-72276: Cannot edit Ozone replication policy if the MapReduce service is stale

You could not edit an Ozone replication policy in Replication Manager if the MapReduce service did not load completely. This issue is fixed.

OPSAPS-71932: Ranger HDFS plugin resource lookup issue

For JDK 17 Isilon cluster, user was not able to create a new policy under cm_hdfs. The connection was failing with the following error message:

cannot access class sun.net.util.IPAddressUtil

The issue is fixed now. Added sun.net.util package to Ranger Admin java opts for JDK 17.

OPSAPS-71907: Solr auditing URL changed port

The Solr auditing URL generated for Ranger plugin services in the data hub cluster is correct when both the local ZooKeeper and the data lake ZooKeeper have ssl_enabled enabled. However, if the ssl_enabled parameter is disabled on the local ZooKeeper in data hub, the Solr auditing URL changed the port to use 2181.

The fix fetches the Solr auditing URL from the data context of data lake on data hub, resolving the issue where, if the ZooKeeper ssl_enabled parameter is disabled, Solr auditing uses port 2181; a rare, corner-case occurrence.

OPSAPS-71666: Replication Manager uses the required property values in the “ozone_replication_core_site_safety_valve ” in the source Cloudera Manager during Ozone replication policy run

During an Ozone replication policy run, Replication Manager obtains the required properties and its values from the ozone_replication_core_site_safety_valve. It then adds the new properties and its values and overrides the value for existing properties in the core-site.xml file. Replication Manager uses this file during the Ozone replication policy run.

OPSAPS-71659: Ranger replication policy failed because of incorrect source to destination service name mapping

Ranger replication policy failed during the transform step because of incorrect source to destination service name mapping. This issue is fixed now.

OPSAPS-71642: GflagConfigFileGenerator is removing the = sign in the Gflag configuration file when the configuration value passed is empty in the advanced safety valve

If the user adds file_metadata_reload_properties configuration in the advanced safety valve with = sign and empty value, then the GflagConfigFileGenerator is removing the = sign in the Gflag configuration file when the configuration value passed is empty in the advanced safety valve.

This issue is fixed now.

OPSAPS-71592: Replication Manager does not read the default value of “ozone_replication_core_site_safety_valve” during Ozone replication policy run

When the ozone_replication_core_site_safety_valve advanced configuration snippet is set to its default value, Replication Manager does not read its value during the Ozone replication policy run. To mitigate this issue, the default value of ozone_replication_core_site_safety_valve has been set to an empty value. If you have set any key-value pairs for ozone_replication_core_site_safety_valve, then these values are written to core-site.xml during the Ozone replication policy run.

OPSAPS-71424: The 'configuration sanity check' step ignores the replication advanced configuration snippet values during the Ozone replication policy job run

The OBS-to-OBS Ozone replication policy jobs failed when the S3 property values for fs.s3a.endpoint, fs.s3a.secret.key, and fs.s3a.access.key were empty in Ozone Service Advanced Configuration Snippet (Safety Valve) for ozone-conf/ozone-site.xml even when these properties were defined in Ozone Replication Advanced Configuration Snippet (Safety Valve) for core-site.xml. This issue is fixed.

OPSAPS-71256: The “Create Ranger replication policy” action shows 'TypeError' if no peer exists

When you click target Cloudera Manager > Replication Manager > Replication Policies > Create Replication Policy > Ranger replication policy, the TypeError: Cannot read properties of undefined error appears. This issue is fixed now.

OPSAPS-71093: Validation on source for Ranger replication policy fails

The Cloudera Manager page would be logged out automatically when you created a Ranger replication policy. This is because the source cluster did not support the getUsersFromRanger or getPoliciesFromRanger API requests. The issue is fixed now, and the required validation on the source completes successfully as expected.

OPSAPS-70848: Hive external table replication policies succeed when the source cluster uses Dell EMC Isilon storage

During the Hive external table replication policy run, the replication policy failed at the Hive Replication Export step. This issue is fixed now.

OPSAPS-70822: Save the Hive external table replication policy on the ‘Edit Hive External Table Replication Policy’ window

Replication Manager saves the changes as expected when you click Save Policy after you edit a Hive replication policy. To edit a replication policy, you click Actions > Edit Configuration for the replication policy on the Replication Policies page.

OPSAPS-70721: QueueManagementDynamicEditPolicy is not enabled with Auto Queue Deletion enabled

Whenever Auto Queue Deletion is enabled, the QueueManagementDynamicEdit policy is not enabled. This issue is fixed now and when there are no applications running in a queue, then its capacity is set to zero.

OPSAPS-70449: After creating a new Dashboard from the Cloudera Manager UI, the Chart Title field was allowing Javascript as input

In Cloudera Manager UI, while creating a new plot object, a Chart Title field allows Javascript as input. This allows the user to execute a script, which results in an XSS attack. This issue is fixed now.

OPSAPS-69782: Exception appears if the peer Cloudera Manager's API version is higher than the local cluster's API version

HBase replication using HBase replication policies in CDP Public Cloud Replication Manager between two Data Hubs/COD clusters succeed as expected when all the following conditions are true:

The destination Data Hub/COD cluster’s Cloudera Manager version is 7.9.0-h7 through 7.9.0-h9 or 7.11.0-h2 through 7.11.0-h4, or 7.12.0.0.
The source Data Hub/COD cluster's Cloudera Manager major version is higher than the destination cluster's Cloudera Manager major version.
The Initial Snapshot option is chosen during the HBase replication policy creation process and/or the source cluster is already participating in another HBase replication setup as a source or destination with a third cluster.

OPSAPS-69622: Cannot view the correct number of files copied for Ozone replication policies

The last run of an Ozone replication policy does not show the correct number of the files copied during the policy run when you load the Cloudera Manager > Replication Manager > Replication Policies page after the Ozone replication policy run completes successfully. This issue is fixed now.

OPSAPS-72143: Atlas replication policies fail if the source and target clusters support FIPS

The Atlas replication policies fail during the Exporting atlas entities from remote atlas service step if the source and target clusters support FIPS. This issue is fixed now.

OPSAPS-67498: The Replication Policies page takes a long time to load

To ensure that the Cloudera Manager > Replication Manager > Replication Policies page loads faster, new query parameters have been added to the internal policies that fetch the REST APIs for the page which improves pagination. Replication Manager also caches internal API responses to speed up the page load.

OPSAPS-65371: Kudu user was not part of the

cm_solr
                RANGER_AUDITS_COLLECTION

policy

Kudu user was not part of the default policy of cm_solr, which prevented to write any Kudu audit logs on Ranger Admin untill Kudu user was manually added to the policy.

The issue is fixed now. Added Kudu user to default policy for cm_solr - RANGER_AUDITS_COLLECTION, so that Kudu user does not need to be added manually to write audits to Ranger Admin.

Cloudera Manager 7.13.1

OPSAPS-72254: FIPS Failed to upload Spark example jar to HDFS in cluster mode

Fixed an issue with deploying the Spark 3 Client Advanced Configuration Snippet (Safety Valve) for spark3-conf/spark-env.sh.

For more information, see Added a new Cloudera Manager configuration parameter spark_pyspark_executable_path to Livy for Spark 3 in Behavioral Changes In Cloudera Manager 7.13.1.

OPSAPS-71873 - UCL | CKP4| livyfoo0 kms proxy user is not allowed to access HDFS in 7.3.1.0

In the kms-core.xml file, the Livy proxy user is taken from Livy for Spark 3's configuration in Cloudera Runtime 7.3.1 and above.

OPSAPS-70976: The previously hidden real-time monitoring properties are now visible in the Cloudera Manager UI:

The following properties are now visible in the Cloudera Manager UI:

enable_observability_real_time_jobs
enable_observability_metrics_dmp

OPSAPS-69996: HBase snapshot creation in Cloudera Manager does not work as expected

During the HBase snapshot creation process, the snapshot create command sometimes tries to create the same snapshot twice because of an unhandled OptimisticLockException during the database write operation. This resulted in intermittent HBase snapshot creation failures. The issue is fixed now.

OPSAPS-66459: Enable concurrent Hive external table replication policies with the same cloud root

When the HIVE_ALLOW_CONCURRENT_REPLICATION_WITH_SAME_CLOUD_ROOT_PATH feature flag is enabled, Replication Manager can run two or more Hive external table replication policies with the same cloud root path concurrently.

For example, if two Hive external table replication policies have s3a://bucket/hive/data as the cloud root path and the feature flag is enabled, Replication Manager can run these policies concurrently.

By default, this feature flag is disabled. To enable the feature flag, contact your Cloudera account team.

OPSAPS-72153: Invalid signature when trying to create tags in Atlas through Knox

Atlas, SMM UI, and SCHEMA-REGISTRY throw 500 error in FIPS environment.

This issue is fixed now.

OPSAPS-70859: Ranger metrics APIs were not working on FedRAMP cluster

On FedRAMP HA cloud cluster, Ranger metrics APIs were not working.This issue is fixed now by introducing new Ranger configurations.

This issue is fixed now by introducing new Ranger configurations.

OPSAPS-71436: Telemetry publisher test Altus connection fails

An error occurred while running the test Altus connection action for Telemetry Publisher. This issue is fixed now.

OPSAPS-68252: The Ranger RMS Database Full Sync command is not visible on cloud clusters

The Ranger RMS Database Full Sync command was not visible on any cloud cluster. Also, it was needed to investigate the minimum user privilege required to see the Ranger RMS Database Full Sync command on the UI.

The issue is fixed now. The command definition on service level in Ranger RMS has been updated after which the command is visible on the UI. The minimum user privilege required to see this command is EnvironmentAdmin.

OPSAPS-69692, OPSAPS-69693: Included filters for Ozone incremental replication in API endpoint

You can use the include filters in the POST /clusters/{clusterName}/services/{serviceName}/replications API to replicate only the filtered part of the Ozone bucket. You can use multiple path regular expressions to limit the data to be replicated for an Ozone bucket. For example, if you include the /path/to/data/.* and .*/data filters in the includeFilter field for the POST endpoint, the Ozone replication policy replicates only the keys that start with /path/to/data/.* or ends with .*/data in the Ozone bucket.

OPSAPS-70561: Improved page load performance of the “Bucket Browser” tab.

The Cloudera Manager > Clusters > [***OZONE SERVICE***] > Bucket Browser tab does not load all the entries of the bucket. Therefore, the page loads faster when you try to display the content of a large bucket with several keys in it.

OPSAPS-71090: The spark.*.access.hadoopFileSystems gateway properties are not propagated to Livy.

Added new properties for configuring Spark 2 (spark.yarn.access.hadoopFileSystems) and Spark 3 (spark.kerberos.access.hadoopFileSystems) that propagate to Livy.

OPSAPS-71271: The precopylistingcheck script for Ozone replication policies uses the Ozone replication safety valve value.

The "Run Pre-Filelisting Check" step during Ozone replication uses the content of the ozone_replication_core_site_safety_valve" property value to configure the Ozone client for the source and the target Cloudera Manager.

OPSAPS-70983: Hive replication command for Sentry to Ranger replication works as expected

The Sentry to Ranger migration during the Hive replication policy run from CDH 6.3.x or higher to Cloudera on cloud 7.3.0.1 or higher is successful.

OPSAPS-69806: Collection of YARN diagnostic bundle will fail

For any combinations of CM 7.11.3 version up to CM 7.11.3 CHF7 version, with CDP 7.1.7 through CDP 7.1.8, collection of the YARN diagnostic bundle will fail, and no data transmits occur.

Now the changes are made to Cloudera Manager to allow the collection of the YARN diagnostic bundle and make this operation successful.

OPSAPS-70655: The hadoop-metrics2.properties file is not getting generated into the ranger-rms-conf folder

The hadoop-metrics2.properties file was getting created in the process directory conf folder, for example, conf/hadoop-metrics2.properties, whereas the directory structure in Ranger RMS should be {process_directory}/ranger-rms-conf/hadoop-metrics2.properties.

The issue is fixed now. The directory name is changed from conf to ranger-rms-conf, so that the hadoop-metrics2.properties file gets created under the correct directory structure.

OPSAPS-71014: Auto action email content generation failed for some cluster(s) while loading the template file

The issue has been fixed by using a more appropriate template loader class in the freemarker configuration.

OPSAPS-70826: Ranger replication policies fail when target cluster uses Dell EMC Isilon storage and supports JDK17

Ranger replication policies no longer fail if the target cluster is deployed with Dell EMC Isilon storage and also supports JDK17.

OPSAPS-70861: HDFS replication policy creation process fails for Isilon source clusters

When you choose a source Cloudera Base on premises cluster using the Isilon service and a target cloud storage bucket for an HDFS replication policy in Cloudera Base on premises Replication Manager UI, the replication policy creation process fails. This issue is fixed now.

OPSAPS-70708: Cloudera Manager Agent not skipping autofs filesystems during filesystem check

Clusters in which there are a large number of network mounts on each host (for example, more than 100 networked file system mounts), cause the startup of Cloudera Manager Agent to take a long time, on the order of 10 to 20 seconds per mount point. This is due to the OS kernel on the cluster host interrogating each network mount on behalf of the Cloudera Manager Agent to gather monitoring information such as file system usage.

This issue is fixed now by adding the ability in the Cloudera Manager Agent's config.ini file to disable filesystem checks.

OPSAPS-68991: Change default SAML response binding to HTTP-POST

The default SAML response binding is HTTP-Artifact, rather than HTTP-POST. While HTTP-POST is designed for handling responses through the POST method, where as HTTP-Artifact necessitates a direct connection with the SP (Cloudera Manager in this case) and Identity Provider (IDP) and is rarely used. HTTP-POST should be the default choice instead.

This issue is fixed now by setting up the new Default SAML Binding to HTTP-POST.

OPSAPS-40169: Audits page does not list failed login attempts on applying Allowed = false filter

The Audits page in Cloudera Manager shows failed login attempts when no filter is applied. However, when the Allowed = false filter is applied it returns 0 results. Whereas it should have listed those failed login attempts. This issue is fixed now.

OPSAPS-70583: File Descriptor leak from Cloudera Manager 7.11.3 CHF3 version to Cloudera Manager 7.11.3 CHF7

Unable to create NettyTransceiver due to Avro library upgrade which leads to File Descriptor leak. File Descriptor leak occurs in Cloudera Manager when a service tries to talk with Event Server over Avro. This issue is fixed now.

OPSAPS-70962: Creating a cloud restore HDFS replication policy with a peer cluster as destination which is not supported by Replication Manager

During the HDFS replication policy creation process, incorrect Destination clusters and MapReduce services appear which when chosen creates a dummy replication policy to replicate from a cloud account to a remote peer cluster. This scenario is not supported by Replication Manager. This issue is now fixed.

OPSAPS-71108: Use the earlier format of PCR

You can use the latest version of the PCR (Post Copy Reconciliation) script, or you can restore PCR to the earlier format by setting the com.cloudera.enterprise.distcp.post-copy-reconciliation.legacy-output-format.enabled=true key value pair in the Cloudera Manager > Clusters > HDFS service > Configuration > hdfs_replication_hdfs_site_safety_valve property.

OPSAPS-70689: Enhanced performance of DistCp CRC check operation

When a MapReduce job for an HDFS replication policy job fails, or when there are target-side changes during a replication job, Replication Manager initiates the bootstrap replication process. During this process, a cyclic redundancy check (CRC) check is performed by default to determine whether a file can be skipped for replication.

By default, the CRC for each file is queried by the mapper (running on the target cluster) from the source cluster's NameNode. The round trip between the source and target cluster for each file consumes network resources and raises the cost of execution. To improve the performance, you can set the following variables to true, on the target cluster, to improve the performance of the CRC check for the Cloudera Manager > Clusters > HDFS service > Configuration > HDFS_REPLICATION_ENV_SAFETY_VALVE property:

ENABLE_FILESTATUS_EXTENSIONS
ENABLE_FILESTATUS_CRC_EXTENSIONS

By default, these are set to false.

After you set the key-value pairs, the CRC for each file is queried locally from the NameNode on the source cluster and copied over to the target cluster at the end of the replication process, which reduces the cost because round trip is between two nodes of the same cluster. The CRC checksums are written to the file listing files.

OPSAPS-70685: Post Copy Reconciliation (PCR) for HDFS replication policies between on-premises clusters

To add the Post Copy Reconciliation (PCR) script to run as a command step during the HDFS replication policy job run, you can enter the SCHEDULES_WITH_ADDITIONAL_DEBUG_STEPS = [***ENTER COMMA-SEPARATED LIST OF NUMERICAL IDS OF THE REPLICATION POLICIES***] key-value pair in the target Cloudera Manager > Clusters > HDFS service > hdfs_replication_env_safety_valve property.

To run the PCR script on the HDFS replication policy, use the /clusters/[***CLUSTER NAME***]>/services/[***SERVICE***]/replications/[***SCHEDULE ID***]/postCopyReconciliation API.

For more information about the PCR script, see How to use the post copy reconciliation script for HDFS replication policies.

OPSAPS-70188: Conflicts field missing in ParcelInfo

Fixed an issue in parcels where conflicts field in manifest.json would mark a parcel as invalid

OPSAPS-70248: Optimize Impala Graceful Shutdown Initiation Time

This issue is resolved by streamlining the shutdown initiation process, reducing delays on large clusters.

OPSAPS-70157: Long-term credential-based GCS replication policies continue to work when cluster-wide IDBroker client configurations are deployed

Replication policies that use long-term GCS credentials work as expected even when cluster-wide IDBroker client configurations are configured.

OPSAPS-70422: Change the “Run as username(on source)” field during Hive external table replication policy creation

You can use a different user other than hdfs for Hive external table replication policy run to replicate from an on-premises cluster to the cloud bucket if the USE_PROXY_USER_FOR_CLOUD_TRANSFER=true key-value pair is set for the source Cloudera Manager > Clusters > Hive service > Configuration > Hive Replication Environment Advanced Configuration Snippet (Safety Valve) property. This is applicable for all external accounts other than IDBroker external account.

OPSAPS-70460: Allow white space characters in Ozone snapshot-diff parsing

Ozone incremental replication no longer fails if a changed path contains one or more space characters.

OPSAPS-70594: Ozone HttpFS gateway role is not added to Rolling Restart

This issue is now resolved by adding the Ozone HttpFS gateway role to the Rolling Restart.

OPSAPS-68752: Snapshot-diff delta is incorrectly renamed/deleted twice during on-premises to cloud replication

The snapshots created during replication are deleted twice instead of once, which results in incorrect snapshot information. This issue is fixed. For more information, see Cloudera Customer Advisory 2023-715: Replication Manager may delete its snapshot information when migrating from on-prem to cloud.

OPSAPS-68112: Atlas diagnostic bundle should contain server log, configurations, and, if possible, heap memories

The diagnostic bundle contains server log, configurations, and heap memories in a GZ file inside the diagnostic .zip package.

OPSAPS-69921: ATLAS_OPTS environment variable is set for FIPS with JDK 11 environments to run the import script in Atlas

_JAVA_OPTIONS are populated with additional parameters as seen in the following:

java_opts = 'export _JAVA_OPTIONS="-Dcom.safelogic.cryptocomply.fips.approved_only=true ' \
'--add-modules=com.safelogic.cryptocomply.fips.core,' \
'bctls --add-exports=java.base/sun.security.provider=com.safelogic.cryptocomply.fips.core ' \
'--add-exports=java.base/sun.security.provider=bctls --module-path=/cdep/extra_jars ' \
'-Dcom.safelogic.cryptocomply.fips.approved_only=true -Djdk.tls.ephemeralDHKeySize=2048 ' \
'-Dorg.bouncycastle.jsse.client.assumeOriginalHostName=true -Djdk.tls.trustNameService=true" '

OPSAPS-71258: Kafka, SRM, and SMM cannot process messages compressed with Zstd or Snappy if /tmp is mounted as noexec

The issue is fixed by using JVM flags that point to a different temporary folder for extracting the native library.

OPSAPS-69481: Some Kafka Connect metrics missing from Cloudera Manager due to conflicting definitions

Cloudera Manager now registers the metrics kafka_connect_connector_task_metrics_batch_size_avg and kafka_connect_connector_task_metrics_batch_size_max correctly.

OPSAPS-68708: Schema Registry might fail to start if a load balancer address is specified in Ranger

Schema Registry now always ensures that the address it uses to connect to Ranger ends with a trailing slash (/). As a result, Schema Registry no longer fails to start if Ranger has a load balancer address configured that does not end with a trailing slash.

OPSAPS-69978: Cruise Control capacity.py script fails on Python 3

The script querying the capacity information is now fully compatible with Python 3.

OPSAPS-64385: Atlas's client.auth.enabled configuration is not configurable

In customer environments where user certifications are required to authenticate to services, the Apache Atlas web UI will constantly prompt for certifications. To solve this, the client.auth.enabled parameter is set to true by default. If it is needed to set it false, then you need to override the setting from safety-valve with a configuration snippet. Once it set to false, then no more certificate prompts will be displayed.

OPSAPS-71089: Atlas's client.auth.enabled configuration is not configurable

OPSAPS-71677: When you are upgrading from CDP Private Cloud Base 7.1.9 SP1 to Cloudera Base on premises 7.3.1, upgrade-rollback execution fails during HDFS rollback due to missing directory.

This issue is now resolved. The HDFS meta upgrade command is executed by creating the previous directory due to which the rollback does not fail.

OPSAPS-71390: COD cluster creation is failing on INT and displays the Failed to create HDFS directory /tmp error.

This issue is now resolved. Export options for jdk17 is added.

OPSAPS-71188: Modify default value of dfs_image_transfer_bandwidthPerSec from 0 to a feasible value to mitigate RPC latency in the namenode.

This issue is now resolved.

OPSAPS-58777: HDFS Directories are created with root as user.

This issue is now resolved by fixing service.sdl.

OPSAPS-71474: In Cloudera Manager UI, the Ozone service Snapshot tab displays label label.goToBucket and it must be changed to Go to bucket.

This issue is now resolved.

OPSAPS-70288: Improvements in master node decommissioning.

This issue is now resolved by making usability and functional improvements to the Ozone master node decommissioning.

OPSAPS-71647: Ozone replication fails for incompatible source and target Cloudera Manager versions during the payload serialization operation

Replication Manager now recognizes and annotates the required fields during the payload serialization operation. For the list of unsupported Cloudera Manager versions that do not have this fix, see Preparing clusters to replicate Ozone data.

OPSAPS-71156: PostCopyReconciliation ignores mismatching modification time for directories

The Post Copy Reconciliation script (PCR) script does not check the file length, last modified time, and cyclic redundancy check (CRC) checksums for directories (paths that are directories) on both the source and target clusters.

OPSAPS-70732: Atlas replication policies no longer consider inactive Atlas server instances

Replication Manager considers only the active Atlas server instances during Atlas replication policy runs.

OPSAPS-70924: Configure Iceberg replication policy level JVM options

You can add replication-policy level JVM options for the export, transfer, and sync CLIs for Iceberg replication policies on the Advanced tab in the Create Iceberg Replication Policy wizard.

OPSAPS-70657: KEYTRUSTEE_SERVER & RANGER_KMS_KTS migration to RANGER_KMS from CDP 7.1.x to UCL

KEYTRUSTEE_SERVER and RANGER_KMS_KTS services are not supported starting from the Cloudera Base on premises 7.3.1 release. Therefore added validation and confirmation messages to the Cloudera Manager upgrade wizard to alert the user to migrate KEYTRUSTEE_SERVER keys to RANGER_KMS before upgrading to Cloudera Base on premises 7.3.1 release.

OPSAPS-70656: Remove KEYTRUSTEE_SERVER & RANGER_KMS_KTS from Cloudera Manager for UCL

The Keytrustee components - KEYTRUSTEE_SERVER and RANGER_KMS_KTS services are not supported starting from the Cloudera Base on premises 7.3.1 release. These services cannot be installed or managed with Cloudera Manager 7.13.1 using Cloudera Base on premises 7.3.1.

OPSAPS-67480: In CDP 7.1.9, default Ranger policy is added from the cdp-proxy-token topology, so that after a new installation of CDP 7.1.9, the knox-ranger policy includes cdp-proxy-token. However, upgrades do not add cdp-proxy-token to cm_knox policies automatically.

This issue is fixed now.

OPSAPS-70838: Flink user should be add by default in ATLAS_HOOK topic policy in Ranger >> cm_kafka

The "flink" service user is granted publish access on the ATLAS_HOOK topic by default in the Kafka Ranger policy configuration.

OPSAPS-69411: Update AuthzMigrator GBN to point to latest non-expired GBN

Users will now be able to export sentry data only for given Hive objects (databases and tables and the respective URLs) by using the config "authorization.migration.export.migration_objects" during export.

OPSAPS-68252: "Ranger RMS Database Full Sync" option was not visible on mow-int cluster setup for hrt_qa user (7.13.0.0)

The fix makes the command visible on cloud clusters when the user has minimum EnvironmentAdmin privilege.

OPSAPS-70148: Ranger audit collection creation is failing on latest SSL enabled UCL cluster due to zookeeper connection issue

Added support for secure ZooKeeper connection for the Ranger Plugin Solr audit connection configuration xasecure.audit.destination.solr.zookeepers.

OPSAPS-52428: Add SSL to ZooKeeper in CDP

Added SSL/TLS encryption support to CDP components. ZooKeeper SSL (secure) port now gets automatically enabled and components communicate on the encrypted channel if cluster has AutoTLS enabled.

OPSAPS-72093: FIPS - yarn jobs are failing with No key provider is configured

The yarn.nodemanager.admin environment must contain the FIPS related Java options, and this configuration is handled such that the comma is a specific character in the string. This change proposes to use single module additions in the default FIPS options (use separate --add-modules for every module), and it adds the FIPS options to the yarn.nodemanager.admin environment.

Previously, yarn.nodemanager.container-localizer.admin.java.opts contained FIPS options only for 7.1.9, this patch also fixes this, and adds the proper configurations in 7.3.1 environments also.

This was tested on a real cluster, and with the current changes YARN works properly, and can successfully run distcp from/to encryption zones.

OPSAPS-70113: Fix the ordering of YARN admin ACL config

The YARN Admin ACL configuration in Cloudera Manager shuffled the ordering when it was generated. This issue is now fixed, so that the input ordering is maintained and correctly generated.

DMX-3364: Drop table operation works incorrectly during Iceberg replication

A replicated table was dropped automatically in the target cluster during a subsequent policy run after you dropped the table in the source cluster, and then edited the replication policy to remove the table and added another table to the replication policy. This issue is resolved.