Behavioral changes in Cloudera Runtime 7.1.9 SP1 CHF 9

Atlas

Summary:

A new option to ignore spark_process.attributes, details and sparkPlanDescription is introduced.

Previous behavior:

The spark_process entity attributes details and sparkPlanDescription are populated with query plan details, which can contain a large amount of text, often in megabytes. This amount of data can incur unnecessary processing costs.

New behavior:

The attribute atlas.notification.consumer.preprocess.spark_process.attributes is set to false by default. Set it to true to avoid populating the details and sparkplandescription attributes. These attributes can be ignored using above configurations in Atlas server. Ignoring these attributes helps to eliminate the cost of having a large amount of data processed at Atlas.

To keep the data for details and sparkplandescription, atlas.notification.consumer.preprocess.spark_process.attributes, set them to false explicitly.

Hive

Summary:: Metastore secondary connection pool size is now configurable; Previous Behavior: The Metastore's secondary connection pool had a fixed size of 2. This often led to connection limitations, especially under heavy workloads.; New Behavior: You can now configure the metastore's secondary connection pool size using the property datanucleus.connectionPool.secondary.maxPoolSize. This lets you adjust the pool beyond its default of 2, preventing connection limitations and improving performance.

Impala

Summary:: OAuth Authentication now supported; Previous Behavior: Authentication using OAuth was not available.; New Behavior: OAuth authentication is now supported using OAuth JWT bearer tokens. For more information, see OAuth Authentication