What's new in 1.5.5 SP1

Cloudera AI on premises 1.5.5 SP1 delivers a set of new features for Cloudera AI.

Job Retry feature is available on Cloudera AI on premises

The Job Retry feature enables automatic retries of jobs based on their terminal execution states, that is, failed, timed out or skipped. It also supports concurrent execution of job retry runs, ensuring that the scheduled job runs remain unaffected and are not blocked by retry processes. Users have the flexibility to configure various options to define the retry behavior. These retries are fully automated, eliminating the need for manual intervention.

For details on the Job Retry feature, see the instructions in Configuring Job Retry settings for the Administrator and references in Job Retry parameters.

Apache Spark defaults and custom configuration details available

Custom Apache Spark settings can now be configured at workbench level. When set, the custom Spark configuration provided by the Administrator will be merged with the default Spark configuration used in Cloudera AI sessions. These settings will automatically apply to all newly launched Spark sessions within the workbench. The configuration option is available under Site Administration > Runtimes. If Spark pushdown is enabled, Spark configuration details can be configured at workbench-level and are visible on Base cluster-level. For more details, see Setting custom Spark configurations at workbench-level and Modifying Project settings.

Support added for NVIDIA Inference Microservice (NIM) CLI

Support for NIM CLI in the Cloudera AI Registry is provided to import the latest offerings from NVIDIA.

Cloudera AI Inference service

The performance of the Cloudera AI Inference service has been significantly enhanced by implementing token caching, which improves UI responsiveness and reduces network load.

Cloudera AI Registry

Support for the following models are now added for Cloudera AI on premises 1.5.5 SP1:

llama-3.3-nemotron-super-49b-v1
llama-3.1-nemotron-nano-v1
starcoder-2
deepseek-r1-distill-llama
llama-3.2-instruct
nemoretriever-parse
nemoretriever-graphic-elements-v1
nemoretriever-page-elements-v2
nemoretriever-table-structure-v1
paddleocr
llama-3.2-nv-embedqa-v2
llama-3.2-nv-rerankqa-v2
riva-asr-whisper-large-v3
boltz2
gpt-oss

Bucket name validation during creating Cloudera AI Registry

When creating a Cloudera AI Registry, the system now performs a validation to ensure the S3 bucket name meets the required standards.

As part of this validation, Cloudera AI Registry creation will be blocked if the bucket name contains a backslash ('/'), as this character is not permitted in S3 bucket names.

Increased default resource requests for infrastructure pods

Resource requests for several core Cloudera AI services are now increased, adding 1350 millicores of CPU and 1094 MiB of memory. This enhancement is aimed at improving performance and stability, providing a smoother and more reliable experience without requiring any user intervention.

Configurable Livelog retention period in Cloudera AI

Cloudera AI now provides enhanced flexibility with a configurable livelog retention period. This new feature addresses the challenges caused by the hardcoded retention period in Helm charts, which often lead to system crashes due to the insufficient default livelog storage size of 100 GB for many users. With this update, Administrators can define the livelog retention period in days, offering more granular control compared to the previous monthly configuration. The retention period can be easily customized in the Site Administration Settings, with a default value of 180 days, ensuring improved adaptability to diverse storage requirements. For details, see the instructions in Historical workload cleanup settings

Periodic workload records cleanup from the Cloudera AI Workbench PostgreSQL database

A new Cleanup DB Entries checkbox in the Site Administration Settings allows Administrators to enable periodic cleanup of database tables, including dashboards, dashboard_pods, model_deployments, and user_events, for entries older than the livelog retention period. The livelog cleaner triggers a gRPC call for DeleteDBWorkloads action, which performs the cleanup when the enable_clean_db_workloads setting is activated.

Additionally, the cleanup logic for the usageview table is now configurable directly through the UI, providing greater control and ease of management.

For details on the Dashboard Archive feature, see Optimized queries with Dashboards Archive table.

Improved stability of web deployments

The stability of web deployments is improved by optimizing Kubernetes liveness probes to prevent unnecessary pod restarts during high load scenarios.

New model build options introduced

model_root_dir: This option allows you to set a custom build root directory, enabling deployment even when a .git structure is nested.

build_script_path: This option provides the ability to specify a custom path for the build script.