Known Issues in Atlas

Learn about the known issues in Ozone the impact or changes to the functionality, and the workaround in Cloudera Runtime 7.1.9 SP1 CHF 7.

Known issues identified in Cloudera Runtime 7.1.9 SP1 CHF 7

The following known issues were identified in this release:

Known issues identified before Cloudera Runtime 7.1.9 SP1 CHF 7

CDPD-77054: Exporting shell entities can cause the import to fail
For Apache Hive tables created with Apache Spark, there may be shell entities created in Apache Atlas. If these entities are in the export zip file before they are resolved, the import will fail.
Workaround: None
CDPD-70450: Impala SQL queries that include the “WITH” clause should populate lineage in Atlas
Impala SQL queries that do not use the WITH clause can show lineage in Atlas, but queries that do use the WITH clause cannot show lineage in Apache Atlas. Impala SQL queries using the WITH clause are not supported.
CDPD-77738: Atlas hook authorization issue causing HiveCreateSysDb timeout
Atlas hook authorization error causes HiveCreateSysDb command to time out due to repeated retries.
None
CDPD-6565: Issue with ddlQueries in Atlas for Hive/Impala tables created in default/non-default db using DAS/HUE respectively
ddlQueries are not created for Impala origin table. ddlQueries created for Impala/Hive CTAS tables have difference in names as compared to same operation for Hive table. The Impala table has the query in the ddlQueries entity name.
None
CDPD-55301: ddlQueries and ALTERTABLE_* lineage missing for Spark tables created through spark3-shell
The ddlQueries and ALTERTABLE_* lineage missing for Spark tables created using spark3-shell.
None
CDPD-56085: LOAD DATA INPATH to iceberg_table creates a temporary hive_table with name <iceberg_table_name>_tmp* and then marks it as DELETED in Atlas
Running a query like LOAD DATA INPATH to iceberg_table, creates a temporary hive_table with name <iceberg_table_name>_tmp* and then marks it as DELETED in Atlas. So in Atlas, a deleted entity is created corresponding to the temporary table <iceberg_table_name>_tmp*.
None
CDPD-58554: Discard audits of specific classification, label, and business metadata
Support to control audits for specific classification, label, and business metadata is not present in the Custom Audit Filters feature.
None
CDPD-58412: Ranger KMS APIs returning incorrect HTTP response codes for error cases
In case keys are not found while doing any operation on that, KMS returns 500 internal server error. Instead, it should return proper error code.
Such calls execution does not bring KMS to any inconsistent state and further calls with correct key name will be processed normally.
CDPD-40346: The ddlQueries and ALTERTABLE_ADDCOLS lineage missing for Impala tables.

The ALTERTABLE_ADDCOLS lineage has some issue when an Impala table is altered and the corresponding lineage is not created.

CDPD-67112: Import transforms do not work as expected when replacing a string which already has ":"
The character “:” is not supported in path replacements. The import succeeds but location remains unchanged. The character “:” must be avoided.
None
CDPD-69150: Unable to add labels or user defined properties in Japanese
Adding Japanese labels or user defined properties results in the error message: “Invalid label: データ, label should contain alphanumeric characters, _ or -
None
CDPD-69279: Quick search does not return entities when using Japanese or Chinese characters to search properties
Entities, such as hdfs_path, are not returned by the search when their indexable properties are searched using partial search terms made of Japanese or Chinese characters. Only exact matches return results when searching the indexed properties made of Japanese or Chinese characters.
None
CDPD-68191: Suggestions do not return the correct results when searching multiple Chinese characters
Free text search does not return results when searching Chinese phrases made of multiple characters. Partial searches return the correct results.
None.
CDPD-71219: Regression : Suggestions don't work for single character words on indexable attributes
When searching for entities whose name (entity names are indexable) is a single character, search results are returned but suggestions are not. When searching for entities whose description (entity descriptions are not indexable) is a single character, both search results and suggestions are returned.
None.
CDPD-67450: Table name renaming operation is not updating or creating iceberg_table entity
Renaming an Iceberg table does not update the corresponding Atlas entity.
None.
CDPD-67089: Export/Import: When a table with Ozone path is exported as "connected", only the Ozone key is exported.
When table with Ozone path is exported as "connected", only Ozone key is exported. Other Ozone entities, such as Ozone volume, Ozone bucket are not exported.
None.
CDPD-43772: Performance issues with Atlas service
If there are lot of update operations and the compression type of column families of atlas_janus table is SNAPPY, then the Kafka message processing might become slower.
  • Consider setting compression type of column families of atlas_janus table as GZ.
OPSAPS-67783: During rolling upgrade one among two Atlas server failed to start but Cloudera Manager considered as success
Cloudera Manager marks the Execute command Start on service Atlas-1 as a success even when Atlas service had failed to start successfully. In such cases, Atlas logs give the exact reason for Atlas start-up failure.
None
CDPD-19358: "IsIndexable"and "isOptional" value of a typedef's attribute is modified post migration.

In HDP-265, falcon_feed_creation has "stored-in" attribute which has IsIndexable value set to True.

Post migration , "stored-in" is moved to relationshipAttributeDefs and has IsIndexable value set to False.

Similarly , in HDP-265 the table attribute of hive_storage_desc isOptional is set to true. Post migration, isOptional set to False.

None.
CDPD-11941: Table creation events missed when multiple tables are created in the same Hive command
When multiple Hive tables are created in the same database in a single command, the Atlas audit log for the database may not capture all the table creation events. When there is a delay between creation commands, audits are created as expected.
None.
CDPD-11940: Database audit record misses table delete
When a hive_table entity is created, the Atlas audit list for the parent database includes an update audit. However, at this time, the database does not show an audit when the table is deleted.
None.
CDPD-11692: Navigator table creation time not converted to Atlas
In converting content from Navigator to Atlas, the create time for Hive tables is not moved to Atlas.
None
CDPD-11338: Cluster names with upper case letters may appear in lower case in some process names
Atlas records the cluster name as lower case in qualifiedNames for some process names. The result is that the cluster name may appear in lower case for some processes (insert overwrite table) while it appears in upper case for other queries (ctas) performed on the same cluster.
None.
CDPD-10574: Suggestion order doesn't match search weights
At this time, the order of search suggestions does not honor the search weight for attributes.
None.
CDPD-9095: Duplicate audits for renaming Hive tables
Renaming a Hive table results in duplicate ENTITY_UPDATE events in the corresponding Atlas entity audits, both for the table and for its columns.
None.
CDPD-7982: HBase bridge stops at HBase table with deleted column family
Bridge importing metadata from HBase fails when it encounters an HBase table for which a column family was previously dropped. The error indicates:
Metadata service API org.apache.atlas.AtlasClientV2$API_V2@58112bc4 failed with status 404 (Not Found) Response Body 
({""errorCode"":""ATLAS-404-00-007"",""errorMessage"":""Invalid instance creation/updation parameters passed : 
hbase_column_family.table: mandatory attribute value missing in type hbase_column_family""}) 
None.
CDPD-7781: TLS certificates not validated on Firefox
Atlas is not checking for valid TLS certificates when the UI is opened in FireFox browsers.
None.
CDPD-6675: Irregular qualifiedName format for Azure storage
The qualifiedName for hdfs_path entities created from Azure blog locations (ABFS) does not have the clusterName appended to it as do hdfs_path entities in other location types.
None.
CDPD-4762: Spark metadata order may affect lineage
Atlas may record unexpected lineage relationships when metadata collection from the Spark Atlas Connector occurs out of sequence from the metadata collection from HMS. For example, if an ALTER TABLE operation in Spark that is changing a table name and is reporting to Atlas before HMS has processed the change, Atlas may not show the correct lineage relationships to the altered table.
None.
CDPD-4545: Searches for Qualified Names with "@" does not fetch the correct results
When searching Atlas qualifiedName values that include an "at" character (@), Atlas does not return the expected results or generate appropriate search suggestions.
Consider leaving out the portion of the search string that includes the @ sign, and use the wildcard character * instead.
CDPD-3208: Table alias values are not found in search
When table names are changed, Atlas keeps the old name of the table in a list of aliases. These values are not included in the search index in this release, so after a table name is changed, searching on the old table name does not return the entity for the table.
None.
CDPD-3160: Hive lineage missing for INSERT OVERWRITE queries
Lineage is not generated for Hive INSERT OVERWRITE queries on partitioned tables. Lineage is generated as expected for CTAS queries from partitioned tables.
None.
CDPD-3125: Logging out of Atlas does not manage the external authentication
At this time, Atlas does not communicate a logout event with the external authentication management, Apache Knox. When you log out of Atlas, you can still open the instance of Atlas from the same web browser without re-authentication.
To prevent access to Atlas after logging out, close all browser windows and exit the browser.
CDPD-1892: Ranking of top results in free-text search not intuitive
The Free-text search feature ranks results based on which attributes match the search criteria. The attribute ranking is evolving and therefore the choice of top results may not be intuitive in this release.
If you don't find what you need in the top 5 results, use the full results or refine the search.
CDPD-1884: Free text search in Atlas is case sensitive
The free text search bar in the top of the screen allows you to search across entity types and through all text attributes for all entities. The search shows the top 5 results that match the search terms at any place in the text (*term* logic). It also shows suggestions that match the search terms that begin with the term (term* logic). However, in this release, the search results are case-sensitive.
If you don't see the results you expect, repeat the search changing the case of the search terms.
CDPD-1823: Queries with ? wildcard return unexpected results
DSL queries in Advanced Search return incorrect results when the query text includes a question mark (?) wildcard character. This problem occurs in environments where trusted proxy for Knox is enabled, which is always the case for CDP.
None.
CDPD-1664: Guest users are redirected incorrectly
Authenticated users logging in to Atlas are redirected to the CDP Knox-based login page. However, if a guest user (without Atlas privileges) attempts to log in to Atlas, the user is redirected instead to the Atlas login page.
To avoid this problem, open the Atlas Dashboard in a private or incognito browser window.
CDPD-922: The IsUnique relationship attribute not honored
The Atlas model includes the ability to ensure that an attribute can be set to a specific value in only one relationship entity across the cluster metadata. For example, to add metadata tags to relationships that you wanted to make sure were unique in the system, you can design the relationship attribute with the property IsUnique equal true. However, in this release, the IsUnique attribute is not enforced.
None.
DOCS-13759: Tag Propagation stops after a certain depth while the lineage is being extended
When a tag is added to an entity at timestamp T1, the entities along the lineage to which the tag must be propagated is calculated at T1. Before tag propagation completes, if the lineage is extended, tag does not propagate to the entities in the extended lineage.
CDPD-41142: When a Kafka console consumer group is run, more than one update audits are seen
After running the console consumer with a consumer group, verify the consumer group entity created, along with the metrics and notifications for the consumer group and topic. The expected result can be: one ENTITY_CREATE audit and one ENTITY_UPDATE audit. But more than one ENTITY_UPDATE audits are seen.
CDPD-40165: Two audits are created for SPARK CTAS table

When following Spark queries are fired:

spark.sql("create table table1(id int)")
spark.sql("create table table2 as select * from table1")

HMS sends "ENTITY_CREATE" and "ENTITY_FULL_UPDATE_V2".

The extra ENTITY_FULL_UPDATE_V2 message received from HMS is sent as a part of ALTERTABLE_ADDCOLS event from the HMS Hook side. This behaviour is observed only when the queries are run from Spark SQL and not when run the same queries are run from Beeline.

CDPD-39197: Debug metrics returns empty data
When debug metrics is enabled and some operations are performed, the response is empty
CDPD-36495: Updating legacyAttribute from False to True resets the initially created relationshipAttributes values
Creating types, entities, and to start, you must set the relationship with is_legacy_attribute value as False.

Later, update the value relationshipDef is_legacy_attribute to True.

For the entities that were created before updating the is_legacy_attribute to True, the relationshipAttributes value is reset.

CDPD-13466: Bulk create/update entity POST API does not create / update authorised entities
The bulk API fails with 403 error if some belong to entities on which the user is unauthorized and other GUIDs belong to entities on which user is authorized.
CDPD-22744: Bulk entity DELETE API does not delete authorised entities
Bulk entity DELETE API does not delete authorised entities when the list of authorised and unauthorised entities list is passed.
CDPD-29409: Hive import: Suggestion suggests entity which is deleted.
Suggestions suggests tables of a database, which is a deleted entity.
CDPD-25152: Tag propagation through deferred actions consumes additional time as compared to default flow
The additional time might be due to the small overhead added to create / update task vertex and which is run in the background. This also depends on number of tasks queued to be executed.
CDPD-42954: Zeppelin notebook fails after enabling Atlas-HDFS hook

The Zeppelin notebooks are failing with errors after enabling Atlas-HDFS hook in the CDP cluster.

When the below properties are set for atlas-client.properties in Cloudera Manager:
  • atlas.jaas.KafkaClient.option.keyTab
  • atlas.jaas.KafkaClient.option.principal

Along with adding the properties in /etc/atlas/conf/atlas-application.properties, Cloudera Manager also adds these properties to atlas-application.properties for other services (like Spark).

Adding these properties interferes with the normal flow of the services (like Spark)

To enable HDFS lineage feature, instead of setting these properties through Cloudera Manager, users can manually add the properties directly in /etc/atlas/conf/atlas-application.properties
CDPD-10576: Deleted Business Metadata attributes appear in Search Suggestions
Atlas search suggestions continue to show Business Metadata attributes even if the attributes have been deleted.
None.
CDPD-39427: [HDFS Lineage]: When the input is a directory in case of put/copyfromLocal/cp/mv commands, lineage is not created even though the script succeeds.
When Source is a directory and target is a directory which is already present in Atlas, the command succeeds and inserts the data in the desired location, but lineage is not created.
DOCS-13760: System Attributes search, __classificationNames: Search with parent tag does not return entities associated to its children tags

System attribute search with __classificationNames = parent_tag returns entities associated to parent_tag only and not entities associated to its children tag.

Workaround: Instead of using system attribute, employ the basic search attribute "classification" which lists entities associated with inherited classifications.
CDPD-35818: Basic search with tag filter provides approximateCount as -1 when there is no match and is 0 otherwise
When the following search operations are performed:
  • Faceted search with both tag filter and entity filter

    The observed approximateCount is -1.

    Here both entity filter and tag filter are present and when there is no match the response received is -1.

  • Faceted search with only entity filter

    Performing a basic search provides an approximate value of 0 when there is no match.

  • Faceted search with only tag filter

    Whenever there is tag filter in the query and there is no entity match, the approximateCount is -1 and if the tag filter is not available, the response approximateCount is 0

CDPD-13466: Bulk create/update entity POST API does not create / update authorised entities
The bulk API fails with 403 error if some belong to entities on which the user is unauthorized and other GUIDs belong to entities on which user is authorized.
CDPD-22744: Bulk entity DELETE API does not delete authorised entities
Bulk entity DELETE API does not delete authorised entities when the list of authorised and unauthorised entities list is passed.
CDPD-76035: Resource lookup for Atlas service is failing
Once the Atlas configuration snippet atlas.authentication.method.file is enabled and a classification is created, these do not synchronize correctly to the Type Category resource field setting of Apache Ranger. The newly created classification won't be able to be selected as the Type Name.
CDPD-77738: Atlas hook authorization issue causing HiveCreateSysDb timeout
Atlas hook authorization error causes HiveCreateSysDb command to time out due to repeated retries.
None