Configuring the Cluster Sensitivity Profiler
In addition to the generic configuration, there are additional parameters for the Cluster Sensitivity Profiler that can be optionally edited.
DataCatalogCspRuleManager
role, to
create, to deploy new Custom Sensitivity Profiler rules, to create new regex expressions, and
to run validations on newly created rules.- Go to Profilers and select your data lake.
- Go to Profilers > Configs.
-
Select Cluster Sensitivity Profiler.
The Detail page is displayed which contains the following sections:
-
Use the toggle button
to enable or disable the profiler.
-
Select a schedule to run the profiler. This is implemented as a quartz cron
expression.
For more information, see Understanding the Cron Expression generator.
-
Select Last Run Check and set a period if needed.
-
Set the sample settings for VM-based environments:
- Select the Sample Data Size.
- From the drop down, select the type of sample data size.
- Enter the value based on the previously selected type.
- Select the Sample Data Size.
-
Continue with the resource settings.
- In Advanced Options, set the following:
- Number of Executors - Enter the number of executors to launch for running this profiler.
- Executor Cores - Enter the number of cores to be used for each executor.
- Executor Memory - Enter the amount of memory in GB to be used per executor process.
- Driver Cores - Enter the number of cores to be used for the driver process.
- Driver Memory - Enter the memory to be used for the driver processes.
- In Advanced Options, set the following:
- Click Save to apply the configuration changes to the selected profiler.
-
Add Asset Filter Rules as needed to customize the selection and deselection
of assets which the profiler profiles.
-
Set your Deny List and Allow-list.
The profiler will skip profiling assets that meet any criteria in the Deny List and will include assets that meet any criteria in the Allow List.
- Select the Deny-list or Allow List tab.
- Click Add New to define new rules.
- Select the key from the drop-down list and the relevant operator. You can select
from the following:
Key Operator Database name - equals
- starts with
- ends with
Name (of asset) - equals
- contains
- starts with
- ends with
Owner (of asset) Creation date1 - greater than
- less than
- Enter the value corresponding to the key. For example, you can enter a string as mentioned in the previous example.
- Click Add Rule. Once a rule is added (enabled by
default), you can toggle the state of the new rule to enable it or disable it as
needed.
-
Set your Deny List and Allow-list.
1 By Creation
Date, Greater than 7 days means
an asset older than seven days. Less than 7
days means an asset younger than seven
days.