Data Compliance profiler configuration
You can configure the scheduling and the available resources for your profiler.
- Go to Profilers and select your data lake.
- Go to Profilers > Data Compliance > Profiler Details > Configuration > All Configurations
- Select a schedule to run profiler using either UNIX Cron Expression or the Basic scheduler.


-
Select Last Run Check and set a period in Day Range if
needed.
-
Continue with resource settings:
-
Set the Maximum number of executors
Indicates the number of processes that are used by the distributed computing framework. The recommended value is at least 10 executors.
-
Set the Maximum cores per executor
Indicates the maximum number of cores that can be allocated to an executor.
- Set the Executor memory limit in GBs
-
Set the Maximum number of executors

- Click Save to apply the configuration changes to the selected profiler.
-
Add Asset Filtering
Rules as needed to customize the selection of assets to be profiled.
-
Set your Deny List and Allow-list.
The profiler will skip profiling assets that meet any criteria in the Deny List and will include assets that meet any criteria in the Allow List.
- Select the Deny-list or Allow List tab.
- Click Add New Rule to define new rules.
- Select the key from the drop-down list and the relevant operator. You can select
from the following:
Key Operator Database name - equals
- starts with
- ends with
Name (of asset) - equals
- contains
- starts with
- ends with
Owner (of asset) Creation date - greater than
- less than
- Enter the value corresponding to the key. For example, you can enter a string as mentioned in the previous example.
- Click Add Rule. Once a rule is added (enabled by default), you can toggle the state of the new rule to enable it or disable it as needed.
-
Set your Deny List and Allow-list.