Flume Solr UUIDInterceptor Configuration Options
Flume can modify and drop events using Interceptors, which can be attached to any Flume source. The Solr UUIDInterceptor sets a universally unique 128-bit identifier (such as f692639d-483c-1b5f-cd61-183cb1726ae0) on each event.
Cloudera recommends assigning UUIDs to events as early as possible (for example, in the first Flume source of your data flow). This allows you to de-duplicate events that are duplicated as a result of replication or re-delivery in a Flume pipeline that is designed for high availability and high performance. If available, application-level UUIDs are preferable to auto-generated UUIDs because they enable subsequent updates and deletion of the document in Solr using that key. If application-level UUIDs are not present, you can use UUIDInterceptor to automatically assign UUIDs to document events.
The UUIDInterceptor supports the following configuration options (required options in bold):
Property Name | Default | Description |
---|---|---|
type | Must be set to the fully qualified class name (FQCN)org.apache.flume.sink.solr. morphline.UUIDInterceptor$Builder. | |
headerName | id | The name of the Flume header to use for setting the UUID. |
preserveExisting | true | Determines whether to preserve existing UUID headers. |
prefix | "" | Specifies a string to prepend to each generated UUID. |
For examples, see BlobHandler and BlobDeserializer.