Apache Flume Interceptors are used to modify and drop the events on the fly. It is implemented through “org.apache.flume.interceptor.Interceptor” interface. If interceptors are defined in order then they will follow the same order for execution as well. During processing time the event from one interceptor is passed to another interceptor to process it further.
Following is the sample configuration file of Apache Flume Interceptors.
agentone.sources = r1
agentone.sinks = k1
agentone.channels = c1
agentone.sources.r1.interceptors = i1 i2
agentone.sources.r1.interceptors.i1.type = org.apache.flume.interceptor.HostInterceptor$Builder
agentone.sources.r1.interceptors.i1.preserveExisting = false
agentone.sources.r1.interceptors.i1.hostHeader = hostname
agentone.sources.r1.interceptors.i2.type = org.apache.flume.interceptor.TimestampInterceptor$Builder
agentone.sinks.k1.filePrefix = FlumeData.%{CollectorHost}.%Y-%m-%d
agentone.sinks.k1.channel = c1
Apache Flume provides different types of Interceptors as mentioned below.
- Timestamp Interceptor
- Host Interceptor
- Static Interceptor
- Remove Header Interceptor
- UUID Interceptor
- Morphline Interceptor
- Search and Replace Interceptor
- Regex Filtering Interceptor
Let us see each Flume Interceptors in detail.
1. Timestamp Interceptor
Apache Flume Timestamp Interceptor is used to insert a header with a key timestamp to process events. Its value is the relevant timestamp. It can preserve an existing timestamp if it is already present in the configuration.
Example for Apache Flume Timestamp Interceptor.
agentone.sources = r1
agentone.channels = c1
agentone.sources.r1.channels = c1
agentone.sources.r1.type = seq
agentone.sources.r1.interceptors = i1
agentone.sources.r1.interceptors.i1.type = timestamp
2. Host Interceptor
Apache Flume Host Interceptor inserts a header with the key host or a configured key whose value is the hostname or IP address of the host. It is based on the configuration file.
Example for Apache Flume Host Interceptor.
agentone.sources = r1
agentone.channels = c1
agentone.sources.r1.interceptors = i1
agentone.sources.r1.interceptors.i1.type = host
3. Static Interceptor
Apache Flume Static Interceptor allows appending a static header with static value in all events.
Example for Apache Flume Static Interceptor.
agentone.sources = r1
agentone.channels = c1
agentone.sources.r1.channels = c1
agentone.sources.r1.type = seq
agentone.sources.r1.interceptors = i1
agentone.sources.r1.interceptors.i1.type = static
agentone.sources.r1.interceptors.i1.key = datacenter
agentone.sources.r1.interceptors.i1.value = NEW_YORK
4. Remove Header Interceptor
Apache Flume is used to manipulate Flume event headers, by removing one or many headers. It can remove a statically defined header, based on a regular expression or headers in a list.
Please note that if one header needs to be removed then specify it by name which will increase performance.
5. UUID Interceptor
UUID Interceptor is used to set a universally unique identifier on all events that are intercepted. Example of UUID is b5755073-77a9-43c1-8fad-b7a586fc1b97, which represents a 128-bit value.
We can use UUID Interceptor in those cases where the application level key is not defined. UUIDs can be assigned to events as soon as they enter the Flume network.
6. Morphline Interceptor
Apache Flume Morphline Interceptor is used to filter out events based on a morphine configuration file which defines a chain of transformation commands that pipe records from one command to another.
Apache Flume morphine interceptor is not intended for heavy-duty ETL processing. If we want to use this interceptor then we can move ETL processing from FLume Source to a Fume sink such as MorphlineSolrSink.
Example for Apache Flume Morphline Interceptor.
agentone.sources.avroSrc.interceptors = morphlineinterceptor
agentone.sources.avroSrc.interceptors.morphlineinterceptor.type =
org.apache.flume.sink.solr.morphline.MorphlineInterceptor$Builder
agentone.sources.avroSrc.interceptors.morphlineinterceptor.morphlineFile =
/etc/flume-ng/conf/morphline.conf
agentone.sources.avroSrc.interceptors.morphlineinterceptor.morphlineId = morphline1
7. Search and Replace Interceptor
Apache Flume Search and Replace Interceptor is used to provide a string-based search-and-replace functionality that is based on Java regular expressions. It uses a similar method of Java Matcher.replaceAll() method.
Example for Apache Flume Search and Replace Interceptor.
agentone.sources.avroSrc.interceptors = search-replace
agentone.sources.avroSrc.interceptors.search-replace.type = search_replace
agentone.sources.avroSrc.interceptors.search-replace.searchPattern = ^[A-Za-z0-9_]+
agentone.sources.avroSrc.interceptors.search-replace.replaceString =
8. Regex Filtering Interceptor
Apache Flume Regex Filtering Interceptor is used to filter events by interpreting the event body as text and matching the text against a configured regular expression.