Apache NiFi

Wednesday Dec 16, 2015

Getting Syslog Events to HBase

Getting Syslog Events to HBase

Bryan Bende -  bbende@gmail.com @bbende

In the Apache NiFi 0.4.0 release there are several new integration points including processors for interacting with Syslog and HBase. In this post we'll demonstrate how to use NiFi to receive messages from Syslog over UDP, and store those messages in HBase.

The flow described in this post was created using Apache NiFi 0.4.0, rsyslog 5.8.10, and Apache HBase 1.1.2.

Setting up Syslog

In order for NiFi to receive syslog messages, rsyslog needs to forward messages to a port that NiFi will be listening on. Forwarding of messages can be configured in rsyslog.conf, generally located in /etc on most Linux operating systems.

Edit rsyslog.conf and add the following line:

*.* @localhost:7780

This tells rsyslog to forward all messages over UDP to localhost port 7780. A double '@@' can be used to forward over TCP.

Restart rsyslog for the changes to take effect:

/etc/init.d/rsyslog restart
Shutting down system logger:                               [  OK  ]
Starting system logger:                                    [  OK  ]

Setting up HBase

In order to store the syslog messages, we'll create an HBase table called 'syslog' with one column family called 'msg'. From the command line enter the following:

hbase shell
create 'syslog', {NAME => 'msg'}

Configure an HBase Client Service

The HBase processors added in Apache NiFi 0.4.0 use a controller service to interact with HBase. This allows the processors to remain unchanged when the HBase client changes, and allows a single NiFi instance to support multiple versions of the HBase client. NiFi's class-loader isolation provided in NARs, allows a single NiFi instance to interact with HBase instances of different versions at the same time.

The HBase Client Service can be configured by providing paths to external configuration files, such as hbase-site.xml, or by providing several properties directly in the processor. For this example we will take the latter approach. From the Controller Services configuration window in NiFi, add an HBase_1_1_2_ClientService with the following configuration (adjusting values appropriately for your system):


After configuring the service, enable it in order for it to be usable by processors:


Building the Dataflow

The dataflow we are going build will consist of the following components:

  • ListenSyslog for receiving syslog messages over UDP
  • UpdateAttribute for renaming attributes and creating a row id for HBase
  • AttributesToJSON for creating a JSON document from the syslog attributes
  • PutHBaseJSON for inserting each JSON document as a row in HBase

The overall flow looks like the following:

Lets walk through the configuration of each processor...



Set the Port to the same port that rsyslog is forwarding messages to, in this case 7780. Leave everything else as the default values.

With a Max Batch Size of "1" and Parse Messages as "true", each syslog message will be emitted as a single FlowFile, with the content of the FlowFile being the original message, and the results of parsing the message being stored as FlowFile attributes.

The attributes we will be interested in are:

  • syslog.priority
  • syslog.severity
  • syslog.facility
  • syslog.version
  • syslog.timestamp
  • syslog.hostname
  • syslog.sender
  • syslog.body
  • syslog.protocol
  • syslog.port



The attributes produced by ListenSyslog all start with "syslog." which keeps them nicely namespaced in NiFi. However, we are going to use these attribute names as column qualifiers in HBase. We don't really need this prefix since we will already be with in a syslog table.

Add a property for each syslog attribute to remove the prefix, and use the Delete Attributes Expression to remove the original attributes. In addition, create an id attribute of the form "timestamp_uuid" where timestamp is the long representation of the timestamp on the syslog message, and uuid is the uuid of the FlowFile in NiFi. This id attribute will be used as the row id in HBase.

The expression language for the id attribute is:

${syslog.timestamp:toDate('MMM d HH:mm:ss'):toNumber()}_${uuid}



Set the Destination to "flowfile-content" so that the JSON document replaces the FlowFile content, and set Include Core Attributes to "false" so that the standard NiFi attributes are not included.



Select the HBase Client Service we configured earlier and set the Table Name and Column Family to "syslog" and "msg" based on the table we created earlier. In addition set the Row Identifier Field Name to "id" to instruct the processor to use the id field from the JSON for the row id.

Verifying the Flow

From a terminal we can send a test message to syslog using the logger utility:

logger "this is a test syslog message"

Using the HBase shell we can inspect the contents of the syslog table:

hbase shell
hbase(main):002:0> scan 'syslog'
ROW                                          COLUMN+CELL
29704815000_84f91b21-d35f-4a24-8e0e-aaed4a521c13 column=msg:body, timestamp=1449775215481,
  value=root: this is a test message
29704815000_84f91b21-d35f-4a24-8e0e-aaed4a521c13 column=msg:hostname, timestamp=1449775215481,
29704815000_84f91b21-d35f-4a24-8e0e-aaed4a521c13 column=msg:port, timestamp=1449775215481,
29704815000_84f91b21-d35f-4a24-8e0e-aaed4a521c13 column=msg:protocol, timestamp=1449775215481,
29704815000_84f91b21-d35f-4a24-8e0e-aaed4a521c13 column=msg:sender, timestamp=1449775215481,
29704815000_84f91b21-d35f-4a24-8e0e-aaed4a521c13 column=msg:timestamp, timestamp=1449775215481,
  value=Dec 10 19:20:15
29704815000_84f91b21-d35f-4a24-8e0e-aaed4a521c13 column=msg:version, timestamp=1449775215481,
1 row(s) in 0.1120 seconds

Performance Considerations

In some cases the volume of syslog messages being pushed to ListenSyslog may be very high. There are several options to help scale the processing depending on the given use-case.

Concurrent Tasks

ListenSyslog has a background thread reading messages as fast as possible and placing them on a blocking queue to be de-queued and processed by the onTrigger method of the processor. By increasing the number of concurrent tasks for the processor, we can scale up the rate at which messages are processed, ensuring new messages can continue to be queued.


One of the more expensive operations during the processing of a message is parsing the message in order to provide the the attributes. Parsing messages is controlled on the processor through a property and can be turned off in cases where the attributes are not needed, and the original message just needs to be delivered somewhere.


In cases where parsing the messages is not necessary, an additional option is batching many messages together during one call to onTrigger. This is controlled through the Batch Size property which defaults to "1". This would be appropriate in cases where having individual messages is not necessary, such as storing the messages in HDFS where you need them batched into appropriately sized files.


In addition to parsing messages directly in ListenSyslog, there is also a ParseSyslog processor. An alternative to the flow described in the post would be to have ListenSyslog produce batches of 100 messages at a time, followed by SplitText, followed by ParseSyslog. The tradeoff here is that we can scale the different components independently, and take advantage of backpressure between processors.


At this point you should be able to get your syslog messages ingested into HBase and can experiment with different configurations. The template for this flow can be found here.

We would love to hear any questions, comments, or feedback that you may have!

Learn more about Apache NiFi and feel free to leave comments here or e-mail us at dev@nifi.apache.org.



This is my views on the storing syslog events in base of the h.I am eager to get more information regarding it.

Posted by google play redeem codes on October 10, 2017 at 09:27 AM GMT #

Hi can we do the same way when we use TCP? Also, I need to get rfc 5424 format, so the above approach works? As I am new to nifi, please help me.

Posted by hadoop user on October 20, 2017 at 12:41 PM GMT #

I usually do this tasks with tcp, it's really good type of format. Check what i did with those tasks http://imvucreditshack.us You are so awesome dudes.

Posted by Sara Montero on October 26, 2017 at 03:37 PM GMT #

Really good format of coding, keep it up amazing stuff http://animaljamhack.club

Posted by Bruno Den on March 11, 2018 at 05:12 PM GMT #

Parsing and batching it's what i like this attribute the most http://imvucreditshack.club

Posted by Lloyd Flaton on March 11, 2018 at 05:26 PM GMT #

Do you guys know how long i have been searching for a page like this for apache, as i am addicted to code about apache http://episodehack.club/

Posted by Sara Monero on March 11, 2018 at 05:33 PM GMT #

Thanks for sharing this info

Posted by cyberflix tv on November 28, 2018 at 12:55 PM GMT #

Welcome fred.. mantab lah seenggaknya ada pemain masuk sblom WC.. Dan ane rasa tim sekelas MU pasti bakal sign big name di stiap musimnya, untuk menaikkan value MU gan.. apalagi MU doyan bgt iklan dmn2 dan lama bgt buat sign pemain bintangnya.. https://fansbarcelona999.blogspot.com/2019/05/barcelona-menjalani-laga-liga-champions.html https://juventusindo.weebly.com/ https://www.storeboard.com/blogs/sports-and-fitness/fans-keren-manchester-united-indonesia/961863 https://fansliverpool.yolasite.com/ https://liverpool999.home.blog/2019/05/01/strategi-klopp-menangani-liverpool/ http://icalshare.com/calendars/8796 https://berandasehat.com/ahli-kunci-karawang/ ane masih penasaran MU bakal gaet sapa yak.. ane sih berharap griezmen yg masuk.. deadwood" di dibuang...

Posted by Live Streaming Bola online on June 21, 2019 at 07:52 PM GMT #

beautiful design! thank you

Posted by vercsentioping1970 on July 20, 2019 at 05:36 PM GMT #

Son espectaculares. Enhorabuena! creative, cool & elegant work ✌️✌️

Posted by binrapiczio1988 on July 20, 2019 at 06:03 PM GMT #

Beautiful, intricate, dynamic. these are awesome shots. what technique did you use?

Posted by enapenrhod1982 on July 20, 2019 at 06:45 PM GMT #

This style is incredible. Very unique! Awesome prints! Love it

Posted by tsadsadathe1977 on July 20, 2019 at 08:27 PM GMT #

It’s so amazing ,I’m love it so much! I admire your works! Awesome logo :)

Posted by itarfimea1979 on July 20, 2019 at 11:46 PM GMT #

Awesome characters, I really like your work☺️ nice shot!

Posted by usivtirigh1978 on July 21, 2019 at 12:36 AM GMT #

Not a single extra line. Perfect! Nice idea, and well executed series !

Posted by apbagsubstan1976 on July 21, 2019 at 01:52 AM GMT #

I like this Magnificent!

Posted by coatemeslio1975 on July 21, 2019 at 03:31 AM GMT #

Fun work you have here. Its too smart

Posted by candwehoumu1974 on July 21, 2019 at 05:00 AM GMT #

Really love this color and creative concept Fantastic work.

Posted by poichildpromes1976 on July 21, 2019 at 05:32 AM GMT #

Fantastic !! Amazingly done!!

Posted by trusealccorti1980 on July 21, 2019 at 06:13 AM GMT #

Im a sucker for big type on royal blue def a trend lately works for postmates too. Awesome......!!!!

Posted by salldeheartopc1983 on July 21, 2019 at 06:46 AM GMT #

Such attention to detail. Amazing work! Thank you James.@James Hernandez

Posted by centleacacon1981 on July 21, 2019 at 07:18 AM GMT #

even as automobilist love the typo, great job

Posted by plactimoman1978 on July 21, 2019 at 07:51 AM GMT #

Fantastic illustrations! You can see the amount of work you put into this project. Great work. Cheers, mate!

Posted by toatelloweb1971 on July 21, 2019 at 08:55 AM GMT #

Great work! Exquisite design!

Posted by deddecktofi1976 on July 21, 2019 at 09:27 AM GMT #

Good shots! A beautiful pieces of our planet 好

Posted by derswhishomar1970 on July 21, 2019 at 10:01 AM GMT #

Wow, these pics are amazing! This is easily one of the greatest detailed art projects for kids I’ve ever seen. Take it from me, amd I’ve illustrated detailed maps and posters like this. I got this for my grandkids and named it after Cecil, their dearly departed chihuahua, and signed by the rest of the grand Dogs! Can’t wait to share it!

Posted by pretexprosre1986 on July 21, 2019 at 10:34 AM GMT #

wow. great concept Wow. Really cool designs. :)

Posted by ribacksohot1975 on July 21, 2019 at 11:07 AM GMT #

Post a Comment:
Comments are closed for this entry.