Apache Samza

Friday June 09, 2017

Announcing the release of Apache Samza 0.13.0

We are very excited to announce the release of Apache Samza 0.13.0.

Samza has been powering real-time applications in production across several large companies (including LinkedIn, Netflix, Uber) for years now. Samza provides leading support for large-scale stateful stream processing with:

 •  First class support for local state (with RocksDB store). This allows a stateful application to scale up to 1.1 Million events/sec on a single machine with SSD.

 •  Support for incremental checkpointing of state instead of full snapshots. This enables Samza to scale to applications with very large state.

 •  A fully pluggable model for input sources (e.g. Kafka, Kinesis, DynamoDB streams etc.) and output systems (HDFS, Kafka, ElastiCache etc.).

 •  A fully asynchronous programming model that makes parallelizing remote calls efficient and effortless.

 •  Features like canaries, upgrades and rollbacks that support extremely large deployments with minimal downtime.

New features

The 0.13.0 release contains previews for the following highly anticipated features:

High Level API

With the new high level API you can express your complex stream processing pipelines concisely in few lines of code and accomplish what previously required multiple jobs. This new API facilitates common operations like re-partitioning, windowing, and joining streams. Check out some examples to see the high level API in action here 

Flexible Deployment Model

Samza now provides flexibility for running your application in any hosting environment and with cluster managers other than YARN. Samza can now also be run as a lightweight stream processing library embedded inside your application. Your processes can coordinate task distribution amongst themselves using ZooKeeper or static partition assignments out-of-the box.

See more details and code examples here.

Enhancements, Upgrades and Bug Fixes

This release also includes the following enhancements to existing features:

  • SAMZA-871 adds a heart-beat mechanism between JobCoordinator and all running containers to prevent orphaned containers.
  • SAMZA-1140 enables non-blocking commit in the AsyncRunloop.
  • SAMZA-1143 adds configurations for localizing general resources in YARN.
  • SAMZA-1145 provides the ability to configure the default number of changelog replicas.
  • SAMZA-1154 adds a tasks endpoint to samza-rest to get information about all tasks in a job.
  • SAMZA-1158 adds a samza-rest monitor to clean up stale local stores from completed containers.

This release also includes several bug-fixes and improvements for operational stability. Some notable ones are:

  • SAMZA-1083 prevents loading task stores that are older than delete tombstones during container startup.
  • SAMZA-1100 fixes an exception when using an empty stream as both bootstrap and broadcast.
  • SAMZA-1112 fixes BrokerProxy to log fatal errors.
  • SAMZA-1121 fixes StreamAppender so that it doesn't propagate exceptions to the caller.
  • SAMZA-1157 fixes logging for serialization/deserialization errors.

We've also upgraded the following dependency versions:

  • Samza now supports Scala 2.12.
  • Kafka version to 0.10.1.1.
  • Elasticsearch version to 2.2.0

 

Community Developments

We've made great community progress since the previous release. We showcased how Samza is powering stream processing at LinkedIn in Kafka Summit 2017 and O’Reilly Strata 2017. We also presented Samza use cases and case studies from several large companies in ApacheCon Big Data, 2017. In addition, the Samza talk in LinkedIn's Stream Processing Meetup in Sunnyvale was well-received with over 200 attendees. Here are links to some of these events:

  1. March 15, 2017 - Processing millions of events per second without breaking the bank - Kartik Paramasivam (Video)
  2. May 8, 2017 - Data Processing at LinkedIn with Apache Kafka and Apache Samza (Kafka Summit NYC 2017) (Slides)
  3. May 16, 2017 - What it takes to process a trillion events a day? Case studies in scaling stream processing at LinkedIn - Jagadish Venkatraman (ApacheCon Big Data '17) (Slides)
  4. May 16, 2017 - The continuing story of Batching to Streaming analytics at Optimizely, Michael Borsuk (ApacheCon Big Data’17) (Slides)
  5. May 24, 2017 - Managed or stand alone, streaming or batch; Unified processing with the Samza Fluent API - Yi Pan (LinkedIn Stream Processing Meetup) (Slides)
  6. May 25, 2017 - How companies are using Apache Samza - Jagadish Venkatraman (Apache Con podcast)

Future:

We'll continue improving the new High Level API and flexible deployment features with your feedback.

It’s a great time to get involved. You can start by reviewing the tutorials, signing up for the mailing list, and grabbing some newbie JIRAs. I'd like to close by thanking everyone who's been involved in the project. It's been a great experience to be involved in this community, and I look forward to its continued growth.

Comments:

So excited by these new highly anticipated features

Posted by json formatter on July 03, 2017 at 03:50 AM GMT #

Great enhancements, upgrades, and bug fixes

Posted by bullet force on July 03, 2017 at 03:57 AM GMT #

The focus on embedded and local instance running is great -- really looking forward to 0.13.1, and being able to launch StreamTask's directly from a local application runner!

Posted by malcolm on July 14, 2017 at 06:45 PM GMT #

I have not rethought Samza since this report so I can't speak to where it is now with respect to aggregation work. What I can address is that Samza has had about zero industry pickup. In the event that making a choice of a stream preparing engine now, I would not prescribe Samza, in light of the lack of adoption.

Posted by Coursework help on July 09, 2018 at 12:21 PM GMT #

From this report, I did not think of Sanja so I can not talk about where he is now in relation to the aggregation work. Thanks for sharing this to us.

Posted by Backup Outlook on July 10, 2018 at 05:17 AM GMT #

Awesome update

Posted by Json Editor on July 15, 2018 at 02:33 AM GMT #

After a long hunt here we have found out one of the foremost website http://windowstuts.net/mobile which will help you to fix connections to bluetooth audio devices in windows 10 mobile so make use of this portal if facing same problems.

Posted by Hard Work on September 24, 2018 at 07:38 AM GMT #

Great Enhancements, experience but you left some bugs fixing. Please consider them as a serious issue, Fix them in a future update!

Posted by Coursework UK on October 08, 2018 at 07:22 AM GMT #

Cooooooooooooooooooooooooooool very neat

Posted by best on July 19, 2019 at 10:20 PM GMT #

you have great sense of color!! nice non-obvious color palette

Posted by great on July 19, 2019 at 11:26 PM GMT #

Pure!! The composition looks so awesome. Great example of print design.

Posted by levirosi1973 on July 20, 2019 at 08:41 AM GMT #

very nice! love the style and colours. @Jaime Alvarez Sobreviela Lovely stuff Jamie

Posted by tuimoyfogrearth1974 on July 20, 2019 at 03:48 PM GMT #

Wow'! nice @Hussein Al-Erwi Thanks!!

Posted by baulocama1975 on July 20, 2019 at 04:07 PM GMT #

WOW!! Que lindo esse material. Parabéns!

Posted by silasymfact1970 on July 20, 2019 at 05:22 PM GMT #

Love the concept. Nicely done. This is great.

Posted by chesprissano1980 on July 20, 2019 at 06:09 PM GMT #

Continue to work in the same spirit I think it's a new way of looking at graphic design and i love it. reat neat! nice work

Posted by subrocksuper1973 on July 20, 2019 at 06:35 PM GMT #

Very beautiful.. beautiful and creative Pedro! Nice illustration xD

Posted by workworsiworth1986 on July 20, 2019 at 11:07 PM GMT #

a very nice work , Eiko @Eiko Ojala This is just amazing

Posted by orwalloaprot1981 on July 21, 2019 at 12:02 AM GMT #

ihihihi, W O W ! Tosto(IT)=High(GB). ;-) Very fresh! Great work!

Posted by elafkeyplic1970 on July 21, 2019 at 02:23 AM GMT #

green palette is nature :) 艺术

Posted by outperliopsych1982 on July 21, 2019 at 04:01 AM GMT #

Jean Julien is a wonderboy i think green color combination looks good!

Posted by glicavosblac1979 on July 21, 2019 at 05:10 AM GMT #

cool++ excelente! ótimo branding

Posted by asroerehic1982 on July 21, 2019 at 06:23 AM GMT #

Wonderful work! + + + Good job !

Posted by lestabukons1976 on July 21, 2019 at 06:56 AM GMT #

This is incredible! Congratulations. beaut!

Posted by adervasdia1970 on July 21, 2019 at 07:29 AM GMT #

Amazing composition!! Love it Thank you so much @kwesi trimz The colours are quote mad - but i thik we need a bot of madness at times. I cant get my head around the 3 colour rule. I am trying - but failing!

Posted by terwalumsu1976 on July 21, 2019 at 08:01 AM GMT #

impressive work @Sergey Shchapov :)

Posted by racmicavi1980 on July 21, 2019 at 08:33 AM GMT #

it is very beautiful and unusual franclement top ! le projet est impressionnant , très graphique et avec une atmosphère interessante, bravo pour le (Gros) boulot @Friendly Robot & @Ash Thorp

Posted by mandtwarminmang1973 on July 21, 2019 at 09:06 AM GMT #

stunning!! every element comes together to make a mesmerizing work of art.

Posted by kremracsuppga1980 on July 21, 2019 at 09:38 AM GMT #

Such great selection of cars! Wow, they are awesome! :)

Posted by terbhandlestee1973 on July 21, 2019 at 10:11 AM GMT #

aaaaaa。。。mazing! Fantastic!!!

Posted by cheperrodsmysq1980 on July 21, 2019 at 10:44 AM GMT #

عظيم astonishing

Posted by maipedfamind1974 on July 21, 2019 at 11:17 AM GMT #

Post a Comment:
  • HTML Syntax: NOT allowed

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation