The Apache Software Foundation Blog

Tuesday January 27, 2015

The Apache Software Foundation Announces Apache™ Samza™ as a Top-Level Project

Open Source Big Data distributed stream processing framework used in business intelligence, financial services, healthcare, mobile applications, security, and software development, among other industries.

Forest Hill, MD –27 January 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Samza™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

"The incubation process at Apache has been great. It has helped us cultivate a strong community, and provided us with the support and infrastructure to make Samza grow," said Chris Riccomini, Vice President of Apache Samza.

Apache Samza is a distributed stream processing framework, designed to handle fault tolerance, stateful processing, message durability, and scalability. Samza helps users to write light-weight processors that consume streams of data from messaging systems such as Apache Kafka. These processors empower organizations to understand and react to their data in real-time. In addition, Samza uses Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.

Samza represents a different approach to stream processing. It has been purpose-built first and foremost as a production-grade system with operability and scalability in mind. Samza integrates tightly with Apache Kafka, which makes it a natural fit to those already running Kafka in their data pipeline. The framework also introduces the concept of stateful processing and aggregation as a first-class feature. Stateful processing gives Samza developers a completely new paradigm for aggregating stream data. These features help organizations do high performance stream processing at scale.

Created to process tracking data, service log data, and for data ingestion pipelines for realtime services, Samza originated at LinkedIn, and was submitted to the Apache Incubator in July 2013. 

"LinkedIn is thrilled to see Apache Samza experience such strong adoption and now graduate to a Top-Level Project. Samza was developed to help solve some of LinkedIn's  toughest stream processing challenges and has become a central piece of our infrastructure," said Kevin Scott, Senior Vice President of Engineering and Operations at LinkedIn.

Apache Samza is used in an array of industries, applications, and organizations, including:
  • DoubleDutch, developers of mobile apps for events and conferences, uses Samza to power their analytics platform and stream data live into an event dashboard for real-time insights;
  • Forstcales' Big Data security analytics solutions use Samza to processes security events log as part of the data ingestion pipelines and on-line machine learning models creation process;
  • Happy Pancake, Northern Europe's largest internet dating service, uses Samza for all event handlers and data replication;
  • Advertising technology provider Improve Digital uses Samza as the foundation of a realtime processing capability performing data analytics and as the basis for an alerting system;
  • Jack Henry & Associates uses Samza to process user activity data across its Banno suite of products for financial institutions;
  • MobileAware uses Samza as a foundation for two mobile network products: real time analytics and multi channel notification (push, text message and HTML5);
  • Technology startup Project Florida uses Samza for real-time monitoring of data streams from wearable sensors, for preventative healthcare purposes;
  • Quantiply, providers of Cloud-based micro-applications, uses Samza to bring together user event, system performance, and business operational data for real-time visibility and decision support; and
  • Social media business intelligence solution VinTank uses Samza to power their analysis and natural language processing (NLP) pipeline.


"We've had great experiences with Samza at Improve Digital where it has enabled us to  build out our streaming data platform," said Garry Turkington, CTO of Improve Digital. "It's fantastic to see it graduate to a top-level project."

Jay Kreps, CEO of Confluent, said "Samza is a fantastic piece of infrastructure, and a great complement to Apache Kafka. We at Confluent are really excited to see it added as a top-level Apache project."

"Fortscale has been using Apache Samza successfully to build online machine learning algorithms and detect insider threats," said Dotan Patrich, Software Architect at Fortscale. "It's been a great experience building large scale streaming solution and using Samza's and enjoying it's unique state management architecture. It's fantastic to see it graduate to a Top-Level Project."

"I've been involved in Apache Samza's community since its inception. It's been thrilling to watch the community grow, and I'm very proud and excited to see that the project is graduating. Samza has a bright future, and I'm looking forward to what's to come," added Riccomini.

Availability and Oversight
As with all Apache products, Apache Samza software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache Samza, visit http://samza.apache.org/ and @SamzaStream on Twitter

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow https://twitter.com/TheASF.

© The Apache Software Foundation. "Apache", "Apache Samza", "Samza", "Apache Hadoop", "Hadoop", "Hadoop YARN", "Apache Kafka", "Kafka", "ApacheCon", and the Apache Samza logo are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

# # #

Comments:

Post a Comment:
Comments are closed for this entry.

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation