The Apache Software Foundation Blog
Tuesday July 22, 2014
The Apache Software Foundation Announces Apache™ Tez™ as a Top-Level Project
Highly-efficient Open Source framework for Apache Hadoop® YARN-powered data processing applications in use at Microsoft, NASA, Netflix, and Yahoo, among others.
Forest Hill, MD –22 July 2014– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 170 Open Source projects and initiatives, announced today that Apache™ Tez™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
"Graduation to a top-level Apache project is a significant validation of the community momentum behind Tez," said Hitesh Shah, Vice President of Apache Tez.
Apache Tez is an embeddable and extensible framework for building high-performance batch and interactive data processing engines and tools that require out-of-the-box integration with Apache Hadoop® YARN. Tez leverages Hadoop’s unparalleled ability to process petabyte-scale datasets, allowing projects in the Apache Hadoop ecosystem (such as Apache Hive and Apache Pig) and third-party software vendors to express fit-to-purpose data processing logic in a way that meets their unique demands for fast response times and extreme throughput.
Tez's customizable execution architecture enables scalable, purpose-built data-processing computations, and also allows for dynamic performance optimizations based on real information about the data and the resources required to process it.
Tez was originally developed by Hortonworks, and entered the Apache Incubator in February 2013. The project currently has code contributions from individuals representing Cloudera, Facebook, Hortonworks, LinkedIn, Microsoft, Twitter, and Yahoo.
"I'm really happy to see the graduation of Apache Tez from the Incubator. The community has worked diligently to get to this point," said Chris Mattmann, Apache Tez Incubator Mentor, and Chief Architect, Instrument and Science Data Systems Section at NASA JPL. "Tez makes queries on Hadoop databases like Hive interactive, instead of batch oriented. Tez is similar to recently graduated projects in the Apache Big Data ecosystem including Apache Spark and also Apache Tajo, projects with similar goals of speeding up queries in Hadoop. My data science team at NASA is looking at Tez, Spark, and Tajo and evaluating them on projects in climate science and in radio astronomy."
"Netflix builds its big data analytics platform in the cloud by leveraging open source technologies such as Apache Hadoop, Hive, Pig and more," said Cheolsoo Park, Senior Software Engineer at Netflix and Vice President of Apache Pig. "While MapReduce has served us well for years, Tez is a welcome improvement. Netflix has made significant contributions to the development of Pig-on-Tez alongside with Hortonworks, LinkedIn, and Yahoo. Based on our initial benchmark of Pig-on-Tez, it is nearly twice as fast as MapReduce for some of our heavy production jobs. This is a huge improvement in efficiency. We look forward to deploying Pig-on-Tez in production this year. We thank the Tez community for all your help and are excited that Tez has become an Apache top-level project."
"Yahoo's business is built on Hadoop; it's essential to our ability to deliver personalized, delightful experiences for our users and create value for our advertisers," said Peter Cnudde, Vice President of Engineering, Yahoo. "We're committed to working closely with the Apache community to evolve the processing of Big Data at scale with technologies such as Apache Hive, Tez, and YARN."
"It's fantastic to see Tez promoted to a top-level Apache project. Microsoft has invested in improving Hive performance by bringing innovation used in SQL Server to Hadoop, through contributions to Tez," said Eric Hanson, Principal Engineer in the HDInsight team at Microsoft and an Apache Hive Committer. "Hive on Tez enables major performance improvements of up to 100x, and we're happy it's available now on Microsoft Azure HDInsight, our Hadoop-based solution for the cloud."
"Tez is on its way to becoming a cornerstone of core Apache projects like Apache Hive and Apache Pig and has been embraced by other important Open Source projects like Cascading. We look forward to continuing to grow our community and driving Tez adoption," added Shah.
Availability and Oversight
As with all Apache products, Apache Tez software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project’s day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache Tez, visit http://tez.apache.org/
About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than one hundred and seventy leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 400 individual Members and 3,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
"Apache", "Hadoop", "Apache Hadoop", "Hive", "Apache Hive", "MapReduce", "Hadoop MapReduce", "Pig", "Apache Pig", "Tez", "Apache Tez", and "ApacheCon" are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.
# # #