The Apache Software Foundation Blog
The Apache Software Foundation Announces Apache™ Tajo™ v0.9
Forest Hill, MD –21 October 2014– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 200 Open Source projects and initiatives, announced today the availability of Apache™ Tajo™ v0.9, the advanced Open Source data warehousing system in Apache Hadoop™.
Dubbed an "SQL-on-Hadoop" solution, Apache Tajo is used for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large data sets stored on HDFS (Hadoop Distributed File System) and other data sources. By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities. Overall, Apache Tajo v0.9 delivers more powerful native SQL support on an even faster platform.
Features and enhancements in Apache Tajo v0.9 include:
- More comprehensive and powerful SQL capabilities, such as TIMESTAMP, DATE, TIME, and INTERVAL type support, as well as WINDOW functions, OVER clause support, and multiple distinct aggregation;
- Performance improvements, such as offheap sort algorithm for ORDER BY and Runtime code generation for evaluating expressions push the boundaries of massive data query speeds;
- Improvements to the hash shuffle I/O, boosting bottom-line speeds by 200-300% on "heavy", complex queries;
- Enhanced Hadoop integration, including support for Hadoop 2.2.0 up to Hadoop 2.5.1, and expanded Hive Metastore access;
- Improved catalog backup and restore feature, as well as accessibility enhancements streamline performance across disparate technology environments.
Hyoungjun Kim, CTO of Gruter, said "We run Apache Tajo in-house on 30 cluster nodes in order to power Seenal, our social network analysis service that supplies social media insight to government and corporate clients. On the one hand, this involves running complex ETL processes on hundreds of gigabytes of data per day in order to detect market and opinion signals. On the other hand, analysts and project teams often need to run very specific analyses on much smaller data sets. Tajo is able to handle the full spectrum of Seenal’s data processing and query needs at high speed and with minimal fuss."
"We're very excited about the release of Apache Tajo 0.9," added Choi. "The Apache Tajo community, committers, and supporters have really done our mission proud."
Availability and Oversight
As with all Apache products, Apache Tajo software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Tajo, visit http://tajo.apache.org/ and https://twitter.com/ApacheTajo
About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than two hundred leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 450 individual Members and 4,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.
"Apache", "Apache Hadoop", "Hadoop", "Apache Tajo", "Tajo", "ApacheCon", and the Apache Tajo logo are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.
# # #