Apache Drill Blog
Announcing the Apache Drill Beta Release, Self Service Data Exploration in Action
It is our pleasure to announce the 0.5.0 release of Apache Drill. This is Drill’s first beta release and the second in our iterative monthly release cycle. It includes more than 100 issues addressed since last month’s release and more than 1,000 addressed since Drill’s inception, this is a great release to start exploring your data, wherever and whatever it is.
For more background on what Drill is about, check out the Drill overview or Drill in 10 minutes. The 0.5.0 release builds upon the huge 0.4.0 release so you should refer to last month’s release for information on all the functionality available. Notable features included in 0.5.0 include the following:
- Drill now uses the Hadoop 2.4.1 APIs. This includes upgrading Parquet to use direct memory and the ability to write larger Parquet files when using CREATE TABLE AS.
- Improved JOIN planning when using HBase tables based on row count approximations using region level statistics.
- Improved handling of large sorts and out of memory conditions.
- JSON projection pushdown, an all text JSON mode and boolean short circuit. Each of these features allow more flexibility when interacting with complicated JSON files
- Substantial improvements in SELECT * handling when interacting with schemaless data sources.
- Creation of a self contained JDBC JAR file to ease access to Drill from JDBC tools.
- Fully distributed execution of all basic aggregates including standard deviation and avg.
Drill will continue on its march towards GA with upcoming monthly releases continuing to harden and expand Drill’s capabilities and performance. Check out the release notes, download it, or better yet, make your own fork and contribute back to the community. Together, we can make data available to everyone, anywhere.
-The Apache Drill Team