The Apache Software Foundation Blog

Friday June 23, 2017

The Apache News Round-up: week ending 23 June 2017

Another great week from the Apache community! Here's what happened:

Support Apache –help secure the future of free, community-driven software. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 19 July 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield cheery performance at 97.28% uptime http://status.apache.org/

Apache Incubator –the entry path for codebases and their communities to become an official part of the ASF.
 - Welcome New Podlings: Livy, Pulsar, and Superset http://incubator.apache.org/

Apache HTTP Server™ –the world's most popular Web server.
 - Apache HTTP Server 2.4.26 released http://httpd.apache.org/

Apache HBase™ –an Open Source, distributed, versioned, non-relational database.
 - Apache HBase 1.1.11 released https://hbase.apache.org/

Apache Impala (incubating) –a high-performance C++ and Java SQL query engine for data stored in Apache Hadoop-based clusters.
 - Apache Impala 2.9.0 released https://impala.incubator.apache.org/

Apache Jackrabbit™ Oak –a scalable, high-performance hierarchical content repository designed for use as the foundation of modern world-class Web sites and other demanding content applications.
 - Apache Jackrabbit Oak 1.7.2 released https://jackrabbit.apache.org/

Apache Joshua (incubating) –a statistical machine translation decoder for phrase-based, hierarchical, and syntax-based machine translation, written in Java.
 - Apache Joshua 6.1 released https://joshua.apache.org

Apache Lucene™ Solr™ –the search server built on Apache Lucene.
 - Apache Solr Reference Guide for 6.6 released https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/apache-solr-ref-guide-6.6.pdf

Apache Mnemonic (incubating) –an advanced hybrid memory storage oriented library.
 - Apache Mnemonic-0.8.0-incubating released http://mnemonic.incubator.apache.org/

Apache Pig™ –provides a high-level data-flow language and execution framework for parallel computation on Apache Hadoop clusters.
 - Apache Pig 0.17.0 released http://pig.apache.org/

Apache RocketMQ (incubating) –a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability.
 - Apache RocketMQ 4.1.0 released https://rocketmq.apache.org/

Did You Know?

 - Did you know that Apache Mahout has both Apache Flink and Apache Spark GPU compute? http://mahout.apache.org/

 - Did you know that Apache Tinkerpop's Gremlin is not just a graph traversal language, but a graph in itself? http://tinkerpop.apache.org/

 - Did you know that Xiaomi uses Apache HBase and Thrift across 4 data centers and 40+ clusters? http://hbase.apache.org/ http://thrift.apache.org/

Apache Community Notices:

 - "Success at Apache" focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh 7) Learning to Build a Stronger Community https://s.apache.org/x9Be

 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE are available; as well as videos https://s.apache.org/AE3m and audio recordings https://feathercast.apache.org/

 - Check out the Apache Community Development blog https://blogs.apache.org/comdev/

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - Apache ActiveMQ Call For Logo https://blogs.apache.org/activemq/entry/apache-activemq-call-for-logo

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - The CloudStack European User Group will be held 17 August in London https://www.eventbrite.co.uk/e/cloudstack-european-user-group-tickets-35565783215

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Friday June 16, 2017

The Apache News Round-up: week ending 16 June 2017

Happy Friday from the Apache community! Here's what we've been up to over the past week:

Support Apache –billions of users depend on Apache's free, community-driven software. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 21 June 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ApacheCon™ –the official conference of the Apache Software Foundation. Tomorrow's Technology Today.
 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE
 - Videos of keynotes + presentations https://s.apache.org/AE3m and Audio recordings https://feathercast.apache.org/

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield savvy performance at 98.54% uptime http://status.apache.org/

Apache Incubator –the entry path for codebases and their communities to become an official part of the ASF.
 - Welcome New Podlings: Livy, Pulsar, and Superset http://incubator.apache.org/

Apache Ant™ Compress –offers tasks and types for archive and compression formats.
 - Apache Compress Antlib 1.5 released http://ant.apache.org/antlibs/compress/

Apache Arrow™ –a columnar in-memory analytics layer designed to accelerate Big Data.
 - Apache Arrow 0.4.1 released http://arrow.apache.org/

Apache Commons™ FileUpload –parses HTTP requests which conform to RFC 1867, "Form-based File Upload in HTML".
 - Apache Commons FileUpload 1.3.3 released http://commons.apache.org/proper/commons-fileupload/

Apache Commons™ Lang –provides helper utilities for the java.lang API, notably String manipulation methods, basic numerical methods, object reflection, concurrency, creation and serialization and System properties.
 - Apache Commons Lang 3.6 released http://www.apache.org/dist/commons/lang/

Apache Directory™ DS –an extensible and embeddable directory server entirely written in Java, which has been certified LDAPv3 compatible by the Open Group.
 - ApacheDS 2.0.0-M24 released http://directory.apache.org/apacheds

Apache Fluo (incubating) –adds distributed transactions to Apache Accumulo, and provides an observer/notification framework to allow users to incrementally update large data sets stored in Accumulo.
 - Apache Fluo 1.1.0-incubating released https://fluo.apache.org/

Apache Groovy™ –a multi-facet programming language for the JVM.
 - Apache Groovy-2.5.0-beta-1 released https://groovy.apache.org/

Apache Jackrabbit™ –a fully compliant implementation of the Content Repository for Java(TM) Technology API, version 2.0 (JCR 2.0) as specified in the Java Specification Request 283 (JSR 283).
 - Apache Jackrabbit 2.15.3 and Jackrabbit Oak 1.6.2 and 1.7.1 released https://jackrabbit.apache.org/

Apache Kudu™ –an Open Source storage engine for structured data that supports low-latency random access together with efficient analytical access patterns.
 - Apache Kudu 1.4.0 released http://kudu.apache.org/

Apache NiFi™ –an easy to use, powerful, and reliable system to process and distribute data.
 - Apache NiFi 0.7.4 and 1.3.0 released https://nifi.apache.org/
 - Apache NiFi CVE-2017-7667 and CVE-2017-7665 http://mail-archives.apache.org/mod_mbox/www-announce/201706.mbox/%3CCAFddr25eFkXCOQGwyN4B4VVNjdVYLcKya_JCaW%3Dd%3D11%3DQkyd4g%40mail.gmail.com%3E

Apache Portable Runtime™ –provides cross-platform APIs that relieve developers of the need to deal with platform differences.
 - Apache Portable Runtime and Utilities 1.6 released https://apr.apache.org/

Apache Sling™ –a Web framework that uses a Java Content Repository, such as Apache Jackrabbit, to store and manage content. 
 - Apache Sling 9 released https://sling.apache.org/

Apache Zeppelin™ –a collaborative data analytics and visualization tool for distributed, general-purpose data processing system such as Apache Spark, Apache Flink, etc.
 - Apache Zeppelin 0.7.2 released http://zeppelin.apache.org/

Did You Know?

 - Did you know that the Symphony communications and messaging platform uses Apache Cassandra, HBase, Kafka, Solr, Tomcat, and the Apache License? http://cassandra.apache.org/ http://hbase.apache.org/ http://kafka.apache.org/ http://lucene.apache.org/solr http://tomcat.apache.org/ https://www.apache.org/licenses/LICENSE-2.0

 - Did you know that the Apache Drill distributed SQL engine brings flexibility and agility to data lakes across the Hadoop ecosystem? http://drill.apache.org/

 - Did you know that we have 10 project birthdays this month? Happy Apache Anniversary to SpamAssassin (13 yrs); Santuario (12 yrs); Commons and Wicket (10 yrs); Sling (8 yrs); Karaf (7 yrs); Flume and VCL (5 yrs); Mesos (4 yrs); and Twill (1 year) --many happy returns to all! https://projects.apache.org/

Apache Community Notices:

 - "Success at Apache" focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh 7) Learning to Build a Stronger Community https://s.apache.org/x9Be

 - Check out the Apache Community Development blog https://blogs.apache.org/comdev/

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - Apache ActiveMQ Call For Logo https://blogs.apache.org/activemq/entry/apache-activemq-call-for-logo

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 20-21 June in Amsterdam and 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Friday June 09, 2017

The Apache News Round-up: week ending 9 June 2017

Another week has passed with the Apache community collaborating in full force:

Support Apache –help sustain your favorite Apache project for less than $14/day. Every dollar counts. http://apache.org/foundation/contributing.html
 - new ASF VP Fundraising Kevin McGrail on his goals for the coming year https://feathercast.apache.org/2017/05/18/kevin-mcgrail-fundraising-and-apachecon-north-america/

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 21 June 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

Success at Apache –monthly blog series that focuses on the processes behind why the ASF "just works".
 - Learning to Build a Stronger Community by John Ament https://s.apache.org/x9Be

ApacheCon™ –the official conference of the Apache Software Foundation. Tomorrow's Technology Today.
 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE
 - Videos of keynotes + presentations https://s.apache.org/AE3m and Audio recordings + soundbites from the conference floor https://feathercast.apache.org/

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield skipping performance at 99.96% uptime http://status.apache.org/

Apache Directory™ LDAP API –an ongoing effort to provide an enhanced LDAP API, as a replacement for JNDI and the existing LDAP API (jLdap and Mozilla LDAP API).
 - Apache Directory LDAP API 1.0.0 released http://directory.apache.org/api

Apache Groovy™ –a multi-facet programming language for the JVM.
 - Apache Groovy-2.5.0-beta-1 released https://groovy.apache.org/

Apache Hadoop™ –the cornerstone of the Big Data ecosystem, from which dozens of Apache Big Data projects and countless industry solutions originate.
 - The Apache Software Foundation Announces Momentum With Apache® Hadoop® v2.8 https://s.apache.org/h0Tl

Apache HBase™ –an Open Source, distributed, versioned, non-relational database.
 - Apache HBase 1.2.6 released https://hbase.apache.org/

Apache Jackrabbit™ Oak –a scalable, high-performance hierarchical content repository designed for use as the foundation of modern world-class Web sites and other demanding content applications.
 - Apache Jackrabbit Oak 1.2.26 and 1.4.16 released http://jackrabbit.apache.org/

Apache Lucene™ –a high-performance, full-featured text search engine library written entirely in Java.
 - Apache Lucene 6.6.0 and Apache Solr 6.6.0 released http://lucene.apache.org/

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and JASPIC technologies.
 - CVE-2017-5664 Apache Tomcat Security Constraint Bypass http://mail-archives.apache.org/mod_mbox/www-announce/201706.mbox/%3C3abc830a-69e1-9ce1-27c8-1eaf9c2d6739%40apache.org%3E

Did You Know?

 - Did you know that if it's not at *.apache.org, it's not from us? https://s.apache.org/QviH

 - Did you know that Apache CouchDB was one of the first Apache projects to use git? https://blog.couchdb.org/2017/06/06/couchdb-developer-profile-joan-touzet/

 - Did you know that nearly half of businesses' security breaches are due to the Internet Of Things? Apache Spot (incubating) can help! http://spot.incubator.apache.org/

Apache Community Notices:

 - "Success at Apache" focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh 7) Learning to Build a Stronger Community https://s.apache.org/x9Be

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - Apache ActiveMQ Call For Logo https://blogs.apache.org/activemq/entry/apache-activemq-call-for-logo

 - Join members of the Apache Apex, Beam, Flink, Hadoop, Kafka, Lucene, Solr, and Spark communities at Berlin Buzzwords 11-13 June in Berlin https://berlinbuzzwords.de/17/

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - Will we be seeing you at HBaseCon 12 June/Mountain View https://www.eventbrite.com/e/hbasecon-west-2017-tickets-33101238696 and PhoenixCon 13 June/San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772 ?

 - Meet members of Apache's Cloud community at Cloud Foundry Summit Silicon Valley 13-15 June in Santa Clara; enjoy 20% off registration rates using discount code CFSV17ASF20 https://goo.gl/Uq3g0t

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 20-21 June in Amsterdam and 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Monday June 05, 2017

The Apache Software Foundation Announces Momentum With Apache® Hadoop® v2.8

Major release of the cornerstone of the Big Data ecosystem, from which dozens of Apache Big Data projects and countless industry solutions originate.

Forest Hill, MD —5 June 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today momentum with Apache® Hadoop® v2.8, the latest version of the Open Source software framework for reliable, scalable, distributed computing.

Now ten years old, Apache Hadoop dominates the greater Big Data ecosystem as the flagship project and community amongst the ASF's more than three dozen projects in the category.

"Apache Hadoop 2.8 maintains the project's momentum in its stable release series," said Chris Douglas, Vice President of Apache Hadoop. "Our community of users, operators, testers, and developers continue to evolve the thriving Big Data ecosystem at the ASF. We're committed to sustaining the scalable, reliable, and secure platform our greater Hadoop community has built over the last decade."

Apache Hadoop supports processing and storage of extremely large data sets in a distributed computing environment. The project has been regularly lauded by industry analysts worldwide for driving market transformation. Forrester Research estimates that firms will spend US$800M in Hadoop software and related services in 2017. According to Zion Market Research, the global Hadoop market is expected to reach approximately US$87.14B by 2022, growing at a CAGR of around 50% between 2017 and 2022.

Apache Hadoop 2.8 is the result of 2 years of extensive collaborative development from the global Apache Hadoop community. With 2,914 commits as new features, improvements and bug fixes since v2.7, highlights include:
  • Several important security related enhancements, including Hadoop UI protection of Cross-Frame Scripting (XFS) which is an attack that combines malicious JavaScript with an iframe that loads a legitimate page in an effort to steal data from an unsuspecting user, and Hadoop REST API protection of Cross site request forgery (CSRF) attack which attempt to force an authenticated user to execute functionality without their knowledge.

  • Support for Microsoft Azure Data Lake as a source and destination of data. This benefits anyone deploying Hadoop in Microsoft's Azure Cloud. The Azure Data Lake service was actually developed for Hadoop and analytics workloads.

  • The "S3A" client for working with data stored in Amazon S3 has been radically enhanced for scalability, performance, and security. The performance enhancements were driven by Apache Hive and Apache Spark benchmarks. In Hive TCP-DS benchmarks, Apache Hadoop is currently faster working with columnar data stored in S3  than Amazon EMR's closed-source connector. This shows the benefit of collaborative Open Source development.

  • Several WebHDFS related enhancements include integrated CSRF prevention filter in WebHDFS, support OAuth2 in WebHDFS, disallow/allow snapshots via WebHDFS, and more.

  • Integration with other applications has been improved with a separate jar for the hdfs-client than the hadoop-hdfs JAR with all the server side code. Downstream projects that access HDFS can depend on the hadoop-hdfs-client module to reduce the amount of transitive classpath dependencies.

  • YARN NodeManager Resource Reconfiguration through RM Admin CLI for a live cluster that allows YARN clusters to have a more flexible resource model especially for a Cloud deployment.

In addition to physical Hadoop clusters, where the majority of storage and computation lies, Apache Hadoop is very popular within Cloud infrastructures. Contributions from Apache Hadoop's diverse community includes improvements provided by Cloud infrastructure vendors and large Hadoop-in-Cloud users. These improvements include: Azure and S3 storage and YARN reconfiguration in particular, improve Hadoop's deployment on and integration with Cloud Infrastructures. The improvements in Hadoop 2.8 enable Cloud-deployed clusters to be more dynamic in sizing, adapting to demand by scaling up and down.

"My colleagues and I are happy that tests of Apache Hive and Hadoop 2.8 show that we are able to provide a similar experience reading data in from S3 as Amazon EMR, with its closed-source fork/rewrite of S3," said Steve Loughran, member of the Apache Hadoop Project Management Committee.

Hailed as a "Swiss army knife of the 21st century" by the Media Guardian Innovation Awards  and "the most important software you’ve never heard of…helped enable both Big Data and Cloud computing" by author Thomas Friedman, Apache Hadoop is used by an array of companies such as Alibaba, Amazon Web Services, AOL, Apple, eBay, Facebook, foursquare, IBM, HP, LinkedIn, Microsoft, Netflix, The New York Times, Rackspace, SAP,  Tencent, Teradata, Tesla Motors, Uber, and Twitter. Yahoo, an early pioneer, hosts the world's largest known Hadoop production environment to date, spanning more than 38,000 nodes.

Catch Apache Hadoop in action at DataWorks Summit 13-15 June 2017 in San Jose, CA.

Availability and Oversight
Apache Hadoop software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Hadoop, visit http://hadoop.apache.org/ and https://twitter.com/hadoop

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday June 02, 2017

The Apache News Round-up: week ending 2 June 2017

Hello, June! We're at the mid-year mark with ongoing activities from the Apache community:

Support Apache –help sustain more than 300 freely-available Open Source projects and dozens of innovations in the Apache Incubator. Every dollar counts. http://apache.org/foundation/contributing.html
 - new ASF VP Fundraising Kevin McGrail on his goals for the coming year https://feathercast.apache.org/2017/05/18/kevin-mcgrail-fundraising-and-apachecon-north-america/

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 21 June 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ApacheCon™ –the official conference of the Apache Software Foundation. Tomorrow's Technology Today.
 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE
 - Videos of keynotes + presentations https://s.apache.org/AE3m and Audio recordings + soundbites from the conference floor https://feathercast.apache.org/

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield grand performance at 99.92% uptime http://status.apache.org/

Apache ActiveMQ™ –the most popular and powerful Open Source message broker.
 - Apache ActiveMQ Call For Logo https://blogs.apache.org/activemq/entry/apache-activemq-call-for-logo

Apache Calcite™ Avatica –a framework for building database drivers.
 - Apache Calcite Avatica 1.10.0 released https://calcite.apache.org/avatica/

Apache Fineract™ –an Open Source system for core banking as a platform to offer financial services to the world's 2B under-banked and unbanked.
 - Apache Fineract 1.0.0 released http://fineract.apache.org/

Apache Flink™ –an Open Source stream processing framework for distributed, high-performing, always-available, and accurate data streaming applications.
 - Apache Flink 1.3.0 released https://flink.apache.org/

Apache Jackrabbit™ –a fully compliant implementation of the Content Repository for Java(TM) Technology API, version 2.0 (JCR 2.0) as specified in the Java Specification Request 283 (JSR 283).
 - Apache Jackrabbit 2.14.1 and Jackrabbit Oak 1.7.0 released http://jackrabbit.apache.org/

Apache Knox™ –a free and Open Source Java framework for building semantic Web and Linked Data applications.
 - CVE-2017-5646: Apache Knox Impersonation Issue for WebHDFS http://mail-archives.apache.org/mod_mbox/www-announce/201705.mbox/%3CCACRbFyjtT7QQGHUzTRdbJoySbJb7tt4BDk5-r-VRn0GB0Kgvag%40mail.gmail.com%3E

Apache Tephra (incubating) –a transaction engine for distributed data stores like Apache HBase.
 - Apache Tephra-0.12.0-incubating released http://tephra.incubator.apache.org/

Apache Tika™ –a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
 - Apache Tika 1.15 released http://tika.apache.org/


Did You Know?

 - Did you know that the CFP for MesosCon North America closes on June 3rd? Submit your proposals to http://bit.ly/2rbVJhY

 - Did you know that ARGO Labs uses Apache Airflow (airflow) towards automating Extract, Transform, Load (ETL) workflows for California water data from The California Data Collaborative? http://incubator.apache.org/projects/airflow.html

 - Did you know Dutch multinational banking company Rabobank uses Apache Kafka for real-time financial alerts? http://kafka.apache.org/

Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - Join members of the Apache Apex, Beam, Flink, Hadoop, Kafka, Lucene, Solr, and Spark communities at Berlin Buzzwords 11-13 June in Berlin https://berlinbuzzwords.de/17/

 - Will be seeing you at HBaseCon 12 June/Mountain View https://www.eventbrite.com/e/hbasecon-west-2017-tickets-33101238696 and PhoenixCon 13 June/San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772 ?

 - Meet members of Apache's Cloud community at Cloud Foundry Summit Silicon Valley 13-15 June in Santa Clara; enjoy 20% off registration rates using discount code CFSV17ASF20 https://goo.gl/Uq3g0t

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 20-21 June in Amsterdam and 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Wednesday May 31, 2017

The Apache Software Foundation Announces Apache® SystemML™ as a Top-Level Project

Open Source Big Data machine learning platform in use at Cadent Technology and IBM Watson Health, among other organizations.

Forest Hill, MD –31 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® SystemML™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache SystemML is a machine learning platform optimal for Big Data that provides declarative, large-scale machine learning and deep learning. SystemML can be run on top of Apache Spark, where it automatically scales data, line by line, to determine whether code should be run on the driver or an Apache Spark cluster.

"Today, the machine learning revolution is leading to thousands of life-altering innovations such as self-driving cars and computers that detect cancer," said Deron Eriksson, Vice President of Apache SystemML. "Apache SystemML enables and simplifies this process by executing optimized high-level algorithms on Big Data using proven technologies such as Apache Spark and Apache Hadoop MapReduce."

The core of Apache SystemML has been created from the ground up with the following design principles in mind: 

  • Performance and Scalability, as SystemML scales up on single nodes, and scales out on large clusters using Apache Spark or Apache Hadoop;
  • "Designed for data scientists", enabling data scientists to develop algorithms in a system with a strong foundation in linear algebra and statistical functions; and 
  • Cost-based optimization for scalable execution plans, that significantly shortens and simplifies the development and deployment cycle of algorithms for varying data characteristics and system configurations.

Using Apache SystemML, data scientists are able to implement algorithms using high-level language concepts without knowledge of distributed programming. Depending on data characteristics such as data size/shape and data sparsity (dense/sparse), and cluster characteristics such as cluster size and memory configurations, SystemML's cost-based optimizing compiler automatically generates hybrid runtime execution plans that are composed of single-node and distributed operations on Apache Spark or Apache Hadoop clusters for best performance.

"SystemML allows Cadent to implement advanced numerical programming methods in Apache Spark, empowering us to leverage specialized algorithms in our predictive analysis software," said Michael Zargham, Chief Scientist at Cadent Technology.

"SystemML is like SQL for Machine Learning, it enables Data Scientists to concentrate on the problem at hand, working in a high-level script language like R, and all the optimizations and rewrites are handled by the very powerful SystemML optimizer that considers data and available resources to produce the best execution plan for the application," said Luciano Resende, Architect at the IBM Spark Technology Center and Apache SystemML Incubator Mentor.

"IBM Watson Health VBC is using Apache SystemML on Apache Spark to build risk models on a very large EHR data set to predict emergency department visits," said Steve Beier, Vice President of Value Based Care Platform and Analytics at IBM Watson Health. "The models identify high-risk patients so that they can be targeted with preemptive strategies, thus potentially reducing care costs while at the same time leading to optimal outcomes for patients."

SystemML originated at IBM Research - Almaden in 2010, and was submitted to the Apache Incubator in November 2015. SystemML initiated compressed linear algebra research, a differentiating feature in SystemML, which received the VLDB 2016 Best Paper.

"The Apache Incubator is all about open collaboration and communication and was invaluable for everyone involved in SystemML," added Eriksson. "The Apache SystemML community sincerely encourages everyone interested in machine learning and deep learning to help build our community around this revolutionary technology."

Catch Apache SystemML in action at the Big Data Developers Silicon Valley MeetUp on 8 June 2017 in San Francisco, CA.

Availability and Oversight
Apache SystemML software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache SystemML, visit http://systemml.apache.org/ and https://twitter.com/ApacheSystemML

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit https://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "SystemML", "Apache SystemML", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday May 26, 2017

The Apache News Round-up: week ending 26 May 2017

May is drawing to a close with many activities from the Apache community:

Support Apache –Apache projects benefit billions of users for less than $14 per project per day. We appreciate your generous consideration: Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 21 June 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html
 - On the State of the Feather https://s.apache.org/Lz3t

ApacheCon™ –the official conference of the Apache Software Foundation. Tomorrow's Technology Today.
 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE
 - Videos of keynotes + presentations are now available https://s.apache.org/AE3m
 - Soundbites from the conference floor https://feathercast.apache.org/
 - My First Experience of ApacheCon https://blogs.apache.org/comdev/entry/my-first-experience-of-apachecon + "epic" slideshow https://youtu.be/pKFfirqEgQ8

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield zippity performance at 99.95% uptime http://status.apache.org/

Apache Archiva™ –an extensible repository management tool that helps taking care of your own personal or enterprise-wide build artifact repository.
 - CVE-2017-5657: Apache Archiva CSRF vulnerability for REST endpoints http://mail-archives.apache.org/mod_mbox/www-announce/201705.mbox/%3C1622774.CTg74Sxca6%40golgafrichnam%3E

Apache Arrow™ –a columnar in-memory analytics layer designed to accelerate Big Data.
 - Apache Arrow 0.4.0 released http://arrow.apache.org/

Apache Commons™ Text –Open Source software library that provides a host of algorithms focused on working with strings and blocks of text.
 - Apache Commons Text 1.1 released http://commons.apache.org/proper/commons-text/

Apache Jena™ –a free and Open Source Java framework for building semantic Web and Linked Data applications.
 - Apache Jena 3.3.0 released http://jena.apache.org/

Apache NiFi™ –an easy to use, powerful, and reliable system to process and distribute data.
 - Apache NiFi 0.7.3 and MiNiFi 0.2.0 released https://nifi.apache.org/

Apache OFBiz™ –an Open Source product for the automation of enterprise processes that includes framework components and business applications.
 - Apache OFBiz 16.11.02 released http://ofbiz.apache.org/

Did You Know?

 - Did you know that Apache Metron helps make it possible for enterprise security personnel to more quickly detect and investigate costly threats using Big Data in real time? http://metron.apache.org/

 - Did you know that Apache Samza currently processes billions of messages per day, accounting for 100s of TB of data flowing through the system at Uber? http://samza.apache.org/

 - Did you know that over the past week, 482 contributors changed 766,603 lines of Apache code through 2,565 commits? Still productive during ApacheCon! http://status.apache.org/

Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 20-21 June in Amsterdam and 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Monday May 22, 2017

On the State of the Feather

One of the great things about Apache is that we're all about the individual (contributor). No one has higher rank/status over another. We're not pay-to-play: no-one can "buy" their way in. Titles are for organizational purposes only: a Vice President of a project doesn't carry any more weight than any other member of a project management committee, for example.

We have diverse backgrounds, opinions, and experiences. Each person has their own preferences and personal style, and we celebrate that. Whilst we do adhere to The Apache Way, we don't impose "corporate conformity" directives on anyone, from our support staff to our executive leadership.

As technologists (and perfectionists), we're trained to look for bugs and are always looking for ways to make things better. And, in keeping with our tenets of openness, our matter-of-fact communication style can sometimes be perceived as too honest and transparent.

In light of that, it might be easy to misinterpret the intent of the State of The Feather presentation by ASF President Sam Ruby at ApacheCon last week:

This isn't another "the ASF is great" presentation where I will talk about how we do things differently/better than others.

Instead, this is a talk where I identify what works and where there is more work that needs to be done.

TL;DR

We've been around for 18 years.

We're continuing to grow by every measure.

We expect to continue to be around.

We expect to continue to grow.

...Perhaps even a bit too fast.

I'm not saying it is easy…


As with any organization managing dramatic business growth, meeting these challenges presents unique opportunities, which, at times, may not be an easy feat with an all-volunteer Board overseeing a nearly all-volunteer organization. Luckily for us, we are well-versed in the mantra "If it isn't hard, it isn't worth doing". With more than 18 years of successfully honing our process of developing, incubating, and shepherding projects under our belt, we are well prepared to overcome operational demands.

The Foundation's ongoing transformation is driven by existing Apache projects and an impressive number of new innovations undergoing incubation. The collective Apache community continues to be highly productive, as summarized every week. Our commitment to rise to the challenge is evident, as demonstrated at ApacheCon. We are proud of our achievements and look forward to sharing our successes in the upcoming Annual Report.

# # # 

Friday May 19, 2017

The Apache News Round-up: week ending 19 May 2017

We're wrapping up a great week in Miami at ApacheCon, with thanks to all our attendees, event sponsors, organizers, producers, staff, volunteers, and the greater Apache community of developers, users, and enthusiasts --we miss you already. Here's what happened this week:

Support Apache –if Apache software has helped you, please consider making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 21 June 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ApacheCon™ –the official conference of the Apache Software Foundation. Tomorrow's Technology Today.
 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE
 - Videos of keynotes + presentations are now available https://s.apache.org/AE3m
 - Soundbites from the conference floor https://feathercast.apache.org/

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield outstanding performance at 99.98% uptime http://status.apache.org/

Apache Archiva™ –an application for managing one or more remote repositories, including administration, artifact handling, browsing and searching.
 - Apache Archiva 2.2.3 released http://archiva.apache.org/

Apache Beam™ –Open Source unified programming model for batch and streaming Big Data processing.
 - The Apache Software Foundation Announces Apache® Beam™ v2.0.0 https://s.apache.org/k5W7

Apache Buildr™ –a build system for Java-based applications, including support for Scala, Groovy and a growing number of JVM languages and tools.
 - Apache Buildr 1.5.3 released http://buildr.apache.org/

Apache CarbonData™ –an indexed columnar data format for fast analytics on Big Data platforms such as Apache Hadoop and Apache Spark.
 - Apache CarbonData 1.1.0 released http://carbondata.apache.org/

Apache Commons™ Compress –library defines an API for working with ar, cpio, Unix dump, tar, zip, gzip, XZ, Pack200, bzip2, 7z, arj, lzma, snappy, DEFLATE, lz4, Brotli and Z files.
 - Apache Commons Compress 1.14 released http://commons.apache.org/compress/

Apache Directory™ Kerby –a Java Kerberos binding.
 - Apache Kerby™ 1.0.0 released http://directory.apache.org/kerby

Apache NiFi™ MiNiFi –provides a complementary data collection approach that supplements the core tenets of NiFi in dataflow management, focusing on the collection of data at the source of its creation.
 - Apache NiFi MiNiFi C++ 0.2.0 released https://nifi.apache.org/minifi

Apache PDFBox™ –an Open Source Java tool for working with PDF documents.
 - Apache PDFBox 2.0.6 released http://pdfbox.apache.org/

Apache Qpid™ –implements the latest AMQP specification, the first open standard for enterprise messaging, and provides transaction management, queuing, distribution, security, management, clustering, federation and heterogeneous multi-platform support and a lot more.
 - Apache Qpid JMS 0.23.0 released http://qpid.apache.org

Apache Samza™ –Open Source Big Data distributed stream processing framework in production at Intuit, LinkedIn, Netflix, Optimizely, Redfin, and Uber, among other organizations.
 - The Apache Software Foundation Announces Apache® Samza™ v0.13 https://s.apache.org/CSbJ

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and Java Authentication Service Provider Interface for Containers technologies.
 - Apache Tomcat 7.0.78, 8.0.44, and 8.5.15 released http://tomcat.apache.org/

Apache Wicket™ –an Open Source Java component oriented web application framework that powers thousands of Web applications and web sites for governments, stores, universities, cities, banks, email providers, and more.
 - Apache Wicket 7.7.0 and 8.0.0-M6 released http://wicket.apache.org/


Did You Know?

 - Did you know that Autodesk's private Cloud is powered by Apache CloudStack? http://cloudstack.apache.org/

 - Did you know that Formula 1 races have 1.5 billions of data points for per race, and use Apache Drill, Flink, Hadoop, HBase, Hive, Kafka, MapReduce, Solr, and Spark for their Big Data architectures? http://events.linuxfoundation.org/sites/events/files/slides/fast_car_big_data_code_motion_carol3.pdf

 - Did you know that over the past month, 1,086 Apache Committers changed 5,147,842 lines of code over 15,487 commits? http://status.apache.org/

Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 20-21 June in Amsterdam and 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Wednesday May 17, 2017

The Apache Software Foundation Announces Apache® Beam™ v2.0.0

Open Source unified programming model for batch and streaming Big Data processing in use at Google Cloud, PayPal, and Talend, among others.

Forest Hill, MD —17 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Beam™ v2.0.0, the first stable release of the unified programming model for both batch and streaming Big Data processing.

An Apache Top-Level Project (TLP) since December 2016, Beam includes Java and Python software development kits used to define data processing pipelines and runners to execute them on Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow, among other execution engines.

Apache Beam has its roots in Google's internal work on data processing over the last decade, evolving from the initial MapReduce system, through FlumeJava and MillWheel, into Google Cloud Dataflow v1.x, which defined the unified programming model that became the heart of Apache Beam.

"The first stable release is an important milestone for the Apache Beam community," said Davor Bonaci, Vice President of Apache Beam. "This is a statement from the community that it intends to maintain API stability with all releases for the foreseeable future, making Beam suitable for enterprise deployment."

Apache Beam v2.0.0 improves user experience across the project, focusing on seamless portability across execution environments, including engines, operating systems, on-premise clusters, cloud providers, and data storage systems. Other highlights include:
  • API stability and future compatibility within this major version;
  • Stateful data processing paradigms that unlock efficient, data-dependent computations;
  • Support for user-extensible file systems, with built-in support for Hadoop Distributed File System, among others; and
  • A metrics subsystem for deeper insight into pipeline execution.

Apache Beam is in use at Google Cloud, PayPal, and Talend, among others.

"Apache Beam is a mature data processing API for the enterprise, with powerful semantics that solve real-world challenges of stream processing," said Tomer Pilossof, Big Data Manager at PayPal. "With Beam, we provide data processing solutions for a wide range of customers within the PayPal organization."

"We at Talend are thrilled to have contributed to Apache Beam reaching the 2.0.0 milestone and its first official stable release," said Laurent Bride, Chief Technology Officer at Talend. "Apache Beam is now part of the foundation of Talend products. Recently, we released Talend Data Preparation for Big Data which leverages Beam to create transformation pipelines that are portable across many execution engines. Later this year, we plan to deliver Talend Data Streams, taking the Apache Beam integration one step further by utilizing its powerful streaming semantics. Whether for batch, streaming, or real-time use cases, Apache Beam is a powerful framework that delivers the flexibility and advanced functionality our customers need."

"We congratulate the Apache Beam community for reaching the key milestone of a first stable release," said William Vambenepe, Lead Product Manager for Big Data, Google Cloud. "We look forward to our Google Cloud Dataflow customers taking full advantage of Beam's powerful programming model and newest features to run their data processing pipelines on Google Cloud."

Apache Beam v2.0.0 is making its debut at Apache: Big Data, taking place this week in Miami, FL, with four sessions featuring Apache Beam. Apache Beam will also be highlighted at numerous face-to-face meetups and conferences, including the Future of Data San Jose meetup, Strata Data Conference London, Berlin Buzzwords, and DataWorks Summit San Jose.

"I'd like to invite everyone to try out Apache Beam v2.0.0 today and consider joining our vibrant community," added Bonaci. "We welcome feedback, contribution and participation through our mailing lists, issue tracker, pull requests, and events."

Availability and Oversight
Apache Beam software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Beam, visit https://beam.apache.org/ and https://twitter.com/ApacheBeam

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server -- the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Beam", "Apache Beam", "Apex", "Apache Apex", "Flink", "Apache Flink", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday May 15, 2017

The Apache Software Foundation Announces Apache® Samza™ v0.13

Open Source Big Data distributed stream processing framework in production at Intuit, LinkedIn, Netflix, Optimizely, Redfin, and Uber, among other organizations.

Forest Hill, MD —15 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Samza™  v0.13, the latest version of the Open Source Big Data distributed stream processing framework.

An Apache Top-Level Project (TLP) since January 2015, Samza is designed to provide support for fault-tolerant, large scale stream processing. Developers use Apache Samza to write applications that consume streams of data and to help organizations understand and respond to their data in real-time. Apache Samza offers a unified API to process streaming data from pub-sub messaging systems like Apache Kafka and batch data from Apache Hadoop.

"The latest 0.13 release takes Apache Samza's data processing capabilities to the next level with multiple new features," said Yi Pan, Vice President of Apache Samza. "It also improves the simplicity and portability of real-time applications."

Apache Samza powers several real-time data processing needs including realtime analytics on user data, message routing, combating fraud, anomaly detection, performance monitoring, real-time communication, and more. Apache Samza can process up to 1.1 million messages per second on a single machine. v0.13 highlights include:
  • A higher level API that developers can use this to express complex processing pipelines on streams more concisely;
  • Support for running Samza applications as a lightweight embedded library without relying on YARN;
  • Support for flexible deployment options; 
  • Support for rolling upgrade of running Samza applications;
  • Improved monitoring and failure detection using a built-in heart beating mechanism;
  • Enabling better integrations with other cluster-manager frameworks and environments; and
  • Several bug-fixes that improve reliability, stability and robustness of data processing,

Organizations such as Intuit, LinkedIn, Netflix, Optimizely, Redfin, TripAdvisor, and Uber rely on Apache Samza to power complex data architectures that process billions of events each day. A list of user organizations is available at https://cwiki.apache.org/confluence/display/SAMZA/Powered+By

"Apache Samza is a highly performant stream/data processing system that has been battle tested over the years of powering mission critical applications in a wide range of businesses," said Kartik Paramasivam, Head of Streams Infrastructure, and Director of Engineering at LinkedIn. "With this 0.13 release, the power of Samza is no longer limited to YARN based topologies. It can now be used in any hosting environment. In addition, it now has a new higher level API that makes it significantly easier to create arbitrarily complex processing pipelines."

"Apache Samza has been powering near real-time use cases at Uber for the last year and a half," said Chinmay Soman, Staff Software Engineer at Uber. "This ranges from analytical use cases such as understanding business metrics, feature extraction for machine learning as well as some critical applications such as Fraud detection, Surge pricing and Intelligent promotions. Samza has been proven to be robust in production and is currently processing about billions of messages per day, accounting for 100s of TB of data flowing through the system." 

"At Optimizely, we have built the world’s leading experimentation platform, which ingests billions of click-stream events a day from millions of visitors for analysis," said Vignesh Sukumar, Senior Engineering Manager at Optimizely. "Apache Samza has been a great asset to Optimizely's Event ingestion pipeline allowing us to perform large scale, real time stream computing such as aggregations (e.g. session computations) and data enrichment on a multiple billion events/day scale. The programming model, durability and the close integration with Apache Kafka fit our needs perfectly."

"It has been a phenomenal experience engaging with this vibrant international community of users and contributors, and I look forward to our continued growth. It is a great time to be involved in the project and we welcome new contributors to the Samza community," added Pan.

Catch Apache Samza in action at Apache: Big Data, 16-18 May 2017 in Miami, FL http://apachecon.com/ , where the community will be showcasing how Samza simplifies stream processing at scale.

Availability and Oversight
Apache Samza software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Samza, visit http://samza.apache.org/ , https://blogs.apache.org/samza/ , and https://twitter.com/samzastream

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", "Kafka", "Apache Kafka", "Samza", "Apache Samza", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday May 12, 2017

The Apache News Round-up: week ending 12 May 2017

As members of the Apache community is preparing to convene in Miami for ApacheCon next week, we continue to be productive. Here's what we've been up to:

Support Apache –if Apache software has helped you, please consider making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield forward performance at 99.82% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today: Big Data, CloudStack, Flex, IoT, Tomcat, and dozens of Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - Support your favorite Apache projects and communities at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache Arrow™ –columnar in-memory analytics layer designed to accelerate Big Data.
- Apache Arrow 0.3.0 released http://arrow.apache.org/

Apache Directory™ Fortress –a standards-based access management system, written in Java, that provides role-based access control, delegated administration and password policy services with LDAP.
 - Apache Fortress 2.0.0-RC2 released http://directory.apache.org/fortress/

Apache HttpComponents™ Client –Java library implementing an HTTP client based on HttpCore components.
- HttpComponents Client 5.0 alpha2 released http://hc.apache.org/httpcomponents-client/

Apache Ignite™ –In-Memory Data Fabric providing in-memory data caching, partitioning, processing, and querying components.
 - Apache Ignite 2.0.0 released https://ignite.apache.org/

Apache OpenWebBeans™ Meecrowave –a light Apache web server based on Tomat and OpenWebBeans (à la microprofile fashion).
 - Apache Meecrowave 0.3.1 released http://openwebbeans.apache.org/meecrowave/

Apache NiFi™ –an easy to use, powerful, and reliable system to process and distribute data.
 - Apache NiFi 1.2.0 released https://nifi.apache.org/

Apache Qpid™ Proton –a messaging library for the Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464, http://www.amqp.org).
 -Apache Qpid Proton-J 0.19.0 released http://qpid.apache.org/

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and JASPIC technologies.
 - Apache Tomcat 9.0.0.M21 released http://tomcat.apache.org/

Apache Trafodion (incubating) –a Web-scale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop.
 - Apache Trafodion 2.1.0-incubating released http://trafodion.incubator.apache.org/index.html


Did You Know?

 - Did you know that previews for ApacheCon + Apache: Big Data keynotes and select sessions are available exclusively on Feathercast? https://feathercast.apache.org/

 - Did you know that over the past quarter, Apache source code has been downloaded more than 2M times? https://projects.apache.org/statistics.html

 - Did you know that RiskCo finacial engineering and risk intelligence group uses Apache Wicket? http://wicket.apache.org/


Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Friday May 05, 2017

The Apache News Round-up: week ending 5 May 2017

Welcome, May! The Apache community is getting ready for ApacheCon, and we hope to see you in Miami soon. Here's what's happened over the past week:

Support Apache –your donations help the world's largest Open Source foundation enrich the lives of countless users and developers. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html
 - The Apache Software Foundation Welcomes 64 New Members https://s.apache.org/2Vt

Success at Apache –our sixth installment in the monthly blog series that focuses on the processes behind why the ASF "just works".
 - Meritocracy and Me, by Tom Barber https://s.apache.org/tQQh

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield whizbang performance at 99.97% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today: Big Data, CloudStack, Flex, IoT, Tomcat, and dozens of Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - Support your favorite Apache projects and communities at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache CarbonData™ –Open Source Big Data analytics accelerator.
The Apache Software Foundation Announces Apache® CarbonData™ as a Top-Level Project https://s.apache.org/QmTI

Apache HBase™ –an Open Source, distributed, versioned, non-relational database.
 -  Apache HBase 1.1.10 released https://hbase.apache.org/

Apache HttpComponents™ Core –a set of HTTP/1.1 and HTTP/2 transport components that can be used to build custom client and server side HTTP services with a minimal footprint.
 -Apache HttpComponents Core 5.0 alpha3 released http://hc.apache.org/

Apache Juneau (incubating) –a toolkit for marshalling POJOs to a wide variety of content types using a common framework, and for creating sophisticated self-documenting REST interfaces and microservices using very little code.
 - Apache Juneau 6.2.0 (incubating) released http://juneau.incubator.apache.org/

Apache Kylin™ –Open Source petabyte-scale Big Data Distributed Analytics Engine.
 - Apache Kylin 2.0.0 released https://kylin.apache.org/

Apache Mahout™ –Open Source scalable machine learning and data mining library for Big Data artificial intelligence.
 - The Apache Software Foundation Announces Apache® Mahout™ v0.13.0 https://s.apache.org/ioAa

Apache Qpid™ Dispatch –a router for the Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464, http://www.amqp.org).
 - Apache Qpid Dispatch 0.8.0 released http://qpid.apache.org/


Did You Know?

 - Did you know that previews for ApacheCon + Apache: Big Data keynotes and select sessions are available exclusively on Feathercast? https://feathercast.apache.org/

 - Did you know that the Top 5 closers (+ number of issues) over the past 30 days are: Wes McKinney (169), Sean Owen (75), Gavin McDonald (64), Aled Sage (57), and Jesse MacFadyen (55)? Well done, all! http://status.apache.org/

 - Did you know that you can support Apache by shopping at http://smile.amazon.com? 0.5% will be donated to the ASF!

 - Did you know that Air New Zealand's "Ask Oskar" online chatbot is trained by Apache OpenNLP? http://opennlp.apache.org/


Apache Community Notices:

 - "Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - Introducing the new Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-march-2017 Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity . Do friend and follow us.

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Wednesday May 03, 2017

The Apache Software Foundation Welcomes 64 New Members

The Apache Software Foundation welcomes the following new Members who were elected during the annual ASF Members' Meeting on 28-30 March 2017:

Taher A. Alkhateeb, Ryan Blue, Davor Bonaci, Michael Brohl, Cédric Champeau, Byung-Gon Chun, William Colen, Radu Cotescu, Jaroslaw Cwiklik, Dániel Dékány, Mike Drob, Eric Evans, Olaf Flebbe, Lars Francke, Roberto Galoppini, Robert Gemmell, Jorge Luis Betancourt Gonzalez, Thamme Gowda, Scott Gray,Stefan Hett, Jonathan Hsieh, Claus Ibsen, Bilgin Ismet Ibryam, Jeff Jirsa, Evgeny Kotkov, Andrew Kurth, Chris Lambertus, T Jake Luciani, Nicolas Malin, Stephen Mallette, Karl Heinz Marbaise, Sidney Markowitz, Gary Martin, Jan Materne, Nate McCall, Larry McCay, Robert Munteanu, Kay Ousterhout, Anil Patel, Christine Poerschke, Matt Post, Dominik Psenner, Chris Riccomini, Carlos Rovira, Daniel Ruggeri, Guergana K. Savova, Felix Schumacher, Anthony Shaw, Matt Sicker, Karanjeet Singh, Stian Soiland-Reyes, Lee Moon Soo, Michael Starch, Daniel Takamori, Josh Tynjala, Ashish Vijaywargiya, Jay Vyas, Andrew Wang, Claude Warren, Michael Semb Wever, Evans Ye, Piotr Zarzycki, Jeff Zhang, and Jordan Zimmerman.


When the ASF incorporated in 1999, the Foundation's core membership comprised 21 individuals who oversaw the progress of the Apache HTTP Server. This group grew with Committers --developers who contributed code, patches, or documentation, and were subsequently granted access by the Membership:
  1. to "commit" or "write" (contribute) directly to the code repository;

  2. the right to vote on community-related decisions; and

  3. the ability propose an active user for Committership.

Those Committers who demonstrate merit in the Foundation’s growth, evolution, and progress are nominated for ASF Membership by existing members.

This election brings the total number of ASF Members to 684 today. Individuals elected as ASF Members legally serve as the "shareholders" of the Foundation https://www.apache.org/foundation/governance/members.html

For more information on how the ASF works, visit http://www.apache.org/foundation/how-it-works.html .

# # # 


Monday May 01, 2017

The Apache Software Foundation Announces Apache® CarbonData™ as a Top-Level Project

Open Source Big Data analytics accelerator in use at Bank of Communications, Hulu, Huawei, SAIC Motor, Zhejiang Mobile, among others.

Forest Hill, MD –1 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® CarbonData™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache CarbonData is an indexed columnar store file format for fast analytics on Big Data platforms (including Apache Hadoop, Apache Spark, among others) to help speed up queries an order of magnitude faster over petabytes of data.

"We are very proud to complete the incubation process and graduate as an Apache Top-Level Project," said Liang Chen, Vice President of Apache CarbonData. "The CarbonData community grew rapidly over last ten months, both in terms of size and diversity. Since entering the Apache Incubator, we have completed 4 releases, and exceeded 90 contributors from 10 different organizations."

With the aim of using a unified file format to satisfy all kinds of data analysis cases, Apache CarbonData seamlessly integrates with Hadoop and Spark to improve Big Data analysis efficiency. In benchmarks, CarbonData's faster interactive query helps in speeding up queries approximately 10x faster than standard column-oriented SQL on Hadoop data stores.

Highlights include:

  • Unique data organization to allow faster filtering and better compression;
  • Multi-level Indexing to enable faster search and speeding up query processing;
  • Deep Apache Spark Integration for dataframe + SQL compliance;
  • Advanced push down optimization to minimize the amount of data being read processed, converted, transmitted, and shuffled;
  • Efficient compression and global encoding schemes to further improve aggregation query performance;
  • Dictionary encoding for reduced storage space and faster processing; and
  • Data update + delete support using standard SQL syntax.


Apache CarbonData is in use at an array of organizations, including Bank of Communications, medical/pharma social platform DXY, Hulu, Huawei, group online retailer MEITUAN, SAIC Motor, Zhejiang Mobile, among others.

"CarbonData has very good performance as a ‘SQL on Hadoop’ solution," said Tan Sheng, Director of SAIC Motor’s Big Data team. "It is suitable for SAIC Motor to adopt as a central Big Data platform component. Not only do we use Apache CarbonData, we also actively participate in its community as contributors." 

"Apache CarbonData is great, as helped our audit business to improve 7-10X performance based on 14 billion rows of data," said Wei Zhao, Senior Engineer at Bank of Communications.

"Apache CarbonData is very suitable for our filter query cases, and has averaged 20x improvement on performance," said William Zhu, Architecture team member at DXY. "And, as CarbonData supports data update and delete, this feature is very useful. We would consider CarbonData as our all-in-one solution to unify all analysis data."

CarbonData was first developed at Huawei in 2013. The project was submitted to the Apache Incubator in June 2016, and had its first official release two months later. The project won top honors in the BlackDuck 2016 Open Source Rookies of the Year's Big Data category.

"Apache CarbonData is a great example of the value of the incubation process," said Jean-Baptiste Onofré, Apache CarbonData Incubator Mentor and Project Management Committee member. "Helping grow the CarbonData developer and user communities has increased our visibility, which allowed us to extend our use cases and tests, and gather new ideas. The initial CarbonData committers did (and are still doing) great work to welcome new users and contributors, clearly understanding it's a step forward for the project."

"We will continue to put our efforts towards optimizing data format efficiency for Big Data ecosystem and provide an unified and high performance data storage solution," added Liang. "The Apache CarbonData community welcomes interested contributors to work with us on our journey forward."

Catch Apache CarbonData in action at ApacheCon (16-18 May/Miami), and Spark Summit (5-7 June/San Francisco).

Availability and Oversight
Apache CarbonData software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache CarbonData, visit http://carbondata.apache.org/ , https://twitter.com/ApacheCarbonDat , and https://www.facebook.com/carbondata/

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "CarbonData", "Apache CarbonData", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # # 

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation