The Apache Software Foundation Blog

Friday May 26, 2017

The Apache News Round-up: week ending 26 May 2017

May is drawing to a close with many activities from the Apache community:

Support Apache –Apache projects benefit billions of users for less than $14 per project per day. We appreciate your generous consideration: Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 21 June 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html
 - On the State of the Feather https://s.apache.org/Lz3t

ApacheCon™ –the official conference of the Apache Software Foundation. Tomorrow's Technology Today.
 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE
 - Videos of keynotes + presentations are now available https://s.apache.org/AE3m
 - Soundbites from the conference floor https://feathercast.apache.org/
 - My First Experience of ApacheCon https://blogs.apache.org/comdev/entry/my-first-experience-of-apachecon + "epic" slideshow https://youtu.be/pKFfirqEgQ8

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield zippity performance at 99.95% uptime http://status.apache.org/

Apache Archiva™ –an extensible repository management tool that helps taking care of your own personal or enterprise-wide build artifact repository.
 - CVE-2017-5657: Apache Archiva CSRF vulnerability for REST endpoints http://mail-archives.apache.org/mod_mbox/www-announce/201705.mbox/%3C1622774.CTg74Sxca6%40golgafrichnam%3E

Apache Arrow™ –a columnar in-memory analytics layer designed to accelerate Big Data.
 - Apache Arrow 0.4.0 released http://arrow.apache.org/

Apache Commons™ Text –Open Source software library that provides a host of algorithms focused on working with strings and blocks of text.
 - Apache Commons Text 1.1 released http://commons.apache.org/proper/commons-text/

Apache Jena™ –a free and Open Source Java framework for building semantic Web and Linked Data applications.
 - Apache Jena 3.3.0 released http://jena.apache.org/

Apache NiFi™ –an easy to use, powerful, and reliable system to process and distribute data.
 - Apache NiFi 0.7.3 and MiNiFi 0.2.0 released https://nifi.apache.org/

Apache OFBiz™ –an Open Source product for the automation of enterprise processes that includes framework components and business applications.
 - Apache OFBiz 16.11.02 released http://ofbiz.apache.org/

Did You Know?

 - Did you know that Apache Metron helps make it possible for enterprise security personnel to more quickly detect and investigate costly threats using Big Data in real time? http://metron.apache.org/

 - Did you know that Apache Samza currently processes billions of messages per day, accounting for 100s of TB of data flowing through the system at Uber? http://samza.apache.org/

 - Did you know that over the past week, 482 contributors changed 766,603 lines of Apache code through 2,565 commits? Still productive during ApacheCon! http://status.apache.org/

Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 20-21 June in Amsterdam and 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Monday May 22, 2017

On the State of the Feather

One of the great things about Apache is that we're all about the individual (contributor). No one has higher rank/status over another. We're not pay-to-play: no-one can "buy" their way in. Titles are for organizational purposes only: a Vice President of a project doesn't carry any more weight than any other member of a project management committee, for example.

We have diverse backgrounds, opinions, and experiences. Each person has their own preferences and personal style, and we celebrate that. Whilst we do adhere to The Apache Way, we don't impose "corporate conformity" directives on anyone, from our support staff to our executive leadership.

As technologists (and perfectionists), we're trained to look for bugs and are always looking for ways to make things better. And, in keeping with our tenets of openness, our matter-of-fact communication style can sometimes be perceived as too honest and transparent.

In light of that, it might be easy to misinterpret the intent of the State of The Feather presentation by ASF President Sam Ruby at ApacheCon last week:

This isn't another "the ASF is great" presentation where I will talk about how we do things differently/better than others.

Instead, this is a talk where I identify what works and where there is more work that needs to be done.

TL;DR

We've been around for 18 years.

We're continuing to grow by every measure.

We expect to continue to be around.

We expect to continue to grow.

...Perhaps even a bit too fast.

I'm not saying it is easy…


As with any organization managing dramatic business growth, meeting these challenges presents unique opportunities, which, at times, may not be an easy feat with an all-volunteer Board overseeing a nearly all-volunteer organization. Luckily for us, we are well-versed in the mantra "If it isn't hard, it isn't worth doing". With more than 18 years of successfully honing our process of developing, incubating, and shepherding projects under our belt, we are well prepared to overcome operational demands.

The Foundation's ongoing transformation is driven by existing Apache projects and an impressive number of new innovations undergoing incubation. The collective Apache community continues to be highly productive, as summarized every week. Our commitment to rise to the challenge is evident, as demonstrated at ApacheCon. We are proud of our achievements and look forward to sharing our successes in the upcoming Annual Report.

# # # 

Friday May 19, 2017

The Apache News Round-up: week ending 19 May 2017

We're wrapping up a great week in Miami at ApacheCon, with thanks to all our attendees, event sponsors, organizers, producers, staff, volunteers, and the greater Apache community of developers, users, and enthusiasts --we miss you already. Here's what happened this week:

Support Apache –if Apache software has helped you, please consider making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 21 June 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ApacheCon™ –the official conference of the Apache Software Foundation. Tomorrow's Technology Today.
 - Presentations from ApacheCon https://s.apache.org/Hli7 and Apache: Big Data https://s.apache.org/tefE
 - Videos of keynotes + presentations are now available https://s.apache.org/AE3m
 - Soundbites from the conference floor https://feathercast.apache.org/

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield outstanding performance at 99.98% uptime http://status.apache.org/

Apache Archiva™ –an application for managing one or more remote repositories, including administration, artifact handling, browsing and searching.
 - Apache Archiva 2.2.3 released http://archiva.apache.org/

Apache Beam™ –Open Source unified programming model for batch and streaming Big Data processing.
 - The Apache Software Foundation Announces Apache® Beam™ v2.0.0 https://s.apache.org/k5W7

Apache Buildr™ –a build system for Java-based applications, including support for Scala, Groovy and a growing number of JVM languages and tools.
 - Apache Buildr 1.5.3 released http://buildr.apache.org/

Apache CarbonData™ –an indexed columnar data format for fast analytics on Big Data platforms such as Apache Hadoop and Apache Spark.
 - Apache CarbonData 1.1.0 released http://carbondata.apache.org/

Apache Commons™ Compress –library defines an API for working with ar, cpio, Unix dump, tar, zip, gzip, XZ, Pack200, bzip2, 7z, arj, lzma, snappy, DEFLATE, lz4, Brotli and Z files.
 - Apache Commons Compress 1.14 released http://commons.apache.org/compress/

Apache Directory™ Kerby –a Java Kerberos binding.
 - Apache Kerby™ 1.0.0 released http://directory.apache.org/kerby

Apache NiFi™ MiNiFi –provides a complementary data collection approach that supplements the core tenets of NiFi in dataflow management, focusing on the collection of data at the source of its creation.
 - Apache NiFi MiNiFi C++ 0.2.0 released https://nifi.apache.org/minifi

Apache PDFBox™ –an Open Source Java tool for working with PDF documents.
 - Apache PDFBox 2.0.6 released http://pdfbox.apache.org/

Apache Qpid™ –implements the latest AMQP specification, the first open standard for enterprise messaging, and provides transaction management, queuing, distribution, security, management, clustering, federation and heterogeneous multi-platform support and a lot more.
 - Apache Qpid JMS 0.23.0 released http://qpid.apache.org

Apache Samza™ –Open Source Big Data distributed stream processing framework in production at Intuit, LinkedIn, Netflix, Optimizely, Redfin, and Uber, among other organizations.
 - The Apache Software Foundation Announces Apache® Samza™ v0.13 https://s.apache.org/CSbJ

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and Java Authentication Service Provider Interface for Containers technologies.
 - Apache Tomcat 7.0.78, 8.0.44, and 8.5.15 released http://tomcat.apache.org/

Apache Wicket™ –an Open Source Java component oriented web application framework that powers thousands of Web applications and web sites for governments, stores, universities, cities, banks, email providers, and more.
 - Apache Wicket 7.7.0 and 8.0.0-M6 released http://wicket.apache.org/


Did You Know?

 - Did you know that Autodesk's private Cloud is powered by Apache CloudStack? http://cloudstack.apache.org/

 - Did you know that Formula 1 races have 1.5 billions of data points for per race, and use Apache Drill, Flink, Hadoop, HBase, Hive, Kafka, MapReduce, Solr, and Spark for their Big Data architectures? http://events.linuxfoundation.org/sites/events/files/slides/fast_car_big_data_code_motion_carol3.pdf

 - Did you know that over the past month, 1,086 Apache Committers changed 5,147,842 lines of code over 15,487 commits? http://status.apache.org/

Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - Catch the Apache Ignite and Spark communities at the In-Memory Computing Summit 20-21 June in Amsterdam and 24-25 October in San Francisco https://imcsummit.org/

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Wednesday May 17, 2017

The Apache Software Foundation Announces Apache® Beam™ v2.0.0

Open Source unified programming model for batch and streaming Big Data processing in use at Google Cloud, PayPal, and Talend, among others.

Forest Hill, MD —17 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Beam™ v2.0.0, the first stable release of the unified programming model for both batch and streaming Big Data processing.

An Apache Top-Level Project (TLP) since December 2016, Beam includes Java and Python software development kits used to define data processing pipelines and runners to execute them on Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow, among other execution engines.

Apache Beam has its roots in Google's internal work on data processing over the last decade, evolving from the initial MapReduce system, through FlumeJava and MillWheel, into Google Cloud Dataflow v1.x, which defined the unified programming model that became the heart of Apache Beam.

"The first stable release is an important milestone for the Apache Beam community," said Davor Bonaci, Vice President of Apache Beam. "This is a statement from the community that it intends to maintain API stability with all releases for the foreseeable future, making Beam suitable for enterprise deployment."

Apache Beam v2.0.0 improves user experience across the project, focusing on seamless portability across execution environments, including engines, operating systems, on-premise clusters, cloud providers, and data storage systems. Other highlights include:
  • API stability and future compatibility within this major version;
  • Stateful data processing paradigms that unlock efficient, data-dependent computations;
  • Support for user-extensible file systems, with built-in support for Hadoop Distributed File System, among others; and
  • A metrics subsystem for deeper insight into pipeline execution.

Apache Beam is in use at Google Cloud, PayPal, and Talend, among others.

"Apache Beam is a mature data processing API for the enterprise, with powerful semantics that solve real-world challenges of stream processing," said Tomer Pilossof, Big Data Manager at PayPal. "With Beam, we provide data processing solutions for a wide range of customers within the PayPal organization."

"We at Talend are thrilled to have contributed to Apache Beam reaching the 2.0.0 milestone and its first official stable release," said Laurent Bride, Chief Technology Officer at Talend. "Apache Beam is now part of the foundation of Talend products. Recently, we released Talend Data Preparation for Big Data which leverages Beam to create transformation pipelines that are portable across many execution engines. Later this year, we plan to deliver Talend Data Streams, taking the Apache Beam integration one step further by utilizing its powerful streaming semantics. Whether for batch, streaming, or real-time use cases, Apache Beam is a powerful framework that delivers the flexibility and advanced functionality our customers need."

"We congratulate the Apache Beam community for reaching the key milestone of a first stable release," said William Vambenepe, Lead Product Manager for Big Data, Google Cloud. "We look forward to our Google Cloud Dataflow customers taking full advantage of Beam's powerful programming model and newest features to run their data processing pipelines on Google Cloud."

Apache Beam v2.0.0 is making its debut at Apache: Big Data, taking place this week in Miami, FL, with four sessions featuring Apache Beam. Apache Beam will also be highlighted at numerous face-to-face meetups and conferences, including the Future of Data San Jose meetup, Strata Data Conference London, Berlin Buzzwords, and DataWorks Summit San Jose.

"I'd like to invite everyone to try out Apache Beam v2.0.0 today and consider joining our vibrant community," added Bonaci. "We welcome feedback, contribution and participation through our mailing lists, issue tracker, pull requests, and events."

Availability and Oversight
Apache Beam software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Beam, visit https://beam.apache.org/ and https://twitter.com/ApacheBeam

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server -- the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Beam", "Apache Beam", "Apex", "Apache Apex", "Flink", "Apache Flink", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday May 15, 2017

The Apache Software Foundation Announces Apache® Samza™ v0.13

Open Source Big Data distributed stream processing framework in production at Intuit, LinkedIn, Netflix, Optimizely, Redfin, and Uber, among other organizations.

Forest Hill, MD —15 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Samza™  v0.13, the latest version of the Open Source Big Data distributed stream processing framework.

An Apache Top-Level Project (TLP) since January 2015, Samza is designed to provide support for fault-tolerant, large scale stream processing. Developers use Apache Samza to write applications that consume streams of data and to help organizations understand and respond to their data in real-time. Apache Samza offers a unified API to process streaming data from pub-sub messaging systems like Apache Kafka and batch data from Apache Hadoop.

"The latest 0.13 release takes Apache Samza's data processing capabilities to the next level with multiple new features," said Yi Pan, Vice President of Apache Samza. "It also improves the simplicity and portability of real-time applications."

Apache Samza powers several real-time data processing needs including realtime analytics on user data, message routing, combating fraud, anomaly detection, performance monitoring, real-time communication, and more. Apache Samza can process up to 1.1 million messages per second on a single machine. v0.13 highlights include:
  • A higher level API that developers can use this to express complex processing pipelines on streams more concisely;
  • Support for running Samza applications as a lightweight embedded library without relying on YARN;
  • Support for flexible deployment options; 
  • Support for rolling upgrade of running Samza applications;
  • Improved monitoring and failure detection using a built-in heart beating mechanism;
  • Enabling better integrations with other cluster-manager frameworks and environments; and
  • Several bug-fixes that improve reliability, stability and robustness of data processing,

Organizations such as Intuit, LinkedIn, Netflix, Optimizely, Redfin, TripAdvisor, and Uber rely on Apache Samza to power complex data architectures that process billions of events each day. A list of user organizations is available at https://cwiki.apache.org/confluence/display/SAMZA/Powered+By

"Apache Samza is a highly performant stream/data processing system that has been battle tested over the years of powering mission critical applications in a wide range of businesses," said Kartik Paramasivam, Head of Streams Infrastructure, and Director of Engineering at LinkedIn. "With this 0.13 release, the power of Samza is no longer limited to YARN based topologies. It can now be used in any hosting environment. In addition, it now has a new higher level API that makes it significantly easier to create arbitrarily complex processing pipelines."

"Apache Samza has been powering near real-time use cases at Uber for the last year and a half," said Chinmay Soman, Staff Software Engineer at Uber. "This ranges from analytical use cases such as understanding business metrics, feature extraction for machine learning as well as some critical applications such as Fraud detection, Surge pricing and Intelligent promotions. Samza has been proven to be robust in production and is currently processing about billions of messages per day, accounting for 100s of TB of data flowing through the system." 

"At Optimizely, we have built the world’s leading experimentation platform, which ingests billions of click-stream events a day from millions of visitors for analysis," said Vignesh Sukumar, Senior Engineering Manager at Optimizely. "Apache Samza has been a great asset to Optimizely's Event ingestion pipeline allowing us to perform large scale, real time stream computing such as aggregations (e.g. session computations) and data enrichment on a multiple billion events/day scale. The programming model, durability and the close integration with Apache Kafka fit our needs perfectly."

"It has been a phenomenal experience engaging with this vibrant international community of users and contributors, and I look forward to our continued growth. It is a great time to be involved in the project and we welcome new contributors to the Samza community," added Pan.

Catch Apache Samza in action at Apache: Big Data, 16-18 May 2017 in Miami, FL http://apachecon.com/ , where the community will be showcasing how Samza simplifies stream processing at scale.

Availability and Oversight
Apache Samza software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Samza, visit http://samza.apache.org/ , https://blogs.apache.org/samza/ , and https://twitter.com/samzastream

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", "Kafka", "Apache Kafka", "Samza", "Apache Samza", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday May 12, 2017

The Apache News Round-up: week ending 12 May 2017

As members of the Apache community is preparing to convene in Miami for ApacheCon next week, we continue to be productive. Here's what we've been up to:

Support Apache –if Apache software has helped you, please consider making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield forward performance at 99.82% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today: Big Data, CloudStack, Flex, IoT, Tomcat, and dozens of Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - Support your favorite Apache projects and communities at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache Arrow™ –columnar in-memory analytics layer designed to accelerate Big Data.
- Apache Arrow 0.3.0 released http://arrow.apache.org/

Apache Directory™ Fortress –a standards-based access management system, written in Java, that provides role-based access control, delegated administration and password policy services with LDAP.
 - Apache Fortress 2.0.0-RC2 released http://directory.apache.org/fortress/

Apache HttpComponents™ Client –Java library implementing an HTTP client based on HttpCore components.
- HttpComponents Client 5.0 alpha2 released http://hc.apache.org/httpcomponents-client/

Apache Ignite™ –In-Memory Data Fabric providing in-memory data caching, partitioning, processing, and querying components.
 - Apache Ignite 2.0.0 released https://ignite.apache.org/

Apache OpenWebBeans™ Meecrowave –a light Apache web server based on Tomat and OpenWebBeans (à la microprofile fashion).
 - Apache Meecrowave 0.3.1 released http://openwebbeans.apache.org/meecrowave/

Apache NiFi™ –an easy to use, powerful, and reliable system to process and distribute data.
 - Apache NiFi 1.2.0 released https://nifi.apache.org/

Apache Qpid™ Proton –a messaging library for the Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464, http://www.amqp.org).
 -Apache Qpid Proton-J 0.19.0 released http://qpid.apache.org/

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and JASPIC technologies.
 - Apache Tomcat 9.0.0.M21 released http://tomcat.apache.org/

Apache Trafodion (incubating) –a Web-scale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop.
 - Apache Trafodion 2.1.0-incubating released http://trafodion.incubator.apache.org/index.html


Did You Know?

 - Did you know that previews for ApacheCon + Apache: Big Data keynotes and select sessions are available exclusively on Feathercast? https://feathercast.apache.org/

 - Did you know that over the past quarter, Apache source code has been downloaded more than 2M times? https://projects.apache.org/statistics.html

 - Did you know that RiskCo finacial engineering and risk intelligence group uses Apache Wicket? http://wicket.apache.org/


Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Friday May 05, 2017

The Apache News Round-up: week ending 5 May 2017

Welcome, May! The Apache community is getting ready for ApacheCon, and we hope to see you in Miami soon. Here's what's happened over the past week:

Support Apache –your donations help the world's largest Open Source foundation enrich the lives of countless users and developers. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html
 - The Apache Software Foundation Welcomes 64 New Members https://s.apache.org/2Vt

Success at Apache –our sixth installment in the monthly blog series that focuses on the processes behind why the ASF "just works".
 - Meritocracy and Me, by Tom Barber https://s.apache.org/tQQh

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield whizbang performance at 99.97% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today: Big Data, CloudStack, Flex, IoT, Tomcat, and dozens of Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - Support your favorite Apache projects and communities at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache CarbonData™ –Open Source Big Data analytics accelerator.
The Apache Software Foundation Announces Apache® CarbonData™ as a Top-Level Project https://s.apache.org/QmTI

Apache HBase™ –an Open Source, distributed, versioned, non-relational database.
 -  Apache HBase 1.1.10 released https://hbase.apache.org/

Apache HttpComponents™ Core –a set of HTTP/1.1 and HTTP/2 transport components that can be used to build custom client and server side HTTP services with a minimal footprint.
 -Apache HttpComponents Core 5.0 alpha3 released http://hc.apache.org/

Apache Juneau (incubating) –a toolkit for marshalling POJOs to a wide variety of content types using a common framework, and for creating sophisticated self-documenting REST interfaces and microservices using very little code.
 - Apache Juneau 6.2.0 (incubating) released http://juneau.incubator.apache.org/

Apache Kylin™ –Open Source petabyte-scale Big Data Distributed Analytics Engine.
 - Apache Kylin 2.0.0 released https://kylin.apache.org/

Apache Mahout™ –Open Source scalable machine learning and data mining library for Big Data artificial intelligence.
 - The Apache Software Foundation Announces Apache® Mahout™ v0.13.0 https://s.apache.org/ioAa

Apache Qpid™ Dispatch –a router for the Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464, http://www.amqp.org).
 - Apache Qpid Dispatch 0.8.0 released http://qpid.apache.org/


Did You Know?

 - Did you know that previews for ApacheCon + Apache: Big Data keynotes and select sessions are available exclusively on Feathercast? https://feathercast.apache.org/

 - Did you know that the Top 5 closers (+ number of issues) over the past 30 days are: Wes McKinney (169), Sean Owen (75), Gavin McDonald (64), Aled Sage (57), and Jesse MacFadyen (55)? Well done, all! http://status.apache.org/

 - Did you know that you can support Apache by shopping at http://smile.amazon.com? 0.5% will be donated to the ASF!

 - Did you know that Air New Zealand's "Ask Oskar" online chatbot is trained by Apache OpenNLP? http://opennlp.apache.org/


Apache Community Notices:

 - "Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - Introducing the new Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-march-2017 Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity . Do friend and follow us.

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Wednesday May 03, 2017

The Apache Software Foundation Welcomes 64 New Members

The Apache Software Foundation welcomes the following new Members who were elected during the annual ASF Members' Meeting on 28-30 March 2017:

Taher A. Alkhateeb, Ryan Blue, Davor Bonaci, Michael Brohl, Cédric Champeau, Byung-Gon Chun, William Colen, Radu Cotescu, Jaroslaw Cwiklik, Dániel Dékány, Mike Drob, Eric Evans, Olaf Flebbe, Lars Francke, Roberto Galoppini, Robert Gemmell, Jorge Luis Betancourt Gonzalez, Thamme Gowda, Scott Gray,Stefan Hett, Jonathan Hsieh, Claus Ibsen, Bilgin Ismet Ibryam, Jeff Jirsa, Evgeny Kotkov, Andrew Kurth, Chris Lambertus, T Jake Luciani, Nicolas Malin, Stephen Mallette, Karl Heinz Marbaise, Sidney Markowitz, Gary Martin, Jan Materne, Nate McCall, Larry McCay, Robert Munteanu, Kay Ousterhout, Anil Patel, Christine Poerschke, Matt Post, Dominik Psenner, Chris Riccomini, Carlos Rovira, Daniel Ruggeri, Guergana K. Savova, Felix Schumacher, Anthony Shaw, Matt Sicker, Karanjeet Singh, Stian Soiland-Reyes, Lee Moon Soo, Michael Starch, Daniel Takamori, Josh Tynjala, Ashish Vijaywargiya, Jay Vyas, Andrew Wang, Claude Warren, Michael Semb Wever, Evans Ye, Piotr Zarzycki, Jeff Zhang, and Jordan Zimmerman.


When the ASF incorporated in 1999, the Foundation's core membership comprised 21 individuals who oversaw the progress of the Apache HTTP Server. This group grew with Committers --developers who contributed code, patches, or documentation, and were subsequently granted access by the Membership:
  1. to "commit" or "write" (contribute) directly to the code repository;

  2. the right to vote on community-related decisions; and

  3. the ability propose an active user for Committership.

Those Committers who demonstrate merit in the Foundation’s growth, evolution, and progress are nominated for ASF Membership by existing members.

This election brings the total number of ASF Members to 684 today. Individuals elected as ASF Members legally serve as the "shareholders" of the Foundation https://www.apache.org/foundation/governance/members.html

For more information on how the ASF works, visit http://www.apache.org/foundation/how-it-works.html .

# # # 


Monday May 01, 2017

Success at Apache: Meritocracy and Me.

by Tom Barber

When Sally asked for volunteers to help with a blog post series "Success at Apache" I realised there was a very human story to tell about how the ASF helped me get to where I am today and hopefully where I'll go tomorrow. Over the years I have worked on and run a number of Open Source projects whilst working with an awful lot of Open Source software. One day I was browsing Slashdot as you do, yeah I know a lot of people disparage it, but it's an awfully hard habit to kick, and without it I wouldn't have got involved in the ASF so I owe it a lot. Anyway, one day when browsing Slashdot I saw this article (https://it.slashdot.org/story/11/01/08/1544204/apache-to-steward-nasa-built-middleware), I had been working in the Open Source business intelligence industry for a few years at that point and I spent a lot of time hacking around and managing data systems, so I wondered how I could get some help out of OODT (http://oodt.apache.org/). Also as a kid I had always loved everything about space, I was a huge Apollo fan, had a small telescope, went to the total eclipse in the UK in 1999 and so on. I thought this OODT project would be a fun way for me to chat nonsense to a few NASA employees, find out how they did stuff and do a bit of Open Source hacking on the side, which would at least let me participate in some NASA related development work, and so it began.

For those of you who haven't heard of Apache OODT it is a middleware layer for building data systems. Originally written by NASA JPL and then Open Sourced to The Apache Software Foundation it provides data ingest capabilities, metadata extraction, data workflows and resource management. I started by asking pretty dumb questions on the IRC channel, posting stuff on the mailing lists and trying to figure out how this reasonably expansive stack of software even operated. Chris Mattmann and Sean Kelly guided me through the opening foray into OODT development and education. Eventually, having submitted a few bug fixes, I volunteered to be a release manager for an OODT release and that got me more involved. Not too long after Sean asked if I fancied having a go at being the project chair, which I duly accepted. Behind the scenes cogs were turning and in a matter of a few weeks, I'd gone from a committer and PMC member to Chair to ASF member, it was certainly a hectic time, trying to keep up with all the new things I had to do, mailing lists to follow and so on, but what a period in my ASF experience, lots of fun!

That was just over 2 years ago and I'm still happily stewarding the OODT folks and keeping the cogs turning, releases happening and Jira tickets triaged. Alongside that it is truly an honour to be involved with the ASF as a member and although the politics can get tedious, the foundation is an amazing place for people to learn to work on great software as part of a distributed team. 18 months ago I was getting a little jaded with the monotony of the BI work I was doing, there are only so many sales databases and budget reports one guy can take and after being in BI 8 or so years I felt like it was time for a change of scenery, I just didn't know what. So I blasted out an email to 10 or so people I knew or had had some contact with over the years who might be able to give me a job, or know someone who was looking for a Java developer, BI guy, Open Source advocate, that type of thing. I'd included some OODT folks on my email, not because I thought there was a chance of a job using it, but just in case they happened to know someone out in California needing some remote help. Everyone said no, except Chris Mattmann who said if I could hold on a few months he might have something for me at NASA! That response floored me, I'd never even considered that as an option and knew that with a young family it would be highly unlikely I'd be able to move to California, so I played along assuming it would fall through. But the as the process dragged on and contracts got drawn up we got closer and closer to it becoming real and the excitement grew, there was the tangible possibility of me fulfilling at least a bit of a life long dream, no I wouldn't be an astronaut, but there was the chance of employment by NASA.

Eventually 6 months or so later, the paperwork was signed and I joined the ranks at NASA JPL, working as an Apache OODT and devops guy. What is great is that having 10 years of business and development experience, I feel like I can very much make a positive contribution to the team, and in part that is down to what I have learnt developing and coding at the ASF. It has been an amazing experience  and a wonderful 12 months. I never dreamt an opportunity like that would arise and it is 100% down to the great work the ASF does in stewarding new projects through the incubation process and into mainstream adoption. Without the ASF I would likely still be a BI guy dealing with run of the mill data warehouses, instead I work on Genomics Search Engines, help hunt criminals on the dark web and a host of other stuff. Life sometimes throws you an opportunity that you don't expect and the ASF certainly facilitated that.

Last week I was in Pasadena, visiting the JPL facility and getting the guided tour, and doing a bit of work. It was amazing talking to such a dedicated group of people who clearly have a big passion for what they do. Getting to see their mission control, the mars rovers and various satellite mock ups was awe inspiring but what excited me the most was getting to sit down and pick the brains of people with whom I have worked with at the ASF for years yet not met in the flesh. Finally making that human connection means a lot.

What the ASF offers here is the ability to learn to work as a distributed team without the pressures of the "real world". Everyone at the ASF, pretty much, is a volunteer and other volunteers recognise that, and so it reduces the pressure, but whilst reducing the pressure it teaches you how to make binding decisions as a disparate group, how to keep records and how to ship good quality code whilst living in different timezones. At the ASF some of us might meet once or twice a year at ApacheCon, Fosdem or elsewhere, but largely all communications is done via mailing list. This can cause issues when people "just want to get it done" but it also provides an immutable record of what is going on in a project and who said what. This proves equally useful out in the "real world" where you want to track business decisions or look up historical records of why a certain choice was made. Also dealing with people who you don't work with on a daily basis also helps you think more about your communication style, what is fine to say and what isn't and also how you structure your communications, which is also very important in a business setting. Do you know the person? Do they understand your nuances? Is English their native language? etc

The other thing I find the ASF offers is understanding. Last year I was diagnosed with Aspergers at the age of 33, which is pretty late. What is nice is that generally, people like to listen, and if you have something that affects your personal or professional life, people who you've met at the ASF will often lend an understanding ear to allow you to off load or discuss something completely unrelated to the project you might be working on. Or like me, you can just stand up at the front of an ApacheCon lightning talk and tell everyone! Either way, you can generally find someone in the Apache family who will provide a sounding board for anything you want to discuss.

These days I spend my spare time still working on OODT stuff, but also doing a lot of public speaking and mentoring and whenever I do I make sure I talk up the Apache Software Foundation because it has given me the chance of a life time and one that I'll be forever grateful for. If you aren't involved in development here at the ASF, get involved, you don't have to be a coder, you just need to like helping out in a fun, Open Source community.

As I mentioned at the start, this blog series is about success at Apache, hopefully this proves that success can come in a number of ways, the ASF was selected by NASA as the home for its data middleware platform, that proves that the NASA deemed the incubation process, the license and ecosystem acceptable, that is success the the Apache Foundation. Similarly the foundation has proved very successful in placing people into employment from a range of different walks of life into new lines of work, and that is exactly what happened to me and the reason I wanted to share my story about success at Apache.

= = =

"Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM

The Apache Software Foundation Announces Apache® CarbonData™ as a Top-Level Project

Open Source Big Data analytics accelerator in use at Bank of Communications, Hulu, Huawei, SAIC Motor, Zhejiang Mobile, among others.

Forest Hill, MD –1 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® CarbonData™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache CarbonData is an indexed columnar store file format for fast analytics on Big Data platforms (including Apache Hadoop, Apache Spark, among others) to help speed up queries an order of magnitude faster over petabytes of data.

"We are very proud to complete the incubation process and graduate as an Apache Top-Level Project," said Liang Chen, Vice President of Apache CarbonData. "The CarbonData community grew rapidly over last ten months, both in terms of size and diversity. Since entering the Apache Incubator, we have completed 4 releases, and exceeded 90 contributors from 10 different organizations."

With the aim of using a unified file format to satisfy all kinds of data analysis cases, Apache CarbonData seamlessly integrates with Hadoop and Spark to improve Big Data analysis efficiency. In benchmarks, CarbonData's faster interactive query helps in speeding up queries approximately 10x faster than standard column-oriented SQL on Hadoop data stores.

Highlights include:

  • Unique data organization to allow faster filtering and better compression;
  • Multi-level Indexing to enable faster search and speeding up query processing;
  • Deep Apache Spark Integration for dataframe + SQL compliance;
  • Advanced push down optimization to minimize the amount of data being read processed, converted, transmitted, and shuffled;
  • Efficient compression and global encoding schemes to further improve aggregation query performance;
  • Dictionary encoding for reduced storage space and faster processing; and
  • Data update + delete support using standard SQL syntax.


Apache CarbonData is in use at an array of organizations, including Bank of Communications, medical/pharma social platform DXY, Hulu, Huawei, group online retailer MEITUAN, SAIC Motor, Zhejiang Mobile, among others.

"CarbonData has very good performance as a ‘SQL on Hadoop’ solution," said Tan Sheng, Director of SAIC Motor’s Big Data team. "It is suitable for SAIC Motor to adopt as a central Big Data platform component. Not only do we use Apache CarbonData, we also actively participate in its community as contributors." 

"Apache CarbonData is great, as helped our audit business to improve 7-10X performance based on 14 billion rows of data," said Wei Zhao, Senior Engineer at Bank of Communications.

"Apache CarbonData is very suitable for our filter query cases, and has averaged 20x improvement on performance," said William Zhu, Architecture team member at DXY. "And, as CarbonData supports data update and delete, this feature is very useful. We would consider CarbonData as our all-in-one solution to unify all analysis data."

CarbonData was first developed at Huawei in 2013. The project was submitted to the Apache Incubator in June 2016, and had its first official release two months later. The project won top honors in the BlackDuck 2016 Open Source Rookies of the Year's Big Data category.

"Apache CarbonData is a great example of the value of the incubation process," said Jean-Baptiste Onofré, Apache CarbonData Incubator Mentor and Project Management Committee member. "Helping grow the CarbonData developer and user communities has increased our visibility, which allowed us to extend our use cases and tests, and gather new ideas. The initial CarbonData committers did (and are still doing) great work to welcome new users and contributors, clearly understanding it's a step forward for the project."

"We will continue to put our efforts towards optimizing data format efficiency for Big Data ecosystem and provide an unified and high performance data storage solution," added Liang. "The Apache CarbonData community welcomes interested contributors to work with us on our journey forward."

Catch Apache CarbonData in action at ApacheCon (16-18 May/Miami), and Spark Summit (5-7 June/San Francisco).

Availability and Oversight
Apache CarbonData software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache CarbonData, visit http://carbondata.apache.org/ , https://twitter.com/ApacheCarbonDat , and https://www.facebook.com/carbondata/

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "CarbonData", "Apache CarbonData", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # # 

The Apache Software Foundation Announces Apache® Mahout™ v0.13.0

Open Source scalable machine learning and data mining library for Big Data artificial intelligence now more powerful and easier to use.

Forest Hill, MD —1 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® MahoutTM v0.13.0, the latest version of the Open Source scalable machine learning library.

Apache Mahout provides an environment for quickly creating machine-learning applications that scale and run on the highest-performance parallel computation engines available. Mahout is the first scalable generalized tensor and linear algebra solving engine taking data scientists from interactive experiments to production use.

"Apache Mahout 0.13.0 is more powerful with its new algorithm framework that allows for easier implementation of machine learning algorithms," said Andrew Palumbo, Vice President of Apache Mahout. "The enhanced Mahout code base and development framework make machine learning even more accessible, which is a game changer in the field of artificial intelligence."

Mahout provides a wide variety of premade algorithms (Matrix Factorization, QR via ALS, SSVD, PCA, etc.) for Scala + Apache Spark, H2O, and Apache Flink, as well as on-GPU compute for performance improvements in very large tensor math. Apache Mahout provides the data science tools to automatically find meaningful patterns in Big Data sets by supporting the following main data science use cases:
  • Collaborative filtering – mines user behavior and makes product recommendations (such as eCommerce product recommenders);
  • Regression – estimates a numerical value based on values of other inputs;
  • Clustering – takes items in a particular class (such as Web pages or newspaper articles) and organizes them into naturally occurring groups, such that items belonging to the same group are similar to each other; and
  • Classifying – learns from existing categorizations and then assigns unclassified items to the best category.

New in v0.13.0
Apache Mahout now makes it easier to do matrix math on graphics cards, which is relevant for most modern machine-learning and deep-learning methods. In addition, v0.13.0 allows shared nothing computation on GPUs, on multi-core CPU, or in the JVM as appropriate, as well as a simplified framework for building new algorithms. As Mahout comprises an interactive environment and library that support generalized scalable linear algebra and include many modern machine-learning algorithms, the project has also collaborated with developers on other projects, including the Open Source linear algebra library ViennaCL, the Java wrapper library interface JavaCPP, and the graphics processor technology manufacturer NVIDIA to add CUDA bindings directly into Mahout for simplicity of development.

The v0.13.0 release reflects 62 separate JIRA issues from v0.12.2, including numerous enhancements to Mahout-Samsara, the vector math experimentation environment with R-like syntax that works at scale. Complete release notes are at http://mahout.apache.org/release-notes/Apache-Mahout-0.13.0-Release-Notes.pdf

Future versions of Mahout will include support for native iterative solvers, a more robust algorithm library, and smarter probing and optimization of multiplications, among other features.

A comprehensive list of users of Apache Mahout is available at https://mahout.apache.org/general/powered-by-mahout.html ; current users are mostly researchers and developers actively involved in building distributed machine-learning pipelines and tools.

"We thank our community of developers and users who helped make this milestone release possible, and welcome new contributors to help us advance machine learning," added Palumbo.

Catch Apache Mahout in action at Apache: Big Data, where attendees learn first-hand from many original project creators and companies from the greater Mahout community. Apache: Big Data will be held 16-18 May 2017 in Miami, FL. To register, and for more information, visit http://apachecon.com/

Availability and Oversight
Apache Mahout software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Mahout, visit http://mahout.apache.org/ and https://twitter.com/ApacheMahout

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Flink", "Apache Flink", "Mahout", "Apache Mahout", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation