Entries tagged [source]

Monday February 12, 2018

The Apache Software Foundation Announces Apache® CloudStack® v4.11

Mature Open Source Enterprise Cloud platform powers billions of dollars in transactions for the world's largest Cloud providers.

Wakefield, MA —12 February 2018— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® CloudStack® v4.11, the latest version of the turnkey enterprise Cloud orchestration platform.

Apache CloudStack is the proven, highly scalable, and easy-to-deploy IaaS platform used for rapidly creating private, public, and hybrid Cloud environments. Thousands of large-scale public Cloud providers and enterprise organizations use Apache CloudStack to enable billions of dollars worth of business transactions annually across their clouds.

"This is another great release," said Wido den Hollander, Vice President of Apache CloudStack. "The community has worked very hard to develop great new features and many enhancements to CloudStack. Together we are making the project better every single day."

Apache CloudStack v4.11 features more than 250 new capabilities, such as improved integration, stability, storage, networking support, and performance. Highlights include:

  • new Host HA framework with an HA provider for KVM powered clouds;
  • new certificate authority framework to support ongoing  work in container & application clusters;
  • integration with Prometheus;
  • a connector for Cloudian Hyperstore S3 storage;
  • deeper integration with Nuage SDN; and
  • Layer 2 Networking capabilities.


The full list of new features can be found in the project release notes at http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.11.0.0/

"This release has been driven by the people operating CloudStack clouds,” said Rohit Yadav, Apache CloudStack v4.11 Release Manager. "I would like to thank the contributors across all of these organizations for supporting this release, which reflects both the user-driven nature of our community and the Apache CloudStack project's commitment to continue to be the most stable, easily deployable, scalable Open Source platform for IaaS. Along with great new features, v4.11 brings several important structural changes such as better support for systemd and Java 8, migration to embedded Jetty, and a new and optimized Debian 9 based systemvm template."

Apache CloudStack powers numerous elastic Cloud computing services, including solutions that have ranked as Gartner Magic Quadrant leaders. Highlighted in the Forrester Q4 2017 Enterprise Open Source Cloud Adoption report, Apache CloudStack "sits beneath hundreds of service provider clouds", including Fortune 5 multinational corporations. A list of known Apache CloudStack users are available at http://cloudstack.apache.org/users.html

"Apache CloudStack release 4.11 has impressive new content," said Kris Sterckx, CloudStack development lead at Nuage Networks. "More than any previous release, v4.11 leverages the value of Nuage Networks SDN with Nuage 5.0 support, SDN managed networks, per-interface DHCP options support and automation support for migrating deployed Apache CloudStack clouds from traditional Linux bridge networking to SDN/OVS based networking."

"At Interoute, we depend on Apache Cloudstack to help us deliver innovative and reliable IaaS services to our global customers," said Alex Mattioli, Chief Cloud Architect at Interoute. "The functionality in the 4.11 release shows that Cloudstack continues to be the most operator-focused IaaS platform available for large service providers, with a development community who are able to quickly develop what the market wants. Many of the features in this release have been created directly based on the needs of Interoute and our customers."

"Apache CloudStack 4.11 continues to bring innovative features and functionality to market through collaborative Open Source development," said Simon Weller, Director of Technology at Education Networks America. "We're particularly excited by the Host-HA framework, which brings a much greater level of hypervisor automation to KVM based service providers."

"Feedback from our community helps us solve real-world problems and strengthens our development process," added den Hollander. "I look forward to Apache CloudStack users and developers continuing to work closely together to make future releases even better!"

Catch Apache CloudStack in action 28 February 2018 at the German CloudStack Meetup in Frankfurt.

Availability and Oversight
Apache CloudStack software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache CloudStack, visit http://cloudstack.apache.org/ and https://twitter.com/CloudStack

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,500 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, ARM, Baidu, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, Huawei, IBM, Inspur, iSIGMA, ODPi, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Target, Union Investment, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "CloudStack", "Apache CloudStack", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday December 14, 2017

The Apache Software Foundation Announces Apache® Hadoop® v3.0.0 General Availability

Ubiquitous Open Source enterprise framework maintains decade-long leading role in $100B annual Big Data market

Forest Hill, MD —14 December 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, today announced Apache® Hadoop® v3.0.0, the latest version of the Open Source software framework for reliable, scalable, distributed computing.

Over the past decade, Apache Hadoop has become ubiquitous within the greater Big Data ecosystem by enabling firms to run and manage data applications on large hardware clusters in a distributed computing environment.

"This latest release unlocks several years of development from the Apache community," said Chris Douglas, Vice President of Apache Hadoop. "The platform continues to evolve with hardware trends and to accommodate new workloads beyond batch analytics, particularly real-time queries and long-running services. At the same time, our Open Source contributors have adapted Apache Hadoop to a wide range of deployment environments, including the Cloud."

"Hadoop 3 is a major milestone for the project, and our biggest release ever," said Andrew Wang, Apache Hadoop 3 release manager. "It represents the combined efforts of hundreds of contributors over the five years since Hadoop 2. I'm looking forward to how our users will benefit from new features in the release that improve the efficiency, scalability, and reliability of the platform."

Apache Hadoop 3.0.0 highlights include:
  • HDFS erasure coding —halves the storage cost of HDFS while also improving data durability;
  • YARN Timeline Service v.2 (preview) —improves the scalability, reliability, and usability of the Timeline Service;
  • YARN resource types —enables scheduling of additional resources, such as disks and GPUs, for better integration with machine learning and container workloads;
  • Federation of YARN and HDFS subclusters transparently scales Hadoop to tens of thousands of machines;
  • Opportunistic container execution improves resource utilization and increases task throughput for short-lived containers. In addition to its traditional, central scheduler, YARN also supports distributed scheduling of opportunistic containers; and 
  • Improved capabilities and performance improvements for cloud storage systems such as Amazon S3 (S3Guard), Microsoft Azure Data Lake, and Aliyun Object Storage System.

Hadoop 3.0.0 has already undergone extensive testing and integration with the broader Open Source ecosystem at The Apache Software Foundation. With this release, its community of developers and users promote this release series out of beta.

Apache Hadoop is widely deployed at numerous enterprises and institutions worldwide, such as Adobe, Alibaba, Amazon Web Services, AOL, Apple, Capital One, Cloudera, Cornell University, eBay, ESA Calvalus satellite mission, Facebook, foursquare, Google, Hortonworks, HP, Hulu, IBM, Intel, LinkedIn, Microsoft, Netflix, The New York Times, Rackspace, Rakuten, SAP, Tencent, Teradata, Tesla Motors, Twitter, Uber, and Yahoo. The project maintains a list of known users at https://wiki.apache.org/hadoop/PoweredBy

"It's tremendous to see this significant progress, from the raw tool of eleven years ago, to the mature software in today's release," said Doug Cutting, original co-creator of Apache Hadoop. "With this milestone, Hadoop better meets the requirements of its growing role in enterprise data systems.  The Open Source community continues to respond to industrial demands."

Apache Hadoop's diverse community enjoys continued growth amongst the ASF's most active projects, and remains at the forefront of more than three dozen Apache Big Data projects.

Apache Hadoop committer history

Apache Hadoop has received countless awards, including top prizes at the Media Guardian Innovation Awards and Duke's Choice Awards, and has been hailed by industry analysts:

"...the lifeblood of organizational analytics…" —Gartner

"Hadoop Is Here To Stay" —Forrester

"...today Hadoop is the only cost-sensible and scalable open source alternative to commercially available Big Data management packages. It also becomes an integral part of almost any commercially available Big Data solution and de-facto industry standard for business intelligence (BI)." —MarketAnalysis.com/Market Research Media

"...commanding half of big data’s $100 billion annual market value...Hadoop is the go-to big data framework." —BigDataWeek.com

"Hadoop, and its associated tools, is currently the 'big beast' of the big data world and the Hadoop environment is undergoing rapid development..." —Bloor Research


"The opportunity to effect meaningful, even fundamental change in the Apache Hadoop project remains open," added Douglas. "Our new contributors uprooted the project from its historical strength in Web-scale analytics by introducing powerful, proven abstractions for data management, security, containerization, and isolation. Apache Hadoop drives innovation in Big Data by growing its community. We hope this latest release continues to draw developers, operators, and users to the ASF."

Catch Apache Hadoop in action at the Strata Data Conference in San Jose, CA, 5-8 March 2018, and at dozens of Hadoop Meetups held around the world.

Availability and Oversight
Apache Hadoop software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Hadoop, visit http://hadoop.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server —the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,300 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, Huawei, IBM, Inspur, iSIGMA, ODPi, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, Union Investment, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday December 13, 2017

The Apache Software Foundation Announces Apache® Mnemonic™ as a Top-Level Project

Open Source storage-class memory oriented durable object platform for Java application developers in use across an array of industries that include eCommerce, Financial Services, and Semiconductors, among others.

Forest Hill, MD —13 December 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® Mnemonic™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache Mnemonic is an Open Source Java-based storage-class memory oriented durable object platform for linked objects processing and analytics. Using Apache Mnemonic, objects can also be directly accessed by other computing languages (e.g. C/C++); the durable object model and durable computing model implemented by this library might lead to new cache-less and SerDe-less (Serializer and Deserializer-less) architecture for high-performance applications and frameworks.

"The Mnemonic community continues to explore new ways to significantly improve the performance of real-time Big Data processing/analytics," said Gang "Gary" Wang, Vice President of Apache Mnemonic. "We worked hard to develop both our code and community the Apache Way, and are honored to graduate as an Apache Top-Level Project."

"Apache Mnemonic fills the void of the ability to directly persist on-heap objects, making it beneficial for use in production to accelerate Big Data processing applications at several large organizations," said Henry Saputra, ASF Member and Apache Mnemonic Incubating Mentor. "I am pleased how the community has grown and quickly embraced the Apache Way of software development and making progressive releases. It has been a great experience to be part of this project."

Mnemonic addresses Big Data performance issues that include serialization, caching, computing bottlenecks, and persistency using next-generation, non-volatile memory (NVM) storage media. Apache Mnemonic abstracts system memory, storage-class memory, and even traditional storage as hybrid memory services. Mnemonic’s performance-oriented architecture features include:

  • Unified platform enabling framework;
  • Unique durable object model and computing model;
  • Flexible and extensible focal point for optimization; and 
  • Easy integration with Big Data projects such as Apache Hadoop and Apache Spark

"Apache Mnemonic provides a unified interface for memory management," said Yanhui Zhao, Apache Mnemonic Committer. "It is playing a significant role in reshaping the memory management in current computer architecture along with the developments of large capacity NVMs, making a smooth transition from present mechanical-based storage to flash-based storage with the minimum cost."

"Apache Mnemonic provides intuitive abstractions and APIs to help make non-volatile memory a more natural and integrated part of data system development," said Wes McKinney, Software Architect at Two Sigma Investments and member of the Apache Arrow Project Management Committee.

Apache Mnemonic is in use by many industries, including eCommerce, Financial Services, and Semiconductors, among others.

"Next generation compute platforms will be dominated by technologies like non-volatile memory (NVM). As NVMs proliferate, we will need to revisit the memory access and the computation models," said Debojyoti Dutta, Distinguished Engineer at Cisco, and member of the Apache Metron and Mnemonic Project Management Committees. "Apache Mnemonic fills the gap around an urgent need to unify the memory management for JVM based applications. Given the proliferation of JVM based data intensive platforms, I expect Mnemonic to have a profound impact in leveraging NVMs for data workloads."

"Apache Mnemonic project will help in building memory based storage systems with the modern big memory storages," said Uma Maheswara Rao G, ASF Member, and member of the Apache Incubator and Hadoop Project Management Committees. "One of the key and useful goal is to avoid the serde overheads while storing and accessing durable objects. The Unified interface of Mnemonic allow us to leverage different type of storage services, that allow applications to use storage services transparently."

"Today’s challenge of data processing from different persistence layers is a big rock for application to manipulate easily and quickly, especially in the world of hybrid from on-premises to in the Cloud," said Luke Han, CEO of Kylingence, ASF Member, and Vice President of Apache Kylin. "Apache Mnemonic brings a way simplified such investment for it, which saved a lot of efforts to unify underlying storage options and speed up project implementation very much."

"We invite individuals interested in Apache Mnemonic to join our mailing lists and contribute to the project," added Wang. "We welcome user feedback across deployments of all scales."

Availability and Oversight
Apache Mnemonic software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Mnemonic, visit http://mnemonic.apache.org/ and https://twitter.com/ApacheMnemonic

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,300 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, Huawei, IBM, Inspur, iSIGMA, ODPi, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, Union Investment, WANdisco, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Mnemonic", "Apache Mnemonic", "Arrow", "Apache Arrow", "Hadoop", "Apache Hadoop", "Metron", "Apache Metron", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

Thursday October 19, 2017

The Apache Software Foundation Announces Five Years of Apache® OpenOffice™ as a Top-Level Project

Latest, secure version of leading Open Source office application and personal productivity suite for Windows, Linux, and Mac now available in 41 languages.

Forest Hill, MD —19 October 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the five-year anniversary of Apache® OpenOffice™, the leading Open Source office document productivity suite.

"OpenOffice has been downloaded by millions of users since becoming an Apache project five years ago," said Marcus Lange, Vice President of Apache OpenOffice. "We are extremely proud of our community of loyal users and developers who are committed to the future of OpenOffice. We are inspired by their encouragement and thank them by making the next version of the world's leading Open Source productivity suite even better."

With more than 225 million downloads, Apache OpenOffice includes the following applications:
  1. "Writer" - a word processor;
  2. "Calc" - a spreadsheet tool;
  3. "Impress" - a presentation editor;
  4. "Draw" - a vector graphics editor; 
  5. "Math" - a mathematical formula editor; and 
  6. "Base" - a database management program. 

Apache OpenOffice is available in 41 languages on Windows, macOS and Linux.

In celebration of OpenOffice's triple anniversary this month —17 years as an Open Source project, 6 years at the ASF, and 5 years as an ASF Top-Level Project— the Apache OpenOffice Project Management Committee also announced the immediate availability of Apache OpenOffice 4.1.4, which reflects changes that include:
  • Several updates for language dictionaries
  • Some translation fixes in the UI
  • Bug fixes
  • Security improvements
  • Updated graphics/logos (new Apache feather)
  • Enhancements to the build tools (for developers)

The complete list of changes and new features is available at https://s.apache.org/AOO-414changes ; users are encouraged to download the official version from https://www.openoffice.org/download/

Apache OpenOffice is used by millions of organizations, institutions, and individuals around the world. OpenOffice also plays an integral role in many governments, in response to their mandates to use files in the ISO/IEC standard Open Document Format (ODF). OpenOffice supports localized versions in more than 120 languages (those that are 100% translated and maintained are officially released).

As with all Apache projects, Apache OpenOffice is available as a free download to all users at no cost, charge, or fees of any kind. OpenOffice is Open Source software: its C++ source code is readily available for anyone who wishes to enhance the applications.

Availability and Oversight
Apache OpenOffice software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For project data, documentation, and more information on Apache OpenOffice, visit https://openoffice.apache.org/

Download
The project strongly recommends that users download OpenOffice only from the official site https://www.openoffice.org/download/ to ensure that they receive the original software in the correct and most recent version. The project also recommends users review the Release Notes https://s.apache.org/AOO-414releasenotes for important updates and remarks concerning any known issues with this version and their workarounds.

Get Involved!
Apache OpenOffice welcomes contributions and community participation through mailing lists as well as attending face-to-face MeetUps, developer trainings, and user events. Those wishing to get involved in the project can find out more at https://openoffice.apache.org/get-involved.html

About Apache OpenOffice
Originally created as "StarOffice" by StarDivision and after further expansion as an Open Source product under the name "OpenOffice.org" at Sun Microsystems, the project continued development after Oracle Corporation acquired Sun Microsystems in 2010. OpenOffice entered the Apache Incubator in 2011 and graduated as an Apache Top-level Project in October 2012. 9 releases have been made under the auspices of the ASF, with more than 225 million downloads recorded to date. Visit https://openoffice.apache.org/ and https://twitter.com/ApacheOO for more information.

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server -- the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,300 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF
© The Apache Software Foundation. "Apache", "OpenOffice", "Apache OpenOffice", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday September 25, 2017

The Apache Software Foundation Announces Apache® RocketMQ™ as a Top-Level Project

Open Source distributed messaging and streaming Big Data platform in use at Alibaba Group, Didi Chuxing, S.F. Express, WeBank, Peking University, and Chinese Academy of Sciences, among others.

Forest Hill, MD –25 September 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® RocketMQ™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache RocketMQ is an Open Source distributed messaging and streaming Big Data platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability.

"I am very excited to see Apache RocketMQ as a Top-Level Project and I would like to thank our mentors for all their help, the Apache Incubator Project Management Committee for its advice and guidance, everyone in the RocketMQ community, and Alibaba for publishing the research upon which RocketMQ is based," said Xiaorui Wang, Vice President of Apache RocketMQ. "During the incubation process, the RocketMQ community worked very hard to develop high-quality distributed software for messaging and streaming, in an open and inclusive manner in accordance with the Apache Way."

RocketMQ originated at Alibaba in 2012, and, after handling 1.2 trillion concurrent online message transmissions in the Alibaba Nov. 11th Global Shopping Festival, was donated to the Apache Incubator in November 2016. Apache RocketMQ v4.0.0 was released in February 2017.

As a distributed messaging engine, RocketMQ features include:
  • Low latency; more than 99.6% response latency within 1 millisecond under high pressure;
  • Finance-oriented, high availability with tracking and auditing features;
  • Industry-sustainable, trillion-level message capacity guaranteed;
  • Vendor-neutral, support multiple messaging protocols like JMS and OpenMessaging;
  • Big Data friendly, batch transferring with versatile integration for flooding throughput; and
  • Massive accumulation, given sufficient disk space, accumulate messages without performance loss.

"RocketMQ was conceived from the outset as an open-source distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability," said Von Gosling, original co-creator of RocketMQ and Chief Architect of Aliware MQ at Alibaba Group. "It has been great to witness the growth of the RocketMQ community and codebase as an ASF incubating project, and I look forward to this continuing as a Top-Level Project. Today, more than 100 companies are using Apache RocketMQ, with more feedback coming from the community. According to our data, more than 80% of the project's contributions are from outside the donator Alibaba Group."

In addition to Alibaba Group, Apache RocketMQ is in use at hundreds of companies and research/educational institutions that include Didi Chuxing, S.F. Express, WeBank, Peking University, and Chinese Academy of Sciences, among others.

"Graduation from the Incubator marks an important milestone for the RocketMQ project," said Bruce Snyder, Apache RocketMQ Incubator Mentor and Director of Software Development at SAP Hybris. "This is recognition of the focus and hard work of the project members to learn The Apache Way and drive community around RocketMQ. I am honored to have helped guide the project to a successful graduation."

"At Didi, we have used Apache RocketMQ as storage engine to build MessageQueue service. Based on high availability and high performance of RocketMQ we provide high-quality service," said Neil Qi, Architect at Didi Chuxing. "I believe RocketMQ will become the best MessageQueue project in future."

"New participants are more than welcome to join the project, To serve the community better, we created and maintained two repositories, one as our kernel version and the other one is for community contributions. The community contributed some integrated projects with some other Apache TLPs like Apache Storm, Apache Ignite, Apache Spark and Apache Flume," said Xinyu "yukon" Zhou, member of the Apache RocketMQ Project Management Committee. "We enthusiastically look forward to working together with all contributors to Apache RocketMQ in order to advance the state-of-the-art distributed messaging engine."

Availability and Oversight
Apache RocketMQ software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache RocketMQ, visit http://rocketmq.apache.org/ and https://twitter.com/ApacheRocketMQ

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 650 individual Members and 6,200 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, Inspur, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "RocketMQ", "Apache RocketMQ", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday June 29, 2017

The Apache® Software Foundation Announces Annual Report for 2017 Fiscal Year

Apache's community-led projects bring billions in value to users, developers, and critical applications; organization poised to foster continued growth

Forest Hill, MD —29 June 2017— The Apache® Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of the annual report for its 2017 fiscal year, which ended 30 April 2017. 

Now in its 18th year, the ASF's operational highlights include:
  1. 35M page views per week across apache.org;
  2. Web requests received from every Internet-connected country on the planet;
  3. ASF's 100M functional lines of code (out of 150M+ total) have been developed over 65,000 person years, valued at US$7B;
  4. 65+M lines of code committed over the past year;
  5. 20% of all Apache lines of code are comments (nearly 3x the entire Linux codebase);
  6. 3,300 Apache code Committers made 214,398 commits;
  7. More than half of all contributions made by individuals new to Apache;
  8. Nearly 300 new code contributors and 300-400 new people filing issues each month;
  9. 25,154 authors sent 2,105,992 emails on 834,045 topics over the past year;
  10. Approximately 9M source code downloads served from Apache mirrors on a yearly basis (excluding convenience binaries);
  11. 64 new individual ASF Members elected, bringing the total to 684;
  12. Exceeded 6,000 code Committers;
  13. Remains all-volunteer for all code/development-related activities;
  14. 182 Top-Level communities overseeing 300+ Apache projects and sub-projects;
  15. Dozens of Apache projects continue to dominate the enterprise Big Data ecosystem
  16. Record 64 "podlings" undergoing incubation during FY2017 (59 at end of FY2017);
  17. New innovations include IoT, Microfinance, Machine Learning, and Cryptography;
  18. 22nd anniversary of the Apache HTTP Server (18 years under the ASF umbrella);
  19. Apache OpenOfficeTM exceeded 200M downloads (value to users $25M+ per day);
  20. Apache GroovyTM downloaded 12M times during the first 4 months of 2017;
  21. 976 Individual Contributor License Agreements (CLAs) signed;
  22. 42 Corporate Contributor License Agreements signed (totalling 384);
  23. 30 Software Grant Agreements signed;
  24. Apache License remains one of the most popular Open Source licenses;
  25. Apache Infrastructure services running 24x7x365 at near 100% uptime on an annual budget of less than US$5,000 per project;
  26. New "Gitbox" service launched to allow communities to host their read/write Git repositories on GitHub;
  27. Continued reduction in Infrastructure costs by moving select services to the Cloud;
  28. Improved buildbot and Jenkins build farms for continuous integration, testing, and automated Website generation;
  29. Increased trademarks, brand management, and legal support for dozens of projects;
  30. Launched new Apache Community survey, newsletter, and social media resources;
  31. ASF serves as a mentoring organization in Google Summer of Code for 12th consecutive year;
  32. Participated in hundreds of events globally, including ApacheCon North America and Europe;
  33. Increased corporate backing, with 51 ASF Sponsors and 11 Infrastructure partners;
  34. Improved individual giving program and outreach tactics;
  35. Completed first-time budgetary planning with five-year projections;
  36. FY17 ended with a 15.3-month cash reserve (more than double the industry average);
  37. ASF operational areas (Brand Management, Fundraising, Marketing and Publicity, Infrastructure, Conferences, and Travel Assistance) now supported by professional staff overseen by appointed ASF Members;
  38. FY17 tapped as an investment year, with resources committed to core Infrastructure and Marketing & Publicity services.

The full report is available online at https://s.apache.org/FY2017AnnualReport

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit https://www.apache.org/ and https://twitter.com/TheASF

# # #

© The Apache Software Foundation. "Apache", "Apache Groovy", "Groovy", "Apache HTTP Server", "Apache OpenOffice", "OpenOffice", and "ApacheCon", are registered trademarks or trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

Wednesday May 31, 2017

The Apache Software Foundation Announces Apache® SystemML™ as a Top-Level Project

Open Source Big Data machine learning platform in use at Cadent Technology and IBM Watson Health, among other organizations.

Forest Hill, MD –31 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® SystemML™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache SystemML is a machine learning platform optimal for Big Data that provides declarative, large-scale machine learning and deep learning. SystemML can be run on top of Apache Spark, where it automatically scales data, line by line, to determine whether code should be run on the driver or an Apache Spark cluster.

"Today, the machine learning revolution is leading to thousands of life-altering innovations such as self-driving cars and computers that detect cancer," said Deron Eriksson, Vice President of Apache SystemML. "Apache SystemML enables and simplifies this process by executing optimized high-level algorithms on Big Data using proven technologies such as Apache Spark and Apache Hadoop MapReduce."

The core of Apache SystemML has been created from the ground up with the following design principles in mind: 

  • Performance and Scalability, as SystemML scales up on single nodes, and scales out on large clusters using Apache Spark or Apache Hadoop;
  • "Designed for data scientists", enabling data scientists to develop algorithms in a system with a strong foundation in linear algebra and statistical functions; and 
  • Cost-based optimization for scalable execution plans, that significantly shortens and simplifies the development and deployment cycle of algorithms for varying data characteristics and system configurations.

Using Apache SystemML, data scientists are able to implement algorithms using high-level language concepts without knowledge of distributed programming. Depending on data characteristics such as data size/shape and data sparsity (dense/sparse), and cluster characteristics such as cluster size and memory configurations, SystemML's cost-based optimizing compiler automatically generates hybrid runtime execution plans that are composed of single-node and distributed operations on Apache Spark or Apache Hadoop clusters for best performance.

"SystemML allows Cadent to implement advanced numerical programming methods in Apache Spark, empowering us to leverage specialized algorithms in our predictive analysis software," said Michael Zargham, Chief Scientist at Cadent Technology.

"SystemML is like SQL for Machine Learning, it enables Data Scientists to concentrate on the problem at hand, working in a high-level script language like R, and all the optimizations and rewrites are handled by the very powerful SystemML optimizer that considers data and available resources to produce the best execution plan for the application," said Luciano Resende, Architect at the IBM Spark Technology Center and Apache SystemML Incubator Mentor.

"IBM Watson Health VBC is using Apache SystemML on Apache Spark to build risk models on a very large EHR data set to predict emergency department visits," said Steve Beier, Vice President of Value Based Care Platform and Analytics at IBM Watson Health. "The models identify high-risk patients so that they can be targeted with preemptive strategies, thus potentially reducing care costs while at the same time leading to optimal outcomes for patients."

SystemML originated at IBM Research - Almaden in 2010, and was submitted to the Apache Incubator in November 2015. SystemML initiated compressed linear algebra research, a differentiating feature in SystemML, which received the VLDB 2016 Best Paper.

"The Apache Incubator is all about open collaboration and communication and was invaluable for everyone involved in SystemML," added Eriksson. "The Apache SystemML community sincerely encourages everyone interested in machine learning and deep learning to help build our community around this revolutionary technology."

Catch Apache SystemML in action at the Big Data Developers Silicon Valley MeetUp on 8 June 2017 in San Francisco, CA.

Availability and Oversight
Apache SystemML software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache SystemML, visit http://systemml.apache.org/ and https://twitter.com/ApacheSystemML

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit https://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "SystemML", "Apache SystemML", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday May 17, 2017

The Apache Software Foundation Announces Apache® Beam™ v2.0.0

Open Source unified programming model for batch and streaming Big Data processing in use at Google Cloud, PayPal, and Talend, among others.

Forest Hill, MD —17 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Beam™ v2.0.0, the first stable release of the unified programming model for both batch and streaming Big Data processing.

An Apache Top-Level Project (TLP) since December 2016, Beam includes Java and Python software development kits used to define data processing pipelines and runners to execute them on Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow, among other execution engines.

Apache Beam has its roots in Google's internal work on data processing over the last decade, evolving from the initial MapReduce system, through FlumeJava and MillWheel, into Google Cloud Dataflow v1.x, which defined the unified programming model that became the heart of Apache Beam.

"The first stable release is an important milestone for the Apache Beam community," said Davor Bonaci, Vice President of Apache Beam. "This is a statement from the community that it intends to maintain API stability with all releases for the foreseeable future, making Beam suitable for enterprise deployment."

Apache Beam v2.0.0 improves user experience across the project, focusing on seamless portability across execution environments, including engines, operating systems, on-premise clusters, cloud providers, and data storage systems. Other highlights include:
  • API stability and future compatibility within this major version;
  • Stateful data processing paradigms that unlock efficient, data-dependent computations;
  • Support for user-extensible file systems, with built-in support for Hadoop Distributed File System, among others; and
  • A metrics subsystem for deeper insight into pipeline execution.

Apache Beam is in use at Google Cloud, PayPal, and Talend, among others.

"Apache Beam is a mature data processing API for the enterprise, with powerful semantics that solve real-world challenges of stream processing," said Tomer Pilossof, Big Data Manager at PayPal. "With Beam, we provide data processing solutions for a wide range of customers within the PayPal organization."

"We at Talend are thrilled to have contributed to Apache Beam reaching the 2.0.0 milestone and its first official stable release," said Laurent Bride, Chief Technology Officer at Talend. "Apache Beam is now part of the foundation of Talend products. Recently, we released Talend Data Preparation for Big Data which leverages Beam to create transformation pipelines that are portable across many execution engines. Later this year, we plan to deliver Talend Data Streams, taking the Apache Beam integration one step further by utilizing its powerful streaming semantics. Whether for batch, streaming, or real-time use cases, Apache Beam is a powerful framework that delivers the flexibility and advanced functionality our customers need."

"We congratulate the Apache Beam community for reaching the key milestone of a first stable release," said William Vambenepe, Lead Product Manager for Big Data, Google Cloud. "We look forward to our Google Cloud Dataflow customers taking full advantage of Beam's powerful programming model and newest features to run their data processing pipelines on Google Cloud."

Apache Beam v2.0.0 is making its debut at Apache: Big Data, taking place this week in Miami, FL, with four sessions featuring Apache Beam. Apache Beam will also be highlighted at numerous face-to-face meetups and conferences, including the Future of Data San Jose meetup, Strata Data Conference London, Berlin Buzzwords, and DataWorks Summit San Jose.

"I'd like to invite everyone to try out Apache Beam v2.0.0 today and consider joining our vibrant community," added Bonaci. "We welcome feedback, contribution and participation through our mailing lists, issue tracker, pull requests, and events."

Availability and Oversight
Apache Beam software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Beam, visit https://beam.apache.org/ and https://twitter.com/ApacheBeam

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server -- the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Beam", "Apache Beam", "Apex", "Apache Apex", "Flink", "Apache Flink", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday May 15, 2017

The Apache Software Foundation Announces Apache® Samza™ v0.13

Open Source Big Data distributed stream processing framework in production at Intuit, LinkedIn, Netflix, Optimizely, Redfin, and Uber, among other organizations.

Forest Hill, MD —15 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Samza™  v0.13, the latest version of the Open Source Big Data distributed stream processing framework.

An Apache Top-Level Project (TLP) since January 2015, Samza is designed to provide support for fault-tolerant, large scale stream processing. Developers use Apache Samza to write applications that consume streams of data and to help organizations understand and respond to their data in real-time. Apache Samza offers a unified API to process streaming data from pub-sub messaging systems like Apache Kafka and batch data from Apache Hadoop.

"The latest 0.13 release takes Apache Samza's data processing capabilities to the next level with multiple new features," said Yi Pan, Vice President of Apache Samza. "It also improves the simplicity and portability of real-time applications."

Apache Samza powers several real-time data processing needs including realtime analytics on user data, message routing, combating fraud, anomaly detection, performance monitoring, real-time communication, and more. Apache Samza can process up to 1.1 million messages per second on a single machine. v0.13 highlights include:
  • A higher level API that developers can use this to express complex processing pipelines on streams more concisely;
  • Support for running Samza applications as a lightweight embedded library without relying on YARN;
  • Support for flexible deployment options; 
  • Support for rolling upgrade of running Samza applications;
  • Improved monitoring and failure detection using a built-in heart beating mechanism;
  • Enabling better integrations with other cluster-manager frameworks and environments; and
  • Several bug-fixes that improve reliability, stability and robustness of data processing,

Organizations such as Intuit, LinkedIn, Netflix, Optimizely, Redfin, TripAdvisor, and Uber rely on Apache Samza to power complex data architectures that process billions of events each day. A list of user organizations is available at https://cwiki.apache.org/confluence/display/SAMZA/Powered+By

"Apache Samza is a highly performant stream/data processing system that has been battle tested over the years of powering mission critical applications in a wide range of businesses," said Kartik Paramasivam, Head of Streams Infrastructure, and Director of Engineering at LinkedIn. "With this 0.13 release, the power of Samza is no longer limited to YARN based topologies. It can now be used in any hosting environment. In addition, it now has a new higher level API that makes it significantly easier to create arbitrarily complex processing pipelines."

"Apache Samza has been powering near real-time use cases at Uber for the last year and a half," said Chinmay Soman, Staff Software Engineer at Uber. "This ranges from analytical use cases such as understanding business metrics, feature extraction for machine learning as well as some critical applications such as Fraud detection, Surge pricing and Intelligent promotions. Samza has been proven to be robust in production and is currently processing about billions of messages per day, accounting for 100s of TB of data flowing through the system." 

"At Optimizely, we have built the world’s leading experimentation platform, which ingests billions of click-stream events a day from millions of visitors for analysis," said Vignesh Sukumar, Senior Engineering Manager at Optimizely. "Apache Samza has been a great asset to Optimizely's Event ingestion pipeline allowing us to perform large scale, real time stream computing such as aggregations (e.g. session computations) and data enrichment on a multiple billion events/day scale. The programming model, durability and the close integration with Apache Kafka fit our needs perfectly."

"It has been a phenomenal experience engaging with this vibrant international community of users and contributors, and I look forward to our continued growth. It is a great time to be involved in the project and we welcome new contributors to the Samza community," added Pan.

Catch Apache Samza in action at Apache: Big Data, 16-18 May 2017 in Miami, FL http://apachecon.com/ , where the community will be showcasing how Samza simplifies stream processing at scale.

Availability and Oversight
Apache Samza software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Samza, visit http://samza.apache.org/ , https://blogs.apache.org/samza/ , and https://twitter.com/samzastream

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", "Kafka", "Apache Kafka", "Samza", "Apache Samza", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday May 01, 2017

The Apache Software Foundation Announces Apache® CarbonData™ as a Top-Level Project

Open Source Big Data analytics accelerator in use at Bank of Communications, Hulu, Huawei, SAIC Motor, Zhejiang Mobile, among others.

Forest Hill, MD –1 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® CarbonData™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache CarbonData is an indexed columnar store file format for fast analytics on Big Data platforms (including Apache Hadoop, Apache Spark, among others) to help speed up queries an order of magnitude faster over petabytes of data.

"We are very proud to complete the incubation process and graduate as an Apache Top-Level Project," said Liang Chen, Vice President of Apache CarbonData. "The CarbonData community grew rapidly over last ten months, both in terms of size and diversity. Since entering the Apache Incubator, we have completed 4 releases, and exceeded 90 contributors from 10 different organizations."

With the aim of using a unified file format to satisfy all kinds of data analysis cases, Apache CarbonData seamlessly integrates with Hadoop and Spark to improve Big Data analysis efficiency. In benchmarks, CarbonData's faster interactive query helps in speeding up queries approximately 10x faster than standard column-oriented SQL on Hadoop data stores.

Highlights include:

  • Unique data organization to allow faster filtering and better compression;
  • Multi-level Indexing to enable faster search and speeding up query processing;
  • Deep Apache Spark Integration for dataframe + SQL compliance;
  • Advanced push down optimization to minimize the amount of data being read processed, converted, transmitted, and shuffled;
  • Efficient compression and global encoding schemes to further improve aggregation query performance;
  • Dictionary encoding for reduced storage space and faster processing; and
  • Data update + delete support using standard SQL syntax.


Apache CarbonData is in use at an array of organizations, including Bank of Communications, medical/pharma social platform DXY, Hulu, Huawei, group online retailer MEITUAN, SAIC Motor, Zhejiang Mobile, among others.

"CarbonData has very good performance as a ‘SQL on Hadoop’ solution," said Tan Sheng, Director of SAIC Motor’s Big Data team. "It is suitable for SAIC Motor to adopt as a central Big Data platform component. Not only do we use Apache CarbonData, we also actively participate in its community as contributors." 

"Apache CarbonData is great, as helped our audit business to improve 7-10X performance based on 14 billion rows of data," said Wei Zhao, Senior Engineer at Bank of Communications.

"Apache CarbonData is very suitable for our filter query cases, and has averaged 20x improvement on performance," said William Zhu, Architecture team member at DXY. "And, as CarbonData supports data update and delete, this feature is very useful. We would consider CarbonData as our all-in-one solution to unify all analysis data."

CarbonData was first developed at Huawei in 2013. The project was submitted to the Apache Incubator in June 2016, and had its first official release two months later. The project won top honors in the BlackDuck 2016 Open Source Rookies of the Year's Big Data category.

"Apache CarbonData is a great example of the value of the incubation process," said Jean-Baptiste Onofré, Apache CarbonData Incubator Mentor and Project Management Committee member. "Helping grow the CarbonData developer and user communities has increased our visibility, which allowed us to extend our use cases and tests, and gather new ideas. The initial CarbonData committers did (and are still doing) great work to welcome new users and contributors, clearly understanding it's a step forward for the project."

"We will continue to put our efforts towards optimizing data format efficiency for Big Data ecosystem and provide an unified and high performance data storage solution," added Liang. "The Apache CarbonData community welcomes interested contributors to work with us on our journey forward."

Catch Apache CarbonData in action at ApacheCon (16-18 May/Miami), and Spark Summit (5-7 June/San Francisco).

Availability and Oversight
Apache CarbonData software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache CarbonData, visit http://carbondata.apache.org/ , https://twitter.com/ApacheCarbonDat , and https://www.facebook.com/carbondata/

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "CarbonData", "Apache CarbonData", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # # 

The Apache Software Foundation Announces Apache® Mahout™ v0.13.0

Open Source scalable machine learning and data mining library for Big Data artificial intelligence now more powerful and easier to use.

Forest Hill, MD —1 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® MahoutTM v0.13.0, the latest version of the Open Source scalable machine learning library.

Apache Mahout provides an environment for quickly creating machine-learning applications that scale and run on the highest-performance parallel computation engines available. Mahout is the first scalable generalized tensor and linear algebra solving engine taking data scientists from interactive experiments to production use.

"Apache Mahout 0.13.0 is more powerful with its new algorithm framework that allows for easier implementation of machine learning algorithms," said Andrew Palumbo, Vice President of Apache Mahout. "The enhanced Mahout code base and development framework make machine learning even more accessible, which is a game changer in the field of artificial intelligence."

Mahout provides a wide variety of premade algorithms (Matrix Factorization, QR via ALS, SSVD, PCA, etc.) for Scala + Apache Spark, H2O, and Apache Flink, as well as on-GPU compute for performance improvements in very large tensor math. Apache Mahout provides the data science tools to automatically find meaningful patterns in Big Data sets by supporting the following main data science use cases:
  • Collaborative filtering – mines user behavior and makes product recommendations (such as eCommerce product recommenders);
  • Regression – estimates a numerical value based on values of other inputs;
  • Clustering – takes items in a particular class (such as Web pages or newspaper articles) and organizes them into naturally occurring groups, such that items belonging to the same group are similar to each other; and
  • Classifying – learns from existing categorizations and then assigns unclassified items to the best category.

New in v0.13.0
Apache Mahout now makes it easier to do matrix math on graphics cards, which is relevant for most modern machine-learning and deep-learning methods. In addition, v0.13.0 allows shared nothing computation on GPUs, on multi-core CPU, or in the JVM as appropriate, as well as a simplified framework for building new algorithms. As Mahout comprises an interactive environment and library that support generalized scalable linear algebra and include many modern machine-learning algorithms, the project has also collaborated with developers on other projects, including the Open Source linear algebra library ViennaCL, the Java wrapper library interface JavaCPP, and the graphics processor technology manufacturer NVIDIA to add CUDA bindings directly into Mahout for simplicity of development.

The v0.13.0 release reflects 62 separate JIRA issues from v0.12.2, including numerous enhancements to Mahout-Samsara, the vector math experimentation environment with R-like syntax that works at scale. Complete release notes are at http://mahout.apache.org/release-notes/Apache-Mahout-0.13.0-Release-Notes.pdf

Future versions of Mahout will include support for native iterative solvers, a more robust algorithm library, and smarter probing and optimization of multiplications, among other features.

A comprehensive list of users of Apache Mahout is available at https://mahout.apache.org/general/powered-by-mahout.html ; current users are mostly researchers and developers actively involved in building distributed machine-learning pipelines and tools.

"We thank our community of developers and users who helped make this milestone release possible, and welcome new contributors to help us advance machine learning," added Palumbo.

Catch Apache Mahout in action at Apache: Big Data, where attendees learn first-hand from many original project creators and companies from the greater Mahout community. Apache: Big Data will be held 16-18 May 2017 in Miami, FL. To register, and for more information, visit http://apachecon.com/

Availability and Oversight
Apache Mahout software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Mahout, visit http://mahout.apache.org/ and https://twitter.com/ApacheMahout

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Flink", "Apache Flink", "Mahout", "Apache Mahout", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday February 14, 2017

The Apache Software Foundation Announces Apache® MyFaces™ Tobago 3

Standards-based Open Source components library allows developers to quickly and easily create business Web applications without worrying about technical details 

Forest Hill, MD —14 February 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® MyFaces™ Tobago 3, the user interface components for creating business applications without the need for coding HTML, CSS, or JavaScript.

A sub-project of Apache MyFaces (the Open Source implementation of JavaServer Faces Web application framework that accomplishes the Model-View-Controller paradigm), Tobago is a component library for JavaServer Faces (JSF). The project was originally created at Atanion GmbH in 2002, and was donated to the Apache Incubator in 2005. Tobago graduated as Apache MyFaces sub-project in 2006.

"With a commitment to reduce the time and effort spent on development and deployment, the unofficial Tobago tagline is 'less magic, more standards'," said Udo Schnurpfeil, member of the Apache MyFaces Project Management Committee. "We are are happy that Tobago 3 helps users get their applications up and running even more quickly and easily."

By omitting the need to code HTML, CSS, or JavaScript, Tobago allows users to easily create business Web applications, and emulates the development process of conventional user interfaces (rather than the creation of Web pages) via:
  1. UI components abstracted from HTML, along with any layout information that does not belong to the general page structure. The final output format is determined by the client/user-agent;

  2. A theming mechanism that makes it easy to change the look-and-feel and provides special implementations for certain browsers; and

  3. A layout manager used to arrange the components automatically. This means that no manual laying out using HTML tables or other constructs is needed.

Under The Hood
Apache MyFaces Tobago 3's increased responsiveness and standardization makes it easier to integrate libraries and other projects. Features include:
  • Layout-management moved to CSS and JavaScript to natively achieve layout requirements and make rendering more efficient and responsive;

  • Themes using CSS library Bootstrap 4 make it easy to obtain a modern and rich design; and

  • Use of current technologies such as SCSS, CSS3, HTML5, AJAX, JSF and, Theming on pure CSS base further simplifies the development experience.

Apache Tobago dramatically reduces developer resources and programming time, providing individuals and organizations with improved productivity and ease of implementation.

"For over 10 years we have been working closely with the Tobago team. The close collaboration has been mutually beneficial. Currently we are working on more than 60 intranet applications based on Apache Tobago. We see the new features from Tobago 3 as a significant architectural leap - in particular the innovations with ajax, theming, and responsive design. We expect a fast project adoption - even with the associated migration costs," said Rainer Rohloff, Senior Software Architect at Norddeutsche Landesbank. "We look forward to working on additional projects with the Tobago team in the future."

"It's great to see many users adopt Tobago," added Schnurpfeil. "We welcome new developers and users to join us on our mailing lists, MeetUps, and community events."

Availability and Oversight
Apache MyFaces software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, release notes, documentation, and more information on Apache MyFaces, visit http://myfaces.apache.org/ and https://twitter.com/MyFacesTeam

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 5,900 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, OPDi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "MyFaces", "Apache MyFaces", "Tobago", "Apache MyFaces Tobago", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday January 11, 2017

The Apache Software Foundation Announces Apache® Zest™ Renamed to Apache Polygene

Rebranded Open Source Composite Oriented Programming platform reflects growing codebase and community.

Forest Hill, MD —11 January 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® Zest™, the Composite Oriented Programming platform, has been renamed Apache Polygene.

Apache Polygene is a platform to develop applications with large domain models and complex business logic for Java enterprise developers. Apache Polygene introduces multi-inheritence, aspect orientation (both typesafe and generic weaving) and persistence to both SQL and NoSQL storage systems. Apache Polygene also easily integrates with other technologies such as Spring Framework, REST, OSGi and many more.

"The name change was triggered to prevent confusion with other similarly named software such as the visualization toolkit from Eclipse," said Niclas Hedhman, Vice President of Apache Polygene. "Since becoming an official ASF project, our codebase and community continue to flourish. We are confident that our new identity will reflect ongoing innovation and increased productivity."

The resolution relating to the project's name change was approved at the ASF Board meeting in December 2016.

Project History
In 2007, Hedhman convinced Rickard Öberg to create an Open Source project based on Öberg’s Composite Oriented Programming (COP) concept, which launched as Qi4j. Since then, 28 people have contributed source to the project, with many others participating on mailing lists regarding direction, concepts and design. In 2015 the project arrived at the ASF as Apache Zest, along the unique designation as the first project to enter the ASF as al Top-Level Project– without entering the Apache Incubator (the official entry path for projects and codebases wishing to become part of the ASF’s efforts). As part of its eligibility, the project had to meet the rigorous requirements of the Apache Maturity Model http://s.apache.org/O4p , that addresses the integrity of a project's code, copyright, licenses, releases, community, consensus building, and independence, among other qualities. In March 2015 Apache Zest became an official ASF Top-Level Project, and renamed as Apache Polygene in December 2016.

Availability and Oversight
Apache Polygene software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For project updates, downloads, documentation, and ways to become involved with Apache Polygene, visit http://polygene.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 5,900 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, OPDi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Polygene", "Apache Polygene", "Zest", "Apache Zest", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday December 16, 2016

Feedback from The Apache Software Foundation on the Free and Open Source Security Audit (FOSSA)

by Dirk-Willem van Gulik <dirkx(at)apache(punto)org>

December 2016, v1.09

Background

The important role of open source software in key infrastructures was brought to collective attention by two major security vulnerabilities in the core of the internet infrastructure. Heartbleed and Shellshock of 2014 caused significant concern. It made a lot of people realise how important the collective efforts around these open source infrastructures are. And how much key internet infrastructure relies on open source communities. Such as the Apache community.

Two of those people were Julia Reda and Max Andersson; Members of the European Parliament. As a result they proposed (and directed Europe to fund) a pilot project:  the "Free and Open Source Software Audit (FOSSA)" within a larger workstream that was about "€1 million to demonstrate security and freedom are not opposites".

One part of the money was about developing a methodology; the other about actually auditing some widely used open source software. After soliciting votes from the public - two projects "won": KeePass and the Apache Web Server.

Audit Process

The European Commission (easiest thought of as executive part of Europe) commissioned Spanish Aerospace and Defence company Everis to carry out the review on the Apache HTTPD server (and associated APR).  Their first draft had a considerable number of false positives and a fair bit of focus on some of the more arcane build tools (e.g. our libtool that is used on OS/2 where there is no gnu-libtool). At  Apache vulnerabilty scans are most valuable if we see analysis and at least a theory as to why something is vulnerable -- so we then worked with Everis to improve the report. Their final report on Apache HTTPD and APR has since gone live along with the other audits reports and results.

As none of the vulnerabilities found were particularly severe, we did not need to go through a responsible disclosure path; but could post the issues publicly to the developer mailing list.

Feedback on FOSSA

As part of this work, we were also asked for feedback - especially important now that Julia Reda and Max Andersson have managed to secure a recent vote in the the European Parliament for additional budget.

So in the remainder of this post I'll try to outline some of the conflicting forces around a security issue report v.s. a report of a vulnerability.

Security Reports

Infrastructure software needs constant maintenance to accommodate the evolving platforms; and to back port or propagate improvements and new learnings throughout the code. It is not a static piece of code with 'security holes' waiting to be found. `Fixing' a hole without `lifting the helicopter' is not net-positive by definition; in fact it can be negative. For example if a 'fix' makes the code more complex, if it reduces the number of people that understand it, or if it has an adverse effect on systems that use a different CPU architecture, build environment or operating system.

So in general terms, the main metric is whether security overall gets better - and indirectly about optimising efficient use of the available (existing and extra), but always limited, capacity and capabilities of the resources. At any given time there is both a known 1) backlog of deficiencies and known loose ends and 2) a reservoir of unknown issues. Tackling the first will generally make things more secure. Whereas searching in the latter space only makes things more secure if one finds issues that are severe enough to warrant the time spent on the unknown versus the time not spent on the known deficiencies.

To illustrate this with examples; a report from a somewhat outdated automated vulnerability tool often reduces overall security. Time that could be spent on fixing real issues and cleanups is instead spent on dealing with the false positives and minor stuff. The opposite is also true: bringing a verified security issue to us with a modest bit of analysis as to how such is exploitable, is virtually always a straight win. This obviously is even more true for a very severe issue (where it is immediately clear how it is exploitable). 

But it is also true for the case where someone bestows time on us on a small deficiency (e.g. initially found by a tool) - provided they spend significant time and engineering on handing us the 'fix' on a well tested silver platter. And it is even more useful if a class of issues is tackled throughout; with things like updated test cases.

Throughout this it is very important to consider the threat model and what or whom the bad actors are that you are protecting against. This includes questions like: Is it when the server runs in production? Or also during build? What is the attack surface?. This is particularly important when using (modern!) automated scanning tools (even after you laboriously winnow down the 1000's of false positives for the 1 nugget).

The reason for this is that it is common for constructs such as:
  ....
  results = (results_t *) mallocOrDie(sizeof(results_t));
  results->sum = 0;
  for(int i = 0; i < ptr->array_len; i++) {
    results->sum += ptr->array[i];
  ....
to be automatically flagged by (old-fashioned) tools. This is because there is seemingly no error trapping on mallocOrDie() and because there is no bound checking on ptr->array[i]. So in those cases you need to carefully analyse how this code is used; and what assumptions there are in the API; how exposed it is and so on (e.g is len public or private to the API). 

The last thing you want (when the situation is more complex) is to add a whole load of sentinels to the above code. That would make the code harder to maintain, harder to test and introduce things like the risk of a dangling else going unnoticed. As then you've just reduced security by tackling a non-existent issue. It would have been better to focus, for example, on making sure that mallocOrDie() always bombs out reliably when it fails to allocate.

People and Community versus tools

So specifically this means people, rather than tools, spending a lot of time analysing issues are the thing that is most valuable to Open Source communities.

By the time open source infrastructure code sees use in the market that is significant enough for the likes of FOSSA to consider it 'infrastructure and important' by some metric, it is likely that it is reasonably robust and secure.  As it is open source, it has some standing and is probably used by sizeable organisations that care about security or are regulated. Therefore, it has probably seen a fair bit of (automated and manual) security testing. 

In fact, once an open source project has become part of the landscape every security vendor worth their salt will probably test their tools on it - and try to use it as a wonderful (because you are public) example they can talk about in their sales pitches (that is, if they find something).

It also means that the issues that remain tend to be hard; and are more likely to require structural improvements (e.g. hardening an API) and large scale, systematic changes. Which result in totally disproportional amounts of time to be spent on updating test cases, testing and manual validation. As otherwise it would probably already have been done before. To some extent this also applies to automated tooling; we see that modern/complex tools that are hard to run; require a lot of manual work to update their rule bases for false positives or require sizeable investments (such as certain types of fuzzing, code coverage tools, automated condition testing/swaps) are used less often (but thus tend to sometimes yield promising new strains of issues).

Secondly there is the process of impact and the cost of dealing with the report and changes.  Often the report will find a lot of 'low' issues and perhaps one or two serious ones. For the latter it is absolutely warranted to 'light up' the security response of an open source project; and have people rush into action to do triage, fix and follow up with responsible disclosure.

Given that the code is already open source, the same cannot be said for the 'low' issues. Generally anyone (bad actors and good actors) can find these too. So in a lot of cases it is better to work with the community to file these as bug reports; or even better - as simple issues usually have simple non controversial fixes, submit the fixes and associated test cases as contributions. (It is often less work for the finder of the bug to submit a technical patch & test case than to fully write up a nicely formatted PDF report)

Bug Bounties - a Panacea ?

One 'solution' which is getting a lot of media attention is that of bug bounties; where the romantic concept of a lone open source volunteer coder code the internet is replaced by a lone bounty hunter - valiantly searching for holes & getting paid if they shoot first. 

If we review that solution against the needs of large, stable, communities that deal with relatively mature and stable infrastructure code (as opposed to commercial project or new code that is still evolving) we have seen a number of counter-indications stack up:
  • Fees are not high enough for the expert volunteers one would need to be enticed by the fee alone `in bulk'.

    Take the recent Azure-Linux update reporting or the Yahoo issue as examples. 5 to 10k is unlikely to come even close to the actual out of cost of a few weeks to a few months of engineering time at that quality level (or compensating the years invested in training) that was required to find, analyse and report that issue.

  • The same applies for the higher `competition' fees - topping out at 30-100k. In those cases only the first to report gets it. So your actual payment-per-issue found is lower on average; with some 4 to 8 top global teams at this level and with 2 to 4 high-value target events per year - that works out at well below 8k/teammember per year on average.
That in itself has a number of ramifications:
  • The very best people will only engage in this as a hobby and (hence) for personal credit and pride; OR when they work for a vulnerability company that wants the PR and marketing.

BUT that means that it is personal credit & marketing that is the real driving value, not the money itself. So what then happens if we introduce money into this (already credit and marketing driven) situation? 

  • Very large numbers of people without sufficient skill may be tempted --- but then one has to worry about the impact on the open source community: is dealing with reports at that level a better time spend for volunteers than having insiders look for things ? Will time spent on these fixes distract from the important things ?

    Should we ask people to pre-filter; or ask people managing bug hunting programmes to pre-vet or otherwise carry an administrative burden ? (Keep in mind that there are third party bug-hunting programmes for Apache code that the Apache Software Foundation has no control over).
Secondly - we know (from various dissertations and experience) that introducing money into a volunteer arrangement has an impact on group dynamics and how volunteers feel rewarded; or what work they seek to get rewarded for. 

With that - it may be so that:
  • It is likely that `grunt' and `boring' work in the security area will suffer --- `let that be done by paid folks';

  • It fundamentally shifts the non-monetary (and monetary - but not relevant as too low) reward from writing secure/good code and caring/maintaining --- to the negative - finding a flaw in (someone else) code. So feel-good, job-well-done and other feedback cycles now bypass primary production processes (that of writing good code), or at the very least, make that feedback loop involve a bug bounty party.
Finally - in complex/mature code - the class of vulnerabilities that we probably want to get fixed tend to be very costly to fix/find - and any avenue you go down has a high risk of not finding a security issue but a design/quality issue. 

Bug bounty finders, unlike the coding volunteers are NOT incentivised to report/fix these.  

On top of this, they are more likely to go for the higher reward/lower risk kind of niggle stuff. Stuff that, without digging deeper, is likely to cause higher layers of the code to get convoluted and messy. As these groups have no incentive to reduce complexity or fix deeper issues (in fact, if one were cynical - they have every reason to stay clear of such - as it means ripe hunting grounds during periods of drought).

So at some level Bug bounties are about the trade-off between rewarding, paying, a single person versus saddling a community of motivated volunteers with the fallout - not so much of genuine reports; but of everything else.

So ultimately - it is about the risks of what Economists call "Externalisation"; making a cost affects a party who did not choose to incur that cost - or denying that party a choice how to spend their resources most effectively.

Summary and suggestions for the next FOSSA Audits

In summary:
  1. Submitting the results of automated validation (even with some human vetting) is generally a negative contribution to security. 

  2. Submitting a specific detailed vulnerability that includes some sort of analysis as how this could be exploitable is generally a win. 

  3. Broad classes of issues which (perhaps rightly!) give you hits all over the code base are generally only worth the time spent on them if there are additional resources willing to work on the structural fixes, write the test cases and test them on the myriad of platforms and settings -- and if a lot of the analysis and planning for this work has been done prior to submitting the issue (to generally a public mailing list).

    From this it also follows that narrow and specific (and hence more "new" and "unique") is generally more likely to increase overall security; while making public the results of something broad and shallow is at best not going to decrease security.

  4. Lighting up the security apparatus of an open source project is not 'free'. People are volunteers. So consider splitting your issues into: ones that need a responsible disclosure path; and ones that can go straight to the public lists. Keep in mind that, as the code is open source, you generally can err towards the open path a bit - other (bad) actors can run the same tools and processes as you.

  5. Consider raising the bar; rather than report a potential vulnerability - analyse it; have the resources to (help) solve it and support the community with expensive things; such as the human manpower for subsequent regression testing, documentation, unit tests or searching the code for similar issues. 

  6. Security is a process; over very long periods of time. So consider if you can consistently spend resources over long periods on things which are hard to do for (isolated) volunteers. And if it is something like comprehensive fuzzing, code-coverage, condition/exchange testing -  then consider the fact that it is only valuable if it is; a) done over long periods of time and b) comes with a large block of human manpower that do things like analyses of the results and updates of test cases.

  7. Anything that increases complexity is a risk; and may have long term negative consequences. As it may lead to code which is harder to read, harder to maintain or where the pool of people that can maintain it becomes disproportionally smaller. A broad sweeping change that increases complexity may need to be backed by a significant (5.10+ years) commitment of maintenance in order to be safe to implement; especially if the security improvement it brings is modest.

  8. Carefully consider threat model and actors when you are classing things a security hole - especially around APIs.

  9. Carefully consider what type of resources you want to mobilise in the wider community; and what incentivises the people and processes that are most likely to improve the overall security and safety. And take the overall, longterm, health and social patterns of the receiving community into account when there such forces for good are "external".  It is all to easy to in essence to in effect cause a "Denial of Service" style effect; no mater how well intentioned.

  10. World-class expertise is rare; and by extension - the experts are often isolated. Bringing them together for long periods of time in relatively neutral settings gives synergy which is hard to get otherwise. Consider using a JRC or ENISA setting as a base for long term committed efforts. An effort that is perhaps more about strengthening and improving large scale (IT) infrastructures and (consumer) safety - rather than security.

  11. Bug bounties are not the only option. Some open source communities have benefited from "grants" or "stipend"; where a specific issue got tackled or addressed. In some cases, such as in for example Google its Summer of Code - it is focused on relatively young people; and helps train them up; in other cases it gives established experts room for a (few) year(s) to really bottom out some long standing issue.
With respect to the final point - security engineering (and its associated areas; such as privacy, trust and so on) is a "hard" thing to hire; the market generally lacks capacity and capability. Also in Europe. 

While open source its access to `lots of eyeball's does help; it does not magically give us access to a lot of the right eyeballs.

Yet increasing both Capacity and Capability in society does help. And that is a long process that starts early.

# # #

Tuesday November 15, 2016

The Apache Software Foundation Announces Apache® jclouds™ v2.0


[Read More]

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation