Entries tagged [project]

Monday May 20, 2019

Project Perspectives: Apache Weex (incubating) and The Apache Way

by York Shen, member of the Apache Weex Project Management Committee 

I am a Project Management Committee (PMC) member of Apache Weex (Incubating), a cross-platform mobile development framework, widely used in many mobile apps, among top of which have nearly 0.7 billion MAU (Monthly Active Users). Weex became an Open Source project in early 2016 and entered the Apache Incubator in December 2016. As a PMC member, I have been with the project from beginning to today; it is an exciting journey mixed with challenge and suffering, and the journey is not end yet. 

Challenge

"This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning."
–Winston Churchill

As The Apache Software Foundation (ASF) divides its projects into two types, namely Top-Level Projects (TLPs) and Incubator projects (known as "podlings"), joining the Apache Incubator is not the end. Instead, it is just a beginning.

Community

As a project under the ASF, Weex should and would do things under The Apache Way. But as one might imagine, there are a few problems Weex has to solve:

It is said that "If it didn't happen on (a mailing) list, it didn't happen". As Weex was developed by Alibaba Inc. and donated to ASF, it is not surprising that some contributors and committers of Weex are full-time employees of Alibaba Inc. For example, there were many internal IM and face-to-face communications, which is not an an endorsed way of project management and operation under The Apache Way. Also, many Weex contributors and users are Chinese, and they preferred Chinese to communicate, report bugs, and write documentation, which is also not typical under The Apache Way.

Engineering and Product

There are some technical issues due to the feature of Weex:

As both Android and iOS system are upgraded each year, their features and APIs also get updates each year. Weex is a cross-platform framework and designed to provide mobile feature with front-end technology, which means that it is not a easy task to map these Java (for Android system) and Objective-C (for iOS system) APIs to front-end world. Yearly updates of these systems makes it even harder.

Users of Weex are mainly front-end engineers while the project’s contributors and committers are Android and iOS developers: there is a technological stack gap between users and Weex contributors.

The active committers of Weex are not enough: it is difficult to maintain a project that provides Operating System API with about ten active committers.

Weex Way

Open Source is more than just code.

Weex has two repositories: one from before its donation to the ASF, and the other is after that. There are are nearly 30 thousand stars on GitHub among these two repositories: what an exciting number! But Open Source is more than just about code.

Community

Community Over Code.

Currently, most Open Source projects will adopt one of the governance structures:
  • BDFL (benevolent dictator for life)
  • Meritocracy
  • Liberal contribution

The Apache Way promotes earned authority, the ASF champions Meritocracy, where community is over code.

Mailing List

"If it didn't happen on (a mailing) list, it didn't happen."

As mentioned above, many Weex contributors and committers are employees of commercial companies, and some of their companies even prefer Weex in their production environment. Therefore we, as employees, receive a great deal of feature requests from our coworkers, who often choose face-to-face conversations to discuss new features.

Code commits without discussion in mailing lists is not what Weex's PMC wants, nor is it The Apache Way either. Therefore, the Apache Weex PMC needs to enforce some rules to make things right: 

The dev@ mailing list is the only official communication channel, all features must be discussed in the mailing list before coding except tiny bugfix like fixing a null pointer exception.

Move Github PR and Issue from dev@ to a separate mailing list to avoid noise.

Decision Making

Apache Weex is overseen by the ASF and developed by a community of developers. It is important to follow the Consensus building and Voting procedures. The procedure is transparent and search engine-friendly to all users worldwide. It is normal that someone stops maintaining a project due to their interests changing or perhaps a change in their work situation. Projects that obey the previous consensus building and decision making procedures are normally more stable and robust in the long term compared with projects that don’t, as current maintainers would have a better understanding of what was happening before by searching a mailing list.

By default, the official language used in Apache mailing lists is English, which causes some problems for many Weex users who are Chinese and not proficient or comfortable at communicating in English. Therefore, enthusiastic contributors of Apache Weex would often  prefer to use Google Translate to translate Chinese to English to let others known what is happening, and politely remind the original author to use Chinese next time. It is a time consuming and tedious job to translate others' posts, but it is worthy to let the rest of the world understand what is happening here (on list).

In fact, there is a discussion about language used in the Weex mailing lists.
Engineering and Product
Infrastructure

Many users prefer Weex in their commercial products, among top of which have nearly 0.7 billion MAU. In such cases, project stability is our priority, as 99.999% availability only means that ten thousands of end users may experience problems.

Therefore, we choose to reduce or remove features from Weex instead of adding them:

The priority of Weex now is stability, which means only bugfixes are allowed;
New features should be imported to Weex as a plugin, which allows developers to  choose to enable or disable a certain plugin;

Developers' Feedback


We also make a feedback convention between our users and contributors in order for user problems to be solved efficiently.

Github Issues are only for Bug Feedbacks: other problems should go through mailing list.

It is important to report a bug according to the bug reporting template.

Future Growth

There are many end users that choose Weex in their commercial product, among which include Taobao Mobile, with hundreds of millions of users. For a list of known companies using Apache Weex, please see https://weex.apache.org/community/who-is-using-weex.html .

For now, Weex is still a project under development in the Apache Incubator. We welcome you to join the Apache Weex community. Visit us at https://weex.apache.org/

# # # 

Part of the "Success at Apache" series, Project Perspectives chronicles how projects and their communities have benefited from The Apache Way.

Wednesday April 24, 2019

The Apache Software Foundation Announces Apache® NetBeans™ as a Top-Level Project

Popular, award-winning Open Source development environment, tooling platform, and application framework enables Java programmers to easily build desktop, mobile, and Web applications

Wakefield, MA —24 April 2019— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® NetBeans™ as a Top-Level Project (TLP).

Apache NetBeans is an Open Source development environment, tooling platform, and application framework that enables Java programmers to build desktop, mobile, and Web applications. The project was originally developed as part of a student project in 1996, was acquired and open-sourced by Sun Microsystems in 2000, and became part of Oracle when it acquired Sun Microsystems in 2010. NetBeans was submitted to the Apache Incubator in October 2016.

"Being part of the ASF means that NetBeans is now not only free and Open Source software: it is also, uniquely, and for the first time, part of a foundation specifically focused on enabling open governance," said Geertjan Wielenga, Vice President of Apache NetBeans. "Every contributor to the project now has equal say over the roadmap and direction of NetBeans. That is a new and historic step and the community has been ready for this for a very long time. Thanks to the strong stewardship of NetBeans in Sun Microsystems and Oracle, Apache NetBeans is now ready for the next phase in its development and we welcome everyone to participate as equals as we move forward."

Apache NetBeans 11.0 was released on 4 April 2019, and is the project’s third major release since entering the Apache Incubator. The project has most recently won the 2018 Duke's Choice Award, a well established industry award in the Java ecosystem.

"'Have a patch for NetBeans? Then create a pull request for Apache NetBeans!' I love how that sounds," said Jaroslav Tulach, original founder and architect of NetBeans. "I am really glad the transition has gone so well and that 'my NetBeans' has turned into a full-featured project at The Apache Software Foundation."

"From the moment that I first evaluated NetBeans for use in my courses at Dawson College and Concordia University, I recognized that it was a unique tool. In the years that followed, it has never disappointed me as the best tool for education. Now, I am even more excited about using it as it becomes a top-level project in the Apache Software Foundation," said Ken Fogel, Chairperson of Computer Science Technology at Dawson College, Montreal. "A lot of amazing developers from around the world have contributed to making NetBeans a first-class tool worthy of being under The Apache Software Foundation. Now, more than ever, its continued evolution will be faster, more responsive to the needs of the development community, and ever more open to the participation of the community. I am proud to have had a very small part in its development and I am excited to see how it will grow and evolve going forward."

By becoming an Apache project, NetBeans is benefiting from being enabled to receive more contributions from around the world. For example, large companies are using NetBeans as an application framework to build internal or commercial applications and are much more likely to contribute to NetBeans with it being part of the ASF than as part of a commercial enterprise. At the same time, individual contributors from Oracle continue to work on Apache NetBeans in its new home, as part of the worldwide community of individual contributors, both self-employed as well as from other organizations.

"Apache is the perfect home for NetBeans, allowing its long tail of historic contributors to stay involved while also launching another stage in its evolution for newcomers," said Simon Phipps, current President of the Open Source Initiative. "As a member of the new Apache NetBeans Project Management Committee, I look forward to helping in any way I can and I encourage the whole Java family to do so too."

"I've used NetBeans since I first started learning Java over 15 years ago," said Neil C. Smith, creator of PraxisLIVE. "It remains my tool of choice. It's great to be part of the Apache community and helping it to thrive. But NetBeans is more than just a development environment, it's also a powerful platform for building other business and development tools. It forms the backbone of PraxisLIVE, which I have created and continue developing on top of Apache NetBeans, powering a hybrid visual Smalltalk-like IDE for the underlying live programmable Java actor system". 

"I am an avid NetBeans user, since my first experience in about 2008. The most important aspect is, quoting Java EE guru Adam Bien: ‘It always works’," said Pieter van den Hombergh, lecturer at Fontys Venlo University of Applied Sciences. "This is particularly important in my job and to my audience: I teach Java, as well as, occasionally, PHP. Now that NetBeans has gone through the hard work of the transfer from Oracle to Apache, I am glad to see it increasingly becoming complete again. I am certain to enjoy using the up to date version with Java 11+, JUnit 5 integration, and all the other goodies, either built-in or provided by the many useful plugins."

"The flip side of freedom is responsibility," added Wielenga. "Now that the community finally has what’s its been asking for for so many years, it needs to step up and take ownership of Apache NetBeans. Each and every user of Apache NetBeans now has the ability to ask themselves where they can best fit in to drive the project forward -- from evaluating bugs, to reviewing pull requests, to tweaking the documentation, to verifying tutorials, to helping answer questions on the mailing lists, or sharing tips and insights on Twitter. Lack of Java knowledge and even lack of programming knowledge is no excuse; there’s really something to do for everyone with any skill or interest level. There is no need nor excuse to stand on the sidelines anymore -- NetBeans is now yours, exactly as much as you want it to be."

Catch Apache NetBeans in action at conferences all over the world. Users are welcome to set up and host their own Apache NetBeans events, such as the annual Apache NetBeans Day UK, which will be held 27 September 2019, in London.

Availability and Oversight
Apache NetBeans software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache NetBeans, visit http://netbeans.apache.org/ and https://twitter.com/netbeans

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects seeking to become an Apache project or initiative enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects that provide $20B+ worth of Apache Open Source software to the public at 100% no cost. Through the ASF's merit-based process known as "The Apache Way," more than 730 individual Members and 7,000 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting billions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Hortonworks, Huawei, IBM, Indeed, Inspur, Leaseweb, Microsoft, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, Tencent, Union Investment, Workday, and Verizon Media. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "NetBeans", "Apache NetBeans", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday March 21, 2019

The Apache Software Foundation Announces Apache® Unomi™ as a Top-Level Project

Powerful Open Source Customer Data Platform in use at Al-Monitor, Altola, Jahia, and Yupiik, among others. 

Wakefield, MA —21 March 2019— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Unomi™ as a Top-Level Project (TLP).

Apache Unomi is a standards-based, Customer Data Platform (CDP) that manages online customer, leads, and visitor information to provide personalized experiences that adheres to visitor privacy rules such as GDPR and “Do Not Track” preferences. The project was originally developed at Jahia, and was submitted to the Apache Incubator in October 2015.

"I am truly thankful to our community, especially our mentors, who have helped us achieve this milestone," said Serge Huber, Vice President of Apache Unomi. "The original vision behind Unomi was to ensure true privacy by making the technologies handling customer data completely Open Source and independent. Since it was submitted to the Apache Incubator, developing Unomi using the Apache Way will ensure the project grows its community to be more diverse and welcome new users and developers."

Apache Unomi is versatile, and features privacy management, user/event/goal tracking, reporting, visitor profile management, segmentation, personas, A/B testing, and more. It can be used as:

  • a personalization service for a Web CMS;

  • an analytics service for  native mobile applications;

  • a centralized profile management system with segmentation capabilities; and

  • a consent management hub

Apache Unomi is the industry's first reference implementation of the upcoming OASIS CDP specification (established by the OASIS CXS Technical Committee, which sets standards as a core technology for enabling the delivery of personalized user experiences). As a reference implementation, Apache Unomi serves as a real world example of how the standard will be stable, and is quickly gaining traction by those interested in truly open and transparent customer data privacy. Apache Unomi is in use at organizations such as Al-Monitor, Altola, Jahia, Yupiik, and many others to create and deliver consistent personalized experiences across channels, markets, and systems.

"When Serge and I announced the launch of the Apache Unomi project at the 2015 ApacheCon Budapest, Apache Unomi, at that time, was the first proposal among the rising Customer Data Platform industry's segment, positioned as an 'ethical data-driven marketing' product that would respect the privacy of customers while leveraging the power of unified customers data," said Elie Auvray, Head of Business Development at Jahia. "Jahia's digital experience management solutions are based on Apache Unomi, and we can't wait to see how the project will now evolve with its growing community. Seeing today Apache Unomi becoming a Top-Level Project is a great reward for us as Open Source software believers. We are proud of this milestone, grateful to the Apache Software Foundation and our mentors, and we know it's only the beginning of a new –hopefully long and successful– journey."

"Under development at OASIS, the Customer Data Platform specification –for which Apache Unomi aims to be the reference implementation– lies at the crossroads of many solutions providers needs such as WCM, CRM, Big Data Platforms, Machine Learning, IoT and Digital Marketing," said Laurent Liscia, CEO of OASIS. "At a time when client data interoperability and built-in data privacy are mandatory foundations for legal, consistent, and personalized experiences across channel markets and systems, the CDP specification, together with Apache Unomi, is a clear and welcome answer to end-user concerns."

"Apache Unomi is the perfect solution to implement a user profile platform," said Jean-Baptiste Onofré, Fellow at Talend. "It fully addresses the user trust and privacy needs, allowing to easily create user profile and Web marketing features. As Unomi is powered by Apache Karaf, it's also a great platform for several use cases, such as digital marketing in Web applications, managing user profiles on IoT devices, and more."

"Apache Unomi enables Al-Monitor readers to be driven towards additional personalized content that corresponds, via content tags profiling and related automated segmentations, to what they have already accessed," said Valerie Voci, Head of Digital Strategy and Marketing at Al-Monitor. "This data follows our customers where they go, so it's a consistent experience whether they are getting these recommendations in their inbox or on the Website or both. And if a change takes place on one, that change is immediately reflected on the other. It helps us create a very cohesive marketing message and a great overall digital experience."

"As we were developing a progressive web app (PWA) for a client, we were looking for a Customer Data Platform (CDP) to store customer insights, such as behavioral and explicit customer data," said Lars Petersen, Co-Founder at Altola. "Privacy was table stake for us, along with the flexibility to customize data schema and open API. We selected Apache Unomi based on these parameters, we had it up and running on AWS in less than 30 min. and are very impressed with the maturity of the platform, its privacy by design and how easy it was to work with."

"In a digital world, customer data is very important to offer a better experience to users. However, data privacy and trust is not an option for users," said François Papon, CTO at Yupiik. "Apache Unomi is the best solution for our clients because it's an Open Source project managed by an independent foundation, there is no vendor lock-in. It's also based on other solutions like Apache Karaf that made it ready for modularity, scalability, cloud, devops, and more." 

"Apache Unomi is poised to disrupt the Customer Data Platform market," said Thomas Sigdestad, CTO at Enonic, and co-chair, with Serge Huber, of the CDP standards work at OASIS open. "The CDP marketplace is lacking from a standard way of exchanging data, and the vendor space is over-represented by closed source and proprietary cloud offerings. This effectively limits the potential and adoption of CDP in general. Apache Unomi is not merely Open Source, but also the reference implementation of the imminent CDP standard from OASIS. Companies using Unomi will benefit from faster and simpler integrations without locking their customer data into yet another proprietary silo." 

"Graduating as an Apache Top-Level Project is only the beginning," added Huber. "Unomi has a lot of potential that it still to be developed, and is a perfect opportunity for those interested in Customer Data Privacy to participate through our mailing lists and Slack channel, and to learn more about the project on our Website and presentations."

Catch Apache Unomi in action at ApacheCon North America (9-12 September 2019 in Las Vegas, Nevada), and ApacheCon Europe (22-24 October 2019 in Berlin, Germany) http://apachecon.com/ .

Availability and Oversight
Apache Unomi software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Unomi, visit http://unomi.apache.org/

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects seeking to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 730 individual Members and 7,000 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Hortonworks, Huawei, IBM, Indeed, Inspur, Leaseweb, Microsoft, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, Tencent, Union Investment, Workday, and Verizon Media. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Unomi", "Apache Unomi", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

Wednesday March 20, 2019

Project Perspectives: Apache RocketMQ and The Apache Way

by Von Gosling, member of the Apache RocketMQ Project Management Committee

Last year, I wrote a blog about how communities from a non-English-speaking country understand and use Apache way for open innovation. In that article, I mainly expressed the desire to be open as a developer, to be good at using mailing lists, listening to community voices and making decisions. In addition, project communities can also organize more programming activities like Google Summer of Code ("GSoC"; applications are now open for the 2019 program) to help developers learn how to become involved in the community-led development process as well as provide greater help and encouragement for developers to join the community. After that, I have been thinking about sharing some Apache RocketMQ community building and collaborative innovation stories through some real-life examples. It is my great honor to share some stories with community at the upcoming ASF anniversary celebration. IMO, The Apache Way has greatly helped Apache RocketMQ to grow its community since entering the Apache Incubator in 2016.

Apache RocketMQ was originally created and used as a distributed messaging engine for online e-commerce transaction processing. It could undertake billions, even trillions of  message transmissions in many companies’ production environments. RocketMQ has been proven to be suitable for high-throughput, low-latency messaging systems in the large-scale distributed scenarios. A standard OpenMessaging project on the Linux Foundation provides a common benchmarking platform[1] for distributed messaging systems in the cloud including Apache RocketMQ. Of course, as a widely used messaging engine, the functional level, RocketMQ provides both pull and push models, supporting scheduled message, ordered message, batch message, broadcast message, message filtering, dead-letter queue and so on, almost sustaining all classic event driven or streaming scenarios. Even so, there are still a few core features not included from an enterprise perspective. Last year, the RocketMQ community announced three attractive features: transactional message, message tracing, authentication and authorization. The transactional message implements the transaction consistency guarantee between the sender and the local business operation. This feature is a very valuable feature which was initiated and contributed by several individuals from the financial industry. While based on transactional messages, we could build a full-stack distributed transaction platform, which is suitable for long-running microservice. 

In the enterprise application for messaging, there has always been a troubling problem. Where did my message send to? How can I find message accumulation if the consumer has failed or not? That’s a very difficult task especially when we are providing cloud pub/sub service, because messaging is an asynchronous decoupling process, the natural upstream and downstream are not aware of each other. Fortunately, some guys from the China Mobile Research Institute found some of the pmc members at Apache RocketMQ Meetup and mentioned their troubles. So, we showed them the latest RIP[2]  plan from the community, which is a very challenging optimization and improvement for Apache RocketMQ intra code. Under the help of a Project Management Committee (PMC) member, RIP were proposed, discussed and accepted. After the necessary time planning, we started to design, coding and discuss, exchange code implementation details, which includes several meetups to gather together to discuss and review the code until the later online verification and released. More interesting, in the following review process, another cloud vendor from the community has joined. After simple video communication, the original implementation was optimized to be compatible with its implementation. Finally, in the community, we are happy to see that the new version has been verified in production by two cloud vendors. The ACL feature is also a process in which the PMC and the community continue to collaborate, and the final version of the RIP is finally published. Through the meetup we collected the requirements, through open discussion, coupled with the video communication using Zoom, the RocketMQ community completed the several important releases in the last year. At the same time, in order to better promote ecological flourishing, several projects residing under the Apache RocketMQ external repository were reorganized (80% above projects contributed by community during incubation). In addition to setting milestones, this time we added similar incubations and graduation mechanism further reduces the difficulty of community participation while better guarantee product quality. Today, several different language sdk projects that have graduated are from a large number of users in use and maintenance. The enthusiasm of the community completely surpasses thoughts. It also verifies that the future cloud architecture is language-independent, even serverless. Under this general trend, the community's guys actively participated in the RocketMQ multi-language ecological construction. RocketMQ now supports java, cpp, python, go, nodejs and other languages are planning and on the way. Even, the current CPP client can support up to 8 platforms, like CentOS, MacOS, Ubuntu and Windows.

Not only that, more and more community enthusiasts are also spontaneously self-organizing: they are also actively planning similar activities in the city station, but also need to give some attention and encouragement by PMCs. At the same time, this also provoked us to think about whether the Apache community should have a developer-oriented role like release manager, such as developer-relationship maintainer, project manager, to let more users understand, and become more involved in the product. The community development in recent years has also brought many new signs to the RocketMQ community. There are more and more active developers in the community. In roughly three months, nearly 2,000 emails were sent from the dev email list. Research has shown that 70% of top banks in China use Apache RocketMQ on the core business link, approximately 60% of Internet finance and insurance customers use RocketMQ in their production environments, and 75% of top 20 Internet companies in China widely use classic pub/sub scenarios.

Recently, the RocketMQ community has been discussing the development of the next-generation messaging platform. We hope it is a unified messaging engine with lightweight data processing platform, and welcome everyone to become involved and tell the PMC what features you are most looking forward to seeing in future versions of RocketMQ.

Many thanks to The Apache Software Foundation for its open and inclusive community culture that helps RocketMQ build a broader community. It will soon be 20th anniversary. I am pleased to represent the RocketMQ community in wishing the ASF a happy 20th anniversary, hoping for many more and continuing to thrive.


[1] https://github.com/openmessaging/openmessaging-benchmark
[2] https://github.com/apache/rocketmq/wiki/RocketMQ-Improvement-Proposal

# # #

Part of the "Success at Apache" series, Project Perspectives chronicles how projects and their communities have benefited from The Apache Way. 

Tuesday February 19, 2019

The Apache® Software Foundation Announces Apache Arrow™ Momentum

Open Source Big Data in-memory columnar layer adopted by dozens of Open Source and commercial technologies; exceeded 1,000,000 monthly downloads within first three years as an Apache Top-Level Project

Wakefield, MA —19 February 2019— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, today announced momentum with Apache® Arrow™, the Open Source Big Data in-memory columnar layer.

Since the founding of the project in January 2016, Apache Arrow has quickly become the defacto standard for representing and processing analytical data in memory, accelerating analytical processing and interchange by more than 100x.

"When we became a Top-Level Project, we projected that the majority of the world's data will be processed through Arrow within the next decade," said Jacques Nadeau, Vice President of Apache Arrow. "In just three years time, we are proud to see Arrow's substantial industry adoption and increased value across a wide range of analytical, machine learning, and artificial intelligence workloads."

Highlights of Apache Arrow's success include:

Industry Adoption —more than 20 major technologies adopted Arrow to accelerate in-memory analytics, including Apache Spark, NVIDIA RAPIDS, pandas, and Dremio, among others. A list of known Open Source and commercial implementations can be found at https://arrow.apache.org/powered_by/

Millions of Downloads —leveraging and integrating Apache Arrow into many other technologies has bolstered downloads to more than 1,000,000 each month.

New Language Support —as a cross-language development platform, supporting multiple programming languages is paramount. Apache Arrow has grown from supporting one language to eleven different languages today; they include C++, Java, Python, R, C#, Javascript, and Ruby, among others.

Seamless Data Format Support —Arrow supports different data types, both simple and nested, located in arbitrary memory such as regular system RAM, memory-mapped files or on-GPU memory. In addition, it can ingest data from popular storage formats such as Apache Parquet, CSV files, Apache ORC, JSON, and more.

Major Code Donations —Apache Arrow's new features and expanded functionality are due in part to code and component donations that include:
  • C# Library
  • Gandiva LLVM-based Expression Compiler
  • Go Library
  • Javascript Library
  • Plasma Shared Memory Object Store
  • Ruby Libraries (Apache Arrow and Apache Parquet)
  • Rust Libraries (Parquet and DataFusion Query Engine)
Community and Contributor Growth —over the past 12 months, nearly 300 individuals have submitted more than 3,000 contributions that have grown the Apache Arrow code base by 300,000 lines of code. The Arrow community is welcoming approximately 10 new contributors each month.


In January the project announced its most recent release, Apache Arrow 0.12.0, which reflects more than 600 enhancements developed during Q4 2018. The Apache Arrow community is actively working on a number of impactful new initiatives that include solving high performance analytical problems and allowing for more efficient data distribution across entire clusters.

"Apache Arrow's rapid industry adoption and developer community growth supports our original thesis of the importance of a language-independent open standard for columnar data," said Wes McKinney, member of the Apache Arrow Project Management Committee, and creator of Python's pandas project. "Additionally, we are seeing productive collaborations take place not only between programming languages but also between the database systems and data science worlds. We look forward to welcoming more data system developers into our community."

About Apache Arrow
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust.

Availability and Oversight
Apache Arrow software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Arrow, visit http://arrow.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 730 individual Members and 7,000 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official global conference series. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Hortonworks, Huawei, IBM, Indeed, Inspur, LeaseWeb, Microsoft, Oath, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, Tencent, Union Investment, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Arrow", "Apache Arrow", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday January 08, 2019

The Apache Software Foundation Announces Apache® Airflow™ as a Top-Level Project

Open Source Big Data workflow management system in use at Adobe, Airbnb, Etsy, Google, ING, Lyft, PayPal, Reddit, Square, Twitter, and United Airlines, among others.

Wakefield, MA —8 January 2019— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Airflow™ as a Top-Level Project (TLP).

Apache Airflow is a flexible, scalable workflow automation and scheduling system for authoring and managing Big Data processing pipelines of hundreds of petabytes. Graduation from the Apache Incubator as a Top-Level Project signifies that the Apache Airflow community and products have been well-governed under the ASF's meritocratic process and principles.

"Since its inception, Apache Airflow has quickly become the de-facto standard for workflow orchestration," said Bolke de Bruin, Vice President of Apache Airflow. "Airflow has gained adoption among developers and data scientists alike thanks to its focus on configuration-as-code. That has gained us a community during incubation at the ASF that not only uses Apache Airflow but also contributes back. This reflects Airflow’s ease of use, scalability, and power of our diverse community; that it is embraced by enterprises and start-ups alike, allows us to now graduate to a Top-Level Project."

Apache Airflow is used to easily orchestrate complex computational workflows. Through smart scheduling, database and dependency management, error handling and logging, Airflow automates resource management, from single servers to large-scale clusters. Written in Python, the project is highly extensible and able to run tasks written in other languages, allowing integration with commonly used architectures and projects such as AWS S3, Docker, Apache Hadoop HDFS, Apache Hive, Kubernetes, MySQL, Postgres, Apache Zeppelin, and more. Airflow originated at Airbnb in 2014 and was submitted to the Apache Incubator March 2016.

Apache Airflow is in use at more than 200 organizations, including Adobe, Airbnb, Astronomer, Etsy, Google, ING, Lyft, NYC City Planning, Paypal, Polidea, Qubole, Quizlet, Reddit, Reply, Solita, Square, Twitter, and United Airlines, among others. A list of known users can be found at https://github.com/apache/incubator-airflow#who-uses-apache-airflow

"Adobe Experience Platform is built on cloud infrastructure leveraging open source technologies such as Apache Spark, Kafka, Hadoop, Storm, and more," said Hitesh Shah, Principal Architect of Adobe Experience Platform. "Apache Airflow is a great new addition to the ecosystem of orchestration engines for Big Data processing pipelines. We have been leveraging Airflow for various use cases in Adobe Experience Cloud and will soon be looking to share the results of our experiments of running Airflow on Kubernetes." 

"Our clients just love Apache Airflow. Airflow has been a part of all our Data pipelines created in past 2 years acting as the ring-master and taming our Machine Learning and ETL Pipelines," said Kaxil Naik, Data Engineer at Data Reply. "It has helped us create a Single View for our client's entire data ecosystem. Airflow's Data-aware scheduling and error-handling helped automate entire report generation process reliably without any human-intervention. It easily integrates with Google Cloud (and other major cloud providers) as well and allows non-technical personnel to use it without a steep learning curve because of Airflow’s configuration-as-a-code paradigm."

"With over 250 PB of data under management, PayPal relies on workflow schedulers such as Apache Airflow to manage its data movement needs reliably," said Sid Anand, Chief Data Engineer at PayPal. "Additionally, Airflow is used for a range of system orchestration needs across many of our distributed systems: needs include self-healing, autoscaling, and reliable [re-]provisioning."

"Since our offering of Apache Airflow as a service in Sept 2016, a lot of big and small enterprises have successfully shifted all of their workflow needs to Airflow," said Sumit Maheshwari, Engineering Manager at Qubole. "At Qubole, not only are we a provider, but also a big consumer of Airflow as well. For example, our whole Insight and Recommendations platform is built around Airflow only, where we process billions of events every month from hundreds of enterprises and generate insights for them on big data solutions like Apache Hadoop, Apache Spark, and Presto. We are very impressed by the simplicity of Airflow and ease at which it can be integrated with other solutions like clouds, monitoring systems or various data sources."

"At ING, we use Apache Airflow to orchestrate our core processes, transforming billions of records from across the globe each day," said Rob Keevil, Data Analytics Platform Lead at ING WB Advanced Analytics. "Its feature set, Open Source heritage and extensibility make it well suited to coordinate the wide variety of batch processes we operate, including ETL workflows, model training, integration scripting, data integrity testing, and alerting. We have played an active role in Airflow development from the onset, having submitted hundreds of pull requests to ensure that the community benefits from the Airflow improvements created at ING.  We are delighted to see Airflow graduate from the Apache Incubator, and look forward to see where this exciting project will be taken in future!"

"We saw immediately the value of Apache Airflow as an orchestrator when we started contributing and using it," said Jarek Potiuk, Principal Software Engineer at Polidea. "Being able to develop and maintain the whole workflow by engineers is usually a challenge when you have a huge configuration to maintain. Airflow allows your DevOps to have a lot of fun and still use the standard coding tools to evolve your infrastructure. This is 'infrastructure as a code' at its best."

"Workflow orchestration is essential to the (big) data era that we live in," added de Bruin. "The field is evolving quite fast and the new data thinking is just starting to make an impact. Apache Airflow is a child of the data era and therefore very well positioned, and is also young so a lot of development can still happen. Airflow can use bright minds from scientific computing, enterprises, and start-ups to further improve it. Join the community, it is easy to hop on!"

Availability and Oversight
Apache Airflow software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Airflow, visit http://airflow.apache.org/ and https://twitter.com/ApacheAirflow

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 730 individual Members and 7,000 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Hortonworks, Huawei, IBM, Indeed, Inspur, LeaseWeb, Microsoft, Oath, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, Tencent, and Union Investment. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Airflow", "Apache Airflow", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday January 30, 2018

The Apache Software Foundation Announces Apache® Kibble™ as a Top-Level Project

Open Source tools used for collecting, aggregating and visualizing software project activity.

Wakefield, MA —30 January 2018— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Kibble™ as a Top-Level Project (TLP).

Apache Kibble is an activity reporting platform created to collect, aggregate, analyze, and visualize activity in software projects and communities. With Kibble, users can track a project's code, discussions, issues, and individuals through detailed views mapped across specified time periods.

"We are passionate about solving hard problems, particularly as they relate to defining and measuring a project's success," said Rich Bowen, Vice President of Apache Kibble. "As doing so is notoriously difficult, we want to provide a set of tools that allow a project to define success, and track their progress towards that success, in terms that make the most sense for their community. Apache Kibble is a way to make this happen."

Apache Kibble is the latest project to enter the ASF directly as a Top-Level Project, bypassing the Apache Incubator (the official entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation). As part of its eligibility, Apache Kibble had to meet the many requirements of the Apache Maturity Model http://s.apache.org/O4p that include a project’s code, copyright, licenses, releases, consensus building, independence, and more.

Kibble is the Open Source edition of Snoot, the enterprise project and community reporting platform used by dozens of Apache projects as well as by the ASF for its official reports including the ASF Annual Report.

"By gaining an in-depth view into the ASF's operations through 1,433 Apache project repositories, we are able to obtain performance metrics for more than 300 Apache projects and nearly 900 million code line changes by more than 6,500 contributors," said Sally Khudairi, Vice President of Marketing and Publicity at The Apache Software Foundation. "We are excited to share the ability to provide insight with projects of all kinds, and help their communities identify trends and advance their impact."

"We're getting input and data from both a wide range of Apache projects as well as projects from outside of the foundation," added Bowen. "We're also collecting historical metrics from older projects with their rich history of successes and mistakes. They have a great deal of history and passion around measuring their communities, and hearing from disparate projects is helping to refine that vision. We would love to hear from more projects about what metrics are important to track, and invite their communities to join our mailing lists to discuss how we can help one another."

Catch Apache Kibble in action at FOSDEM, 3-4 February 2018 in Brussels.

Availability and Oversight
Apache Kibble software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Kibble, visit http://kibble.apache.org/ and https://twitter.com/ApacheKibble

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,500 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, ARM, Baidu, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, Huawei, IBM, Inspur, iSIGMA, ODPi, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Target, Union Investment, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Kibble", "Apache Kibble", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday January 10, 2018

The Apache Software Foundation Announces Apache® Trafodion™ as a Top-Level Project

Mature Big Data database management system for working in SQL at Apache Hadoop-scale levels in use China Mobile, China Unicom, Dell EMC, Esgyn Corporation, and Millersoft Limited, among others.

Forest Hill, MD —10 January 2018— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® Trafodion™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache Trafodion extends Apache Hadoop to guarantee transactional integrity and operational workloads for new kinds of Big Data applications that run on Hadoop.

 "We are very excited to have been established as an Apache Top-Level Project," said Pierre Smits, Vice President of Apache Trafodion. "Graduation is a terrific milestone that culminates 2.5 years of contributions from around the globe to establishing a growing community committed to delivering a high-grade OLTP solution on top of the Apache Hadoop ecosystem."

Building on the scalability, elasticity, and flexibility of Hadoop, Trafodion (meaning "transactions" in Welsh) is the first integrated Open Source solution that delivers on the promise of integrated transactional and analytical systems (OLTP/OLAP) for Apache Hadoop. Trafodion's features include:
  • Fully functional ANSI SQL support, leveraging existing SQL skills;
  • Distributed ACID data protection, guaranteeing data consistency across multiple tables and rows;
  • Compile-Time and Run-Time Optimizers, delivering performance improvements for OLTP workloads;
  • Parallel-aware Query Optimizer, supporting large data sets;
  • Apache Spark integration, supporting streaming analysis;
  • Interoperability with existing Apache Hadoop tools and solutions, such as Hive, Ambari, Flume, Kafka, and Oozie; and 
  • Apache Hadoop and Linux distribution neutrality.

Trafodion originated at HP-IT in 2013, and was donated to the Apache Incubator in May 2015. The project has had four official releases since entering the Apache Incubator. 

Apache Trafodion is in use at China Mobile, China Unicom, Dell EMC, Esgyn Corporation, and Millersoft Limited, among others.

"As a member of the HP Core Team responsible for releasing Trafodion to The Apache Software Foundation, and responsible for the project’s name, I'm thrilled to see the Trafodion community be recognized with this major achievement. Congratulations to all who made it possible," said Ken Holt, COO at Esgyn Corporation. "Trafodion is the heart of EsgynDB, and the community is like its lifeblood — we at Esgyn are committed to continue to grow and support the community."

"Congratulations to the Trafodion community for becoming an Apache Top-Level Project," said Tianduo Gao, Senior Development Engineer of Software Technology (Suzhou) at China Mobile. "We are planning to use Trafodion to expand the business of China Mobile's Big Data platform: our data statistics of 4G real-time business in the country and provinces are more efficient than ever before."

"Becoming a core Apache Project is a major step forward for Trafodion. It will give Millersoft the confidence to introduce the technology to our Big Data clients," said Calum Miller, Director of Millersoft Limited. "Testing of our Open Source Data Vault engine running on top of Apache Trafodion is going well and we look forward to announcing a fully integrated product shortly."

"Apache Trafodion enhanced the operational efficiency of our Big Data platforms, and brought us better customer experience and broader application scenarios," said Charles Yu, Managing Director, Application Services at Dell EMC.

"Congratulations to Trafodion for officially becoming part of the Apache open source ecosystem," said Qingquan Gu, Senior Development Engineer of Internet of Things Marketing Service Center at China Unicom. "Using Trafodion provided China Unicom with the ability to build and integrate Big Data platforms, enhanced our operational efficiency, and brought us better customer experience."

"Becoming an Apache Top-Level Project is only the beginning," added Smits. "We are looking forward to growing the Trafodion community, reaching new adopters and contributors, and fostering a strong ecosystem around the project."

Availability and Oversight
Apache Trafodion software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Trafodion, visit http://trafodion.apache.org/ and https://twitter.com/Trafodion

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,300 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hewlett Packard, Hortonworks, Huawei, IBM, Inspur, iSIGMA, ODPi, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, Union Investment, WANdisco, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Trafodion", "Apache Trafodion", "Hadoop", "Apache Hadoop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday November 28, 2017

The Apache Software Foundation Announces Apache® Impala™ as a Top-Level Project

High performance analytic database for Apache Hadoop in-Cloud or on-premises in use at Caterpillar, Cox Automotive, Jobrapido, Marketing Associates, the New York Stock Exchange, phData, and Quest Diagnostics, among others.

Forest Hill, MD —28 November 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® Impala™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.
Apache Impala is a modern, high-performance analytic database for Apache Hadoop. The massively parallel processing (MPP) SQL query engine allows for analytical queries on data stored on-premises (in HDFS or Apache Kudu) or in Cloud object storage via SQL or business intelligence tools without having to migrate data sets into specialized systems or proprietary formats.

"The Impala project has grown a lot since we entered incubation in December 2015," said Jim Apple, Vice President of Apache Impala. "With the help of our mentors and the Incubator, we have grown as a community and adopted the Apache Way, all while the Impala contributors have helped make Impala more stable and performant."

In addition to using the same unified storage platform as other Hadoop components, Impala also uses the same metadata, SQL syntax (Apache Hive SQL), ODBC driver, and user interface (Impala query UI in Hue) as Hive. This provides a familiar and unified platform for real-time or batch-oriented queries. Impala provides:
  • A familiar SQL interface that data scientists and analysts already know;
  • The ability to query high volumes of data (Big Data) in Apache Hadoop;
  • Distributed queries in a cluster environment, for convenient scaling and to make use of cost-effective commodity hardware;
  • The ability to share data files between different components with no copy or export/import step; for example, to write with Apache Pig, transform with Hive and query with Impala. Impala can read from and write to Hive tables, enabling simple data interchange using Impala for analytics on Hive-produced data; and
  • A single system for big data processing and analytics, so customers can avoid costly modeling and ETL just for analytics.

Impala was inspired by Google's F1 database, which also separates query processing from storage management. It was originally released in 2012 and entered the Apache Incubator in December 2015. The project has had four releases during its incubation process.

"In 2011, we started development of Impala in order to make state-of-the-art SQL analytics available to the user community as open-source technology," said Marcel Kornacker, original founder of the Impala project. "The graduation to an Apache Top-Level Project is a recognition of the exceptional developer community that stands behind this project."

Apache Impala is deployed across a number of industries such as financial services, healthcare, and telecommunications, and is in use at companies that include Caterpillar, Cox Automotive, Jobrapido, Marketing Associates, the New York Stock Exchange, phData, and Quest Diagnostics. In addition, Impala is shipped by Cloudera, MapR, and Oracle.

"Apache Impala is our interactive SQL tool of choice. Over 30 phData customers have it deployed to production," said Brock Noland, Chief Architect at phData. "Combined with Apache Kudu for real-time storage, Impala has made architecting IoT and Data Warehousing use-cases dead simple. We can deploy more production use-cases with fewer people, delivering increased value to our customers. We're excited to see Impala graduate to a top-level project and look forward to contributing to its success."

"We use Apache Impala to boost performance of our SQL queries against our data lake," said Matteo Coloberti, Head of Analytics at Jobrapido. "Impala is an incredible service that gives us impressive performance on queries."

"We used to distribute Microsoft Excel reports to clients every one or two days but now they can search on their own by customer, sales deal, or even service type," said Andy Frey, CTO of Marketing Associates. "Apache Impala is used to query millions of rows to identify specific records that match the clients' criteria. We've even given clients a 'Query Hadoop' option that allows them to create simple SQL statements and query Hadoop directly via Impala. We're able to offer a faster, richer, and more accurate selection of services without the labor or latency concerns that we used to have."

"The Apache Impala community is growing, and we welcome new contributors to join in our efforts in our code, documentation, issue tracker, and discussion forums," added Apple.

Catch Apache Impala in action at Not Another Big Data Conference, taking place 12 December 2017 in Palo Alto, CA.

Availability and Oversight
Apache Impala software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Impala, visit http://impala.apache.org/ and https://twitter.com/ApacheImpala

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,300 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hewlett Packard, Hortonworks, Huawei, IBM, Inspur, iSIGMA, ODPi, LeaseWeb, Microsoft, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Impala", "Apache Impala", "Hadoop", "Apache Hadoop", "Hive", "Apache Hive", "Kudu", "Apache Kudu", "Pig", "Apache Pig", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday September 25, 2017

The Apache Software Foundation Announces Apache® RocketMQ™ as a Top-Level Project

Open Source distributed messaging and streaming Big Data platform in use at Alibaba Group, Didi Chuxing, S.F. Express, WeBank, Peking University, and Chinese Academy of Sciences, among others.

Forest Hill, MD –25 September 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® RocketMQ™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache RocketMQ is an Open Source distributed messaging and streaming Big Data platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability.

"I am very excited to see Apache RocketMQ as a Top-Level Project and I would like to thank our mentors for all their help, the Apache Incubator Project Management Committee for its advice and guidance, everyone in the RocketMQ community, and Alibaba for publishing the research upon which RocketMQ is based," said Xiaorui Wang, Vice President of Apache RocketMQ. "During the incubation process, the RocketMQ community worked very hard to develop high-quality distributed software for messaging and streaming, in an open and inclusive manner in accordance with the Apache Way."

RocketMQ originated at Alibaba in 2012, and, after handling 1.2 trillion concurrent online message transmissions in the Alibaba Nov. 11th Global Shopping Festival, was donated to the Apache Incubator in November 2016. Apache RocketMQ v4.0.0 was released in February 2017.

As a distributed messaging engine, RocketMQ features include:
  • Low latency; more than 99.6% response latency within 1 millisecond under high pressure;
  • Finance-oriented, high availability with tracking and auditing features;
  • Industry-sustainable, trillion-level message capacity guaranteed;
  • Vendor-neutral, support multiple messaging protocols like JMS and OpenMessaging;
  • Big Data friendly, batch transferring with versatile integration for flooding throughput; and
  • Massive accumulation, given sufficient disk space, accumulate messages without performance loss.

"RocketMQ was conceived from the outset as an open-source distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability," said Von Gosling, original co-creator of RocketMQ and Chief Architect of Aliware MQ at Alibaba Group. "It has been great to witness the growth of the RocketMQ community and codebase as an ASF incubating project, and I look forward to this continuing as a Top-Level Project. Today, more than 100 companies are using Apache RocketMQ, with more feedback coming from the community. According to our data, more than 80% of the project's contributions are from outside the donator Alibaba Group."

In addition to Alibaba Group, Apache RocketMQ is in use at hundreds of companies and research/educational institutions that include Didi Chuxing, S.F. Express, WeBank, Peking University, and Chinese Academy of Sciences, among others.

"Graduation from the Incubator marks an important milestone for the RocketMQ project," said Bruce Snyder, Apache RocketMQ Incubator Mentor and Director of Software Development at SAP Hybris. "This is recognition of the focus and hard work of the project members to learn The Apache Way and drive community around RocketMQ. I am honored to have helped guide the project to a successful graduation."

"At Didi, we have used Apache RocketMQ as storage engine to build MessageQueue service. Based on high availability and high performance of RocketMQ we provide high-quality service," said Neil Qi, Architect at Didi Chuxing. "I believe RocketMQ will become the best MessageQueue project in future."

"New participants are more than welcome to join the project, To serve the community better, we created and maintained two repositories, one as our kernel version and the other one is for community contributions. The community contributed some integrated projects with some other Apache TLPs like Apache Storm, Apache Ignite, Apache Spark and Apache Flume," said Xinyu "yukon" Zhou, member of the Apache RocketMQ Project Management Committee. "We enthusiastically look forward to working together with all contributors to Apache RocketMQ in order to advance the state-of-the-art distributed messaging engine."

Availability and Oversight
Apache RocketMQ software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache RocketMQ, visit http://rocketmq.apache.org/ and https://twitter.com/ApacheRocketMQ

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 650 individual Members and 6,200 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, Inspur, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "RocketMQ", "Apache RocketMQ", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday August 22, 2017

The Apache Software Foundation Announces Apache® MADlib™ as a Top-Level Project

Big Data machine-learning library used for scalable in-database analytics

Forest Hill, MD –22 August 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® MADlib™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache MADlib is a comprehensive library for scalable in-database analytics. It provides parallel implementations of machine learning, graph, mathematical and statistical methods for structured and unstructured data.

"Graduating as a Top-Level Project is a very important milestone for Apache MADlib," said Aaron Feng, Vice President of Apache MADlib. "During the incubation process, the MADlib community worked very hard to develop high quality software for in-database analytics, in an open and inclusive manner in accordance with the Apache Way."

MADlib grew out of discussions between database engine developers, data scientists, IT architects and academics interested in new approaches to scalable, sophisticated in-database analytics. These discussions were written up in a paper from VLDB 2009 [1] that coined the term "MAD Skills" for data analysis. The MADlib software project began the following year as a collaboration between researchers at UC Berkeley and engineers and computer scientists at Pivotal (formerly EMC/Greenplum). In September 2015, MADlib joined the ASF community as an incubating project.

MADlib is deployed on a wide variety of industry and academic projects across many different verticals, including automotive, consumer, finance, government, healthcare, and telecommunications.

"MADlib was conceived from the outset as an open-source meeting ground for software developers, computing researchers and data scientists to collaborate on scalable, in-database machine learning and statistics," said Joe Hellerstein, Professor of Computer Science at UC Berkeley, Co-Founder and Chief Strategy Officer at Trifacta, and one of the original authors of MADlib. "It has been great to witness the growth of the MADlib community and codebase as an ASF incubating project, and I look forward to this continuing as a Top-Level Project."

"At Pivotal, we have seen our customers successfully deploy MADlib on large scale data science projects across a wide variety of industry verticals," said Elisabeth Hendrickson, Vice President, R&D for Data at Pivotal. "As MADlib graduates to a Top-Level Project at the ASF, we anticipate increased adoption in the enterprise given the mature level of the codebase and the active developer community."

"The potential of the Apache MADlib project is unbounded," said Jim Jagielski, Vice Chairman of the ASF. "The ability to perform in-depth and detailed analytics, on both structured and unstructured data, using SQL enables MADlib to be applicable in scenarios where others simply can't compete. As not only interest in, but real-world usage of, machine learning becomes common place, MADlib joins the growing roster of Apache projects that define innovation."

"Apache MADlib is a great example of the diversity at Apache," said Ted Dunning, Apache MADlib Incubator Mentor and Member of the ASF Board of Directors. "MADlib does state-of-the-art machine learning, but does as an inherent part of a database. This is a radical approach that can provide important design flexibility. I am excited to see MADlib become a fully fledged project at Apache."

"New participants are more than welcome to join the project," added Feng. "We enthusiastically look forward to working together with all contributors to Apache MADlib in order to advance the state-of-the-art of scale-out data science tools."

[1] http://dl.acm.org/citation.cfm?id=1687576

Availability and Oversight
Apache MADlib software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache MADlib, visit http://madlib.apache.org/ and https://twitter.com/ApacheMADlib

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 650 individual Members and 6,200 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, Inspur, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "MADlib", "Apache MADlib", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday May 31, 2017

The Apache Software Foundation Announces Apache® SystemML™ as a Top-Level Project

Open Source Big Data machine learning platform in use at Cadent Technology and IBM Watson Health, among other organizations.

Forest Hill, MD –31 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® SystemML™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache SystemML is a machine learning platform optimal for Big Data that provides declarative, large-scale machine learning and deep learning. SystemML can be run on top of Apache Spark, where it automatically scales data, line by line, to determine whether code should be run on the driver or an Apache Spark cluster.

"Today, the machine learning revolution is leading to thousands of life-altering innovations such as self-driving cars and computers that detect cancer," said Deron Eriksson, Vice President of Apache SystemML. "Apache SystemML enables and simplifies this process by executing optimized high-level algorithms on Big Data using proven technologies such as Apache Spark and Apache Hadoop MapReduce."

The core of Apache SystemML has been created from the ground up with the following design principles in mind: 

  • Performance and Scalability, as SystemML scales up on single nodes, and scales out on large clusters using Apache Spark or Apache Hadoop;
  • "Designed for data scientists", enabling data scientists to develop algorithms in a system with a strong foundation in linear algebra and statistical functions; and 
  • Cost-based optimization for scalable execution plans, that significantly shortens and simplifies the development and deployment cycle of algorithms for varying data characteristics and system configurations.

Using Apache SystemML, data scientists are able to implement algorithms using high-level language concepts without knowledge of distributed programming. Depending on data characteristics such as data size/shape and data sparsity (dense/sparse), and cluster characteristics such as cluster size and memory configurations, SystemML's cost-based optimizing compiler automatically generates hybrid runtime execution plans that are composed of single-node and distributed operations on Apache Spark or Apache Hadoop clusters for best performance.

"SystemML allows Cadent to implement advanced numerical programming methods in Apache Spark, empowering us to leverage specialized algorithms in our predictive analysis software," said Michael Zargham, Chief Scientist at Cadent Technology.

"SystemML is like SQL for Machine Learning, it enables Data Scientists to concentrate on the problem at hand, working in a high-level script language like R, and all the optimizations and rewrites are handled by the very powerful SystemML optimizer that considers data and available resources to produce the best execution plan for the application," said Luciano Resende, Architect at the IBM Spark Technology Center and Apache SystemML Incubator Mentor.

"IBM Watson Health VBC is using Apache SystemML on Apache Spark to build risk models on a very large EHR data set to predict emergency department visits," said Steve Beier, Vice President of Value Based Care Platform and Analytics at IBM Watson Health. "The models identify high-risk patients so that they can be targeted with preemptive strategies, thus potentially reducing care costs while at the same time leading to optimal outcomes for patients."

SystemML originated at IBM Research - Almaden in 2010, and was submitted to the Apache Incubator in November 2015. SystemML initiated compressed linear algebra research, a differentiating feature in SystemML, which received the VLDB 2016 Best Paper.

"The Apache Incubator is all about open collaboration and communication and was invaluable for everyone involved in SystemML," added Eriksson. "The Apache SystemML community sincerely encourages everyone interested in machine learning and deep learning to help build our community around this revolutionary technology."

Catch Apache SystemML in action at the Big Data Developers Silicon Valley MeetUp on 8 June 2017 in San Francisco, CA.

Availability and Oversight
Apache SystemML software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache SystemML, visit http://systemml.apache.org/ and https://twitter.com/ApacheSystemML

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit https://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "SystemML", "Apache SystemML", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday January 10, 2017

The Apache Software Foundation Announces Apache® Eagle™ as a Top-Level Project

Intelligent Big Data monitoring and alerting solution in use at high volume, high demand Websites, platforms, and organizations such as eBay, PayPal, Dataguise, and YHD.com, among others.

Forest Hill, MD —10 January 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® Eagle™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache Eagle is an Open Source monitoring and alerting solution for instantly identifying security and performance issues on Big Data platforms such as Apache Hadoop, Apache Spark, and more.

"We are proud to complete the incubation process and graduate as an Apache Top-Level Project," said Edward Zhang, Vice President of Apache Eagle. "The community is actively improving product coverage for analyzing various performance and security issues in large Hadoop clusters."

Eagle was first developed at eBay to solve the monitoring problem for a large scale Hadoop cluster. The eBay team soon realized it would be useful to the whole community, and submitted the project to the Apache Incubator in October 2015. Since then, the project gained a lot of attraction from various developers and organizations for its broad usage scenarios, such as system/service monitoring, application performance monitoring, and security breach detection.

Apache Eagle features include:
  • Highly extensible - Apache Eagle builds its core framework around the application concept; the application itself includes the logic for monitoring source data collection, pre-processing and normalization. Developers can easily develop out-of-box monitoring applications using Eagle's application framework, and deploy into Eagle.
  • Scalable - the project’s fundamental runtime is based on proven Big Data technologies, and applies a scalable core to make it adaptive according to the throughput of the data stream as well as the number of monitored applications.
  • Real-time - provides state-of-the-art alert engine to identify security breaches and performance issues.
  • Dynamic - users can freely enable or disable a monitoring application and dynamically change their alert policies without any impact to the underlying runtime.

"It is exciting to see increasing deployments of Apache Eagle, along with great use cases and contributions back to the project," added Zhang.

"Apache Eagle is a highly scalable and extensible technology platform to support the ever growing needs of intelligent monitoring and alerting in a massively distributed computing environment," said Debashis Saha, CTO and EVP at Jiff Inc. "As the founding executive sponsor of this project at eBay, I am proud to see the community continue to expand the capabilities by supporting complex and diverse use cases for monitoring in security, infrastructure, networking and distributed services in Apache Eagle. Congratulations to the team and the community in graduating to a Apache top level project."

"As a leader in data-centric security with a focus on cloud and Big Data technologies, Dataguise is proud to be part of the Eagle committers group. DgSecure Monitor, our sensitivity-aware monitoring product, uses Apache Eagle as the core engine," said Subra Ramesh, VP of Products and Engineering at Dataguise Inc. "Apache Eagle's flexible architecture, proven scalability, and  cutting-edge design, have enabled DgSecure Monitor to be a highly responsive and scalable solution for both on-premises and cloud deployments. We look forward to continued involvement with Eagle as it has now become a top-level Apache project."

"We have been using Apache Eagle for about a year, and are very happy to see it graduate to a Top-Level Project. Apache Eagle and its low latency real-time alert engine can help us easily identify security and performance issues instantly on Hadoop platform," said Anson Zhong, Senior Vice President of Tech Department at YHD.com. "In addition, Eagle's architecture is highly extensible. We are looking forward to using it in real time risk management system."

"Apache Eagle is a great monitoring and alerting solution designed for large-scale distributed environment," said Chad Chun, Director of Analytics Data Infrastructure at eBay. "It was originally intended for security monitoring and quickly become a generic solution for allowing domain experts to create their own monitoring applications on top of Eagle. This is a wonderful design for easily leveraging the power of community to create and share applications. Looking forward to the tremendous adoption in the industry."

"The Apache Eagle community has done a tremendous job throughout the incubation process, and I'm thrilled to see it graduate to a Top-Level Project," said P. Taylor Goetz, ASF Member and Apache Eagle Project Mangement Committee member. "Eagle fills a very important role in providing top-notch security and performance monitoring and alerting for Big Data deployments. The Eagle project has built a robust, sustainable community and demonstrated a firm understanding of the Apache Way. I look forward to further innovation as the Eagle community marks this important milestone."

"It is great to see Apache Eagle graduate to a Top Level Project within a year of time," said Seshu Adunuthula, Senior Director of Data Platforms at eBay. "It is a great product with unique position to fill the gap of monitoring and alerting large-scale distributed computing environment which is well architected to allow communities to easily implement monitoring and alerting applications on different technical domains such as networking and database clusters.  I would love to see the community to grow fast in the next coming years!"

The project welcomes contributions and community participation through mailing lists, Slack channel, face-to-face Meetups, and other events.

Availability and Oversight
Apache Eagle software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For project updates, downloads, documentation, and ways to become involved with Apache Eagle, visit http://eagle.apache.org and @TheApacheEagle.

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 5,900 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, OPDi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Eagle", "Apache Eagle", "Apache Hadoop", "Hadoop", "Apache Spark", "Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #


Monday January 09, 2017

Success at Apache: "All Carrot and No Stick"

By Danny Angus

When the ASF launched their "Success at Apache" series I offered to share my own experiences. If you read on, remember that this is my personal experience and that others may disagree with me, but as you'll see, that's really part of the fun. 

For a bit background I’m currently the Project Management Committee (PMC) Chair of Apache Labs and in my day job I’m a "Divisional CTO" for a FTSE250 technology company. I first came to the ASF around 2000 when I was part of a startup - I was a CTO then too, it was the dot com boom, and it was just me and a couple of guys. We were considering a partnership with some researchers who wanted to commercialise their work, and were looking for a bit of software that we could use as the foundation for a product because a) we couldn’t afford to write it or buy it, and b) we didn’t have the knowledge anyway. What I found was Apache James http://james.apache.org , so I downloaded it, got it up and running, and did some prototyping, but we quickly realised that it needed work if we were going to be able to use it in production. I dug into it a little, subscribed to the mailing lists, asked questions and figured out what needed to be done to fix and extend what was already there, then started to modify it locally. Meantime I found myself answering other users’ questions on the user list, and one day noticed that I was actually answering more questions that I was asking. Shortly after that, that I was answering more questions than anyone else. Then I started submitting patches to the developer list (this was in the days of CVS: long before git!), which were reviewed and committed for me by the committers … but eventually they got bored with that and decided to extend commit privileges to me so I could do it all myself.


My experience illustrates an important characteristic of Apache projects: the fact that you can just turn up and get involved. Another very other important characteristic is that we are a meritocracy: demonstrating your capability is all you need to do in order to gain more responsibility; demonstrating your willingness and trustworthiness should be enough to get you the job. "Karma" is a word that is used to mean "access permission" in many Open Source projects, and we used to say that if you knew how to ask for karma properly, that was itself a sign that you could be trusted with it. Of course we were a much different organisation in those days, but the principles of a community built on merit and trust are still core to our identity. It's no coincidence that organisations cannot be part of our community: only individuals. Organisations are an important part of the world in which we exist, but we don't exist for their benefit, we only exist at all because as individuals we each bother to turn up and do stuff, from the guy who one time downloads and installs the Apache HTTP Web Server to Sam Ruby, our current (and can I just say excellent) President, everyone is contributing in their own way to the life of Apache and achieving benefits suited to their own, personal, motivations. So it was OK for me to focus on my own and my employer's priorities, which meant that I could learn from my new friends, develop the software we needed at work and become part of this amazing community all at the same time.


My experience of Apache is that it is what I would call "all carrot and no stick". I think that is the most healthy model of Open Source, as it is predicated on the fact that every participant will benefit from their participation without the need to contribute more than they are prepared to do. For me, focusing my contribution on the things I knew about was not only the most efficient use of my time, in terms of meeting our company's product goals, but it also allowed me to learn from others who had, and continue to have, way more knowledge and experience than I, and to benefit from their work. Mixing with these amazing people, many of whom are now real friends of mine, has taught me more than I would ever have learned any other way.


At this point in my involvement Apache went through a bit of what has diplomatically been described as "navel gazing", and settled on the idea that the organisational structure should be very very flat, and there should be no limit to our growth. As long as our standards were met by projects and people, we would welcome them both into our community. Those standards are partly about merit, partly about legal protection, one of the key roles Apache plays is to provide a degree of protection to projects and the people contributing to them, from the threat of bullies, trolls, and gorillas with expensive lawyers; and partly about ensuring that the behaviours and practices that define our identity and have contributed to the survival and the success of our organisation are continued by new generations of people in new projects using and creating technologies that we could hardly have dreamed about 16 years ago.


Before long the dust settled and I found myself voted to chair the Apache James Project http://apache.org/foundation/governance/pmcs.html , which was a whole new dimension of interesting. Chairing a project using only positive motivation teaches you a lot about people, including yourself, and I have a few observations about successful collaboration that I have found to be helpful both at work, where I strive to implement bottom-up decision making, and at the ASF where I want to make a positive contribution and see our communities flourish:


  • Free your mind.The collective sense of direction may not be what you expect, there have been times when I have been very sceptical about the reality of great sounding ideas, but I have also learned that it’s OK to go down the wrong road because most of the time it makes little difference in the end, usually you learn a lot regardless, and if people are really behind it you stand a much better chance of success than if the really good idea has all the fun of a death march. One phrase which is often used to summarise the spirit of Apache is “Community over code”, put the community first, and the code will follow.

  • Listen, and be supportive. There are a lot of different people involved in our projects with a lot of very different motivations. They are mostly all valid, and mostly all equally important if that even has an absolute scale. There are students studying our code, asking questions using our software and maybe fixing defects so that they can learn, there are employees of corporations who are being paid to protect their investment, to implement the product roadmap and maintain some predictable velocity, there are researchers who are pushing the boundaries of their chosen topic, there are people whose livelihood and success depends on a project, and those who are involved because it is a release from the pressure of things with names like "impact", "benefits", "deadlines" and "goals". Moderate or steer the discussion to ensure that all sides are heard, a meritocracy needs to listen to everyone not just the most vocal or assertive, and when I say listen that doesn’t mean formulating your own response while someone else is talking. Support people who you agree with, help to realise other people’s ideas, collaboration is only achieved by being truly committed to each other’s success, not just your own.

  • "A's hire A's B's hire C's". Find, support, and mentor the next generation, when your success depends upon the community it makes sense for you to put some effort into creating the best community you can.

  • Use Positive Language. When I was a kid being mean to my sister, adults used to say, "If you don't have anything nice to say, don't say anything at all". That's great advice if you’re involved in any collaborative venture, but doubly so when it is something like an Open Source project where you are usually communicating using written English, with people you don't know well, who might not have the same language skills as you do, who live in a different time zone and sometimes have very different cultural background than you. On top of all that you"re often debating the details of highly abstract technical concepts. The communication barrier itself can cause a kind of baseline of frustration so go easy on the negativity, one thing I like to do when I strongly disagree with someone is to write how I feel, then try to reword it using only positive language, it might sound like touchy-feely hippy nonsense to you, but you will be surprised how effective changing "I think you’re wrong and here’s why..." into "You have clearly thought a lot about this, I wonder if you have considered...". Alienating people is not the way to get your point across.

  • Learn to be a good loser. You don't own your projects, not here, and you're not the smartest person here either (OK so that’s not going to be 100% true, but there are 5,938 Committers today which makes it about 99.98%) recognising that and learning to embrace the collective view is hard for some people, but being able to step outside your subjective point of view and make a success of something you didn't believe in is a lesson in leadership that is definitely worth learning, because if not, your growth will be limited by the ideas that come from your own head, not accelerated by other people.

  • We are making it up as we go along http://apache.org/foundation/how-it-works.html . Yes, it sometimes seems from the outside like we have it all sorted and nailed down, and that we want to lawyer up and suck the life out of every fun thing (I mean we have a major software licence with our name on it, how grown up is that for goodness sakes?)  But the truth is that Apache, The Apache Software Foundation is, and probably always will be, a work in progress, hopefully will be at-least-good-enough to survive the next unexpected storm, and there have been several of those already, but the only way we ever find that out is when it hits us. Over a relatively long period we have figured out, adopted, borrowed, adapted, had donations of, and thunk out with nothing but our own brains, a whole load of ideas about effective Open Source collaboration, governance, legal shenanigans, marketing, community building, and so on. Things that work well, some that mostly work, and some that are sometimes rubbish, but better than nothing. We write these things down and propagate this good practice amongst projects because it is the bedrock on which our foundation rests, but that doesn’t mean that it can’t change, we correct, adapt and evolve our best practices all the time, this is how we adapt, this is how we have survived and remained relevant in a field that seems to change almost beyond recognition every four or five years. And, being a meritocracy, if you don’t agree with the way things are, if you think it is out of date or ineffective or pointless, don’t complain, stay and fix it. We have another saying which is that "you can scratch your own itch" - don’t be passive, if you care about it, do it.

    The important point about Apache is not that we have rules and committees but that we have these things because they have been shown to help us do the right thing, to help us to live by our principles and to provide a home for Open Source projects that will equip them to survive amongst the commercial sharks in the ocean of the software industry.

  • Finally: Define your own achievements. Whether you are doing it because you need some software, or because, like me, you just found it and it wasn't quite ready, whether you want to make friends, or to learn something new, whether it is because you are being paid to promote your employer's best interest, because you want to explore new ideas, or because you always wanted to write a book, Success at Apache is yours to define. Create your own measure of success and let us achieve it together.


# # #


"Success at Apache" is a new monthly blog series that focuses on the processes behind why The Apache Software Foundation (ASF) "just works". First article: Project Independence https://s.apache.org/CE0V

Monday December 05, 2016

Success at Apache: Project Independence

By Mark Thomas

I've been involved in The Apache Software Foundation (ASF) since 2003. I was using Apache Tomcat at work and I hit a problem that needed a new feature to be implemented. There was already an enhancement request in Bugzilla so I submitted a patch. After some re-work by the project committers, the patch was applied and the feature available in the next release. I enjoy problem solving, so I started to take a look at the other open Tomcat bug reports and my involvement grew from there to include Apache Commons, the Infrastructure Team, the Security Team and, most recently, the Board of Directors to which I was elected in March 2016.

Apache Tomcat has always been at the heart of my involvement and is where I spend most of my time. Tomcat started with a donation to the ASF by Sun in 1999 and, some seven major versions later, the project continues to be very successful. A significant part that success is due to the involvement of a wide range of individuals from different companies. The reason those companies are happy co-operating on Tomcat is because of the importance the ASF places on project independence.

There are many aspects to project independence but, for me, the most important is that committers and Project Management Committee (PMC) members contribute to the project as individuals and do so with the intention of doing what is best for the community as a whole. Some committers contribute in their free time – I did for the first five years or so with Tomcat – and some are allowed /directed to spend time contributing to Apache projects by their employer. However, those committers contributing on their employer's time still need to act in the best interests of the community rather than the best interest of their employer.

To give a specific example, my employer has a product that is built around Apache Tomcat. The sales folks at my employer asked if I could add a feature to this product. The problem was that this feature required access to low-level Tomcat internals in order to implement it effectively. For this to be possible, I would have needed to make some ugly API changes to Tomcat to provide the integration points required. Rather than try and push those changes through, I persuaded my employer that it would better to donate the entire feature to the Apache Tomcat project.

This feature also demonstrates other important elements of a successful ASF project: the ability to make decisions in public and always aiming to achieve community consensus with those decisions. As the development of this new feature progressed, the design evolved as the community reviewed the commits and suggested improvements. This isn't always the quickest way of working but the quality of the end result – both technically but more importantly in terms of community health - more than makes up for that.

The perception of project independence is as important as projects actually being independent. It is a key factor in many projects choosing the ASF as their home so projects need to ensure that the perception agrees with reality.

Things can and do go wrong. With 350 projects it is pretty much a given that there will be a handful of ongoing issues at any given time. For example, there might be an attempt to push a project in a particular direction or to suggest that some external entity controls / leads / manages the project. Typically these are self-corrected by the PMC. Sometimes the PMC needs help to resolve the issue e.g. from V.P. Brand Management or possibly the ASF Board.

Being a board member is often viewed as more significant than it is. I have no more status in Apache Tomcat, Apache Commons or any other project as a board member than I did before my election to the board. I can still have bad ideas and my fellow community members still point it out when it happens. I don't get to always have my way just because I am board member. It is the board as a whole, rather than the individual board members, whose voice carries significant weight. It is fairly rare for any board member to speak on behalf of the board. To give that some context, I've probably done it no more than once a month since joining the board. It is sufficiently rare that board members always include an explicit "on behalf of the board" when speaking for the board rather than as an individual. Sometimes this point isn't appreciated and the views of an individual board member are incorrectly taken to be the views of the board.

The ASF board is also very different to a corporate board. The board manages the Foundation but it is the PMC that manages the project and sets the direction. The board has no role in the technical direction of a project. The board has responsibility for corporate governance, finance, legal etc., but its primary role is monitoring, mentoring and coaching our project communities to help keep them healthy. As part of this, the board reviews all projects on a regular basis. Newly graduated projects are reviewed monthly for typically 3 months before moving to quarterly reviews. The project V.P. (PMC Chair) is an important part of this. They are the eyes and ears of the board. While the board will look for warning signs as part of its regular review, the V.P. has much more in depth knowledge of the project and can flag specific issues early. Where issues are identified, the aim is to get the PMC to self-correct. The board will provide mentoring / coaching / guidance as necessary but it will be the PMC members who do the work to correct the issue.

As an example of the board working with a PMC, earlier this year the V.P. for a particular project became unavailable. The board became concerned because the regular reports were not being produced for the project. In this instance, no-else on the PMC had experience of being a project V.P so the board worked with the PMC to identify a new V.P. and to then mentor the new V.P. as they found their way in their new role.

For the last 17 years, the ASF has provided a home for a large and diverse set of open source projects. Key to this success has been the importance the ASF places on project independence as part of the Apache Way. By continuing to adhere to the principles of the Apache Way, I am confident that the ASF will continue to be successful for another 17 years and a long way beyond.

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation