Entries tagged [processes]

Tuesday March 29, 2022

Success at Apache: My experience with the Apache Way —a perfect society?

by Etienne Chauchot

I have been working in software engineering for more than 15 years. I've always contributed to Open Source software as a user or a developer. But I've been contributing to Apache Software Foundation (ASF) projects such as Apache Flink, Apache Beam or Apache Spark for nearly 6 years. It is long enough for me to say that I find the Apache Way is almost the best way to collaborate on software engineering.

I will not describe the Apache way here as there is a lot of good information about that already. I would rather link to the official Apache documentation. I humbly suggest that you read what it is if you don't know it already. 

My point here is to talk about the Apache Way in practice as I’ve experienced it. Of course, every Apache community is different, but what I wanted to emphasize is that applying the Apache Way by the book could lead to what I'd call a "perfect society" even if this word seems a bit naive and over optimistic, or even utopian.

A perfect society

Actually, working with the Apache way was a revelation to me!

ASF projects are governed by merit: what you do inside the community is noted, you get credit and it can lead to you obtaining more rights (direct access to the project repositories, election of committers etc.). Merit also drives decisions, discussing solutions and building consensus or voting for the best one helps lead to the best possible state of the project in the end. The best idea always wins in the long term.

The software is not driven by companies: no vendor concerns should take precedence over community. Consider how the ASF creates new top-level projects (TLP): a project starts in the Apache Incubator and is mentored by people who have already participated in successful Apache projects. When the mentors agree a project is ready, healthy and following The Apache Way, the ASF Board can approve its graduation from the Incubator to become a self-governing TLP. So the project is managed by the community itself and not by a single company and its private financial considerations. This helps drive the best decisions for the software itself and ensures long term maintenance of the software.

It is inclusive: the key aspect is that every voice matters, and that everyone is considered equal no matter their personal background, education, ethnicity or nationality, every contribution is good to take. Community members recognize that people skills may be different and complementary to theirs. So contributions might come from anyone, from anywhere and in any form (blog post, documentation, talk, code, website...)

ASF communities are welcoming: they are in constant search for new talents to join their forces. Being welcoming is very important to build and grow a community. The Open source community is also a great place for people to grow. The way people collaborate is generally by mentoring. Experienced contributors help newcomers or experts share their thoughts with others. It is really also a good way for mentors to share their passion and inspire mentees. Mentoring is in the DNA of the ASF starting with the Incubator when the podling community profits from the experience and advice of a mentor to grow in the Apache Way and become a top level project

Communities are self-organized: there is no manager but only technical leaders and mentors. Each community has a PMC that guides its governance, but its responsibilities don’t include assigning work and expecting it to be done. People are self-motivated and I must say that it is the best form of motivation ever! I’ve found the decision-making simple and efficient: there is no solely decision, feedback is always very important. People are willing to share their thoughts and solve problems together.

Community members have a collaborative mindset: they are positive, act constructively and their comments are in the best interest of the project and the community. They are  willing to share their thoughts, review PRs, share advice, accept change requests or bug tickets. People are willing to accept criticism without being defensive. The master word is transparency. 

Last but not least, I’ve seen most people behave gently: the fact that every communication is public guides people to communicate in a positive way. Indeed one of the ASF guiding concepts is "what did not happen publicly didn’t happen" – often said as “what didn’t happen on the mailing list, didn’t happen” but of course this concept can be generalized to any communication tool we use. Examples of good communication I’ve seen in open source communities are: asking questions or suggesting rather than affirming or asking for thoughts rather than disagreeing bluntly. An open source contributor should try to put theirself in the other person's shoes, trying to not hurt their feelings and to not demotivate them.

Considering all of this, what I can tell is that it is the way we all would like people and society in general to behave, no?

Daily life

The funny thing is that it goes even further, after some years of applying this philosophy (I was told lately that it felt almost like a religion) at work on a full time basis, I started applying it to daily life outside of work. It started to become my standard way of behaving in society: 

Meritocracy becomes second nature, for example I reward my home builders with gifts and public credit because they did a good job, I reward my kids for good school work etc... 

I also started to give time to others and share knowledge, mentoring becomes  second nature as well. 

      Another big thing which is very visible is that I now always take good care to give positive communication, leading to positive and constructive thinking. Positivism also becomes a key aspect of my daily life.

On a professional basis, an important thing is that merit never expires. So, if you gain committership on a project, or become a PMC member or even an ASF member, it is for life. So your skills are recognized by your peers for your whole career. This is an incredible credit and a tremendous trust mark!

Can be a bit challenging

In order to avoid being seen as a total idealist, I need to temper a bit:

I remember when I first joined an open source community, I felt intimidated. Community members are generally very senior level and very highly skilled developers. But, remember what is written above: every contribution is good to take. And, with time and mentoring, anyone can earn a place inside the community.

The other thing I felt a bit difficult when I joined is to find where to start: some projects are old enough to have a large community so the amount of code is pretty high. But here again mentoring comes into play: mentors can give you pointers on hot topics, starter tickets or simply areas that need maintenance. And with time, you might be recognized as an expert in a given area and the exciting subjects will come to you. 

And if you feel like you want to join a smaller community try joining a project which is still in the incubation phase!


I hope you enjoyed these insights and I hope it gave you the motivation to join an open source community.

Etienne Chauchot has been working in software engineering for more than 15 years and is now specialized in Big Data. He is an Open Source fan, and contributes to Apache projects such as Apache Beam, Apache Flink or Apache Spark. He is also the author of the "Big data Chronicles" blog. He is an Apache Beam committer and PMC member and also an Apache Foundation member.

= = = "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works".

Thursday November 11, 2021

Sponsor Success at Apache: Exploration and Practice of the Apache Way in Tencent

by Mark Shan, Chairman of Tencent Open Source Alliance and Tencent Cloud Open Source Ecosystem General Manager

The Apache Software Foundation (ASF) manages more than 227 million lines of code, has 206 project management committees, leads more than 350 Apache projects and operates through a merit system, with more than 850 members, 8,100+ committers, and tens of thousands of contributors.

Previously the Apache Group, the ASF has grown to one of the largest open source foundations in the world today. It has built the well-known "Apache Way" through its leadership, sound community, and merit thinking, resulting in a set of schemes that promote the sustainable development of open source communities and guide the practice of open source projects.projects.

Since Tencent Open Source was created 11 years ago, a large number of Tencent engineers have formed a deep connection with the Apache community by participating and contributing to Apache projects. Furthermore, by learning from the Apache Way, Tencent is going through its open source journey.

At ApacheCon@Home 2021, I shared how Tencent thinks, explores, and practices open source according to the Apache Way. Below is a synopsis of this presentation:

The Apache Way's Importance in Community Building

The Apache Way is difficult to define. Although the Apache Way has evolved somewhat over the years, the original intention of "high transparency" has remained unchanged. In Mark Shan’s view, Tencent's learning experience from the Apache Way can be summarized into five main points:

  1. Everyone has the opportunity to participate and can become a contributor. Contributors can increase their impact and personal growth through their contributions to projects.
  2. The ASF has a structure that encourages contributions from everyone, regardless of employer. This means that, for example, committer and PMC roles are open to anyone who earns the title. Tencent encourages its engineers to participate in the Apache community actively.
  3. Understanding and practicing open communication is extremely important. Because open source is the collaboration of a global community, Tencent engineers are able to participate in the Apache open source project through asynchronous collaboration using the mailing list. Code and decision-related communication are open and transparent.
  4. Reaching consensus when making decisions is strongly encouraged. Consensus can maintain project momentum and productivity. But when a complete consensus is impossible, voting or other coordination is available to arrive at binding decisions.
  5. The most important point is that the Apache community's motto, "community over code", is often emphasized. Because a healthy community is more important than simply good code. A strong and healthy community can always correct code problems, while an unhealthy community may struggle to maintain the code repository in a sustainable manner. In addition, flexibility is also an integral part of ASF's sustainable open source success.

Tencent's Open Source Way —Inspired by the Apache Way Apache projects and their communities are unique and diverse. In the community-led development process, Apache members formed the Apache Way based on their experience. Many of Tencent's open source practices and results are executed following the model of the Apache Way. After many years of incorporating open source into its culture, Tencent has formed an approach, "open collaboration, open source for good," which reflects Tencent's value and vision, and is honed based on open source best practices. Practicing the Apache Way: Contributing and Donating to open source projects Tencent engineers have been contributing to and helping to lead many ASF projects, including: 1) Big Data Over the past four years, several Tencent engineers helped lead releases for Apache Hadoop 2.8.4 and 2.8.5, Apache Ozone 1.0.0 (Ozone was a sub-project of Apache Hadoop, and became a Top-Level Project in 2020), and Apache Spark 2.3.2. Tencent engineers contributed more than 20 features and optimizations to several versions of Hadoop, and are core contributors in multiple Apache computing and AI frameworks that include Flink, HBase, Hive, MXNet, and Parquet. 2) Middleware In 2019, Tencent donated TubeMQ, its self-developed trillion-level big data component, to the ASF through the Apache Incubator. In 2021 the project officially changed its name to Apache InLong, as part of its incubation process. Tencent Applications based on Apache Projects

In addition to self-developed tools, Tencent widely uses ASF open source projects in its various business systems, with particular focus in big data, API gateways, and observability. As one of the largest daily real-time computing companies, Tencent’s overall big data platform exceeds 5 million cores, and must support single-day access message volume exceeding 55 trillion, real-time computing volume exceeding 65 trillion, and daily analysis tasks reaching 15 million.

More than two dozen Apache projects are used in applications like WeChat, QQ, and Tencent Cloud. These big data projects are used for data transmission, storage, computing, and analysis demand scenarios, supported by other technical fields of service governance that include API gateways and observability.

For big data, Apache Ozone is one of the key projects that provides support for Tencent's business scenarios that require large amounts of data and traffic. As early adopters, Tencent's big data platform deployed an Ozone cluster with more than 1,000 nodes as the back-end storage for big data applications and also uses Ozone as the main storage solution for some private data warehouse projects. 

Today, Tencent is connecting more and more businesses to Ozone, including data warehouses, machine learning platforms, Kubernetes cluster mounting volume, and more. Ozone helps Tencent operate stably, steadily, and at scale: of more than a thousand units without manual operation and maintenance intervention. In the process of verification and improvement, Tencent has done a lot of optimization work to improve performance and stability.

In addition, Apache Pulsar integrates messaging, storage and functional computing, and it adopts an architecture that separates storage and computing. Pulsar has successfully supported a large number of data and traffic business scenarios within Tencent Cloud and has some practical experience in cloud native environments, such as solving rapid and dynamic expansion and contraction, improving the utilization of cluster resources, and cluster forms.

For API Gateways, Apache APISIX offers high performance, a friendly developer experience, and rich plug-in capabilities, which is why Tencent's internal business chose it. Using APISIX, Tencent internally shares its self-developed cloud system components to solve business pain points and provide efficient API gateway services; externally, Tencent engineers contribute to the Apache APISIX open source community, expanding its influence, and leads the development of the open source community.

For Observability, Apache SkyWalking's application in Tencent's internal observable platform provides great convenience for selecting client tracing reporting of the microservices system, as well as the mechanism of computing storage separation and multi-layer query to provide very excellent performance output.

Building Tencent Open Source with 3 Major Projects

In addition to sponsoring the ASF at the Platinum level, Tencent is currently active in more than 10 foundations worldwide as a top member, including the Linux Foundation and the CNCF Foundation. Tencent is actively involved with open source communities such as Kubernetes, Spring Cloud, MariaDB, and it is also promoting the implementation of open source projects, products and solutions.

Foundations provide intellectual property management framework, code repositories, issue tracking, technical guidance, project governance, financial and public relations, and other services. By participating in these foundations, Tencent has learned many more mature open source organizational governance models and used them to guide the process to open source its internal projects to the world.

Tencent has open sourced more than 130 independent cloud native, big data, artificial intelligence, database, and other projects to the world. These projects obtained more than 370,000 GitHub Stars, and have exceeded 2,000 contributors.

By open sourcing projects across internal departments and to the world, actively collaborating with developers and communities around the world, and training open source talents, Tencent's open source ecosystem continuously improves and grows. 

Tencent Cloud’s future will focus on further strengthening the construction of Tencent's open source ecosystem through three major projects:

1) WeOpen: Tencent Open Source community

Tencent Cloud aims to build WeOpen, a platform for open source communication, promotion, and project incubation.

WeOpen is committed to connecting enthusiasts, practitioners, and leaders with global open source foundations to initiate new projects, co-create communities, and hold activities that extend open source culture and inspire the global open source ecosystem to flourish. 

2) Establish an industry open source joint laboratory

The open source laboratory is the landing place for the actual projects. Tencent Cloud plans to successively establish industry joint open source laboratories with well-known Chinese universities and open source organizations to provide a platform for students, researchers, and developers in the enterprise to contribute code and scenarios for open source projects to realize in the industry.

Starting with the "Rhino Bird Open Source Talent Cultivation Plan" held by Tencent this year, open source courses and practice training programs for college students help popularize open source culture, encourage contributions, and further the open source talent ecosystem.

3) Release "Cloud Native Open Source White Paper"

At the Cloud Native Industry Conference last May, Tencent Cloud and China's Institute of Information and Communications Technology announced the official preparation of the Cloud Native Open Source White Paper, which is expected to be released by the end of the year.

Tencent welcomes open source practitioners and companies to join their efforts in the above projects.


For more than 22 years, the ASF has proven that the Apache Way is one of the best practices for building an open organization that balances organizational structure and flexibility. Tencent continues to expand its own open source concepts, methodology, and ecosystem building, with plans to participate in more universities, companies, and organizations, while embracing the Apache Way into the future.

Mark Shan has a long career and practical experience in cloud-native, microservices, big data, edge computing, and open-source ecosystem. As the chairperson of Tencent Open Source Alliance, he works full of passion to build the ecosystem for Tencent Open Source and makes great efforts to accelerate innovation in technology and product with the open-source way. At Tencent Cloud, Mark leads the open-source team and works with organizations and communities including Apache Software Foundation,Linux Foundation, Open Atom Foundation, CAICT, COPU and others to build open-source ecosystem. He is also the observer of Linux Foundation Board, chairperson of TARS Foundation, TOC member of Open Atom Foundation and Magnolia Open Source Community, TSC member of Akraino Edge Stack, fellow of China Cloud Native Industry Alliance, advisor of Open Source Community.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works". "Sponsor Success at Apache" features insights and experiences by select ASF Sponsors https://apache.org/foundation/thanks.html

For more Success at Apache posts, visit https://blogs.apache.org/foundation/category/SuccessAtApache

Thursday September 16, 2021

Success at Apache: from Mentee to PMC

by Ephraim Anierobi

This post is about how I became a committer and a Project Management Committee (PMC) member of Apache Airflow, and provides guidance to those new to programming, are new to contributing to open-source projects, and want to become committers and PMC members in their respective Apache projects.

About a year and a half after changing my career from electrical engineering to software development, I became a committer and a Project Management Committee member of Apache Airflow. Becoming a committer and a PMC member is a reward and a kind of validation that you are on the right part of your journey.

On February 16, 2021, I accepted an invitation to become a committer in Apache Airflow. It came as a surprise, as I was not expecting it. Six months down the line, I received another surprise invitation to become a PMC member in Apache Airflow.

These are impressive feats for me because before contributing to Apache Airflow, I didn't have experience working with other programmers. I was making websites and taught a few friends of mine how to make their own. I didn't have a mentor, and no one has ever seen my code to advise whether to continue on my journey or drop the idea of becoming a programmer.

While I desired to work with experienced programmers to improve my skills, I feared people seeing my code would talk me down. I almost gave up on my journey only to come across an Outreachy post on Twitter looking for interns for open source projects. Outreachy is a tech diversity program that provides three months of paid, remote internships to people underrepresented in tech.

I was ready to change my career and was looking for mentorship, but couldn't find an internship that could help me get started in my journey. In Nigeria where I'm living, your location affects your chances of getting an entry-level job. I was not close to the major cities. 

So I applied for an internship through Outreachy. 

There are two application processes. The initial application involves explaining your background and why you should be accepted into the program. You must pass the initial application before you could proceed to the next. The second application process (called the contribution period) is where you choose an open source project that matches your skill sets and then contribute to it. You must have some minimum contributions before you could be accepted.

That was how I found Apache Airflow.

You could imagine the joy I had when I was accepted into the program.

Here are things I did which I believe would help you in your journey to becoming an Apache committer and a PMC member.

Asking Questions

Asking questions is the fastest way to learn. Don't be afraid to ask questions if you do not understand something. I ask questions a lot and I always get answers, but I didn't start by asking questions: I made 40 commits to the repository without understanding what Airflow does. It was not until I joined my new employer Astronomer that I learned what DAG is and what a data pipeline is. Now I can easily reproduce issues following someone's descriptions. I wish I had asked questions earlier --I could have had more experience by now!

Start small

If you are like me, with little experience, start contributing from the minor issues. Find good first issues and work on them. You don't have to wait to contribute a large change before contributing.

While working on the REST API project, which I got hired by Outreachy to do, I was looking at the codebase. I started with Airflow providers because it was easy for me to understand. There were so many requests about providers at the time and I started looking into it, reading the code base, and helping with the providers. I didn't go into the core straight up; I avoided it. My first PR was on simple database migration during the Outreachy contribution period.

Refactor codes

Airflow is complex. Till now, I'm still learning it. Just last week I learned about how the execution date works. I know there are a lot of other things I have not understood very well but refactoring helped me to understand a lot.

When I was to work in the scheduler, I found the file was so large that I went back and forth without progress. I worked on separating the files and I'm glad I did because after that I could contribute. I recommend refactoring code but do not go into large refactoring. A little at a time, with the hope to understand the project. Avoid the core of the project if you are just starting.


One thing about issues is that most reporters would tell you how to reproduce them. Most times, you would find that the issue is quite easy to fix. I usually jump on those and fix them. Other times, I had to contact my superiors before I could fix it.

Looking at reported issues gives an added advantage that you could learn how the software works in the real world. Try to reproduce as many issues as possible. It adds to your knowledge.

Pull Requests

Here's where you can learn a great deal. I start my day by looking at the PRs. Most PRs link to issues. I read the issues and study PRs. I must admit that some of these PRs are just too complex for me. If I don't understand it, sometimes I ask questions, other times I go to the next PR. When I jump to the next PR, I record the topic that made me jump to the next and plan on reading about it some other time.

When you make a PR, ask for reviews in the community channel of communication. Airflow uses Slack and the mailing list for communications. You should ask for reviews in the slack channel and not the mailing list. The reviews not only give information on how to fix the problem but also teach you best practices in programming.


The ASF has a code of conduct that covers the Foundations activities as well as the projects. Read it first.

Among many other things, you would learn in Apache Airflow is communication. How to communicate with people in a civil manner. Spend time reading PR reviews, you will learn a lot and especially how to ask people to make changes to their code.


You don't have to wait for an invitation to contribute to an Apache project. You don't have to become an Outreachy intern to get involved with something you're interested in.

Don't be afraid to make a PR because nobody will penalize you if you're wrong. I know the feeling that people may think you are not good enough, forget it, they know you are new to the field and if you are thinking that they don't know your level in the language, forget it too, they know you are still a junior because it says so in your code. I can't count how many times I have had code reviews that showed me a better way to implement the code. Be open-minded, make mistakes, and excel.

Ephraim Anierobi started to work on the Apache Airflow project as an Outreachy Intern in May 2020. He became a committer in February 2021 and a member of the Apache Airflow Project Management Committee (PMC) in August 2021. He is a software engineer at Astronomer.

= = = "Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache

Wednesday June 23, 2021

Success at Apache: Security in Practice

by Jarek Potiuk

This post is about the Apache Software Foundation's Security process and security mindset of the Apache Software project’s PMC put to the best use in practice. From this post you can learn why security practices we apply at our projects are important and how they work when they are applied correctly and when the right security-driven mindset is applied by the PMCs but also how important it is for the users of the Apache Software Foundation projects to keep their software updated - including latest security fixes.

The idea of this article was triggered by a recent blog post of the security researcher Ian Caroll that has earned USD 13.000 on bug bounties by simply following up the results of Apache Security process applied by the Apache Airflow PMC. This saved quite a few businesses a lot of trouble, but it was only possible due to the foundations laid down by the ASF and the PMC of the project.

Here is what Ian Caroll has to say about it: “This issue was a great example of how ASF's transparent way of fixing and disclosing vulnerabilities worked to protect users of their software, and gave many organizations a wake-up call on ensuring they upgrade and protect their open-source software.

Apache Airflow is one of the most common orchestration software used in the industry currently, and due to its nature, it sounds like an important vector of attack - if you run it internally in your company, you are likely to interact with pretty much all your systems, and if you manage to break in through Airflow, it might cascade into as many systems you connect to. Therefore the Apache Airflow PMC takes security very seriously. So seriously that we have the whole discussion panel about Apache Airflow Security at the Airflow Summit that is coming soon - July 8-16th.

This post's main point is to show how important it is to follow the security best practices for all the software lifecycle and how important it is to think about it at every step of building and releasing the software (and beyond).

Let's start from the very beginning: making sure the code development process is secure. Like most of the ASF projects, the Apache Airflow project is developed in GitHub and together with a growing number of projects we use GitHub Actions to run continuous integration. There are a number of best practices and security hardening practices published by Github that you should follow when you run your CI with GitHub Actions, and we rigorously follow them, including monitoring of the "Security blog of GitHub" and following it’s advisories.

And we have not stopped there. We actively think and discuss the potential security threats and ways how - for example supply chain attacks can be performed on our project, and we share our findings at the discussion mailing lists of the ASF and introducing recommendations for all ASF projects to make use of the best practices. One of the results there is documenting the practices and sharing them at the builds@apache.org. But we also raised a few security issues to GitHub and as a result of that (at least that’s the feedback we got from GitHub) they implemented some improvements that we apply in practice. The recent example of that is a change implemented by GitHub to allow control of permissions of the GitHub Token used during the CI build which resulted in this PR. Few months ago, we raised concern that having the blanket "write" permission is quite dangerous, and GitHub responded and implemented the change, which allowed us to limit the scope of tokens used for our builds and increase protection against a wide range of attacks - with the supply-chain attacks being recently the most prominent ones, leading to ransomware threats and millions of dollars paid to hackers

This is where the security mindset for the Apache Airflow PMC starts with and this lays the foundation for the next steps where the Apache Software Foundation takes a crucial role in - releasing the software and monitoring for security vulnerabilities. The ASF has a rather well established process for disclosing and following up with security vulnerabilities for the ASF projects. One that is very straightforward and simple to follow for everyone involved - starting from security researchers, who raise those issues, going through the voluntary (!) security team of the ASF that has to handle (from the upcoming annual report) 387 reports of possible vulnerabilities spanned across 95 of the top level ASF projects, which led to 155 CVEs (Common Vulnerabilities and Exposures) assigned, and end up with the PMC that has to handle solving the issues and follow up with reporting. Heck, ASF even introduced an internal portal to report and keep track of all the CVEs as well as report the yearly security summary report and video.

This process is very clear about responsible disclosure and publishing the vulnerabilities, the way how security researchers, the ASF security team and PMC can collaborate when security is discovered. Quite a recent experience there was discovering and announcing CVE-2021-29621: User enumeration in database authentication in Flask-AppBuilder. This issue was reported to the ASF - following the process - by Dolev Farhi he responsibly disclosed it together with proof-of-concept reproducible scenario that allowed us to quickly verify that the issue exists and (more importantly) that allowed us to verify that the issue is fixed when we fixed it. 

At the end of the process this is the message we got from Dolev: "Truly enjoyed working with you. Thanks so much for your help in bringing this to closure and making Airflow what it is."

The CVE was an interesting one because it was not an issue with the Airflow code, but it was introduced by a dependency of Airflow - Flask-AppBuilder. Fortunately the process is built in the way that we can involve and collaborate with other projects in solving it, and we got excellent support from Daniel Gaspar. We tried and tested the fix locally, provided it to Daniel which let Daniel quickly implement it and release a new version of Flask AppBuilder fixing it. This was also important for the Apache Superset project (Daniel is a PMC there as well) which also uses Flask-AppBuilder and suffered from the same vulnerability. This shows how security is a distributed issue and how much cooperation is important and how much a good security process should embrace it. I truly enjoyed cooperation with Daniel, and Dolev as we helped to test release candidate of Flask AppBuilder. Later on, when the CVE was published, we announced it following the regular announcement process.

Here is what Daniel has to say about it: "A great example of multiple open source projects working together, elevating each other to higher quality. The whole is greater than the sum of the parts. Got a clear report with a proposed fix, reproducible steps all backed by the ASF security process, it was a breeze to fix and release." 

This leads to the most important point. We can do only as much as we can when it comes to developing and releasing our software. But then it’s up to our users to upgrade to the latest versions. If they don’t, they remain vulnerable. This was the actual reason for the blog post I mentioned initially - despite announcing a CVE-2020-17526 and releasing a fixed version a long time ago, many of our users did not follow the announcements and did not upgrade to the latest version of Airflow. I must stress here the importance of this step - as long as our users do not upgrade to fixed versions, there is not much we can do to help them. It's all in our users' hands! This time it ended up with just USD 13.000 paid to Ian in the form of bounties, because Ian is a responsible security researcher (so called "white hat"). But imagine some bad characters doing the same thing Ian did.

Of course we understand that this might sometimes be difficult to migrate to newer versions of a software, but here we also have another solution that we applied last year, and one that might seem surprising at first, but makes perfect sense when you look at the consequences. Consistent versioning and release support predictability. When we announced Airflow 2.0 last year, there was a small but important change we introduced - full support for Semantic Versioning which we follow rigorously since. We also published a predictable version lifecycle. Why is this important ? Because the users might be pretty sure that they can safely upgrade “patchlevel” version of Airflow when it gets released without even thinking about potential migration problems. Also when you release the "feature" - minor version of Airflow, we promise it is backwards-compatible and even if the migration process might be a bit longer, they can apply it without worrying about spending a lot of time for the migration of their DAGs (DAGs are the users workflow definitions that some of our customers have many thousands of as their entire data processing is orchestrated by Airflow). 

We also publish (and will continue to) the support schedule for our major releases, so that the users can be prepared and plan migration to new major releases in advance. As with all software we sometimes will implement backwards-incompatible changes which will cause our users to spend more time on migrations. Those old releases will stop receiving security fixes at some date and the best you can do as a user is to migrate to the supported version before the date!

Which leads to the last and most important point in this article. If you are a diligent reader and look at the announcement I mentioned above for CVE-2021-29621, you will see that the fix for that is only released for Airflow 2 series. Why? Because Airflow 1.10 just reached its end-of-life on June 17th 2021. When we released Airflow 2, half a year ago, we agreed in the community that we will only support Airflow 1.10 with critical/security fixes for 6 months. And we did - for example the CVE-2020-17526 has been addressed in the Airflow 1.10.14. 

But this time is over now. This is the first security vulnerability that we addressed only for Airflow 2. If you are still using Airflow 1.10 - you are on your own now. You are no longer protected by the security process of the ASF, the security team of ASF and airflow PMC. What’s even more - security researchers who raise the issues, even if they find it, might not be eager to responsibly disclose it, knowing also that the issue will not be fixed anyway. When you read about the next ransomware attack and millions of dollars paid, think if you would like one day your company to face this kind of dilemma. Even if it costs time and money to keep your software updated, preventing this kind of problem is far cheaper than dealing with the consequences of such an attack.

Upgrade NOW! to the latest release of Airflow 2 and keep on doing it for the future releases!

Be sure to join us at Airflow Summit online 8-16 July https://airflowsummit.org/ --registration is free and open to all.

# # #

Jarek Potiuk started to work on the Apache Airflow project in September 2018. He became an Apache Airflow committer in April 2019 and a member of the Apache Airflow Project Management Committee (PMC) in October 2019. He was elected an ASF Member in April 2021. He is an Apache project mentor in Outreachy and Google Summer of Code and was a mentor in Google Season of Docs. Jarek is an independent Open Source Contributor and Advisor and always keen on making it easier for people with different backgrounds to join OSS projects. = = = "Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache 

Monday April 12, 2021

Sponsor Success at Apache: The Fork

by Wei Zhou

I joined the Apache CloudStack community in 2012 and became a committer in 2013, eventually becoming a PMC (Project Management Committee) member in 2017. My journey to becoming a PMC was both physical and literal, and included several forks in the road. The forks presented themselves in all aspects of my journey – although the literal forks came later, mainly because my journey began in China.

In 2010 I was working at China Mobile, the world's largest mobile network operator, in Beijing as the manager of a cloud project based on OpenNebula (another Open Source project). A year later, my partner received her PhD in the Netherlands and began working in Belgium, so I started looking for new work opportunities in the area. 

In 2012 I visited Europe and had a few interviews, but it was difficult at the time as my English was quite poor.

I was committed to finding a good job and moving to Europe, which meant I needed to improve my English quickly. I studied the language, left China Mobile, and moved to Belgium permanently. It took me seven months to became fluent in English. I re-interviewed at the companies that had rejected me seven months prior and landed a position in Amsterdam at Leaseweb as a Cloud Innovation Engineer. At that time, we had two public cloud platforms based on Apache CloudStack. I was mainly working on the research of Apache CloudStack.

In the first two months I fixed some bugs we found in our productions. Thanks to the CloudStack community, I also received tons of help as I began contributing my changes to the mainstream. In 2013 was invited to be a committer, 3 months after my first submission. It was a huge surprise and a massive honor for me, and I began pushing my changes for new features and bug fixes much more quickly.

A year later, in 2014, Leaseweb released its first private cloud based on Apache CloudStack. It was very welcomed by many customers. As this was happening, we began finding issues with CloudStack as customers were requesting more features and functionalities. The same year, Apache CloudStack moved code repositories to GitHub.com and started using GitHub pull quests to review and merge commits. While all commits should be reviewed and approved by other commits before they are merged into the mainstream, we had already made many changes at Leaseweb and could not wait for next release. Because of this we created our own fork containing all our changes and bug fixes. 

We developed very quickly, and our process was much faster than the review/merge process of Apache CloudStack. The gap between our fork and the community was getting bigger and bigger. When we decided to upgrade from CloudStack 4.2.1 to CloudStack 4.7.1, we had to spend half of a year just to port all of our changes in our fork based on CloudStack 4.2.1 to new fork based on CloudStack 4.7.1. The same problem happened again when we tried to upgrade to CloudStack 4.14, and we had to spend around one year to port all of our changes. The lesson we learned from these two upgrades was that we needed to contribute more to the community and maintain a fork as small as possible. After realizing this, we contributed all of our features and bug fixes to the community by creating many GitHub PRs. Some PRs have now already been merged into mainstream, while others are still in review.

My colleague recently asked me, “If you could go back in time, would you still make the Leaseweb fork?” My answer is yes, I would do it again. A fork makes us more flexible, as we can offer more stable production and more functionalities to our customers. However, if I could go back in time, I would have spent much more time contributing our changes to the community. I’ve learned that the gap between the fork and the community should be less than 100 commits.

We learned so much from these two painstakingly long ports and have implemented the above advice. From now on, the Leaseweb fork only contains features we have developed in the past and bug fixes. For new features, we will always contribute to community and deploy to our production only if it is merged into the mainstream. By doing this, we will be able to upgrade to the next CloudStack release much easier, and will benefit more from the community (e.g., more bug fixes, more features by other contributors). When we contribute to the community, we also benefit from knowledge sharing and the contributions from others.

Wei Zhou has been an Apache committer since 2013 and a PMC member since 2017. He has a Masters in Computer Applied Technology from the University of Science and Technology of China, and a PHD in Computer Organization and Architecture from the Institute of Computing Technology, Chinese Academy of Sciences. Wei specializes in all things computers and has over 10 years of experience in software development. He is a Principal Cloud Engineer at Leaseweb.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works". "Sponsor Success at Apache" features insights and experiences by select ASF Sponsors https://apache.org/foundation/thanks.html

For more Success at Apache posts, visit https://blogs.apache.org/foundation/category/SuccessAtApache

Monday August 03, 2020

Success at Apache: I Became an Apache Solr Committer in 4,662 Days. Here’s how you can do it faster!

by Eric Pugh

On April 6th, 2020 I was invited to become a committer on the Apache Solr project.  My journey to becoming a committer started in earnest 4662 days before that!  On July 2nd, 2007, I opened SOLR-284, a ticket for adding content extraction to Solr. 

A committer on an open source project under the Apache Foundation umbrella is someone who is trusted to contribute code to the project and to help manage and drive its ongoing development. It’s an honour to have been asked and I was very proud to accept the invitation!

So, you did the math, and you realized that it took me 153 months, or 13 years (rounding up), to become a committer, and you’re wondering “What if I don’t want to wait that long?” So here’s my quick cheat sheet on ways to become a committer on an open source project, illustrated by my own journey:

  1. Start by learning the culture of the project. How are decisions made? What tools do people use? What do the various acronyms mean? Join the mailing lists and read every commit.
  2. Start small and work your way in.  Some great ways to do this are to:
    • Take existing patches and test them.  Update them to the latest code base.  Document what you’ve learned
    • Take advantage of being new to a project to bring fresh eyes to the documentation.  Every time you find yourself scratching your head on how something works, contribute a fix to the docs.   It’s a powerful way to immediately contribute.  This is the fastest way to get involved and involves the least cognitive load!  See SOLR-2232 or this email thread.
    • Answer questions on the mailing list! Being able to articulate reasonable responses to questions demonstrates how much you have learned.
    • Bug fix, bug fix, bug fix! Pick bugs that have an obvious answer so that the “correct” solution is easy to figure out. If the right approach to solving it is very ambiguous, you probably won’t get much traction. Remember to remind committers to apply your fixes when they have the time! See SOLR-13965 and SOLR-11480 and SOLR-2611 and SOLR-2263.
  3. Ready to start slinging some code?   Don’t go and refactor the core foundations of the project (at least not yet).   Instead, be like a pilot fish and latch onto one of the core committers who is being very active in the project.

Embrace their vision, and start picking up tasks related to whatever major chunk of work they are doing. Write some unit tests. See about opportunities for refactoring. Do some manual testing over multiple platforms. Once they see that you’re contributing (and accelerating what they are pushing), then work to get some of your own tickets assigned to you under that vision. I’ve seen this lead directly to committership many times, and if I had followed this route, I might have joined sooner!

Here’s to the next 4,662 days of being active in the Apache Solr project!

Eric Pugh is a member of the ASF and a committer Apache Solr. He co-authored the book Apache Solr Enterprise Search Server. Eric is co-founder and CEO of OpenSource Connections, where he helps OSC clients, especially those in the ecommerce space, build their own search teams by leading projects and by acting as a trusted advisor. He also stewards Quepid, a platform for assessing and improving your search relevance.

[this post first appeared at https://opensourceconnections.com/blog/2020/07/10/i-became-a-solr-committer-in-4662-days-heres-how-you-can-do-it-faster/ ]

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache 

Monday May 11, 2020

Success at Apache: Remote Collaboration in the Time of Coronavirus

by Marvin Humphrey

I "arrived" at the Apache Software Foundation in 2005, unreasonably angry about a bug in Apache Lucene.  By "arrived", I mean that I sent the first few emails among several thousand I would go on to send over the next 15 years — the ASF didn't have a physical office where I could show up to buttonhole and berate some unlucky customer service representative.  An unreasonably patient Lucene contributor named Doug Cutting talked me down.

Because the ASF has always been a virtual organization, the Coronavirus pandemic has had minimal impact on its day-to-day operations.  While individual contributors may be personally affected, at the collective level there's been no mad scramble to adapt.

Others have not been so fortunate.  All around the world organizations have been struggling to revamp their processes and infrastructure to comply with "social distancing" protocols.  Sadly, many have already laid off workers, or even closed their doors for good.

And yet, there is a huge pool of work which could conceivably be performed remotely but isn't yet — or which is suddenly being performed remotely but inefficiently.  If we can accelerate and streamline the transition to remote work, many jobs and businesses could be saved.  With some creativity, our interim "new normal" could be more propsperous, and perhaps sooner than we think!

Are you an Open Source contributor?  If so, you possess expertise in remote operations which is desperately needed in today's challenging economic environment.  Let's talk about what we know and how we can help.

The Internet Turns People Into Jerks

People type things at each other over the internet that they would never say to someone's face.  In person, we calibrate our language based on feedback we receive via facial expressions, tone of voice, and body language.  But when all communication is written, the feedback loop is broken — and all too easily, vicious words fall out of our fingertips.

Suddenly-remote workers may find themselves exposed to this phenomenon as conversations that once took place in the office migrate to Slack, email, and other text-centric communication channels.  But it can be tricky learning to recognize when a conversation being conducted via a text channel has gotten overheated — it takes an intuitive leap of empathy, possibly aided by dramatic reading of intemperate material a la Celebrities Read Mean Tweets https://www.youtube.com/playlist?list=PLs4hTtftqnlAkiQNdWn6bbKUr-P1wuSm0 on Jimmy Kimmel.

Open Source communities have grappled with incivility for as long as the movement has existed.  Over time, "ad hominem" personal attacks have gradually become taboo because of their insidious corrosive effect; there exists broad cultural consensus that you should attack the idea rather than person behind it.

Defenses have become increasingly formalized and sophisticated as more and more communities have adopted a "code of conduct".  While the primary purpose of such documents is guard gainst harassment and other serious misconduct, they often contain aspirational recommendations about how community members should treat each other — because serious misconduct is more likely to occur in an environment of constant low-grade incivility.

Regardless of whether your organization adopts a code of conduct, it won't hurt to raise awareness among remote team members of the suceptibility of text-based communications to incivility — so that they may identify and confront it in themselves and others and shunt everyone towards more constructive patterns of communication.

Keeping Everyone "In The Loop"

Coordination is a troublesome problem even when everyone works in the same office, but the difficulties are magnified in remote environments where it takes more effort to initiate and conduct conversations.  Teams can become fragmented and individuals can become isolated unless a culture is established of keeping everyone "in the loop".

At the ASF, the problem is especially acute because its virtual communities are spread out across the globe.  Due to time zone differences, it is typically infeasible to get all stakeholders together for a meeting — even a virtual meeting held via conference call or videochat.  Additionally, many stakeholders in ASF communities do not have the availability to participate in real-time conversations regularly because they are not employed to to work on projects full-time.

"Synchronous" communication channels like face-to-face, videochat, phone, text chat, and so on are good for rapid-fire iteration and refinement of ideas, but they effectively exclude anyone who isn't following along in real-time.  Even if conversations are captured, such as with AV-recorded live meetings or logged text chats, it is inefficient and often confusing to review how things went down after the fact.

The solution that the ASF has adopted is to require that all meaningful project decisions be made in a single, asynchronous communication channel.

  • The channel must be canonical so that all participants can have confidence that if they at least skim everything that goes by in that one channel, they will not miss anything crucial.
  • The channel must be asynchronous to avoid excluding stakeholders with limited availability.

Synchronous conversations will still happen outside this canonical channel —and they should, because again, synchronous conversations are efficient for iterating on ideas!  However, the expectation is that a summary of any such offline conversation must be written up and posted to the canonical channel, allowing all stakeholders the opportunity to have their say.

At the ASF, the canonical channel must be an email list, but for other organizations different tools might be more appropriate: a non-technical task manager such as Asana, a wiki, even a spreadsheet (for a really small team). The precise technology doesn't matter: the point is that there are significant benefits which obtain if a channel exists which is 1) canonical, and 2) asynchronous.

Decision Making

In an office, decision makers can absorb a certain amount of information by osmosis — via overheard conversations, working lunches, impromptu collaborations, and so on.  That source of information goes disconcertingly dry on suddenly-remote teams, leaving only information siphoned through more deliberate action.

A canonical, asynchronous communication channel can compensate to some extent, providing transparency about what is being worked on and how well people are working together, and facilitating oversight even while most of the work gets done solo.  Because properly used asynchronous channels capture summaries rather than chaotic and verbose real-time exchanges, the information they provide is more easily consumed by observers watching from a distance.  The canonical channel also provides an arena for gauging consensus among stakeholders and for documenting signoff.

"Lazy consensus" is a particularly productive kind of signoff, where a proposal is posted to the canonical channel and if there are no objections within some time frame (72 hours at the ASF), the proposal is considered implicitly approved.  Provided that the channel is monitored actively enough that flawed proposals get flagged reliably, "lazy consensus" is a powerful tool for encouraging initiative — a precious quality in contributors collaborating remotely.


Organizations are adapting in myriad ways to the economic crisis brought on by the Coronavirus pandemic.  In the world of Open Source Software where countless projects have run over the internet for decades, we've accumulated a lot of hard-learned lessons about the possibilities and pitfalls of remote collaboration.  Perhaps our experiences can inform some of the suddenly-remote teams out there straining to find their way in these difficult times.  Let's help them to do their best!

Marvin Humphrey is a Member Emeritus of the ASF and a past VP Incubator, VP Legal Affairs, and member of the Board of Directors.  These days, he is focusing on family concerns and consulting part-time.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache  

Monday April 06, 2020

Success at Apache: Welcoming Communities Strengthens the Apache Way

by Jarek Potiuk

During my career, I have been a software engineer, Tech Lead Manager at Google, a robotics engineer at an AI and robotics startup, and am currently the Principal Software Engineer of a software house, Polidea, which I helped grow from 6 to 60 people within 6 years as CTO. Over the past year and a half I was a user, then contributor, then committer, and now a Project Management Committee (PMC) member of Apache Airflow.

Although I took on many roles through the years, including being the main organizer of the international tech conference (MCE), deep in my heart I was always a software engineer. It took me many years to find a place where I could explore my true potential. Then I became part of the Apache community. I first learned about the Apache Software Foundation (ASF) 20 years ago when I used the Apache HTTP server at the beginning of my career. I had only made small contributions to OSS projects up to that point, and becoming involved with Apache Airflow was the first time I contributed seriously to one. As a Principal Software Engineer at Polidea, several of our customers were using Apache Airflow and wanted to contribute back to the project to help other users. Better integrations of services with Airflow would significantly improve future releases of the software. 

The needs of our customers made me and my team go from users of the project to contributors and more. We have people in our software house who understand open source, know how to follow the OSS rules, and contribute changes from customers to help other people in the community. We know how to communicate well, we can also represent a vendor-neutral point of view because we represent the view of several customers and collaborate with all the stakeholders in the project. People in our company also contribute to other OSS projects, such as Apache Beam and Flink. 

We also discovered a great model where our customers wanted us to contribute to an open-source project to make it better because they were using it and wanted to improve it for future users. This allowed us to do it full time (or even 150% of the time if you add all the out-of-hours contributions). I invite you to read about it in my blog post The evolution of Open Source - standing on the shoulders of giants.

Committing to Apache 

I found exactly what I was looking for in the Apache Software Foundation. It’s a great organization for people like me: individual contributors who are also good at working with others, the ones who don’t shy away from organizing and making things happen, who thrive when they can do meaningful work with others. 

This made me think: since the ASF is so great, how come for 20 years I was not contributing to OSS projects more? And since so many software engineers use Apache technology, why is participation not more common? I got lucky because I was in a position that allowed and supported my contributions to an open-source project. For me, it’s a dream-come-true. But what about others? There must be more people willing to contribute and get involved in the OSS community, they probably just don’t know how to go about it yet or did not like the experience.

So here I am, sharing my thoughts on what can be done to help others to get to know ASF sooner and get involved.

Apache Airflow and the initial experience 

Apache Airflow is an exciting project. It is a platform created by the community to programmatically author, schedule and monitor workflows. It started in AirBnB in 2014, was submitted to the ASF incubator in March 2016 and it graduated to a top-level project in January 2019.

When I started working with the Apache Airflow project, I quickly realized that it was hard for me to contribute to. It was not clear how to develop and debug Airflow, how to start, and how to communicate. The project had a number of channels for communication including a developer list, a Slack channel, issues and pull requests alongside the code. As a newcomer, it was not easy to understand which channel is used for what and whether it’s OK to raise certain issues using those channels. It was not clear what were the common protocols: for example how to see that one thread is a discussion and one is voting on an already discussed topic. 

Is our community welcoming enough?

At a party after a conference where I spoke about Apache Airflow, I had a long discussion with a young engineer who was new to the field, Fabian. Fabian told me that often OSS projects create some invisible barriers around communication and onboarding. I explained to him the “Apache Way” and how transparency and openness help with those barriers, fiercely protecting the fact that “we are open”. We carried on with our friendly discussion and it was really eye-opening.

That conversation stayed with me for a while. After some time, I realized that maybe our project was not as welcoming and accessible as we thought: there should be an easier way for people to contribute and join the community. I recalled my case—when I joined the community, I made a mistake by writing that something “will happen” before discussing it with the community. A long-time community member reminded me that this is not the way we should communicate at Apache Airflow. Just to note - each project is autonomous within Apache and it defines its own communication rules. It was done in a very good and friendly tone and I took it as a lesson, but some people might be put off by such a response. Not everyone has the determination, experience, thick skin and willpower to overcome all the obstacles and some people might be put off by such responses - even if they are nice and friendly. 

Could we do better to communicate the ideals of our community more straightforwardly? Without coming across as harsh? Maybe we could find a way of explaining to the future contributors how they should communicate rather than do it by trial-and-error?

Becoming a more welcoming community

A few days and emails after the discussion with Fabian I started a thread at the developer’s list of Apache Airflow “[NON-TECHNICAL] [DISCUSS] Being an even more welcoming community?” that kicked off a conversation that included people who rarely had spoken before. As a result, we managed to introduce many changes to the processes for new members. Thanks to the input of people such as Karolina Rosół (Project Manager at Polidea), we came to the conclusion that the way seasoned community members communicate at Apache Airflow is not obvious at all to newcomers. We added missing chapters to our CONTRIBUTING documentation regarding communication channels, expected response times, and more. 

What helped a lot was that we were able to improve our documentation for the development process during last year’s Google Season of Docs program. Apache Airflow was one of the first projects at the ASF to participate in the program. I was one of the mentors to Elena Fedotova, a technical writer assigned to our project. She improved and restructured the documentation, and made it more readable and easier to understand. Many people took part in reviewing and correcting the docs. Also, we took on the task of creating a new website for Apache Airflow with modern, clean design, and well thought UX addressing different personas of visitors in mind (including new contributors, users and potential partners of Apache Airflow). One of my colleagues and fellow Apache Airflow committer, Kamil Breguła, put enormous effort into both building the website and also restructuring the documentation. 

As a result of the discussion at the developer’s list, we also introduced the mentorship options and even handled (via mentorship) a few difficult cases that could have lost us valuable contributions. A great example of the improvements we’ve done as a community might be this tweet from Vanessa, a research software engineer who had no experience with the community.  Vanessa had tried to contribute support for Singularity—a popular container technology for high-performance computing - a year earlier, and came back for a second try after much of this work was done:


Are we there yet?

Looking back, it’s been a long (yet satisfying) journey trying to make the Apache Airflow community more welcoming. But how do we know it works?

At the beginning of this year, we started to participate in the Outreachy and Google Summer of Code programs, where people from around the world with different backgrounds can be paid for contributing to open source projects. Together with my friend and PMC member Kaxil Naik, we became program mentors and started to receive a flood of requests from the Outreachy members. Initially, we were overwhelmed but soon realized that we have everything we need to answer the questions of the candidates (and future committers) to let them teach their lessons and easily follow the “contributing” documentation. The contribution environment was available for them to get started, and the documentation detailed how they could learn how to prepare contributions and communicate via various channels.

Just two days later, we approved a few pull requests from those people! That’s quite a difference from 1.5 years ago when it took days, if not weeks, to understand the environment and how to work with Apache Airflow. It was truly a team effort; many community members participated in the process and made the Airflow project much more welcoming to newcomers.

Despite having challenges during our experience getting started, I was never going to quit. I loved the project and people almost from day one. Realizing how hard it was initially to start contributing (other people had told me so as well), I decided that I would put a lot of effort (both professionally and also personally) into making the project easier and more open and accessible for people with different backgrounds and experiences. My experience starting as a contributor, then becoming a committer, and now a PMC member proves that this is possible.

To me, Success At Apache means making the community and the spirit of Apache Way more accessible to people around the world. With the difficult times that we are going through now with COVID-19, it’s more important than ever to build and strengthen various communities. And to strengthen the community means to be open to others and be welcoming, We hope that our experience will encourage you to take a look at your project and see if you can make your community more welcoming.

# # #

Jarek Potiuk started to work on the Apache Airflow project in September 2018. He became an Apache Airflow committer in April 2019 and a member of the Apache Airflow Project Management Committee (PMC) in October 2019. He is an Apache project mentor in Outreachy and Google Summer of Code and was a mentor in Google Season of Docs. Jarek is a Principal Software Engineer at Polidea and always keen on making it easier for people with different backgrounds to join OSS projects.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache 

Monday February 03, 2020

Success at Apache: Literally

by Chris Thistlethwaite

I became part of the Apache community as a member of the ASF Infrastructure team in 2016, and was elected an ASF Member in 2019.

Browsing through the other "Success at Apache" posts made me reflect on the word "success". Years ago, I was asked in a job interview, "How do you define success?". After a pause, I asked back, "In what?", which threw the interviewer off a bit. That's just too broad of a question for me to define one answer: success in a career, success as a human, success as a team member, success at a software release, the list goes on and on. 

Every day there's a giant list of possible successes and failures, and that’s even before you get to work ...so keep that in mind as you continue reading.

In August of 2016 I came across a blog post that would change my life forever. 

At the time, I was looking for a new job that was taking longer than I expected. Taking a long shot, I sent off a very sparse email replying to the post. Two days later David Nalley (VP Infrastructure) replied, introducing me to Daniel Gruno who'd be doing the first round of interviewing. Fast forward a few months, and, spoiler alert: I got the job.

My first day "in the office" was in Seville, Spain, on November 14th during ApacheCon EU. Let me jump back a bit: most of the "Success at Apache" posts talk about the extensive background the authors have, both in the Open Source community and the ASF. While I use httpd, LAMP, etc. all the time, I never really found out how the "sausage was made". Apache has well-made products and the philosophy of how they were built intrigued me. My career until that point has mostly been inside Microsoft shops, usually with me suggesting FOSS solutions in meetings and only getting to use them in small-ish batches. A few MySQL boxes here, a few other Linux machines there, but not "full stack" kinda stuff: I ran it where I could but I was very happy with Microsoft products. "Best tool for the job", right? 

Anyway, back to Spain. I don't travel as much as I should, my Spanish is terrible (or enough to get me into a bar fight), and I'm traveling to a country I've never been to.

Friday November 11th was the last day at my previous job. Saturday afternoon, I left my wife and kid to jump on a plane for Seville, Sunday-ish I landed, and on Monday I started work in another country, at a job that was 98% Linux-based (Windows Jenkins build nodes), with people whom I’ve never seen before because no one used video chat during the interviews --at a conference held by the foundation I now work for. 

You may ask yourself, "How did I get here?", as I sure did: queue "Once in a Lifetime" by the Talking Heads...

My time at the ASF has been very interesting to say the least. With such a huge range of users of Apache software, some days I'm helping a large global company trying to get a product out the door, other days I'm troubleshooting a broken commit for someone working in their basement between dinner and baths for the kids. That's what makes this place special: those contributions help the community and help the common good of the project. The unique perspective I have is from within Infra. We don't just support the ASF, we support all projects in one way or another. One project might just be getting started with automated builds in Jenkins while another has been using CI/CD for years. That's a true strength of the ASF: disparate parts come together as a whole in a way that wouldn't work otherwise. Some days my job has nothing to do with technology, it's just getting the right people together on an email to figure out how to solve a problem, leveraging the different parts.

As mentioned earlier, "success" is a moving target, and at Apache, it's no different. Though in my case, any success at my job means I'm helping the ASF become successful, which in turn helps the projects and communities it supports. Behind every commit is a person, just working towards their own success.

I'm glad that I took the chance to respond to the job opening. Every job, company, and environment have a fair share of unpredictably and diversity. At the ASF, those traits are celebrated, leveraged, leaned on, and held up by the great people I get to work with and the community that I'm proud to be a part of.

Chris Thistlethwaite has been fixing problems and herding cats since before he can remember. He likes digging through log files to find solutions to complex problems and then turning his findings into pretty charts and graphs. After working at Avenue A | Razorfish, Sharebuilder, and some small startups, he brought his unique perspective on DevOps/Systems Engineering to the ASF Infrastructure team, where he specializes in monitoring systems. In his spare time, he enjoys homelabbing and spending time with his family.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache

Monday September 23, 2019

Success At Apache: "Mentor Your Mentor"

By Patricia Shanahan

After retiring, I wanted to continue programming but without the pressure and constraints of a job, so I started contributing to Apache. Open software development the Apache way is a great retirement hobby, offering social contacts, intellectual challenge, continuous learning, and the pleasant feeling of making a contribution.

I just got back from having a wonderful time at ApacheCon NA 2019 in Las Vegas. While there, I met relatively young people, and older people who had been involved in Apache for up to 20 years, but joining as a retiree seemed to be unusual. 

Encouraging retirees could benefit Apache in many ways. 

Often, a retiree has a range of experience and skills that take time to accumulate. I have worked, for several years each, on applications, operating systems, compilers, system performance, and architecture of servers with dozens of processors. People like me who were programming in the 1970's have experience surmounting memory limitations, a skill that may be useful again for Internet of Things projects. I can imagine several reasons for a lack of retiree recruits. The most basic is that the computing profession was relatively small when a 2019 retiree would have started their career. That is a good reason to develop ways of helping retirees join Apache, so we will benefit from increasing numbers over the next few decades.

Some retirees already have plans that will take all their time and energy, and have zero interest in another hobby. Among those who might choose Apache as a hobby, there are several possible blocks, such as just not thinking of it, lack of confidence in returning to doing after a period of managing, outdated skills, and skills that may have atrophied through disuse.

The concept behind "Mentor your Mentor" is that someone who is active in Apache should watch for opportunities to bring the idea of open source as a retirement hobby to the attention of a retiring colleague, even if the retiree has been their mentor, and no matter how senior the retiree.

If the retiree is interested, the Apache contributor can offer various forms of help and support such as:

• Introduction to how Apache operates
• Encouragement
• Help selecting a project
• Help identifying resources for technical learning and relearning

In summary, the Apache contributor would do for the retiree the things a good mentor would do for someone new to IT. 

If you are an Apache contributor reading this blog, ask yourself: who in your network has retired from the computing profession? Reach out to them! Apache projects are a great opportunity for retirees to reconnect with innovation in computing. If you are a retiree and do not have an Apache mentor, don't let that stop you. Begin at http://community.apache.org/newcomers/.

Patricia Shanahan worked from 1970 to 2002 in various programming and computer architecture roles for NCR, Celerity Computing, FPS, Cray Research, and Sun Microsystems. She then went to UCSD as a graduate student, receiving a PhD in computer science in 2009, after which she retired.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache

Saturday September 21, 2019

Success at Apache: Why you'd want to become an Apache Committer

by Dmitriy Pavlov

All newbies in Open Source communities may sometimes think that they’ll never be able to become Committers. Many treat this role as a prestigious one, granted only for special feats, and after having written a ton of code. But not all things are so simple, and I hope my story will help you. 

Election as Apache Committer

My journey with The Apache Software Foundation began relatively recently, in 2015. I was working for Openway Group, and was enthusiastic about in-memory computing. I got to know about Apache Ignite at a local developers conference. I implemented the POC of a backend system based on Apache Ignite. I was so impressed with the clear API and documentation, and it was also very convenient that I could start prototyping without passing through the approval process. I suggested using the Apache product instead of a source-available solution. I met Konstantin Boudnik (cos), who helped me to understand the difference between Apache projects and source-available/closely-governed products.

Luckily for me, GridGain, the company that initially donated Ignite to the Apache Incubator, has a development center in my city (Saint Petersburg, Russia). I joined the GridGain team in 2017.

As part of my day job, I provide patches to the product. I actively joined the dev.list discussions (some fellows sometimes say “too actively”). I’ve created a number of wiki pages - ‘Apache Ignite-Under the hood’ to help developers understand product internals. Also, I developed ‘Direct IO’ plugin. I was elected as a Committer.

In 2018 I was concerned about reviews of patches from members of the community not affiliated with GridGain. Since I had a commit-bit now, I’ve started to review patches and ask others to review them, too. I don’t know for sure, but I suppose - these social achievements in the community development were a basis for me being elected as Project Management Committee (PMC) member. 

I asked several questions about The Apache Way on the Community Development (“ComDev”) dev list. I was very impressed by how friendly and welcoming they are. I very much like such a positive atmosphere, and feel it influences the success of Apache projects. Now I’ve also joined Apache Training (incubating) community as Committer and PPMC (Podling Project Management Committee) member.

Quite funny for a software developer with 17 years of experience… being elected as a Committer, that is to say, because of the social aspects and documentation. 

Who is a typical Committer and where does his or her strength lie?

When creating an Open Source product, we always let the users explore it in action -- as well as allow them to modify it and distribute modified copies. But when such modified copies are replicated in an unsupervised manner, we don’t get contributions into the main codebase and the project stalls. It’s here where we need exactly such a person – the Committer – someone who is authorized to merge user contributions into the project.

Why should you become a Committer?

First of all, being assigned to a Committer role is extremely motivating. The professional community acknowledges you and your work, and you clearly see the results of your work in action.

How different is that from some enterprise project -- where you have no idea why you must continually keep shuffling various XML fields?

The second pure advantage of being a Committer is an opportunity to connect with top professionals and also pull some cool ideas from Open Source into your own project. But, if you aren’t one of the top professionals, certainly don’t be afraid to join -- the community has various tasks for different folks.

Besides, being a Committer is a jewel in your CV --and even a greater plus for junior programmers, because at interviews you are often asked to show code samples. If you know an Open Source project well, a company supporting or using it will be happy to hire you. There are some people who will tell you that great positions are unreachable without first committing in Open Source.

There are some bonus goodies, too! For example, Apache Committers get an IntelliJ Idea Ultimate license for free (albeit with some limitations).

How do you become a Committer?

You should be committed to the project --it’s just that simple. Development, writing tests and documentation, and simply answering questions on lists are also good ways to start working towards committership.

Yeah, the contributions of QA engineers and technical writers in the community are valued no less than the developers’ contributions.

If you think there are no tasks for you on some project, you are wrong. Just join the community you are interested in and start working on its tasks. 

The Apache Software Foundation has this dedicated page that lists what contributions are needed.  

Committer — to be or not to be?

Committer activity is a good and useful endeavor, but one shouldn’t strive to become a committer per se. This status can be granted not only for code and it doesn’t justify your proficiency. 

Find a project you may be interested in: it will probably be a project whose software you already use. Dive into its code and say hi to the community; offer help, improve docs, complete a newbie ticket or answer to a user. You may just be surprised how welcoming and open folks are there.

Strive to gain the expertise (knowledge and experience) while researching the project, tweaking it and helping others to solve their problems, and, hopefully, enjoying collaborative development in an Open Source project.

Getting started at http://community.apache.org/ will help you on your way.

Dmitriy Pavlov is a Java developer enthusiastic about Open Source and in-memory computing. He is interested in system performance, information security, and cryptography. He created and donated utility for monitoring tests for Apache Ignite, and is a former Community Manager for Apache Ignite at GridGain. Dmitriy represents the Apache Ignite Project Management Committee (PMC) at local meetups in Russia. He runs workshops and training for Apache Ignite developers and users, and is a frequent speaker at meetups and conferences.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache

Tuesday August 06, 2019

"Success at Apache: The Path To Berlin"

by Isabel Drost-Fromm

"A decade ago" - according to common wisdom for software engineering that's the equivalent of the stone age: The only constant is change - and rapid change at that.

This blog post is going to tell an entirely different story: One about what persistence, patience and continuous engagement can accomplish. One about what can happen when people are working together.

It was over a decade ago, in 2008, when I met with people interested in the then-hyped Apache Hadoop project to create a quarterly meetup on everything data analytics, text mining, scalable data storage. That was when the (Apache) Hadoop Get Together Berlin took place at newthinking store - a co-working space and event location before that format itself was turned into a business model http://blog.isabel-drost.de/posts/scaling-user-groups336.html.

A year later it was clear that an ApacheCon EU was unlikely to happen in Europe. When Simon Willnauer, Jan Lehnardt and myself approached The Apache Software Foundation about holding the event in Berlin - the kind people at Apache who did have experience with ApacheCon successfully stopped us: Given the baggage around the event, the trademark implication, the expectation that all sorts of different people had as well as the pure http://blog.isabel-drost.de/posts/how-hard-can-it-be-organising-a-conference.html logistics of creating an event that's bigger than 100 people, that was a safe and likely wise decision.

What they didn't achieve though was to stop us from running some event the following year: We at least wanted to give friends from the Big Data and search communities an excuse to make their employers pay for a trip to Berlin in summer: http://blog.isabel-drost.de/posts/my-highly-subjective-berlin-buzzwords-recap290.html. This was possible due to some lucky conditions we found ourselves in: Knowing conference organisers who were willing to share their know-how such as legal issues and boundaries with us Finding an event company (newthinking.de) that was supporting the idea.

Being employed by a company https://www.neofonie.de/ that saw value in sponsoring the event by allowing me to do so during my working hours.

While successful, part of the ASF community was still missing though. Fast forward several years to 2017, a new conference concept was born. Under the name of FOSS Backstage, we focus on the one topic that every project at Apache has to deal with: Governance, legal, security, economics https://blogs.apache.org/foundation/entry/success-at-apache-cookie-monster Issues that are not an Apache exclusive issue, but true for everyone - individuals as well as legal entities - involved in open source projects.

The only caveat: We had intentionally left all technical content out of scope for FOSS Backstage. For the data analytics crowd the event was conveniently co-located with Berlin Buzzwords. For the remaining content, Sharan Foga kindly volunteered for coordinating to run an Apache Roadshow alongside FOSS Backstage together with newthinking for two days after Berlin Buzzwords and in parallel to FOSS Backstage. With a name different than ApacheCon this left quite some room for experimenting beyond the traditional ApacheCon format.

Little over a year after that ApacheCon finally is on its way to the city of Berlin: With https://twitter.com/plainschwarz as event organisers, in collaboration with the Apache Software Foundation - with Myrle Krantz as event chair to coordinate between the ASF and the local event team.

In retrospect, the series of events was an interesting journey. There's a couple of lessons I've learned that carry over to open source software development - but also a few that are distinctly different.

1.Patches welcome - turn people that come with feature requests into active contributors

Instead of accepting feature requests from people, it helps to pull them in to submit their own patches: Early on there were requests for a barcamp, for a lightning talk session, for trainings. My response back then: Submit the idea through the CfP form, find someone to run it and we'll run it through the regular submission process adding it to the schedule if it fits.

For the trainings we went for a slightly different approach: Instead of directly offering them ourselves we reached out to established training providers suggesting to run with a co-location/ co-promotion approach.

For those that asked for free tickets we would turn them into helping hands - either on site, during setup or (in the first year) as local guides taking groups of up to twenty people out for dinner in a restaurant close by that they had selected.

For those that asked for more content on some specific topic we offered the option of organising a deep dive satellite event on Wednesday after the conference at one of the many companies willing to host these in Berlin.

In general we left a lot of room and freedom to those who wanted to get involved and add content to the event that they found missing.

2. Decisions are made by those doing the work

While feedback is important, there is a limit to what can feasibly be realised for any given conference budget. While we value feedback from anyone involved, the final decisions need to be taken by those actively contributing time to the event. As a result, that also means that not all feedback can always be taken into account - at least not right away, maybe at a later stage or in a different event, the consecutive year or just taken as an impulse to come up with new fresh ideas.

3. "It's done when it's done" is not an option

Conferences are slightly different from open source releases in that there are hard deadlines - in combination with a fixed budget coming in from attendees and sponsors and some hard features that cannot be postponed to the next release (unless you're organising a remote only conference, running without a venue is pretty much impossible.) That circumstance makes organisation slightly more prone to conflict than your average open source project: There's no cheap way to go down both paths and only at the very last minute decide which is better - or even offering both options.

4. Balancing public and private communication

At Apache we value public communication: Often having discussions in the open invites others to participate and shows where contributions are needed. When it comes to budget, ticket pricing, communication with sponsors, contracts including specific prices for venues this approach becomes a whole lot harder. Even though it helps to provide a dedicated mailing list for program committee members as well as interested attendees to get in touch.

It also helps to make some of the planning background public - either while discussions are ongoing, or at least after a conclusion has been reached: http://blog.isabel-drost.de/posts/berlin-buzzwords-scheduling-behind-the-scenes118.html (caveat: the algorithm has changed substantially since this was published, but the article did help answer a lot of questions.)

One downside to this mode of operation is that people who potentially could provide valuable insight or help out have no idea of what is going on. Another downside is that for the outside world it becomes invisible how large a team it takes to make the event successful. As a tiny fix for that we always tried to make the team involved publicly known.

5. Bringing people together

At Apache we have a tradition to work asynchronously - on archived, searchable, referenceable mailing lists. Without that way of working we wouldn't be able to build bridges across timezones, geographies, cultures and organisations. It wouldn't be possible to collaborate for people with wildly different time schedules. Despite all this hearing people speak when reading their texts makes it easier to understand their tone correctly. Despite all this there are topics that are best shared face to face only for deniability reasons. Despite all this, meeting someone in person who so far has been communicating only remotely with you can be a ton of fun. I hope that you will join that fun in October in Berlin: Looking forward to seeing you there!

Join us! ApacheCon Europe/Berlin 22-24 October 2019 https://aceu19.apachecon.com/

Isabel Drost-Fromm is a former board member of The Apache Software Foundation, co-founder of Apache Mahout and mentored several incubating projects. Interested in all things search and text mining with a thorough background in open source project management and open collaboration she is working Europace AG as Open Source Strategist. True to the nature of people living in Berlin she loves having friends fly in for a brief visit –- as a result she co-founded and is still one the creative heads behind both, Berlin Buzzwords, a tech conference on all things search, scale and storage as well as FOSS Backstage, a conference on all things Free and Open Source behind the scenes and how it interrelates with business and InnerSource. She is a member of the inaugural Open Source and Intellectual Property Advisory Group of the United Nations Technology Labs (UNTIL).

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache

Monday April 01, 2019

Success At Apache: Positively impacting the world one contribution at a time

by Dinesh Joshi

My journey with Apache began in 1999 with Apache httpd and Apache Tomcat. Apache httpd was the de facto webserver at the time on Linux and Tomcat was the most well known Java Servlet container. LAMP (Linux, Apache httpd, MySQL, PHP) stack was a fantastic combination. From that point on, I have always been a Apache user, successively exploring technologies like Apache Commons, Apache Storm, Apache Hadoop, Apache HBase, and Apache Cassandra. It has been a very dependable OSS brand. Being interested in Distributed Systems and Databases, I began exploring OSS databases and came across Cassandra.

In early 2018, almost 19 years after I was first introduced to Apache projects, I began actively contributing to Apache Cassandra source. I have always been passionate about Cassandra and used it during my Masters at Georgia Tech. Its distributed, shared-nothing model is amazing. So when I did get the opportunity to contribute to the Cassandra codebase, I decided to make the most of it. Over the past year I have contributed over 25 patches as an author and reviewed over 30 patches. Collaborating with various contributors in the community, we successfully proposed the very first CIP (Cassandra Improvement Process), a Cassandra Sidecar. We, the community, are now busy building it. I have contributed some interesting changes to Cassandra so that it is more reliable and can scale better viz. Zero Copy streaming and Zstd Compressor which have been featured on the Apache Cassandra blog and at various international conferences. This has generated new interest in Cassandra.

I fully credit the Cassandra community with enabling a new contributor like me to make meaningful contributions. It is an incredibly passionate community, with a lot of questions, answers and knowledge dominating the project JIRA board and mailing lists. As a new contributor it was incredible to see a lot of community interest in what I was contributing. The Sidecar specifically generated a lot of discussion and debate within the community and ultimately we achieved consensus, the Apache Way! Zero Copy streaming is something that big players like Netflix, Uber, etc were interested in. Contributors from Netflix took the initiative in testing and benchmarking it and posting the results on Jira. Getting your work into an Open Source project is one thing but it is humbling to see your work being actively evaluated by some of the biggest industry names. It is even more fascinating to me how people can overcome organizational boundaries to collaborate on a project, and how ideas are accepted, debated and implemented as a community ultimately making it better for everyone in the world. Given my contributions to the Cassandra community, recently the PMC voted me in as a Committer which will help me bring in more contributions from the community as well as help mentor others to join in and contribute!

My goal with contributing to Cassandra was to give back to the community the knowledge & expertise that I have gained over the years building some of the most scalable systems in the world. I have found great mentors along the way who have helped me achieve that goal. It is incredible to see the impact we have on the world through Apache projects such as Apache Cassandra. 

Cassandra is used at some of the biggest organizations in the world for mission critical applications and changes like Zero Copy streaming (CASSANDRA-14556) or Zstd Compression (CASSANDRA-14482) will have a significant impact on many large businesses and more importantly people’s lives. Specifically Zero Copy Streaming in Cassandra allows the database to recover from a failed node several times faster than existing stable version of Cassandra. In addition, it also lowers the amount of resources that are required by the streaming process. Therefore, an organization running large installations of Cassandra can see a meaningful reduction in MTTR (Mean Time to Recovery) as well as reduce the spare server pool capacity that they need to maintain. This lowers the TCO (Total Cost of Ownership) for Cassandra. Zstd Compression is a new lossless compression scheme that offers better compression ratios over existing LZ4 Compression that is used within Cassandra with comparable compression speed. It can reduce storage needs by up to 40% depending on the characteristics of your dataset. Again, this not only reduces the expenses but also requires fewer servers to store data. As a result you are not only saving money but in a way saving the planet by using fewer servers.

I also believe that being a Open Source contributor is not just about code contributions. Contributions come in various forms and one of them is documentation. Seeing how Cassandra’s documentation is not updated, I proposed Cassandra for Google Season of Documentation to improve it. I also have been invited to talk about Cassandra at various conferences across Asia, Europe and North America. So far, in the past year, I have spoken about Cassandra at 9 conferences. It is great to engage with the user community at large which is very passionate and excited about Cassandra. This is one of the most important aspects of community contributions because you get to talk to your users first hand. It also generates interest in the project and is key to getting new contributors for your project.

In summary, this is impossible without having a great, supportive community which is the whole point of the ASF – to build great communities that foster collaboration making the world better one contribution at a time.

Dinesh Joshi is a Senior Software Engineer and a Committer on the Apache Cassandra project. He has a Masters in Computer Science (Distributed Systems & Databases) from Georgia Tech, Atlanta. In the past, Dinesh was a Principal Software Engineer at Yahoo building real time distributed systems for Yahoo’s Finance Web, iOS & Android apps. He is also an international speaker and regularly talks about Apache Cassandra and Databases. In his spare time, he volunteers as a mentor for Women Who Code.

# # #

"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache

Tuesday March 26, 2019

Success at Apache: What You Need to Know

EDITOR'S NOTE: I came across the author's original post, "An Introduction to Apache Software — What you need to know", dated 3 February 2017, and was interested in finding away to share with the greater Apache community. The author's enthusiasm was palpable, and earnestly intended to help educate others. With the ASF celebrating its 20th Anniversary this year, it's easy for many of us to simply rely on tribal knowledge, not realizing that navigation to definitive guides aren't intuitive to newcomers. Those of us who have been here for a while "just know", partially because we were creating it as we went along. Below is an updated version of the original post, amended through the guidance of three long-standing ASF Members. And that's the point of it all at the end of the day: at Apache, we help each other as it contributes to our collective success, and this writeup will help others find their Success at Apache. 

by Maximilian Michels

Before you started reading this post, you have already been using Apache software. The Apache web server (Apache HTTP Server) serves about every second web page on the WWW, including this website. One could say, Apache software runs the WWW. But it doesnt stop there. Apache is more than a web server. Apache software also runs on mobile devices. Apache software is part of enterprise and banking software. Apache software is literally everywhere in today's software world.

Apache has become a powerful brand and a philosophy of software development which remains unmatched in the world of open-source. Although the Apache® trademark is a known term even among the less tech-savvy people, many people struggle to define what Apache software really is about, and what role it plays for today's software development and businesses.

In the last years I've learned a lot about Apache through my work on Apache Flink and Apache Beam with dataArtisans, as a freelancer/consultant, and as a volunteer. In this post I want to give an overview of the Apache Software Foundation and its history. Moreover, I want to show how the "Apache way" of software development has shaped the open-source software development as it is today.

The History of the Foundation

The Apache Software Foundation (ASF) was founded in 1999 by a group of open-source enthusiasts who saw the need to create a legal entity to institutionalize their work. Among the first projects was the famous web server called Apache HTTP, which is also simply referred to as "Apache web server". At that time, the Apache web server was already quite mature. In fact, not only did the Apache web server give the foundation its name but it became the role model for the "Apache way" of open and collaborative software development. To see how that took place, we have to go back a bit further in time.

A Web Server goes a long way

As early as 1994, Rob McCool at the National Center for Supercomputing Applications (NCSA) in Illinois created a simple web server which served pages using one of the early versions of today's HTTP protocol. Web servers were not ubiquitous like they are today. In these days, the Web was still in its early days and there was only one web browser developed at CERN where the WWW was invented only shortly before. Rob's web server was adopted quite fruitfully throughout the web due to its extensible nature. When its source code spread, web page administrators around the world developed extensions for the web server and helped to fix errors. When Rob left the NCSA in late 1994, he also left a void because there was nobody left to maintain the web server along with its extensions. Quickly it became apparent that the group of existing users and developers needed to join forces to be able to maintain NCSA HTTP.

At the beginning of 1995, the Apache Group was formed to coordinate the development of the NCSA HTTP web server. This led to the first release of the Apache web server in April 1995. During the same time, development at NCSA started picking off again and the two teams were in vivid exchange about future ideas to improve the web server. However, the Apache Group was able to develop its version of the web server much faster because of their structure which encouraged worldwide collaboration. At the end of the year, the Apache server had its architecture redone to be modular and it executed much faster.

One year later, at the beginning of 1996, the Apache web server already succeeded the popularity of the NCSA HTTP which had been the most popular web server on the Internet until then. Apache 1.0 finally was released on Dec 1, 1995. The web server continued to thrive and is still the most widely used web browser as of this writing.

The Rise of the Foundation

The team effort that led to the development and adoption of the Apache web server was a huge success. The Apache project kept receiving feedback and code changes (also called patches) from people all over the world. Could this be the development model for future software? More and more projects started to organize their groups similarly to the Apache group. As the number of project grew, financial interests arose and potential legal issues threatened the existence of the Apache group. Out of this need, the Apache Software Foundation (ASF) was incorporated as a US 501(c)(3) non-profit organization in June 1999. In the US, the 501(c)(3) is a legal entity specifically designed for non-profit charitable organizations. This is in contrast to other for-profit open-source software organizations or even US 501(c)(6) non-profit organizations which do not require to be charitable.

After the ASF was incorporated, new projects could easily leverage the foundation's services. Over the next year, every few months a new project entered the ASF. The first projects after Apache HTTP Server were Apache mod_perl (March 2000), Apache tcl (July 2000), and Apache Portable Runtime (December 2000). After a short break in 2001 which was used to come up with a programmatic approach to onboard new projects via an incubator, the ASF has seen very consistent growth of up to 12 projects (2012) each year.

The ASF became a framework for open-source software development which, in its entirety, remains unmatched by other forms of open-source software development. The secret of ASF's success is its unique approach to scaling its operations, in which the foundation does not try to exercise control over its projects. Instead, it focuses on providing volunteers with the infrastructure and a minimal set of rules to manage their projects. The projects itself remain almost autonomous.

Apache Governance - How does the foundation work?

There are about 200 independent projects running under the Apache umbrella. The question may arise, how does the foundation govern its projects? First of all, the ASF is an organization that is run almost entirely by volunteers. In the early days, many of the volunteers were developers which did not like to spend much time with administrative things (who does?), so the organization is structured in a way that requires little central control but favors autonomy of the projects which run under its umbrella.

Per-Project Entities

For every project (e.g. Apache HTTP, Apache Hadoop, Apache Commons, Apache Flink, Apache Beam, etc.), there are a Project Management Committee (PMC), Committers, Contributors, and Users.

Project Management Committee (PMC)

The PMC manages a project's community and decides over its development direction. Its most rudimentary and traditional role is to approve releases for a project. In that sense it has a similar function as the original Apache Group which led the development of Apache HTTP Server. When a new project graduates from the Incubator (covered later), the foundation's central instance, the Board, approves the initial PMC which is selected from the PPMC (Podling PMC) formed during incubation. Each PMC elects one PMC member as the PMC Chair which represents the project and writes quarterly reports to the ASF Board. The Chair needs to be approved by the Board.

Through a project's lifetime new PMC members can be elected by the existing PMC. Note that each new PMC member needs to be approved by the Board but this approval is merely formal and there are few instances that a new PMC member is not approved. PMC members do not need the formal permission of the foundation to elect new Committers. PMC members themselves are also Committers. Let's learn about Committers next.


Committers can modify the code base of the project but they can't make decisions regarding the governance of the project. They are trusted by the PMC to work in the interest of the project. When they contribute changes, they commit (thus, the name) these changes to the project. Committers don't only change code but they can also update documentation, write blog posts on the project's website, or give talks at conferences. Committers are selected from the users of the project; more about this process in the Meritocracy section.

Users and Contributors

Users are as important as the developers because they try out the project’s software, report bugs, and request new features. The term is a slightly confusing because, in the Apache world, most users tend to be developers themselves. They are users in the sense that they are using an Apache project for their own work; usually they are not actively developing the Apache software they are using. However, they may also provide patches to the Committers. Users who contribute to a project are called Contributors. Contributors may eventually become Committers.

In the image, the per-project entities are represented as circles. They exist for every project. Note that the user group circle is not depicted in full size because big projects tend to have much more Users than Committers and PMC members.

Foundation-Wide Entities

The ASF does not work without some central services. Here are the most important entities:

Apache Members

Apache members represent the heart of the foundation. They have been referred to as the "shareholders of the ASF" because they are deeply invested in the ASF (not in the financial sense). A prerequisite to becoming a member is to be active in at least one project. To become a member, you have to show interest in the foundation and try to promote its values. The ASF holds membership meetings which are usually held annually. At membership meetings new members can be proposed and subsequently elected. Elected members receive an invitation which they can choose to accept within 30 days. Becoming a member it not merely a recognition for supporting the ASF, but it also grants the right to elect the Board.

The Board of Directors (Board)

The Board takes care of the overall government of the foundation. In particular, it is concerned with legal and financial matters like brand and licensing issues, fundraising, and financial planning. The board is elected by the Apache members annually and is also composed of Apache members. The current board can be viewed here. Note that there is only one central Board for the entire foundation but Board members can be PMC members in different projects.

Officers of the corporation

The Officers of the corporation are the executive part of the administration. They execute the decisions of the board and take care of everyday business. Most of the officers are implicitly officers by being the PMC chair of a project. Additionally, there are dedicated officers for central work of the foundation, e.g. fundraising, marketing, accounting, data privacy, etc.

Infrastructure (INFRA)

The support and administration team (INFRA) is the team that runs the Apache infrastructure and provides tools and support for developers. INFRA is the only team at Apache which consists of contractors which are paid for their work. Their work includes running the apache.org web site and the mailing lists which are Apache’s main way of communication. Over time, various other tools and services were created to assist the projects. The main tools available which are used by almost all projects are:

  • Web space for the project's websites.
  • Mailing lists, for discussing the roadmap of the project, exchanging ideas, or reporting bugs (unwanted software behavior). Typically the mailing lists are divided into a developer and a user mailing list.
  • Bug trackers, which help developers to keep track of new features or bugs.
  • Version control, which helps developers to keep track of the code changes.
  • Build servers, which help to integrate/test new code or changes to existing code.

The Incubator

Founded in 2002, the Incubator is a project at the ASF dedicated to forming (bootstrapping) new Apache projects. The process is the following: People (volunteers, enthusiasts, or company employees) make a proposal to the Incubator. The proposal contains the project name, the list of initial PPMC (Podling PMC) members, and the motivation and goals for the new project. Once the IPMC (Incubator PMC) has discussed the proposal, it holds a vote to decide if the project enters the incubation phase. In the incubation phase, projects carry "incubating" in their names, e.g. "Apache Flink (incubating)"; this is dropped only once they graduate. To graduate, a project has to show that it is mature enough. The Community Development project at the ASF has created a catalogue of criteria called the Maturity Model. It requires having an active community, quality of code, and being legally compliant. Formally, the project needs to prove it fulfils the criteria to the Incubator IPMC which is comprised of Apache members. All existing work donated in the course of entering the incubator and all future work inside the project has to be licensed to the ASF under the Apache License. This ensures that development remains in the open-source according to the Apache philosophy. More about incubation on the official website.

Meritocracy - How are decisions made?

The Apache Software Foundation uses the term "meritocracy" to describe how it governs itself. Going back to the ancient Greeks, meritocracy was a political system to put those into power which proved that they were motivated, put effort into their work, and were able to help a project. The core of this philosophy can be found throughout history from ancient China to medieval Europe and is still present in many of today’s cultures in the sense that effort, increased responsibility, and service to a part of society ought to pay off in terms of power of decision, social status, or money.

Meritocracy in the Apache Software Foundation denotes that people who either work in the interest of the foundation or a project get promoted. Users who submit patches may be offered Committer status. Comitters who are drive a project, may gain PMC status. PMC members active across projects and taking part in the foundation's work may earn the Member status.

Decision-making within the foundation and projects are typically performed using Consensus. Consensus can be "lazy" which implies that even a few people can drive a discussion and make decisions for the entire community as long as nobody objects. The discussions have to be held in public on the mailing list. For instance, if a Committer decides to introduce a new feature X, she may do so by proposing the feature on the mailing list. If nobody objects, she can go ahead and develop the feature. If lazy consensus does not work because an argument cannot be settled, a majority based vote can be started.

Meritocracy and "lazy" Consensus are the core principles for governance within the Apache Software Foundation. Meritocracy ensures that new people can join those already in power. "Lazy" Consensus creates the opportunity to split up decision-making among the group such that it doesn't always require the action of all members of the community.

The Apache License - A license for the world of open-source

With the incorporation of the foundation in 1999, a license had to be created to prevent conflicts with the intellectual property contributed by others to the ASF. Originally, the license was meant to be used exclusively by the ASF but it quickly became one of the most widely used software licenses for all kinds of open-source software development.

The Apache license is very permissive in the sense that source code modifications are not required to be open-sourced (made publicly available) even when the source code is distributed or sold to other entities. This is in contrast to “Copyleft” licenses like the GNU Public License (GPL) which, upon redistribution, requires public attribution and publication of changes made to the original source code. The Apache license was first derived from the BSD license which is similarly permissive. The reason for this was that the Apache HTTP Server was originally licensed under the BSD license.

The current version of the Apache License is 2.0, released in January 2004. The changes made since the initial release are only minor but they set the prerequisite for its prevalence. At first, the license was only available to Apache projects. Due to the success of the Apache model, people also wanted to use the license outside the foundation. This was made possible in version 2.0. Also, the new version made it possible to combine GPL code with Apache licensed code. In this case, the resulting product would have to be licensed under the GPL to be compatible with the GPL license. Another change for version 2.0 was to make inclusion of the license in non Apache licensed projects easier and require explicit patent grants for patent-relevant parts.

Apache Today

The ASF today is not the small group that it used to be back in 1999. At the time of this writing, the Apache Software Foundation hosts 51 podlings in the Incubator and 199 top-level committees (PMCs). This amounts to almost 300 projects (latest statistics). Note that, a PMC may decide to host multiple projects if necessary. For instance, the Apache Commons PMC has split up the different parts of the Apache Commons library into separate projects (e.g. CLI, Email, Daemon, etc.). 50 of the 300 projects have been retired and are now part of Apache Attic, the project which hosts all retired projects. The above graph is taken from https://projects.apache.org.

Apache Conferences

The Apache Software Foundation regularly organizes conferences around the world called ApacheCon. These conferences are dedicated to the Apache community or certain topics like Big Data or IoT. It is a place to meet community members and learn about the latest ideas and trends within the global Apache community. Apart from the official conferences, there are conferences on Apache software organized by companies or external organization, e.g. Strata, FlinkForward, Kafka Summit, Spark Summit.

Here's a list of some projects that I came across in the past. I grouped them into categories for a better overview. I realize you might not know a lot of the projects but maybe this list can be the starting point to discover more about these Apache projects :)

Big Data

  • Hadoop
  • Flink
  • Spark
  • Beam
  • Samza
  • Storm
  • NiFi
  • Kafka
  • Flume
  • Tez
  • Zeppelin


  • Mesos
  • CloudStack
  • Libcloud

Machine Learning

  • Mahout


  • OpenOffice


  • CouchDB
  • HBase
  • Zookeeper
  • Derby
  • Cassandra

Query Tools / APIs

  • Hive
  • Pig
  • Drill
  • Crunch
  • Ignite
  • Solr
  • Lucene

Programming Languages

  • Groovy


  • Bigtop
  • Ambari


  • Commons
  • Avro
  • Thrift
  • ActiveMQ
  • Parquet

Developer Tools

  • Ant
  • Maven
  • Ivy
  • Subversion

Web Servers

  • HTTP (the one!)
  • Tomcat

Web Frameworks

  • Cocoon
  • Struts
  • Sling

Apache - A Successful Open-Source Development Model

My first attempt to learn more about Apache goes back several years. I was using the Apache License while working on Scalaris at Zuse Institute Berlin. I realized that the license was somehow connected to the Apache Software Foundation but I didn't really understand the depth of this relationship until I started working on Apache Flink with dataArtisans. Besides the official homepage of the foundation, relatively little information was available on the Internet about the foundation and its projects. In hindsight, the best source of information was to read the email archives, get to know other people at the ASF, and become a volunteer myself :)

When I originally wrote this post I couldn’t find an introductory guide to the ASF. So I decided to do a bit of research myself and tried to write down what I had learned working on Apache projects. I hope that I could provide an overview of the ASF and show you how significant the foundation has been for the open-source software development.

Thank you

Thank you for reading this article. Feel free to write me an email if I got something wrong or you would like to comment on anything.

Thank you Roman Shaposhnik, Shane Curcuru, Dave Fisher, and Sally Khudairi for your comments which were very helpful to revise this post for the 20th anniversary of the ASF.


 # # #

"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache

Monday March 18, 2019

Apache Software Foundation Platinum Sponsor Profile: Leaseweb

with Robert van der Meulen, Global Product Strategy Lead at Leaseweb

Robert is Global Product Strategy Lead at Leaseweb. Fascinated by technology, Robert studied computer sciences, and after his studies, he delved into the then relatively young and rapidly developing internet technology. He soon understood that the internet would be at the center of almost everything we do and wanted to be part of it. Robert is passionate about using technology to improve people's lives. He contributed to the Debian project as a developer later introduced Apache CloudStack in Leaseweb and has been active in the open source community for quite some time. During his 9 years at Leaseweb, he worked hard to make sure digital transformation, from how we communicate to how we do business, is part of the company mission. Follow @Leaseweb on Twitter.

"Many Apache projects are being built by – mostly – volunteers and motivated individuals, and the world can use, change and develop all of those. It's important to support the people that make this possible."

How did Leaseweb's work with Open Source begin?

There's a long history of open source within Leaseweb. When you do large-scale hosting, open source operating systems, tools, and applications are always pretty much part of your product – and working on open source projects brings mutual benefits. This can mean running a mirror for your customers and "the outside world", fixing and reporting bugs, or helping with actual features or changes to "products" you use. As our services portfolio grew, we started using and contributing to more open source projects, especially when Cloud was becoming a bigger part of the portfolio – bringing the need for more middleware and (platform) management.

Why Apache? How is Leaseweb involved with the ASF, and for how long?

If you count the Apache HTTP Server predating the ASF, we've probably been using ASF software in one way or another for as long as we exist – which is incidentally pretty close to the ASF (we celebrated our 20th anniversary about 2 years ago). Our contributions and use of ASF projects grew significantly when Apache CloudStack became part of the portfolio. CloudStack being open source and managed by the ASF after cloud.com and Citrix adventures gave us a nudge to start using it more. I'd say CloudStack is a significant part of our Cloud portfolio right now – we run large deployments all over the world, often supporting critical customer applications.

Why is support for foundations such as the ASF important? How does helping the ASF help Leaseweb?

Support for foundations such as the ASF is important because those foundations are important :-) . Any big open source project at some point needs the infrastructure to continue to run – and it's great if a project can rely on an organization like the ASF for that infrastructure so the focus can be on making the project great. Open source projects can grow and be more successful if they can more easily deal with governance, financials and administration, as well as tangible infrastructure and tools. Helping an organization like the ASF helps the ASF projects all over, which has an impact on the software we use as part of our products.

What sets the ASF apart from other software foundations or consortia?

The simple distinguishing factor is the size of the ASF. There's a vast number of projects which are part of the ASF, and most of those are significant in the free software "portfolio". Many of those components and projects are used in a less visible way, being part of the hard-core infrastructure of the Internet – but without them, many tools and devices we use would not be able to function. What I like about the ASF, is that it embodies the open source spirit by valuing consensus and being an enabler for the projects that are part of the Foundation. 

What does "The Apache Way" mean to Leaseweb? What makes The Apache Way special?

There's overlap between The Apache Way and the values we try to stick to in Leaseweb. 

Important ones are Merit, Open and Consensus – in part due to our backgrounds. We value facts, deeply dislike assumptions, and like to make decisions based on proper motivation and data. Growth makes people happy and gives great results, so we like people to do more of what they're good at – and to get even better at it. This also means being open to new opinions and ideas – executing on them because they make sense, not because where they come from. I think those values are recognizable in many open source communities as well as in The Apache Way.

Do you have any guidelines for promoting innovation? There is no limit with Open Source: how do you stay focused?

Open source projects are often tools for us or part of a product offering or service – which means we'd look at what tools fit best to get to where we want to be. Innovation is more a natural part of that, as we need it to continue to offer the right services to our markets. It's pretty much impossible to do that without constantly innovating and adding new features and products. Focus is a more difficult one; a large part of that focus is driven by our target markets and customers, so listening to that market and those customers to figure out what they want is super important. That information, along with keeping a close eye on market trends, gives us direction and focus from a product and services perspective. From a more technical angle, great engineers are always looking for ways to make their systems and services better :-) . Lots of input (and obviously the execution!) comes from the product teams – either by constant optimization and small changes, or by bigger business cases or ideas that can be executed. 

How does Apache fit into Leaseweb's long-term strategy/plans?

A number of our leading Cloud products are based on Apache software. We use Apache CloudStack for various private cloud and VPS offerings, and those platforms are continually growing and evolving – and we keep adding more with most of the new locations we open. Along with the CloudStack platforms, hosting environments obviously have many deployments using Apache web servers. Within our technical teams we consume lots of different Apache projects and actively contribute to a number of them (we have a dedicated CloudStack team that includes one of the Apache CloudStack PMC members). Every software solution has its limits, and obviously this goes for CloudStack too – but also we're happy we can change or help change the things that could be better. 

Money is just one way to support the ASF. How else do you contribute? What recommendations do you have for others to participate?

An important part of the ASF support is the platform we provide; it's obvious that The Apache Software Foundation would run their infrastructure on Apache CloudStack, but I'm happy the Leaseweb team and infrastructure is delivering and maintaining that particular CloudStack setup. There's a sense of pride; the people building the software are trusting our services to do part of that on. Next to that, it proves to me we provide a set of services and products we can be proud of and one that fits the requirements of a bunch of pretty hardcore Cloud developers.

How does it feel to be able to offer this level of support?

It feels great! It's important to, if you have the opportunity, give something back. Many Apache projects are being built by – mostly – volunteers and motivated individuals, and the world can use, change and develop all of those. It's important to support the people that make this possible.

Are there any other thoughts on the experience of being a large-scale donor that you would like to share? What else do we need to know?

Not much. I personally really enjoy seeing what happens with the support we provide – what projects it makes possible, what things it makes more easy or better. Tangible insight in the results is a big motivator as well as a proof point. 

# # #

Sponsors of The Apache Software Foundation such as Leaseweb enable the all-volunteer ASF to ensure its 300+ community-driven software products remain available to billions of users around the world at no cost, and to incubate the next generation of Open Source innovations. For more information sponsorship and on ways to support the ASF, visit http://apache.org/foundation/contributing.html . 



Hot Blogs (today's hits)

Tag Cloud