Entries tagged [just]
Success at Apache: Security in Practice
by Jarek Potiuk
This post is about the Apache Software Foundation's Security process and security mindset of the Apache Software project’s PMC put to the best use in practice. From this post you can learn why security practices we apply at our projects are important and how they work when they are applied correctly and when the right security-driven mindset is applied by the PMCs but also how important it is for the users of the Apache Software Foundation projects to keep their software updated - including latest security fixes.
The idea of this article was triggered by a recent blog post of the security researcher Ian Caroll that has earned USD 13.000 on bug bounties by simply following up the results of Apache Security process applied by the Apache Airflow PMC. This saved quite a few businesses a lot of trouble, but it was only possible due to the foundations laid down by the ASF and the PMC of the project.
Here is what Ian Caroll has to say about it: “This issue was a great example of how ASF's transparent way of fixing and disclosing vulnerabilities worked to protect users of their software, and gave many organizations a wake-up call on ensuring they upgrade and protect their open-source software.”
Apache Airflow is one of the most common orchestration software used in the industry currently, and due to its nature, it sounds like an important vector of attack - if you run it internally in your company, you are likely to interact with pretty much all your systems, and if you manage to break in through Airflow, it might cascade into as many systems you connect to. Therefore the Apache Airflow PMC takes security very seriously. So seriously that we have the whole discussion panel about Apache Airflow Security at the Airflow Summit that is coming soon - July 8-16th.
This post's main point is to show how important it is to follow the security best practices for all the software lifecycle and how important it is to think about it at every step of building and releasing the software (and beyond).
Let's start from the very beginning: making sure the code development process is secure. Like most of the ASF projects, the Apache Airflow project is developed in GitHub and together with a growing number of projects we use GitHub Actions to run continuous integration. There are a number of best practices and security hardening practices published by Github that you should follow when you run your CI with GitHub Actions, and we rigorously follow them, including monitoring of the "Security blog of GitHub" and following it’s advisories.
And we have not stopped there. We actively think and discuss the potential security threats and ways how - for example supply chain attacks can be performed on our project, and we share our findings at the discussion mailing lists of the ASF and introducing recommendations for all ASF projects to make use of the best practices. One of the results there is documenting the practices and sharing them at the email@example.com. But we also raised a few security issues to GitHub and as a result of that (at least that’s the feedback we got from GitHub) they implemented some improvements that we apply in practice. The recent example of that is a change implemented by GitHub to allow control of permissions of the GitHub Token used during the CI build which resulted in this PR. Few months ago, we raised concern that having the blanket "write" permission is quite dangerous, and GitHub responded and implemented the change, which allowed us to limit the scope of tokens used for our builds and increase protection against a wide range of attacks - with the supply-chain attacks being recently the most prominent ones, leading to ransomware threats and millions of dollars paid to hackers.
This is where the security mindset for the Apache Airflow PMC starts with and this lays the foundation for the next steps where the Apache Software Foundation takes a crucial role in - releasing the software and monitoring for security vulnerabilities. The ASF has a rather well established process for disclosing and following up with security vulnerabilities for the ASF projects. One that is very straightforward and simple to follow for everyone involved - starting from security researchers, who raise those issues, going through the voluntary (!) security team of the ASF that has to handle (from the upcoming annual report) 387 reports of possible vulnerabilities spanned across 95 of the top level ASF projects, which led to 155 CVEs (Common Vulnerabilities and Exposures) assigned, and end up with the PMC that has to handle solving the issues and follow up with reporting. Heck, ASF even introduced an internal portal to report and keep track of all the CVEs as well as report the yearly security summary report and video.
This process is very clear about responsible disclosure and publishing the vulnerabilities, the way how security researchers, the ASF security team and PMC can collaborate when security is discovered. Quite a recent experience there was discovering and announcing CVE-2021-29621: User enumeration in database authentication in Flask-AppBuilder. This issue was reported to the ASF - following the process - by Dolev Farhi he responsibly disclosed it together with proof-of-concept reproducible scenario that allowed us to quickly verify that the issue exists and (more importantly) that allowed us to verify that the issue is fixed when we fixed it.
At the end of the process this is the message we got from Dolev: "Truly enjoyed working with you. Thanks so much for your help in bringing this to closure and making Airflow what it is."
The CVE was an interesting one because it was not an issue with the Airflow code, but it was introduced by a dependency of Airflow - Flask-AppBuilder. Fortunately the process is built in the way that we can involve and collaborate with other projects in solving it, and we got excellent support from Daniel Gaspar. We tried and tested the fix locally, provided it to Daniel which let Daniel quickly implement it and release a new version of Flask AppBuilder fixing it. This was also important for the Apache Superset project (Daniel is a PMC there as well) which also uses Flask-AppBuilder and suffered from the same vulnerability. This shows how security is a distributed issue and how much cooperation is important and how much a good security process should embrace it. I truly enjoyed cooperation with Daniel, and Dolev as we helped to test release candidate of Flask AppBuilder. Later on, when the CVE was published, we announced it following the regular announcement process.
Here is what Daniel has to say about it: "A great example of multiple open source projects working together, elevating each other to higher quality. The whole is greater than the sum of the parts. Got a clear report with a proposed fix, reproducible steps all backed by the ASF security process, it was a breeze to fix and release."
This leads to the most important point. We can do only as much as we can when it comes to developing and releasing our software. But then it’s up to our users to upgrade to the latest versions. If they don’t, they remain vulnerable. This was the actual reason for the blog post I mentioned initially - despite announcing a CVE-2020-17526 and releasing a fixed version a long time ago, many of our users did not follow the announcements and did not upgrade to the latest version of Airflow. I must stress here the importance of this step - as long as our users do not upgrade to fixed versions, there is not much we can do to help them. It's all in our users' hands! This time it ended up with just USD 13.000 paid to Ian in the form of bounties, because Ian is a responsible security researcher (so called "white hat"). But imagine some bad characters doing the same thing Ian did.
Of course we understand that this might sometimes be difficult to migrate to newer versions of a software, but here we also have another solution that we applied last year, and one that might seem surprising at first, but makes perfect sense when you look at the consequences. Consistent versioning and release support predictability. When we announced Airflow 2.0 last year, there was a small but important change we introduced - full support for Semantic Versioning which we follow rigorously since. We also published a predictable version lifecycle. Why is this important ? Because the users might be pretty sure that they can safely upgrade “patchlevel” version of Airflow when it gets released without even thinking about potential migration problems. Also when you release the "feature" - minor version of Airflow, we promise it is backwards-compatible and even if the migration process might be a bit longer, they can apply it without worrying about spending a lot of time for the migration of their DAGs (DAGs are the users workflow definitions that some of our customers have many thousands of as their entire data processing is orchestrated by Airflow).
We also publish (and will continue to) the support schedule for our major releases, so that the users can be prepared and plan migration to new major releases in advance. As with all software we sometimes will implement backwards-incompatible changes which will cause our users to spend more time on migrations. Those old releases will stop receiving security fixes at some date and the best you can do as a user is to migrate to the supported version before the date!
Which leads to the last and most important point in this article. If you are a diligent reader and look at the announcement I mentioned above for CVE-2021-29621, you will see that the fix for that is only released for Airflow 2 series. Why? Because Airflow 1.10 just reached its end-of-life on June 17th 2021. When we released Airflow 2, half a year ago, we agreed in the community that we will only support Airflow 1.10 with critical/security fixes for 6 months. And we did - for example the CVE-2020-17526 has been addressed in the Airflow 1.10.14.
But this time is over now. This is the first security vulnerability that we addressed only for Airflow 2. If you are still using Airflow 1.10 - you are on your own now. You are no longer protected by the security process of the ASF, the security team of ASF and airflow PMC. What’s even more - security researchers who raise the issues, even if they find it, might not be eager to responsibly disclose it, knowing also that the issue will not be fixed anyway. When you read about the next ransomware attack and millions of dollars paid, think if you would like one day your company to face this kind of dilemma. Even if it costs time and money to keep your software updated, preventing this kind of problem is far cheaper than dealing with the consequences of such an attack.
Upgrade NOW! to the latest release of Airflow 2 and keep on doing it for the future releases!
Be sure to join us at Airflow Summit online 8-16 July https://airflowsummit.org/ --registration is free and open to all.
# # #
Jarek Potiuk started to work on the Apache Airflow project in September 2018. He became an Apache Airflow committer in April 2019 and a member of the Apache Airflow Project Management Committee (PMC) in October 2019. He was elected an ASF Member in April 2021. He is an Apache project mentor in Outreachy and Google Summer of Code and was a mentor in Google Season of Docs. Jarek is an independent Open Source Contributor and Advisor and always keen on making it easier for people with different backgrounds to join OSS projects. = = = "Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 04:52PM Jun 23, 2021 by Sally Khudairi in SuccessAtApache | |
Sponsor Success at Apache: The Fork
by Wei Zhou
I joined the Apache CloudStack community in 2012 and became a committer in 2013, eventually becoming a PMC (Project Management Committee) member in 2017. My journey to becoming a PMC was both physical and literal, and included several forks in the road. The forks presented themselves in all aspects of my journey – although the literal forks came later, mainly because my journey began in China.
In 2010 I was working at China Mobile, the world's largest mobile network operator, in Beijing as the manager of a cloud project based on OpenNebula (another Open Source project). A year later, my partner received her PhD in the Netherlands and began working in Belgium, so I started looking for new work opportunities in the area.
In 2012 I visited Europe and had a few interviews, but it was difficult at the time as my English was quite poor.
I was committed to finding a good job and moving to Europe, which meant I needed to improve my English quickly. I studied the language, left China Mobile, and moved to Belgium permanently. It took me seven months to became fluent in English. I re-interviewed at the companies that had rejected me seven months prior and landed a position in Amsterdam at Leaseweb as a Cloud Innovation Engineer. At that time, we had two public cloud platforms based on Apache CloudStack. I was mainly working on the research of Apache CloudStack.
In the first two months I fixed some bugs we found in our productions. Thanks to the CloudStack community, I also received tons of help as I began contributing my changes to the mainstream. In 2013 was invited to be a committer, 3 months after my first submission. It was a huge surprise and a massive honor for me, and I began pushing my changes for new features and bug fixes much more quickly.
A year later, in 2014, Leaseweb released its first private cloud based on Apache CloudStack. It was very welcomed by many customers. As this was happening, we began finding issues with CloudStack as customers were requesting more features and functionalities. The same year, Apache CloudStack moved code repositories to GitHub.com and started using GitHub pull quests to review and merge commits. While all commits should be reviewed and approved by other commits before they are merged into the mainstream, we had already made many changes at Leaseweb and could not wait for next release. Because of this we created our own fork containing all our changes and bug fixes.
We developed very quickly, and our process was much faster than the review/merge process of Apache CloudStack. The gap between our fork and the community was getting bigger and bigger. When we decided to upgrade from CloudStack 4.2.1 to CloudStack 4.7.1, we had to spend half of a year just to port all of our changes in our fork based on CloudStack 4.2.1 to new fork based on CloudStack 4.7.1. The same problem happened again when we tried to upgrade to CloudStack 4.14, and we had to spend around one year to port all of our changes. The lesson we learned from these two upgrades was that we needed to contribute more to the community and maintain a fork as small as possible. After realizing this, we contributed all of our features and bug fixes to the community by creating many GitHub PRs. Some PRs have now already been merged into mainstream, while others are still in review.
My colleague recently asked me, “If you could go back in time, would you still make the Leaseweb fork?” My answer is yes, I would do it again. A fork makes us more flexible, as we can offer more stable production and more functionalities to our customers. However, if I could go back in time, I would have spent much more time contributing our changes to the community. I’ve learned that the gap between the fork and the community should be less than 100 commits.
We learned so much from these two painstakingly long ports and have implemented the above advice. From now on, the Leaseweb fork only contains features we have developed in the past and bug fixes. For new features, we will always contribute to community and deploy to our production only if it is merged into the mainstream. By doing this, we will be able to upgrade to the next CloudStack release much easier, and will benefit more from the community (e.g., more bug fixes, more features by other contributors). When we contribute to the community, we also benefit from knowledge sharing and the contributions from others.
Wei Zhou has been an Apache committer since 2013 and a PMC member since 2017. He has a Masters in Computer Applied Technology from the University of Science and Technology of China, and a PHD in Computer Organization and Architecture from the Institute of Computing Technology, Chinese Academy of Sciences. Wei specializes in all things computers and has over 10 years of experience in software development. He is a Principal Cloud Engineer at Leaseweb.
= = =
"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works". "Sponsor Success at Apache" features insights and experiences by select ASF Sponsors https://apache.org/foundation/thanks.html
For more Success at Apache posts, visit https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 04:12PM Apr 12, 2021 by Sally Khudairi in SuccessAtApache | |
Success at Apache: I Became an Apache Solr Committer in 4,662 Days. Here’s how you can do it faster!
by Eric Pugh
On April 6th, 2020 I was invited to become a committer on the Apache Solr project. My journey to becoming a committer started in earnest 4662 days before that! On July 2nd, 2007, I opened SOLR-284, a ticket for adding content extraction to Solr.
A committer on an open source project under the Apache Foundation umbrella is someone who is trusted to contribute code to the project and to help manage and drive its ongoing development. It’s an honour to have been asked and I was very proud to accept the invitation!
So, you did the math, and you realized that it took me 153 months, or 13 years (rounding up), to become a committer, and you’re wondering “What if I don’t want to wait that long?” So here’s my quick cheat sheet on ways to become a committer on an open source project, illustrated by my own journey:
- Start by learning the culture of the project. How are decisions made? What tools do people use? What do the various acronyms mean? Join the mailing lists and read every commit.
- Start small and work your way in. Some great ways to do this are to:
- Take existing patches and test them. Update them to the latest code base. Document what you’ve learned.
- Take advantage of being new to a project to bring fresh eyes to the documentation. Every time you find yourself scratching your head on how something works, contribute a fix to the docs. It’s a powerful way to immediately contribute. This is the fastest way to get involved and involves the least cognitive load! See SOLR-2232 or this email thread.
- Answer questions on the mailing list! Being able to articulate reasonable responses to questions demonstrates how much you have learned.
- Bug fix, bug fix, bug fix! Pick bugs that have an obvious answer so that the “correct” solution is easy to figure out. If the right approach to solving it is very ambiguous, you probably won’t get much traction. Remember to remind committers to apply your fixes when they have the time! See SOLR-13965 and SOLR-11480 and SOLR-2611 and SOLR-2263.
- Ready to start slinging some code? Don’t go and refactor the core foundations of the project (at least not yet). Instead, be like a pilot fish and latch onto one of the core committers who is being very active in the project.
Embrace their vision, and start picking up tasks related to whatever major chunk of work they are doing. Write some unit tests. See about opportunities for refactoring. Do some manual testing over multiple platforms. Once they see that you’re contributing (and accelerating what they are pushing), then work to get some of your own tickets assigned to you under that vision. I’ve seen this lead directly to committership many times, and if I had followed this route, I might have joined sooner!
Here’s to the next 4,662 days of being active in the Apache Solr project!
Eric Pugh is a member of the ASF and a committer Apache Solr. He co-authored the book Apache Solr Enterprise Search Server. Eric is co-founder and CEO of OpenSource Connections, where he helps OSC clients, especially those in the ecommerce space, build their own search teams by leading projects and by acting as a trusted advisor. He also stewards Quepid, a platform for assessing and improving your search relevance.
[this post first appeared at https://opensourceconnections.com/blog/2020/07/10/i-became-a-solr-committer-in-4662-days-heres-how-you-can-do-it-faster/ ]
= = =
"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 12:21PM Aug 03, 2020 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Remote Collaboration in the Time of Coronavirus
by Marvin Humphrey
I "arrived" at the Apache Software Foundation in 2005, unreasonably angry about a bug in Apache Lucene. By "arrived", I mean that I sent the first few emails among several thousand I would go on to send over the next 15 years — the ASF didn't have a physical office where I could show up to buttonhole and berate some unlucky customer service representative. An unreasonably patient Lucene contributor named Doug Cutting talked me down.
Because the ASF has always been a virtual organization, the Coronavirus pandemic has had minimal impact on its day-to-day operations. While individual contributors may be personally affected, at the collective level there's been no mad scramble to adapt.
Others have not been so fortunate. All around the world organizations have been struggling to revamp their processes and infrastructure to comply with "social distancing" protocols. Sadly, many have already laid off workers, or even closed their doors for good.
And yet, there is a huge pool of work which could conceivably be performed remotely but isn't yet — or which is suddenly being performed remotely but inefficiently. If we can accelerate and streamline the transition to remote work, many jobs and businesses could be saved. With some creativity, our interim "new normal" could be more propsperous, and perhaps sooner than we think!
Are you an Open Source contributor? If so, you possess expertise in remote operations which is desperately needed in today's challenging economic environment. Let's talk about what we know and how we can help.
The Internet Turns People Into Jerks
People type things at each other over the internet that they would never say to someone's face. In person, we calibrate our language based on feedback we receive via facial expressions, tone of voice, and body language. But when all communication is written, the feedback loop is broken — and all too easily, vicious words fall out of our fingertips.
Suddenly-remote workers may find themselves exposed to this phenomenon as conversations that once took place in the office migrate to Slack, email, and other text-centric communication channels. But it can be tricky learning to recognize when a conversation being conducted via a text channel has gotten overheated — it takes an intuitive leap of empathy, possibly aided by dramatic reading of intemperate material a la Celebrities Read Mean Tweets https://www.youtube.com/playlist?list=PLs4hTtftqnlAkiQNdWn6bbKUr-P1wuSm0 on Jimmy Kimmel.
Open Source communities have grappled with incivility for as long as the movement has existed. Over time, "ad hominem" personal attacks have gradually become taboo because of their insidious corrosive effect; there exists broad cultural consensus that you should attack the idea rather than person behind it.
Defenses have become increasingly formalized and sophisticated as more and more communities have adopted a "code of conduct". While the primary purpose of such documents is guard gainst harassment and other serious misconduct, they often contain aspirational recommendations about how community members should treat each other — because serious misconduct is more likely to occur in an environment of constant low-grade incivility.
Regardless of whether your organization adopts a code of conduct, it won't hurt to raise awareness among remote team members of the suceptibility of text-based communications to incivility — so that they may identify and confront it in themselves and others and shunt everyone towards more constructive patterns of communication.
Keeping Everyone "In The Loop"
Coordination is a troublesome problem even when everyone works in the same office, but the difficulties are magnified in remote environments where it takes more effort to initiate and conduct conversations. Teams can become fragmented and individuals can become isolated unless a culture is established of keeping everyone "in the loop".
At the ASF, the problem is especially acute because its virtual communities are spread out across the globe. Due to time zone differences, it is typically infeasible to get all stakeholders together for a meeting — even a virtual meeting held via conference call or videochat. Additionally, many stakeholders in ASF communities do not have the availability to participate in real-time conversations regularly because they are not employed to to work on projects full-time.
"Synchronous" communication channels like face-to-face, videochat, phone, text chat, and so on are good for rapid-fire iteration and refinement of ideas, but they effectively exclude anyone who isn't following along in real-time. Even if conversations are captured, such as with AV-recorded live meetings or logged text chats, it is inefficient and often confusing to review how things went down after the fact.
The solution that the ASF has adopted is to require that all meaningful project decisions be made in a single, asynchronous communication channel.
- The channel must be canonical so that all participants can have confidence that if they at least skim everything that goes by in that one channel, they will not miss anything crucial.
- The channel must be asynchronous to avoid excluding stakeholders with limited availability.
Synchronous conversations will still happen outside this canonical channel —and they should, because again, synchronous conversations are efficient for iterating on ideas! However, the expectation is that a summary of any such offline conversation must be written up and posted to the canonical channel, allowing all stakeholders the opportunity to have their say.
At the ASF, the canonical channel must be an email list, but for other organizations different tools might be more appropriate: a non-technical task manager such as Asana, a wiki, even a spreadsheet (for a really small team). The precise technology doesn't matter: the point is that there are significant benefits which obtain if a channel exists which is 1) canonical, and 2) asynchronous.
In an office, decision makers can absorb a certain amount of information by osmosis — via overheard conversations, working lunches, impromptu collaborations, and so on. That source of information goes disconcertingly dry on suddenly-remote teams, leaving only information siphoned through more deliberate action.
A canonical, asynchronous communication channel can compensate to some extent, providing transparency about what is being worked on and how well people are working together, and facilitating oversight even while most of the work gets done solo. Because properly used asynchronous channels capture summaries rather than chaotic and verbose real-time exchanges, the information they provide is more easily consumed by observers watching from a distance. The canonical channel also provides an arena for gauging consensus among stakeholders and for documenting signoff.
"Lazy consensus" is a particularly productive kind of signoff, where a proposal is posted to the canonical channel and if there are no objections within some time frame (72 hours at the ASF), the proposal is considered implicitly approved. Provided that the channel is monitored actively enough that flawed proposals get flagged reliably, "lazy consensus" is a powerful tool for encouraging initiative — a precious quality in contributors collaborating remotely.
Organizations are adapting in myriad ways to the economic crisis brought on by the Coronavirus pandemic. In the world of Open Source Software where countless projects have run over the internet for decades, we've accumulated a lot of hard-learned lessons about the possibilities and pitfalls of remote collaboration. Perhaps our experiences can inform some of the suddenly-remote teams out there straining to find their way in these difficult times. Let's help them to do their best!
Marvin Humphrey is a Member Emeritus of the ASF and a past VP Incubator, VP Legal Affairs, and member of the Board of Directors. These days, he is focusing on family concerns and consulting part-time.
= = =
"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 02:05PM May 11, 2020 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Welcoming Communities Strengthens the Apache Way
by Jarek Potiuk
During my career, I have been a software engineer, Tech Lead Manager at Google, a robotics engineer at an AI and robotics startup, and am currently the Principal Software Engineer of a software house, Polidea, which I helped grow from 6 to 60 people within 6 years as CTO. Over the past year and a half I was a user, then contributor, then committer, and now a Project Management Committee (PMC) member of Apache Airflow.
Although I took on many roles through the years, including being the main organizer of the international tech conference (MCE), deep in my heart I was always a software engineer. It took me many years to find a place where I could explore my true potential. Then I became part of the Apache community. I first learned about the Apache Software Foundation (ASF) 20 years ago when I used the Apache HTTP server at the beginning of my career. I had only made small contributions to OSS projects up to that point, and becoming involved with Apache Airflow was the first time I contributed seriously to one. As a Principal Software Engineer at Polidea, several of our customers were using Apache Airflow and wanted to contribute back to the project to help other users. Better integrations of services with Airflow would significantly improve future releases of the software.
The needs of our customers made me and my team go from users of the project to contributors and more. We have people in our software house who understand open source, know how to follow the OSS rules, and contribute changes from customers to help other people in the community. We know how to communicate well, we can also represent a vendor-neutral point of view because we represent the view of several customers and collaborate with all the stakeholders in the project. People in our company also contribute to other OSS projects, such as Apache Beam and Flink.
We also discovered a great model where our customers wanted us to contribute to an open-source project to make it better because they were using it and wanted to improve it for future users. This allowed us to do it full time (or even 150% of the time if you add all the out-of-hours contributions). I invite you to read about it in my blog post The evolution of Open Source - standing on the shoulders of giants.
Committing to Apache
I found exactly what I was looking for in the Apache Software Foundation. It’s a great organization for people like me: individual contributors who are also good at working with others, the ones who don’t shy away from organizing and making things happen, who thrive when they can do meaningful work with others.
This made me think: since the ASF is so great, how come for 20 years I was not contributing to OSS projects more? And since so many software engineers use Apache technology, why is participation not more common? I got lucky because I was in a position that allowed and supported my contributions to an open-source project. For me, it’s a dream-come-true. But what about others? There must be more people willing to contribute and get involved in the OSS community, they probably just don’t know how to go about it yet or did not like the experience.
So here I am, sharing my thoughts on what can be done to help others to get to know ASF sooner and get involved.
Apache Airflow and the initial experience
Apache Airflow is an exciting project. It is a platform created by the community to programmatically author, schedule and monitor workflows. It started in AirBnB in 2014, was submitted to the ASF incubator in March 2016 and it graduated to a top-level project in January 2019.
When I started working with the Apache Airflow project, I quickly realized that it was hard for me to contribute to. It was not clear how to develop and debug Airflow, how to start, and how to communicate. The project had a number of channels for communication including a developer list, a Slack channel, issues and pull requests alongside the code. As a newcomer, it was not easy to understand which channel is used for what and whether it’s OK to raise certain issues using those channels. It was not clear what were the common protocols: for example how to see that one thread is a discussion and one is voting on an already discussed topic.
Is our community welcoming enough?
At a party after a conference where I spoke about Apache Airflow, I had a long discussion with a young engineer who was new to the field, Fabian. Fabian told me that often OSS projects create some invisible barriers around communication and onboarding. I explained to him the “Apache Way” and how transparency and openness help with those barriers, fiercely protecting the fact that “we are open”. We carried on with our friendly discussion and it was really eye-opening.
That conversation stayed with me for a while. After some time, I realized that maybe our project was not as welcoming and accessible as we thought: there should be an easier way for people to contribute and join the community. I recalled my case—when I joined the community, I made a mistake by writing that something “will happen” before discussing it with the community. A long-time community member reminded me that this is not the way we should communicate at Apache Airflow. Just to note - each project is autonomous within Apache and it defines its own communication rules. It was done in a very good and friendly tone and I took it as a lesson, but some people might be put off by such a response. Not everyone has the determination, experience, thick skin and willpower to overcome all the obstacles and some people might be put off by such responses - even if they are nice and friendly.
Could we do better to communicate the ideals of our community more straightforwardly? Without coming across as harsh? Maybe we could find a way of explaining to the future contributors how they should communicate rather than do it by trial-and-error?
Becoming a more welcoming community
A few days and emails after the discussion with Fabian I started a thread at the developer’s list of Apache Airflow “[NON-TECHNICAL] [DISCUSS] Being an even more welcoming community?” that kicked off a conversation that included people who rarely had spoken before. As a result, we managed to introduce many changes to the processes for new members. Thanks to the input of people such as Karolina Rosół (Project Manager at Polidea), we came to the conclusion that the way seasoned community members communicate at Apache Airflow is not obvious at all to newcomers. We added missing chapters to our CONTRIBUTING documentation regarding communication channels, expected response times, and more.
What helped a lot was that we were able to improve our documentation for the development process during last year’s Google Season of Docs program. Apache Airflow was one of the first projects at the ASF to participate in the program. I was one of the mentors to Elena Fedotova, a technical writer assigned to our project. She improved and restructured the documentation, and made it more readable and easier to understand. Many people took part in reviewing and correcting the docs. Also, we took on the task of creating a new website for Apache Airflow with modern, clean design, and well thought UX addressing different personas of visitors in mind (including new contributors, users and potential partners of Apache Airflow). One of my colleagues and fellow Apache Airflow committer, Kamil Breguła, put enormous effort into both building the website and also restructuring the documentation.
As a result of the discussion at the developer’s list, we also introduced the mentorship options and even handled (via mentorship) a few difficult cases that could have lost us valuable contributions. A great example of the improvements we’ve done as a community might be this tweet from Vanessa, a research software engineer who had no experience with the community. Vanessa had tried to contribute support for Singularity—a popular container technology for high-performance computing - a year earlier, and came back for a second try after much of this work was done:
Are we there yet?
Looking back, it’s been a long (yet satisfying) journey trying to make the Apache Airflow community more welcoming. But how do we know it works?
At the beginning of this year, we started to participate in the Outreachy and Google Summer of Code programs, where people from around the world with different backgrounds can be paid for contributing to open source projects. Together with my friend and PMC member Kaxil Naik, we became program mentors and started to receive a flood of requests from the Outreachy members. Initially, we were overwhelmed but soon realized that we have everything we need to answer the questions of the candidates (and future committers) to let them teach their lessons and easily follow the “contributing” documentation. The contribution environment was available for them to get started, and the documentation detailed how they could learn how to prepare contributions and communicate via various channels.
Just two days later, we approved a few pull requests from those people! That’s quite a difference from 1.5 years ago when it took days, if not weeks, to understand the environment and how to work with Apache Airflow. It was truly a team effort; many community members participated in the process and made the Airflow project much more welcoming to newcomers.
Despite having challenges during our experience getting started, I was never going to quit. I loved the project and people almost from day one. Realizing how hard it was initially to start contributing (other people had told me so as well), I decided that I would put a lot of effort (both professionally and also personally) into making the project easier and more open and accessible for people with different backgrounds and experiences. My experience starting as a contributor, then becoming a committer, and now a PMC member proves that this is possible.
To me, Success At Apache means making the community and the spirit of Apache Way more accessible to people around the world. With the difficult times that we are going through now with COVID-19, it’s more important than ever to build and strengthen various communities. And to strengthen the community means to be open to others and be welcoming, We hope that our experience will encourage you to take a look at your project and see if you can make your community more welcoming.
# # #
Jarek Potiuk started to work on the Apache Airflow project in September 2018. He became an Apache Airflow committer in April 2019 and a member of the Apache Airflow Project Management Committee (PMC) in October 2019. He is an Apache project mentor in Outreachy and Google Summer of Code and was a mentor in Google Season of Docs. Jarek is a Principal Software Engineer at Polidea and always keen on making it easier for people with different backgrounds to join OSS projects.
= = =
"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 09:46AM Apr 06, 2020 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Literally
Browsing through the other "Success at Apache" posts made me reflect on the word "success". Years ago, I was asked in a job interview, "How do you define success?". After a pause, I asked back, "In what?", which threw the interviewer off a bit. That's just too broad of a question for me to define one answer: success in a career, success as a human, success as a team member, success at a software release, the list goes on and on.
Every day there's a giant list of possible successes and failures, and that’s even before you get to work ...so keep that in mind as you continue reading.
In August of 2016 I came across a blog post that would change my life forever.
At the time, I was looking for a new job that was taking longer than I expected. Taking a long shot, I sent off a very sparse email replying to the post. Two days later David Nalley (VP Infrastructure) replied, introducing me to Daniel Gruno who'd be doing the first round of interviewing. Fast forward a few months, and, spoiler alert: I got the job.
My first day "in the office" was in Seville, Spain, on November 14th during ApacheCon EU. Let me jump back a bit: most of the "Success at Apache" posts talk about the extensive background the authors have, both in the Open Source community and the ASF. While I use httpd, LAMP, etc. all the time, I never really found out how the "sausage was made". Apache has well-made products and the philosophy of how they were built intrigued me. My career until that point has mostly been inside Microsoft shops, usually with me suggesting FOSS solutions in meetings and only getting to use them in small-ish batches. A few MySQL boxes here, a few other Linux machines there, but not "full stack" kinda stuff: I ran it where I could but I was very happy with Microsoft products. "Best tool for the job", right?
Anyway, back to Spain. I don't travel as much as I should, my Spanish is terrible (or enough to get me into a bar fight), and I'm traveling to a country I've never been to.
Friday November 11th was the last day at my previous job. Saturday afternoon, I left my wife and kid to jump on a plane for Seville, Sunday-ish I landed, and on Monday I started work in another country, at a job that was 98% Linux-based (Windows Jenkins build nodes), with people whom I’ve never seen before because no one used video chat during the interviews --at a conference held by the foundation I now work for.
You may ask yourself, "How did I get here?", as I sure did: queue "Once in a Lifetime" by the Talking Heads...
My time at the ASF has been very interesting to say the least. With such a huge range of users of Apache software, some days I'm helping a large global company trying to get a product out the door, other days I'm troubleshooting a broken commit for someone working in their basement between dinner and baths for the kids. That's what makes this place special: those contributions help the community and help the common good of the project. The unique perspective I have is from within Infra. We don't just support the ASF, we support all projects in one way or another. One project might just be getting started with automated builds in Jenkins while another has been using CI/CD for years. That's a true strength of the ASF: disparate parts come together as a whole in a way that wouldn't work otherwise. Some days my job has nothing to do with technology, it's just getting the right people together on an email to figure out how to solve a problem, leveraging the different parts.
As mentioned earlier, "success" is a moving target, and at Apache, it's no different. Though in my case, any success at my job means I'm helping the ASF become successful, which in turn helps the projects and communities it supports. Behind every commit is a person, just working towards their own success.
I'm glad that I took the chance to respond to the job opening. Every job, company, and environment have a fair share of unpredictably and diversity. At the ASF, those traits are celebrated, leveraged, leaned on, and held up by the great people I get to work with and the community that I'm proud to be a part of.
= = =
Posted at 04:03PM Feb 03, 2020 by Sally Khudairi in SuccessAtApache | |
Success At Apache: "Mentor Your Mentor"
The concept behind "Mentor your Mentor" is that someone who is active in Apache should watch for opportunities to bring the idea of open source as a retirement hobby to the attention of a retiring colleague, even if the retiree has been their mentor, and no matter how senior the retiree.
If the retiree is interested, the Apache contributor can offer various forms of help and support such as:
Patricia Shanahan worked from 1970 to 2002 in various programming and computer architecture roles for NCR, Celerity Computing, FPS, Cray Research, and Sun Microsystems. She then went to UCSD as a graduate student, receiving a PhD in computer science in 2009, after which she retired.
= = =
Posted at 05:47PM Sep 23, 2019 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Why you'd want to become an Apache Committer
by Dmitriy Pavlov
All newbies in Open Source communities may sometimes think that they’ll never be able to become Committers. Many treat this role as a prestigious one, granted only for special feats, and after having written a ton of code. But not all things are so simple, and I hope my story will help you.
Election as Apache Committer
My journey with The Apache Software Foundation began relatively recently, in 2015. I was working for Openway Group, and was enthusiastic about in-memory computing. I got to know about Apache Ignite at a local developers conference. I implemented the POC of a backend system based on Apache Ignite. I was so impressed with the clear API and documentation, and it was also very convenient that I could start prototyping without passing through the approval process. I suggested using the Apache product instead of a source-available solution. I met Konstantin Boudnik (cos), who helped me to understand the difference between Apache projects and source-available/closely-governed products.
Luckily for me, GridGain, the company that initially donated Ignite to the Apache Incubator, has a development center in my city (Saint Petersburg, Russia). I joined the GridGain team in 2017.
As part of my day job, I provide patches to the product. I actively joined the dev.list discussions (some fellows sometimes say “too actively”). I’ve created a number of wiki pages - ‘Apache Ignite-Under the hood’ to help developers understand product internals. Also, I developed ‘Direct IO’ plugin. I was elected as a Committer.
In 2018 I was concerned about reviews of patches from members of the community not affiliated with GridGain. Since I had a commit-bit now, I’ve started to review patches and ask others to review them, too. I don’t know for sure, but I suppose - these social achievements in the community development were a basis for me being elected as Project Management Committee (PMC) member.
I asked several questions about The Apache Way on the Community Development (“ComDev”) dev list. I was very impressed by how friendly and welcoming they are. I very much like such a positive atmosphere, and feel it influences the success of Apache projects. Now I’ve also joined Apache Training (incubating) community as Committer and PPMC (Podling Project Management Committee) member.
Quite funny for a software developer with 17 years of experience… being elected as a Committer, that is to say, because of the social aspects and documentation.
Who is a typical Committer and where does his or her strength lie?
When creating an Open Source product, we always let the users explore it in action -- as well as allow them to modify it and distribute modified copies. But when such modified copies are replicated in an unsupervised manner, we don’t get contributions into the main codebase and the project stalls. It’s here where we need exactly such a person – the Committer – someone who is authorized to merge user contributions into the project.
Why should you become a Committer?
First of all, being assigned to a Committer role is extremely motivating. The professional community acknowledges you and your work, and you clearly see the results of your work in action.
How different is that from some enterprise project -- where you have no idea why you must continually keep shuffling various XML fields?
The second pure advantage of being a Committer is an opportunity to connect with top professionals and also pull some cool ideas from Open Source into your own project. But, if you aren’t one of the top professionals, certainly don’t be afraid to join -- the community has various tasks for different folks.
Besides, being a Committer is a jewel in your CV --and even a greater plus for junior programmers, because at interviews you are often asked to show code samples. If you know an Open Source project well, a company supporting or using it will be happy to hire you. There are some people who will tell you that great positions are unreachable without first committing in Open Source.
There are some bonus goodies, too! For example, Apache Committers get an IntelliJ Idea Ultimate license for free (albeit with some limitations).
How do you become a Committer?
You should be committed to the project --it’s just that simple. Development, writing tests and documentation, and simply answering questions on lists are also good ways to start working towards committership.
Yeah, the contributions of QA engineers and technical writers in the community are valued no less than the developers’ contributions.
If you think there are no tasks for you on some project, you are wrong. Just join the community you are interested in and start working on its tasks.
The Apache Software Foundation has this dedicated page that lists what contributions are needed.
Committer — to be or not to be?
Committer activity is a good and useful endeavor, but one shouldn’t strive to become a committer per se. This status can be granted not only for code and it doesn’t justify your proficiency.
Find a project you may be interested in: it will probably be a project whose software you already use. Dive into its code and say hi to the community; offer help, improve docs, complete a newbie ticket or answer to a user. You may just be surprised how welcoming and open folks are there.
Strive to gain the expertise (knowledge and experience) while researching the project, tweaking it and helping others to solve their problems, and, hopefully, enjoying collaborative development in an Open Source project.
Getting started at http://community.apache.org/ will help you on your way.
Dmitriy Pavlov is a Java developer enthusiastic about Open Source and in-memory computing. He is interested in system performance, information security, and cryptography. He created and donated utility for monitoring tests for Apache Ignite, and is a former Community Manager for Apache Ignite at GridGain. Dmitriy represents the Apache Ignite Project Management Committee (PMC) at local meetups in Russia. He runs workshops and training for Apache Ignite developers and users, and is a frequent speaker at meetups and conferences.
= = =
Posted at 11:48PM Sep 21, 2019 by Sally Khudairi in SuccessAtApache | |
"Success at Apache: The Path To Berlin"
This blog post is going to tell an entirely different story: One about what persistence, patience and continuous engagement can accomplish. One about what can happen when people are working together.
It was over a decade ago, in 2008, when I met with people interested in the then-hyped Apache Hadoop project to create a quarterly meetup on everything data analytics, text mining, scalable data storage. That was when the (Apache) Hadoop Get Together Berlin took place at newthinking store - a co-working space and event location before that format itself was turned into a business model http://blog.isabel-drost.de/posts/scaling-user-groups336.html.
A year later it was clear that an ApacheCon EU was unlikely to happen in Europe. When Simon Willnauer, Jan Lehnardt and myself approached The Apache Software Foundation about holding the event in Berlin - the kind people at Apache who did have experience with ApacheCon successfully stopped us: Given the baggage around the event, the trademark implication, the expectation that all sorts of different people had as well as the pure http://blog.isabel-drost.de/posts/how-hard-can-it-be-organising-a-conference.html logistics of creating an event that's bigger than 100 people, that was a safe and likely wise decision.
What they didn't achieve though was to stop us from running some event the following year: We at least wanted to give friends from the Big Data and search communities an excuse to make their employers pay for a trip to Berlin in summer: http://blog.isabel-drost.de/posts/my-highly-subjective-berlin-buzzwords-recap290.html. This was possible due to some lucky conditions we found ourselves in: Knowing conference organisers who were willing to share their know-how such as legal issues and boundaries with us Finding an event company (newthinking.de) that was supporting the idea.
Being employed by a company https://www.neofonie.de/ that saw value in sponsoring the event by allowing me to do so during my working hours.
While successful, part of the ASF community was still missing though. Fast forward several years to 2017, a new conference concept was born. Under the name of FOSS Backstage, we focus on the one topic that every project at Apache has to deal with: Governance, legal, security, economics https://blogs.apache.org/foundation/entry/success-at-apache-cookie-monster Issues that are not an Apache exclusive issue, but true for everyone - individuals as well as legal entities - involved in open source projects.
The only caveat: We had intentionally left all technical content out of scope for FOSS Backstage. For the data analytics crowd the event was conveniently co-located with Berlin Buzzwords. For the remaining content, Sharan Foga kindly volunteered for coordinating to run an Apache Roadshow alongside FOSS Backstage together with newthinking for two days after Berlin Buzzwords and in parallel to FOSS Backstage. With a name different than ApacheCon this left quite some room for experimenting beyond the traditional ApacheCon format.
Little over a year after that ApacheCon finally is on its way to the city of Berlin: With https://twitter.com/plainschwarz as event organisers, in collaboration with the Apache Software Foundation - with Myrle Krantz as event chair to coordinate between the ASF and the local event team.
In retrospect, the series of events was an interesting journey. There's a couple of lessons I've learned that carry over to open source software development - but also a few that are distinctly different.
1.Patches welcome - turn people that come with feature requests into active contributors
Instead of accepting feature requests from people, it helps to pull them in to submit their own patches: Early on there were requests for a barcamp, for a lightning talk session, for trainings. My response back then: Submit the idea through the CfP form, find someone to run it and we'll run it through the regular submission process adding it to the schedule if it fits.
For the trainings we went for a slightly different approach: Instead of directly offering them ourselves we reached out to established training providers suggesting to run with a co-location/ co-promotion approach.
For those that asked for free tickets we would turn them into helping hands - either on site, during setup or (in the first year) as local guides taking groups of up to twenty people out for dinner in a restaurant close by that they had selected.
For those that asked for more content on some specific topic we offered the option of organising a deep dive satellite event on Wednesday after the conference at one of the many companies willing to host these in Berlin.
In general we left a lot of room and freedom to those who wanted to get involved and add content to the event that they found missing.
2. Decisions are made by those doing the work
While feedback is important, there is a limit to what can feasibly be realised for any given conference budget. While we value feedback from anyone involved, the final decisions need to be taken by those actively contributing time to the event. As a result, that also means that not all feedback can always be taken into account - at least not right away, maybe at a later stage or in a different event, the consecutive year or just taken as an impulse to come up with new fresh ideas.
3. "It's done when it's done" is not an option
Conferences are slightly different from open source releases in that there are hard deadlines - in combination with a fixed budget coming in from attendees and sponsors and some hard features that cannot be postponed to the next release (unless you're organising a remote only conference, running without a venue is pretty much impossible.) That circumstance makes organisation slightly more prone to conflict than your average open source project: There's no cheap way to go down both paths and only at the very last minute decide which is better - or even offering both options.
4. Balancing public and private communication
At Apache we value public communication: Often having discussions in the open invites others to participate and shows where contributions are needed. When it comes to budget, ticket pricing, communication with sponsors, contracts including specific prices for venues this approach becomes a whole lot harder. Even though it helps to provide a dedicated mailing list for program committee members as well as interested attendees to get in touch.
It also helps to make some of the planning background public - either while discussions are ongoing, or at least after a conclusion has been reached: http://blog.isabel-drost.de/posts/berlin-buzzwords-scheduling-behind-the-scenes118.html (caveat: the algorithm has changed substantially since this was published, but the article did help answer a lot of questions.)
One downside to this mode of operation is that people who potentially could provide valuable insight or help out have no idea of what is going on. Another downside is that for the outside world it becomes invisible how large a team it takes to make the event successful. As a tiny fix for that we always tried to make the team involved publicly known.
5. Bringing people together
At Apache we have a tradition to work asynchronously - on archived, searchable, referenceable mailing lists. Without that way of working we wouldn't be able to build bridges across timezones, geographies, cultures and organisations. It wouldn't be possible to collaborate for people with wildly different time schedules. Despite all this hearing people speak when reading their texts makes it easier to understand their tone correctly. Despite all this there are topics that are best shared face to face only for deniability reasons. Despite all this, meeting someone in person who so far has been communicating only remotely with you can be a ton of fun. I hope that you will join that fun in October in Berlin: Looking forward to seeing you there!
Join us! ApacheCon Europe/Berlin 22-24 October 2019 https://aceu19.apachecon.com/
= = =
Posted at 01:18PM Aug 06, 2019 by Sally Khudairi in SuccessAtApache | |
Success At Apache: Positively impacting the world one contribution at a time
Dinesh Joshi is a Senior Software Engineer and a Committer on the Apache Cassandra project. He has a Masters in Computer Science (Distributed Systems & Databases) from Georgia Tech, Atlanta. In the past, Dinesh was a Principal Software Engineer at Yahoo building real time distributed systems for Yahoo’s Finance Web, iOS & Android apps. He is also an international speaker and regularly talks about Apache Cassandra and Databases. In his spare time, he volunteers as a mentor for Women Who Code.
# # #
"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 04:36AM Apr 01, 2019 by Sally Khudairi in SuccessAtApache | |
Success at Apache: What You Need to Know
EDITOR'S NOTE: I came across the author's original post, "An Introduction to Apache Software — What you need to know", dated 3 February 2017, and was interested in finding away to share with the greater Apache community. The author's enthusiasm was palpable, and earnestly intended to help educate others. With the ASF celebrating its 20th Anniversary this year, it's easy for many of us to simply rely on tribal knowledge, not realizing that navigation to definitive guides aren't intuitive to newcomers. Those of us who have been here for a while "just know", partially because we were creating it as we went along. Below is an updated version of the original post, amended through the guidance of three long-standing ASF Members. And that's the point of it all at the end of the day: at Apache, we help each other as it contributes to our collective success, and this writeup will help others find their Success at Apache.
by Maximilian Michels
Before you started reading this post, you have already been using Apache software. The Apache web server (Apache HTTP Server) serves about every second web page on the WWW, including this website. One could say, Apache software runs the WWW. But it doesnt stop there. Apache is more than a web server. Apache software also runs on mobile devices. Apache software is part of enterprise and banking software. Apache software is literally everywhere in today's software world.
Apache has become a powerful brand and a philosophy of software development which remains unmatched in the world of open-source. Although the Apache® trademark is a known term even among the less tech-savvy people, many people struggle to define what Apache software really is about, and what role it plays for today's software development and businesses.
In the last years I've learned a lot about Apache through my work on Apache Flink and Apache Beam with dataArtisans, as a freelancer/consultant, and as a volunteer. In this post I want to give an overview of the Apache Software Foundation and its history. Moreover, I want to show how the "Apache way" of software development has shaped the open-source software development as it is today.
The History of the Foundation
The Apache Software Foundation (ASF) was founded in 1999 by a group of open-source enthusiasts who saw the need to create a legal entity to institutionalize their work. Among the first projects was the famous web server called Apache HTTP, which is also simply referred to as "Apache web server". At that time, the Apache web server was already quite mature. In fact, not only did the Apache web server give the foundation its name but it became the role model for the "Apache way" of open and collaborative software development. To see how that took place, we have to go back a bit further in time.
A Web Server goes a long way
As early as 1994, Rob McCool at the National Center for Supercomputing Applications (NCSA) in Illinois created a simple web server which served pages using one of the early versions of today's HTTP protocol. Web servers were not ubiquitous like they are today. In these days, the Web was still in its early days and there was only one web browser developed at CERN where the WWW was invented only shortly before. Rob's web server was adopted quite fruitfully throughout the web due to its extensible nature. When its source code spread, web page administrators around the world developed extensions for the web server and helped to fix errors. When Rob left the NCSA in late 1994, he also left a void because there was nobody left to maintain the web server along with its extensions. Quickly it became apparent that the group of existing users and developers needed to join forces to be able to maintain NCSA HTTP.
At the beginning of 1995, the Apache Group was formed to coordinate the development of the NCSA HTTP web server. This led to the first release of the Apache web server in April 1995. During the same time, development at NCSA started picking off again and the two teams were in vivid exchange about future ideas to improve the web server. However, the Apache Group was able to develop its version of the web server much faster because of their structure which encouraged worldwide collaboration. At the end of the year, the Apache server had its architecture redone to be modular and it executed much faster.
One year later, at the beginning of 1996, the Apache web server already succeeded the popularity of the NCSA HTTP which had been the most popular web server on the Internet until then. Apache 1.0 finally was released on Dec 1, 1995. The web server continued to thrive and is still the most widely used web browser as of this writing.
The Rise of the Foundation
The team effort that led to the development and adoption of the Apache web server was a huge success. The Apache project kept receiving feedback and code changes (also called patches) from people all over the world. Could this be the development model for future software? More and more projects started to organize their groups similarly to the Apache group. As the number of project grew, financial interests arose and potential legal issues threatened the existence of the Apache group. Out of this need, the Apache Software Foundation (ASF) was incorporated as a US 501(c)(3) non-profit organization in June 1999. In the US, the 501(c)(3) is a legal entity specifically designed for non-profit charitable organizations. This is in contrast to other for-profit open-source software organizations or even US 501(c)(6) non-profit organizations which do not require to be charitable.
After the ASF was incorporated, new projects could easily leverage the foundation's services. Over the next year, every few months a new project entered the ASF. The first projects after Apache HTTP Server were Apache mod_perl (March 2000), Apache tcl (July 2000), and Apache Portable Runtime (December 2000). After a short break in 2001 which was used to come up with a programmatic approach to onboard new projects via an incubator, the ASF has seen very consistent growth of up to 12 projects (2012) each year.
The ASF became a framework for open-source software development which, in its entirety, remains unmatched by other forms of open-source software development. The secret of ASF's success is its unique approach to scaling its operations, in which the foundation does not try to exercise control over its projects. Instead, it focuses on providing volunteers with the infrastructure and a minimal set of rules to manage their projects. The projects itself remain almost autonomous.
Apache Governance - How does the foundation work?
There are about 200 independent projects running under the Apache umbrella. The question may arise, how does the foundation govern its projects? First of all, the ASF is an organization that is run almost entirely by volunteers. In the early days, many of the volunteers were developers which did not like to spend much time with administrative things (who does?), so the organization is structured in a way that requires little central control but favors autonomy of the projects which run under its umbrella.
For every project (e.g. Apache HTTP, Apache Hadoop, Apache Commons, Apache Flink, Apache Beam, etc.), there are a Project Management Committee (PMC), Committers, Contributors, and Users.
Project Management Committee (PMC)
The PMC manages a project's community and decides over its development direction. Its most rudimentary and traditional role is to approve releases for a project. In that sense it has a similar function as the original Apache Group which led the development of Apache HTTP Server. When a new project graduates from the Incubator (covered later), the foundation's central instance, the Board, approves the initial PMC which is selected from the PPMC (Podling PMC) formed during incubation. Each PMC elects one PMC member as the PMC Chair which represents the project and writes quarterly reports to the ASF Board. The Chair needs to be approved by the Board.
Through a project's lifetime new PMC members can be elected by the existing PMC. Note that each new PMC member needs to be approved by the Board but this approval is merely formal and there are few instances that a new PMC member is not approved. PMC members do not need the formal permission of the foundation to elect new Committers. PMC members themselves are also Committers. Let's learn about Committers next.
Committers can modify the code base of the project but they can't make decisions regarding the governance of the project. They are trusted by the PMC to work in the interest of the project. When they contribute changes, they commit (thus, the name) these changes to the project. Committers don't only change code but they can also update documentation, write blog posts on the project's website, or give talks at conferences. Committers are selected from the users of the project; more about this process in the Meritocracy section.
Users and Contributors
Users are as important as the developers because they try out the project’s software, report bugs, and request new features. The term is a slightly confusing because, in the Apache world, most users tend to be developers themselves. They are users in the sense that they are using an Apache project for their own work; usually they are not actively developing the Apache software they are using. However, they may also provide patches to the Committers. Users who contribute to a project are called Contributors. Contributors may eventually become Committers.
In the image, the per-project entities are represented as circles. They exist for every project. Note that the user group circle is not depicted in full size because big projects tend to have much more Users than Committers and PMC members.
The ASF does not work without some central services. Here are the most important entities:
Apache members represent the heart of the foundation. They have been referred to as the "shareholders of the ASF" because they are deeply invested in the ASF (not in the financial sense). A prerequisite to becoming a member is to be active in at least one project. To become a member, you have to show interest in the foundation and try to promote its values. The ASF holds membership meetings which are usually held annually. At membership meetings new members can be proposed and subsequently elected. Elected members receive an invitation which they can choose to accept within 30 days. Becoming a member it not merely a recognition for supporting the ASF, but it also grants the right to elect the Board.
The Board of Directors (Board)
The Board takes care of the overall government of the foundation. In particular, it is concerned with legal and financial matters like brand and licensing issues, fundraising, and financial planning. The board is elected by the Apache members annually and is also composed of Apache members. The current board can be viewed here. Note that there is only one central Board for the entire foundation but Board members can be PMC members in different projects.
Officers of the corporation
The Officers of the corporation are the executive part of the administration. They execute the decisions of the board and take care of everyday business. Most of the officers are implicitly officers by being the PMC chair of a project. Additionally, there are dedicated officers for central work of the foundation, e.g. fundraising, marketing, accounting, data privacy, etc.
The support and administration team (INFRA) is the team that runs the Apache infrastructure and provides tools and support for developers. INFRA is the only team at Apache which consists of contractors which are paid for their work. Their work includes running the apache.org web site and the mailing lists which are Apache’s main way of communication. Over time, various other tools and services were created to assist the projects. The main tools available which are used by almost all projects are:
- Web space for the project's websites.
- Mailing lists, for discussing the roadmap of the project, exchanging ideas, or reporting bugs (unwanted software behavior). Typically the mailing lists are divided into a developer and a user mailing list.
- Bug trackers, which help developers to keep track of new features or bugs.
- Version control, which helps developers to keep track of the code changes.
- Build servers, which help to integrate/test new code or changes to existing code.
Founded in 2002, the Incubator is a project at the ASF dedicated to forming (bootstrapping) new Apache projects. The process is the following: People (volunteers, enthusiasts, or company employees) make a proposal to the Incubator. The proposal contains the project name, the list of initial PPMC (Podling PMC) members, and the motivation and goals for the new project. Once the IPMC (Incubator PMC) has discussed the proposal, it holds a vote to decide if the project enters the incubation phase. In the incubation phase, projects carry "incubating" in their names, e.g. "Apache Flink (incubating)"; this is dropped only once they graduate. To graduate, a project has to show that it is mature enough. The Community Development project at the ASF has created a catalogue of criteria called the Maturity Model. It requires having an active community, quality of code, and being legally compliant. Formally, the project needs to prove it fulfils the criteria to the Incubator IPMC which is comprised of Apache members. All existing work donated in the course of entering the incubator and all future work inside the project has to be licensed to the ASF under the Apache License. This ensures that development remains in the open-source according to the Apache philosophy. More about incubation on the official website.
Meritocracy - How are decisions made?
The Apache Software Foundation uses the term "meritocracy" to describe how it governs itself. Going back to the ancient Greeks, meritocracy was a political system to put those into power which proved that they were motivated, put effort into their work, and were able to help a project. The core of this philosophy can be found throughout history from ancient China to medieval Europe and is still present in many of today’s cultures in the sense that effort, increased responsibility, and service to a part of society ought to pay off in terms of power of decision, social status, or money.
Meritocracy in the Apache Software Foundation denotes that people who either work in the interest of the foundation or a project get promoted. Users who submit patches may be offered Committer status. Comitters who are drive a project, may gain PMC status. PMC members active across projects and taking part in the foundation's work may earn the Member status.
Decision-making within the foundation and projects are typically performed using Consensus. Consensus can be "lazy" which implies that even a few people can drive a discussion and make decisions for the entire community as long as nobody objects. The discussions have to be held in public on the mailing list. For instance, if a Committer decides to introduce a new feature X, she may do so by proposing the feature on the mailing list. If nobody objects, she can go ahead and develop the feature. If lazy consensus does not work because an argument cannot be settled, a majority based vote can be started.
Meritocracy and "lazy" Consensus are the core principles for governance within the Apache Software Foundation. Meritocracy ensures that new people can join those already in power. "Lazy" Consensus creates the opportunity to split up decision-making among the group such that it doesn't always require the action of all members of the community.
The Apache License - A license for the world of open-source
The Apache license is very permissive in the sense that source code modifications are not required to be open-sourced (made publicly available) even when the source code is distributed or sold to other entities. This is in contrast to “Copyleft” licenses like the GNU Public License (GPL) which, upon redistribution, requires public attribution and publication of changes made to the original source code. The Apache license was first derived from the BSD license which is similarly permissive. The reason for this was that the Apache HTTP Server was originally licensed under the BSD license.
The current version of the Apache License is 2.0, released in January 2004. The changes made since the initial release are only minor but they set the prerequisite for its prevalence. At first, the license was only available to Apache projects. Due to the success of the Apache model, people also wanted to use the license outside the foundation. This was made possible in version 2.0. Also, the new version made it possible to combine GPL code with Apache licensed code. In this case, the resulting product would have to be licensed under the GPL to be compatible with the GPL license. Another change for version 2.0 was to make inclusion of the license in non Apache licensed projects easier and require explicit patent grants for patent-relevant parts.
The ASF today is not the small group that it used to be back in 1999. At the time of this writing, the Apache Software Foundation hosts 51 podlings in the Incubator and 199 top-level committees (PMCs). This amounts to almost 300 projects (latest statistics). Note that, a PMC may decide to host multiple projects if necessary. For instance, the Apache Commons PMC has split up the different parts of the Apache Commons library into separate projects (e.g. CLI, Email, Daemon, etc.). 50 of the 300 projects have been retired and are now part of Apache Attic, the project which hosts all retired projects. The above graph is taken from https://projects.apache.org.
The Apache Software Foundation regularly organizes conferences around the world called ApacheCon. These conferences are dedicated to the Apache community or certain topics like Big Data or IoT. It is a place to meet community members and learn about the latest ideas and trends within the global Apache community. Apart from the official conferences, there are conferences on Apache software organized by companies or external organization, e.g. Strata, FlinkForward, Kafka Summit, Spark Summit.
Here's a list of some projects that I came across in the past. I grouped them into categories for a better overview. I realize you might not know a lot of the projects but maybe this list can be the starting point to discover more about these Apache projects :)
Query Tools / APIs
- HTTP (the one!)
Apache - A Successful Open-Source Development Model
My first attempt to learn more about Apache goes back several years. I was using the Apache License while working on Scalaris at Zuse Institute Berlin. I realized that the license was somehow connected to the Apache Software Foundation but I didn't really understand the depth of this relationship until I started working on Apache Flink with dataArtisans. Besides the official homepage of the foundation, relatively little information was available on the Internet about the foundation and its projects. In hindsight, the best source of information was to read the email archives, get to know other people at the ASF, and become a volunteer myself :)
When I originally wrote this post I couldn’t find an introductory guide to the ASF. So I decided to do a bit of research myself and tried to write down what I had learned working on Apache projects. I hope that I could provide an overview of the ASF and show you how significant the foundation has been for the open-source software development.
Thank you for reading this article. Feel free to write me an email if I got something wrong or you would like to comment on anything.
Thank you Roman Shaposhnik, Shane Curcuru, Dave Fisher, and Sally Khudairi for your comments which were very helpful to revise this post for the 20th anniversary of the ASF.
# # #
"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 02:15AM Mar 26, 2019 by Sally Khudairi in SuccessAtApache | |
Apache Software Foundation Platinum Sponsor Profile: Leaseweb
with Robert van der Meulen, Global Product Strategy Lead at Leaseweb
Robert is Global Product Strategy Lead at Leaseweb. Fascinated by technology, Robert studied computer sciences, and after his studies, he delved into the then relatively young and rapidly developing internet technology. He soon understood that the internet would be at the center of almost everything we do and wanted to be part of it. Robert is passionate about using technology to improve people's lives. He contributed to the Debian project as a developer later introduced Apache CloudStack in Leaseweb and has been active in the open source community for quite some time. During his 9 years at Leaseweb, he worked hard to make sure digital transformation, from how we communicate to how we do business, is part of the company mission. Follow @Leaseweb on Twitter.
"Many Apache projects are being built by – mostly – volunteers and motivated individuals, and the world can use, change and develop all of those. It's important to support the people that make this possible."
How did Leaseweb's work with Open Source begin?
Support for foundations such as the ASF is important because those foundations are important :-) . Any big open source project at some point needs the infrastructure to continue to run – and it's great if a project can rely on an organization like the ASF for that infrastructure so the focus can be on making the project great. Open source projects can grow and be more successful if they can more easily deal with governance, financials and administration, as well as tangible infrastructure and tools. Helping an organization like the ASF helps the ASF projects all over, which has an impact on the software we use as part of our products.
What sets the ASF apart from other software foundations or consortia?
A number of our leading Cloud products are based on Apache software. We use Apache CloudStack for various private cloud and VPS offerings, and those platforms are continually growing and evolving – and we keep adding more with most of the new locations we open. Along with the CloudStack platforms, hosting environments obviously have many deployments using Apache web servers. Within our technical teams we consume lots of different Apache projects and actively contribute to a number of them (we have a dedicated CloudStack team that includes one of the Apache CloudStack PMC members). Every software solution has its limits, and obviously this goes for CloudStack too – but also we're happy we can change or help change the things that could be better.
It feels great! It's important to, if you have the opportunity, give something back. Many Apache projects are being built by – mostly – volunteers and motivated individuals, and the world can use, change and develop all of those. It's important to support the people that make this possible.
Are there any other thoughts on the experience of being a large-scale donor that you would like to share? What else do we need to know?
Not much. I personally really enjoy seeing what happens with the support we provide – what projects it makes possible, what things it makes more easy or better. Tangible insight in the results is a big motivator as well as a proof point.
# # #
Sponsors of The Apache Software Foundation such as Leaseweb enable the all-volunteer ASF to ensure its 300+ community-driven software products remain available to billions of users around the world at no cost, and to incubate the next generation of Open Source innovations. For more information sponsorship and on ways to support the ASF, visit http://apache.org/foundation/contributing.html .
Posted at 02:04PM Mar 18, 2019 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Growing with the ASF
by Phil Steitz
I got involved at the ASF in 2002, back in the wild and wooly Apache Jakarta days. In my day job, I was responsible for the team introducing Java technology at a large financial services company.
One of the first things we built was an MVC (model-view-controller) framework for Web applications. We were very proud of it and it worked great in production, but it was hard for us to keep ahead of the feature requests from the many development teams who were using it. One evening, someone said, "Hey, there is this Struts thing that is very similar to what we do and it has some of these things already." I went home and found my way to the Jakarta Web site and downloaded the latest source release.
One thing led to another and the next thing I knew I was asking questions on the struts user mailing list as we started playing with the software and seeing what it would take to convert our apps to use it. After a few months, I found myself answering questions on-list as well and I finally got up the nerve to submit my first patch, which was a documentation fix. At the time, the Apache Struts community was struggling to release version 1.0. I looked around to see what I could do to help and found my way to Apache Commons Pool and DBCP, which Struts was trying to use to replace its built-in connection pool. What I found there was some brilliant but inscrutable code hiding some nasty bugs that Struts needed fixed. At that time, I did not have the Java skills to solve the problems, but I resolved to come back when I did and I watched as others developed workarounds that enabled the Struts community to move forward. I found a welcoming community in Commons and some problems that I could help with. I did eventually make it back to Commons Pool and DBCP, serving as RM for quite a few releases.
During this same timeframe, my $dayjob career was advancing rapidly, thanks in no small part to my aggressive introduction of Open Source software and practices, which was uncommon at the time in financial services. We brought in some ASF committers and their companies to help us build a development pipeline and tooling that was ahead of its time. We applied the Contributor - Committer - PMC member concept to developing enterprise technology standards and strategy. We developed the concept of "earned authority" in technology decision-making, modeled after the idea of publicly earned merit at the ASF. My leadership approach was profoundly influenced by my experience at the ASF, and continues to be to this day. Not a day goes by at work when I do not push for more transparency, more eyeballs on code and more focus on community collaboration and genuine appreciation of diverse viewpoints. I am very grateful to the many ASF community members who have helped me develop as a leader.
Through the years I've met other Apache committers with similar experiences: welcoming projects, friendly communities and great opportunities for personal growth. I’m pleased to see how the ASF has grown and continued to evolve. Every day new contributors join us and new leaders regularly emerge to help guide our communities and the Foundation overall. We all benefit from our experience here and the Foundation becomes stronger as a result.
Phil Steitz is Chairman of the Board of The Apache Software Foundation. He has been an ASF committer since 2003 and a member since 2005. He served for 4 years as Vice President, Apache Commons. Phil also currently serves as Chief Technology officer of Nextiva, a cloud-based business communications company. He has previously held C-level technology leadership positions at multiple software and financial services companies.
# # #
"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 09:19PM Mar 05, 2019 by Sally Khudairi in SuccessAtApache | |
Success at Apache: For Love or Money: Volunteer vs. Professional Open Source
EDITOR'S NOTE: "Success at Apache" reflects diverse, personal experiences from our community members, with particular focus on the people and processes behind why the ASF "just works". The post below is the result of a discussion with the author that originated in early September 2018 and remained unpublished as its tone deviates from the general tenor of this series.Over the past few months, this topic has increased in visibility and relevance with the greater community, and we have made an exception in publishing due its timeliness and representation of the evolution of Open Source communities, both within and without Apache.
A few weeks ago, a colleague asked me what I believed to be the biggest threat facing Open Source today. I answered that I think it's full-time Open Source developers, and the effect they have on part-time volunteer developers.
Long ago (it actually hasn't been very long, it just seems that way sometimes) Open Source was developed primarily by part-time hobbyist developers, working on their evenings and weekends on things that they were passionate about. Sure, there were full-time developers, but they were in the minority. Those of us working a few hours on the weekends envied them greatly. We wished that we, too, could get paid to do the thing that we love.
Now, 20 years on, the overwhelming majority of Open Source development is done by full-timers, working 9-5 on Open Source software. And those who are working nights and weekends are often made to feel that they are less important than those that are putting in the long hours.
Most of the time, this is unintentional. The full-timers are not intentionally marginalizing the part-timers. It just happens as a result of the time that they're able to put into it.
Imagine, if you will, that you're an "evenings-and-weekends" contributor to a project. You have an idea to add a new feature, and you propose it on the mailing list, as per your project culture. And you start working on it, a couple of hours on Friday evening, and then a few more hours on Saturday morning before you have to mow the lawn and take your kids to gymnastics practice. Then there's the cross country meet, and next thing you know, it's Monday morning, and you're back at work.
All week you think about what you're going to do next weekend.
But, Friday evening comes, and you `git pull`, and, lo and behold, one of the full-timers has taken your starting point, turned it in a new direction, completed the feature, and there's been a new release of the project. All while you were punching the clock on your unrelated job.
This is great for the product, of course. It moves faster. Users get features faster. Releases come out faster.
But, meanwhile, you have been told pretty clearly that your contribution wasn't good enough. Or, at the very least, that it wasn't fast enough.
The Cost of Professionalism
And of course there are lots of other benefits, too. Open Source code, as a whole, is probably better than it used to be, because people have more time to focus. The features are more driven by actual use cases, since there's all sorts of customer feedback that goes into the road map. But the volunteerism that made Open Source work in the first place is getting slowly squelched.
This is happening daily across the Open Source world, and MOST of it is unintentional. People are just doing their jobs, after all.
We are also starting to see places where projects are actively shunning the part timers, because they are not pulling their weight. Indeed, in recent weeks I've been told this explicitly by a prominent developer on a project that I follow. He feels that the part timers are stealing his glory, because they have their names on the list of contributors, but they aren't keeping up with the volume of his contributions.
But, whether or not it is intentional, I worry about what this will do to the culture of open source as a whole. I do not in any way begrudge the full-timers their jobs. It's what I dreamed of for *years* when I was an evenings-and-weekends open source developer, and it's what I have now. I am *thrilled* to be paid to work full time in the Open Source world. But I worry that most new Open Source projects are completely corporate driven, and have none of the passion, the scratch-your-own-itch, and the personal drive with which Open Source began.
While most of the professional Open Source developers I have worked with in my years at Red Hat have been passionate and personally invested in the projects that they work on, there's a certain percentage of them for whom this is just a job. If they were reassigned to some other project tomorrow, they'd switch over with no remorse. And I see this more and more as the years go by. Open Source projects as passion is giving way to developers that are working on whatever their manager has assigned, and have no personal investment whatsoever.
This doesn't in any way mean that they are bad people. Work is honorable.
I just worry about what effect this will have, and what Open Source will look like 20 years from now.
Rich Bowen has been doing open source-y stuff since about 1995, and has been a member of the Apache Software Foundation since 2002. He currently serves on the ASF Board of Directors. By day, he's the CentOS Community Manager, working for Red Hat. Read Rich's earlier post, "Success at Apache: Wearing Small Hats".
# # #
Posted at 03:18PM Feb 05, 2019 by Sally Khudairi in SuccessAtApache | |
Apache Software Foundation Gold Sponsor Profile: Bloomberg
with Kevin Fleming, Head of Open Source Community Engagement and Member of the CTO Office at Bloomberg
Kevin has spent more than 20 years in the technology industry. In 2004, he started a VOIP service provider company and chose Asterisk as his platform. 9 months later he was offered a position at Digium to work on Asterisk full-time. After seven years of developing and managing the Asterisk project, and helping to design and build the Asterisk SCF project, he moved on to Bloomberg LP, where he works with various teams to help produce and support Bloomberg's open software, used by its customers and partners to integrate with the Bloomberg Terminal. Follow @realkpfleming and @TechAtBloomberg on Twitter.
“ASF's very explicit statement that every participant
in a project is participating personally is really
a big differentiating factor to the other models.”
How did Bloomberg's work with Open Source begin?
We're pleased to be able to participate and encourage more companies to sponsor the Apache Software Foundation. Everyone wins when everyone collaborates.
# # #
Sponsors of The Apache Software Foundation such as Bloomberg enable the all-volunteer ASF to ensure its 300+ community-driven software products remain available to billions of users around the world at no cost, and to incubate the next generation of Open Source innovations. For more information sponsorship and on ways to support the ASF, visit http://apache.org/foundation/contributing.html .
Posted at 02:25AM Jan 15, 2019 by Sally Khudairi in SuccessAtApache | |