The Apache Software Foundation Blog
Success at Apache: bringing the Apache Beam firefly to life
by Julián Bruno
Creating the Apache Beam firefly was the first opportunity I had to contribute my skills as a designer and illustration artist to an open source project. I didn’t know anybody working in open source until I moved to San Francisco from Buenos Aires, Argentina. I knew about open source software for video games, like Unity or Unreal Engine... This allowed gamers to make modifications, like adding new levels or creating new character models, and upload them to the same engine that hosted the original game for other gamers to use. This practice enabled a sense of community, where users can share ideas, passions, and express creativity. There are so many things you can do when you work in collaboration with others. This spirit of community is one of the things that made me excited about contributing to Apache Beam.
Living in an area where technology is everywhere really piqued my interest and drove my curiosity to understand how technology evolves. When the opportunity came to contribute to Apache Beam, I was interested right away. I didn’t know about the project before I got involved, and I certainly didn’t know there was a community behind it, working together to build this amazing solution. Building a mascot for a group of people is different from working for a brand because this firefly represents a group of people and what they find valuable. There is an extra layer that makes it more human. For this type of work, designing a mascot is usually a decision reserved for a small group, and the larger community is not involved. It is refreshing and very meaningful that the community had a chance to step into the process. I saw it as an opportunity for self-expression,participation, and one more exercise in community building.
In order for this process to be inclusive, I built a group-wide communication system for the community to input during the process. I think that having open and frequent communication was key because, ideally, I wanted everyone to feel that the mascot represents them. I created questions that would help Apache Beam contributors understand what I needed as an illustrator. The questions helped me understand what they liked. This ensured that the mascot was aligned with the community’s taste. Some questions were about colors and visual styles they preferred, if the eyes are too big or small, and preferred line art style. There were 4 rounds of feedback, plus a final vote, where 18 people participated. Engagement increased with every new round. The Apache Way for communities to operate reminded me of a lot of animation forums I participated in the early 2000s. I’m glad to see that some of these practices are still around, because they help make processes more inclusive and build a sense of community.
This communication with the Apache Beam community helped me to create a mascot with features that are unique to the project. When I started, I was given a few concepts that I needed to work with, such as: cute, innovative, fast, data processing, and futuristic. The first few decisions, like making the mascot look as aerodynamic as possible were easy to make. Conveying "data processing" was a bit harder to figure out, butI eventually chose to communicate this concept by changing the mascot's color. What really gave the mascot its unique identity came from using Pokémon-like character style. I built the rhetoric for Apache Beam's logo by combining two concepts that have nothing to do with each other, Pokémon and data streaming, and created something new.
In the end, I created the Apache Beam mascot and its model sheet, so that anyone can reproduce it, a version of the mascot learning (a key focus for the project at the moment), and a version of the firefly doing what it does best… stream data! I really enjoyed working for Apache Beam and contributing my skills as an illustration artist to open source. I think the most interesting part is the community: creating something in collaboration with others adds a lot of value to what you are making for the world.
Julián is a digital artist based in San Francisco, California. He has spent over 10 years in the animation industry and has developed his skills in art direction, 2D animation, illustration, and visual art development. My passions include art and cartoon animation, as well as connecting with people and creating new projects. He was born and raised in Buenos Aires, Argentina, where he studied Graphic Design at University of Buenos Aires (UBA). Find Julián's work on Artstation and Instagram.
= = =
"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 02:41PM May 04, 2020 by Sally Khudairi in SuccessAtApache | |
Inside Infra: Drew Foulks
The second in the "Inside Infra" interview series with members of the ASF Infrastructure team features Drew Foulks, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.
What is your name --how is it pronounced?
My name is Drew Foulks. “Droo Follx”.
If folks were to find you at the ASF, like on Slack or elsewhere, what's your handle? How do they find you?
They'll find me at Warwalrux, spelled with an X, so W-A-R W-A-L-R-U-X.
So, “War Walrus”, but with an X at the end. Where did that come from?
Kind of embarrassing story actually. I got picked on a lot in middle school because I was always really good with computers, but as bad as it sounds, I never really wanted to be. I always wanted to be one of the one of the cool kids and the cool kids were not going to computers. One day I got into a fight at school and one of my friends just absolutely made me lose it afterwards. I was sitting there on the ground crying and he said, "Man, you were a fighting walrus, like the walrus of war or something. It was awesome." I lost it. But ever since then, I’ve just been like, "You know what? I'm not even going to be ashamed about that anymore." I've been that since I started doing tech, which was actually not that long ago compared to the other guys on the team.
How long have you been in tech?
I'm 29, and have been in tech since I was 16, so 13 years.
When did you get involved with the ASF? How did you get here?
I was working at NASA for four and some change years, and I decided that I wanted to pursue some other opportunities because they really were not supportive of that work from home culture. And at the time I had a lot of stuff going on. My wife was sick, my daughter, my youngest, has special needs and stepson actually also has special needs, so being at home was something I had to do. A buddy of mine tipped me off on a Website called We Work Remotely. I ran across your ad there and thought, "There is no way that is who I think that is and I'm going to apply for the hell of it." Surprisingly, two months later, I got a call back.
You do understand how many interview candidates we had, right? A lot of people were competing against you.
It blows my mind. I heard the stories after I got hired and I was just like, "Man, that's nuts." And then when I got hired, I was actually told, jokingly, of course, the ASF was looking to launch its own brand of internet satellite. So that’s why we hired people from SpaceX and NASA.
The Infra guys have such a dry sense of humor! How long have you been a member of the team?
One year and one month, 13 months.
For some reason it feels like you’ve been part of the Apache family for years. What’s your role in ASF Infrastructure? What are you responsible for?
My latest contributions have been the Website builders, so I'm working on helping people migrate off of CMS. Some of the ways that I've chosen to do that are by working with Humbedooh (the handle for ASF Infrastructure team member Daniel Gruno) on his ASF.YAML project, that so many projects seemed to be really enjoying.
YAML? Yet Another Markup Language?
That's it. Yet Another Markup Language.
So basically, I built the system that lets you build Websites from ASF.YAML and you just specify your Website builder, whether it be Pelican or Jekyll --those are the two that we support right now. And you give it a source branch and a target branch and every time you check in, boom. It builds your website.
Who is this aimed at?
This is for Apache Projects building their TLP Websites. When you commit your Website to the repo, say any project, they've all got Websites, but some of them are generated via Jekyll. Some of them are generated with Pelican, some are generated in a custom way with a Jenkins job. It's just how each project is determined to generate their website, but we're trying to make it easy and provide lots of options for projects to migrate off of the old CMS. But still projects are allowed to be able to choose their own method of publishing or their method of creating a site, but you have to be able to enable all of that to happen.
Did you have to learn this or was this knowledge something that you came into the position with?
I learned it.
Was it difficult? How long did it take you to get this project up?
The Pelican one was a lot harder than the Jekyll one. So, Pelican took a couple of months. Really, Greg had a prototype when I came in that apparently had been kicking around for a little bit, so I tightened it up and pelicanized it. I think it works pretty well. I've not heard any complaints about it.
That took a while before I wasn't doing primarily Python programming, I was doing lots of different ops things just in a completely different way than what I do now. To be honest, I still haven't wrapped my head around exactly what it is I do here.
Do you mind sharing a little bit about that?
I came here from the government world, which is very silent. I worked for the OCIO, Office of Chief Intelligence Officer Data Center for NASA Langley, which is a very old NASA center. Older than NASA itself actually. Their infrastructure, as you can probably guess, is not the newest: It's 100 years old. They have wind tunnels from the 1920s. There are parts of the infrastructure that are 100 years old and it's insane. Everybody has a specialty, everybody's a subject matter expert in something, and there's nothing more permanent than a temporary government program, so if you take something on, expect to be doing that for the rest of your life. It's very regimented. If you’ve ever seen Hidden Figures, the computational research facility where they’ve opened the Katherine Johnson Research Center, was my data center.
And then to come to the ASF, it's like, "Okay, so we've got like 11 different Cloud providers and these are all the projects that we're supporting. Do you know this, this, this, this or this?” Jenkins, Buildbot, VMware, any of the Docker, Puppet and all that stuff. Do I know any of these myriad Open Source technologies that one doesn't really get to use a lot of in the government sphere. I mean, I've been doing Ansible there for three years.
It was very monolithic. We had VMware. I ran a data center. I had hardware. I had to track all of that. Coming here, everything is completely different. It's like, "We're juggling all these different Cloud providers, and oh, wait: we’ve got to migrate out of this one today, so let's do that. Okay. All right. Where are we going with this?" It's just like there's no end in sight. As technology progresses, so do we. It's just that we do it so much faster than anywhere else I've ever been.
Is that exciting or scary?
Oh, gosh. I've never stopped long enough to think about it. It is a bit of both. It is intimidating for sure, because before it was very silent. Like I said, I did my thing and I had my interests, my extracurricular interests, running home network setups and private media servers and whatnot. Then I come here and those hobbies go away, now I’m doing that for the Foundation instead.
Yeah, that's cool, though.
It is. I'm a professional hobbyist.
To get paid for doing your hobby is pretty rewarding.
It is. Yeah.
This has become your hobby in a different way, of course, because I'm sure you weren't planning on dealing with ~11 different Cloud providers.
No, I was not.
In our chat with Chris Thistlethwaite last month, we learned more about who ASF Infra serves and the scope of the work that you provide. Can you tell me more about the who and how it works exactly? So, who Infra serves and to what capacity or what is it that you guys do? Because I get every person's perspective is slightly different because I get the same, we do it all answer, and is that true? I mean, you're saying that so far, it sounds like it's true. I guess no one has a reason to expand upon it in terms of embellishment, but tell me more.
We serve Apache project developers and development teams. It’s not just the people who sit down and write the code, the people who orchestrate these very complex processes of building testing, checking, doing the sanity work behind the scenes, the people coordinating releases, PMCs planning out the future of these projects, we serve them, too, and we have to serve them in a capacity beyond, "Hey, here's a build platform," it's: "We support your email communications, we’re there to facilitate the goings on of the Project." Infra's domain is almost everything but the coordinating and writing of code.
Taking care of their code management systems, providing them with the means to do build testing and having it not kill us in the process. That's a big, big addendum to that requirement. Like I mentioned, email, I call them the central services, things like LDAP, authentication, your virtualization services, file sharing, all of those things that make the business of a TLP easy(ish). I am in the business of making life easy for people who do phenomenal stuff. That's honestly how I view my job and it's very, very different than my old one.
In my old job, I had one customer who I bent over backwards for; here, it's very much, "Listen, my job is to provide these services and to facilitate what you guys do, not do it for you." Drawing that line sometimes becomes difficult for me personally because I don't have as much experience in the ASF, I think. But that seems to be a skill that the other guys have is when to bounce back and say, "No, this is definitely a PMC or a PMC issue that you guys should be dealing with because it sets a bad precedent if I make this decision. I'm not going to do this work for you." It wouldn't be a right to pollute a project like that.
What you're saying doesn't come across as odd. One thing that I always want to know is how ASF compares with other infrastructure operations in general. Chris had said this also, here you have 300+ projects and all sorts of different groups that you're interfacing with, so it's a completely different type of interaction. Your response is totally legitimate: it takes a certain type of personality to be able to handle that because most people would likely be overwhelmed and run away. The fact that you're here and thriving and our projects are expanding is awesome.
Thank you. You can thank my wife for not letting me run away.
Based on my understanding, as a team you're autonomous yet coordinated. Is that the right way to describe how you work together?
Yes. That is a good way to describe how we work together.
Do you feel like that model works or do you think something else should be happening or how does that work for you?
That's a tough question because I'm not sure that the answer would make any sense, but I'll give it a go anyway. By constantly talking with each other, the team gets a sense for the direction that we need to be heading. Leadership is very organic and not spontaneous, but they're like a current guiding us towards the goal, really, whatever that is, so all of the decisions that we make on the daily really kind of help us towards that goal, because fighting the current is difficult.
In a lot of ways that long-term coordination is really facilitated by this, I'm going to call it “on a current of progress”. It's not forceful. That's kind of what it feels like. The team is driving towards something, it's not random, to be honest with you. It's typically a goal that we have in mind, but all of the work that we do is just like, "There's a cool idea that I had related to this, so let's just work on that." And we end up getting there. It's crazy.
Describe your typical workday. Are you on a rolling schedule? Do you guys work on a shift? How do you get it all done --and you're down one person now-- how do you get it done?
I have no idea. So really, personally, I have a nine-hour a day week schedule that I follow every day. So basically I start work and I break it up into two or two-and a half hour chunks and I do four of those, take little breaks in between, try to keep myself sane, try to throw in a dog walk. Really, I just approach it like I approach any other job, one ticket at a time.
Do you work in shifts? How do you cover those 24/7? How do you balance the load?
So there's a one week on-call rotation. So right now there are the... gosh, how many of us are there? Five? Anyway, so there's one week on-call rotation and that person is on 24/7 for the week, Monday to Monday. And then after that, it's pretty much just you cover your time zone. Yeah. So the scheduling, it's so loose that I mean really as long as you're putting in your eight hours a day, nobody really cares when you do that. I choose to have that nine-hour work day because kids really. It's fantastic for having a family, but whether you want to jump on at 1:00 in the morning and work for six hours, that's fine.
OK, so as long as someone's there, and it doesn't have to be you, you can work on your own timeframe. Are you guys usually slammed? Is it low-level? Is there a busy time for Infra on the whole? Is it like tax season if you're an accountant, or is it constantly just 24/7/365?
It's pretty much 24/7/365, but we do definitely have “seasons” as well. We do a one week on-call rotation, so somebody's always on, but the scheduling is very relaxed. So, it's optional, the hours you'd like to keep. I choose to work a work day because of the family and that just kind of fits in nicely actually. Some people may decide that, "I'm awake It's 1:00. I can't sleep. I might as well get some work done and I do that." And I've certainly done that before. So, yeah, it's pretty whatever and we're all kind of, I don't want to call us workaholics because I think that's a bad word, but we're all …
I don't know that I've called them busy seasons as much as busy cycles.
What are they? What triggers them?
Typically? Releases. The most tickets coming in is when some project is putting out a build or is putting out a release. For a large project release, we'll have a lot of tickets sent in because they're utilizing a bunch of resources and stuff gets backed up. That's typically it.
So whoever is on call during that time period, it's really their responsibility to handle: it's not like when Apache Wombat or whatever Project has an issue, it becomes “Drew's issue”. You're not assigned to a project to facilitate that, it's whomever is there will help them however possible, correct?
Yeah. And I think that you said it earlier: everybody that you've talked to says that we do it all. I'm going to tell you that we do it all. It's every project from Apache Zeppelin to Airflow, whatever the first one is. That's not our work.
I don't know if this is actually the case, but I'm curious: is it possible for an ASF Infra team member to be an introvert or do you all have to be “client-facing”? I know that we don't have an office, and you see people from time to time at ApacheCon, but do you have a wall that you can hide behind or do you have to interface with people all the time?
Did you go to the end for Lightning Talks?
I was not at Lightning Talks at ApacheCon/Vegas, but I heard it had quite an activity that happened there, Chris told me about it during his interview, let's put it that way. No one said anything to me up until that interview, so I was surprised. Fill me in with some more. What do I need to know?
[laughing] So, an introvert and two extroverts that are way too drunk, get up on a stage in front of people and proceed to just make fools of themselves for a minute. That's pretty much it.
I guess I know who the introvert was.
Yeah. So the original plan was to go up there and make thunder noises because that is the sound of lightning talking. That was a fun experience. Not one that I would do again, I think but it was fun.
Let's go back to the daily schedule for a minute. This is always a curiosity for me for anyone who's super busy, which is pretty much everyone at Apache: how do you keep your workload organized? Your structure for your day is very impressive, I have to say, this two-and-a-half hours times four. I think it's fascinating. But your actual workload, for example, you get one of these huge releases, how do you manage all that?
Okay, so the first part of my day is typically spent organizing my day as awful as that sounds. We get so much email that I think that it's literally impossible to read it all. I'm pretty sure it's literally impossible to read it all and so much email, so the first order of the day is sift through that while you drink your coffee because there's no way I can get through that. I catch up on the stuff that the team has been talking about, catch up on all the slack channels, look at my tickets, prioritize my workload, and that usually takes about an hour. So right at 8:30, I'm ready to actually start doing stuff. Then it's usually tickets and then a break. And then I don't like to check my email too terribly often. I wish I could three, four times a day, because I think it gets me off task, but that's not really something I have the luxury of being able to do all the time, so I do have to monitor my Ubuntu alerts as emails come in, scanning for anything important. But yeah, it's ticket work for the first half of the day, a project work for the back half of the day. And then right after lunch, I'll sit down and I'll figure out where I am on my project, and then try to move forward from there. Typically, that involves research, but yeah, I like to spend the last couple of hours of my day trying to do something. So, typically project work, because I don't like doing ticket changes at the end of the day.
Why is that?
Well, if you're going to nail your foot to the floor, don't be surprised when you can only run in circles.
I presume when you do ticket work, more things come out of it, too, so it never ends.
Yes. Typically, ticket work involves making a change of some sort, to something that's actually being used, whereas project work is kind of this nebulous, unused, non-production thing.
I'm hearing that you need to know a little bit about everything in addition to your own areas of expertise. How do you stay ahead of the curve? How do you learn about everything that you need to know especially if you don't know what you need to know? How do you do that?
I don't think that you do stay ahead of the curve. I really don't. I think that we do our best to ride it. Getting ahead is so immensely difficult. This technology essentially fractalizes into these many different various facets of high computing.
From virtualizing, networking, programming, you have all of these facets. Nobody can really, truly stay ahead of the curve. I mean, holy cow, the guys in the Infra team, they are all 12-pound brain-type dudes. They'll go from talking about hardware specs to talking about virtualization. They'll bounce around all these different facets of technology, and obviously you have strengths and weaknesses, I don't think anybody can really stay ahead of the curve at this point, and I feel like it's been a long time since anybody has. Technology has just gotten so complicated. We've really tried to, without specializing too much ... kind of pick out some of the non-essential fluff, the stuff that we don't use. I mean, hypervisors aren't really like super in these days. It's all about the Cloud, which is really just an abstract hypervisor, but whatever.
So, we don't really have any “machines” anymore, spec-ing out a physical machine is not something many of us do very often. It's not part of our job anymore, but that's definitely one area of technology that continues to advance as they put out better processors and whatnot. Mostly we try to stay ahead on the DevOps side of things without focusing too much on this operational infrastructure portion. And that's where I came from, this operational infrastructure, the data centers, the servers, the hypervisors, making VMs for people. That's what I used to do and now it's a lot less of that and a lot more fine-tuning this nebulous system of intermeshed tools that I don't fully understand yet.
Seeing that you and others can't stay ahead of the curve, can ASF Infrastructure actually stay ahead of the demand? I mean, is there any way you aren’t constantly in a reactive mode of “this new thing we're responding to, or here's a new part.” Can you get your house in order, or is the house in order?
At the ASF, especially Infra, we do a very good job of listening to our projects because we as individuals cannot stay ahead of the curve *and* have every good new idea that there ever was to be had. Our community is large, and our community is very smart as people and as a group. We have a lot of really excellent ideas that come in from tickets and you say, "You know? I think I'm going to look into that today." And you look into it. You realize that it has all this potential and suddenly, that's the service that we're now using, some things like Travis, which is a third party build validator, came to us in that way.
Since I've been here, some of them have come to us via tickets, where it's been, "Hey, I saw that GitHub has this new thing, you should check it out." So one of us will check it out and we’re like, "Dude, that's awesome. We should use that." I think that we're constantly being batted in front of the curve by a community, by a boots-on-the-ground community that knows what's up. We obviously have our own interests and our own passions, but I don't think if left to our own devices, it would look quite the same as if Apache TLPs couldn't put in tickets.
So it's been one year and one month, but how has Infra changed for you since you've come on board or has it changed?
Nope, still terrified. [chuckles]
How is the team coping with the ASF's unstoppable growth? We have 45 projects in the incubator and there's more than 300 projects out there … there's a geographic influence now on demand, fan increase in users and committers and projects from China, for example. Are there any issues that the team feels like, "Oh boy, we got to deal with this?" Is computing an international language, where it doesn't matter where you're from or what's happening? Are any shifts going on from the ASF’s growth impacting you guys beyond more of what you're already doing?
So, typically, all of my jobs really have been this kind of larger, national or international affairs so basically, since I was 20. I worked for a really large mortgage company, and then I left there and I went to a massive health insurance company. Lots of international folks and so, aside from the language barriers, yeah, I would say that computing is kind of an international thing. As far as the unlimited growth, I don't really know. I'm not sure. That sounds like a question that I would definitely advise you to go ask one of the board members about.
You had mentioned that you were working on the no-longer-CMS project. Is there another project that you're doing? Are you a go-to guy for something?
I don't think I'm the go-to guy for anything really. I just try to pick up whatever is there to be picked up. One of the things that I'm working on right now in the “demise of CMS” project is this custom builder. I'm still working on it, so it’s still a work in progress, but the idea is that you'll be able to have a custom build environment that would allow you to, from the ASF.YAML file, write a script, do a “thing” to create your own custom build environment so that we can really, really make a hardcore concerted effort to get off CMS.
Why? What was the issue with CMS? Why do we have to migrate from it? What was the problem?
To be honest with you, I've never actually used CMS. Fortunately, I have never been asked, too. John (former Infra team member John Andrunas) was, but I was not. I was spared, by the CMS gods, they shone their countenance upon me. It was pretty awesome. From what I understand, it's very cumbersome to use and not very friendly and also very old. My understanding is that although it works, there are changes we wish we could make to it that we cannot, so it might be time to just move on to something newer that maybe works a little bit better for us because our use case has changed.
You're still rather new to the role: when you first came on board, what was the biggest challenge or surprise? What really opened your eyes?
So, what really opened my eyes was how much of a learning curve there is. Man, that was rough.
Is that still the case?
Yes, that's still the case. It's just not as bad as it was. Where I was before, I was using all of the stuff that we're not using here, all the Enterprise Edition stuff. So I came in with a completely different toolbox than what I was handed, so the learning curve was massive. I had to relearn how to use the automation software and we were all Splunk, so I had to learn the ELK stack stuff and we were Ansible or they were Ansible, the Foundation is using Puppet. Just all of it down to the monitoring. We didn't have any third party monitoring because, “government”: we had this really unfathomably convoluted Xymon setup, which was interesting but we were using RCS for everything. So instead of git or subversion or even CVS.
Yeah, they're stuck with their legacy, that's for sure.
Yeah. You got text files in there that have got 10,000 versions in RCS. It was like, "Oh, my God. What am I going to do with this?"
So, I tried to implement some of the new hotness there. The git workflow, gitflow, actually, the exact same kind of thing that we do here.
I had a good understanding of how ASF did business from an operational standpoint. I understood it, because I've helped implement it elsewhere, but this is the first time I've ever been fully immersed in the river of PRs and tickets and all that other stuff, so it's been a hell of a learning curve, like it has really, really kicked my butt.
But you're kicking it back. I mean, you're here. You're making it work.
Oh, yeah, hustle, man. That's really all you’ve got to have is hustle.
As you're describing the way the ASF is and you were talking about some of the tools and the orchestration requirements, is this a common thing that Infrastructure today in general is heading in that direction, or is it an anomaly not only from your personal experience, obviously, but that is an anomaly but from the way you see the industry? Does “infrastructure” in general seem to be headed in this direction, or is ASF really a unique animal in that way? Do people really have to be more jack-of-all-trades?
So the ASF is a unique animal. It is. Typically, people don't have 11 Cloud providers and if they do, they've usually got some sort of system underpinning all of that whereas ours is tribal knowledge and text documents and we're really trying to get this knowledge codified and our technical writer Andrew Wetmore was really doing a kick ass job with that. But, yeah, typically an infrastructure team of this sophistication would probably have a different set of tools.
It's surprising that we're not using, like Vagrant and Packer and Teraforms which abstract the way Cloud providers make VMs. We still make them by hand. It's work, and really the only way to be good at that is to know what you're doing and to be confident in that particular UI, which is always its own special kind of awkward, trying to get used to a new UI, finding out where all the options are, and we're doing all these things by hand … everybody just picks up this knowledge through osmosis, just by stumbling through these tickets from time to time and it's really crazy to see sometime how much process there is and how little documentation there is. So I'm really happy to have our documentation writer on board.
That's Andrew, right? Andrew Wetmore is working on the documentation?
Oh, yeah. Yep, and he's doing a really good job, helping us sort it out.
And he hasn't left screaming and running either, so that's a good sign. It's a lot of work.
That's true. Yeah. It is. It is a lot of work and he has not left running, but he is a really chill dude.
Our infrastructure is unique in that we do all of the things that are kind of necessary. There really isn't too much of a go-to guy for any of this stuff. If there's a problem in the build system, you take care of it. If there's a problem with a Web server, you take care of it. That's where the autonomous nature of Infra comes in. If there's a problem, you just take care of it. You have these tools, you know how to do it, you just do it.
How do you know that someone's not fixing it on their own at the same time? If something's broken, you're like, "Hey, this is broken. I'm dealing with it" or something else?
Just slack, typically. I always check.
Yeah. Okay, what's your favorite part of the job?
Oh, gosh. My favorite part of the job is not feeling icky at the end of the day. I've worked for some companies that kind of made me feel a little ick in their mission. So one of the stories that my wife likes to tell is that I quit [MEDICAL INSURANCE COMPANY] because I disagreed with them as a company and I paid $5,000 to do so. But yeah, so I worked in the mortgage industry a little while shortly after the housing collapsed and I just thought about it. It was like, "Man, I really don't feel good about this job anymore." And then I moved to [REDACTED], which was arguably a bad move.
I was there for like 11 months. I signed a contract, I got a sign-on bonus, I moved to get there, so the stipulation was I stayed a year. I stayed 11 months and three weeks and I quit. I couldn't take it anymore. I'm just like, "I'm not doing this. I'm not doing this."
I was walking on an image parser for the Affordable Care Act pipeline, which was awful. They were still implementing it. This was 2012, 2013.
It was really bad. So after that, I went to NASA and I finally felt good about what I was doing and to have made a move where, again, I agree ethically and morally with what we're doing. I mean, it really is noble work, not specifically the work that I do, but the work that the people that I support do, and so, by proxy, my work is also.
At Apache, we have volunteers that dedicate hours of their life to these projects that we distribute freely because it really does make the world a better place. I mean, where would the world be without HTTPd?
What you just said right now has totally touched me. I feel like I’m ready to burst into tears, that's amazing. Really: I mean, wow. That's from the heart. I totally get you about doing things for people you don't believe in. That's so hard.
That sucks so much.
I totally get it and you're right. This is such a crazy group. It should not work and they do and it's incredible: 21 years of this. It's amazing.
Yeah, it's like trying to watch an eight-legged horse run.
[laughing] A what?!
An eight-legged horse. Somehow twice as fast, but you have no idea how it's working. Or which direction it's going to go.
I can’t stop laughing over the visual of that.
It's actually really funny because I'm a huge classics and mythology nerd. Technology was not my first choice in careers. I wanted to be a Latin teacher.
I love this. These are the backstories that everyone wants to know. You want to be a Latin teacher?!
I wanted to be a Latin teacher, yeah. I did Latin from freshman year in high school until I decided that college wasn't for me. So sophomore year, I took six years of Latin and it is really awesome what learning Latin does for your programming ability because it’s surprisingly similar to learning to code. But yeah, I make a lot of really, really stupid classics and mythlogy puns. So my daughter, her nickname is actually Livy, in reference to the famous historian, which is not something a lot of people get, but that's okay, it makes me chuckle. And Odin had an eight-legged horse that was twice as fast as the other horses, supposedly really fast because it had twice as many legs.
It's interesting with your career, you've worked at places that are big names and people would be very impressed with that, but you're stressing that just because it's a big name or big group, it's not what it's all cracked up to be. What are you most proud of with your career, your Infra career, with Infra as a whole? What makes you say “yay”?
To be honest, becoming an Apache Member was pretty freaking awesome. When I got here, when I start a new job, I always try to set a goal for that job. Sometimes I get it and sometimes I don't, and sometimes I don't realize how hard it is to actually do what I'm setting out to do when I start. My goal at NASA was to win a silver Snoopy, but that was never going to happen.
Silver Snoopy? What’s that?
That's an award given by astronauts to engineers. They don't typically give that to IT folks, but I didn't know at that time.
But here, it was to kind of become a Member and really to be accepted. I feel like I'm doing okay on that. That's pretty cool. That's going along really well.
You fast tracked. I mean, if you've been here for 13 months and you're in as a Member, that's pretty cool. That's good timing, good performance on you.
Well, thank you. I have no idea of how well or badly I am doing. I'm just doing things in the hope that they affect the universe in a positive way.
You're there, we couldn't do it without you.
That's excellent. Thank you.
You got to pat yourself on the back for the work that you're doing, because with our community, you know if you weren't doing it, you'd hear it. People would grump about it.
That's true. That's very true. But again, this is a mindset that's really prevalent in IT is the Tetris mindset where when you're playing Tetris, you fill up a row and it disappears. As such, those are your successes.
The Tetris mindset really is being bogged down by the monument to failure that you've built because really, when you're playing Tetris, that's what you're looking at is the monument of your failure, places you haven't quite gotten the row completed yet and shifted out of your bucket. And it's really easy to succumb to that mindset, especially in a place like this.
And I really, really enjoy the fact that the Apache Community is they seem eager to call out wins for other people and that is an awesome attitude for a community. It's something I've not experienced a whole lot of being called out for successes. I think that on the whole, the community and being embraced by the community has really kind of helped me not fall into that funk, that Tetris mindset just doesn't seem to be prevalent in this community, which is nice.
Do you think that puts people in a kind of "I'm not good enough" mindset because there's not a reward? You're young enough to be part of that community that likes or is accustomed to getting trophies for showing up. Apache doesn't allow that. It's nice for you to show up, but you're not going to be rewarded. Do you think there's an impact with that?
I was on a soccer team once and I did get a participation trophy. You know what? I couldn't even tell you what the name of that soccer team was because I didn't want to play soccer. So, really, I think that if you're coming to The Apache Software Foundation, you're not doing it for the participation trophy, you're doing it because you want to, so the reward doesn't matter. You're doing it because you want to. It's really weird to be surrounded by people who are motivated by nothing other than the fact that they want to be here doing this.
And it's refreshing and I love it. I do.
I love hearing that, that's great. Here come the somewhat personal questions: there's just a few of them. Chris was laughing hard when I was asking them; I don't know if you read the full Chris interview, but it's always interesting to hear what they have to say. So ... how would your co-workers describe you?
Less cool than my wife.
What is your greatest piece of advice... what would you tell aspiring infra people, sysadmins, people like yourself, what would you give them for work advice or career advice or life advice: what would you say?
Oof, that's tough. I guess I would have to say that if at the end of the day you don't feel like your job is worth it, it's probably not.
So, if you're going to do something, make it worth it. That's my advice.
If you had a magic wand, what would you see happen with ASF Infra?
What would I see happen? Well, obviously bonuses and pay raises, but I have no idea. If I had a magic wand, I'd probably turn it over to someone who I thought could make the wish better than I could, but yeah, I have no idea.
What else do we need to know that I haven't asked?
Oh, gosh. So many things, but none of them would make sense out of the context of this particular conversation. To be honest, I'm still under the impression that everybody knows more about this than I do still, so I don't know.
Drew is based in Tennessee on UTC -5. His favorite thing to drink during the workday is a black coffee prepared using a French press or the pour-over method.
# # #
Posted at 10:45PM Apr 27, 2020 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Welcoming Communities Strengthens the Apache Way
by Jarek Potiuk
During my career, I have been a software engineer, Tech Lead Manager at Google, a robotics engineer at an AI and robotics startup, and am currently the Principal Software Engineer of a software house, Polidea, which I helped grow from 6 to 60 people within 6 years as CTO. Over the past year and a half I was a user, then contributor, then committer, and now a Project Management Committee (PMC) member of Apache Airflow.
Although I took on many roles through the years, including being the main organizer of the international tech conference (MCE), deep in my heart I was always a software engineer. It took me many years to find a place where I could explore my true potential. Then I became part of the Apache community. I first learned about the Apache Software Foundation (ASF) 20 years ago when I used the Apache HTTP server at the beginning of my career. I had only made small contributions to OSS projects up to that point, and becoming involved with Apache Airflow was the first time I contributed seriously to one. As a Principal Software Engineer at Polidea, several of our customers were using Apache Airflow and wanted to contribute back to the project to help other users. Better integrations of services with Airflow would significantly improve future releases of the software.
The needs of our customers made me and my team go from users of the project to contributors and more. We have people in our software house who understand open source, know how to follow the OSS rules, and contribute changes from customers to help other people in the community. We know how to communicate well, we can also represent a vendor-neutral point of view because we represent the view of several customers and collaborate with all the stakeholders in the project. People in our company also contribute to other OSS projects, such as Apache Beam and Flink.
We also discovered a great model where our customers wanted us to contribute to an open-source project to make it better because they were using it and wanted to improve it for future users. This allowed us to do it full time (or even 150% of the time if you add all the out-of-hours contributions). I invite you to read about it in my blog post The evolution of Open Source - standing on the shoulders of giants.
Committing to Apache
I found exactly what I was looking for in the Apache Software Foundation. It’s a great organization for people like me: individual contributors who are also good at working with others, the ones who don’t shy away from organizing and making things happen, who thrive when they can do meaningful work with others.
This made me think: since the ASF is so great, how come for 20 years I was not contributing to OSS projects more? And since so many software engineers use Apache technology, why is participation not more common? I got lucky because I was in a position that allowed and supported my contributions to an open-source project. For me, it’s a dream-come-true. But what about others? There must be more people willing to contribute and get involved in the OSS community, they probably just don’t know how to go about it yet or did not like the experience.
So here I am, sharing my thoughts on what can be done to help others to get to know ASF sooner and get involved.
Apache Airflow and the initial experience
Apache Airflow is an exciting project. It is a platform created by the community to programmatically author, schedule and monitor workflows. It started in AirBnB in 2014, was submitted to the ASF incubator in March 2016 and it graduated to a top-level project in January 2019.
When I started working with the Apache Airflow project, I quickly realized that it was hard for me to contribute to. It was not clear how to develop and debug Airflow, how to start, and how to communicate. The project had a number of channels for communication including a developer list, a Slack channel, issues and pull requests alongside the code. As a newcomer, it was not easy to understand which channel is used for what and whether it’s OK to raise certain issues using those channels. It was not clear what were the common protocols: for example how to see that one thread is a discussion and one is voting on an already discussed topic.
Is our community welcoming enough?
At a party after a conference where I spoke about Apache Airflow, I had a long discussion with a young engineer who was new to the field, Fabian. Fabian told me that often OSS projects create some invisible barriers around communication and onboarding. I explained to him the “Apache Way” and how transparency and openness help with those barriers, fiercely protecting the fact that “we are open”. We carried on with our friendly discussion and it was really eye-opening.
That conversation stayed with me for a while. After some time, I realized that maybe our project was not as welcoming and accessible as we thought: there should be an easier way for people to contribute and join the community. I recalled my case—when I joined the community, I made a mistake by writing that something “will happen” before discussing it with the community. A long-time community member reminded me that this is not the way we should communicate at Apache Airflow. Just to note - each project is autonomous within Apache and it defines its own communication rules. It was done in a very good and friendly tone and I took it as a lesson, but some people might be put off by such a response. Not everyone has the determination, experience, thick skin and willpower to overcome all the obstacles and some people might be put off by such responses - even if they are nice and friendly.
Could we do better to communicate the ideals of our community more straightforwardly? Without coming across as harsh? Maybe we could find a way of explaining to the future contributors how they should communicate rather than do it by trial-and-error?
Becoming a more welcoming community
A few days and emails after the discussion with Fabian I started a thread at the developer’s list of Apache Airflow “[NON-TECHNICAL] [DISCUSS] Being an even more welcoming community?” that kicked off a conversation that included people who rarely had spoken before. As a result, we managed to introduce many changes to the processes for new members. Thanks to the input of people such as Karolina Rosół (Project Manager at Polidea), we came to the conclusion that the way seasoned community members communicate at Apache Airflow is not obvious at all to newcomers. We added missing chapters to our CONTRIBUTING documentation regarding communication channels, expected response times, and more.
What helped a lot was that we were able to improve our documentation for the development process during last year’s Google Season of Docs program. Apache Airflow was one of the first projects at the ASF to participate in the program. I was one of the mentors to Elena Fedotova, a technical writer assigned to our project. She improved and restructured the documentation, and made it more readable and easier to understand. Many people took part in reviewing and correcting the docs. Also, we took on the task of creating a new website for Apache Airflow with modern, clean design, and well thought UX addressing different personas of visitors in mind (including new contributors, users and potential partners of Apache Airflow). One of my colleagues and fellow Apache Airflow committer, Kamil Breguła, put enormous effort into both building the website and also restructuring the documentation.
As a result of the discussion at the developer’s list, we also introduced the mentorship options and even handled (via mentorship) a few difficult cases that could have lost us valuable contributions. A great example of the improvements we’ve done as a community might be this tweet from Vanessa, a research software engineer who had no experience with the community. Vanessa had tried to contribute support for Singularity—a popular container technology for high-performance computing - a year earlier, and came back for a second try after much of this work was done:
Are we there yet?
Looking back, it’s been a long (yet satisfying) journey trying to make the Apache Airflow community more welcoming. But how do we know it works?
At the beginning of this year, we started to participate in the Outreachy and Google Summer of Code programs, where people from around the world with different backgrounds can be paid for contributing to open source projects. Together with my friend and PMC member Kaxil Naik, we became program mentors and started to receive a flood of requests from the Outreachy members. Initially, we were overwhelmed but soon realized that we have everything we need to answer the questions of the candidates (and future committers) to let them teach their lessons and easily follow the “contributing” documentation. The contribution environment was available for them to get started, and the documentation detailed how they could learn how to prepare contributions and communicate via various channels.
Just two days later, we approved a few pull requests from those people! That’s quite a difference from 1.5 years ago when it took days, if not weeks, to understand the environment and how to work with Apache Airflow. It was truly a team effort; many community members participated in the process and made the Airflow project much more welcoming to newcomers.
Despite having challenges during our experience getting started, I was never going to quit. I loved the project and people almost from day one. Realizing how hard it was initially to start contributing (other people had told me so as well), I decided that I would put a lot of effort (both professionally and also personally) into making the project easier and more open and accessible for people with different backgrounds and experiences. My experience starting as a contributor, then becoming a committer, and now a PMC member proves that this is possible.
To me, Success At Apache means making the community and the spirit of Apache Way more accessible to people around the world. With the difficult times that we are going through now with COVID-19, it’s more important than ever to build and strengthen various communities. And to strengthen the community means to be open to others and be welcoming, We hope that our experience will encourage you to take a look at your project and see if you can make your community more welcoming.
# # #
Jarek Potiuk started to work on the Apache Airflow project in September 2018. He became an Apache Airflow committer in April 2019 and a member of the Apache Airflow Project Management Committee (PMC) in October 2019. He is an Apache project mentor in Outreachy and Google Summer of Code and was a mentor in Google Season of Docs. Jarek is a Principal Software Engineer at Polidea and always keen on making it easier for people with different backgrounds to join OSS projects.
= = =
"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 09:46AM Apr 06, 2020 by Sally Khudairi in SuccessAtApache | |
Inside Infra: Chris Thistlethwaite
"Inside Infra" is a new interview series with members of the ASF Infrastructure team. The series opens with an interview with Chris Thistlethwaite, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.
Let’s start with you telling us your name --how is it pronounced?
It’s “Chris Thistle-wait” --I don’t correct people who say “thistle-th-wait”-- that’s also correct, but our branch of the family doesn’t pronounce the second “th”.
What’s your handle if people are trying to find you? I know you’re "christ" (pronounced "Chris T") on the internal ASF Slack channel.
Yeah --anything ASF-related is all under "christ".
Do people call you "Christ"?
They do! I first started in IT around Christmastime and was doing desktop support and office-type IT. When people started putting in tickets, and my username was "christ" there, they were asking "why was Christ logging into my computer right now?" and it became a thing. When I was hired at the ASF I told Greg (Stein; ASF Infrastructure Administrator) about that story, he said "you gotta go with that for your Apache username."
When and how did you get involved with the ASF?
A long time ago I started getting into Linux and Open Source, and naturally progressed to httpd (Apache HTTP Server). Truth be told, that’s where it started and stopped, but I’ve always been interested in Open Source and working with projects and within communities. Three years ago I was looking for a new job and stumbled across the infra blog post for a job opening. I fired up an email, sent it off to VP infra and that’s how everything started. The ramp up of the job was diving deep into everything there is with the ASF and Open Source --which I am still diving. I don't think I found the bottom yet with the ASF.
How long have you been a member of the Infrastructure team?
This November will be my fourth year.
What are you responsible for in ASF Infrastructure?
Infrastructure has a whole bunch of different services that are used by both Apache projects as well as the Foundation itself: the Infrastructure team builds, monitors, supports, and keeps all those things running. Anything from Jenkins to mailing lists to Git, SVN repositories and on the back end of things we keep everything working for the Foundation itself within, say, SVN or mailing lists, keeping archives of those things, keeping your standard security and permissions set up and split out. Anyone you ask on the Infra team will say: "I do everything!" It's too hard to explain --it's quite possibly a little bit of everything that has anything to do with technology --as broad as it can possibly be.
So you really have to be a jack-of-all-trades. Do you have a specialty, or does everybody literally do everything?
Everyone on the team generally does everything --for the most part any one of us can jump into the role of anyone else on the team. Everyone has a deep knowledge of a particular or a handful of services that they’ll take care of --like, Gavin (McDonald; ASF Infrastructure team member) knows more about Jenkins and the buildbot and build services than most people on the team. At any one given point we’re on call and need to be able to fix something or take a look at something, so everyone needs to be versed enough in how to troubleshoot Jenkins. That can also be said for not just services that we offer, but also parts of technology, like MySQL or Postgres or our mail system or DNS: we do have actual physical hardware in some places, and we have VMs everywhere too, so sometimes we’re troubleshooting a bad backplane on a server or why a VM is acting the way it is. There's a very broad knowledge base that all of us have but there are specifics that some people know more about than others.
How does ASF Infrastructure differ from other organizations?
There are a lot of similarities but a ton of differences. A big part of how Infra is different is, to use a "Sally-ism": if you look at it on paper, it wouldn't work --I've heard you describe the ASF that way. If you explained the way things work at the Foundation to somebody, they would literally think that you're making it up and there's no way that it would possibly be working the way that it does. There's a lot of that with the Infrastructure team too: many people that I keep in contact with that I've worked with over the years, from my first job where we would buy servers, unbox them, rack them, wire them up, set them up, and run them from the office next door to us --I'd be impressed whenever I had 25 servers running in our little "data center" at that job, and now I talk to these guys about what we do at the ASF: we have 200 servers in 10+ different data centers that are vendor-agnostic and we make it all work. They ask: "how the heck do you do that?!" We just do --it's an interesting thing as to how it all works together because we solve problems that others have as well, but their problems are often centralized to one thing, or a data center that they control and own, or one cloud provider that they control and own, where they deal with a single vendor and possibly at most have to talk with the same vendor in two different geographical areas. We're having to deal with stuff with one cloud vendor that's a VM and other stuff on the other side of the world that's actual hardware in a co-location or data center running and the only thing that makes them the same is that they're on the Internet.
It's a good summation of the team too due to the fact that we’re all based out of worldwide locations, we’re not all in one spot doing something.
Describe your typical workday. Since you're all working on different things on such a huge scale, what's it like to be you?
"It's amazing" [laughs]. Everyone on the team generally has some project or projects that they are working on --long-running things for Infra.
I'm currently working on rewriting a script for Apache ID creations. The process of putting your ICLA in, sending off to the Secretary, the Secretary says, "OK good," puts in all your data, and that gets put into a file in SVN ...currently, we have a script that we manually run that does a bunch of checks on the account and whatnot, and then creates it, sends off a welcome email, whatever. I'm rewriting that because it's an old script, it's in several different languages. It's actually six scripts that all run off of one script. Consolidating that into one, massive script, that's in a supported language for us, and then moving forward with it into something that we could potentially automate, versus me having to run a script manually a couple of times a day.
Fluxo (the ID/handle for Apache Infra team member Chris Lambertus) was working on some mail archive stuff in our mail servers. Gavin (Apache Infra team member Gavin McDonald) is working on some actual build stuff. Everyone has kind of "one-two punch" tasks that they work on during the day, and then the rest of the time is (Jira) tickets or staying on top of Slack, if people are asking questions in the Infra channel or in our team channel or something like that. The rest of it is bouncing around inside the ASF and checking things out, or finding out new projects to work on, or ways to improve such-and-such process.
How many requests does Infra usually receive a day, in general?
Over the past three years, we've resolved an average of 6 Jira tickets a day, year-round. We've had 213 commits to puppet repositories in the last 30 days. We handle thousands of messages on our #asfinfra Slack channel, and have had 659 different email topics in the last year.
Dovetailing that, how do you keep your workload organized?
Everyone on the team does it their own personal way. I have a whiteboard and a Todoist list. We also have Jira to keep our actual tickets prioritized and running. We have a weekly team meeting/call and talk about things that are going on, and is the more social aspect of what we do week-to-week.
How do you get things done? You're juggling a lot of requests --what's the structure of the team? How do you prioritize when things are coming in? Is there a go-to person for certain things? If you're sharing everything, how do you balance it and who structures it? How does that work?
To one end, the funnel to us starts with Greg and David (ASF Infrastructure Administrator Greg Stein and VP Infrastructure David Nalley). It's different from other places that I've worked, where I'm on a team of other systems administrator/engineering people, and we have a singular, customer-facing site. Someone says, "Hey, this should be blue instead of red," there's a ticket and we make the change and then it goes to the production.
There're many different ways to get a hold of the Infrastructure team. Everyone gets emails about Jira tickets and gets updated as soon as one of those comes in. If it's something that you know about --say, the Windows nodes that we handle-- those all fall into my wheelhouse because I'm the last one to work with Windows extensively. Everyone else knows how to work with them, but it makes more sense for me to pick it up in some cases.
Most of the stuff in Jira are very "break-fix" kinds of things. A lot of the requests on Slack are too, for example: "DNS is busted," and we fix DNS. It's a very quick, conversational, "Let me go change that," or, "I'm going to go fix that real quick." Of course, some of the Jira tickets are very long-running, but the end result is they're fixing something that used to work.
We were originally running git.apache.org, and Git WIP, so we hosted our own internal Git servers and we would read-only mirror those out to GitHub. Somewhere along the line, Humbedooh (the ID/handle for Apache Infrastructure team member Daniel Gruno) started writing out Gitbox or building Gitbox based on the need to have writable GitHub repositories. He built Gitbox and set up with the help of some other people on the team, got it going, and that became our replacement for git.apache.org. While we still host our own Git repositories, people are free to either write to ours or write to GitHub, and the changes are instantaneously mirrored between the two.
We had Git hosted at the ASF, and had GitHub as a read-only resource. The need arose to have rewrite on both sides: Humbedooh went and built out MATT (Merge All The Things), which does all of the sync between GitHub and our Git instance.
MATT started a while ago, and Humbedooh added on to that to do the rewrite to GitHub. Basically what all that does is once your Apache ID is created, or if you have one already, you go on ID.apache.org, you add your GitHub username in there and then MATT --there's another part of that called Grouper-- MATT and/or Grouper will run periodically, pull data from our LDAP system and say, "Oh, ChrisT at apache.org has ChrisT as his GitHub ID. I'll pull those down." It says, "ChrisT is in the Infrastructure group. Hey look, there's an Infrastructure group in GitHub. I'll give ChrisT write access to the GitHub project." In a nutshell, that's what that does.
There's a ton of other house cleaning things, if you get removed from the LDAP group ... we run LDAP and keep all this stuff straight. If you get removed from the Infrastructure group at LDAP then MATT/Grouper will go and say, "Oh, this person's not in this LDAP group but they do have access in GitHub. Let me pull that so that they don't have access to that any more." It does housekeeping of everything as well as additions to groups and that kind of thing. There's a ton of technical backend to that, and that's what Humbedooh's doing.
At first when Git and GitHub were set up, it was fine: the ASF has to keep canonical everything about what goes into each project. You could only write to our Git repos. Then it was conveniently mirrored out to GitHub because there's a lot of tools that GitHub has that we didn't have or weren't prepared to set up. GitHub has a very familiar way of doing things for a lot of developers. Once GitHub Writable came along with Gitbox and the changes to MATT, that opened up a whole other world of tools for people on projects to use. If they wanted to use pull requests on GitHub, they could start using pull requests on GitHub to manage code. They could wire up their build systems to GitHub with Jenkins so that whenever a PR was submitted and got approved, it would kick off a build in Jenkins and go through unit tests and do all the lovely things that Jenkins does.
It was really an evolution of, "Here's the service that we have. Someone, somewhere, be it infrastructure or otherwise, once they have writable GitHub access, here we go." And here's the swath of things that that now opens up to projects inside the ASF that if they could come and set up a project with us, and then never, ever actually commit code to the ASF, it would always go to GitHub but still be safe and saved on our GitHub servers for ASF project reasons.
At the same point, we saw a need and said, "Let's build this out and go." Another funnel that comes into us is when we're on-call, something breaks and we ask, "Why do we do it this way? We should be doing it a different way." We then come up with a project to fix that or build it. It's a very interesting process of how work gets into the Infrastructure team.
It's been an interesting ride with that one.
There's always stuff that we're working and fixing and making better. For the most part, Gitbox as it is now is kind of in a state of "It's getting worked on". If there are bugs that need fixed, it gets fixed, but I don't know what the next feature request is on Gitbox. There's talk of other services ...like GitLab. If someone wanted to write code and put it in GitLab as opposed to GitHub, then someone would need to come in and write the connector from Gitbox to GitLab. So it's possible. I don't know if that's necessarily an Infrastructure need as much as it is a volunteer need for infra. But it's a system that can be set up to any other Git service as long as someone goes in and writes that.
You brought up an interesting point here, which is volunteers. Do volunteers contribute to Infra also?
We sometimes have volunteers, yes. We have a lot of people on the infra mailing lists that will bounce ideas back to us or they'll work on a ticket or put in a pull request.
Well, the need is not as critical because you have a paid team, versus Apache projects.
Right. That's exactly true. There's a bit of a wall that we have to have because we work with Foundation data, which not everyone has access to. Granted, we're a non-profit, Open Source company and everything's out there to begin with, but usernames and passwords of databases and things that we have encrypted that the team has access to isn't necessarily something that you would want any volunteer to have access to.
How do you stay ahead of demand? This is a really interesting thing because part of it is you're saying, "Necessity is the mother of invention." You guys are doing stuff because you've got those binary, "break-fix" types of scenarios. In an ideal situation, do you even have enough runway to be able to optimize your processes? How do you have the opportunity to fix things and improve things as you're going along if you're firefighting pretty much all day long?
That's a really good question about just how our workflow is. In other companies that I've been in, there's the operations people that are doing the "break-fix", and then there's the development people that are doing "the next big thing". The break-fix folks are spinning the plates and keeping them spun without breaking, and that's a lot of firefighting. That's literally all that job is. Even when you're not firefighting, you're sitting around thinking about firefighting in a sense of, “when is this going to fall over again? If it does fall over, what can we do to fix it so it doesn't do that anymore?" And in the past, the break-fix guys, the firefighters, would end up saying, "Hey, there's this thing that needs fixed." And it would fall over the wall to the developers. They would develop the fix for it, and then it would go back into production and then the cycle continues.
To some extent, that's kind of where DevOps came from: if you merge the two of those together, then while you're firefighting you can also write the fix for the problem, and then you don't have to wait for the lag between the two. We don't have that split here. Everyone on the team is firefighting with one hand and typing out the solution with another. And a lot of the times our project work, like getting a new mail server spun up or my task to rewrite the workflow for new Apache ID creations, I've been working on that for a very long time because it will keep falling off ... it gets put on the backburner while we're like, "Hey, we found out that our TLP servers are getting hammered with downloads from apps and people trying to use them instead of the mirror servers." So, let's set up downloads.apache.org and we can funnel stuff over to that so that that server can get hammered and do whatever it needs to do so that our www. site and all the Apache Project websites stay up and running in a more reliable way.
What's the size of the teams that you were dealing with before that had a firefighting team and a dev team versus ASF infra?
The last "big" corporate job I had was ...six ops people that kept the site going, four database people, another eight technical operations-type people… all-told it was about thirty.
There were technically thirty firefighting people and we had a NOC (network operations center) that was literally people that only watched dashboards and watched for alerts. Whenever those go off, they’d call the firefighting people. The NOC was another 20 people. And then the development teams were ... twenty to fifty people.
What kind of consumer base were they accommodating? Does it match the volume that ASF has? Was it more of a direct, enterprise type of, "We have a customer that's paying, we have to respond to them" situation? Or is it different?
This was at a financial services company that transacted on their Website: completely different from the type of stuff we're dealing with here at the ASF. Volume-wise, they were much smaller, but it was much more ...visible, as their big times were at the start of market and end of market. After end-of-market came all the processing for the day to get done before markets started the next day. The site had to be up 100% of the time. We had SLAs of five minutes. If you got paged or something broke, you had to get the page and respond to it in a way of, "Hey, this is what's going on and these are the people that I need involved with it," all within five minutes of it going off. That was the way the management structure was. It was intense.
In scale, Apache probably does way more: they do way more traffic across all of our services in any given day. If someone doesn't get mail for a little bit, then they come and tell us or we get alerted of it by our systems, and we go and we fix it and we take care of it. But with the financial services group, people were losing money: dealing with people and money is just a very stressful situation for anyone working in technology because you have to get it right and it has to be done as fast as possible before someone's kids can’t go to college anymore. It was a completely different minefield to navigate.
The type of stress that's involved or the type of demand or the pressure is different, but you also have the responsibility with ASF that systems have to be up and running. I understand it's not mission critical if something goes down for more than five minutes, which is different in the financial sector, but do you feel that same type of pressure? Is it there or is it completely different for you?
No. I think I do because we also have SLAs here: they're just not five minutes. We have structure around that and the way that we handle uptime and that kind of thing. I get very attached to the technology that I'm working with and the communities that I'm working with, so if a server goes down or a site's acting wonky, I take that very personally. That reflects on how I do my job. If a server's not working or if something's broken either because of me or something externally that's going on, I want to get that up and running as fast as possible because that's how I would expect anyone to work in a field that has ...any technology field, for that matter. And generally, that's the same attitude the rest of the team has as well.
How has ASF Infra changed over the years?
It's matured quite a bit. When I first started, it was Gavin, Fluxo, Humbedooh, Pono (former ASF Infrastructure team member Daniel Takamori), and me. There were five of us. The amount of stuff that we got done, I'm like, "Man, there's no way that five people can do this."
That's kind of what I'm pointing at. If you're a team of eight or five or twelve or whatever, compared to the other thing that you did with the other job that had maybe a core team of twenty, thirty --that in itself is insane.
We were five people, everything was very, "Here's the shiny thing we're working on," and then something else would come up and we'd have to jump on that. Then something else would come up and we'd have to jump on that. We were very ...I don't want to say we were stretched thin, but there wasn't necessarily ...time for improvement.
There was a lot of stuff we had still in physical hardware, and a couple of vendors that we no longer use. But things were moving more towards a configuration-based infrastructure with Puppet instead of one person building a machine, setting up all the configs themselves, installing everything and then letting it go off into the ether to run and do its job. We were moving everything towards Puppet to where you configure Puppet to configure the server. So then if the server breaks, or goes down or goes away or we need to move vendors or whatever, all you need to do is spin up a new server somewhere else, give it the Puppet config, it configures itself and then goes off into the ether to run and do whatever it needs to do.
That's great. More automation.
Right. We were automating a lot more stuff right when I first started. Over the course of the next year, the team kind of ebbed and flowed a little bit until we were eight in the last year. We started to get to the point of "where can we point the gun to next? What can we target next to get it taken care of and done?" That's where we started taking on more specific infra projects, for instance, mail. Our mail server has been around since the dawn of time, and it's virtualized so it moves servers every now and then, but the same base of it is quite old in technology standards.
Fluxo started moving this on to newer stuff and he got that going. We started taking care of projects that were not broken, but needed to be worked on. Instead of waiting for it to break, we're fixing and upgrading and moving down that path versus firefighting, break-fix, that kind of thing. We were moving more towards, "Hey, I see a problem. I have time. I'm going to take care of that and make that into a more serviceable system."
Automation has helped quite a bit with that. I also think that just as the team grew, it just got to a point where I think tickets were getting responded to quicker, emails, chat was responded to quicker. And then we also could focus more on the tools that we use for the foundation. Like, HipChat was going away. We needed a new chat platform, so we chose Slack. And then we updated and moved everything over to Slack, and that's where we are with that. It started following into its own with workflows of like, "Oh, okay. How do we get this done? Let's go do that."
What areas are you experiencing your biggest growth? Is it a technical area? Like, "Hey, all of a sudden mail's out of control"? Or, "Hey, we need to satiate the demand for more virtual machines," or is it a geographic influence that's coming in in terms of draw? Where are you guys pointing all your guns to?
Currently we're trying to get more out to the projects and talk to people more often. Not that we didn't do that before, because ApacheCons and any Meetups that we had, Infra would always have a table. We were always accessible, but we were always passively accessible. We weren't really going out and talking to projects proactively to say, "Hey. What do you guys need from us? What are we doing with this?" So I think that's one part of it, that I think that we're moving towards a little bit. It's not at all technical, but more of a foundation broadening, community broadening thing that we're doing.
That's one part of it. The other thing that we're doing too is from a more technical or infrastructure standpoint, is we're really trying to get our arms around all of the services we provide, and then really take a look at those and say, how is this used inside the ASF? How is it used in the industry as a whole? Do we need to put more time and energy towards those things in order to make the offerings of the infrastructure team a little bit a more solid platform, kind of thing? Generally, that ... and on top of any other automation and that kind of stuff, I think that's really the two spots that I see infra growing in a lot in the next year-ish of just really boiling down our services to, "Hey, we've seen a lot of people using this. And a lot more projects are using this. It's not just a flash in the pan. We need to build out more infra around blah service, so let's really do that and make that a solid platform to use."
What do you think people would be surprised to know about ASF Infra? When you tell someone something about your job and they go, "Whoa, I had no idea" or, "That's crazy." What would people be surprised to know?
That Apache has an infrastructure team. [laughs]
Why are you saying that?
Because honestly, I don't think a lot of people know about the Infrastructure team. Those that do, have used us for something, not used us for something, have talked to us about something, and worked with us on something. Those that don't are like, "Oh, I didn't know the ASF paid people to be here," --that kind of thing. That's kind of the two reactions I've got from people. It's like, "Oh, that's cool. You work for the infrastructure team." Shrug. And then the other people are like, "Oh, sweet. Yeah, that's great. I know Gav. I've worked with him on blah, blah, blah." But that's not necessarily surprising. I mean, it is in a sort of way.
When people ask, "What are you doing for work?" and you say you work for ASF, do people even know what that is? Do they know what you're doing? Do they care? Are they like, "Oh, okay. Whatever"?
There's literally three types of people that I've run into that ask, "Oh, what are you doing for work?" One person is the person that has no idea what the ASF is, not even the vaguest hint of Apache, and they're like, "Oh, okay. That's cool." There's that next person that does, and may or may not know about the ASF but knows of Apache, the Web server, or some other lineage of that. They're like, "Oh, whoa. That's super cool. It's impressive.” That's wild. Then the third people ask "Why are ‘Indians’/Native Americans running software? That doesn't make any sense to me" and "Are you on a reserve?" I swear to God I've gotten that question before. I don't even know how to answer that. I'm like, "No, buddy."
Are these technologists or are these just guys off the street? Are they in the industry?
Guys off the street. I say Apache Software Foundation, and they're like "Apache" and "software" doesn’t make sense. Actually I've gotten mean tweets too whenever I've been tweeting about being at ApacheCon. Things like I'm "taking away" from Native Americans and whatever...
We also get that on Twitter, on the Foundation side: we get included in tweets about some kind of violation along the lines of, "Stand up for the ..." I get it. From time to time we also get sent these "How dare you?" letters, that sort of kind of thing. It's an interesting challenge, the whole issue of "why do Native Americans run this thing?" misinterpretation.
Let’s move on. What's your favorite part of your job?
The whole job is my favorite part of the job.
That's funny because everyone at Infra ... You know how people have bad days or may be grumpy or whatever, in general you guys seem to all like each other. You all have a great camaraderie. You all get along. You work really closely together. It's a very interesting thing to see from the outside. Is that true? Or are you just playing it up? Does it really work that way?
That's absolutely true. I've found that generally speaking, when you get a bunch of nerds together, they either really like each other and everything works or they really don't like each other and nothing gets done. The team is great, and it's like no other team I've ever worked with before. But it's very odd because you go through the interview process, and the interviews are interviews. I mean, you get to know people in interviews, but not really. Then you start working with people, and at some point you start getting below the surface. And at some point you get deep enough to where you find out whether or not ...how you gel with all these people.
It's very odd that all of us have the same general sense of humor. We'll talk about food non-stop in the channel, and recipes and cooking, and different beers or different whatevers. It's nice to get to that point with a team that you're comfortable enough with everybody to ... like I said, I've been here three years and there is still so much that I don't know, both technical and non-technical, about the ASF. I ask very dumb questions in channel and say, "I have no idea why this is doing this this way," or, "Can someone else take a look?" or, "I don't know what I'm doing here." And never in the entire time I've been here, from the day one until now, has anyone ever chastised me for not knowing something or said anything about the way that I work or something like that. Well, at least not in channel. At least not publicly.
Everyone's very supportive. It doesn't matter if you know everything there possibly is to know about one singular product or thing you're working on, or don't know anything about it. You can ask questions and really learn about why it was done the way it was done, or figure out how to fix a problem. No problem on the team. It's just like, "Okay, yeah. This is what you have to do." Or, "Here's a document. Read up on it." Or, "I don't know either." And then out of that comes an hour of conversation and then a document pops out, and then the next person that asks, we can say, "Here, go read the doc." Yeah. I mean, we're all very happy. Very happy.
Which is really good. Looking back when you first started, what was your biggest challenge when you came onto the team?
Oh man. I look back at that and I feel like the learning curve was ... It wasn't a curve. It was a wall. I've used Linux, I've used Ubuntu for a while and various other flavors of Debian and whatnot, so getting spun up on all of ...expanding my Linux knowledge was a big deal, expanding everything about the ASF and how it works. Which I'm still trying to figure out. If you know, send me something to read to figure out how that all works. I mean, I don't want to sound like I was completely out of my depth and I have no idea what I'm doing, but I feel like I was completely out of my depth and I had no idea what I was doing.
There's a lot about the ASF that is just tribal knowledge, and there's a lot about Infra that's tribal knowledge. It's just no one has anything written down --"the server's been running under Jim's desk for the last 15 years in a basement that has battery backups and redundant Internet, so it's never gone down. But don't ever touch that server, because if it goes down, then all of our mail goes down" or whatever. There was a lot of figuring all that out for myself and digging around. Which is, frankly, one of the parts that I really enjoy, is just, "Hey, this thing broke. I've no idea what that thing is. I've no idea where it lives," and just diving in and trying to figure out what's going on with it and how it's built, and then the hair trigger that sets it off to crash and never work again. Yeah. That's an interesting question too.
What are you most proud of in your Infra career to date? You're talking about overcoming these challenges, I'm always curious just to see what people are like, "Yeah, I'm patting myself on the back for that one" or, "Ta-da. That's my ta-da moment."
I did lightning talks at ApacheCon Las Vegas and didn't get a phone call from you when I was done. [laughs]
I wasn't at lightning talks --what did you say? What would make me call you?
I didn't say it. We were on stage, and it's John (former ASF Infrastructure team member John Andrunas), Drew (ASF Infrastructure team member Drew Foulks), and I, and we figured we'd do lightning talks: "Hey, we're the new guys: ask us infrastructure questions." A week or two before ApacheCon, there was a massive outage at a particular vendor. It wasn't: "Oh, our server's down for a while," the server went down and then it was *gone*. It got erased from the vendor side. I can't remember what service it was. There was something that disappeared two weeks before Vegas and never came back.
It wasn't just us, though: tons of companies had this issue. So we're on stage answering questions, and someone asks where this service went: "What happened to XYZ?" And John has the mic and he goes, "You should probably go ask [vendor name]." At that point it was very widely published that the vendor"s response was like, "Whoops, someone tripped over the cord that powered the data center. And when it came back up, then deleted all of your VMs.” They totally acknowledged it and they didn't give refunds for it, so it was a little bit of a PR kerfuffle for them. The vendor is in the other room handing out buttons and stickers, and John was like, "Oh yeah, go ask the [vendor] guys what happened to your server. That's their fault," he said it jokingly but my jaw dropped.
[laughs] No one told me this story. No one said anything. Someone's trying to protect you. I had no idea this happened ...oh my gosh.
Well, David Nalley was in the back of the room, and he's screaming with his hands cupped around his mouth, "Don't badmouth the vendor and the sponsors." I deflected and quickly moved onto something else. [laughs]
But yes, that's another good question that I haven't actually reflected on. Looking back and seeing where Infra was when I first started and where it is now, it was a very runnable and very good team then, and it's a very runnable and it's a very good team now. I feel like a lot of the work that I've done and a lot of the work that the team has done over the last three years has been getting from a spot of "everything's on fire, who's holding up what this weekend?" to things being stable and us nitpicking on whether or not something needs to be updated or not. That's huge. That's a big step from like starting a company and treading water to being profitable and having resources to do other things versus just keeping your employees paid. I mean, it's a big step for a company and it's a big step for Infrastructure.
I love your talking about how you guys are tightly-knit and all that. How would your co-workers describe you?
The other odd part about that too is being completely remote and not having day-to-day, face-to-face interactions with people. You get a very odd sense of people through text for a 24-hour period that you're online reading stuff. It's a different perspective than if I was in the office every day, working on something and interacting with people. Even though every day, except for the weekends, I'm online talking to these guys and doing stuff. How would they describe me? Dashingly good looking and ... I don't know. [laughs]
I know that Infra's "just Infra," right --you guys are all under the Infra umbrella. Do you have a title? When you got hired, what do they call you?
We're all systems administrators. The only person that actually has a title is Greg, and he's Infrastructure Administrator.
What are the biggest threats you face? For infra folks or systems administrators or infrastructure administrators even, what do you need to watch out for these days? What's big in the industry? Is everyone saying, "Oh, XYZ's coming"? In terms of your role in the job: is there something that you need to keep your eye on? Is there something that you would advise other people, "If you're in this job keep an eye out for blah, this is a new threat" or anything along those lines?
General scope stuff. 16 years ago, everything was hardware: you bought hardware and you had to physically put it somewhere. And virtual machines came along about the same time. People were starting to do virtual stuff to where you could have a physical machine and then multiple machines running on that, sharing resources. Then cloud and infrastructure as a service, and everything's been moving more and more towards that over the years.
Of course, there's still people that work in office IT, doing desk support stuff or office infrastructure type things.Those are still a majority of how things run at companies. As everything is moved more towards the cloud or hosted services, more systems administrators are becoming more like software engineers. And software engineers are becoming more like systems administrators. They're kind of melding into one, big group of people. Now of course, there are still people that only write software. But gone are the days where it used to be someone would write some code and say, "I need to deploy it and get it out to all these computers." They would write the code, they'd hand it off to a systems person. Systems would go and configure on whatever server to get it out to however many machines and hit the button and go. The software developer never really needed to know hardware specifics of the systems that it was going to run on. And the systems people never really needed to know what software packages this was getting put together. There's exceptions to that, but for the most part ...
Over the years, it's fallen into a thing now where the software developer knows exactly what systems this is going to run on and how it's going to run there, so it's more efficient and things work better and they're releasing less buggy code based on the fact that they know they're closer to the hardware. And the systems people, they want to troubleshoot it more and work with it and fix problems because they're closer to the software and know more about its internal workings and how it's going to run on systems. Everything is getting more and more chunked down into, first it was VMs, then it's cloud, then it's containers with Docker and things like that, and it's going to get more virtualized down into that. Knowing about Docker orchestration and things like Kubernetes and Apache Mesos. The reality is other people run Kubernetes, people run Docker, people run everything. That's the interesting thing in terms of how they do it at ASF. We don't require folks to do just one thing.
In terms of where the industry's going ... everything's getting pushed down to "a developer can work in a container on a set of systems, write software for that and then deploy that to a machine themselves, never involving a systems engineer at all, and build a product using that." It's getting stuff out the door faster, and it's also keeping the unicorn of the industry a while to go ... even today, I developed this thing, it works on my machine. If I move it over to another computer, it stops working. Why? What's the problem with that? Containering or containers fix that problem. The container you run on my system runs the same way as it does on every system everywhere. It takes the "runs on my machine" thing out of the equation.
What's your greatest piece of advice? What would you tell aspiring sysadmins?
Part of the ASF is the community behind it, and a giant part of that is what makes it work. I mean, you could say all of it. That's what makes everything work with this. Right when I first started the sysadmin kind of thing, I didn't get into Meetups and Linux Users Groups and any of that stuff. I didn't get into the network. I didn't go into the community that I had around me. And honestly, I don't know if that's because it didn't exist or because I didn't know about it or what, but now that I'm older and wiser, the community part of it is really ...there's a massive benefit to that. Aside from socialization, or networking and how to get a better job through networking, getting together with like-minded people and talking through your problems is an amazing tool to use. And I didn't do that enough when I was a sysadmin starting out, and looking back it's something that I sort of regret not doing, was really sharing knowledge with other people in the community and building a group of people that I could ping ideas off of, or help with other ideas, or share in the knowledge of, "Hey, this is what's going on in the industry" or, "Hey, I saw this at work the other day. How do we work around that?" or that kind of thing. It's much easier these days with social media: the never-ending amounts of social media. But it's a big, important part of my day-to-day now, that I wish I had 16 years ago.
That's powerful. OK, If you had a magic wand, what would you see happen with ASF infra?
If I had a magic wand, I'd update our mail server instantly or maybe magic wand a few other projects.
Wait. I know you're joking, but what is the problem with the mail server?
It's running on an older version FreeBSD that doesn't play well with our current tools. Some form of that server has been upgraded, patched, moved, migrated, etc for the last 20 years. We want to bring it up to more modern standards. Mail runs fine for the most part, but it's probably the most critical service we have at the ASF and we want to make sure everything continues to hum along. Because of that, it's a huge project that touches a ton of different parts of our infrastructure.
How big is it?
It's all of our email. Every email that goes through an apache.org address.
This is a huge project and Chris (Lambertus) has been working on it for a while --it's not a simple thing to fix. It's very, very complicated. We couldn’t do it without him.
Back to the magic wand thing: I'd wish for more wands.
Chris is based in Pennsylvania on UTC -4. His favorite thing to eat during the workday is chicken ramen.
# # #
Posted at 11:09PM Mar 31, 2020 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Google Summer of Code Mentorship --inside the GSoC 2019 Mentor Summit
by Sanyam Goel & Kevin A. McGrail
Sanyam first came to the ASF as a Google Summer of Code (GSoC) student in 2017; since then he has become a committer and contributor to Apache Fineract and active participant with Apache community initiatives. Sanyam, along with Kevin (a.k.a. “KAM”), a long-time ASF Member involved with the Apache Incubator and SpamAssassin projects, were selected to represent the Apache Software Foundation at GSoC’s 2019 Mentor Summit.
Google Summer of Code is a global program focused on introducing students to open source software development. Students work on a 3 month programming project with an open source organization during their break from university.
Since its inception in 2005, the program has brought together 15,000+ student participants and 25,000+ mentors from over 118 countries worldwide. Google Summer of Code has produced 36,000,000+ lines of code for 686 open source organizations.
As a part of Google Summer of Code, student participants are paired with a mentor from the participating organizations, gaining exposure to real-world software development and techniques. Students have the opportunity to spend the break between their school semesters earning a stipend while working in areas related to their interests.
About the ASF and GSOoC: “The Apache Software Foundation has been a GSoC mentoring organization every year since the program’s inception. As a mentoring organization, the ASF is able to draw attention and new talent to many of its projects; Apache projects benefit from contributions and galvanize new community members by mentoring students; and students have an invaluable opportunity to gain experience by working directly with the individuals behind Apache projects. This, in turn, enriches the Apache community as a whole, and furthers the ASF’s mission of providing software for the public good.”
At the ASF, GSoC is overseen by Apache Community Development (“ComDev”), the committee that welcomes new participants to the Apache community and mentors them in “The Apache Way”. Former ComDev VP and Google Summer of Code administrator Ulrich Stärk, along with Apache OpenMeetings VP and GSoC mentor, Maxim Solodovnik, helped lead the ASF’s participation in GSoC this year, with the support of numerous Apache community members.
The ASF provides an established framework for intellectual property and financial contributions that simultaneously limits contributors potential legal exposure. Through a collaborative and meritocratic development process known as “The Apache Way”, Apache projects deliver enterprise-grade, freely available software products that attract large communities of users. The pragmatic Apache License makes it easy for all users, commercial and individual, to deploy Apache products.
As we gear up for Google Summer of Code 2020, we wanted to take a moment and share some of the experiences from last year’s GSOC!
In Google Summer of Code 2019, 23 students were selected by a careful analysis and ranking. 17 students successfully completed their Google Summer of Code projects with the support of 45 mentors spread across dozens of Apache projects that include Allura, AsterixDB, Beam, Camel, Fineract, Gora, Kudu, Mnemonic, Nemo (Incubating), OODT, SpamAssassin, and more.
Quick Report on the GSoC 2019 Numbers for Apache.org:
Accepted projects: 23
1st evaluation: 22 passed, 1 failed
2nd evaluation: 17 passed, 5 failed
3rd evaluation: all passed
Total Apache Mentors: 45
Sanyam and KAM were lucky enough to be selected as the delegates of the Apache Software Foundation for the GSoC Mentor Summit & the 15th GSoC anniversary.
On 10th March 2019 we got our invitations from Google: “You have been invited to be a Mentor for The Apache Software Foundation in Google Summer of Code 2019”.
With this invitation, there comes a huge pool of responsibilities to mentor students. For Sanyam, it was his first time to provide mentorship at such a great level and to drive the complete project with the college student.
Sanyam: “By providing the complete guidance throughout the GSoC Period at the same time, though I had provided mentorship to at the university level to juniors in college. I also learned to manage the project and how to play the role of project lead to fulfill the project with the timelines with the student.
I was really excited to meet Google Open Source team in person and Kevin A. Mc Grail (KAM) along with 332 mentors from 162 organizations and 42 countries to share their ideas about open source and to discuss their experience of GSoC 2019. I would like to thank Ulrich Stärk and Maxim Solodovnik for serving as an organization admin for the ASF community.”
Day 1: Thursday | Munich, Germany - Marriott München
Day 1 of the summit is started by checkin into the Marriott Hotel, where we met the Google OPSO team just near the entrance and reception of the hotel.
Google OPSO team was very welcoming and welcomed every mentor by providing a Goodie bag along with a mouth watering sweet.
At the reception, we met Mario Behling from FOSSASIA community along with mentors from various organisations like Mifos Initiative, SCoRE Labs and DBpedia where we talked about the pocket science project.
Then we all headed to lunch, where we met dove into the discussions about the OSS and how umbrella organisation manages the student applications to select the students for Google Summer of Code.
GSoC Mentor Summit started with the opening reception dinner along with opening notes from the Google OPSO team which lead to a small game named as person scavenger hunt which had a sole purpose to connect and meet the mentors from different organisations and to interact with them to discuss more about open source with some drinks and food.
Day 2: Friday | Munich, Germany - Fun Day (City Scavenger hunt / Castle Tour)
On the celebration of the 15th anniversary of GSoC, Google allocated an extra day this year at the mentor summit for fun activities like Castle tour and City Scavenger hunt.
Sanyam participated in the Scavenger hunt where some group of mentors had to explore the city on their own to find the clues and the top 2 teams got the prize. Sanyam was lucky enough to be with the winners team. And some mentors like KAM went for a really nice castle tour thanks to our host, Google.
The day ended up with informal conversations among the mentors over dinner and games in the ballroom of the Marriott.
Day 3: Saturday | Munich, Germany - Unconferences (Yay!!)
Day 3 was one of the most exciting days at the event. We had a lot of sessions organized by different organisations in the form of an unconference, which is “a loosely structured conference emphasizing the informal exchange of information and ideas between participants, rather than following a conventionally structured programme of events.”
Mentors organized the unconference sessions on Saturday and Sunday. The unconference slots were planned with two rounds of lightning talks but ended with three rounds of lightning talks :-). A lightning talk is a platform for organisations to present on the work of their GSoC 2019 and GCI 2018 for 3 minutes. KAM also presented a lightning talk for ASF and Apache SpamAssassin on Saturday morning.
After lunch, all the mentors and the Google OPSO team gathered in a lawn just outside the Marriott for a group photograph.
[“GSoC 2019 Mentors Photo”]
We were involved in various unconferences sessions like:
How to get more Women interested in FOSS
The Fundraising Session (Presented by Kevin A. McGrail)
Source code preservation
Google Season of Docs (GSoD)
Intro to licenses and why we need them
After attending all the talks, we also discussed how to retain students after the completion of the GSoC period.
After the last lightning talk we all managed to spend some more time together to enjoy dinner, playing foosball, making funny poses on the photo booth along with enjoying the famous chocolate room (Oh, did we forget to mention about the famous chocolate table? This year, Google managed to have a complete room of chocolates!) where mentors across the globe shared the local country chocolates with each other!
Day 4: Sunday | Munich, Germany - Final day :(
Unfortunately, it was the last day of the mentor summit. The day started with continuation of lightning talks where Sanyam and KAM almost managed to attend all the lightning talks and got to know more about the other GSoC organisations and their amazing projects from GSoC 2019.
We attended some more unconference sessions on the following topics
GCI Info & Feedback with Google
GSoC Feedback session
Breaking the barrier for the newcomers
Interviews at Silicon Valley
Then we all headed for the final lunch of the summit. By this point, most of us knew each other and some are planning to extend the trip by visiting some other cities, or some are planning to return back to their home countries. We all gathered for the closing session and all mentors had made a great network of cool people in the open source community!
We have also met a lot of mentors who were previously GSoC students. We had a lot of discussions about the experiences of being a student as well as a mentor, what motivated them to become a mentor and how they're contributing to their community.
Left to Right: Joey Schlichting, Sanyam Goel & Kevin A. McGrail
Overall, it was one of the lifetime experiences for every representative. The trip was full of memories and we got to learn so much, we also made new and special friends throughout the summit.
The GSoC Mentor Summit-2019 was a wonderful experience and we would like to thank the Google, The Apache Software Foundation, and once again, the ASF GSoC Organisation Admins, Ulrich Stärk and Maxim Solodovnik and the event hosts from the Google Open Source Team.
GSoC 2020 is underway now and we are just gathering project ideas and mentors. Students looking to get involved, please see http://community.apache.org/gsoc.html
Sanyam Goel started his journey with ASF by participating in GSoC 2017 as a student and continued contributing actively to OSS, currently serving as a committer of Apache Fineract. He also participated as a mentor in Google Code In and Outreachy programs for Mifos Initiative and DIAL community and always keen to spread the word about OSS to create an impact around the globe and focus on reducing the barriers for newcomers into OSS.
Kevin A. McGrail, better known as KAM, is a VP emeritus of the Apache SpamAssassin project where he has battled spammers for years. In addition to helping the SpamAssassin project, he has served as in the office of treasurer and fundraising for the Apache Software Foundation. He is also a member of the Apache Incubator project where he mentors new projects at the ASF including echarts, IoTDB & brpc. In his $dayjob, he works at InfraShield.com doing cybersecurity for critical infrastructure.
= = =
"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 12:42AM Mar 04, 2020 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Literally
Browsing through the other "Success at Apache" posts made me reflect on the word "success". Years ago, I was asked in a job interview, "How do you define success?". After a pause, I asked back, "In what?", which threw the interviewer off a bit. That's just too broad of a question for me to define one answer: success in a career, success as a human, success as a team member, success at a software release, the list goes on and on.
Every day there's a giant list of possible successes and failures, and that’s even before you get to work ...so keep that in mind as you continue reading.
In August of 2016 I came across a blog post that would change my life forever.
At the time, I was looking for a new job that was taking longer than I expected. Taking a long shot, I sent off a very sparse email replying to the post. Two days later David Nalley (VP Infrastructure) replied, introducing me to Daniel Gruno who'd be doing the first round of interviewing. Fast forward a few months, and, spoiler alert: I got the job.
My first day "in the office" was in Seville, Spain, on November 14th during ApacheCon EU. Let me jump back a bit: most of the "Success at Apache" posts talk about the extensive background the authors have, both in the Open Source community and the ASF. While I use httpd, LAMP, etc. all the time, I never really found out how the "sausage was made". Apache has well-made products and the philosophy of how they were built intrigued me. My career until that point has mostly been inside Microsoft shops, usually with me suggesting FOSS solutions in meetings and only getting to use them in small-ish batches. A few MySQL boxes here, a few other Linux machines there, but not "full stack" kinda stuff: I ran it where I could but I was very happy with Microsoft products. "Best tool for the job", right?
Anyway, back to Spain. I don't travel as much as I should, my Spanish is terrible (or enough to get me into a bar fight), and I'm traveling to a country I've never been to.
Friday November 11th was the last day at my previous job. Saturday afternoon, I left my wife and kid to jump on a plane for Seville, Sunday-ish I landed, and on Monday I started work in another country, at a job that was 98% Linux-based (Windows Jenkins build nodes), with people whom I’ve never seen before because no one used video chat during the interviews --at a conference held by the foundation I now work for.
You may ask yourself, "How did I get here?", as I sure did: queue "Once in a Lifetime" by the Talking Heads...
My time at the ASF has been very interesting to say the least. With such a huge range of users of Apache software, some days I'm helping a large global company trying to get a product out the door, other days I'm troubleshooting a broken commit for someone working in their basement between dinner and baths for the kids. That's what makes this place special: those contributions help the community and help the common good of the project. The unique perspective I have is from within Infra. We don't just support the ASF, we support all projects in one way or another. One project might just be getting started with automated builds in Jenkins while another has been using CI/CD for years. That's a true strength of the ASF: disparate parts come together as a whole in a way that wouldn't work otherwise. Some days my job has nothing to do with technology, it's just getting the right people together on an email to figure out how to solve a problem, leveraging the different parts.
As mentioned earlier, "success" is a moving target, and at Apache, it's no different. Though in my case, any success at my job means I'm helping the ASF become successful, which in turn helps the projects and communities it supports. Behind every commit is a person, just working towards their own success.
I'm glad that I took the chance to respond to the job opening. Every job, company, and environment have a fair share of unpredictably and diversity. At the ASF, those traits are celebrated, leveraged, leaned on, and held up by the great people I get to work with and the community that I'm proud to be a part of.
= = =
Posted at 04:03PM Feb 03, 2020 by Sally Khudairi in SuccessAtApache | |
Success At Apache: "Mentor Your Mentor"
The concept behind "Mentor your Mentor" is that someone who is active in Apache should watch for opportunities to bring the idea of open source as a retirement hobby to the attention of a retiring colleague, even if the retiree has been their mentor, and no matter how senior the retiree.
If the retiree is interested, the Apache contributor can offer various forms of help and support such as:
Patricia Shanahan worked from 1970 to 2002 in various programming and computer architecture roles for NCR, Celerity Computing, FPS, Cray Research, and Sun Microsystems. She then went to UCSD as a graduate student, receiving a PhD in computer science in 2009, after which she retired.
= = =
Posted at 05:47PM Sep 23, 2019 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Why you'd want to become an Apache Committer
by Dmitriy Pavlov
All newbies in Open Source communities may sometimes think that they’ll never be able to become Committers. Many treat this role as a prestigious one, granted only for special feats, and after having written a ton of code. But not all things are so simple, and I hope my story will help you.
Election as Apache Committer
My journey with The Apache Software Foundation began relatively recently, in 2015. I was working for Openway Group, and was enthusiastic about in-memory computing. I got to know about Apache Ignite at a local developers conference. I implemented the POC of a backend system based on Apache Ignite. I was so impressed with the clear API and documentation, and it was also very convenient that I could start prototyping without passing through the approval process. I suggested using the Apache product instead of a source-available solution. I met Konstantin Boudnik (cos), who helped me to understand the difference between Apache projects and source-available/closely-governed products.
Luckily for me, GridGain, the company that initially donated Ignite to the Apache Incubator, has a development center in my city (Saint Petersburg, Russia). I joined the GridGain team in 2017.
As part of my day job, I provide patches to the product. I actively joined the dev.list discussions (some fellows sometimes say “too actively”). I’ve created a number of wiki pages - ‘Apache Ignite-Under the hood’ to help developers understand product internals. Also, I developed ‘Direct IO’ plugin. I was elected as a Committer.
In 2018 I was concerned about reviews of patches from members of the community not affiliated with GridGain. Since I had a commit-bit now, I’ve started to review patches and ask others to review them, too. I don’t know for sure, but I suppose - these social achievements in the community development were a basis for me being elected as Project Management Committee (PMC) member.
I asked several questions about The Apache Way on the Community Development (“ComDev”) dev list. I was very impressed by how friendly and welcoming they are. I very much like such a positive atmosphere, and feel it influences the success of Apache projects. Now I’ve also joined Apache Training (incubating) community as Committer and PPMC (Podling Project Management Committee) member.
Quite funny for a software developer with 17 years of experience… being elected as a Committer, that is to say, because of the social aspects and documentation.
Who is a typical Committer and where does his or her strength lie?
When creating an Open Source product, we always let the users explore it in action -- as well as allow them to modify it and distribute modified copies. But when such modified copies are replicated in an unsupervised manner, we don’t get contributions into the main codebase and the project stalls. It’s here where we need exactly such a person – the Committer – someone who is authorized to merge user contributions into the project.
Why should you become a Committer?
First of all, being assigned to a Committer role is extremely motivating. The professional community acknowledges you and your work, and you clearly see the results of your work in action.
How different is that from some enterprise project -- where you have no idea why you must continually keep shuffling various XML fields?
The second pure advantage of being a Committer is an opportunity to connect with top professionals and also pull some cool ideas from Open Source into your own project. But, if you aren’t one of the top professionals, certainly don’t be afraid to join -- the community has various tasks for different folks.
Besides, being a Committer is a jewel in your CV --and even a greater plus for junior programmers, because at interviews you are often asked to show code samples. If you know an Open Source project well, a company supporting or using it will be happy to hire you. There are some people who will tell you that great positions are unreachable without first committing in Open Source.
There are some bonus goodies, too! For example, Apache Committers get an IntelliJ Idea Ultimate license for free (albeit with some limitations).
How do you become a Committer?
You should be committed to the project --it’s just that simple. Development, writing tests and documentation, and simply answering questions on lists are also good ways to start working towards committership.
Yeah, the contributions of QA engineers and technical writers in the community are valued no less than the developers’ contributions.
If you think there are no tasks for you on some project, you are wrong. Just join the community you are interested in and start working on its tasks.
The Apache Software Foundation has this dedicated page that lists what contributions are needed.
Committer — to be or not to be?
Committer activity is a good and useful endeavor, but one shouldn’t strive to become a committer per se. This status can be granted not only for code and it doesn’t justify your proficiency.
Find a project you may be interested in: it will probably be a project whose software you already use. Dive into its code and say hi to the community; offer help, improve docs, complete a newbie ticket or answer to a user. You may just be surprised how welcoming and open folks are there.
Strive to gain the expertise (knowledge and experience) while researching the project, tweaking it and helping others to solve their problems, and, hopefully, enjoying collaborative development in an Open Source project.
Getting started at http://community.apache.org/ will help you on your way.
Dmitriy Pavlov is a Java developer enthusiastic about Open Source and in-memory computing. He is interested in system performance, information security, and cryptography. He created and donated utility for monitoring tests for Apache Ignite, and is a former Community Manager for Apache Ignite at GridGain. Dmitriy represents the Apache Ignite Project Management Committee (PMC) at local meetups in Russia. He runs workshops and training for Apache Ignite developers and users, and is a frequent speaker at meetups and conferences.
= = =
Posted at 11:48PM Sep 21, 2019 by Sally Khudairi in SuccessAtApache | |
"Success at Apache: The Path To Berlin"
This blog post is going to tell an entirely different story: One about what persistence, patience and continuous engagement can accomplish. One about what can happen when people are working together.
It was over a decade ago, in 2008, when I met with people interested in the then-hyped Apache Hadoop project to create a quarterly meetup on everything data analytics, text mining, scalable data storage. That was when the (Apache) Hadoop Get Together Berlin took place at newthinking store - a co-working space and event location before that format itself was turned into a business model http://blog.isabel-drost.de/posts/scaling-user-groups336.html.
A year later it was clear that an ApacheCon EU was unlikely to happen in Europe. When Simon Willnauer, Jan Lehnardt and myself approached The Apache Software Foundation about holding the event in Berlin - the kind people at Apache who did have experience with ApacheCon successfully stopped us: Given the baggage around the event, the trademark implication, the expectation that all sorts of different people had as well as the pure http://blog.isabel-drost.de/posts/how-hard-can-it-be-organising-a-conference.html logistics of creating an event that's bigger than 100 people, that was a safe and likely wise decision.
What they didn't achieve though was to stop us from running some event the following year: We at least wanted to give friends from the Big Data and search communities an excuse to make their employers pay for a trip to Berlin in summer: http://blog.isabel-drost.de/posts/my-highly-subjective-berlin-buzzwords-recap290.html. This was possible due to some lucky conditions we found ourselves in: Knowing conference organisers who were willing to share their know-how such as legal issues and boundaries with us Finding an event company (newthinking.de) that was supporting the idea.
Being employed by a company https://www.neofonie.de/ that saw value in sponsoring the event by allowing me to do so during my working hours.
While successful, part of the ASF community was still missing though. Fast forward several years to 2017, a new conference concept was born. Under the name of FOSS Backstage, we focus on the one topic that every project at Apache has to deal with: Governance, legal, security, economics https://blogs.apache.org/foundation/entry/success-at-apache-cookie-monster Issues that are not an Apache exclusive issue, but true for everyone - individuals as well as legal entities - involved in open source projects.
The only caveat: We had intentionally left all technical content out of scope for FOSS Backstage. For the data analytics crowd the event was conveniently co-located with Berlin Buzzwords. For the remaining content, Sharan Foga kindly volunteered for coordinating to run an Apache Roadshow alongside FOSS Backstage together with newthinking for two days after Berlin Buzzwords and in parallel to FOSS Backstage. With a name different than ApacheCon this left quite some room for experimenting beyond the traditional ApacheCon format.
Little over a year after that ApacheCon finally is on its way to the city of Berlin: With https://twitter.com/plainschwarz as event organisers, in collaboration with the Apache Software Foundation - with Myrle Krantz as event chair to coordinate between the ASF and the local event team.
In retrospect, the series of events was an interesting journey. There's a couple of lessons I've learned that carry over to open source software development - but also a few that are distinctly different.
1.Patches welcome - turn people that come with feature requests into active contributors
Instead of accepting feature requests from people, it helps to pull them in to submit their own patches: Early on there were requests for a barcamp, for a lightning talk session, for trainings. My response back then: Submit the idea through the CfP form, find someone to run it and we'll run it through the regular submission process adding it to the schedule if it fits.
For the trainings we went for a slightly different approach: Instead of directly offering them ourselves we reached out to established training providers suggesting to run with a co-location/ co-promotion approach.
For those that asked for free tickets we would turn them into helping hands - either on site, during setup or (in the first year) as local guides taking groups of up to twenty people out for dinner in a restaurant close by that they had selected.
For those that asked for more content on some specific topic we offered the option of organising a deep dive satellite event on Wednesday after the conference at one of the many companies willing to host these in Berlin.
In general we left a lot of room and freedom to those who wanted to get involved and add content to the event that they found missing.
2. Decisions are made by those doing the work
While feedback is important, there is a limit to what can feasibly be realised for any given conference budget. While we value feedback from anyone involved, the final decisions need to be taken by those actively contributing time to the event. As a result, that also means that not all feedback can always be taken into account - at least not right away, maybe at a later stage or in a different event, the consecutive year or just taken as an impulse to come up with new fresh ideas.
3. "It's done when it's done" is not an option
Conferences are slightly different from open source releases in that there are hard deadlines - in combination with a fixed budget coming in from attendees and sponsors and some hard features that cannot be postponed to the next release (unless you're organising a remote only conference, running without a venue is pretty much impossible.) That circumstance makes organisation slightly more prone to conflict than your average open source project: There's no cheap way to go down both paths and only at the very last minute decide which is better - or even offering both options.
4. Balancing public and private communication
At Apache we value public communication: Often having discussions in the open invites others to participate and shows where contributions are needed. When it comes to budget, ticket pricing, communication with sponsors, contracts including specific prices for venues this approach becomes a whole lot harder. Even though it helps to provide a dedicated mailing list for program committee members as well as interested attendees to get in touch.
It also helps to make some of the planning background public - either while discussions are ongoing, or at least after a conclusion has been reached: http://blog.isabel-drost.de/posts/berlin-buzzwords-scheduling-behind-the-scenes118.html (caveat: the algorithm has changed substantially since this was published, but the article did help answer a lot of questions.)
One downside to this mode of operation is that people who potentially could provide valuable insight or help out have no idea of what is going on. Another downside is that for the outside world it becomes invisible how large a team it takes to make the event successful. As a tiny fix for that we always tried to make the team involved publicly known.
5. Bringing people together
At Apache we have a tradition to work asynchronously - on archived, searchable, referenceable mailing lists. Without that way of working we wouldn't be able to build bridges across timezones, geographies, cultures and organisations. It wouldn't be possible to collaborate for people with wildly different time schedules. Despite all this hearing people speak when reading their texts makes it easier to understand their tone correctly. Despite all this there are topics that are best shared face to face only for deniability reasons. Despite all this, meeting someone in person who so far has been communicating only remotely with you can be a ton of fun. I hope that you will join that fun in October in Berlin: Looking forward to seeing you there!
Join us! ApacheCon Europe/Berlin 22-24 October 2019 https://aceu19.apachecon.com/
= = =
Posted at 01:18PM Aug 06, 2019 by Sally Khudairi in SuccessAtApache | |
Project Perspectives: Apache Weex (incubating) and The Apache Way
I am a Project Management Committee (PMC) member of Apache Weex (Incubating), a cross-platform mobile development framework, widely used in many mobile apps, among top of which have nearly 0.7 billion MAU (Monthly Active Users). Weex became an Open Source project in early 2016 and entered the Apache Incubator in December 2016. As a PMC member, I have been with the project from beginning to today; it is an exciting journey mixed with challenge and suffering, and the journey is not end yet.
"This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning."
As a project under the ASF, Weex should and would do things under The Apache Way. But as one might imagine, there are a few problems Weex has to solve:
Engineering and Product
There are some technical issues due to the feature of Weex:
Community Over Code.
- BDFL (benevolent dictator for life)
- Liberal contribution
"If it didn't happen on (a mailing) list, it didn't happen."
There are many end users that choose Weex in their commercial product, among which include Taobao Mobile, with hundreds of millions of users. For a list of known companies using Apache Weex, please see https://weex.apache.org/community/who-is-using-weex.html .
For now, Weex is still a project under development in the Apache Incubator. We welcome you to join the Apache Weex community. Visit us at https://weex.apache.org/
# # #
Part of the "Success at Apache" series, Project Perspectives chronicles how projects and their communities have benefited from The Apache Way.
Posted at 11:44PM May 20, 2019 by Sally Khudairi in SuccessAtApache | |
Success At Apache: Positively impacting the world one contribution at a time
Dinesh Joshi is a Senior Software Engineer and a Committer on the Apache Cassandra project. He has a Masters in Computer Science (Distributed Systems & Databases) from Georgia Tech, Atlanta. In the past, Dinesh was a Principal Software Engineer at Yahoo building real time distributed systems for Yahoo’s Finance Web, iOS & Android apps. He is also an international speaker and regularly talks about Apache Cassandra and Databases. In his spare time, he volunteers as a mentor for Women Who Code.
# # #
"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 04:36AM Apr 01, 2019 by Sally Khudairi in SuccessAtApache | |
Success at Apache: What You Need to Know
EDITOR'S NOTE: I came across the author's original post, "An Introduction to Apache Software — What you need to know", dated 3 February 2017, and was interested in finding away to share with the greater Apache community. The author's enthusiasm was palpable, and earnestly intended to help educate others. With the ASF celebrating its 20th Anniversary this year, it's easy for many of us to simply rely on tribal knowledge, not realizing that navigation to definitive guides aren't intuitive to newcomers. Those of us who have been here for a while "just know", partially because we were creating it as we went along. Below is an updated version of the original post, amended through the guidance of three long-standing ASF Members. And that's the point of it all at the end of the day: at Apache, we help each other as it contributes to our collective success, and this writeup will help others find their Success at Apache.
by Maximilian Michels
Before you started reading this post, you have already been using Apache software. The Apache web server (Apache HTTP Server) serves about every second web page on the WWW, including this website. One could say, Apache software runs the WWW. But it doesnt stop there. Apache is more than a web server. Apache software also runs on mobile devices. Apache software is part of enterprise and banking software. Apache software is literally everywhere in today's software world.
Apache has become a powerful brand and a philosophy of software development which remains unmatched in the world of open-source. Although the Apache® trademark is a known term even among the less tech-savvy people, many people struggle to define what Apache software really is about, and what role it plays for today's software development and businesses.
In the last years I've learned a lot about Apache through my work on Apache Flink and Apache Beam with dataArtisans, as a freelancer/consultant, and as a volunteer. In this post I want to give an overview of the Apache Software Foundation and its history. Moreover, I want to show how the "Apache way" of software development has shaped the open-source software development as it is today.
The History of the Foundation
The Apache Software Foundation (ASF) was founded in 1999 by a group of open-source enthusiasts who saw the need to create a legal entity to institutionalize their work. Among the first projects was the famous web server called Apache HTTP, which is also simply referred to as "Apache web server". At that time, the Apache web server was already quite mature. In fact, not only did the Apache web server give the foundation its name but it became the role model for the "Apache way" of open and collaborative software development. To see how that took place, we have to go back a bit further in time.
A Web Server goes a long way
As early as 1994, Rob McCool at the National Center for Supercomputing Applications (NCSA) in Illinois created a simple web server which served pages using one of the early versions of today's HTTP protocol. Web servers were not ubiquitous like they are today. In these days, the Web was still in its early days and there was only one web browser developed at CERN where the WWW was invented only shortly before. Rob's web server was adopted quite fruitfully throughout the web due to its extensible nature. When its source code spread, web page administrators around the world developed extensions for the web server and helped to fix errors. When Rob left the NCSA in late 1994, he also left a void because there was nobody left to maintain the web server along with its extensions. Quickly it became apparent that the group of existing users and developers needed to join forces to be able to maintain NCSA HTTP.
At the beginning of 1995, the Apache Group was formed to coordinate the development of the NCSA HTTP web server. This led to the first release of the Apache web server in April 1995. During the same time, development at NCSA started picking off again and the two teams were in vivid exchange about future ideas to improve the web server. However, the Apache Group was able to develop its version of the web server much faster because of their structure which encouraged worldwide collaboration. At the end of the year, the Apache server had its architecture redone to be modular and it executed much faster.
One year later, at the beginning of 1996, the Apache web server already succeeded the popularity of the NCSA HTTP which had been the most popular web server on the Internet until then. Apache 1.0 finally was released on Dec 1, 1995. The web server continued to thrive and is still the most widely used web browser as of this writing.
The Rise of the Foundation
The team effort that led to the development and adoption of the Apache web server was a huge success. The Apache project kept receiving feedback and code changes (also called patches) from people all over the world. Could this be the development model for future software? More and more projects started to organize their groups similarly to the Apache group. As the number of project grew, financial interests arose and potential legal issues threatened the existence of the Apache group. Out of this need, the Apache Software Foundation (ASF) was incorporated as a US 501(c)(3) non-profit organization in June 1999. In the US, the 501(c)(3) is a legal entity specifically designed for non-profit charitable organizations. This is in contrast to other for-profit open-source software organizations or even US 501(c)(6) non-profit organizations which do not require to be charitable.
After the ASF was incorporated, new projects could easily leverage the foundation's services. Over the next year, every few months a new project entered the ASF. The first projects after Apache HTTP Server were Apache mod_perl (March 2000), Apache tcl (July 2000), and Apache Portable Runtime (December 2000). After a short break in 2001 which was used to come up with a programmatic approach to onboard new projects via an incubator, the ASF has seen very consistent growth of up to 12 projects (2012) each year.
The ASF became a framework for open-source software development which, in its entirety, remains unmatched by other forms of open-source software development. The secret of ASF's success is its unique approach to scaling its operations, in which the foundation does not try to exercise control over its projects. Instead, it focuses on providing volunteers with the infrastructure and a minimal set of rules to manage their projects. The projects itself remain almost autonomous.
Apache Governance - How does the foundation work?
There are about 200 independent projects running under the Apache umbrella. The question may arise, how does the foundation govern its projects? First of all, the ASF is an organization that is run almost entirely by volunteers. In the early days, many of the volunteers were developers which did not like to spend much time with administrative things (who does?), so the organization is structured in a way that requires little central control but favors autonomy of the projects which run under its umbrella.
For every project (e.g. Apache HTTP, Apache Hadoop, Apache Commons, Apache Flink, Apache Beam, etc.), there are a Project Management Committee (PMC), Committers, Contributors, and Users.
Project Management Committee (PMC)
The PMC manages a project's community and decides over its development direction. Its most rudimentary and traditional role is to approve releases for a project. In that sense it has a similar function as the original Apache Group which led the development of Apache HTTP Server. When a new project graduates from the Incubator (covered later), the foundation's central instance, the Board, approves the initial PMC which is selected from the PPMC (Podling PMC) formed during incubation. Each PMC elects one PMC member as the PMC Chair which represents the project and writes quarterly reports to the ASF Board. The Chair needs to be approved by the Board.
Through a project's lifetime new PMC members can be elected by the existing PMC. Note that each new PMC member needs to be approved by the Board but this approval is merely formal and there are few instances that a new PMC member is not approved. PMC members do not need the formal permission of the foundation to elect new Committers. PMC members themselves are also Committers. Let's learn about Committers next.
Committers can modify the code base of the project but they can't make decisions regarding the governance of the project. They are trusted by the PMC to work in the interest of the project. When they contribute changes, they commit (thus, the name) these changes to the project. Committers don't only change code but they can also update documentation, write blog posts on the project's website, or give talks at conferences. Committers are selected from the users of the project; more about this process in the Meritocracy section.
Users and Contributors
Users are as important as the developers because they try out the project’s software, report bugs, and request new features. The term is a slightly confusing because, in the Apache world, most users tend to be developers themselves. They are users in the sense that they are using an Apache project for their own work; usually they are not actively developing the Apache software they are using. However, they may also provide patches to the Committers. Users who contribute to a project are called Contributors. Contributors may eventually become Committers.
In the image, the per-project entities are represented as circles. They exist for every project. Note that the user group circle is not depicted in full size because big projects tend to have much more Users than Committers and PMC members.
The ASF does not work without some central services. Here are the most important entities:
Apache members represent the heart of the foundation. They have been referred to as the "shareholders of the ASF" because they are deeply invested in the ASF (not in the financial sense). A prerequisite to becoming a member is to be active in at least one project. To become a member, you have to show interest in the foundation and try to promote its values. The ASF holds membership meetings which are usually held annually. At membership meetings new members can be proposed and subsequently elected. Elected members receive an invitation which they can choose to accept within 30 days. Becoming a member it not merely a recognition for supporting the ASF, but it also grants the right to elect the Board.
The Board of Directors (Board)
The Board takes care of the overall government of the foundation. In particular, it is concerned with legal and financial matters like brand and licensing issues, fundraising, and financial planning. The board is elected by the Apache members annually and is also composed of Apache members. The current board can be viewed here. Note that there is only one central Board for the entire foundation but Board members can be PMC members in different projects.
Officers of the corporation
The Officers of the corporation are the executive part of the administration. They execute the decisions of the board and take care of everyday business. Most of the officers are implicitly officers by being the PMC chair of a project. Additionally, there are dedicated officers for central work of the foundation, e.g. fundraising, marketing, accounting, data privacy, etc.
The support and administration team (INFRA) is the team that runs the Apache infrastructure and provides tools and support for developers. INFRA is the only team at Apache which consists of contractors which are paid for their work. Their work includes running the apache.org web site and the mailing lists which are Apache’s main way of communication. Over time, various other tools and services were created to assist the projects. The main tools available which are used by almost all projects are:
- Web space for the project's websites.
- Mailing lists, for discussing the roadmap of the project, exchanging ideas, or reporting bugs (unwanted software behavior). Typically the mailing lists are divided into a developer and a user mailing list.
- Bug trackers, which help developers to keep track of new features or bugs.
- Version control, which helps developers to keep track of the code changes.
- Build servers, which help to integrate/test new code or changes to existing code.
Founded in 2002, the Incubator is a project at the ASF dedicated to forming (bootstrapping) new Apache projects. The process is the following: People (volunteers, enthusiasts, or company employees) make a proposal to the Incubator. The proposal contains the project name, the list of initial PPMC (Podling PMC) members, and the motivation and goals for the new project. Once the IPMC (Incubator PMC) has discussed the proposal, it holds a vote to decide if the project enters the incubation phase. In the incubation phase, projects carry "incubating" in their names, e.g. "Apache Flink (incubating)"; this is dropped only once they graduate. To graduate, a project has to show that it is mature enough. The Community Development project at the ASF has created a catalogue of criteria called the Maturity Model. It requires having an active community, quality of code, and being legally compliant. Formally, the project needs to prove it fulfils the criteria to the Incubator IPMC which is comprised of Apache members. All existing work donated in the course of entering the incubator and all future work inside the project has to be licensed to the ASF under the Apache License. This ensures that development remains in the open-source according to the Apache philosophy. More about incubation on the official website.
Meritocracy - How are decisions made?
The Apache Software Foundation uses the term "meritocracy" to describe how it governs itself. Going back to the ancient Greeks, meritocracy was a political system to put those into power which proved that they were motivated, put effort into their work, and were able to help a project. The core of this philosophy can be found throughout history from ancient China to medieval Europe and is still present in many of today’s cultures in the sense that effort, increased responsibility, and service to a part of society ought to pay off in terms of power of decision, social status, or money.
Meritocracy in the Apache Software Foundation denotes that people who either work in the interest of the foundation or a project get promoted. Users who submit patches may be offered Committer status. Comitters who are drive a project, may gain PMC status. PMC members active across projects and taking part in the foundation's work may earn the Member status.
Decision-making within the foundation and projects are typically performed using Consensus. Consensus can be "lazy" which implies that even a few people can drive a discussion and make decisions for the entire community as long as nobody objects. The discussions have to be held in public on the mailing list. For instance, if a Committer decides to introduce a new feature X, she may do so by proposing the feature on the mailing list. If nobody objects, she can go ahead and develop the feature. If lazy consensus does not work because an argument cannot be settled, a majority based vote can be started.
Meritocracy and "lazy" Consensus are the core principles for governance within the Apache Software Foundation. Meritocracy ensures that new people can join those already in power. "Lazy" Consensus creates the opportunity to split up decision-making among the group such that it doesn't always require the action of all members of the community.
The Apache License - A license for the world of open-source
The Apache license is very permissive in the sense that source code modifications are not required to be open-sourced (made publicly available) even when the source code is distributed or sold to other entities. This is in contrast to “Copyleft” licenses like the GNU Public License (GPL) which, upon redistribution, requires public attribution and publication of changes made to the original source code. The Apache license was first derived from the BSD license which is similarly permissive. The reason for this was that the Apache HTTP Server was originally licensed under the BSD license.
The current version of the Apache License is 2.0, released in January 2004. The changes made since the initial release are only minor but they set the prerequisite for its prevalence. At first, the license was only available to Apache projects. Due to the success of the Apache model, people also wanted to use the license outside the foundation. This was made possible in version 2.0. Also, the new version made it possible to combine GPL code with Apache licensed code. In this case, the resulting product would have to be licensed under the GPL to be compatible with the GPL license. Another change for version 2.0 was to make inclusion of the license in non Apache licensed projects easier and require explicit patent grants for patent-relevant parts.
The ASF today is not the small group that it used to be back in 1999. At the time of this writing, the Apache Software Foundation hosts 51 podlings in the Incubator and 199 top-level committees (PMCs). This amounts to almost 300 projects (latest statistics). Note that, a PMC may decide to host multiple projects if necessary. For instance, the Apache Commons PMC has split up the different parts of the Apache Commons library into separate projects (e.g. CLI, Email, Daemon, etc.). 50 of the 300 projects have been retired and are now part of Apache Attic, the project which hosts all retired projects. The above graph is taken from https://projects.apache.org.
The Apache Software Foundation regularly organizes conferences around the world called ApacheCon. These conferences are dedicated to the Apache community or certain topics like Big Data or IoT. It is a place to meet community members and learn about the latest ideas and trends within the global Apache community. Apart from the official conferences, there are conferences on Apache software organized by companies or external organization, e.g. Strata, FlinkForward, Kafka Summit, Spark Summit.
Here's a list of some projects that I came across in the past. I grouped them into categories for a better overview. I realize you might not know a lot of the projects but maybe this list can be the starting point to discover more about these Apache projects :)
Query Tools / APIs
- HTTP (the one!)
Apache - A Successful Open-Source Development Model
My first attempt to learn more about Apache goes back several years. I was using the Apache License while working on Scalaris at Zuse Institute Berlin. I realized that the license was somehow connected to the Apache Software Foundation but I didn't really understand the depth of this relationship until I started working on Apache Flink with dataArtisans. Besides the official homepage of the foundation, relatively little information was available on the Internet about the foundation and its projects. In hindsight, the best source of information was to read the email archives, get to know other people at the ASF, and become a volunteer myself :)
When I originally wrote this post I couldn’t find an introductory guide to the ASF. So I decided to do a bit of research myself and tried to write down what I had learned working on Apache projects. I hope that I could provide an overview of the ASF and show you how significant the foundation has been for the open-source software development.
Thank you for reading this article. Feel free to write me an email if I got something wrong or you would like to comment on anything.
Thank you Roman Shaposhnik, Shane Curcuru, Dave Fisher, and Sally Khudairi for your comments which were very helpful to revise this post for the 20th anniversary of the ASF.
# # #
"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 02:15AM Mar 26, 2019 by Sally Khudairi in SuccessAtApache | |
Project Perspectives: Apache RocketMQ and The Apache Way
# # #
Part of the "Success at Apache" series, Project Perspectives chronicles how projects and their communities have benefited from The Apache Way.
Posted at 02:17AM Mar 20, 2019 by Sally Khudairi in SuccessAtApache | |
Apache Software Foundation Platinum Sponsor Profile: Leaseweb
with Robert van der Meulen, Global Product Strategy Lead at Leaseweb
Robert is Global Product Strategy Lead at Leaseweb. Fascinated by technology, Robert studied computer sciences, and after his studies, he delved into the then relatively young and rapidly developing internet technology. He soon understood that the internet would be at the center of almost everything we do and wanted to be part of it. Robert is passionate about using technology to improve people's lives. He contributed to the Debian project as a developer later introduced Apache CloudStack in Leaseweb and has been active in the open source community for quite some time. During his 9 years at Leaseweb, he worked hard to make sure digital transformation, from how we communicate to how we do business, is part of the company mission. Follow @Leaseweb on Twitter.
"Many Apache projects are being built by – mostly – volunteers and motivated individuals, and the world can use, change and develop all of those. It's important to support the people that make this possible."
How did Leaseweb's work with Open Source begin?
Support for foundations such as the ASF is important because those foundations are important :-) . Any big open source project at some point needs the infrastructure to continue to run – and it's great if a project can rely on an organization like the ASF for that infrastructure so the focus can be on making the project great. Open source projects can grow and be more successful if they can more easily deal with governance, financials and administration, as well as tangible infrastructure and tools. Helping an organization like the ASF helps the ASF projects all over, which has an impact on the software we use as part of our products.
What sets the ASF apart from other software foundations or consortia?
A number of our leading Cloud products are based on Apache software. We use Apache CloudStack for various private cloud and VPS offerings, and those platforms are continually growing and evolving – and we keep adding more with most of the new locations we open. Along with the CloudStack platforms, hosting environments obviously have many deployments using Apache web servers. Within our technical teams we consume lots of different Apache projects and actively contribute to a number of them (we have a dedicated CloudStack team that includes one of the Apache CloudStack PMC members). Every software solution has its limits, and obviously this goes for CloudStack too – but also we're happy we can change or help change the things that could be better.
It feels great! It's important to, if you have the opportunity, give something back. Many Apache projects are being built by – mostly – volunteers and motivated individuals, and the world can use, change and develop all of those. It's important to support the people that make this possible.
Are there any other thoughts on the experience of being a large-scale donor that you would like to share? What else do we need to know?
Not much. I personally really enjoy seeing what happens with the support we provide – what projects it makes possible, what things it makes more easy or better. Tangible insight in the results is a big motivator as well as a proof point.
# # #
Sponsors of The Apache Software Foundation such as Leaseweb enable the all-volunteer ASF to ensure its 300+ community-driven software products remain available to billions of users around the world at no cost, and to incubate the next generation of Open Source innovations. For more information sponsorship and on ways to support the ASF, visit http://apache.org/foundation/contributing.html .
Posted at 02:04PM Mar 18, 2019 by Sally Khudairi in SuccessAtApache | |
Success at Apache: Growing with the ASF
by Phil Steitz
I got involved at the ASF in 2002, back in the wild and wooly Apache Jakarta days. In my day job, I was responsible for the team introducing Java technology at a large financial services company.
One of the first things we built was an MVC (model-view-controller) framework for Web applications. We were very proud of it and it worked great in production, but it was hard for us to keep ahead of the feature requests from the many development teams who were using it. One evening, someone said, "Hey, there is this Struts thing that is very similar to what we do and it has some of these things already." I went home and found my way to the Jakarta Web site and downloaded the latest source release.
One thing led to another and the next thing I knew I was asking questions on the struts user mailing list as we started playing with the software and seeing what it would take to convert our apps to use it. After a few months, I found myself answering questions on-list as well and I finally got up the nerve to submit my first patch, which was a documentation fix. At the time, the Apache Struts community was struggling to release version 1.0. I looked around to see what I could do to help and found my way to Apache Commons Pool and DBCP, which Struts was trying to use to replace its built-in connection pool. What I found there was some brilliant but inscrutable code hiding some nasty bugs that Struts needed fixed. At that time, I did not have the Java skills to solve the problems, but I resolved to come back when I did and I watched as others developed workarounds that enabled the Struts community to move forward. I found a welcoming community in Commons and some problems that I could help with. I did eventually make it back to Commons Pool and DBCP, serving as RM for quite a few releases.
During this same timeframe, my $dayjob career was advancing rapidly, thanks in no small part to my aggressive introduction of Open Source software and practices, which was uncommon at the time in financial services. We brought in some ASF committers and their companies to help us build a development pipeline and tooling that was ahead of its time. We applied the Contributor - Committer - PMC member concept to developing enterprise technology standards and strategy. We developed the concept of "earned authority" in technology decision-making, modeled after the idea of publicly earned merit at the ASF. My leadership approach was profoundly influenced by my experience at the ASF, and continues to be to this day. Not a day goes by at work when I do not push for more transparency, more eyeballs on code and more focus on community collaboration and genuine appreciation of diverse viewpoints. I am very grateful to the many ASF community members who have helped me develop as a leader.
Through the years I've met other Apache committers with similar experiences: welcoming projects, friendly communities and great opportunities for personal growth. I’m pleased to see how the ASF has grown and continued to evolve. Every day new contributors join us and new leaders regularly emerge to help guide our communities and the Foundation overall. We all benefit from our experience here and the Foundation becomes stronger as a result.
Phil Steitz is Chairman of the Board of The Apache Software Foundation. He has been an ASF committer since 2003 and a member since 2005. He served for 4 years as Vice President, Apache Commons. Phil also currently serves as Chief Technology officer of Nextiva, a cloud-based business communications company. He has previously held C-level technology leadership positions at multiple software and financial services companies.
# # #
"Success at Apache" is a monthly blog series that focuses on the people and processes behind why the ASF "just works". https://blogs.apache.org/foundation/category/SuccessAtApache
Posted at 09:19PM Mar 05, 2019 by Sally Khudairi in SuccessAtApache | |