The Apache Software Foundation Blog

Sunday January 24, 2021

Inside Infra: Chris Lambertus --Part II

Part II of the of the "Inside Infra" interview with Chris Lambertus, the last of the series of interviews with members of the ASF Infrastructure team, who share their experiences with Sally Khudairi, ASF VP Marketing & Publicity.


"...you want to limit your exposure... You have to keep that in mind as you move through the day to make sure that you are minimizing your risk and minimizing your security threat vectors."


So, in the scope of the team, I understand that you're a more "senior" developer. Not that you know better; it's not an issue of better or worse, but you're more seasoned. How does ASF compare to other groups that you've worked with? Are there special technical requirements or special security issues you have to be concerned with? Especially as we mentioned before, it seems like there's an unlimited number of project development environments. Are there certain things that you have to consider or accommodate or do that's so different with ASF that you've never experienced before? Can you give a little bit of a frame of reference for folks unfamiliar with how it is within the ASF?


First of all, I'm not a developer. I am terrible at programming. Absolutely, I'm awful at it. I don't consider myself a developer in any way, shape, or form. I am a system administrator, 100%.


...Administrator. Okay, so, you're a more "senior" sysadmin then.


I hesitate to use the word senior, because it has some implications in the industry that I don't necessarily feel are appropriate for the ASF. I believe that I have been doing it longer than most other people on the team just as a career. I'm guessing that's probably what you mean by that.


Right. That's why I used the word "seasoned" also. It's hard because some people go, "Are you saying I'm old, or are you saying hierarchical, that I'm above others?" It's a hard way of describing it, because some folks have been programming or dealing with computers since there were kids, others later in life, but you guys are all moving in the same direction. So, how does one describe it?


Yeah, I think seasoned is a good word. Just like I said, I've been working in the industry as a system administrator since 1992, pretty much continuously with some brief changes in the 2000s. It's not here nor there. So, it's not hierarchical. Everybody is equivalent in terms of the Infra team. Nobody's above anybody else or below anybody else, right?


...I was wondering how is the ASF different from other groups you've worked with.


All right. It's actually not all that different. There are a couple of things that make it unique. Well, a number of things that make it unique. One is that it's completely remote and completely geographically dispersed. Two is that the participants on the team are all from very different backgrounds and cultures and countries, which is fairly unusual for a system admin team, a small system admin team, I would say. But beyond that, it actually shares quite a lot of things that I typically see in system administration teams. There's a central job board, if you will, like the Jira stuff. There's a communications channel. We have Slack.


There's a nominal leader in Greg, that directs the general movement of the barge. Yeah, by and large, it's pretty similar with most environments that I've worked in. I mean, some are much different. Some are very corporate, some are very open. Yeah, now I remember one of your previous questions --one of the biggest challenges that I found is the openness.


The ASF for quite some time has been incredibly public with its configurations, with its systems, with its documentation. These types of things are very unusual in the corporate world or in commercial IT. Typically, you would never make that stuff public. The fact that it is and has been at the ASF, that's been a challenge for me. It's an unusual way to maintain systems. It's got some downsides. Having that stuff available can be concerning at times.


...How so? Help me understand this, because I've been with the ASF forever. What you're mentioning right now reminds me of about 10 years ago, something failed in Infrastructure. I can't remember what it was, but it was a big thing. People were talking about it. It was even in the press at the time. It wasn't catastrophic, but it was big. We actually wrote a blog post about it and we presented about it at ApacheCon. From a marketing perspective and a media perspective, I was uncomfortable, because from a corporate perspective, you don't do that. The fact that we not only encouraged it but published it and educated everyone about it, admitted it, ate it all, we took responsibility, 100%: "Here's what failed. Here's what happened. Here's what we did." People found this to be extremely refreshing, extremely helpful, and it was totally eye opening for me. I had no concept of anything like that before, and I'd been with the ASF for like 10 years already. I've never seen us opening the kimono at that capacity. So, I'm coming at it from a slightly different perspective as you. I understand you don't want to have your config files public. Obviously, that can put you at a different level of exposure and risk.


Exactly.


...Is that required, or is that just part of our culture saying, "This is what we do"?


It's definitely part of the culture. My background is heavily in computer security. Coming on board to the ASF and seeing all this stuff out in the open to me was... I couldn't believe my eyes. "You're doing what?" So, I've actually worked quite a lot to reel that into some extent, because even 10 years ago was nothing like what's happening today in the world of computer security, in the terms of the threats, in terms of what people are looking for, what people are doing, and what people are capable of doing, right? Even to benevolent organizations like ASF, it's distressing.


So, one of the things that I've really tried to encourage is it's okay to be open to some extent, but you have to have some common sense about your security exposure. That's what I've been trying to do just for the entire time that I've been here is just to try and reel some of that in without losing the culture, because I think the culture is valuable. Like you said, the incident that happened whenever that was, I think it was a right decision for the time. Would you do that today? Probably not.


It's not because you wanted to cover something up, but it's because you want to limit your exposure. Yeah, so it's a different culture now, not the ASF, but the world in general. You have to keep that in mind as you move through the day to make sure that you are minimizing your risk and minimizing your security threat vectors.


All right. Have you had instances where a project has basically treated you as their dedicated resource? Has anyone made unusual demands of the team? I’m not asking you to name names, but I can imagine it can get out of hand with all these different projects, especially the corporate ones.


Absolutely. Yeah, the corporate ones are typically the biggest problems, because they come in with a much different mindset than somebody who's come in from developing an Open Source package and has brought it to the ASF. The corporate projects that we've seen really are the ones who are the purveyors of that mentality. They feel Infra is their personal resource, because they don't really have an understanding of the scope of the Foundation. They don't have an understanding of the amount of projects that Infra supports. So, I don't really fault them for that, because it's just a matter of education. They just need to understand where they are placed in terms of the Foundation, in terms of Infra's availability and scalability.


Once we've explained that to people, they get it. We typically don't have any problems after that. But there are a few projects that have come in and just persisted in wanting weird stuff. Some of the things you can provide. Some of the things, you just got to kick back and say, "Hey, this is not something..." Like I mentioned earlier, if it doesn't have a broad benefit to the Foundation, if it's something really specific to your project. Infra is probably not going to support that for you, because we can't support all these one-offs.


So, we'll say, "We'll give you a VM. You can do it yourself." That's worked out pretty well, but there’ve been a few cases even where people like Greg and David have had to go and talk to these projects and say, "Look, how you're approaching this is not appropriate. You need to pull it back. You need to rein it in." But that's really pretty uncommon. I would say just a basic education as people come through the Incubator is sufficient to dispel most of that.


Those kinds of projects... Do they stand down or they wind up hiring their own committers to do their Infra work? Do you have any idea as to how that works? I'm seeing more projects coming in with more diversity in their committership to take care of marketing stuff, for example. That's expected especially as they scale, but from the site administration side of things, Website stuff, it's a very interesting thing to observe. Some project sites’ information is stagnant ... they're focused on specifically developing code. Others are super productive in terms of getting stuff done. I'm always wondering how are they able to handle all this? Curious to see if you had ideas as to what's going on there ...


I will say this, documentation is hard, right? Writing code is comparatively easy, and it's a lot more fun. So, when you're developing a product, your natural instinct is to develop the product, not develop the documentation. So, you get a project that's only got a couple of active members. They're probably not going to spend most of their time writing documentation. They're going to spend most of their time trying to advance the code base. Even within Infra, that's been a huge challenge for us.


Now that we've hired Andrew (ASF Infra team member and technical writer Andrew Wetmore) to help us work on some of this documentation, it's becoming extremely clear as we work through it how much of that documentation has been untouched. It's been stale, for all the same reasons as these projects. Yeah. Some projects will say, "Hey, we need a documentation guy. That's what Infra said, we need a documentation guy." They'll find one. Maybe somebody will volunteer or maybe it's a corporate thing, whatever. So, yeah, I think it really depends on the project. Some people have the resources. Some projects have the resources, and some don't.


Yeah, it's interesting. Again, since day one, since the '90s, documentation has always been an issue for all projects, even when we started with just HTTPd. It's a constant issue. 


If I was going to have money to do anything in a project, I would use it on documentation.


Documentation is often the thing we need the most. I mean, how is it going to work otherwise?


Yeah, I agree. Even from just a cognitive aspect, writing code and writing documentation are about polar opposites. The type of mind that goes and writes code isn't usually the type of mind that can write documentation or can write meaningful documentation. I'm guilty of it myself. I can't write documentation, I find it quite difficult. Where building packages and tying things together, and Puppet configuration management, is not difficult for me. So, it's a huge mind split between those two types of things. I absolutely agree that hiring somebody to do documentation is a great use of resources.


We've grown a lot during the time you've been with us, now six plus years. Other than scale, how has Infra changed over the years? What's unusual is that the team is getting smaller. I would presume as the Foundation is scaling upwards, you would have more team members. It's some crazy number: five people, six people, it's so small. It’s hard to understand how you guys handle everything.


Yes, six people, including Andrew and then Greg, right?


Including Andrew, that’s six, but Andrew doesn't handle the day-to-day Jira stuff anyway. He doesn't handle tickets. So, you really are a tiny group. From your perspective and your experience, would you say that that's a small group, considering the workload and the demand?


Yeah, I would say so. Probably based on my experience in other organizations, about half the size that it would be in a commercial environment. Well, to go to your original question there, in terms of what's changed, I think prior to David Nalley, I would say that Infra was extremely reactive. I think that's changed quite a lot. I think David has really brought an element of customer service and customer focus to the team that really had been somewhat lacking in the past.


So that was a proactive decision to go in and say, "We have to better serve our projects," right?


Yeah. I really do credit David with that. I think he brought a huge amount of that to the team and that mindset. It's really improved our relationships, Infra's relationships, with the projects. It's helped us develop tooling like Self-Service.The more that we can move off into those projects, do-it-yourself tooling, the better off we are, because it's less tickets that we have to handle. It's a constant juggle for us between dealing with legacy code, dealing with technical debt from years and years and years and years ago to doing modern things to bring out new tools, and all the while supporting projects.


In what areas are you guys experiencing bursts of growth or demand? Everyone has a slightly different perspective. I know CI comes up a lot in this arena. Greg's always saying (since I deal with ASF’s Sponsors), "We need more." Where do you feel Infra's growing at the highest rate or the most interesting rate? Where do you feel like that's happening?


Yeah, continuous integration was the first thing that came to mind when you said that. The more projects we have, the more need there is for CI. That's fairly linear. Other growth places are things like Infra VMs, machines that we run to support Infra services internally. Prior to the resources that we have now, we used to have a lot of monolithic systems, systems that would run a lot of things. Think of a machine like Minotaur, which used to run two dozen services on one machine. That's not a best practice at all.


Moving to aggressive use of configuration management Puppet, and making sure that systems are easily replicable with the configuration management, has allowed us to really build -- not quite micro services, but single purpose systems, which are a lot easier to maintain, a lot easier to scale than some of those monolithic systems. So, that's been a big growth area for us. Just the number of VMs, number of systems that we're maintaining, it's got to be in the hundreds at this point. I haven't counted. Yeah.


...These microservices that you're mentioning also reduce the single point of failure, which is critical. That keeps you guys scalable and keeps you up and running. That's important.


Yeah, that's right.


I'm curious when was the last time you guys had a fire drill type of thing, where everyone's hands on. You had something recently, right? A couple months ago, all hands on deck, there was something broken. You guys were able to resolve it pretty quickly, but that's uncommon, where something breaks in its entirety.


I don't want to say anything about this, because it's going to cause a problem.


...We can go off the record.


What I mean is I'm going to say it's fine, right? And then something's going to break.


...[laughing] You don't want to jinx it. Okay.


We have failures from time to time. We've had some situations where there's been a problem at a colo. One of our VM providers had an issue and we lost machines. We had to rebuild them with Puppet, our configuration management, and restore stuff from backup. It sucked, but it wasn't a disaster, right? Because we have the backups. We have the capacity. We have the configuration management. So, nobody had to wrack their brains: “How did this work? How did this go together?” We’ve made very, very big strides in avoiding that old mindset of ‘one guy set this up 10 years ago and nobody else knows how it works.’ We're very much trying to avoid that these days.


...Right, bus factor.


Yeah, yeah, yeah. The configuration management systems have been absolutely critical with that. So, that continues to grow. We continue to add to configuration management wherever possible and just make sure that those systems are able to be reconstituted wherever, whenever it's needed.


Cool, cool. Okay. What do you think people would be surprised to know about ASF Infra?


The other guys probably said the same thing, but probably the amount of stuff that we support from the number of people we have. I think that would probably surprise most people in the industry.


That's one answer. I think it was (Infra team member) Chris Thistlethwaite who said "that we exist", that you guys exist. People don't know how it happens. It's like magic. I've always talked about how Infra is this crazy-magic-impossible story. It's like The Little Engine That Could, because you guys are such a tiny group. You have such a good working relationship, and everyone is connected. From the outside, it seems like a completely seamless operation. There's this magic thing behind the scenes, and then you find it's only five, six people running it. That's mind blowing. It's incredible.


I hope that people have that perception. We do try to provide a unified front. In reality, there's not really any infighting in the team. We all generally know what needs to be done. We all generally agree on how to do it. So, the disagreements are fairly minor and not all that common.


Well, that in itself is unusual, right? Think about it. I mean, there's a lot of factions and politics and weirdness, but that tends to happen with larger groups. So, you guys make it work in a way that's awesome.


I think one of the things that makes that the way it is, is because we're all supporting the ASF, right? We're all here, because we support the Foundation, and we want the Foundation to succeed. So, that drives, I think, a lot of the direction and the way that we approach how we support the Foundation.


You guys have a very different common goal, right? You're there for the benefit of the Foundation with a capital F; Projects are there to work on their own thing. Of course, if they can help everybody else, that's good, too. But the focus is different. 

...What is your favorite part of the job?


I have to say, the flexibility and the remote aspect of it, along with the constantly changing technology. There are a lot of opportunities to learn new things, and work on new technologies. 


...You are all on call for certain periods throughout the week, right? So, because of your 7:00 to 11:00, are you ever on call overnight, or does that just not work out with schedules, or it doesn't matter?


Well, we rotate on call. So, you're on call for a week at a time, starting at, I think, 10:30 or 11:00 Pacific AM and then going through the following week. So, typically, what happens is you'll get the pages when you're on call, regardless of the time of day or night. But the way that it works out, typically, because we have folks in Europe, we have folks in the US, we have folks on the West Coast and the East Coast, that almost always there'll be somebody awake and available to answer.


Sometimes in the middle of the night, if my pager goes off at 2:00 in the morning, I'll look at my phone and I'll see that Humbedooh or Gavin is already working on it. Thanks, guys. Obviously, the same is reciprocated, right? If the phone goes off in the middle of their night and I see that they're on call but it's 3:00 in the morning, I'll grab a ticket if I can, I'll grab the call if I can. We just try to help each other out that way.


You guys are a true team: you have each other's backs in a way that again, is unusual to see. It's almost like family but even better, because even family has infighting and issues. You are there for each other, which is really, really cool to see.


Yeah, let's say we've had our disagreements, but it is a very familial atmosphere.


When you first came into the role, what was your biggest challenge? Was it what you thought it was? How was your experience?


It was an incredibly steep learning curve. When I first started here, we were in the middle of the transition from the "one guy who set up everything, a volunteer five years ago, nobody knows how it works" environment to a configuration management. We were  just starting to get into that, and shore up some of our documentation at the time. For me, just coming in and learning all the different systems and all the different processes and all the different edge cases and one-offs and locations for things and who's who and all these, that was incredibly difficult. It took me probably at least a couple of years before I felt comfortable with most of the systems.


Even today, there's stuff out there where I'll be like: "I'm not sure what this means. Do you have any idea what's going on?" Because there's so many little pockets and holes and places and things and historical legacy stuff. It's very complicated. It's been organically grown over a long, long time.


...With a lot of different personalities and a lot of different processes, that is what's unusual. The "quilt" that makes Apache is so diverse.


It is.


What are you most proud of with your career with Infra so far?


I'm not really sure, to be honest. I don't tend to think of things like that. I can't really single out one thing and say, "Hey, I'm really particularly proud of that," or whatever. I try and take pride with all my work. Building better backup systems, I think, is definitely a big one. Just getting through some of this mail project has been good as well. When I finally got everything working, that was a pretty proud moment there. I felt pretty good about that. That was a complicated system. It's still a complicated system. I'm still not sure it all works right. That's why we have to test it. By and large, I'm feeling pretty good about it.


That's great. How would your coworkers describe you?


[laughs] Grumpy.


...[laughs] The response is the same with everyone. Everyone laughs, but grumpy is the first one I've ever heard.


I don't really talk too much. I'm not a super verbal person. So, I always seem to come across as grumpy on the chat systems there. It's a schtick, I guess, but it is fun. I'm not really grumpy. Well, most of the time.


What are the biggest threats or concerns that sysadmins need to watch out for? I don’t mean doom-and-gloom unless there’s actually doom-and-gloom ...A lot of non-Apache folks are curious what the Apache guys think. So, is there anything that you could share in terms of advice or trends that are coming up or something that people should be aware of moving forward?


Security, backups, disaster recovery, those are the keystones of any organization that you absolutely must have in place to sleep at night. If you don't have any one of those three, you're in grave danger of doom-and-gloom.


That makes sense. What is your greatest piece of advice for someone looking to have a job like yours?


Oh, boy. Run for the hills [laughing]. Work with as many different things as you can, learn as many different things as you can, and try not to get stuck doing one specific thing. I think in my career I've been such a jack of all trades that it's really helped me to be able to see and build systems that work with a lot of different technologies. You get some people coming in, they're IBM guys, like a specific subset of IBM AIX expertise or something, right? That's all they do. And then when the situation comes around, well, nobody's really using that anymore, you run into a problem, because you're not really marketable anymore. So, the advice that I would give anybody who's trying to get into the system administration field, be broad and learn as much as you can about as many different things as you can.


If you had a magic wand, what would you see happen with ASF Infra?


I think I'd probably just give us more resources. I mean, I don't really have any complaints, to be honest. I think if we had more, then we would do more.


...More ...machines or more cash or more team or more what?


All of those ...I think more cash. Being able to buy more physical compute resources would go a long way for us. We do rely so much on donations and donated resources that it can be a little bit daunting when that donation goes away and you have to scramble to fill the void. Staffing is a complicated one, because it is familial.


Having somebody new come on board, it's challenging. It's nice to have an additional person be able to work on stuff, but going through the process of integrating them into the team and teaching everything else, it's daunting, it's challenging. So, I think having more resources would be more important at least to me than having more staff, because I think we're doing all right with the staff that we have now. So, that's just my perspective.


= = =


Chris is based in California on UTC -8. His favorite thing to drink during the workday is ice water and the occasional Diet Pepsi.

Monday January 11, 2021

Inside Infra: Chris Lambertus --Part I

Part I of the last of the "Inside Infra" interview series with members of the ASF Infrastructure team features Chris Lambertus, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.


"...The thing that we're fighting against is the safety and longevity of the old technology. For quite some time, our primary concern was that the hardware that was running this, which was 15 years old, was going to fail."


What's your name and how is it pronounced?


My name is Chris Lambertus (“Kris Lamb bert uhss”): it's pronounced exactly how it's spelled.



When and how did you get involved with the ASF?


I've been aware of the ASF probably at least since the inception of the ASF. I've been working in IT for quite a long while, and I've been very familiar with the ASF projects, because I use them daily in my career. I didn't actually get involved with the ASF until a buddy of mine who was working on the CloudStack project mentioned to me that (ASF VP Infrastructure) David Nalley was looking for somebody to do some contract Infra work. Long story short, I talked to David and I tossed in an application. I was eventually hired as a part time contractor.


...So CloudStack, we're talking about 2012, when CloudStack first came into the Apache Incubator, or was it after that?


I've been aware of the ASF probably since HTTPd, since the original Web server came out. I joined the team late 2014.


Explain your role within the Infra team —how did you get here? Were they looking for someone who specializes in something particular?


My understanding was David was really looking for somebody that had a background in production systems engineering and had been doing it for a long time in a production environment. That's something that I had been doing since 1992: I've been essentially a professional production systems administrator. I knew that skill set was definitely in line with what David was looking for. I think I brought that to the table pretty well. That's basically what I've been doing ever since as a contractor.


What are you responsible for specifically?


That's a complex question, because the ASF Infra sysadmins are essentially responsible for everything. All of us are all responsible for all of the things. We do tend to specialize a little bit. My current project is probably reengineering the mail system: it's the largest one I'm working on right now. I do tend to focus a lot of my efforts on backups. Beyond that, I do a lot of JIRA and Confluence work with Gavin (ASF Infra team member Gavin McDonald). But Puppet, configuration management, again, all these things are things that all the Infra guys support.


In past interviews everyone has basically said, “we do everything”. How does it work? There's no hierarchy. Everyone does everything. Do queries come in and everyone jumps on them? Do you have a round-robin way of getting stuff done? How do you manage with so much going on with Infra? How do you cope with that?


“Cope with it” is an accurate term. Each of us has... I don't really want to call it a specialty, but definitely a focus. If some question comes in about the new mail routing or something that I've been specifically working on, that would go into my bucket as a priority. Certain people have history with certain types of projects. Gavin (Infra team member Gavin McDonald), for example, has been heavily involved with the Continuous Integration infrastructure for many, many, many, many years. So, he tends to be the font of knowledge for all things CI-related.


We tend to break things up that way. Some of the team members definitely have skillsets above and beyond general system administration work. Humbedooh (Infra team member Daniel Gruno’s username) is a very skilled programmer. He then ends up owning a lot of the software that Infra has developed and he has developed. So, questions regarding that, and specialty configurations related to software that he's written tend to go into his bucket. Because of the nature of the team, and because of the nature of the time zones that we're all in, the responsibility of dealing with issues follows on whoever is on call, first of all, and then whoever is awake and available, to handle any situation that comes up regardless of who "owns" the technology.


Describe a typical workday for you.


Apache work for me is basically: I wake up in the morning, hopefully not at 3:00 in the morning, but get out of bed and plop down in front of the computer. Essentially, my lifestyle is I've always been a computer guy. I've always been really focused on computer system administration, not only as my work but also as a hobby. So, I spend the vast majority of my day behind the computer, whether I'm working on Apache stuff or working on other projects, things like that. That'll go on until 11:00 at night. So, my "workday" is essentially me living my life and doing tasks as they arrive and doing projects as necessary and getting things done that need to get done.


...All the things.


Regardless of the time of day, yeah.


How do you keep your workload organized? Folks have all sorts of different systems. Are you an Evernote type of person, or do you keep your own journal? Do you have a certain system to help manage your workload?


Jira is the primary basis for managing my workload with the ASF. We've done a lot of work in terms of building technologies around Jira. Our service level agreement reporting tools I find extremely useful for seeing what's in the queue, what needs to be done, what hasn't been touched in a while, things like that. That really drives a lot of my day-to-day efforts in terms of replying to tickets and servicing customers.


In addition to that, I also use Jira to track my projects. So, if I have a project going on, that's usually a Jira ticket. And then I can go back and refer to those and see where things are, what needs to be done. I've never been a big one for lists of notes. I do have notes that I keep, but by and large, the things that are on the top of my stack maintain on the top of my brain at the same time. So, I don't feel like I forget a lot of things, but I don't take a lot of notes, which is what it is.


…And then there's people who have everything except the monitors covered in Post-it's.


I don't do a Post-it thing, but I have little text files everywhere with notes and things in them.


So, you all have day-to-day tasks that you manage, as well as things that require your immediate attention, as well as long term projects. In my earlier interviews with other Infra team members, everyone's been saying that I have to talk to you, because you're handling “The Email Project”. For those who aren't aware, standard operating procedure at the ASF is “if it didn't happen on-list, it didn't happen”. So, you have, if I'm understanding this correctly, 21 years’ worth of email archives that you're working on. What's going on with this project? What are you handling? Why is it so important?


Well, as you know, email is the lifeblood of the Foundation. Everything that happens here happens on a list. Because of that, the Foundation has amassed a very large quantity of email archives. Those archives are fundamental to the provenance of the Foundation. So, maintaining those and keeping those safe and available is really a top goal of the Infra team.


The mail project, such as it is, is essentially to upgrade and migrate our existing legacy email system to a modern, more supported system. The current email system as it stands was engineered by folks, volunteers, some staffers, I would guess, over 10 years ago, maybe 15 years ago, running on FreeBSD, which we don't really use too much anymore. Actually, we don't really use it at all. They used technologies that were interesting at the time, but are perhaps not so well supported today. So, a lot of it is modernization.


A lot of it is taking a lot of that old tribal knowledge that really doesn't exist anymore and bringing it into the modern era, documenting all the weird little settings that we have and all the edge cases that we manage in email, management of the list systems, mailing lists and their configuration, and making sure that gets upgraded, migrated, modernized. Doing that all in such a way that we don't a) lose anything, or b) suffer any downtime. So, it's a large project. That's really what I've been working on probably for the better part of the last two years, bringing that up to the present era.


You’re like the Titan Atlas: carrying the heavens on your shoulders. That's a massive, massive undertaking. Is there like a deadline for this —where's the end for this project? Is it never ending?


I feel more like Sisyphus than Atlas, but the deadline is as soon as possible. The thing that we're fighting against is the safety and longevity of the old technology. For quite some time, our primary concern was that the hardware that was running the old email system, which was 15 years old, was going to fail. In fact, it did. But fortunately, I basically copied the whole thing off to a separate colocation facility. So, we had an archive of it when it went down, and I was able to bring it all back up.


So, that wasn't a problem. I mean, it was a problem, but it wasn't a disaster as it could have been. So, the deadline is as soon as possible. But in reality, it's going to work until it stops working. I'm not sure how to better state that, because the technology is so old and we really need to get off of it and onto new technology. But there's no hard and fast timeline. Nobody's really cattle prodding me to get it done, but it's the absolute top priority that I have.


...That was actually my follow-up question. Is the “as soon as possible” official, or is this something you're setting for yourself because you just want to get it done?


Oh, that's definitely an official timeline. Yeah.


...I remember our first email servers were a machine under Brian Behlendorf’s desk at the Wired offices. So, we've come a long way since then.


We have, yes.


...You're handling this behemoth. Are you also dealing with the day-to-day putting out the fires, as well as everybody else?


Absolutely, yes.


The volume and scale of this project seems so huge. Again, the word 'cope' keeps coming to mind, because knowing what I know —and I don't even know— it's just scratching the tip of the iceberg: it seems astronomical in terms of scale and scope. Are you building everything from scratch for this project? Are you using any kind of commercial packages? This is a huge overhaul. Tell us more about it.


Multitasking has been in my blood for my entire life. I don't typically have a problem of splitting my time and my attention and my energies between multiple projects. You are absolutely right: this is a titanic project. It's one of the reasons why it's taken so long. Like I said, we've been working on this for several years at this point. The reason it's taken so long is twofold: One is I can't spend 100% of my attention on it or else I would go absolutely crazy. So I partition that. I partition my mind and my time, if you will. Just a little bit of time here working on this, working on this particular aspect of it, then I'll go work on some tickets. So, I'll go work on something else. If I was only working on the mail, then other things wouldn't get done, right?


I have to partition it that way. I think the main way I've tackled this type of project... Again, my experience in system administration going back so far, I've worked on a lot of very large scale projects. So, this is in the middle in terms of the scale. But the biggest thing is to break it down into multiple components as small a component as you really can. The first thing to do is to analyze the existing system. "What is it? How is it running? How is it tied together? How are these things all related? Where are the pieces? Where are the tendrils? How far do they go?" “Write that down.”


I started developing documentation that explained a lot of stuff. There was some documentation that existed. I take that and I carry it forward then into the new system. Okay, "what things do I want to keep? What things do I HAVE to keep? What things are legacy? What things don't we use anymore?" That process of discovery, of understanding how it was built, why it was built and what we're still using, and what we don't need to use anymore, is probably the vast majority of the work--just to understand it. Once that's done, we say, "Can we use the old technology, or do we need to use a different technology?"


In the case of the Foundation, we're extremely tied to the way that ezmlm, our mailing list system, works. ezmlm is extremely tied to Qmail. So, converting those into other tools, basically, I'll say, it's too complicated. With the amount of data that we have and the amount of dependence that we have on those configurations, migrating it to a different system would be incredibly difficult. So, what we've done is there are modern versions (and updates for) these pieces of software, ezmlm and Qmail.


What we've done is I've taken those packages and I built them for modern operating systems. I've patched them with current technology, TLS and various modern email stuff, and put that into configuration management and built a system that deploys all those packages in a reproducible fashion. So, at any time, I can just turn on a new machine. I could type in, "This machine is the new mail router," and run Puppet, our configuration management software, on it. It'll deploy all that software automatically.


That's probably the second part of this huge phase of developing this. The phase that we're in right now is testing it to make sure that it works the same way as the old one works. Once that's verified, then we can actually look at migrating the old data onto the new system and deploying it into production. I think that answered your question.


I think so, but it made me think of another question: How did this wind up being "your" project? Was this assigned to you? Did you jump on it going, "Yeah, I'm taking it"? How did you wind up with this?


That is a very good question. I don't really know. I think probably just because I had been working with... Back in, 2015 maybe, we were actually having this exact same discussion: "what do we need to do to migrate this EZMLM, all these mail archives, all this stuff to a new modern system?"


One of the things that we looked at was, "Can we transfer this? Can we translate this to something like Mailman or some newer type of mailing list management system?" We looked at a couple of options. The biggest problem we had was that the archivers were terrible. So, Humbedooh basically ended up writing this thing that became Pony Mail as the answer to that system. Ultimately, that turned out to be a great effort. I think it's going to take us a long way. But in the end, I was the one to continue to work on the email system. For whatever reason, I guess it just became my thing. Maybe because I was the only one willing to do it. I don't know.


...Is the legacy system going to be powered by Apache Pony Mail (incubating) at some point, or is it already in the process?


So yeah, lists.apache.org is our primary advertised archive system. That is what we're telling people to use. In terms of what happens to the old system, that remains a little bit under discussion. I don't know the ultimate disposition of that, but the current plan is lists.apache.org will be the primary access to the mail archives.


I noticed that Pony Mail goes back quite a bit, but it didn't originally go back as far as it does now in terms of the archives. I’m curious to see if everything eventually is going to be migrated to it.


Yes, yes, we actually have a plan to load the previous archives in there. We loaded a subset when we first started it up. I believe they go back to 2012 right now. So yes, we do have a plan to load the previous archives.


Great. I understand some Apache projects and their communities are always asking for new services. How does Infra decide which products you support? Who gets assigned to take the lead on introducing new services or new products? I understand that you develop your own custom solutions as well. How do these get divvied up? Is everything in queue? How does it get done?


When you're talking about a project requesting a service, I think the first thing we look at is, "Is this service extremely specific to this one project, or is it something that has broad appeal to the Foundation?" If it has broad appeal to the Foundation, we've got multiple requests for it, it's a service that we feel we can provide, given the amount of time that we have available, then it's something that we would consider doing.


Obviously, there's a lot of other thought that goes into that in terms of what it is, what it does, what it needs to do, who needs access to it, that we have to evaluate. But generally, if it has broad appeal to the Foundation, it would be something we would look into. If it doesn't, if it's something that's very specific to a certain project, what we typically recommend is that a project request their own VM. They can run the service themselves. That's typically how we’ve approached that in the past.


Has the team been in a situation where you're like, "Hey, this is a really cool thing, let's bring it in," and then throw it on projects or see if anybody wants to do it? Does the converse happen also, where you guys have insight as to something that's hot and new and you think that would be a great fit for Infra, but you have to find a "problem" to connect it to; or is that not something that you deal with? Is your work all reactive, or do you ever come into a situation where you say, "Long-term planning: we want to introduce something brand new"?


I think probably up until maybe five years ago, the work was almost entirely reactive. But the team and the processes that David (Nalley) now put together have really pushed us more in a direction of future planning, of taking the time and taking the mindset of, "What can we do long term to better support projects?" I think selfserve.apache.org is a great example of that. That's something that grew out of a small subset of tools. We got very positive feedback from Committers and Projects about using selfserve.apache.org.


That tool has grown extensively since it was developed. I think one of the best things that we provided recently is the .asf.yaml system, which allows projects to essentially set up 90% of their project metadata in GitHub. It lets them set labels. It lets them set notifications. It lets them set all kinds of things, all self-service. So, it's taken a huge load off of Infra in terms of responding to tickets, and also put a lot of that control in the hands of the projects. That's been incredibly well received. It's definitely, I think, one of the best things that we've done for projects in a while. I think it's a fantastic tool.


That's great. Now, it's also a new way of institutionalizing, so to speak, of "scratch your own itch", but in a way that that's a common deployment. You can do your own thing, but there's a common mechanism or method of doing it, because before —it was like the Wild West, back in the '90s— everyone's just doing their own thing. It didn't really matter, but it wouldn't scale properly: you guys can’t really support them because everyone's doing something and it was a one-off. It's interesting to see that selfserve.apache.org has standardized or unified that process.


Yeah, and one of the things that I really like about it too is because we have so many different projects --they're so varied-- the people that work on them are so varied in their skill sets and their desires and their interest level and their skill level and all this. What we want to be able to do is empower projects to use the tooling and take advantage of the skill sets that they have available. So, we don't want to arbitrarily enforce, "Oh, you must use this particular technology," but we also don't want random technologies to proliferate, like you alluded to, the Wild West. So, it's a very refined balance between, "How do you allow projects to do their own thing in a way that's scalable and supportable?" That's a complex task. It's difficult to manage. I think Self-Serve (selfserve.apache.org) goes a long way to support that.


Speaking of Self-Serve and other solutions the team is providing, the strategic process of figuring out where to go —direction— I know you have David (Nalley), I know you have Greg (ASF Infrastructure Administrator Greg Stein). Does the entire team participate in this? How does this work: is it top-down, or is it bottom-up? Are you guys saying, "Hey, there's a new thing that we should do"? I presume you don't have an annual strategy, but rather an ongoing rolling process; how do strategic decisions get made?


It's a collaborative effort, for sure. I think we do have an annual —when we get together at ApacheCon, we do tend to have a lot of discussions about strategy and about future direction. That's one of the things that we try to do as Infra with our team meetups, and with ApacheCon as well, to get together in person in a room and talk about where we're going to go, what we want to do. I say the process is collaborative, because sometimes it comes from the direction of Greg, or the Board, or David, or whoever. Sometimes it comes from a staffer saying, "Hey, it'd be cool if we could do this."


Sometimes it comes from Projects, or Committers, and they say, "Hey, can we go in this direction? I think it would be useful for X reasons." It just depends. By and large, the decisions for a future strategy are brought up by whoever thinks of it and are discussed within the team at a peer-to-peer level, right? We have very few situations where Greg or David or somebody will come down and say, "Thou shalt do it this way." Yeah, very uncommon to have that happen. It's a very collaborative environment, which I appreciate and works well for me.


So, in light of the pandemic, you guys didn't have your face-to-face. Did you do a virtual annual meeting? Or did it just not happen?


Well, we have a weekly team meeting. Yeah, we didn't do any virtual thing beyond that.


With so many projects at the ASF now, with 350 projects and initiatives and growing and so few of you in Infra, you must be constantly learning new things. How do you keep abreast of what's new? How do you close your skills gap? How do you stay ahead of everything?


I follow a few mailing lists, discussion boards, Reddit, and other similar sources. I typically learn new things when I need to implement new technology to solve a problem. "How do we provide 'X'?" I’ll go research it and learn that way. I also find out about new things from my hobby projects or other work.  



...It's not like "I want to take Blah University to become certified in X" or anything like that, right? I mean, you’d do that from your own interest, but it's not something that's required of the job unless it comes up, right?


No, that's never been required of the job. Personally, I'm very much a self-directed learner. If I'm interested in something, I will absolutely seek out the resources to do so. I will say that there's not a lot of time for that stuff, at least not for me. I got a lot going on, right? So, having the time to sit down and take a class or go through that process, I find very difficult. I don't really learn that way very well either. So, class-based learning has never been for me.


...Not linear. Yeah.


Yeah. So, typically, if I want to learn something new —I've been trying to learn Python, because it's definitely a gap for me— I find it incredibly difficult, because it's very hard for me to sit down and watch a video on programming, right? I got to have a reason. I got to have a thing to do. I need to have a project that requires it. And then I go and I figure it out.


...Got it. So, it's purpose-driven education. You need an end result.


Yeah, exactly. That's how I've always operated.

[END OF PART I]

Monday December 14, 2020

Inside Infra: Andrew Wetmore --Part II

The "Inside Infra" series continues with members of the ASF Infrastructure team. Andrew Wetmore shares his experience in Part II of his interview with Sally Khudairi, ASF VP Marketing & Publicity.



"The nice thing is that the Infrastructure team does so much so well and almost making it look easy that any project in Apache that's really got itself organized to do its work is going to find success, because there's going to be no roadblock or brick wall or power failure that will keep them from it. That makes me feel like I'm engaged in a very small way in a very large good thing."


Let's talk about your background and your road to the ASF. How did you become a technical writer and editor? What sorts of projects were you working on?


Well, let's see. I spent 20 years as an ordained minister and I was working for the Episcopal Church in the US and the Anglican Church in Canada. I got to the point where I preached my many thousands of sermons and it was time to stop. It was about then I moved over into QA and documentation with a company building healthcare software in DOS. That tells you how far back we are. One of my first great excitements was helping that team through the Y2K tensions. I got myself a bit smarter and took courses and had a lot of hands-on experience. I became proficient as a tester as well as a documenter. I worked over the next 15 years for a number of companies, large corporations, startups, and nonprofits, and leading teams or participating in teams, both for documentation and testing, but also at one point, I was the director of user experience. It was designing the front-end for a big complex project.


I've built applications from end to end, usually using Flex which compiled to Flash in the days when we could trust it, when we hadFlash to play with, ColdFusion for munging things around and communicating with the database and a MySQL database.


… That's a blast from the past with ColdFusion.


It's still around. There's a new ColdFusion. It's even brighter and shinier, I'm sure. I was just looking at it and thinking, "Gosh, I really should take a look at the tutorial and see if I still recognize anything."


I'm curious, when you tell people what you do, how would you describe the ASF to the uninitiated?


I would say it is a benevolent community home for a whole bunch of highly focused teams who are trying to do good stuff. The benevolent community home provides the support features that let those teams do their things without crashing into each other. 


How do you explain what you do?


Do you know the movie Fifth Element?


… Yes. That's a cult film in the tech community. I've only seen parts of it superficially, I don't know it years after so I might have to watch it again.


You were very young. Your parents probably had to give you permission to go to it.


No. I'm older than you think.


In that movie, there's a sequence when a bad guy is explaining economics by knocking a drinking glass off his table and it breaks and there's a mess. Out from the baseboards of the wall come all these little robots, one robot with a broom, one robot with a dustpan and one robot with a vacuum cleaner and a duster and they go. They run around and they clean up the mess and they disappear back in the baseboard. That's me on the Infra team.


I'm the little guy with the dustpan.


… I love it. I understand that you are also very active with ApacheCon --were you involved in this past ApacheCon that we had in September?


I was. I thought Royale had some things to say that could be said and I looked around and nobody seemed to have the time or have paid attention to the fact that there should be a Royale track. I said, "Oh, there's going to be a Royale track and I guess I'll coordinate it." 


… You volunteered to do that, you decided to do it, you just rolled up sleeves and dove it?


Well, I said to the team, "If nobody else will do it, I'm going to do it."


I was sure, I was so hoping, that someone else would say, "Oh, no, I'll do that," and then I'd be in a support role, but that didn't happen. I also engaged with the team that Rich had to put ApacheCon together, but in a very minor way. I didn't help as much as I felt I should have helped just from a lack of time.


… You were on the Planners list; you were involved with that as well?


Yeah, in the regular meetings and so on and testing out things like the --


… Hopin platform.


I have my own Hopin account now because I found it quite useful.


Was that your first ApacheCon or have you gone to a face-to-face event before?


I've never been to an ApacheCon until this one. Obviously I've never been in a face-to-face one because there hasn't been one since. In fact, the Infra team was going to have one of its annual face-to-face meetings. We were all geared up to do that just when the lockdown happened this year. I haven't even met my colleagues.


… That was actually one of my questions, so we'll get to that later; that's interesting too. Just before we leave ApacheCon, Apache Royale, you mentioned, for them, "it's code once and run anywhere".


That's the theory.


The Project is popular with folks who are programming for mobile devices and other applications. Is this something you're considering with your work that you're doing on apache.org? Is there any cross-pollination or is it completely on a content basis only?


I've not got to the point of suggesting that Apache as a group or that the Infrastructure team use Royale. One reason is that Royale is at the 0.98 stage release. It's really darn good, but it has some weaknesses still. I'm thinking that the suggestion, "Hey, why don't we use the thing which is after all an Apache project to power our front-ends?" should better happen once we're at the 1.0 version.


… I'm all about eating your own dog food, but when you're ready, right? Early on I'd always wondered why don't we require our projects to use Apache projects for everything and was constantly told, "That's not how we work. We don’t dictate what projects use --they’re free to use what they want." Very interesting position, compared to other groups. I was just curious, are you coming across things on the site and saying, "Oh, Royale will be a good fit for this."


One of the ways that Royale could be useful would be as the front-end for the Apache project pages. However, the Apache project landing pages are static. That's their primary thing. They're static HTML pages. Royale really shines when you're doing data-driven pages, when--


… You’re developing across platforms and devices


That's right. Sally logs in and Sally sees this, that and the other thing because of her role on the site. The Royale-built site dynamically knows what to show her. Andrew logs in and doesn't necessarily see what Sally sees.


… That's amazing.


It's cool. I've built apps like that using Flex that were serving people in many different roles, doing many different things on a common project without this huge massive duplicative pile of code to do it with. I could do my elevator pitch for Royale anytime, but that's not what this talk is about.


… No, but it's interesting because it's about site development. I was curious if it impacted what you're working on.


The main difference is that Royale has more power than a project site needs. Pushing them to use Royale because it's an Apache tool might be requiring people to use a shotgun to kill a fly.


With the expansion of the Web and how everyone's becoming a "publishing expert", many people want to break into technical writing and editing. They want to get their hands on site content and content development and don't know where to start. What do you think are some good entry points? Are there special considerations for Open Source, specifically Apache? Is there anything that people need to consider doing? Are you doing something that's very unique to you because you're doing that for the Foundation or is this a more common type of job that you could take anywhere?


I'm not a super expert on anything. I think that would be probably the theme of my life, but what I bring is curiosity and a willingness to try to see things, not just from my stance, but how would this look like to someone who doesn't have the same preexisting knowledge that I have. Any person who would like to become a documentation person for something really just needs to find something and say, "Hey, can I write about it?" Almost any project in the Apache galaxy would jump on a person who is willing to help with the documentation. 


There's not a project here … well, there are a few projects that are very, very thorough with their documentation, but there are a lot of them where this is the thing that you're asking a technical developer to turn over and use another part of their brain and become a writer about the result of the technical development. That's not the easiest thing for most developers. If somebody with writing skills shows up and says, "I really like this cool thing you're doing. Could I write about it, so I could understand it better and maybe others could understand it better?" I think any team in Apache would embrace that person.


Since Day One for us, it's always been "documentation, documentation, documentation", but that’s often lacking. It's a challenge because you want people to do what they're best at and most comfortable with and happiest doing … which … the majority of folks want to develop code and yet we have an uptick with contributors that are non-code contributors. It's an interesting thing to see, "Where can they find the talents?"


In the "real world", the world of a corporation trying to make a buck off their code, they'd have X number of testers anda build and release person and X number of documenters and a middle manager that would help to make all this stuff happen. We don't have that structure. Indeed, the poor developers are called upon to try to put into words what they're doing. It's just tremendous if we have more --I want to say code sympathetic writers, not people who don't have any clue what's going on, but people who have some idea. At least they can ask the right question and say, "How come I don't understand what you're saying here? I think I tried to do it and I can't do it." Then the developer can say, "Oh, that's because you need to be logged in here. We forgot to write that down."


… Right. The obvious missing element to it.


Well, it's so clear in front of you that you can't see it. I flew in a plane once, years ago. The guy sitting beside me had been on the quality assurance team for one of the Gemini missions, the space missions, and this was the early days of manned spaceflight. He told me about how they were testing the escape procedure if something went wrong on the launch pad and they had this 14-step procedure that was attached in front of the astronaut. They thought, "This looks pretty good. We'll go through it." Sat someone down inside the spacesuit in the chair and said, "There's your procedure. Do it. This is step one. Do this. Two, three, four, five." Step six is blow the explosive bolts to release the door. Boom, bang, bang, bang, bang, the bolts go. The door flies away. With it goes the list.


… Right. Forgot about that. Vacuum. There goes the astronaut too with it.


That's right.


… Wow: in the midst of it, you don't think about it?


Well, exactly. What I feel sad about is people who become excited in a software project and try to figure out how they can use it to solve their own problems or answer their own needs. They can't make it run or they can't make it do what they want and they can't figure out from the help docs how to even ask the question they want to ask and they give up and go away. That's a silent vote that we don't really hear.


… That is unfortunate. Unlike other Infra team members who have said to me; and I'm sure you read it --"everyone does everything".


I do nothing. I do nothing.


You're uniquely responsible for optimizing site content. Do you collaborate with any specific Infra team members? You said you talk to Greg. Is there someone you have to go to every time? Do you work with anyone else out of Infra?


I don't go to any one person because I really don't want to make that person roll his eyes when I contact them. It's a small team. First off, I usually lob out a question when I have an issue. I say, "I'm over here on page so-and-so. I haven't a clue what this thing relates to because it looks like it hasn't been touched in a few years. Who knows?" Sometimes, someone right away will say, "Oh, you do this and do that," or they'll say, "Oh, no. Drew knows about that or Gavin knows about that. Go ask him." Then I do that. And sometimes there's silence. Then I won't ask again on the list. I'll wait until the team meeting.


When we've got everyone on the call and it's my turn and say, "I'm stuck on this thing. Who can help me?" someone always steps up and says, "Yeah, put me down for that. I'll help you in a couple of days." All of these people can do everything. The codicil to that is that all day long they are doing everything. I don't want to be hauling on the same person's elbow all day long saying, "Help me with my little thing." I want to spread out my requests, so I don't pull away any one person too often from the essential tasks, the core tasks of keeping the Infrastructure running.


… Do you work with anyone outside of Infra or you only work within the Infra team to get your work done?


There are a number of people I consult, especially specifically with this migration off the CMS. I'm dealing with people on various projects. I probably have 25 conversations going on with people specific to their projects about what are the various pluses and minuses of the alternative technologies that are available or how do we even do step whatever in the list that's on the wiki page. Actually, I'm so proud of myself when I can actually answer one of those questions.


… How do you collaborate with everyone? Do you use certain tools or is it just email or Slack? How do you work with those folks?


Primarily Slack. Well, there are two things. Slack conversation is going on all the time. I also keep my eye out for whenever a page is updated on the confluence wiki Infra area. As soon as it is, like that little robot with the dustpan and the broom, I go and look at the page. I just scroll down it and maybe fix a little punctuation, change this word and that word to make it a little bit more cleare. Usually, I save it in such a way that they know I've been there. Sometimes if I'm just being really, really finicky, I turn off the thing that says, "Tell the team that you've done it." I'll just stealth in and change that run-on sentence into something more legible, but I don't want to draw attention to it.


… Is that more common than not, or …?


I'm not going to say.


… The "non-intrusive" thing. Describe your typical workday.


I get up around 5:00 or 5:30 and get on the Slack channel and see what's happening. The nice thing about my part of this work is that almost never do I check in and they say, "Andrew, you got to fix this." There's usually not a hanging message about an urgent issue. Then if there is no such hanging message, if I know that I have an ongoing project, like the top-level apache.org pages, I go and tackle the next one. That will keep me busy for many days to come. I keep the Slack channel going partly because, as I said before, I like to monitor what's on people's minds because I may say before they think of it, "Oh, do we need a page about that?" 


I'm only working part time, so I only have basically a half of every day available for Apache. I tend to do a couple hours in the morning and another in the afternoon and another at the end of the day. I have another job with a small publishing house. As I'm working on that, I have one eye on the Apache Slack channel to make sure there isn't anything that requires me to jump over there really quick.


How do you keep your workload organized? It sounds like you're super immersed in everything.


Well, I've got, as I mentioned, a job jar page. Basically, it's a long, long list, a checklist of things that need to be done. It's divided into sections. There's a section of things that need to be done for which I need input from other team members. Then there's basically another section where it's just stuff I need to get around to doing. Then if the team comes up with something that they think I should be doing, they're not above adding an item to my list.


We are familiar with that. It's like, "Oh, I left many, many, many more items." How do you stay motivated? What are your challenges?


This is fun. It's partly fun because the ... Let me turn this around: I worked for a corporation at one point. This is fairly early in my career in software and I realized I didn't like what they were selling. I didn't like the way they were selling it. I didn't like what they hoped would happen with the stuff they were selling. It made it harder and harder to go into work. I was very happy when that relationship ended. Here, I am playing around at the edges of a very exciting machine shop or toy workshop with complicated gears and sliders and rheostats and bubbling things that I barely understand, but that I can be helpful with.


Really, I can't see any trouble with motivation. I don't have to say, "Oh, really, you have to go put in your hours." It's more like, "I know, I also have to do this thing outside, my editing job, but I really wanted to do this thing for Apache."


… That's awesome.


The nice thing is that the Infrastructure team does so much so well and almost making it look easy that any project in Apache that's really got itself organized to do its work is going to find success, because there's going to be no roadblock or brick wall or power failure that will keep them from it. That makes me feel like I'm engaged in a very small way in a very large good thing.


With regard to the Infra team, you're looking at it as an "inside outsider", right? You're someone who's working with Infrastructure, but you're not a sysadmin or not a DevOps person. Is this the first time ...


My camouflage is that we all have beards. I fit in that way.


[laughing] … is this the first time you've been part of an Infra team, because usually folks with your profile are usually part of a content team or marketing or PR or sales engineering or some other division that's more again outwardly facing from an acting standpoint. Is this the first time you're dealing with the underbelly?


Not purely. The very first software company I worked for, within a few weeks, I was in charge of build and release. That was as far down into the bowels of the code as I was. I wanted to go with that point, but it was certainly not outward facing or documentation related. At another company, I was in charge of localization across 17 languages. Although of course, there's words involved, it was very much words in terms of, "Will the German for this thing fit on the button we have for it?" I've been the inside-outside guy in other situations before.


Cool. Website administration, as we've been talking about, running Websites in general changed so much over the years. What has been the biggest change or hurdle that you've experienced?


Purely in Website administration or in dev?


... Anything for documentation, what you're doing now, and what it relates to administrating site content.


Gosh, I think the disappearance of paper for me as a writer is one of the big changes. Almost everything I do now I can do digitally. One of the companies I worked with, I helped supervise the transition from printed user guides. There was a great big room full of boxes of spiral-bound user manuals that we stopped doing. We moved over to a product which we could dynamically create, so we could create a manual for Sally and her role and a manual for Andrew and his role that would come off the same text source but would have different chapters with the stuff relevant to what they were doing. Getting away from the physical artifacts to me has been the biggest change in the writing world.


Remember, I work in a publishing company, when I'm not at Apache and what we turn out as books are very much in the physical world. I was just working with an author this past week where I'd send her a PDF of the very final, final, final, "This is the final version of the book before we go to press." She said, "I need a physical copy. Send me one."


We printed off this endless book and drove it to her house. Then she found two tiny little things that way and we fixed those and everybody was happy.


… That's good. When I teach media training, I require everyone to put their laptops away and write, even if it's on the tiny little hotel notepad, but write it down because your brain processes things differently when writing.


Absolutely.


… Some people say, "I feel like I don't know how to write anymore." That's such a sad situation, but it’s our reality.


When I'm doing my own writing, first drafts are always physical. Pretty much always. Then the nice thing then is when you move it over to digital form, even if you try not to, you see ways you can improve it. You're already into version two with a better document because you wrote the thing first by hand and then you write it again on the keyboard.


You've seen a lot of transition in this space. How do you close your skills gaps in order to stay ahead of everything? How do you do that?


Boy, my skills gaps are larger than my skills. I'm wonderfully good at Apache Flex but it's not a skill much in demand now ... If I had known three years ago that I would be on the Infra team now, I would have learned Python, become very comfortable with it, because it's a Python shop. I'm learning Python on the side, not in order to become a coder, but just so I know what the others are talking about.


Got it. But that's not required for you to actually do your editing work?


No. Nowadays, the writing tools are good enough and versatile enough that they're almost transparent. If you learn how to write in Word or on Apple pages and it turns out you have to write using Markdown language in Git pages, that'll take you 90 seconds to figure it out and then you can do it.


Great, so that's not a problem. Earlier we're talking about you not having actually met the team: the offsite was cancelled, and you haven’t been to a previous ApacheCon. A huge advantage for the ASF is, as you know, especially during the pandemic, is that we've been virtual since Day One. We literally didn't have to change anything in order to maintain our day-to-day operations: it's just business as usual. Just keep going. For you, have there been any advantages or disadvantages of working remotely from the team? Do you think your work could be improved in any way or is it no big deal?


Because that's how the team works and because the expectation is asynchronous, that is you ask a question and you may not get your answer until the person four time zones away wakes up, it's not been a big impediment to me. If I need to have a private conversation with someone, I know how to ping them on Slack and we can open a private conversation. I will say I was working remotely from 2013, I guess. I moved from Boston up here to Nova Scotia holding on to the same job I had with the company I was working for. I was into working remotely from then and have continued pretty much without too much disruption.


When that job came to an end, I went primarily freelance. I was working with clients in Japan and Laos, Germany, all over the United States, South America and so on. Of course, I never met any of them face-to-face. This Apache experience ac,tually, I feel closer to this team because we're on Slack all the time than I had felt with many of the other teams I've worked with over the past decade.


… I love that. That's great.


We share cooking recipes. That's really important.


… I hear a lot about that: Infra’s cooking, drinking, and eating. I ask everyone this question and it tends to make folks pause, but what do you think people would be surprised to learn about ASF Infra?


I think ASF Infra is like the person who's not the main act in Cirque du Soleil. This is a big thing happening on Cirque du Soleil. Over in the corner is a person who's throwing two chainsaws and an apple, an egg and a baby, juggling those all at the same time and just doing that. It just happens and somehow it's essential to make it possible for everything else to happen, but that juggler is good enough to just make it happen and make it look easy. I watch what my team colleagues are doing. I'm just in awe of all the things they managed to do. The Apache teams are doing their things and something goes crazy on a server somewhere. We get an alert about it. Whoever is awake at that hour from the team jumps on it and maybe they call in another person. Then they realize it's because of this third thing over on this other server that's gone wrong and they fix that. All the individual project team might notice is that their email is delayed for a few minutes.


… Right. Everything else is being juggled in the background. They're not aware.


Juggling without dropping an egg or a baby.


… Incredible. If you had a magic wand, what would you see happen with what you're handling within Infra, with your role specifically because you're the only one doing that? What would you like to see change?


I don't know that I have a wish at this point. It's pretty straightforward. I guess I wish I could time travel back and learn Python and come back here and be at this point knowing Python, rather than trying to pick it up on the side.


What's your favorite part of the job?


My favorite part is when I hear back from people, "Oh, now I get it. I read that page again now that you've edited it and now I get the thing."


When something we've changed, something we've provided makes it possible for people to do what they need to do.


What was your biggest challenge when you came into the role, when you started? Was it a wall of, "Oh, my God"? What was it like?


I guess it was grandma's attic, trying to figure out what box to pick up first and feeling in the first weeks that I was working, I was afraid. I was very, very preoccupied with not offending anybody, with not implying that they were ... Not correcting their writing in such a way that I insulted them. It took me a little while to realize of course, they are perfectly happy to have their typo corrected, but it was a matter of ... In the first few weeks, I was still feeling out my colleagues and understanding how much they were going to appreciate some documentation help, how much they would find it as an intrusion or a waste of time.


What is your greatest piece of advice for aspiring technical writers and editors coming into this type of role? What advice would you give them?


I would say ask more questions. When you're stuck, don't presume it's your fault. Ask a question. Someone may say, "Oh, that's because X and we never said X."


What are you most proud of in your Infra career to date?


I haven't broken a single damn thing, except one thing and I'm not telling you what it is.


All right. How would your coworkers describe you?


I think the robot with the broom and dustpan. I think they bought that one. I suggested it.


What else do we need to know that I haven't asked?


I'm a playwright. I play the banjo. For $10, I won't play the banjo. At this point in my career, f you hand me a banjo and a cup of coffee, I'll be happy.


= = =


Andrew is based in Nova Scotia on UTC -4. His favorite thing to drink during the workday is countless cups of black tea, accompanied by homemade pumpkin-seed-flour bread served hot with butter.

Sunday November 29, 2020

Inside Infra: Andrew Wetmore --Part I

The "Inside Infra" series with members of the ASF Infrastructure team continues with Part I of the interview with Andrew Wetmore, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.



"...I really had a distant but benevolent appreciation of Apache until I started to get more and more involved with Royale and began to understand from that angle all the things that the Foundation does to support these little projects that could not survive without it. Of course, now that I've become part of the Infrastructure team, I'm awestruck by the amount of work that the team does to support all these little projects, so they can do their thing."


What is your name and how is it pronounced?


I'm An-drew Wet-more. The "Wetmore" is like a rainy day, very easy to pronounce.


When and how did you get involved with the ASF?


I was a Flex and ColdFusion developer. When Flex came to end-of-support with Adobe and they passed it over to Apache, I followed along. I wasn't an active committer: I was a participant in the Apache Flex project and contributing in my little ways here and there. Then when the Apache Royale project split off Apache Flex, I went there, but I was not an active, not a heavily significant contributor. That is, I was helping with documentation, a bit of testing, a bit of organizing and helping. I was truly surprised when I was invited to become a Committer. Then at some point, somebody noted on the Apache Royale list that the Apache Software Foundation Infrastructure Team was looking for a documentation person. I thought, "Well, that's interesting. Maybe I would be able to contribute to that."


I followed that up. I wrote to (ASF Infrastructure Administrator) Greg Stein and introduced myself and said, "Oh, I'd be interested if this is something that's happening." Then for one reason and another, nothing happened for quite a long time. That was fine. He told me nothing was going to happen for a while. He was migrating some monstrous mountain of something. Then when that long time was up, I pinged him again and said, "I'm still around if that's an interesting possibility," and we got talking. He did that wonderful interviewer thing of saying, "Well, if you were going to hire someone for this sort of a job, that has this heading, what sort of job description would you write?"


He made me write the job description. I thought, "this is cute: I'm happy to help. I don't know what person not me is going to get this job, but I'm happy to write what I think is a good job description for this thing." Truly, I really expected this to go out and a whole bunch of people to apply for it and that I would get a participation trophy. I was very pleased when I was invited to join the team.


… You got the real trophy.


Yes, I did.


You got involved when Flex came to Apache, so that goes back to 2011, you've been with the Foundation for nine years or so?


I was aware and downloading builds as soon as there were builds to download and participating. I was still building my own Flex stuff, but I don't think I was really contributing significantly until around maybe 2015. Then I didn't become a Committer until 2018.


The other things you were doing prior to Infra were limited to Apache Flex and then onto Royale?


Yeah, I had a glancing awareness of Apache. Without even thinking about it, of course, I was using Apache tools like Apache Tomcat packages, but I really had a distant but benevolent appreciation of Apache until I started to get more and more involved with Royale and began to understand from that angle all the things that the Foundation does to support these little projects that could not survive without it. Of course, now that I've become part of the Infrastructure team, I'm awestruck by the amount of work that the team does to support all these little projects, so they can do their thing.


It's interesting with Apache projects because they're mostly ingredient brands versus a customer-facing final product. Of course we do have those too, but the majority of them power something else. A lot of times people aren't aware until they're in it: then they’re like, "Oh, wow, Apache is everywhere."


Well, I keep trying to improve myself and I go and choose a product project at random and read its homepage and its "about our product" thing and see how far I can get before I've hit five things that I don't understand at all. I don't even understand what I would do with the thing that I do understand which is not a knock on those projects. It's what you just said, they're not end-user-facing. I, as a Flex developer, was a Flex developer. I was using Flex and now Royale to build other things, not getting in with the toolkit and adjusting and tweaking Flex or Royale.


Right, like a commercial product. Explain your role within the Infra team. What is it exactly? What's your title? When did you get there?


Let's see. It was at the very end of last year. I'm coming up on 11 months on the team. I started two weeks before the end of 2019. My title is either editor/writer or writer/editor. My job is ... well, the situation of the Infrastructure team from a point of view of a written material, of documentation, is that for the past 20 years people have been doing their very best they can to document what they do and how people should do things. They've been adding to that material and adding to that material. We have a built-up pile of stuff, some of which is no longer relevant, some of which contradicts other bits, some of which is written very provisionally for something that we now take for granted that we're always going to use.


My job is to go around with my little broom and dustpan and clean up things. I go and find a page in the Infrastructure documentation and do what I would do when I visit project pages. I start to read it. When I get lost, I start to edit it or I start to ask questions to the team and say, "Is this even a thing anymore?" The team sometimes says, "Oh, we were doing that in 2011, was it? I forget." We know that that page can go into history. There are other times when I run into things, there are many pages, far too many pages where there's a sentence like, "At this point, such and such is true." It doesn't say anywhere what "this point" is.


Then when I dig in a little bit, I find out it was again back several years, where either the conversation came to a stop or the conversation continued and whatever that uncertainty was on the page has been resolved. Now I'm able to update that page and say something that's more useful to the visitor like me who's coming now and doesn't want to read a historical document, so much as get help about doing the thing they want to do.


Right: "what does that mean for me?" This is the legacy dilemma.


Sure. Well, it's also a factor of people trying very hard to do too many things all at the same time. Let's just write enough explanation so people can get going, because surely they'll understand what we mean. That's often true for probably the vast majority of people who are active in an Apache project. The pages that stump me don't stump them.


We have this issue of tribal knowledge not scaling. Part of it is that people are moving on. Part of it is that processes are changing, technology is changing, people forgetting about pages. You're in a very interesting position. If I'm understanding it correctly it’s trying to rectify whatever the intention of what's historically been in place versus what people are doing now, even establishing relevance and trying to find out who's the de-facto source to go for this, considering the amount of contributors, is incredible.


There's another piece in this. I'm trying to come at my review of pages from the point of view of a person of good intention for whom English is not the first language. I'm trying to think, "What on Earth would one of my colleagues from Laos understand from this sentence that has a jokey little acronym or aphorism in it?" Then I tried to figure out how I can say that more clearly and not in a boring way, but in a more accessible way. The three flaws that all our documentation tends to have, one is to write in an academic way. One is to write with a heavy use of acronyms. One is to use a heavy serving of passive voice in the material.


Let me explain what I mean by those three. The academic style: so I have a page here that's going to tell you about X. If I'm an academic, I might write, "The intention of this page is to explain the modalities by which a user might accomplish X." As the reader, especially the reader for whom English is not the first language, I'd like to see, "Here's how to do X."


The second one is the acronyms. Even acronyms that should be, you would think would be obvious to everyone. "The PMC of a TLP." It's like reading Hebrew where there are no vowels, but a simple rule is to always spell out what the thing is and put the acronym in parentheses the first time you use it in a document. It's easy to forget if we know the acronym so very well ourselves. 


The third thing is passive voice. Passive voice correlates strongly to the academic approach. It has a way of hiding who does what. "After the file is uploaded, it is passed to the server and verified." I don't know who does any of those things, if I'm the owner of that file, I don't know if I have to do any of those things or if I just sit back and watch them happen.


Sometimes when I ask—I actually enjoy doing this because it is my naïve or newbie character—I go to the team and I say, "Here's this sentence. Who is doing this?" In the conversation about that sentence, sometimes the team discovers that there's a difference of opinion about what part of the Infrastructure does it or whether someone has to do it. We might end up not only fixing the text but improving the process.


What I'm hearing, it's beyond just writer/editor. It's writer/interpreter.


I came into software through the QA door. Over 15 years, I ran teams that were generally documentation and quality assurance, documentation and testing. I think the two are very tightly connected. When I go to edit a document, to the extent that I'm competent to do so, I test it. If it says, "You go here and do that," I go over there and look, is the thing that I can do available there?


It does what it says on the tin. It doesn't correct that.


Well, we have innocent blind alleys that are built into our documentation out of good intentions, pages written about our software repositories, back just a few years, presumed that everything happens in a Subversion repository for projects.


For projects, almost everything happens in Git repositories now and the instructions of how to do something in the two repositories may be different. We need to go back and find those pages and make the path comfortable for someone who couldn't care less about Subversion, but they have to do something in Git.


Let's talk about the scale of what you're working on. How many pages are you handling?


Oh, good Lord, I don't know. I started out within Infrastructure itself, looking at several packages of pages. There was a set of pages on the Apache Website under the subhead dev, a set of pages on the Apache Website under infra.apache.org. There had been a set of pages under something called reference. There's a set of documents in a Subversion directory. Then there's a very large Infrastructure section of the Confluence wiki. One of the first things I had to work through is how did these relate to each other? How does anyone find their way? I proposed and the team seems to have accepted that the pages at the apache.org/dev area are the introduction. Here's what someone trying to figure out what goes on with dev and possibly the Infra team would need to look at.


The next step in is the infra.apache.org area. If you can't find what you need there, you follow the links through into the wiki. God help you if you have to go to the Subversion repository. Really, what should be in the Subversion repository is only, it seems to me, the instructions for restarting or rebuilding the Confluence wiki. I am gradually moving other stuff out of the Subversion repository. 


This feels like a byproduct of individuals or projects or committees or communities actually having this "scratch your own itch" issue, right? "Hey, we need to document that somewhere on our thing and here's our particular experience." Apache is open enough. You can do that. The whole directive of integration or coordination or making sure that one hand is speaking to the other has never really been a mantra of ours. It's interesting to see as we're scaling, it reflects directly on what you’re experiencing.


There's the thing. If it were a team of six people in a room or at a project level, it wouldn't be a problem figuring out where to put what or how to fix a documentation collision. When you're talking about hundreds and hundreds and hundreds of people, some with decades of experience, literally two decades of experience in the Foundation, someone bopping in brand new can bring things to a standstill. People can get lost or disheartened.


The disorientation is very common. I hear people going, "How do I find …?" "Where do you start? Where do you go?" It's great that there's assistance.


In early January, I started with some thoughts. "I'll be a new person." "I'm going to read the apache.org site and go where it tells me." I got so depressed. I could not understand how the information on one page related to the information on this other page that I was looking at. There seemed to be two sets of instructions for doing many things. Part of my fun is trying to make it easier for people who come in: you have to help them find their way without giving up the whole thing and throwing the computer out the window.


Are you still at the audit and discovery stage or are you actually at the rewriting stage also? Are you in course-correction or new content?


I've moved almost everything out of the apache.org/dev area that shouldn't be there. What is left is introductory material. I've edited every page on the infra.apache.org area. I've edited almost every page, I think, on the wiki. I'm still digging around in the Subversion thing. I am doing a long march through the top level apache.org pages. I've been in most places, but I'm not done. Beyond this, I guess there are some areas I haven't touched or I need to ask permission. For instance, the Incubator has a set of interesting pages in which the material is clear, in large part. There are some suggestions I'd like to make for those pages, but I don't have any mandate to go in there and start changing things.


What I would do in that situation is write up a little sample report, "Here's how I would suggest changing this page," share it with the Incubator team and say, "This is yours to do as you'd like. Would you like some more?"


… Chances are they'd say yes.


Well, indeed, but I don't want to assume that. I know how I would feel if I turned around and found out someone was changing all my sentences.


I might very well feel that not having passed through the Incubator wouldn't have a good grip on what needs to be said.


It's really important for us to have these fresh eyes with respect to what the outside world is seeing: people who are new to it, how is their interpretation or misinterpretation? Again historically, there's been this issue of, "well, it's obvious". Not just has Apache evolved and the communities have evolved, but Open Source has evolved. The expectation is very different. Similar to what you were saying before with Subversion and Git, it's a completely different space now. We have to grow with that. I think again because we're not a corporation, we don't have these marching orders of "go, bring it up to speed or bring it in alignment." It's great that you were there to audit, align, and course-correct.


There are pluses for this that I had suspected but hadn't been sure I would find ... I'll give you an example. One of my projects is about Apache's content management system that's nearing end-of-life. We have to migrate all the projects that use that content management system to generate their Project websites to some other technology. We've been working on this for over a year, I think, but there were 40 or 50 projects that hadn't gotten started on that migration. I started conversations with them, saying, "Hey, are you going to move? What do you need help moving? What do you need to know?"


As I was getting feedback, I was able to improve the documentation we provide on the wiki on how to migrate your project off the CMS. Along the way, I've met some really interesting people and am having fascinating conversations with people deeply engaged in Projects; the output of which I know nothing. It's a lot of fun because these are very, very smart people. They're doing really significant stuff. I want to make it as easy as possible for them to turn from that highly significant stuff to this rather mundane thing of moving the way they built the Website from the current way, which is creaky but they know it, to a new way.


We long have had this "do your own thing" culture --no one's telling anyone that there's one, official, way to develop your Website. Hundreds of Apache Projects are developing their own Websites their own way. No doubt some that are using the current CMS, there is an opportunity to offer them a different development direction versus a giant arm sweep stating, "We're going to pull down 300 project sites and rebuild them all at once," as would be done in other organizations when they choose to rebrand or upgrade their CMS or backend. It's like little mushrooms popping up where everyone is producing their own site at their own pace, using their preferred tools. It's very, very interesting.


It's educational to me also because the Infra team has a series of recommendations, "We really recommend you go to this technology to build your Website." The subtext is because that gives you the most options, the most flexibility, and means Infra has to do less to hold things together, but then we have other Projects that say, "Oh, no, we don't like that. We really would like to use this." The Infrastructure response is, "Show us how you can possibly use that in a way that matches these requirements we have." For instance, that the landing page for the project Website has to be a thing that can be branded as [projectname].apache.org and hosted on our servers.


People, as a project, demonstrate, "Oh, they can use this technology" that we had not thought of, then we have the documentation for that and that might encourage some other project that doesn't like the vanilla package we're providing to migrate using this new thing. We're down to about 20 projects I think that haven't really gotten very far on their migration.


… Is there a deadline for that?


At some point, the content management system is just going to fall over. We'd like to get everyone out before that happens. We set the end of the year as, "Let's do this before the end of this year, but there's not a switch." There's not the end of a license or something like that that's going to happen. We have a little bit of wiggle room.


… We're not pulling the plug, so to speak.


No. We're not pulling the plug, especially since treasurer.a.o. hasn't moved and we wouldn't want to annoy them.


Right. Are there additional responsibilities that you take care of?


Well, that what I just described about helping or encouraging teams to migrate was not part of my job description. I just saw something that I could do that involved being engaged with projects to get them on the path, leaving the other team members available to do things that I can't do. I'm the least technologically savvy person on the team. I might as well do the stuff that involves words and interaction.


What's the process of sorting through 21 years of ASF history on apache.org? How far along are you? Is this a never-ending project or is there a specific milestone that you want to hit to say, "Hey, okay, we've done"? Is there an end in sight to this or …?


I think we'll get to a point where we'd say, "We're pretty well caught up. Now what?" That could happen within the next couple of months, but then remember, at that point, the process of doing doesn't end. Where new material is being created, technology changes. We're migrating server things from one kind of server to another kind of server. We have to document what that new server does. Git for instance, or GitHub, I guess, has provided a couple of new options for things projects can do. The Infra team has to learn how to support those, then we have to document them and help teams understand how they can use them for their benefit.


As long as the Foundation keeps doing stuff, the same problem of uncurated information silting up will recur. Hmm. That's my lifetime employment plan.


[laughs] Going to apache.org/dev, how did you decide where to start? Were there any active fires that you were told you had to put out or it was more a bunch of low simmer, "We'll get to that someday," types of sections of the site? Again, it just seems so like a Medusa situation. How did you decide to divide, conquer, and get started?


For apache.org/dev, I just started at the top file or the top link that said anything about dev and went into it. "Why is this out here? This looks very much like the same thing we say over here in infra.a.o. Why are we saying it twice in two different places in two different ways?" I started pretty much by grabbing anything. It's like the way you might go into your grandparents' attic when they're downsizing to a smaller house and you're going to help them move. All you can do is pick up the first box and see what's in it and give the best guess about where that should go.


… Then there's those people who just grab it and just donate everything.


That's true.


… Not even looking through it, they're just purging and starting afresh.


Fairly early. I tried to elaborate that tiered idea of Infra information, so that if you land on dev, you're getting high-level stuff. If you go to info.a.o, you're getting more thorough stuff. If you need to, you can go over to the wiki to get code snippets or very detailed instructions. If you're an Infra team member, you go over there and get stuff. Only in the direst need you go down into the Subversion repository.


Before, everything was mixed around. Where the most essential and the least essential stuff was, was not consistent or logical.


Have you had to learn about the Apache Way of community-led development or other processes in order to get the job done? Even if they're talking about a technical thing, you're testing it out. Are you kicking the tires along the way saying, "Okay, this doesn't make sense," or are you not at that stage yet in terms of content?


I'm doing a fair bit of tire kicking. Of course, as a participant in the Flex and in the Royale projects, I've engaged myself to understand the Apache Way from them. The PMCs I work with modeled the management and development style of Apache. I learned it organically. I'm not seeing a conflict between what I learned on the Flex and Royale teams and the larger Apache Way of doing things. I think that's really good. You stumble into a small project with a very minor, very focused goal to do this thing, this bit of technology. You take in through your skin how to make decisions and how to share information and how to support each other.


… Continuity for the win: that's good to hear. What kind of influence do you have on content development? You said you're adjusting a page if it's not saying what it's supposed to do, but beyond that, are you saying, "Look, this really needs to take a different approach"? Are you deciding on your own? Is there a review committee that has to oversee every edit or is the process completely autonomous? How do you know what you're writing is factually correct? Who signs off on that?


The Infra team is in constant contact, 24-hour chatter every day on the Slack channel. There's an asynchronous conversation going on. When I run into something I don't understand ... Well, there are two things that happen. I can suggest things there that might be useful, but also when I notice people discussing something that's new or something went wrong and what they have to do to make it right, I often say, "Is that something we should write down, do you think? Where should we write it down?" That begins the conversation about documenting whatever the thing is.


One of the first things I created on the wiki page, the Infra wiki site, is a page for me called the job jar. Each time I come up with something that has to be written, I start a new item on a checklist and write in what that thing is to the best of my knowledge. Then, if I can't see any way to write because I don't have a clue what that thing is about, I go to the Slack channel or I go to the team meeting, which we have every Thursday, and say, "Who can help me write this? I just want you to blurt out the facts and then I'll turn it into pretty language." I can't direct that we have to write anything, but we work interactively.


If I write a new thing, I post it on the Slack channel. Someone will come back and say, "Well, you totally missed this thing. Here, let me fix it for you." We go back and forth like that until it's ready to make available to the larger public. Greg, of course, keeps a close eye on me, so I don't accidentally delete everything.


We review regularly what needs to be added or what can be sliced away because often, if you say less, you can communicate more.


Going back to "delete everything", when I first joined W3C 25 years ago, I remember making copies of everything because I was terrified that I was going to delete the Web's original history, there were thousands of legacy pages. Do you do the same thing? Do you make copies of things and edit that then just do merges? How do you actually do that?


I have a strong reliance in the team's guarantee that everything is version controlled. Actually, I'm more shy about changing things than they are to encourage me to do it. They just said, "Go ahead and do that. We'll fix it later." In that sense, I'm truly not afraid of deleting everything. I am afraid of inadvertently causing annoyance. I have an example: when I first started to move pages from a.o/dev area to the infra.a.o area, some of the pages I wanted to move that had titles that didn't really match what was in the document. God, I've got to improve this. I changed the name of the file at the new location. Then that was a pain because how do you redirect from the old location to the new direction?


I learned very quickly that I was causing trouble for my colleagues, but beyond that, I was causing trouble for people on projects who might have a link on their page to an Infra page. I really don't want to cause an information barrier because, in my mind, I'm making things more efficient. On the a.o/dev area, there are all sorts of pages sitting there now that are just stubs or just shells of their former selves. If you click on the link to go to that page, there's a little gearcranking and all of a sudden you're over at the same page at infra.a.o … most of the time. Sometimes it just does not work and then I have some sad people.


It's interesting you were saying about not wanting to upset people, but I think this is actually a parallel with good documentation and good data management. It becomes un-intrusive and a natural byproduct of your experience online. The whole point is you don't want to say, "Hey, there's some underhanded entity there that's controlling it." It's natural in terms of what you're seeing, what you're reading. In terms of comprehension, it's great UX. It's a very interesting comment that you made about you not wanting to ... this, "do no harm" approach --the outcome is very positive. 


If you want to make a really highfalutin image, we're surgeons working on something together. There's a thing going on the table there that's going to ... Things are going to go bad if we don't do our job well. If I go moving around where the implements are that we're going to reach for on the tray from where they normally are just because I think they should be alphabetical or something, things are not going to go well for the patient.


… Someone might even die, right?


Fortunately, there's a limited amount of trouble I can cause because I'm not turning the nuts and bolts on the servers. Not yet.


But I am ... You asked earlier what sort of, I don't know, influence I have to bear, I'm in there asking questions whenever I can understand a question to ask about, "Shouldn't we update this list here of the servers? This doesn't look like it's been updated since 2014. Shouldn't we make this list more accessible to the people who have to look at it?" That makes it sound like my colleagues are bumbling along and inattentive. They're very attentive, they need to document what they're doing and they're very patient with me when I get fixated about a semicolon while they've done everything else right on that page, except that damn semicolon.


It's important. Both parties: that's a good dovetail of talent, right? You're talking about a page that hasn't been touched since 2014. We have pages that have been untouched since 2001. I'm sure you're coming across them.


Here's a situation that probably is of low impact, except when it has high impact. I've been reading the memorial pages for past committers. I got to a page that said many kind things about the person, "who died in a car accident this last week". This "last week".


… *When* was that, right?


Yeah. I'm making little reports on those pages and the people who have ownership of the pages have to decide what to do with those reports. I'm not going in and changing those pages, but I suggested, "Let's figure out the year at least, maybe the month and make that more accurate, so someone like me now visiting this memorial page about a person who died before I joined Apache can understand what happened." In some ways, that's important for remembering and honoring the people who have been with us and are gone. There is more painful stuff when we haven't updated something or we've left a sentence that says, "As of this writing, so and so is the case, but I don't know if it's going to be that way for long." Again, there's no date. I think it's scary.


… There's no frame of reference at all.


Exactly. It makes the whole thing provisional. We have, under this COVID-19 crisis right now in the province of Ontario in Canada, a very complex Website that purports to tell you if you're in the city of Toronto what you can do in different parts of Toronto, what the lockdown level is. At the very top of the pages, it says, "Latest information."


If you go in there, it's three months old. The latest information is elsewhere in the page. To me, it throws the whole thing. If I'm someone who's trying to find something out from that site, I tend not to believe any of it.


[END OF PART ONE]

Tuesday November 17, 2020

Inside Infra: Gavin McDonald --Part II

The "Inside Infra" series with members of the ASF Infrastructure team continues with Part II of the interview with Gavin McDonald, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.



"...you don't know you need somebody until somebody like that arrives."

Earlier you mentioned growth: preparing for growth and being able to accommodate that growth. What areas are you guys experiencing the biggest growth? Is there a specific type of request that's coming in more than others? I'm hearing all the time when I'm dealing with our  Targeted Sponsors, for example, I'm hearing "We need more CI. We want more services in this area. We want more credits." What is it that you guys are feeling, or what are you dealing with in terms of big picture? What's the biggest demand, where are things coming from?

Yeah, as far as I'm concerned you're spot on with CI.

… That's you.

I mean it's not me totally. I have been concentrating on it more as others have been concentrating on other things, yes. Jenkins for example, we had this one --I call it a mega monolith of a service that had all the project services that was on one server, one Jenkins instance. And it was the same instance being upgraded for the last 10 years. So it was time to migrate it, so it's been migrated to five smaller Jenkins services.

BuildBot is also being upgraded and moved to a bigger server. You've got actually Travis is being used to its full capacity.

What does that mean, "to its full capacity"?

When Travis came out, projects started using it, and we're now at the stage where it's at the full capacity ASF is provided.

Oh okay. So if they're giving us 20 whatever, we're at 20.

Right. It's not unlimited. This is why a lot of projects have decided to start moving to GitHub actions, which is also not unlimited. So the more that's provided, the more is needed. I don't think we'll ever keep up with the pace.

Greg says that often. When I talk to him about donations and things like that coming from different companies, he just says "more" and "more" and "more", so he's not exaggerating, right?

No, no, he's not exaggerating.

... If we offer, they'll take it?

Yeah. There's ways in which projects can use CI in the same way they can use other things. And they will use everything that's given to them.

… Insatiable need.

Yeah. Not understanding that there's 300 other Projects that could be using those same services. But there's a few beginning to realize, and there's talks on certain mailing lists that how can we make this more efficient, how can we projects help each other in managing the best usage of these services that we've got. Because they don't want to have it all. They've just been creating whatever they feel is needed for their project. Then sometime later they realize, "Oh, I'm using 80% of what everyone's been given."

… So it's not malicious, just a lack of awareness. When you guys get a new service do you go, "Hi PMCs, we have this thing, there's 20 units available, use with your discretion," or does someone say, "Hey, Infra has this now. We're going to run and take it all," without realizing they're taking it all. How do you introduce new services to the projects? I'm curious because I never see that side of the activity.

Sometimes it's via the mailing list, users@infra. Projects can come to us and ask questions on that mailing list if it's not appropriate for a Jira ticket. People join that mailing list because they're interested in what Infra is up to. So we use that mailing list as a heads up for whatever it could be: "Jira is getting upgraded this weekend, or there's going to be some downtime on this", or it could be things like "okay we've now enabled GitHub actions across the board" or whatever. There were some new features added to one of our self serve things is asf.yaml, which I know you've spoken to Daniel Gruno about.

… You guys actually published a blog post on that. I saw that on Twitter and  just added it to today's weekly news roundup.

Right.

… I was really surprised to see that. It was exciting. Is that new for you to be announcing publicly like that, sharing outside of the ASF's mailing lists?

It is and it isn't. We used to do it all the time years ago. Then as we've become bigger, it's paid staff not volunteers, we're busy all the time. Blog posts got put out of the picture I guess for a while. So this really new cool feature that was provided, the code was provided by a volunteer via a GitHub pull request. And we looked at it, Daniel made some comments, the changes came backwards and forwards until it was ready to be committed the other day. And it's fantastic new features that projects keep asking us for all the time via Jira tickets. "Can you enable this? Can you check this checkbox?" It's work that we shouldn't have to be doing. So now we don't have to do it. They can just edit their own yaml file in their own GitHub repository and those GitHub features are enabled. It was worthy of a blog post.

… I was pleasantly surprised to see that. That is good. It's interesting because we often don't have enough time to do the work, put out the fires, deal with normal life. You have a family by the way, oh yeah let’s not forget that, but you also have to find time to share this sort of information with the outside world. For me, working on the Inside Infra project is amazing because not only do I get to learn about what's been going on, all these years with the ACF for me it's been kind of a black box in many instances with Infra, but I'm also hearing from outside folks who are saying, "This is great to learn," because how would we know otherwise? It's cool that you are able to share more and more however you can. I was really excited to see that blog post.

Hopefully there'll be more coming. Infra does have a blog and it's been not used as much as it should be I think.

When we publish, we'll be sure to get that out (https://blogs.apache.org/infra/). Let's talk about project requirements. Of course Apache projects have been setting the standard across countless "usual suspect" technologies: servers and Big Data and build tools and libraries and so much more. We now have incoming projects in Edge computing and IOT and blockchain and even hardware acceleration, all of which are coming in through the Incubator. Does Infra need to know anything technical about these topics in order to get the job done? Or do you just handle the back-end support and it doesn't matter what's coming, who's the project or the category they’re in? Are there instances that cause you to say, "We have to learn/do something completely different for this project"? Is there anything that's coming in that changes the way you're getting your work done?

No.

… Great; so it doesn't matter.

No, it doesn't matter. Obviously the Foundation welcomes all types of technologies coming in, and Infra provides what it needs to provide. We don't need to know the ins and outs of 300 projects' code to get our own work done. That would be impossible anyway.

But when you came in, you came in through Apache Forrest: you came in with a project. Do you look at stuff that's coming in through the Incubator, out of curiosity, are you keeping an eye on that?

I do, yes, just on a personal level I do just because it's something I like.

… Okay, so it's a curiosity thing, rather than your job depends on it.

Right.

With so much evolving in this space, if you need to learn something new, is it top-down -- someone saying, "Hey Gavin, go to Jenkins University"? How do you figure out what needs to get done, and how do you learn how to do the job? How do you close your skills gaps?

I think each team member has their own way of learning more. Obviously more needs to be learned all the time, and that could be going to the particular pieces of software's Website, having a look through their documentation. I mean when we're implementing things, we're at a vendor's Website every day of the week, you know? We use Puppet a lot: it's core to what we do. We farm out to 30 Jenkins nodes or whatever, it would all be done through Puppet. So we need to keep up with what's happening with that project. So I'm on the Puppet Website looking through the documentation all the time. New versions are coming out, new features are being enabled, and that could be something that makes things easier for us, so we can implement that from our side. I do a lot of that reading. You talk about Jenkins University. I actually have--

… That exists? I was just making that up.

Well, there is a Cloudbees university. So in my own time I've actually done half a dozen courses myself on Jenkins through Cloudbees University. I'm keeping myself up-to-date and ahead of the curve on that. It's a deep interest of mine, so in my own time I take those courses.

Do you bring it back to the team and share it with them? Or is every man for himself? Do you guys have a lunch-and-learn --"Hey on Thursdays during our weekly call we're going to discuss topic X"? Is there anything like that that happens on a bigger scale, or is it on an individual basis?

Mostly individual basis, but obviously anything that people learn is going to be shared amongst the team at some point, it's going to be "oh I learned this today". Someone could be learning from someone through a PR (pull request), you know? A new piece of code comes into our infrastructure that someone committed, then the others ... everybody looks at everybody's code. So they would look at it and go, "Oh that's neat. Didn't know that." And that's because someone has gone out and learned it from some Website or some course they've done.

Right. Plus your team is super, super close-knit. I'm sure you are like, "Hey, I found this cool thing, you got to check it out" --you're sharing things together, right?

Yes, yes, we have our own Slack channel where we talk all sorts of things. That could be new software coming out, yeah, or new ways of doing things. Or new cooking methods.

That's the thing I've been hearing: the most common thing that everyone tells me is they talk about food and drinking --not code. That's good too, because that's the fabric that connects everyone together.

You can't talk about work all the time. It doesn't matter what industry you're in. If you're in an office somewhere or you're working in a restaurant. It doesn't matter what you're doing, you're not standing there talking work all day long. You're talking about your kids or what place is a good place to go to or have you seen this, have you seen this movie. You know? We do the same talking, we just do it online.

To that end, ASF since day one has been a virtual organization: anyone can work from anywhere, there's always these great stories about people meeting each other for the first time at ApacheCon after collaborating online for 10 years. It's a really cool thing to see. And for years you've been our man in Australia. Has anything changed with that as we're growing? Obviously you're no longer in Australia, but do we need to have an Infra presence on every continent? Where are things going with this? Has anything changed, or we're still business as usual since day one, because it doesn't really matter?

There's no specific place you need to live to do this job. Obviously global coverage is a good thing to have. When I was first employed, I'm pretty sure one of the bonus points was that I was in Australia. There was that one was US-based, then there's Australia-based, so you've got a fair bit of coverage there. So somebody can sleep while the other ones can then fight those fires. With there being five or six people now, there is still a bit of a gap, but not too much. People do all sorts of hours. And I plan on going back there sometime.

… I was going to say that we're going to dispatch someone to Siberia or someplace in order to cover the timezone. But you are going to go back you think?

Yeah, at some point.

… You miss it?

Yeah. The weather is nice, and we have family there.

This is a question I've been asking everyone. What are the biggest threats that infrastructure managers need to watch out for? Not necessarily a doom-and-gloom threat kind of lightning bolt coming out of the sky, unless there is something like that to be aware of. With the pandemic there's been all these security crises and all sorts of weird stuff happening. Is there anything in general you think people in this role need to know about or watch out for? Is there any advice that you think folks should be aware of?

Oh I have no idea. The pandemic hasn't really affected me in terms of the role. I mean being an all virtual organization, I'm not sure it's affected the Infrastructure team at all on a work level.

… Remarkably with that, we remained business as usual while everybody else was scrambling trying to figure out how to cope. So I agree with you on that. Another question I ask is what would you think people would be surprised to know about ASF Infra?

I'm sure the same answer has come out of everybody's mouth. That there's only five, six of us looking after this many projects. I don't really know, yeah. There's just so few of us I guess doing this.

As there are so few of you doing it, and you think that would be surprising for people to know, how many people do you think normally should be doing this work? I'm curious about that, because I don't understand how you guys are able to do this. For me there's always this awe of how you guys are making this work as I can't figure out how.

From my perspective I don't know any different. I've not worked at Google or Microsoft or any other tech company down the road. I've not worked in offices, so I've got no comparison. I wouldn't know whether it takes 20, 30, 40 people. All I know is that we could do with a few more people to take the strain off, because we're all working hard. There's not a day we can say, "Oh there's not much today."

… Are you on a schedule? Do you have days off, or are you on some level seven days a week?

It's different for everybody. Working from home which I've done for these last 10, 11 years is I guess you get used to it. You get up in the morning and you have your coffee, then you're straight into the work. That could be 7:00, 8:00 in the morning. Then you're doing bits and pieces. You're getting some work done. Then "oh hang on a minute, I need to go to the shops," so you can take an hour or two off and go to the shops. Then you come back, then you find yourself closing your laptop at 10:00, 11:00 at night.

… From bed.

Yeah. I like to work the weekends. If I wanted to take a Tuesday, Wednesday off or something like that, I'll just take Tuesday and Wednesday off. During these pandemic times, I haven't had to do much of that.

But the flexibility is crucial and it's super helpful, because you need to keep your mindset. I mean it is very easy to go mad. You could very easily get overwhelmed with this type of workload.

You need to try your best to balance it out. I know other people that work remotely say you must have your own office at home, your own space. And the rest of the family needs to know when you're in that office you're at work.

… Daddy is not accessible. That's not realistic either, right?

No, I mean 10 minutes ago I don't know if you heard my three year old coming up to me.

… I did, and I was going to ask you if you needed to stop because I heard the little one.

My office at the moment for this interview is the dining room table. The kids are upstairs.

You partially answered the next question that I ask everyone. Which is, if you had a magic wand, what would you see happen with ASF Infra?

More staff.

… That's what you mentioned earlier without me asking, so more staff, more people. What kind of roles? More of a generalist thing, or do you need a database guy or a specific type of person?

Either of those would be nice. I mean we've got a dozen services that use databases, so we all know databases. I don't know how many of us are database admins. None of us are, I don't think. But we know enough that we need to know for our services that we run. So do we need a DB admin? I don't know. It could be that if a DB admin came along he would say or she would say, "Look at this, I can improve all this. You've been doing it wrong all these years." And maybe we have, I don't know. So you don't know you need somebody until somebody like that arrives.

Do I think we need somebody specific? Maybe another Python individual. Because we are focused on any new code that we write internally for our own benefit would be in Python. But Daniel is pretty good at that, and so is Greg.

Do we desperately need somebody like that, or do we need a generalist? Probably a generalist at this point.

… To handle the volume.

Yeah.

What's your favorite part of the job?

Interesting ... Let me think about that for a second. I think when you do something and somebody from one of the projects comes back and says, "Thanks very much. You've been a great help." I enjoy helping the projects.

I mean obviously I enjoy working on CI stuff. I enjoy maybe providing a new self-serve tool to help projects and help the team out. Various things. I enjoy the job. I've been here that long, obviously I enjoy the job. But yeah. It's nice when somebody from a community says good job. 

… Is that infrequent? I know that we tend to be very high standards and very "that's expected" mentality. Do folks scrimp with the appreciation?

Yeah. Yeah pretty much. But it does happen. I've had a few "thanks very much, you've been great". Sometimes you need to work with a project not for an hour or a day, but sometimes you might need to work with them for a few weeks on something. It might be a migration or something like that. By the end of it you say, "Okay that's done, I'm going to leave your mailing list now or leave your Slack channel," then you get a message saying, "Thanks very much, we really appreciate your work."

That's great. When you first came into the role, when you first came Infra, what was your biggest challenge?

Hard question. I don't know to be honest. I mean the people that I was working with, whether it was paid staff or volunteers, I knew them anyway because I'd been volunteering previously. I don't know of any challenges to be honest. It was moved from volunteer role to paid role. Gradually over time I got to know the people, got to know the Infrastructure layout, where things were at a technical level.

... Do you think there was any sort of attitude shift towards you when you moved towards paid? It was new then: you were one of the earlier team members. Being paid anything kind of raised an eyebrow at the time. I remember that really clearly. Did you face anything like that? Were you challenged by that?

It wasn't an issue for me. If there were any attitude changes, I didn't notice. I don't think the volunteers leaving over the following couple of years was related.

… Well people burn out. And it's a lot of work. That's the other issue: "Hey I'm here to help out, I know Apache needs help, I'll lift or I'll help raise the building or whatever," --that's one thing, but the fact that the demand is constant, you can't expect people to be spending their free time to do all this work. We do need people to be dedicated and paid for. I'm all for that. 

As you've been with us for such a long time, I'm sure there've been highlights for you. What's your proudest contribution or role or moment in this position?

That's a particularly hard question for me. I think I don't get proud over anything much to be honest. I enjoy doing what I do, but I don't have anything specific to call out. Getting my 10 year t-shirt was a proud moment, I guess.

… There you go. Are you saying that because you're shy or you're super humble?

I don't know.

… Or you're super hard on yourself? I mean there's that too: "That's it, that's all I've done." 

No, I'm not one for calling out the self. That's probably the English in me, I guess.

I appreciate you as one of the gold star participants in my Media & Analyst Training, because you've done it so many times. But I noticed that in you, and that was part of the thing I was always trying to tweeze out of you. "Oh we need to talk about ourselves," and you're like, "No." A lot of people have that tendency to downplay or just not go around "rah rahing". I have to mention that's a very American thing to do, but it happens across the industry. Whether you're interviewing for a job or talking to someone at a cocktail party or whatever. Now even more so with social media, so it's the challenge of balancing that. I want you guys to get the recognition. You are doing massive work. You guys are heroes. I know it might sound weird just to hear that, but it's true.

Yeah. Yeah. I guess. Yeah, I don't have anything to call out. I mean if somebody said to me, if I was in ... not that I've had a job interview in over a decade, but I guess that's one of those questions, isn't it? "What are you most proud of in your last role? What did you accomplish? What was your best thing that you did?" And that's why I've not been good at interviews, because I wouldn't know what to say.

… That's where that t-shirt comes into play. I survived and I thrived over 10 years.

Yeah.

That's great. What about your co-workers, how would they describe you?

You see, we just talked about how quiet I am. But on the Thursday team meetings, I probably talk the most.

… I do not believe that. How do you mean?

It's a standing joke, "Oh this week's team meeting was only 20 minutes because Gavin didn't attend," you know? But I don't waffle on, I do talk about work. It's just I've got a lot to say.

That's good. You stick to the point, you're not waffling on. All right. What else do I need to know that I haven't asked? Is there anything specific that you want to highlight, a project you're working on, an achievement, something? Anything that I'm not touching upon that's like "I need to make sure that Sally knows X"? Is there anything you haven't touched upon that we haven't discussed?

No, I don't think so. I was thinking before the call, "What are we going to talk about?" 

… Media training. There you go, you're preparing your speaking points.

Yeah. 

… Ta-da, so that's my proud achievement. I got Gavin to think about speaking points. 

Yeah. I mean yeah, as far as how the Infrastructure team are doing at the moment, are there any big things coming up? Not from me, I don't think. No.

I started thinking about this series when we were in Vegas (ApacheCon) last year. We had so many people show up at media training, I was thinking, "This is such an interesting group that no one ..." --you're not faceless, you're there, but it's so hard to discern the individual behind the group, right? And the group, like you said, is so small, so I think there is such an amazing story with you guys. And you're so integral to the Foundations actual operation. Because you're so ubiquitous, people don't think about it. And I think that's ... Maybe because I'm non-technical, for me it's this massive lift. It's like shooting a rocket up. It's incredible. So I certainly appreciate you guys. I think it's just amazing what you do, and you deserve a pat on the back and an attaboy and all sorts. I'm here to cheerlead.

Thank you.

You bet. Thank you so much. I really appreciate your time today. And thanks for doing this, and thanks for passing on the word with the team that it's not a painful experience. I'm also trying to encourage other members of the community to consider doing something like this. We do the "Success at Apache" posts, with individuals writing about themselves and their experiences. Apache wouldn't work if it weren't for the people. Right? It's all about the people. This is not from a PR perspective, but after 21 years I am always ... I keep wanting to know the same things about everyone. And I know it's not limited to me. If I'm curious, I'm sure other people are curious.

= = =

Gavin is based in the UK on UTC +0 (currently on CET). His favorite thing to drink during the workday is coffee with milk, no sugar, and consumes about 10 cups per day.

Monday November 02, 2020

Inside Infra: Gavin McDonald --Part I

The "Inside Infra" series with members of the ASF Infrastructure team continues with Gavin McDonald, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.



"...The Foundation itself has a responsibility to the Projects to ensure that there is solid infrastructure there. So there's got to be a requirement that there's people there all the time to maintain this infrastructure. The Infrastructure team has become more professional over the years. The Projects have become customers, I guess. Volunteers are always welcome; at Infra we still have plenty of areas in which volunteers can help out."

All right, let's get started. What is your name and how is it pronounced?

Nice and easy one. Gavin McDonald. Just McDonald as in Big Mac and fries McDonald's. It's M and C, no Mac.

When and how did you get involved with the ASF?

That was back around about 2005. I was looking for something different to do than what I was doing. And I came across the Apache Forrest Project. I knew a little bit about XML and websites and stuff like that. So I started contributing to the Apache Forrest Project. And some months later they made me a Committer.

So you first got involved with the Forrest project, then at some point you became part of infra. How did that evolution happen?

That's me looking around for more things to do. I've always been involved in and interested in system administration work. My first real communications with the Infra team was whilst working on a Forrest Solaris Zone and needed some help with it. Shortly after that I started volunteering there. 

First of all, I saw a huge number of tickets regarding mirrors, you know for our software downloads. I'd say it was probably around 150 tickets outstanding for mirrors wanting to join.

... What?!

Yeah.

... One Hundred and Fifty...

Something like that; some of them had been outstanding for quite a while. At the time there was only one person being paid. There were volunteers obviously looking after the machines and stuff like that. Mirrors were sort of lagging behind as they were less important. So that was my in. I started off with getting karma to add all the mirrors.

There was a certain standard that mirrors have to have, certain configurations. So I was going backwards and forwards with the mirror providers and making sure they were up to scratch, then adding them into our configuration.

From then, I introduced BuildBot to Infrastructure. And I think maybe a year after that, this is now talking 2009, a position opened. I think more or less the rest of the Infrastructure volunteers said, "Gavin is doing the job anyway. Let's give it to him."

That was my interview.

Around October, November 2009 I became paid staff.

Are you the longest serving member of the current Infra team?

Yes. Last year at ApacheCon I got presented with a 10 year t-shirt. Next time there's a physical conference I'll be bringing it along.

10 years thumbs up: that's good! Explain the structure of the Infra team and your role in it.

There are six of us, plus Greg (Stein), our Infra Admin, and David (Nalley), VP Infra. One of them is a documentation guy, that's Andrew (Wetmore). The rest of us all various system administration devops work. We look through tickets, what's needed to be done, and obviously we're looking to improve our infrastructure uptime and software and updates. So we all do what's needed, basically. Everyone has various roles.

What's your role?

Well it's a bit of everything, I think. I have been concentrating quite a lot on the CI/CD side of things. That was written into my original contract, which is now not part of the contract. Basically that means the whole entire time I've been here, I've been involved in BuildBot and Jenkins and other CI/CD stuff, and I've been doing a lot of that lately as well. Migrating Jenkins over to new Cloudbees software, and on a whole load of VMs, mainly in AWS.

You mention that CI/CD is a key part of your role. Is that what you're specifically responsible for within Infra? Are you "the CI guy"? Are there other things you do? Everyone says to me, "Hey we do everything." That sounds amazing, but how is that possible? Do you do everything else in addition to the CI work?

Yeah pretty much. Yeah. Everyone can do pretty much everything that we touch on. Some just choose to do certain things that they're more capable of or more used to working with or they like it better. Nobody is told, "You're working on this."

That's interesting. Fill that part in: if there's six things that need to get done, but five of you are actually hands-on sysadmins, so you guys do what you like to do or what you prefer to do? No one says, "Okay you go handle that mail server"? How does it work?

Obviously there's 24 hours in a day and there's people all around the world. If there's an emergency going on or a mail server breaks down or something needs doing, then whoever's around at the time would step up and say, "Okay I'll take a look at that."

So everyone's pitching in --it's not, "Hey I'm not going to do it. Wait for ‘the mail server guy’."

No, no. Obviously I'm sort of known for the Jenkins and BuildBot stuff, but if I'm not around, everyone else can just jump in and get on with it.

So how did you become the Jenkins guy? How did you get to be the BuildBot guy? You were saying earlier that you kind of evolved into it because it was needed, but is this something that you've personally had interest in to start? Or is it just, "Hey there's a fire here, I need to put it out," and it became this cellular memory, a habit, it's now your thing.

I think a little bit of both. I started off introducing BuildBot not long after I started. Jenkins had already been going a little bit at that time, but I've been involved in that also since the start. And it's over the years become more important to projects. Back when it started, it was a nice-to-have sort of thing. There was none of this pipelines, and CI wasn't an integral part of releases, whereas these days it's more and more a requirement. Jenkins and BuildBot have gone from second-class citizens, if you like, to one of our core services that needs to be kept on top of all the time. It's one of the most important aspects of our infrastructure for projects. There's a great demand for it. And it's increasing all the time.

That's interesting to see it go from a supporting role to the lead demand. That's been what, over 11 years now?

Yeah, 11 years.

In earlier interviews, I spoke to Chris and Drew and Greg and Daniel (Infra team members Chris Thistlethwaite, Drew Foulks, Infra Administrator Greg Stein, and Daniel Gruno) and they've all given me their perspectives on the many areas that infra is involved with. Tell me about the scope of the work that you guys do, and how is it different from other Open Source foundations?

Not sure how I can answer that. I'm not involved in any other Open Source foundations.

Okay, well tell me how does Infra operate at the ASF? You support the Foundation, you support projects. How do you help?

We have, as you know, over 300 projects. Each of those requires a Website, each of those requires an area for their code, whether that be Subversion or Git. We obviously over the last couple of years have been more involved in supporting GitHub for Projects. And we have the Confluence wiki and Jira for the issue trackers. So all of the services that they need to operate as an Apache Project is what we offer them.

So every project needs a Website, as you said. Each Apache Project is responsible for their code, their communications, and their community. So they run their own Website, but Infra handles the backend? What is it that you do for them?

Yes, we handle the backend. We've got Web servers that all the Websites get published on, but they write their Website content, and that could be written in many different languages. So we support them being able to provide their Website content in whatever manner they want. This could be just plain HTML, it could be compiled in Maven or in Pelican, there's a million different things. GitHub pages. So we provide the publishing methods for them to be able to go from there ... most projects these days just want to be able to commit a change and leave it at that. Then that change automatically gets published to the Web via automated mechanisms at the backend, you know? We watch for a commit. That could be via a gitpubsub, could be via Jenkins, via BuildBot, GitHub Actions, all of these methods. We'll see a commit, and it'll publish it and build it if necessary before publishing. So they just commit a change and leave it at that.

So the magic that's associated with that automation, is that something you guys are building to support them? Or is it something that pre-exists? How are you integrating all these different languages or platforms? How is this happening?

Well, software like Jenkins and BuildBot ... those mechanisms we can provide pre-built code to watch their repositories for commits to their Website repository. It'll automatically build it, and then it'll push it to the websites. There's also recently GitHub actions will also, instead of being on Jenkins or BuildBot or Travis or any of those, GitHub actions will take a commit straight out of the GitHub repository. It'll do the building of the Website, then it'll push it to usually what's called an asf-site branch. And then we pick it up from there and publish it. The actual GitHub actions code themselves is written by the projects. So that's self-serve.

If there is a fail for that commit, who fixes it? Is it the Project’s responsibility or is that your responsibility? Who's under the hood dealing with that?

It depends. If it's a coding error, then it's theirs, the Projects. If there's some kind of hardware failure, or if there's a piece of software gone down, communications error, yeah, it's up to us to track that down and find out what happened.

I'm understanding a trend here. If you go to other foundation sites, they seem more “corporate” in the sense that everyone's site looks, feels, and performs the same way, they operate the same way and they tend to be under the same infrastructure altogether, right? They're not using 50 different CMS's.

Right.

... That in itself is highly unusual.

Oh okay, yeah. We don't mandate how projects make their Website look, or we don't mandate how they must build it.

That in itself, the autonomy to do what works best for the Project, I think is highly unusual.

Okay. That's good to know.

In terms of ASF Infra and other foundations, you guys don't sit together and compare notes or talk to each other or anything. A lot of groups copy us, so I presume there's little interaction other than socially, right? I didn't know if there was, "Hey, Linux Foundation does that. We should do the same thing," kind of thing. The ASF does its thing and so be it. 

As far as I know, we have no interest in what other people are doing in terms of how they do things. We do things how we think it's best to do them for us and our Projects, how it works best for us. Whether other team members go off and have a look at how other foundations are doing things, I don't know. But I don't.

... Uniquely Apache.

Yeah.

In terms of services, what's the difference between what you offer for individual Apache Projects and their communities versus Foundation-level initiatives? I presume there's a difference --is the majority of your work serving the Projects? What's the percentage of work that you do that's for the Foundation versus Projects? Is it all for Projects? Or is it all considered one thing.

I don't really see a difference. All the work that needs doing is for the Project or Foundations as a whole. It's all the same to me.

What about incubating projects? Do they have special needs or requirements? How do you support them?

Not really unless they're coming into the Incubator with something they've always used that we don't do. Then we would look at that and decide whether it's something we can do for them or not. There's been a few projects that come in like maybe OpenOffice in 2011.

That was exactly in my head in terms of pre-existing groups that have pre-existing infrastructure. OpenOffice was a whole community altogether in a completely different way. How did Infra support them? What did you do? I knew that there were some issues with the codebase and licensing. What else did you do to support that project?

Oh that was a while ago.

... That's fine, I was just curious as to what you guys did. I just remember it was a huge lift from everybody, from all sides. Licensing and code and every aspect of that project coming in seemed to me to be very, very, very challenging, but we got through it. So that's great.

I know there was a lot of work bringing the code in, and not just from the licensing perspective, but also it was an enormous amount of code that needed to come in. I don't know whether they were in Subversion beforehand, but we provided them their space in Subversion and their Website space. I think a lot of the work was done by the project themselves.

Wow, wow. That was a lot of work. How do you handle Projects or communities that make unreasonable demands from the team? How do you guys deal with that?

There are some projects ask more of Infra than others. Some we never hear from at all. There's kind of a fine balance. Projects that are fairly new, we probably spend a bit more time with them helping them out, making sure they get all set up. They may ask new things, there may be some initial push backs, then all of a sudden there's another two or three projects interested in the same thing. So then we have to take a serious look and decide whether that's something we need to support ongoing.

We do get each of the team members I'm sure gets private PMs on Slack and emails and stuff like, "Hey, can you help me out with this?" Or whatever. Sometimes you just do it. But we're sort of encouraged to ask them to go through the proper channels via a Jira ticket or email to the appropriate list.

Not to name names, but have any Project's expectations been so unusual or so out of scope that it shocked you guys? Have you had situations where it's just been absolute, where you guys have been floored by it?

There's been maybe one or two projects that have just been incessant in their demands on Infra, as if we were their personal team. But we deal with it as in, "okay, slow down, what do you need? File a ticket." If they keep going on and on and on, then obviously we've got escalation levels. We can say, "Hang on," and we can pass that onto our boss and say these are being a bit unreasonable.

For those "colicky baby" types of projects, I've been hearing more and more about additional services being offered through Self-Serve. Are these guys able to take on Self-Serve and go, "Yeah that works for us and we'll do it." Have they been able to kind of self-satiate their needs, or has it always been "Infra do it for us"? How successful has Self-Serve been in terms of wicking away demand?

It's been hugely successful. You're referring to selfserve.apache.org: we introduced that three-four years ago maybe. It was a way to ... help the projects help themselves so they don't have to wait for Infra, because they know Infra is busy. Sometimes waiting two or three days for something is ... from their side of things they're like, "It's been two or three days. Still hasn't been done." But from our side of things, "it's only been open two or three days, what are they worried about?"

... "You're in the queue, wait."

Yeah: self-serve was introduced as a way for them to help them, and also it helps us, there's an awful lot of tickets now that don't get filed because of that. They can create their own Jira Project. They can create their own Confluence wiki. They can create their own Git repositories. 

... On their own completely? Without intervention, without "mother may I?", anything? They just go do it?

Yep. There's an awful lot that they can do on their own. And we introduce more self-serve things all the time that otherwise we'd have hundreds more tickets if they weren't able to do that on their own. They can create their own mailing list now: they don't need us.

Do you have to be a PMC member to do that? Can any Committer can do that? Who gets to administer these types of services for projects?

I believe some of the self serve options are PMC chair, and others are PMC members.

… So not just some person who's like, "Hey I'm committing code, I'm going to go and futz around with the site and break something."

Yeah, no.

That's good. Controls obviously are necessary. This is terrific: what a huge difference.

Yeah definitely.

We've got hundreds of projects that have successfully incubated and graduated under the Apache banner. How do you guys develop new products and services to help support that innovation? We get all sorts of projects coming into the Foundation. Going back to OpenOffice as an example, we've never had a project like this of that scale, and consumer-facing. There were so many different things about that that was so unique, and yet we said, "Yeah you're part of the Foundation, you're coming in, you're part of the family."

Yeah.

We’ve had to adapt as we grow. Is there a way for you guys in anticipation ... feel like you need to have a different type of runway in order to accommodate new projects coming to the ASF? Or do you deal with it as it comes along?

Infra is not in control of what projects come to the Foundation. We don't have a say in that. When a project comes to the foundation and they have different requirements, then that's when we get to know about it. And we would deal with it appropriately then.

Obviously there's growth and we know that there's going to be more and more projects coming to the ASF all the time. So we anticipate growth as such.

… So you are setting yourselves up to accommodate more growth, not specifically a matter of "we need more Jenkins" or whatever.

Right. I mean whatever it is that we are looking after, we need to know that that particular service is going to be able to connect with growth.

Got it. How many requests do you receive a day? In general in terms of what constitutes "hey we're slammed" versus a regular day of "we've got 40 things in the hopper", that's normal? What's the volume that you are dealing with?

I want to give you a figure as far as Jira is concerned, which is only one aspect of the things that we handle. Not everything is done by Jira tickets. But I'd say on an average month, we probably get between 150 to 200 tickets a month.

I've been on the Infra channel on Slack, and it's constant. It's nonstop.

Yeah.

Explain to me a typical workday. How do you manage between "hey I'm focused on a long-term project, this new request is coming in, Sally's hair is on fire because she needs help with a mailing list" and whatever else is going on? There's just constant demand on you guys. How do you not go crazy? How do you manage this?

We just get used to it, I guess. Obviously each individual handles their own time in their own way. At any one time there could be one, two, or all of us could be on Slack. So as requests come in on Slack, if it's a two minute, five minute job, we might just say, "Okay, all right, I'll sort that out for you now." Or if we feel it's going to be a little bit more in depth then we say, "Okay file a Jira ticket." Then one of us can pick that ticket up and take a look at it.

We do get people pinging individuals on Slack saying, "Can you help me with this?" Or whatever. Which is often negative to them in a way that they're narrowing their scope of help they can receive by targeting a specific individual. That person might be extremely busy for the next four or five hours, day and a half, whatever it is. And there's another four or five people that could help them with that question.

Typical day, obviously you get up, you check your emails, you see what's urgent, are there any fires to fight straightaway. You go on Slack, that stays open all day. As requests come in, you check Slack all day long. That's just one of those things. You check your tickets, your Jira tickets, what needs doing today, what can wait, or if you've got plenty of time then even the ones that can wait get done.

Whatever order you feel is most important. Then yes, everyone's got longer-term projects on. So myself personally, if I can spend a day or two on a long-term project, then get back to doing tickets, it's the way it is. If there's a lot going on in ticket land, then your project gets put on hold. If something breaks down ... The other week we had to move our Jira server because the hardware broke, so on a Sunday things broke down. Quickly fire up a new server and move everything across. Not sure anybody noticed, which is a good thing.

That's always a good thing. Business as usual, no one knows. With all this stuff coming at you and servers breaking down on the weekends, et cetera, how do you keep everything organized?

It depends on the day, I guess. Some days are good, some days are ... some days you can't see your hand in front of your face for things going on. Each day as it comes. There's no plan. I don't plan what I'm doing tomorrow. If there's a long-term project and I think things have slowed down, projects aren't asking for things, tickets are coming in slowly, I think I'll get on with my project tomorrow. Then you wake up tomorrow and something different happens. There's no real plan.

You don't use any special tools to keep your work checklist in order or anything like that other than the Jira? 

I tried to use various products over the years. You've got Trello and these other kanban board type things. You actually got to open it up and fill it out, haven't you?

It's so interesting you say that because I think some people find that structured way of working extremely efficient, then it's exactly that solution for them. Spending the time to actually do it is taking away from doing other things ... so I don't know if that works for everyone.

It doesn't work for me. I did start one of these boards, but it doesn't fit in with the job. You've got ... "okay, this has got to be finished in three weeks, this has got to be finished in two days." And it sends you reminders and emails and this and that. I mean there is no time limits on things. We're not a software project. We don't have to release something next week.

… True, you don't have hard delivery dates.

Like you say, time is taken away by filling out these things that are supposed to help you organize. So I just don't do it anymore.

Do you have other challenges with that? Balancing everything and getting everything done?

No, feeling okay. I mean I'm still here.

That t-shirt is evidence, that's true. Since Day One, the ASF has been known for creating their own rules for success. They're like, "We're going to do it our way," right? And Infra --even before there was an official infra-- played an important role. You can't exist without that kind of support. How has --and you've been with the Foundation long enough to see patterns and changes --how has infra changed over the years?

Good question. When I officially came onboard as a contractor, I was the second contractor at the time. And everybody else was a volunteer. There were quite a few volunteers. And they were there a lot. At least a dozen people that were active as infrastructure volunteers, even though they knew that there were two people getting paid to do the same thing, they were still there. Still volunteering.

Over the years, things have gotten a bit more professional, I guess. The service requirements have become more of a professional level. Down time is ... years ago if something was down for a couple of hours, it was like "there were just volunteers that are handling it. They'll get to it when they can". But as more and more paid staff had come onboard, to a grand total of six, a reverse happened with volunteers. They've mostly gone. You've got now maybe two or three volunteers that have stuck around and been around for a while. Because there's paid staff doing it. It's changed as in "who wants to volunteer for something when there's people being paid to do it?"

Was this shift proactive or reactive? Was it a matter of the demand coming from a Project and for us to go, "Well we better change this," or was it a matter of we're feeling like we're having volunteer burnout or whatever and we need to make this a more professionally oriented organization? Do you recall how this shift happened?

It happened gradually over the years. As the Foundation grew, more projects came in, more hardware was required, more services are required, more hands-on time is required. So you increase the staff one by one to handle this. Then I think over time as volunteers start dwindling away, due to the fact that there's people getting paid to do it.

That's one aspect. The Foundation itself has a responsibility to the Projects to ensure that there is solid infrastructure there. So there's got to be a requirement that there's people there all the time to maintain this infrastructure. The Infrastructure team has become more professional over the years. The Projects have become customers, I guess. Volunteers are always welcome; at Infra we still have plenty of areas in which volunteers can help out. And, we don't bite!

Obviously the SLA is related to that shift too. They're becoming customers versus "we're all in it together and everybody figure out how to make it work". I'm sure the expectations also were higher, right? Because now you have a team, what's your excuse for not getting it done?

Right.

[END OF PART ONE]

Monday October 12, 2020

Inside Infra: Daniel Gruno --Part II

The "Inside Infra" series with members of the ASF Infrastructure team continues with Part II of the interview with Daniel Gruno, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.




"...it speaks of how tenaciously the Foundation guards its core values, one of which really is provenance, because it's the Apache seal of approval, means this has been thoroughly vetted. We know where every single piece of code comes from. And we know that it works."


What about "user demand" --what does it take for you collectively to decide, "OK, we'll support Kubernetes," as you mentioned it earlier, or whatever? Are there strategic technologies that you want to work with or plan to support, or is it all coming from the projects themselves? How does that process work? You're creating projects out of some kind of pain point or some kind of vision. So for you, is it a longer-term thing? Do you have an influence on this? What drives the growth of services delivered? It's a mix. It's a mix of, first of all, the Infrastructure team is paid by The Apache Software Foundation and it's paid by The Apache Software Foundation to help the projects. So what we do must first and foremost be something that helps the projects and not something that just helps Infra. I mean, of course, we can make tools and have services that will assist us in our work, but the ultimate goal must be supporting the projects. First and foremost, we listen for projects that come and tell us, "We would really like this or we would really like that." Having said that, we do not always say yes. We have costs to consider. We have maintainability to consider. So as a general rule of thumb we will say, "Okay, project A wants to use service foo. Does anyone else want to use service foo right now?" On occasion, you get, "Nope. No one else wants to use service foo." And then we go back to project A and say, "It doesn't seem like this is feasible for us economically to maintain if it's just you." But you can also have a situation where 10 projects suddenly say, "Yep, we really, really want to use this." Once you have a trend for something, we are usually not proactive, but reactive to these trends. So a project will come and tell us, "We really want you to use this." We will go out and see if anyone else wants to use this, and they will say, "Yes, please." That's when we'll add that feature or service. We also have ideas of our own that are, by and large, a result of either existing services not doing what they're supposed to, or they're being... Let's say you have... For example, there is Google and there are mail archives that we had in the olden days. At some point we wondered, "Why don't we combine it so you can search for emails in the archive?" That's how lists.apache.org came to be. So we have both things that projects come and say, "We really want this," and we also have this crystal ball where we look at problems we're having with existing services, where we look at possible combinations between existing services and other existing services or new services that are emerging in the Web. Or we just have someone say, "Hey, wouldn't it be wonderful if something like this existed?" So it's really a mix of projects asking us and trends emerging and just blue skying, "Wouldn't it be cool if...?" Have you guys been in the situation where you found yourselves caught where there was this magical trend that everyone wanted, and it just didn't serve the Foundation, it failed? Were you guys in that situation where you had to back pedal? Or is that not part of your experience? I would say the most prominent or obvious feature or service would probably be GitHub where we started in 2010 with mirrors of our local Subversion and Git repositories. They would be mirrored to GitHub. That was actually a bit later, but around that time, they started mirroring stuff to get up, but you couldn't write to GitHub. We were adamantly against it. Because provenance, provenance, provenance: that is that thing that if you know Apache, you know that provenance is one of our key features. We like to be able to say, "Oh this came from that. This came from this. This came from that." We had concerns at Infra that we were not able to have the exact --emphasis on exact-- same provenance as we had on our own servers, and we got a lot of pushback for that. In the end, we figured that maybe we don't need this kind of providence that we had. Because we had very verbose logging going on for our own service that we couldn't get from GitHub because GitHub is a third party provider. They're not going to fork over sensitive data about their customers to us. So a) we were willing, at some point, to compromise, because it turned out that the data that we had been collecting was maybe not so important after all, and b) we came up with this linking utility that would actually allow us to go in and see who that person committing was on the ASF side. That is, if someone commits with a GitHub account, we can go in and see, "Oh, this is actually this specific ASF Committer," because we have this internal mapping going on with GitBox. And so with that, and then the realization that we didn't need all of this verbose logging, we finally decided that we're going to allow write access, but that was probably... It could have happened a year earlier. A year sooner. But I wouldn't say that it's a failing, of us, as Infrastructure. I think it's more it speaks of how tenaciously the Foundation guards its core values, one of which really is provenance, because it's the Apache seal of approval, means this has been thoroughly vetted. We know where every single piece of code comes from. And we know that it works. If you're suddenly letting go, even if it's not really the case, but if you're seemingly letting go of some of those core values, you are going to get pushback because we all, I want to say, love and cherish the Foundation. We all believe so powerfully in its mission that for a moment, we forget reason sometimes and we just push these core values without interpreting them, which is sometimes the right thing to do. If we have a core value that says, "We need to be able to see where the code comes from." That doesn't necessarily mean we need these five specific points of data from every single user. It just means we need to know where the code comes from. And if that means these four we know, plus this one new one, then that's just as good. That was a bit grandiose, sorry. No, no, no. There's a lot to it. And I love the angle that you're providing with your answers. That's very different from the other guys' perspectives and that's super helpful. It's important, because that's demonstrative of the diversity of the Foundation. We're people, we're not just machines. And so it's very cool to hear this. Moving on specifically with our growth, like how do you close your skills gap? Do you do that? Do you rely on the team? How do you cope with stuff that way? Oh that's a good question. I rely on mentors that I have. I'm not a bookworm, for example. I can't sit to read a book. I can barely watch a movie because I have a very low attention span. So what I'll do is I'll make some mistakes and I'll have some mentors that I have come in and tell me, "You're doing it wrong." And then I'll fix it or try to fix it and they'll say, "You're doing it wrong again." Eventually, well in a loving way, eventually they'll say that this is right. I love to learn by example. I have a lot of mirror neurons in my brain. I like to copy styles. I like to mishmash styles together. And I love to fall in love with new ways of --this is going to sound very nerdy-- I love to fall in love with new styles of programming. I recently discovered something called MyPy, which is a typing checker for Python. At first I was like, "Oh this is boring," and then I realized, "Oh, I can actually use this for checking whether what I write is going to always work." Then it changes into, "This is awesome. I love it." Which then changes into, "Everything I've ever written must now be written using this typing hint." And then suddenly I have Greg Stein yelling at me saying, "Yes, this is technically valid, but I really don't need this." Another mentor I have introduced me to this typing hint. And so I progress by observing other people doing their things and where they and I differ, there are basically two scenarios. Either they're worse than you are or they're better than you are. If they're better, or perceived better, I will usually try to study, "What are they doing that's different from what I am doing?" and if I like it, it tends to stick like a rash. Then suddenly, it's in everything I do, because I'm not a trained programmer. I never studied programming. I never studied computer science. I studied social economics and then human resource management which is very far from it. It was always just a hobby thing so I never really learned about unit testing. I never really learned about unit testing. I never really did learn about proper documentation outlines. And I never really learned this is the correct way to program in this language. This is how you style it. It was always just looking at some examples and then picking the parts that I thought was interesting. So what I initially want to start off as a program, what I wrote it, it would work, but it would be very ugly and it would be very error prone. So people would say, this is a cool piece of software, but it's very not pretty. This is what you should do to change it. So I've relied on people not telling me that I am good or bad, but telling me, this is the difference between what you do and what I do, and then having it be up to me to figure out is this something I want to adopt. Greg, for instance, has been a tremendous help in that Python department, not necessarily by saying, “you need to do this, you need to do that,” but by writing some examples. Commenting code on saying, “this could be” --emphasis on “could be”-- “could be this. Or you could use this instead.” Because he's got decades of experience in Python programming, for him to say there's a different, smarter way of doing this, it's not by using words, but by just showing the examples. Because he knows that I pick up on the why pretty easily. I just need to know that the difference exists, then I'll know the “why” eventually, because I'll be very interested in why that difference is. So he just kind of feeds me these little nuggets of this smarter way of doing it. I learned from that and I'm very grateful for that. Tell me how has ASF Infra changed over the years: is it proactive? Reactive? How and why did this come about? Obviously it's changed, but was it an organic thing? What's your take on that? It's changed in a lot of different ways and also it hasn't changed. And also: I don't know. It's changed in that it has become more of an obvious hierarchy now, which is not a bad thing. We have a place where the buck stops now: we have a place where decisions are being made. We have, most importantly for someone like me as a staffer, I have someone that I can defer to that I know will take care of it or will be the one with the final responsibility. That can shield us lowly peons when someone is being a bit too grumpy. That has changed which also means that we, the staffers, are not as abrasive as we used to be. I remember when I joined, the tone was a lot different. This is of course my perception as being this little timid newbie back then, but it was more, every single person had to kind of fend for themselves. Now we've got more of a cohesion. We have yearly gatherings, face-to-face gatherings. We talk about a lot of non-work related items. We have weekly calls that didn't happen before. I guess you can say it's become more of a family now than before because we interact with each other on so many different levels that are not specifically work-related now. It's also made us more friendly. The change was largely planned. Or it wasn't “planned”, but was planned as a reaction to events that happened --sometimes you come across some things when you're in any given company. We were like, “we need a change”. And this was one of changes that happened a few years back. Well quite a few years back. Actually, I think this was in Cambridge, not Cambridge, Massachusetts, but Cambridge in England. We had a meetup with our new, at that point, our new Vice President of Infrastructure, David Nalley, and the existing infrastructure team. This was the first event in my lifetime, if you will, of the team. The first face-to-face meeting we had, that was all about “what are we going to do in the future as a team”, where we worked out a lot of policies and work methods that we still use to this day. I'll not go into too much detail about why, but it was planned as a reaction to us being perceived as not the most welcoming group of people. If you go back 10 years, it was in my personal experience, a lot more daunting asking Infra for something. Do you think that's because people were just rude? Or was it a matter of them being overwhelmed? Or there was no process? What do you think was behind that? I think there was not a sense of structure in the team that we have today. People were self-led. We are, let me emphasize that, we are still very much self managing in the team, but we also have a boss and a boss's boss that let us know what they would like us to focus on for the long term processes. We didn't really have that before. It was more fend-for-yourself, figure-out-something-to-do. And if you can't, then that's just “why not”? We have a lot more structure after the Cambridge meeting. And after David started as VP Infra. Because we had gone from being --I don't know if you know this saying in the US, but there is some difference between a United States and American NATO Secretary General, and a Dutch NATO Secretary General. And that is that one is a secretary and one is a leader. One is a boss and one is a leader. We had a change in the style of management at that point. It's not that (former ASF President and VP Infrastructure) Sam (Ruby) wasn't doing his job. It's that David added something to the job that wasn't there, hadn't been there before. Sam was doing what every single VP before him had been doing. Which was fine. David came in and saw that there were things that he wanted to improve upon and he improved upon it. One of the outcomes was that, in my view, that the team also became more friendly towards people coming in with issues. But it's also a different environment that we are in now as a team. Apache in the old days, it was strictly volunteers spending their hobby time, doing what they love. It has slowly pivoted into being people that are paid. They still contribute as individuals, but they are being paid to make those contributions. They are also part of larger teams, often at big companies that have a lot of resources. The expectations and demands of the Apache infrastructure has also increased exponentially as we have become a large organization. So what we are tasked with today is also more demanding. I don't think that the infrastructure to staff 10 years ago would have the same interactions and the same good terms. You want to be on the same good terms with the contributors as we are today. So in that sense, I think David was gearing us up for what was to come. David has also a unique perspective because he had come on Board in 2012 as part of the Apache CloudStack project. So he came in as an incoming project that also needed support from infrastructure. So he has experience on both sides of the fence, so to speak. You know, Sam has a much “older” experience in terms of him being with the Foundation from a much earlier time period. So it's very interesting to see how the evolution has come about. A lot of us who've been here from the beginning see things a certain way, and don't realize that from an outsider's perspective, that experience might be completely different. It's very interesting to be able to have that balance and have someone come in and kind of make the team more cohesive based on what their perceived needs were and being able to project what projects will be needing in future. It really is. Yeah. Also, he has a very special way of --let's say he's very “godfather”-like. I don't mean that he kills people! He has a very persuasive non-intrusive way of asking you for a favor that a) I find very endearing, and b) I know why he using it: because it's very effective. That I don't think a lot of people would get away with. So what that means is we do a lot of things that David asks us because it's David, because he's built up goodwill. It's easier for him to shape the team and to what he wants it to be as to someone that was just there as a secretary and didn't really do anything. If you're not engaging with the team as a boss, and then you suddenly come in and say, do this, there will be pushback. But if you're engaging, if you're there, if you're have a presence in the daily routines and the daily water cooler chat, and then you say, "Hey, by the way, what do you think about this idea?" Then you're much more likely to get a positive response back. I think that's one of the things that David brought is a more relatable and more ... let's say he's brought in a closer bond between boss and workers. Leader and workers. And now we have Greg as well. So now we have two of them. That’s progress in the right direction. What areas are you, meaning Infra, experiencing your biggest growth? At the moment, that would be continuous integration, which is building software basically. Testing that something builds. Testing that something compiles properly, that it passes these tests. We have six or seven different platforms for that at the moment, and it is using hundreds of machines. And it's never enough! We know we have a demand and we know what the trends are, and we're also kind of blue-skying a bit on how do we solve what's ahead of us. A lot of this is throwing more money at it because that always helps. A lot of it is, again, going back to developing smarter tools that enable us to utilize the resources that we have, because we are not like a big whale. We don't have a cash whale: we don't have that much money. So we’ve got to make sure that the resources that we buy or lease or rent or whatever, are being utilized to their maximum potential. So that, again, comes back to figuring out how do we go in and monitor. Is it being utilized? What can we do if it's not? What do we do with over utilized? Can we figure out where it is bottlenecking? And a lot of other things. Builds, continuous integration, continuous delivery, I think it's called. That's the place where it's the most growth at the moment. With regard to CI, what is the most popular platform that you guys are using or what service has the biggest demand? The most used one is still Jenkins. I think we have 30-40 Travis machines building there, and that's practically nonstop. With Jenkins, we have, I think it's 150-200 machines or something that are building practically nonstop. That's by far the largest platform we have. We are using a lot of Travis and Buildbot. We can always use more of that. You’ll be talking to Gavin (ASF Infrastructure team member Gavin McDonald), who has been working a lot on splitting Jenkins into smaller components. So that major software categories, for example, get their own platform and bigger projects can get their own platform. This is because we don't want a monolith. We're splitting that up to actually save us some costs and not have so much downtime on the time. He can tell you much more about that. One of the things we did was graph out how much are we actually using and how much have we been using. Which projects are using the most of these resources? And if there's a specific project that sticks out like a sore thumb with, I don't know, 50% of all the computers or the machines are going towards that project, then we'll reach out to them and say, are you maybe doing something that's a bit too intensive? Can we scale this back a bit? Or do we need to look for a specific targeted sponsor for you or what? We're not constantly, but on more occasions than not are looking at these resource usages and seeing where can we optimize things so as to not use too much money and also not use too little money. Just the right amount. So many companies, as you know, are really struggling with their teams working from home in response to COVID lockdowns and stay at home orders. From day one, ASF has been virtual. I understand that you were stuck in another country when the pandemic lockdowns happened. How did you cope with that? Did anything change with your operations, your work? How were you impacted by that? Work-wise I was not impacted at all, which is wonderful. We are able to work from pretty much wherever we are. And this was not my first trip abroad, believe it or not. This was in Canada, by the way, I was stuck for 105 days. In the few places that I go to more than just once, I have it set up so that I can work from there in a reliable and comfortable way. By that I mean, I don't like laptops: they're a wonderful invention, but I don't like them. I don't like sitting hunched over a tiny, tiny keyboard without a mouse and looking down instead of straight ahead at the screen. Luckily I have a laptop and I travel with it all the time, but I plug it into a KVM switch which is a keyboard, video, and mouse adaptor. I have a monitor and a good old sturdy keyboard set up in the places where I frequent often. So I was able to work from there as I would with my stationary machine back home, just using my travel laptop. The only difference was the time zone difference. But we do most of our work, asynchronously. And whatever firefighting there is that always just happens at random hours. So it doesn't really matter what time zone you're in. You're going to be screwed one way or the other What do you think people would be surprised to learn about ASF infra? Surprised? I mean, probably that it's only six people. I'm sure, I remember Drew saying this and Chris and so on, but people are often surprised that it's only those five-slash-six people that are doing all the work. I know you all, and I'm astonished by it. I'm perpetually amazed by that. It is a seriously huge feat. You want to know what surprises me from the inside? That we actually manage it. It surprises us as well. It's not that “oh yeah we're just that great”. There is something about the coalition and the project that we can't really explain, and it's not explained by the individual parts. It really is the sum of the whole, that somehow… Huge, huge, huge undertaking. It's massive. And the fact that you guys do it is incredible. And yeah, you know it would take five, six times the number of people to do it elsewhere. So it's very special. I think we also have a lot more on flexibility from up above. From both our boss and his boss and his bosses. They trust that we know what we're doing in a sense that you might not find at a typical company. And I think that is part of the reason why we're able to do the things that we do so efficiently. Because we've been given this trust and we've been giving the benefit of a doubt if you will, when we choose to... They trust that we know how to manage our hours and get things done. Like it's not a strict requirement that you be here, nine to five, Monday through Friday. You can be here, I don't know, three hours one day, and then 12 hours next day. Or maybe you want to work on Sunday instead. As long as the job gets done and nothing falls through the cracks, they basically let us get our job done. And I, again, I think this is a win-win situation because it allows for us to kind of cool down when we've burned out a bit, but it also gives them the added benefit of when I feel like I will put in the extra hours because kind of as a “thank you for you let me decide my hours”. So I'm going to put in some more time here and then I'll relax when it's quiet. Because we do get quiet days. So you all have to carry the load, which is good. There's no favoritism. Everyone has the same shared responsibility --you all have to be on call, for example. Yeah. It's still quite a flat structure. I don't consider myself senior in a managerial wage to any of my coworkers. And so if I were to, or if someone else, if Gavin or Chris or Chris or Drew. If anyone were to say, "I'm not going to be on call," that would create a rift between us. I mean, there are staff members who wish they didn't have to participate in it, but we all are on call on a rotating basis. And so I think we're fortunate that we're all in a position where we're okay with it. We were able to manage it because there are legit situations where someone is not able to be on call. I think we all have them from time to time or someone else has had to cover our shift so to speak. All five of us are fortunate that we don't have things going on in our lives where you can't be on call, because those things, they can happen.

Sure, that makes sense. So what's your favorite part of the job? This is going to sound weird for a lot of people, but my favorite part is the weekly meetings. Why does it sound weird? People aren't social? I don't know. It might sound weird to normal people who don't like meetings, that I liked meetings. There's something about meetings... Even though they are very informal meetings, I like the little shred of formality that there are about them. And so that's, I think, my favorite part. And also I get to prepare for them. All right. So you must like preparing for the Board meetings too. Yes. You should read my Apache Web Server reports. What was your biggest challenge when you came into the role at Infra? There were two major challenges. The first one was learning the ropes. This is, as both I have said and a lot of people before me, it's such a complex system at ASF. There are so many things to know and it doesn't just take a year: it takes years to learn enough to get by without someone else's help. So that was, by far, the biggest ... Well, no, that was the second biggest challenge. The biggest challenge was believing in myself and not being scared of doing things unsupervised. Because again, what I can do and what my other colleagues can do with their keyboards is very, very ... We wield a lot of power over a Foundation that is responsible for so much in the world. Not being afraid of typing a command takes a long time when you know what a title can do on a machine. You know, “did you just delete this file or did you delete the entire hard drive”? And especially at the very beginning of getting into the job, I would double, triple, quadruple check everything I typed. I would wait for sometimes minutes before I hit enter just to be sure, I would look up that am I doing the right thing. Just to be sure that I'm not messing things up now. And as you to do it once, twice, three times, 10 times, a 100 times, you become more confident and you also relax more. Your first thought isn't “what if this goes wrong?” First thought is “let's see what happens next”. Or you're thinking ahead to the next debugging step or the next problem solving step instead of being stuck on what if this goes wrong? Which also means unless something goes wrong, you work a lot more efficiently. Because you're not fearing the Enter button. What are you most proud of in your Infra career to date? Let's see. I'm well, probably most proud of ... That's a very difficult question. That's why I ask it. If you don't want to answer, that’s okay. Oh, no, no, no. It's just that I've made so gosh, darn many things. What I'm most proud of is probably ... I would say that lists.apache.org is a thing that's playing out of an Infra job I was doing that. Yeah. I'd say that's probably the thing I'm most proud of. lists.apache.org is very powerful. We all use it every day. So that's yours? With the help of a few friends, yes. It was a brainchild of mine during some tests we had at Infra. And again, it's one of those situations where you have something that's not working and you're like, "Maybe I'll just rewrite it completely and it'll work. And then you start writing and then you find a good idea, a better way of doing some things and some things don't work. And then sometimes eventually you end up with a product that sticks. It's one of the most long lived projects I've had and that's still used today. Well, it's super useful. There's no doubting its efficacy and necessity. I mean, how many mailing lists do we have now? 1,700? It's some crazy number. I think we're nearing 2,500 if you count the private lists. And that's like 25 million emails, so ... That's insane. That's very cool. Very cool. All right, next question. This is the one that everyone kind of laughs at. How would your coworkers describe you? I'll have to think about that. They would probably describe me as the one that suddenly says something completely out of context. (Laughing) Okay. I thought I was supposed to be laughing, not you. That is funny. What happens is when I asked the question, Chris, Drew, and Greg, all laughed. Then they give me their answer and I always laugh. So it's classic. Tell me what you think are the biggest threats that infrastructure teams need to watch out for? I think the biggest threats are relying on tried and tested methods, but forgetting the change and expectations of the user in terms of user experience. I've seen a big change in what a user expects from Infrastructure in terms of user experience. I don't mean they just want it more easy, but I mean people want it more feature complete, they want it to look more intuitive, they want it to tie in together with what they are already using. They want to tie it together with whatever is the next new hot thing. If you stick with what might be good and try it and test it and it's stable, you might end up losing everyone because while it might be that, it might also not be what people are using tomorrow. If it's not what people are using today and tomorrow, then no matter how good it is, people are going to move away from it. Not necessarily outdated in the sense of technology, but more in the sense of trends. What is trendy. Yeah. I mean, it used to be Vine. Now it's TikTok and tomorrow it's going to be something else. If you don't keep up with the fashion of IT, then you're going to find yourself not wanted. That timing out happens more quickly these days, it seems. Okay, what would be advice to aspiring sysadmins or Infrastructure team members? My greatest piece of advice is basically don't be afraid because this ties back into the daunting task of having to push the Enter button after you type something in the command line. Don't be afraid because you'll lose so much time just being afraid that you could have spent fixing things or learning new things or making yourself more at ease. Just jump in with both feet and you'll be fine: you're awesome. Yeah, that's good advice. If you had a magic wand, what would you see happen with ASF Infra? Oh, interesting. I would like to see us having some magic, unified CI system that could be used across the different repository and types we had and didn't require any machines that would just build instantaneously. And yeah, be free of us needing to manage yet and pay for it. And also, if GitBox version two was suddenly a thing tomorrow and I didn't have to actually write it, which I still have to do. Okay. What else do we need to know that I have not asked yet? Gosh, I don't know. I don't know. One thing I'm really good at or one thing I'm really bad at is when you ask me an open question like that, because I don't know where to go with that. I am very good at analyzing a question and coming up with a specific response, which is why when people say, "How are you doing?" I get confused or I say, "I'm okay." And get nervous and forget to ask them how they are doing, because I don't get the dynamics that are happening there." The reason why I ask this question is sometimes people come in thinking, "Okay, this is my area of focus." They might want to talk about the “blue switch” or something specific like that, then we talk about all sorts of other things. We may derail. I may be driving the interview in a certain direction, and they have this pain in their gut because they never got to talk about the blue switch that they wanted to. The only thing I could think of would be something called pip-service, which is a new thing we're making, which is kind of like a package manager for all of our infrastructure services. Again, it's us working smarter instead of harder. And we were defining a way to easily install or run a service on any given machines and set them up without actually having to install and run then set them up. It would require a lot more time to explain in detail, but it's really nifty. Is it coming soon or is it available now? It's in production. And it's really helped us a lot. I love the efficiency of Infra, how you guys are having these new directions ... Like when you and I were dealing with the selfserve.apache.org the other day for the CMS (content management system), when I was getting the Annual Report up. For 21 years, I haven't been able to deal with the ASF CMS and then you walked me through it in literally three minutes on Slack and boom: it was done. I was amazed and shocked --because I'm not a technologist. To me that was phenomenal. You guys are really helping so many different kinds of people with different profiles and different skill sets. It's very cool. I think some of that ties into, again, the CMS was cool 10 years or 15 years ago, but it's not really been able to keep up with what's going on at the moment. No one knows how to use it because it's not very intuitive… Or it's not what we do today. Right. As we’re halfway through the Infra team, who do you think I should be interviewing next? I think you should be interviewing Gavin because he knows a lot about the CI platforms that I have been on, off raving about here. Gavin's not planning to talk to me until October... Oh, well then you should talk to Chris Lambertus, because he doesn't want to talk to anyone. (laughing) Chris can talk a lot about the upgrade of our email infrastructure. We have a lot of very tough work ahead of us in that we're upgrading an infrastructure that again, it works, but it's kind of like upgrading from an IBM mainframe to a modern computer: not that much of a upgrade, but we are having to modernize heavily on our Infra email infrastructure. I understand that's a huge, huge project. It's a very big project, yeah. That's a little advice for sysadmins there. = = =

Daniel is based in Copenhagen on UTC +2 (currently on CEST). His favorite thing to drink during the workday is lukewarm, weak coffee.

Monday September 07, 2020

Inside Infra: Daniel Gruno --Part I

The fourth interview in the "Inside Infra" series with members of the ASF Infrastructure team. Meet Daniel Gruno, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.




"...companies are not the same as ASF. They don't have 300 different departments that all have their own little tools that they want working in their specific way. And they want this to connect to that, and that's connected to some other thing. We are not afraid to create custom solutions, we're not afraid to get our hands dirty and we're not afraid to make mistakes."



What is your name and how is it pronounced? I have my official name and I have my user name and people usually ask about both of them. My name is "Dan-yell Gkhroo-no" or I will accept "Dan-yell Groo-no" which is as you read it in English. It's actually a Dutch name. So you would pronounce it "Hrooy-no" in Dutch, which I'm not even going to try to phoneticize that because, that's, well, Dutch. And my username is "Humbedooh" which is an onomatopoeia that I randomly made up in 2004 for a game called World of Warcraft, where you need a username for this character that you create. And I think I had just listened to "New York, New York", where Frank Sinatra sings "scooby doo bee doo", and I was like, "hum-be-doo-de-doo" and the name just came to me and it stuck ever since. And so for the past 15 years or 16 years, I've been primarily "Hum-beh-doo" online. By the way, Frank Sinatra sings "zoo-bee-doo-bee-doo", not "scooby-doo-bee-doo" in "Strangers in the Night", but I like your version better. Okay. Well today I learned that. When and how did you get involved with the ASF? That goes back to 2010, 2011? Again, this beautifully tied us into World of Warcraft because in that game you can make modules, add ons for the game that will do nifty things, like add ons for a Web browser. And this is written in a programming language called Lua, L-U-A, which is Portuguese for "moon". And so I started writing some programs for this game and I had great fun with it, and programing is not my official trade. I was educated in, or studied, human resource management at university actually. But it was my hobby and I had great fun doing it. And this Lua thing just got stuck in me. And then five years later or so I started writing a program for the Apache Web server called mod_pLua, the best way to describe it as if PHP and Lua had a baby. So it would be the same for people that know PHP. It would be the same structure with the less than equal sign and a question mark, and then the same thing to end it on the other end, but with the Lua language instead of the PHP language. So I wrote this program or interpreter for the Apache Web server. And I didn't really think much of it. Obviously it was mostly for my own edification if you will, and for my own use. But I had put this on a site called SourceForge, which at that time had a community manager named Rich Bowen (also Apache HTTP Server PMC Member) who took a liking to this program or this module for the Web server because the Apache Web server community, which he was a part of at that point, have been doing something similar called mod_lua or at that time mod_wombat. And that had stalled. People have interests and then the interests wane and people would move on to new jobs and the person in charge of this mod _lua had found other interests in life. And so this module was just sitting there and not really being worked on. And Rich said, "Why don't you come take a look at this program and maybe this is a place where we can collaborate." And he also got (ASF co-founder and Apache HTTP Server PMC Member) Jim Jagielski very interested in the work I was doing. And so I slowly started on my path to becoming an ASF Committer initially by fixing what's called 404s, which is basically a reference in a Webpage to a link or another page that doesn't exist. Either it never existed or it doesn't exist anymore. So I started fixing a bunch of those just to get on their good side and hopefully they would take me seriously. And I didn't have high hopes, but I think I was probably the fastest person to get committership at the Apache Web Server Project...perhaps the fastest in the 10 years preceding when I got it probably within a week. They had a vote going and I was voted in and… Within a week? Within a week. Unheard of. I was pretty much on the path to becoming a Committer. I couldn't believe it. Part of me wanted to believe it, because it was a very big validation for me. Because I had been using the Apache Web Server since 1998 and it always been a project that I looked up to and it had been this mythical "Father of the Web” program. And so to actually be a part of it and get your name on the page that says these are the Committers that actually have a say in the project and can commit code to it, that was I was quite a feat for me, especially at that time where I had stopped my studies at university and I didn't know, what am I going to do now? Because as happens with a lot of people that study something, they eventually found out that while, okay, this was interesting, but it's probably not what I want to be doing, if I'm honest. Because what I had fun with was programming. So while it was nice knowing a lot of stuff about statistics and economic models and psychology and so forth, it had started to get a little boring for me. I knew these things, what now? And so to get this validation to get an avenue of sorts where I could use my creativity in a new way that I hadn't studied for, but it naturally just came to me this programming inspiration, that was really nice, to use a very vague word. It was a tremendous opportunity for me. And then that's how I got started with Apache. Fantastic. You’re not only a Member of the ASF and an Infra team member. What other "hats" do you wear at the ASF? I have a couple of hats. I’m also the Vice President of the Apache Web Server Project, which is a great honor. And it's still to this day, three years in, fun to do. People think of it as this is a dictator role or you get to decide, but it's more of a glorified secretary really, where you keep tabs with everything or most of the things that are going on in the project. And you relay that information in a concise way to the Board of directors, whose job it is to look at these reports and say, "Are the projects doing okay? Do they need any help from us? Are they in trouble?" So basically VP is the watchdog --in these COVID days, I guess you can say it's a pulse oximeter of the project. And if you want to know if a project is still healthy and stable and progressing, the VP is the one to ask, because that's basically their job to know. As VP I don't get to decide who gets in or who gets kicked out or what direction we take in the project: I am just the person that ensures that the Board knows that the project is in good health. Do you wear any other hats or is it just the VP of Apache HTTP Server? I'm also VP of Apache STeVe. As I said, I have two VP-ships and STeVe is a whole other beast. Let's say it's very stable in that we have a code base that works and we don't really do much about it, we maintain it. In the Apache Web server, we have around 20 to 30 people actively contributing code every single quarter. And in Apache STeVe, we are basically twiddling our thumbs, waiting for something bad to happen. And it never happens. We have a program that works the way we like it. And we don't see the need for any large changes. And as long as there is sufficient oversight in a project, then the Board doesn't come in and say, "Hey, can you make this cool feature?" Because that's not the Board's job. The Board’s job is just to help us, as projects. And so if the project doesn't have anything that it feels it wants to add, but it's still there and the people are alive and well, then the Board will say, "You got it. We'll see you next quarter." And so two projects are very different and it also makes for very different reports. OK. Let’s drill into Infra, as that’s the focus of our interview series. How long have you been a member of the Infrastructure team? How did you get there? I am not sure. I think I've been a member of the Infrastructure Team since 2012. You can probably figure out when exactly I got my membership in the email archives . It started because the Apache Web server project needed a commenting system. Because we had been eyeballing the PHP project and they had a system where you could, on the documentation pages, you can enter, I have a comment about this documentation bit, or you could add some code snippets or ask a question and get an answer. And the only thing we had was send an email and get a reply and then the next person comes along and doesn't know that that email existed and sends the exact same question and gets the exact same reply and that can get tiresome in the long run. So we wanted someplace inside the documentation itself where you could go in and see, okay, I have an issue with this documentation, have other people encountered the same problem or are there some smart solutions that I can find here. And this type of software doesn't write itself, unfortunately. So I set about in writing that using the mod_lua that I had now invested a great deal of time in because A, we needed a comments system and B this was a good excuse to show off mod_lua in a production system. This could really do something, it's not only fast, but it's got a lot of features and it's got a lot of flexibility to it. And so I asked the Infra Team, which at that point was very daunting for me because they were, let's say our image has improved over time at Apache Infra, it was much more a, well basically an operator from hell vibe you got back in the early 2000s or early 2010s from the Infra Team, especially when you're someone of a more timid nature like I am. So anyway, I asked if I could get a place to set up a machine or borrow a Web server basically and put this commenting system on that I had been writing as a hobby. And they pretty much said "Sure." Which was surprising to me because normally when you go and ask for something at a company and it's very difficult, you can ask for meetings and meetings and meetings, but if you ask for actual resources, you will usually have to file a form J/99-B in triplicates and whatnot. And here they were just: “well it looks like he wants to help the project, just give him what he wants”. And so I got started on this commenting system. And other projects became aware of it and they wanted to use it as well. And then I became the comments guy, basically. And I started maintaining this system for, I think, it was seven different Apache projects at the time using it. And since you can't really maintain anything at the ASF without somehow being an infrastructure person, I was made an infrastructure person...and generally if you're a given something, you get a taste for it and you want more. And so I started volunteering for more and more infrastructure tasks. And then I became what is called infrastructure root. This was about two years later down the road. Which is a point where the Infrastructure Team says we have complete trust in what you do. Here are basically the keys to the kingdom. Do whatever you like, except don't do that. But you could do whatever you wanted to. And that was almost as awesome as becoming a member, which I had become just about a month prior. It needs to be said that at that point you could not become infrastructure root unless you were also an ASF member, because needless to say, when you have root access to an organization as wide and important as to the ASF, you get to be privy to a lot of information that you should keep to yourself. And so the logic at the time was, if you are an ASF member, you will already have access to most of this information because of your membership and so we can allow you to become an infrastructure root person. This has changed since then, we have cast a wider net when looking for new infrastructure people, this also includes a more thorough vetting process that we have now. So we feel more secure and not just requiring you got to be an Apache member before you come and help us. So we are able to look for a broader set of requirements that might not have been found within the, at that time, 400 and something members that were in the Foundation. What are you responsible for in ASF Infrastructure? Oh God. As with most infrastructure members, it's almost easier to see what are you not in charge of, which I usually say “Jenkins” with a big smile because that's things that are, I know this is going to sound silly to a lot of people reading the article, but things that are Java, I tend to shun like a vampire and sunlight. Any particular reason? Yes. I'm not accustomed to the way the output and stack traces and core dumps. And the thing about Java is it's very verbose: you can write 50 lines of code and you'll have a print Hello. And it doesn't appeal to me. So yeah, when things don't appeal to me immediately, this is one of my weaknesses, I try to not really understand it because it's easier not to. Fortunately we have some very talented people at the Infrastructure Team that knows pretty much everything there is about Jenkins and JIRA and Confluence and all the other big Java powered mod lists we have at the Foundation so I can spend my time elsewhere. What I mostly do at the Foundation day to day work aside -- because we all have basic maintenance tasks and disasters that can crop up from time to time -- is product development of the glue that binds The Apache Software Foundation together and its software infrastructure. And I'll tell you about a new thing that we've been doing, which is something called PyPubSub. I can spell it, it's P-Y-P-U-B-S-U-B, so it's a Python publisher subscriber service for the ASF. You can basically think of it as a newspaper where you have a publisher, you have an audience, you have the readers, and then you have topics of interest. Some might want the sports section or the funnies, or someone might want the financial news. And then you have, of course, the writers or journalists that make up the contents in these sections. And at ASF, these sections, they would be Subversion commit or Git commit or a new email being written, or someone got added as a Committer or someone filed a pull request, someone filed a new bug or issue, or some are discussing an issue. And the writers and journalists would be all these systems where you send an email to, or you open up a new ticket or you commit some code to it. The readers will then be either users, or there will be a lot of different software components that rely on these messages in order to operate themselves and do what they're supposed to do. So in essence, PyPubSub is, again, some glue that binds the majority of our services together. And it does so by dispatching events to basically whomever wants to read about them. We actually have something called a Pub/Sub Explorer, in real time shows every single event that happens at the ASF technology wise. So if someone sends an email to us, if someone commits something, if someone opens the poll request, if someone comments on a discussion, it all shows up in this Explorer that will update in real time. And it's very cool. (ASF Infrastructure Administrator) Greg (Stein) was saying that you do things that are uniquely different from other team members. In addition to the PyPubSub, what other things are you working on? Currently, one of the main things we manage is called technical debt, which is basically, the longer you don't maintain and upgrade a system, the more expensive it's going to become once you finally have to do it. And so I'm dealing with some technical debt that is moving the service that we have called GitBox from an old, pretty ancient set up to a brand spanking new 2020 machine and software, which also means moving from Python 2.7 to Python 3.8 for every single component that is in the service called GitBox. And that is a lot of components. GitBox is the ASF side of where a committer would commit code to if a project uses the Git version control system. The other side would then be GitHub, if a project chooses to use GitHub. And GitBox and GitHub, they kind of talk together and figure out, okay, someone pushed to me, I'm going to synchronize this with you. And I'm also going to make sure that everyone gets an email on the mailing list saying "something just happened." It's rather unique in that you can choose to either use a GitHub account, or you can choose to say, "I'm not going to use GitHub. I'll just use my Apache credentials on the Apache server instead.'' Not a lot, very, very few, in fact, organizations have this kind of interconnectivity between GitHub and a locally hosted git server. And what we have done very neatly is, we have managed to link our LDAP directory of all our committers to GitHub. Meaning that, if you go in and say, "This is me on GitHub.'' We automatically figured out, okay, that means you get wide access to this and this and this repository. And that is updated in real time. How did these out-of-the box projects come about? I remember when you first approached me about five years ago with these fantastic stats just before I was going to publish the Annual Report. I’d never seen anything like that at the ASF. It's difficult to explain. It's like asking a painter, ''Where do you get your inspiration from?'' It just happens. A lot of time --I will tell a little secret-- a lot of the time that I spend in my day-to-day work is not spent actually typing code or reading up on new fun things. A lot of it is spent what you would call idling. And by that I mean not particularly engaged in any specific task, but kind of just all over the place casually ... Like how, and I hope not to cause any offense here, but how a standard office worker would spend a lot of time on Facebook catching up on friends and family. I'll just spend mine to see whatever I'm interested in the moment that has to do with programming or mathematics or psychology. And in the back of my mind, there's always, how can I take this information that I'm reading about and apply it in a software world? My mind has a tendency to see structures that may or may not be there. And I think almost exclusively in structures. Whenever I see something I want to understand not just how does it work, but how is it basically designed? And can I replicate that? And so, a lot of my day-to-day work is, I see something cool, it might not be anything that has to do with software, or the internet, or anything. It might just be a cool gadget, or a painting, or a chart in a newspaper. And I'll be like, ''What can I use that for that would benefit the foundation? Or whatever hobby project that I'm working on?'' And then you get these aha moments where you're like, ''This I can actually use this way to fix a problem that we are having, or that problem that we could have.'' Sometimes you just make up problems that will potentially happen in the future, just so you can have an excuse to get started on something. And for some strange reason, these fictitious problems very often tend to be not so fictitious at all. And once you show three or four people, hey, I thought of this thing that's not actually a problem. And I thought of a solution. They'll be like, ''That is actually a problem for us.'' And suddenly you have a solution to a problem that you didn't think existed in real life, but it actually does. So, a lot of the things I do are “for the fun of that”. But there's always a work-related starting point in that, is this something that can be used within the software world? Or within the managerial world of software? Which is where I primarily tend to focus my energy. In terms of your day-to-day work with the Infra team, you said that you’re hands-on, not necessarily coding specific tactical solutions, but solving other problems --do you participate with the firefighting as do the other team members? You often respond to my queries about mailing lists --is that your specialty? Chris and Drew shared that everyone specializes in at least one thing. What do you specialize in? My focus is primarily, and there's this kind of a self-made problem. My focus is all the programs and services that I, unfortunately, created. You create it, you own it. Yes. There are a lot of services at Apache Infrastructure that either I made from scratch, or they have a very big thumbprint of mine on them. And so, when I started at Infrastructure, the Infrastructure team, it was expected that we do our fair bit of firefighting. We do a fair bit of the tasks that every single member of the infrastructure team knows how to complete. And I will go through tickets and I find tickets that I find manageable and complete those. I will participate in firefighting. I will do whatever I feel needs to be done right away. If there's something important, or if there's something where I feel like this should have been dealt with by now, I will do that. But it was also the expectation that I come in and help develop and maintain a lot of new features we were looking at creating for the committers and for the end users of Apache software. Simply to make for a better user experience and an easier workflow for our committers and contributors. So, a lot of what I do is maintaining and assisting with services that I have either initially authored or helped expand upon. Tell us about the structure of the Infra team --how did your work come about in a formal way? You were saying that you're creating these tools and then they just kind of got integrated. But were they looking for your sort of skill set? Or was it more of, “hey, we need another Java guy”? What clicked there? Your background is really different. Your expertise is different. Your insight is different. It's an unusual scenario to have a traditional department embrace someone like you and say, ''Hey, we're going to have a whole new type of services offered based on this one guy's vision.'' That's very unusual. Can you elaborate on that a bit? I don't think they were looking for someone like me. But I think they got someone like me and it was completely happenstance. The Infrastructure team at that point, that was early 2013. They were looking to expand with one more staffing spot. This was a part time job. And this was probably about a year and a half after I started doing things for the Infrastructure team. And they had a very narrow list of candidates at the time, because it was a very closed circle. And kind of still is because when you're a staffer, you get the keys to a very mighty kingdom. And so, they had a few people that they could consider, but I was probably by far the one putting in the most hours. And I will, gladly admit that, at that time, I did not have a job. So, I was able to put in a lot of hours. This was when Sam Ruby was VP infrastructure. When Sam initially took me aside and said, ''Hey, we are looking for this part time opening, are you interested?'' I was like, ''No, this, surely you're not, you can't be serious. There's got to be someone that's actually qualified for this job.'' I didn't consider myself qualified at all. And... But you were doing the work? I was doing the work, I just didn't have any confidence in the work I was doing. You can be creative, you can do a lot of interesting things and still have this incredible imposter syndrome going on at the back of your head saying, ''Someone else is doing this work. It's not you.'' So, I politely turned him down and said, ''Thank you, but I'm not insisted because you'll just find out I'm a fraud.'' It actually took two other Infrastructure at that point, current staffers, two other sector members to yank me aside and say, ''What are you doing? We want you for this job.'' And they had actually pretty much all internally, independently been rooting for me and trying to position me to become this new member of the team, to my great surprise. After, I think it was after a very long talk with (former Infra team member) Joe Schaeffer, I was finally convinced, maybe I should give it a shot. And I'm very glad that he convinced me. I'm very glad that the other people at that time also convinced me because it's now been, to this month, seven years since I started. And it's not been fun every day because there can be such a thing as too much firefighting going on. But it's been interesting every single day. You're never bored and you never think, ''I need to find a new job.'' Because you are respected for what you do. You are rewarded in more ways than money, honestly, and you can probably agree with this, at The Apache Software Foundation you get a very unique sense of loyalty. Not to the Board of Directors, or to the specific projects, or anything else, but to the community as a whole. To the mission that we're doing. So, I am honestly very content being where I am. I'm very happy that these people ganged up on me and, basically, forced me to get a job that was... It was kind of silly in hindsight because it's a well paying job, it's part time. So, you don't have to spend nine hours a day on it. You can work whenever you want to and... There were no setbacks except for this nagging doubt that people are going to find out the real me. Which, as I discovered myself, it turned out the real me was actually kind of awesome at this job. It's interesting because the Apache community tends to not want someone if they're not good. So, it's testament to your skill set, and who you are as a person, you're liked. You're very well liked. Thank you very much. And, you're right. The Apache community seems to be very good at finding talent, and also very good at rewarding it in ways that make that talent stick, and make them interested and continue working within the ASF community. I think that's a thing that you don't see in all software communities. We learned from (Infra team members) Chris (Thistlethwaite), Drew (Foulks), and Greg (Stein) about the scope of the work that Infra does. How is the ASF different from other Open Source foundations from an Infra perspective --are there other people doing what you do, or how our group performs, or the services that our group provides. Is this common in other Open Source foundations? It is not common in other foundations. We are different in that the breadth of the amount of services that we provide for each project. And especially at the budget that we provide it at. I think we did a count back in 2015 and it was something around 52 different distinct unique services that we had, that we were running for all projects to use. And in between these, there are possibly more than 300 machines each running, some of them running the same thing on 10 machines. And then you have another 10 machines that are running 10 different things. This is all handled by what? A team of what, seven people now? Six people actually, five of us and Greg (Stein). Greg is a bit of an übermensch, so yeah. That's amazing, in terms of the workload. It can get hectic, and I will not deny that, but we have a very, very strong cohesion. I don't want to say we finish each other's sentences, but when someone has a problem, the others know when to step in and help, when to back off, and what to do while someone else is doing their thing. We compliment each other really well. And we have a nice set of tools to help us with managing things, making sure that everything is up and running, diagnosing when something goes wrong. We have a lot of, again, by the hand of me, a lot of custom tiny services that you never even hear of or see if you're not within the infrastructure team. But that goes on automatically. Let's say you're abusing someone in a ticket multiple times, or you're spamming, whatever. We have a lot of microprocesses that go in and detect abusive behavior, both in terms of spam, but also what you would call technical hardware abuse, where someone is repeatedly using all of our bandwidth, for example, or causing the CPU to spike. We can go and detect that automatically and pull a systemwide ban on you, which it's very custom, but it saves us a lot of money. I will say that we've saved a lot of money at the Foundation by being smart about what we do and not being afraid of making a few mistakes while we make new things... Because a lot of what we do is custom-based, custom-made. Because there is not, unless you're talking about something big like Kubernetes or something at that scale, it's often very difficult to find the tools that do what we want them to do with the problem that we have. Because other companies, especially companies, are not the same as ASF. They don't have 300 different departments that all have their own little tools that they want working in their specific way. And they want this to connect to that, and that's connected to some other thing. We are not afraid to create custom solutions, we're not afraid to get our hands dirty and we're not afraid to make mistakes. That doesn't mean we make mistakes all the time, or that we're okay with all sorts of risks. How do you interact with the team? How do you stay motivated? I stay motivated by interacting with my team, I would say. Interaction is mostly on Slack, which is, for those that either don't know it or pretend they don't know it, is an instant messaging platform. We have an account for the Foundation; we have our staff channel where everything gets discussed, whether that be, the mail servers are a big backlog, or this prime rib I just sous vide-ed at 105 Fahrenheit four or five hours is awesome. I think one of the tricks or keys to success for teams like us is to really mix up the subjects and not be all business and not be all fun because you don't want it to be too boring, you don't want it to be too relaxed. I think we've somehow managed to hit a pretty good ratio of fun and serious items that we discuss on a day to day basis. So, it's fun talking to your colleagues about real-life stuff that isn't work, but it's also rewarding talking about work and learning from them and their experiences, and you being able to give them some work experiences and wisdom from your many years of being a sysadmin or infrastructure architect. I think we've hit a really good ratio there. It's an interesting perspective with that because everyone I’ve interviewed thus far has given the same answer. Can you describe your typical workday: now, I know some people don't have an exact schedule, some people do. What's a day like in Daniel Gruno's life? My typical workday is very atypical for a worker. I don't have a set schedule. I don't have a set time. I don't have a minimum amount of hours I work. I don't have, unfortunately, a maximum amount of hours I work. It all depends on the day and what happens during that day. As said earlier, a lot of what I do is developing new services for the Foundation. As such, I spend a lot of time getting inspiration, and that's done through various means of... From idling, I can be working at noon and then I'll be like, ''I should watch a movie.'' And then I'll go watch a movie. My significant other will tell you that's a lie, I don't watch movies. But that was just an example. I can't sit through two hours, I get too fidgety. And that's actually the real truth about me. I can't sit still and do something for a specific amount of hours, unless I'm in a really inspired mood. So, my typical workday is finding things to do that don't take more than half an hour to do, in between suddenly getting the greatest inspiration from up high. I'll be looking at tickets that are easy for me, not absolutely speaking, easy to fix, but tickets that I know how to fix and I'll go in and fix those. I'll catch up on every single email that I receive, which is thousands of emails every single day. I have a mania about inbox zero. If there's an email, I have to read it and sort it. Otherwise, I can't get past the inbox. I can't even close down the mail client unless I know that there is nothing in my inbox. Yeah, it's the same with Slack and IRC and all that. If there's a message pending for me, I have to check it. But that's beside the point. It gives me something to multitask between. Because there will always be a new email, there will always be someone saying something on Slack. So, a lot of my time is spent just multitasking between that, between reading up on news. And then, at some point, the inspiration that I need for that day will hit me and then comes the manic in a few hours where I just code like crazy because I have the inspiration. I tend to form fully thought out ideas which is terrible because if you have a fully formed idea in your head, you know it's going to take eight hours to complete it. But you also know that if you stop, you might forget that fully formed thought. Sometimes a work hour day can be four or five hours and sometimes it can be 10, 12, 13 hours because my muse has sung to me and the inspiration just has to be translated through the keyboard and into some sort of code or what page or documentation or just a specification for a new idea. Having said that though, don't pity me because I work 12 hours a day and don't be jealous because I work five hours a day. Because it adds up to a lot of hours on average per month. But I'm also happy to do it because it brings me joy. With this constant flow of concepts and code and inspiration, how do you keep your workload organized? You might be hammering away on a solution and imaging and envisioning something to develop --there's a lot of things happening simultaneously. A lot of people have a hard time multitasking, or focusing on one thing and managing the thousands of emails coming into the mailbox, et cetera. How do you manage that? I would say I don't manage it, but luckily I have family that helps manage it. I have a boss that helps manage it. I'm a very ... I'm on the autism spectrum and some would say that I probably have ADD as well. So I get very easily distracted and can lose focus, but I am surrounded by people that are very good at a) knowing that I lose focus very easily and b) guiding me back to the right path for that day. I think in terms of my boss, Greg's point of view, I think it's a win win because I get guided back on my path and I get to actually do something useful and not just 20 unfinished projects. And he gets some services that are working and are improving the use of experience of the people that we are there to support: the committers. So serving 350 Apache projects, initiatives and their communities, like how busy are you? How many requests do you receive a day? How do you prioritize these requests? How do you do this? Greg, Drew, and Chris talked to me about JIRA systems, et cetera. Your work, as I understand it, is not necessarily responding to user requests. How do you fit the creativity in with this process? How do you mitigate that? How do you fit everything in? I do respond to users to keep me busy because if I am, I don't want to say stalling, but if I am really idling then I lose interest so I have to always keep busy with something. So I will grab a lot of tickets just to keep busy with that. That's the thing that I had to teach myself how to do. And I don't have the recipe for it, and yet I have somehow taught myself. The thing where you have to not click on every single new ticket that opens up. And not read every single ... Well, you can read email, you just don't have to write a reply to every single email. It took a few years, I think, for me to stop doing every single ticket that came in within five minutes of it coming in. Because at that point, if you do that, plus you have 10 different projects on the side, you get burnout very quickly. And I've had a few burnouts, where I've been unproductive and doing nothing for the next week because I'd lost all hope in humanity because of the amount of tickets and angry users. So a lot of it just letting go and knowing that there are team members who know just as well as you do what this is about and how to solve it. And if they don't then they will ask you and you can help them then. So a lot of managing the workload is learning to let go of the workload. And if someone creates a ticket saying, "My forwarding address doesn't work." It's probably okay to wait more than five minutes before you fix that if you're in the middle of something. I used to be of the, not opinion, but yeah, I used to be of the opinion that this must be fixed right away. The minute I saw someone had a problem, I wanted to help them. But there comes a point where the more you try to help someone, the less you're actually helping them in the end because of the overhead of dealing with too many tasks and being burned out. I think some of it is ... Stefano… Mazzocchi? Yeah. Right. The Mazzocchi equilibrium. There is a certain point where in the effort you put in and the effort comes out of that starts to not align anymore. And so if you're not good at holding back and letting things slide just a bit, then you cross that threshold and you end up putting in maybe, I don't know, 12 hours of work. And really what you are doing is five hours of work or four hours or one hour of work because you're so not interested in what you are doing. I know that some of my colleagues use Trello or If This Then That, or other tools to organize their day, but I want to say that's not for me. I don't think it factors in the creativity that is needed in the role I have. I think without any scientific evidence whatsoever, that if your job is to think up new ideas and think of new ways to do something. These tools, they don't necessarily account for where creativity comes in because you can't put in your calendar: step one, “be creative”; or at 9:00, “be creative”. Creativity is something that just happens. I've found that it happens for me when I am idling, when I am doing a lot of non-work related things, switching between and then switching back to work. And then switching back to non work items and switching back to work. And then suddenly, a link appears between these two things and they're like, yeah, this idea could actually be used for work. But the things I am doing are not something that you can put into plan because you don't know ... I mean if I knew how to be creative, if I knew to just go to this Website, then I would be a millionaire by now. So I don't know *how* to be creative: I know that I can be creative and I know it happens when I let it happen. You have to make space for that to happen, right? You have to allow for that to happen. It's great that you have flexibility to be able to do that in your job --that part of your work is to be able to conceptualize and visualize and come up with things. It takes a while because sometimes you're not going to know the problem unless you're in the middle of it: "so, oh this is an issue … here's an opportunity for us to come up with something that'll help." It's great. And it's especially great because I think honestly if I was stuck and let's say I was doing human resource management or whatever that I studied for, if I was stuck doing Excel spreadsheets, for example, all day long ... Not that that's a bad thing, but when creativity suddenly hits me, I have to get it down on paper or it's going to haunt me to an extent where I just can't stand myself. So I'm very fortunate to have a job where I can, fire fighting aside, I can say, "Boom, I have this inspiration suddenly. I need to focus on that." And then I can go and focus on that. And I have a boss and I have a boss's boss and I have my colleagues that are understanding so that suddenly, "Oh Daniel got inspired. He's probably going to manic for the next eight hours just working on this idea he’s got." It's really wonderful being given that space to be creative because I think no matter what job I have, there would be an urge to be creative and to think up ideas. And again, when I think of an idea, it forms itself completely in my head. Some people will start with half an idea or a fingertip of an idea. For me, it's mostly been the entire idea presents itself to me right away, and I have to get as much of that as possible down on paper before it's lost. To have that opportunity is really wonderful. [END OF PART ONE]

Monday August 03, 2020

Success at Apache: I Became an Apache Solr Committer in 4,662 Days. Here’s how you can do it faster!

by Eric Pugh

On April 6th, 2020 I was invited to become a committer on the Apache Solr project.  My journey to becoming a committer started in earnest 4662 days before that!  On July 2nd, 2007, I opened SOLR-284, a ticket for adding content extraction to Solr. 

A committer on an open source project under the Apache Foundation umbrella is someone who is trusted to contribute code to the project and to help manage and drive its ongoing development. It’s an honour to have been asked and I was very proud to accept the invitation!

So, you did the math, and you realized that it took me 153 months, or 13 years (rounding up), to become a committer, and you’re wondering “What if I don’t want to wait that long?” So here’s my quick cheat sheet on ways to become a committer on an open source project, illustrated by my own journey:

  1. Start by learning the culture of the project. How are decisions made? What tools do people use? What do the various acronyms mean? Join the mailing lists and read every commit.
  2. Start small and work your way in.  Some great ways to do this are to:
    • Take existing patches and test them.  Update them to the latest code base.  Document what you’ve learned
    • Take advantage of being new to a project to bring fresh eyes to the documentation.  Every time you find yourself scratching your head on how something works, contribute a fix to the docs.   It’s a powerful way to immediately contribute.  This is the fastest way to get involved and involves the least cognitive load!  See SOLR-2232 or this email thread.
    • Answer questions on the mailing list! Being able to articulate reasonable responses to questions demonstrates how much you have learned.
    • Bug fix, bug fix, bug fix! Pick bugs that have an obvious answer so that the “correct” solution is easy to figure out. If the right approach to solving it is very ambiguous, you probably won’t get much traction. Remember to remind committers to apply your fixes when they have the time! See SOLR-13965 and SOLR-11480 and SOLR-2611 and SOLR-2263.
  3. Ready to start slinging some code?   Don’t go and refactor the core foundations of the project (at least not yet).   Instead, be like a pilot fish and latch onto one of the core committers who is being very active in the project.

Embrace their vision, and start picking up tasks related to whatever major chunk of work they are doing. Write some unit tests. See about opportunities for refactoring. Do some manual testing over multiple platforms. Once they see that you’re contributing (and accelerating what they are pushing), then work to get some of your own tickets assigned to you under that vision. I’ve seen this lead directly to committership many times, and if I had followed this route, I might have joined sooner!

Here’s to the next 4,662 days of being active in the Apache Solr project!

Eric Pugh is a member of the ASF and a committer Apache Solr. He co-authored the book Apache Solr Enterprise Search Server. Eric is co-founder and CEO of OpenSource Connections, where he helps OSC clients, especially those in the ecommerce space, build their own search teams by leading projects and by acting as a trusted advisor. He also stewards Quepid, a platform for assessing and improving your search relevance.

[this post first appeared at https://opensourceconnections.com/blog/2020/07/10/i-became-a-solr-committer-in-4662-days-heres-how-you-can-do-it-faster/ ]

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache 

Friday July 17, 2020

Inside Infra: Greg Stein --Part III

The close of the "Inside Infra" interview with ASF Infrastructure Administrator Greg Stein, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity. 




"Apache is growing: we're just seeing the demand explode and it's a hard problem for us to solve."



PART THREE.


We were talking about ensuring that the team is up to speed with everything required of them...


So there certainly are skill gaps; this is one of the things I want to help motivate the team with, where if somebody says, "Hey, I want to go and investigate Ansible as a potential Puppet replacement," I say, "Go forward." 


This would be similar to Google having their 20% projects. I'm sure you've heard of that.


Oh, yeah.


It's almost the same where it's not 20%, maybe 5%, but it's the same as Google, no matter what they want to tell you, because everybody's got their job and you have to be really rigorous to carve out 20% of your time. And strictly speaking, it does actually make your Google manager a little upset if you carve out the entire 20%. But anyways, the concept is similar.


So for us it’s like, "Well, go in and investigate Ansible, see if it'll work for us and put your notes into the Wiki." That's how we make forward progress, up our game, and learn new skills. If someone says, "I want to go and figure this out," the response is almost always, "Okay. You go do it." There's certainly an allowance for people to learn new skills. But most of the time we simply rely on, say, Gavin (ASF Infrastructure team member Gavin McDonald), knowing more about JIRA configuration than the other guys.


That added component of sharing what you know, and adding it to the JIRA or to the Wiki actually is great because then everyone's learning. This is like the rising tide: everybody's learning about this, whether they're doing it perfectly or not. I think this is a very interesting process.


Yes, and that's also where Andrew (technical writer Andrew Wetmore) is helping us out. He’s organizing that information that we have learned, that we have documented, that we memorialized into the Wiki.


Because our (ASF’s) legacy is quite Medusa-like over all these years, it's interesting to see how everyone can get caught up and also contribute...you have to go back and deal with the legacy, but you also have to be able to move forward. To be able to bring others with you is brilliant. That's really cool.


The infrastructure has grown organically over 25 years from when Brian Behlendorf first said, "Hey, I have this server called hyperreal.org: you can run a CVS repository on it for the Web server."


That computer was under his desk at the Wired offices way back when, wasn’t it...


Yes it was. And it's just grown organically over those 25 years. Then we had Minotaur and it did six different things ... now it only does half of one and we've moved the stuff out onto newer machines and newer processes and this and that. But the organic growth means that we've got some really hairy stuff. Our move to Puppet --first Puppet 3, and now to Puppet 6-- at each step we're improving it and making it less hairy and more manageable and something that somebody can come along, look at, pick up and run with it from there. That makes it a lot easier, so that we don't have to spend 100% of our time cross training.


What are your thoughts on products, the hype cycle, where everyone's demanding Kubernetes, to use that as an example. Do you decide which products to provide support for, or is that up to Apache projects in the communities? You mentioned Ansible, just not too long ago, that was your internal decision to move. But I remember not long ago, GitHub entered into the landscape. How did that happen? How did you decide to make a move like that? That's a significant thing. Can you tell me a little bit about that?


It's a lot based on community input. So if we see a lot of people asking for a particular tool, we'll like, "Oh, hey, David, can you go and take a look at that and see if that's something…” Not David (ASF VP Infrastructure David Nalley), but Chris (Infrastructure team member Chris Lambertus) or somebody else. "Can you go take a look. Is that something that we can support? Because we're getting some queries about it."


And there's a little chicken and egg problem there that if the communities don't know to ask for the egg, we don't know whether to prep the chicken. It's like, “okay, wait, they don't even know to ask for a tool because we haven't said we will make this tool available, because we're not going to make the tool available until somebody asks”. But sometimes people file tickets like, "Can I get this set up?" and we'll go, "No."


Then six months later, somebody else will file a ticket: "Can I get this set up?" and we'll go," No." But after enough of those, we're like, "Maybe that's something that we really want to do." For GitHub, specifically that’s what happened there. Well, even before that Git, where we ran our own Git server, that was a volunteer that made that happen. That was, six years ago or so.


Well...the volunteer came along and said, "Well, I'll do this. I'm not going to take any time from Infra." There's been a couple things for the past few years where I've told people, "No, Infra will not work on that. But if you want to volunteer or find a volunteer, then we'll stand it up for testing." You know what I mean? Why not? So there's a couple things where people have stood up for test examples and there hasn't really been a lot of usage.


So, we're not going to support that. But something like Ansible is our own internal workflow and the tool we’ll experiment with, then to see if it'll improve our stuff. But from the community, they pretty much have to ask and it has to be a sustained ask. That's how we ended up with Travis CI: we actually pay for capacity in Travis CI, and that's based on community input.


So many people wanted to do their continuous integration through Travis that eventually we decided to pay for it. But it's tricky because some of these systems like Travis CI and others require certain permissions that we don't want to provide to the community. So we will want to hold those only within Infra. And so it gets hard to integrate certain tools. We've had to say no, but then again, we've found other ways to improve that so that we can lock down the permissions or use a proxy or other ways that we can route around some of these issues and then integrate the requested tool.


So further to that, have you been in a situation where a project or a community has made unreasonable demands of Infra or have expectations, where it's like, so over the top or so out of scope, it totally surprised you? Have you had something like this?


Nothing surprises me.


Nothing surprises you? Okay. Have you been in this situation? Like “was never going to happen”...


Yes, yes. There's been several times where one of the guys on the team is like, "Oh man, I got this ticket. I don't know what we want to do with this. Greg, go take a look." And I go and look at it and that's where I make that call: "Okay, is the Infra team going to take this on, or do I just say ‘no’ right now?"


So, yeah, there's been a number of times where I've said no and probably two or three times where I've gotten a little bit of pushback on that no. I say, "My answer is no, but here's how you escalate." I've had escalation a few times and I'm actually, mid-process --I'm dealing with one right now. So, I've said, "no, if you don't like my no, you can go to VP Infra and VP Infra is, probably going to tell you the same thing. And then you can go to the President. Right now those are actually the same person."


The same person is a double "no".


That really is the true escalation path. I have to describe that to people and say, "I don't think you're going to get what you want." If I'm the one that says, no, you probably are not going to get it because VP Infra and President, and after that is the Board. They're probably not going to say, "Greg is wrong. Yes, we'll give that to you." But it's there. There's been a couple of times where I said "No, you have to ask the Board for the budget for those additional virtual machines." They went to the board and said, "Can we have budget for three machines?" and the Board said, "Yes."


So Infra went ahead and gave them the three VMs that they had initially requested. Strictly speaking, we would track those machines against their budget, but that detail is more than what the actual budget was. So we don't spend that time doing that, but I have had to say, no. I have had to... There was Apache Maven: they were keeping a copy of Maven Central, and Maven Central is run by Sonatype...


Which is a commercial product...


Yes. They're using the trademark “Maven”, essentially a licensing agreement from us, a MOU. So with Maven Central, you could imagine if someone decides to just turn it off one day ...we wanted a copy. Apache Maven was making a copy of it, and it just started consuming so much disk space. We were like, "We can't support that growth rate. We can't support that even for the next six months. If you want to keep doing it, go ask the Board for money to keep doing it." They never did. We turned it off.


I wouldn't call that a ridiculous request --it was something where we didn't have to just say, "No, not going to do it. Bye." A lot of the requests are mostly just, "We aren't going to run that extra software. If you want: ask for a VM and you can run it, but we're not going to take responsibility for it."


Over the years, obviously ASF Infra has changed. Was this all reactive or was it also proactive? Do you plan for those changes as you go or has it all been in response to Project X or in response to X emergency?


The growth of Infrastructure and its movement from volunteer-only to paid staff was part of just the growth of Apache. The volunteers could no longer keep up and things, like account creation, used to take sometimes four weeks to get an account. You’d put in a request for an account, four weeks later, it would finally get created.


My gosh, that queue was crazy, huh?


Well, it wasn't even a long queue, it was simply that we didn't have volunteers making sure the queue stayed empty. Today it's down to one, two, maybe three days, and the account is created, because every day a staff member goes and creates the accounts first thing in the morning.


It was how I said that my day starts with looking at messages on Slack and then reading emails to see if there's stuff to handle. Well, one of the guys on staff, first thing he does in the morning is go and look at account creation. So he's been off and on pondering on a tool to make that easier for himself; he hasn't finished the tool, so he still has to do it manually. That's his incentive.


“Work quickly”...


This is Chris Thistlethwaite. I say, "Chris, we can do something about that." And he says, "No, no, this is still my project. And every day when I run the script, it just makes me remember, I need to finish this."


So when the volunteers could not keep up with the amount of work, that's when we hired Joe Schaefer, then we hired another person, and hired another person. And so it was just trying to keep up with the rate of requests. 


That's how we ended up with hiring six people. And then I'm half a person, like I said, I'm part-time. So, it's just the growth of Apache. I think we're in much better shape than when I started. We're ahead of the curve. We can stay ahead of the curve because one of the things that I can do because I don't fight the fires every day ... that's for all the guys who know their stuff. They fight the fires and I can look at if I need to go and ask for another head count. And that's how we ended up with Andrew (technical writer Andrew Wetmore): “Well, you know, what we really need is somebody to manage all this documentation.” This was part of Sam's (former ASF President Sam Ruby), “If you had some money, what would you do with it?” That's how the technical writer/editor came around, because we've got 20 years of organic growth. We had...let's just call it “organic documentation”. That revamping project is going really well, I think.


So, in what areas are you guys experiencing your biggest growth? As I was asking Chris and Drew, is there like a geographic influence on the demand? We’ve had a huge influx of users in China. Does any of that change the way or what you guys are doing? Or is it just more of everything?


Our biggest pain point, I would say, is continuous integration/continuous development: CI/CD. Jenkins, Travis, CircleCI, and things like this, where people make a change and they want that change built and tested. The more projects we get and the larger the communities get, the more changes and the more testing and the more building and the more this, more, more, more. It's kind of one of those things where it's “expand-to-fit”. So if we gave people 100 machines, they'd use 100 machines. If we doubled it to 200, they'd use all 200. It's just this rapacious need for CI machines. It's very hard to figure out how to plan around that other than just telling the communities, “No: we just don't have that much capacity: if you want to build it, do it on your own machine. You just can't use Apache hardware to do it.”


That's an unsatisfactory answer. That's been one of our hard problems and it's also kind of a newer problem: the development workflow that uses CI probably is just maybe five years old. Before that, certainly, automated building and testing was a thing, but it's really kind of grown into community workflow much, much more over the past five years, and more and more people are wanting to do it. The communities are growing. Apache is growing: we're just seeing the demand explode and it's a hard problem for us to solve.


China is the one case where we see regional issues, and that's because of the great firewall of China. Not because we're getting more Chinese developers, but because they have problems accessing our servers because they're located outside of China, and so we're looking at CDNs, a content distribution network to essentially make our content available closer to China. We've found that even with one of those CDN drop points in Hong Kong, they still have problems just reaching it there in Hong Kong, and so ... and we don't want to buy or lease or rent a server in China because doing business in China is too high of a hurdle for the Foundation. 


Oh? 


You know, Microsoft and Google have to do business in China and they've got a pack of lawyers and a giant vault of money to deal with all the barriers. The Foundation does not, so it's also a hard problem to solve. We think we might be able to do it through Microsoft Azure, that they have a CDN that resides in China that Microsoft has done all that paperwork, so we're looking at that, but as far as regional things, it's not so much that we run into issues. We see Open Source communities in Europe and Brazil and Australia and Sri Lanka: none of them really have any problems because they don't have that firewall. It's not really about the Chinese people, but about the China firewall. 


That's bigger than us. And that’s not something we can fire hose.

 

We do see little engagement from Japan and Brazil, and that is partly for language reasons and partly because the Brazil community is more about Free Software than Open Source software. 


Yeah. They're very pro-FOSS.


Not OSS. But pro-free. And so, they're going to deal with the Free Software Foundation rather than the Apache Software Foundation.


I see. That’s an important distinction. 


And then you also have the Portuguese language barrier. People contributing from Europe and India, Sri Lanka, etc., they pretty much know English and that's fine. A lot of the Brazilian developers do not know English...this is the same with the Japanese Open Source developers. Japanese and Brazilian, they tend to not know English, and so that kind of isolates them from the larger Open Source world, or Free Software world, in the case of Brazil.


Would we consider localizing anything that we do, or are we going to continue as-is, as the ASF is all English?


The Infrastructure team will not translate our documents to serve those other languages. That's just too high of a bar.


There are a couple groups that have user mailing lists that are not English and that's totally fine, and Infrastructure will... well, you don't have to file a ticket anymore. It's, again, back to selfserve.apache.org: “self-serve” on Apache will create a mailing list for users communicating in Brazilian Portuguese, for example, or communicating in Japanese. But Infra doesn't do anything about that, that's just the self-serve tools. We certainly can't support non-English, and I don't think that the Foundation itself is going to make any moves towards that.


Fair enough. So a lot of companies are really struggling to accommodate their teams working from home in response to the Coronavirus and all that. These stay-at-home orders are kind of shaking companies, but from day one, the ASF has always been a virtual organization. Has anything changed with your operation on that front? Has anything impacted the ASF's day-to-day, from this pandemic?


(chuckling) Not at all. I shouldn't laugh, but no. It really hasn't changed. We've been on our team channel for all three years, three and a half years that I've been here, and the world is burning down around us, but we still sit on the team channel.


Now, that said, (Infra team member) Daniel Gruno got stranded in Canada.


Right! He’s still there?


He's still doing work from Canada. This is why when he travels to Canada for two months at a time, I don't care, you know? Because if his butt is in a chair in Denmark or in a chair in Canada, it's the same butt, so, you know...


As long as you have connectivity and a computer, you can do it. 


Right. But if he has to be offline for two months, I'd say no. Or if you want unpaid time off, well, I'm not going to pay you, of course. Certainly the discussions have changed, you know? I mean, going shopping. You know, some members are immuno-compromised and that had an effect on our team meeting that we were planning in Nashville: they were the first to say, “No way. I'm not going,” so, there’s that, but our day to day hasn't changed.


That's more of a social thing versus an operational thing. Safety first.


So the notion of, “Oh, I got to run out to the grocery store. I need to strap on a mask,” changes, but not the operation.


Right. Right. So...what do you think people would be surprised to know about ASF Infra?


I don't know if it'd be surprising, but we are global. We've got four people in the United States, one in Canada, one in Denmark, one used to be in Australia, but is now in the UK, which actually kind of hurt a little bit, because in Australia, that meant that we always had somebody in that time zone, but now we have kind of this gap of Australia/Asia time zones when...


A “Gavin” gap.


Yeah, well, I might be awake at that time, but I can't go and fix a MySQL server, so it does mean that we don't have that straight-up 24-hour coverage.


The notion that we are worldwide is kind of a neat thing about our team, and is what makes us pretty unique relative to other IT departments. I don't like being called an IT department, but that is essentially what we are. 


Surprise.


What's the name of that TV show? The one that's about IT...


“The IT Crowd”, is that what you’re referring to? The British show?


Yeah. So, you know, that's a funny show, but mostly when you think “IT department”, you think of some corporate people with button-up shirts, but ...most of us, we're in our pajamas.


Good one. What's your favorite part of the job?


I definitely like the team and that's why, nominally I'm part-time, but I'm pretty much constantly on the team channel and interacting, and so I think I just put that down as volunteer hours, where before I might work on Apache Subversion, but now I hang out with the team or I write some little tool or something like that. That's definitely been one of the more rewarding changes. Up until I started with this, I'd been a director for 15-and-a-half years, and that was kind of how I contributed to Apache. Now my work for Infrastructure is a new way to contribute to the Foundation. I'm also part of a new community, where before I would hang out with the httpd community, APR community, the Subversion people ...now it's the Infra people and my hobby time is kind of blended in with my work time, and vice versa. I mean, when your work time can also be seen as a hobby time, that's pretty cool.


I do think it's the team that makes it interesting. That's what I like the most, and that I'm working with a new, interesting community to contribute to the Foundation. 


Not only did you switch roles, you switched communities. What was your biggest challenge going into this new role?


I would say probably trying to delineate what I was going to handle for the guys and that I wasn't going to tell them what to do or how to do it. It's like, “OK, I'm here to assist, to unblock things, to enable you guys, rather than to block you or micromanage you.”


To earn that trust, that I wasn't going to be some pointy-haired boss telling them how to do their work. Now, I don't know if that was ever a problem for them, but that was certainly one of my initial concerns: how to properly create my role. This was the first time Apache's even had somebody fill in this role, so I also had to find the role, which is, again, why I came up with “Infrastructure Administrator”, is because I wanted to define it as an enabler role, as an administrator, so they could get their work done but I would not be their manager. I would not be their boss: I was simply there to enable them.


So, what are you most proud of in your infra career to date?


Ooh. I don't know. I would say by being hands-on, being the “hands” of Infra, it means that VP Infra didn't run away screaming.


David said in January 2016, maybe earlier, he was like, “No way. I'm out.” And after I was on the job for about two months, he said, “Huh. All right.”


“I'm in!”


And so I get that feedback from him, “You know, you make the VP Infra hat quite easy for me.” I think that's probably what I really like about taking on the role, is that one of our volunteers got to stay rather than drop it because it was just causing so much anxiety and pain and time and frustration. Otherwise, most of the stuff I do is really boring. Not to me, but I don't have “accomplishments”. I push paperwork, basically, so the other guys can do accomplishments.


Speaking of the other guys, how would your co-workers describe you?


I have no idea. I don't know. I really don't know. (laughing)


Where I just got done talking about what I saw as an issue, trying to frame what my role would be, it might have been fine with them and I was overly worried about it, but it’s hard for me to know. We don't do 360 reviews in Infra, so I don't get any feedback, really, from the team on what they think about myself or how I'm doing my job, so you'd have to ask them. 


I have. Just kidding. So...what are the biggest “threats” that infrastructure managers or infrastructure administrators need to watch out for? What do you think is a “big thing” that people should be aware of, or is ASF so unique that you don’t feel like anyone really experiences what you experience?


There's our capacity issue with things like Travis, but I think you're asking a different question.


I am, but that's fine. What's your greatest piece of advice? What would you tell aspiring infra administrators?


Actually, one of my greatest fears is really, as a small charitable foundation, it's hard for us to compete with well-funded corporations and some well-funded start-ups.


Related to that, I touched on it earlier, is career development ...you go into Google or Microsoft and there's a career ladder; we simply don't have a career ladder. There's salary growth. There's bonuses. If you want to have a resume or a LinkedIn profile that shows changes in growth and titles and career ladder, we can't offer that, and that's going to cut out some people. It's a very hard problem for me to solve. You know, there's things I can maybe do, but I also want to keep the team egalitarian and sort of level, rather than, “Oh, well, this guy is now the team lead.”


Given what I talked about, our social aspects, because we are all equal peers, keeping everybody with the same title, same position on the ladder means that we are peers and it's a little easier to interact that way. It's a real, real difficult problem. You ask what's scary: that's scary.


But there's a counterpoint to that. You may not have a traditional career ladder path, but to say that you've worked in Infra for Apache carries weight. That's significant. 


I believe it does, especially when you can demonstrate the hundred different types of tasks...


Well, that's exactly it. The breadth of work and the scale of what you guys do and the skill sets that you have to have and the fact that you have to play nice in the sandbox, all of it. The demand is immense, so to be able to be there and thrive and develop something from yourself in terms of a career is tremendous. Our team is exceptional. I mean, they're not expecting a linear ladder or something that others have.


You know, in other jobs, somebody might say, “I was a MySQL administrator.” Here, you're a MySQL administrator, PostgreSQL administrator… They had one role; here you've got dozens. 


If you had a magic wand, what would you see happen with ASF infra?


I'd like to solve that CI problem. The other magic wand would be upgrading our mail server from 10-year-old technology to modern technology.


Is that happening or is that literally a wish list issue?


It's happening, but it's been happening for three years. The thing is that email is so central to the Foundation that we can't really experiment with that. There are certain things we can do, but most of it, not so much, and so it means that we're being super-careful. There's about 10-12 different moving parts to it, and we're upgrading each of those a little bit by a little bit, until we can finally pull that big, scary, Young Frankenstein lever to hit the lightning bolt, you know?


Yeah: I see the visual of that.


The magic wand would be to just make that all happen and make it work. Without the wand, it's going to take another 6-12 months.


Right. What else do we need to know that I haven't asked? What should I be aware of or what should I be sharing?


Oh, I don't know. This is where my creativity ends. Ask me a coding question.


Oh no coding questions. All right. Our time has also ended. Before we go, who should I be interviewing next? 


I would say Daniel (Gruno), because his role ... he's 20-30% system administration. The rest is tool development, so that makes his role rather unique in the team.


Perfect. Thanks so much, Greg. I really appreciate it. 


= = =

Greg is based in Austin on UTC -5. His favorite thing to drink during the workday is a big 32oz cup of Diet Mountain Dew.


Monday June 29, 2020

Inside Infra: Greg Stein --Part II

The "Inside Infra" interview continues with ASF Infrastructure Administrator Greg Stein, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.




"Who are these crazy guys spread around the world that are keeping 200 machines up and running for all these different projects and committers and contributors?"



PART TWO.


How or what would you describe the Infra "brand" to be?


I don't really know. I've never really thought about branding or marketing ourselves, so ...


Well, you guys have a certain persona, you have those funky t-shirts you wear at ApacheCon ...there's definitely some kind of street cred that's different from everybody else. I was curious to see if that's part of your natural sense of hip, or is that something that you guys deliberately planned for.


The t-shirts and other things go back to the team bonding kind of thing. We'll give ourselves an identity, but haven't tried to create or market ourselves. I think it is something that we do need to take some control over. We hired a part-time writer in December and he's been organizing our content to provide a better and more useful front to Infrastructure.


There were a lot of pages on www.apache.org that have now moved over to infra.apache.org. That creates a more coherent Web space, if you will. We can really talk about those different channels. "How do you reach Infrastructure? Do I go to the Slack channel or do I file a JIRA ticket: how do I decide?" So he's helping to, while I wouldn't say "market a new face", he's certainly helping people figure out who we are, what we do, what we can help with and getting that information organized.


Which is good. That's new. Even to have you guys featured in a project like this, it's unusual and it's refreshing. I'm personally curious, and I'm sure other people are also curious about what's behind Infra.


Right, right. Who are these crazy guys spread around the world that are keeping 200 machines up and running for all these different projects and committers and contributors?


So Andrew (technical writer Andrew Wetmore) is primarily going to work on the infrastructure docs until those are whipped into shape because a lot of the material that we have, a lot of the Webpages, is really infrastructure related. He has been working with the team on those pages. What's going to be harder though is when he's kind of at a stopping point for that, what to turn his focus to, and that would be www.apache. But then it gets a lot more difficult because when he wants to update the How It Works page, who does he talk to? Who's authoritative? He can do some edits for flow and word consistency, punctuation, clarity, right, but he can't really update the process.


Right. Right. That's the Foundation thing.


Yeah. But the problem is we don't really even have a concept of who's in charge of that How It Works page, who is, you know, it's just there's nobody that the foundation is willing to say, "That person controls that process." You know what I mean?


I totally do --I come across the same pages and people go, "Are they yours?" It's hard to determine not only evolving processes, but who signs off on this or who gets it. I hear you.


I've recommended for the past year, or three, that Marketing is the owner of DubDubDub (www.), but you know, that's the "face" of Apache. You know? But the raw content, as you point out, who approves the raw content.


One thing that I asked Drew and Chris, and I'm always curious with people who are super busy and juggling 50 things, is to describe a typical workday for you.


I wake up, I look for email first, generally, sometimes I'll hop onto Slack because sometimes people ask me directly for something. Then I go look at email and sort through a number of different categories between direct team stuff, operations, the Apache Board, and then Apache in general. And then of course, if there's any vendor email to deal with. So there's a bunch of different categories in priority order. After I get through that initial work, then it's go and read all the back scroll in the team channel, which is anywhere from 200 to 400 lines of back scroll ...


Can you get any work done? Beyond just catching up on the communications?


Yes. But it does take like 30 minutes to read that back scroll. For me there's a lot in there about what the guys are doing and what they're working on, how to solve a particular problem when they're asking somebody else, "Hey, can you look at this? Can you help me with this?" But I don't, for the most part, "serve", you know ...they are the technical staff... I can do it: I have technical chops, but I let them do their jobs as they know best. I do like reading the back scroll because I'm also looking at it from the angle of "how is the team working together? Is that going well? Is there something that I need to poke and prod to improve how they're working? Are they getting jammed up on something that I can unblock for them so that they can get their work done?"


Stuff like that. That's what I look for when I go through that back scrolling, so it's important to me to read that back scroll. Most of the guys do tend to, when they first sign in in the morning, go back and scan for stuff where they might be needed. I've never really asked them how detailed they get, but I think pretty much everybody reads all of it to catch up, but they're going to be looking at it with a different lens than how I look at it. Mostly I'm looking at unblocking --are they running into problems that I can ease for them?


How do you keep your workload organized?


I don't.


Fair enough. Again, there's a lot, so it's curious to me, like everything at Apache, with the exception of a handful of things, everything could be a priority, if you're always on fire and always running around, putting out fires, you know? It's funny when I've talked to the Infra guys and you also, you all have the same reaction to that question, which is the laugh. I think that's the nature of the beast with the ASF.


Yes. That really is the nature of system administration work. My career has been product development, and you can reasonably plot that out. You can say, "We're going to develop these five new features, which is going to take us between two and four months." We'll see...we might cut a feature to try and limit our time development. The feature is going to change, unless we'll plan in time for change. But system administration is very reactive, so it's a very different beast. This is where, like I said, we were kind of treading water with four people, but we could see as Apache was growing we were not going to be able to keep up. And we certainly weren't going to be able to move ahead of the curve and do things like selfserve.apache.org where, you know, before we would get a dozen tickets to create repositories and that took time. Now we don't have to do anything.


It's all selfserve.apache.org, but we had to write the tool first and have enough air time to get that tool written. So I think we're ahead of the curve. We're getting some of our longer-term initiatives done, but it is still a very reactive thing. For myself, my back office work is pretty straightforward and it's a lot of email and Website work, you know, going in, paying an invoice, putting in the infrastructure credit card, sending out a purchase order, stuff like verifying and improving payroll, that doesn't require me sitting down and writing Python scripts.


The other half of my job is being present on that channel because I also help to set priorities. When something comes up, I ask, "Is this a thing that we want to do? Do we want to take on this new task? Do we want to provide this new tool to the projects?" You know, like a project is going to say, "Well, we want to integrate this thing into our GitHub repository," and we go and review it. It may require permissions that we simply don't want to allow. So there's some of those kinds of policy kind of things that I also help with. And there's always being present to help set policies and priorities.


OK... so how do you work with (VP Infrastructure) David Nalley? Are you making the decisions? Infra is an unusual type of group as opposed to other areas of activity operationally at the ASF. How do you work together?


Correct: I'm the day-to-day, so I look at it like he's the brains and I'm the hands. That said, he's like the strategic brain and I do all the tactical decisions.


I make all the tactical decisions. I am an officer of the corporation. I can make any decision that I need to, related to Infrastructure. If I feel it's a little bit weird, then I'll bounce that off David, but for most of the stuff, he doesn't feel a need to inject himself in. He feels comfortable letting me go ahead and run with the things, and rely upon me asking when it seems a little sketchy.


That's good: that process suits both of your personalities, both your sensibilities. It sounds like a good fit.


I report to the VP of Infrastructure, and that is still David, even though he became Executive VP and is now (ASF) President. He still holds that title. He's asked me, "Well, Greg, maybe you should just be VP Infra," and I said, "No way." Because we're paid people, but the Foundation is all volunteers. I told him I do not want to be a VP, because I want to report to a volunteer. I think that I (and the team) should report to a volunteer that always has a volunteer eye on the Foundation's long-term goals.


Because I manage all the day-to-day, it's a very lightweight hat for him. That VP hat is a tiny aspect compared to his President hat. One day, he'll find somebody to take over that VP Infra hat, but I've essentially mandated to him that it has to be a volunteer position.


It's not that I see we're going to go all out of control and we need a check from a volunteer; I just want a volunteer to always be able to say, "Okay, you guys are a little bit crazy, let's redirect our long-term thinking more in line with what the Foundation wants," and have a volunteer interpret what the Foundation wants.


That perfectly dovetails into what folks referred to in our ("Trillions and Trillions Served") documentary, where they were talking about Greg Stein's famous "plan for the ASF for 50 years..." This super long-term vision, which again, everyone goes back and says, "Greg Stein said..." What does that mean exactly, and how does that translate to Infra, considering that you can't really plan that far out? How does that work?


Well, actually we can plan that far out. I wrote that "50 years" in one of my Director's statements, I think it was 2014 or 2012 ...maybe earlier. Where I was going in that Director statement was the Board doesn't deal with the communities. The Board is there to support the communities. So we want the Foundation to exist for 50 years so that these communities can continue to run and see through evolution.


Some communities are going to move to the Attic, new ones are going to come along, but we want the Foundation to be viable. To say "forever" is okay. Nobody can really put that in their brain. So I just said, "OK, we can think what 50 years means." That is long enough out, but still within people's brain capacity to think, "Okay, what _does_ 50 years mean?"


And so that's where I came up with that. What does the Board need to think about to ensure that we are here 50 years from now and our projects are successful and can run through their lifetime, lifecycles. Apache HTTP and Tomcat, I don't think they are ever going to go away, but you could see maybe in 30 years they might. There might be some other mechanism in computing that would obsolete them, but the model of Apache does need to exist for at least that long.


Now, within Infra, I think we actually can plan that far out because we have growth curves. We see what kinds of computing resources people need. So we can plan for project growth, for machine growth. We can do long-term planning on how we allocate machines among our various cloud resources that we have, and start to schedule those further out. None of that really affects our day to day, but it is something that we can project out a ways and think about what kinds of resources we are going to need two, three, five years from now.


There isn't anything really that we can do for 50 years, but we can keep it in mind. Okay, that is going to be a larger team. That is going to need a larger staff, a full time manager, a full time HR person, a full time... There's different things that will change over that time, but we can actually do some of that projection, although we haven't bothered.


I do the five year plans for the Board, but mostly that is a simple cost growth as opposed to actually changing the structure of the team or the role assignments, because like I said, I think probably within 10 years, we'll probably need to add one or two more staff on top of the head count of six that we have right now. And I think supporting that would still be fine for a part-time person like myself. But once it grows to 10 or 12, then I think it's going to need a real change. Where we need to have a full-time person managing and so, we'll need to adjust the budget considerably to make that happen.


But if we ever get there, the Foundation is going to be likely in a very different position. We're talking 10 years from now. And so, who knows.


So with more than 350 projects and initiatives as we've discussed before, how do you guys stay ahead of the demand? And again, if you're trying to plan for five, ten years out, you mentioned earlier cloud computing. Not so long ago, cloud computing was a novelty. How do you plan for this?


And that is where we try and move more things to selfserve.apache.org, where we look at the kinds of requests that we're getting. The kinds of tasks that we’re performing and find a way to automate that workflow and create more self-serve options for the kinds of tasks that we regularly get tickets on.


Where we used to get tickets on creating Git repositories, we get zero now and, and we can see over the past six months, we've had 20 tickets to do X, is there a way that we can automate that, so we don't have to get our hands on that ourselves and save our hands for doing things like machine upgrades, for rebalancing some of our computer resources, where things are running on an old operating system and we need to get that onto a newer version. Right now, all of our machines are managed by a system called Puppet, which does the basic configuration work for us. But today, we're on two different versions of Puppet, a really old one and a reasonably new one.


And we're trying to get everything migrated off the old stuff onto the new but once we finished that migration, we're going to have to start all over again, or maybe switch to a different tool. We're looking at a tool called Ansible to use instead of Puppet.


And so there's this never-ending ongoing set of tasks, but each time we do it, it reduces our workload by that much more. So when we upgrade from Puppet 3 to Puppet 6, we get an improvement in the maintainability of that server. And that means that we spend less time with that server going forward and have more time to do other things or to deal with project growth.


Regarding a scale of efficiency, how do you close your skills gaps? When I spoke to Chris and Drew, they both said, "We do everything." How do you do that? How do you know all of this? Do you look at this big picture and say, "Okay, we need a person to specialize in X and Y and Z," and then you send them out to learn about it? How do you cope with that?


The team definitely specializes. And the guys have specializations around different areas, but we do a little bit of cross training, but not a lot because as I mentioned, we've got like 200 machines, each individually doing their own thing. If we cross trained everybody in everything, we'd get nothing done. So, there's a little bit of cross training, but mostly some specialties. It does create a little bit of bus factor...


Which is very scary. I was just going to say, your bus factor is very scary. Talk about that.


The thing is that Puppet allows us to create configurations and that's in version control. If all of a sudden somebody leaves, another person can backfill them because if somebody leaves, it's not like they take their work with them: all the work is in version control. And so that work doesn't go with them, but we may need to backfill some education on that particular specialized area. For example, Chris (ASF Infra team member Chris Thistlethwaite) does a lot of our monitoring work. If he left, now we need somebody to get a little more familiar with NodePing and a little more familiar with Datadog, but that'll be like a week for somebody to pick that up.


It wouldn't be, "Oh my God, this is three years of expertise that we need to go backfill" ...we don't have anything that is that highly specialized.


Is that because the team is more well rounded or because you guys are more efficient or what about it? Because of technology evolution, or...


We don't deal with systems of that level of complexity. We've got 200 machines, like I said, each doing their thing, but it's not like we've got a cluster of 200 machines all trying to coordinate to create one particular outcome. It's, here's my SQL server, here's a JIRA server, here's a Puppet server. Things like that, where the amount of technology is pretty small in each little pocket ... but we just have a hundred pockets on our pants.

[END OF PART II]

Tuesday June 09, 2020

Inside Infra: Greg Stein --Part I

The third "Inside Infra" interview is with ASF Infrastructure Administrator Greg Stein, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.




"We've got about 200 different machines and each one runs something different"



PART ONE.


What is your name --how is it pronounced?

Greg Stein. "Gregg St-eye-n"

When people need to find you, are you at gstein@? Has that always been your handle for everything?

Ever since high school, actually. I was gjs@ for a bit in college, but went back to gstein@. I started at Google early April 2004, and Gmail launched on April 1, so I was able to get my work email ID, gstein@gmail. So it’s great, but also rather annoying, because there are a lot of Gary Steins and Gertrude Steins and George Steins, and I get all of their email ... I get plane tickets, hotel reservations ... I got a proposal from the Gates Foundation once. I had some crazy bitter angry lady yelling at her husband as they were getting divorced, and she could rant. I mean, wow: that lady had a pirate's mouth.

But she didn't have his email address.

Apparently not.

When and how did you get involved with the ASF?

I left Microsoft in 1998, and the product group I was working in was building WebDAV into various Microsoft products. I thought the concept of WebDAV was very cool, and wanted the Open Source world to have it. That meant writing a module for the Apache Web Server. I think it was September 1998 when I started posting to the Apache mailing list and looking at how to plug in a WebDAV module. That was Apache 1.3 at the time. I developed a module called mod_dav for Apache 1.3, And when we started Apache 2.0 in 2000, I donated the module to Apache, and it became a standard module in Apache 2.0.

I remember that: I did the press release for that way back when. I knew you were connected with mod_dav, but didn't realize the path as to how you got there. It's very interesting.

That's what brought me to Apache, when they started putting together the foundation: it was in the Spring of '99. I remember asking Roy if I could be one of the first members of the foundation, and Roy's answer was basically like, "We already had the set of people locked in. You'll probably get nominated and voted in at our first member meeting," which occurred in September 1999. So yes, I was in that first batch of new members rather than the original membership.

You've been a member of the ASF much longer than you've been involved with ASF Infra. What were the previous hats you were wearing at the ASF? You've been here for a while, and have had a lot of different configurations.

This is true. So I'm a committer on HTTPd (Apache HTTP Server) and then a PMC Member, an ASF Member. I helped start the APR (Apache Portable Runtime) project with some of the other Web server committers, we pulled that out of HTTPd and created APR, and we used that for 2.0. We used APR, whereas Apache 1.3 was essentially the combination of the two, one big code base. Then Justin Erenkrantz and I started Apache Serf, and that was a high performance C-based client library for HTTP. But we didn't have three people in the community, so it couldn't really be an Apache project. So we took it out of Apache and started working on it on our own, and then eventually Subversion started to use Serf, and so we got more committers on Serf, and the community kind of built up around it because of Subversion. So we ran Serf externally, but just like it was an Apache community, it was Apache licensed and so on. Eventually we wanted to move it back into Apache, and I don't recall off hand, but we went straight to a TLP from our external project back to Apache Serf.

Early 2000, it was January or February, (ASF co-Founder) Brian Behlendorf approached me about helping with the network protocol for this new version control system they were starting at CollabNet, because he knew my background in HTTP and WebDAV. That “V” stands for versioning. I got involved with the Subversion project that Spring. That was also run as a very egalitarian Open Source project, very similar to how we run stuff at Apache. I was really the only Apache person, but Karl Fogel just knows how to run a great community, and so all those values that we cherish in communities at Apache were part of Subversion from day one, but was run by CollabNet. I was hired in 2001 to manage their development team. Eventually, CollabNet wanted to turn it into a vendor-neutral thing that wasn't only CollabNet, so they started a small LLC called the Subversion Corporation. Once the IP was transferred to the Subversion Corporation, people said, "Okay, let's move to Apache," because nobody wanted to deal with the overhead of the Subversion Corporation. We approached Apache at the end of 2009, and Subversion became Apache Subversion. I was the first VP for that. I think that's the only VP hat I've worn.

In 2001, I was elected to the Board at the Members meeting, and in 2002, Roy decided to step down as Chair and said, "Oh, Greg should be Chairman." He just kind of threw me under the bus that way, but I agreed, and that's when I became ASF Chairman. I was chairman until 2007, which is the longest-running chairman. I think Brett Porter did four years.

I think it was 2009 when you hosted us at the Harvard Club and Doug Cutting was appointed Chairman, but he said he didn't really want to travel, do much press stuff, or be a face of Apache. Roy came to the rescue, threw me under the bus again, and said, "Greg can be Vice Chairman, and we'll have the Vice Chairman do all that stuff”. So I held the Vice Chairman role until September 2016, when I gave up my director position, the Vice Chairman position, and VP of Subversion, because that's when I became Infrastructure Administrator.

Over the years, I did a bunch of volunteer work for ASF Infrastructure. I helped out with what we call AP mail: adjusting moderators, changing aliases, things like that. So I've had AP mail access for quite a while when I was doing that. Upayavira wrote id.apache.org for people to review their Member records, change their passwords, etc. I helped him with some of that stuff. That was all written in Python, so I was able to help out.

Python before Python was popular.

I've been using Python since 1995, and I've contributed to Python itself. We set up the Python Software Foundation in 2001. When I say “we”, I mean myself and Dick Hardt from ActiveState. We took the Apache bylaws, and added a different class of membership to it so that companies would become... I forget what we called them, like corporate members or something. The normal people were called nominated members, as they were nominated by somebody else and voted in. But this gave corporations a vote at the table on the board and anything else that members would get a vote on. So the core of the Python Software Foundation came from Apache.

Back to ASF Infrastructure. In 2016 we had four people on staff in Infrastructure, and our volunteer VP of Infrastructure didn't have enough volunteer time to be able to provide support and management for those four people, plus we wanted to hire two more people. With six people, he was right out. So we spent a lot of 2016 trying to figure out how to create a “manager” for the Infra team. At the time the idea of an “executive director” type position was also thrown around, but a full-time position to manage four or six staff is completely overkill, and we certainly didn't have the budget for a full-time position. Somewhere around late August, I realized that there was an email that Ross (former ASF President Ross Gardler) sent and I thought, "I can do that. That's a half-time job. I'm certainly happy to do it. I've managed engineering teams before.” Now, infra's not an engineering team, they don't really develop products, but it's pretty close to engineering management. At a minimum, it's personnel management, which I've been doing since the '90s.

So I threw my hat in the ring. Ross ran it by the Board and the team, and nobody raised a strong concern, so in his authority as President, he went ahead and hired me half-time. It was the day of our Board meeting --I resigned all my positions, and we appointed my replacement for Vice Chairman and my Director position that day, both of which I believe were Sam. He filled my role as Director, and I started as Infrastructure Administrator.

What does “Infrastructure Administrator” mean? What does it entail: are you hands-on coding solutions like the rest of the team? Are you solving problems? What do you do?

I chose the title because I didn't want to be called “manager”: I didn't want to feel like I'm the boss. I wanted to help with the administrative side, make sure the guys get paid, deal with the invoices, handle what you might call back office kind of stuff, and let the team focus on what they do best, which is the system administration. (ASF Infrastructure Team Member) Daniel Gruno does some development work in addition. I do a little bit of development work. For me, it's more like where in my hobby time I might work on Subversion, but now my hobby time is coding Infrastructure type stuff, so it's not really part of my work duties. I deal with salaries, raises, bonuses, getting the payroll done, and for our contractors, getting them paid. I also deal with third party contracts for things like Travis CI, for lists.apache,org ... that's with PonyMail. I make sure that our vendors get paid, and our contractors and employees get paid.

How was the Infra team structured, and how many are in the team?


We have five full-time people that work on Infrastructure: all five are system administrators. Daniel Gruno does maybe 30% system administration and 70% tool development. We don't develop any products, because we're not an Apache community. We write tools, but don't actually develop any products. This is why PonyMail is in the Incubator: it was originally written by one of the people on Infra, but we didn't want to run that as an Infra community. With only five people, we don't really want to be a community lead or anything like that.

The joke is if somebody wants to move into my position, they lose half their salary, because my position is part-time. It's not really a promotion: it would be a loss to do anything. So unlike a corporation with 10,000 people on staff, career development is a little more difficult. It's really a job for people that enjoy Apache and enjoy our mission, and also enjoy working with the other people on the team.

Who does ASF infra serve?

Our primary users are all the communities at Apache. We've got over 200 communities, and those are the primary users. I don't like calling them “customers”, but in a corporate world, they would probably be considered our customers, and we serve those users. There's 8,000 people with accounts that are working on different projects, but the user base is way, way larger than that, because people can file JIRA tickets and work on the wiki and do things like that without actually being an Apache Committer. So the user base is even larger. Then you start looking at all the people subscribed to all of our mailing lists, and that number goes even higher. There's probably 10% of our work which is also supporting the administrative side of the Foundation itself.

For the Board, your role in PR, and Trademarks, and Legal, and the office of the President and various other operational type stuff, we spend 5-10% of our time. A lot of what we do applies naturally across all of the user base, because the foundation uses the same tool set as our communities. Subversion, mailing lists, JIRA, Confluence, etc. We help with account creation, the LDAP management, what sort of permissions people have to access different things...

One of the neat things that we've done, and I've actually had a couple of communities ask us about it, is our GitBox setup where our projects can use GitHub. But then we also mirror all that source code back to Apache so that we have a copy of it for provenance tracking. And in case GitHub does something dumb, we have our own copy of the code. Any changes made on GitHub get sent to our mailing list or get mirrored into JIRA. Our projects can see all the activity on GitHub, and it gets mirrored into our mailing lists where we prefer that our community work is performed.

That's actually a pretty cool feature that we've done at Apache.


It's interesting to see communities outside of Apache that emulate structures and processes and solutions that the ASF has created. It's cool to see it even happening on an infrastructure level. How does ASF infra differ from other organizations or other open source foundations?

Most of them don't really have teams. Most projects out there do their work on GitHub, and don't have their own source control. They don't have account management, they don't run mailing lists. We do all this stuff that most Open Source projects just don't deal with.

They also don't have the scale that we have.


Yes. Because they're one project, and we have over 200 projects. Most projects have some repositories hanging out on GitHub or on GitLab, or wherever else that they might host: if somebody wants to run a demonstration of that project, they buy their own virtual machine and AWS, and pay that out of pocket. At Apache, all of our projects can have virtual machines hosted by Infra, where they install their software for demonstration purposes. They can point people at that VM, so they can check out the product in live motion. So that ability to run VMs is also pretty unique to the Foundation. When you look at the Linux Foundation or the Eclipse Foundation, those are a little bit different. They're not a charitable organization like us. They're a 501(c)(6), which is really like a trade association.

Like a consortium.

Yes, a consortium. I believe that they do have infra teams, but their business model is quite different from ours. If you look at Mozilla, they have the Mozilla Foundation, but that's kind of a shell; Mozilla Corporation essentially runs everything, and the foundation is like a legal shell wrapped around the corporation.

You mentioned earlier that we have 200 projects: you're referring to 200 Top-Level Projects (TLPs), but we also have sub projects and initiatives. At Apache, we have more than 350 different activities going on --you guys touch all of those. It's not like there's any aspect of ASF that you're not involved with or you're not supporting.


That's correct. And I say 200 because I'm thinking mostly from a TLP thing.

Irrespective of the existence of sub projects, you're still dealing with other communities and projects: there's more than just the 200. Hats off to you guys. It's quite a lot of work.

We've got about 200 different machines and each one runs something different. Some companies have 50 copies of a machine that they'll start up in the cloud, running some container --we never do that. Each individual machine is configured one by one and they're all different. And so 200 machines to support the 350 initiatives. It's a lot of heterogeneous work and that can be kind of distracting, but it's also very interesting because we do support such a wide variety of stuff for our projects.

There's what, five Infra team members, and we have 350 projects and initiatives going on. That's a lot of stuff happening: is it non-stop?

Yeah, it's nonstop. That's why we went from four to six people, we were sort of treading water, but we weren't really able to move forward on a number of our longer term initiatives. So when we went to six people in November 2016, that made us a lot more hands-on, if you will. That meant that we could actually make some progress on this longer term work that we wanted to accomplish. Some of that is like https://selfserve.apache.org/ , where people can get things done instead of filing a JIRA ticket and having us do the work for them.

Is that popular? Do people use it?

Oh, absolutely. When somebody opens a JIRA ticket to say, "Can I have this Git repository?” or “Can you create a JIRA Space for me?" we close the ticket and say, "Go to selfserve.apache.org". Before, where everybody would file a ticket for a Git repository or file a ticket for JIRA, file a ticket for Confluence, or whatever, we just close them all down now, and they use selfserve.apache.org instead. We simply won't do those things anymore. So selfserve.apache.org is actually quite handy. And then about four months ago we've added a feature called asf.yaml: it allows communities to control a lot of the finer grained aspects about how their repositories are used, like how do they publish Web pages from a repository, or if you make a change, where does the commit email go? Which mailing lists? Does it go to their development list? Or do they have a commit list? If somebody opens a PR on GitHub, where does notification of that go? Those used to all be tickets also, but people can deal with those just by editing a file in their repository now. So again, it reduces tickets and that's our goal where these routine tasks that all the different communities want to perform, we want to move those into a self-served mechanism so that we don't need hands-on all the time. And thus, we can support 350 different initiatives.

That's great to help empower the communities to take care of their own needs, whether they're minor or major, but that also encourages autonomy. So that's really helpful for you guys: you don't need to have a team of 40 people to support the day-to-day.

We do stay busy. You're talking about the influx and we get requests from people through email, through our Slack channel and through JIRA. Of course our monitoring system will tell us when something goes down, so our monitoring systems also give us more work to do, so it is kind of an endless string of queries. Depending on what the task is, each of those different channels is appropriate. For a quick task, hitting us up on Slack is totally fine, but if there's going to be several days of work, we like JIRA tickets so that we can track the work as it progresses.

How do you encourage the team? How do you keep them motivated? What were your challenges with such a huge load to carry: how do you keep everyone going?

One of the big benefits that we have for our team is actually that we're all remote, so we all sit on a Slack channel. We have a team-only channel that we use for communicating, "What's going on? What beer are you drinking today? What are you having for dinner?" I think about my days when I worked at Microsoft or at Google where I sat in the office by myself and it's a very individual experience that I used to have, but now, our team is there all the time on our channel. It's a very social experience: I think that makes for a much tighter team. And it provides a very different experience than what you get at a more “normal” company. That sort of team experience really helps keep people motivated.

People enjoy their jobs more. From a management standpoint, I can certainly say, "If people are sitting there talking about what they're going to make for lunch, there's a drag on the team and maybe we're not seeing the highest productivity possible," but I think that would actually run the counter. Our team is actually more productive as a result of this great team bonding. We have a conference call once a week for 30-60 minutes. And we don't really have to: the team knows what everybody's doing because we're all doing it right in front of everybody. We all get the commit messages. We have our Slack channel. We see the changes to JIRA. We know what each person is doing, but having the call actually gives us a chance to speak to another human so you're not working in your basement all day without any human contact.

We actually have that once a week, if you will, forced human voice contact.

Did that evolve organically? Or was that something planned?

The team was already doing weekly status calls. When I started, I said, "We're going to keep doing that. We're not going to switch out for just, you know, a status email or anything." Before I started, I think they were doing a group edit on a status Web page or something. I don't know if they had calls, but today I mandate the call because I want the team to get together. We've also been doing the group get-togethers at ApacheCon. We got together at ApacheCon Miami, and then the next year in Montreal. Last year we skipped the whole conference format and just got together as a team in New Orleans for four nights.

It was great because it was just us without the distractions of the conference. The conference is good because the INFRA team gets to meet the people that are their users, their customers, the people that we're actually trying to support, all those communities. And the people in the communities get to meet the team. You know, the people that asked, "Can you help me with X?" They get to put a face to those names.

There are times where one of the guys on the team will work with somebody in the community for a couple of weeks to track down some problem, get a virtual machine configured, whatever. All you see is a user ID and the kind of tone of their messages, but at the conference, you can actually put a face to that name, to that ID. That’s really good from a team standpoint. With the team bonding, we spent eight hours a day in this giant penthouse suite in New Orleans on the 30th floor looking out over the Mississippi River. It was very cool, it had space and a big dining table where we could all come in and work. And then I would go around the corner to Mothers and pick up—

Oh my gawd: the po boys ...the debris po boys.

Exactly, you know what I'm talking about.

I lived there. So, yes, I know.

It was literally a block away. So that was our lunch. Every day I was going down to the Mothers, getting a big brown shopping bag full of food and bringing it to the room. We did go there and eat once so the guys could get out of the room for lunch, and each evening we would go as a team out for dinner. After dinner, it's like, “OK, do whatever you want. It's New Orleans.” That was a really good team experience. We were set to go to Nashville this year and then, you know, pandemic ensued. So we called it off.

It's funny: I stumbled across your channel on Slack and, if I remember this correctly, someone was talking about grilling a whole steer or something along those lines. You guys deal with a lot of beef, there’s a lot of meat in this group. So ...

In the team channel, there's a lot of stories about food and beer and other forms of alcohol. We eventually created a cooking channel on Slack because there's other people like Ruth (ASF Executive Vice President Ruth Suehle) and Shane (ASF Vice Chair Shane Curcuru) and others who also like talking about making food. We still have a lot of that discussion on the team channel, but we’ve now got a dedicated channel with a larger set of people talking foodie type of stuff, so that’s very cool.

You were also talking about motivation: I work with each of the guys to find out what they're interested in exploring. Whether it's a new tool or a new product or to write a new tool to improve our workflow, it's like, "What are you interested in? Okay, take point on that, do the research, go do the experimenting." So each of the guys has gotten generally one or two long-term projects that interest them that they want to work on.


[END OF PART ONE]

Monday May 11, 2020

Success at Apache: Remote Collaboration in the Time of Coronavirus

by Marvin Humphrey

I "arrived" at the Apache Software Foundation in 2005, unreasonably angry about a bug in Apache Lucene.  By "arrived", I mean that I sent the first few emails among several thousand I would go on to send over the next 15 years — the ASF didn't have a physical office where I could show up to buttonhole and berate some unlucky customer service representative.  An unreasonably patient Lucene contributor named Doug Cutting talked me down.

Because the ASF has always been a virtual organization, the Coronavirus pandemic has had minimal impact on its day-to-day operations.  While individual contributors may be personally affected, at the collective level there's been no mad scramble to adapt.

Others have not been so fortunate.  All around the world organizations have been struggling to revamp their processes and infrastructure to comply with "social distancing" protocols.  Sadly, many have already laid off workers, or even closed their doors for good.

And yet, there is a huge pool of work which could conceivably be performed remotely but isn't yet — or which is suddenly being performed remotely but inefficiently.  If we can accelerate and streamline the transition to remote work, many jobs and businesses could be saved.  With some creativity, our interim "new normal" could be more propsperous, and perhaps sooner than we think!

Are you an Open Source contributor?  If so, you possess expertise in remote operations which is desperately needed in today's challenging economic environment.  Let's talk about what we know and how we can help.


The Internet Turns People Into Jerks

People type things at each other over the internet that they would never say to someone's face.  In person, we calibrate our language based on feedback we receive via facial expressions, tone of voice, and body language.  But when all communication is written, the feedback loop is broken — and all too easily, vicious words fall out of our fingertips.

Suddenly-remote workers may find themselves exposed to this phenomenon as conversations that once took place in the office migrate to Slack, email, and other text-centric communication channels.  But it can be tricky learning to recognize when a conversation being conducted via a text channel has gotten overheated — it takes an intuitive leap of empathy, possibly aided by dramatic reading of intemperate material a la Celebrities Read Mean Tweets https://www.youtube.com/playlist?list=PLs4hTtftqnlAkiQNdWn6bbKUr-P1wuSm0 on Jimmy Kimmel.

Open Source communities have grappled with incivility for as long as the movement has existed.  Over time, "ad hominem" personal attacks have gradually become taboo because of their insidious corrosive effect; there exists broad cultural consensus that you should attack the idea rather than person behind it.

Defenses have become increasingly formalized and sophisticated as more and more communities have adopted a "code of conduct".  While the primary purpose of such documents is guard gainst harassment and other serious misconduct, they often contain aspirational recommendations about how community members should treat each other — because serious misconduct is more likely to occur in an environment of constant low-grade incivility.

Regardless of whether your organization adopts a code of conduct, it won't hurt to raise awareness among remote team members of the suceptibility of text-based communications to incivility — so that they may identify and confront it in themselves and others and shunt everyone towards more constructive patterns of communication.


Keeping Everyone "In The Loop"

Coordination is a troublesome problem even when everyone works in the same office, but the difficulties are magnified in remote environments where it takes more effort to initiate and conduct conversations.  Teams can become fragmented and individuals can become isolated unless a culture is established of keeping everyone "in the loop".

At the ASF, the problem is especially acute because its virtual communities are spread out across the globe.  Due to time zone differences, it is typically infeasible to get all stakeholders together for a meeting — even a virtual meeting held via conference call or videochat.  Additionally, many stakeholders in ASF communities do not have the availability to participate in real-time conversations regularly because they are not employed to to work on projects full-time.

"Synchronous" communication channels like face-to-face, videochat, phone, text chat, and so on are good for rapid-fire iteration and refinement of ideas, but they effectively exclude anyone who isn't following along in real-time.  Even if conversations are captured, such as with AV-recorded live meetings or logged text chats, it is inefficient and often confusing to review how things went down after the fact.

The solution that the ASF has adopted is to require that all meaningful project decisions be made in a single, asynchronous communication channel.

  • The channel must be canonical so that all participants can have confidence that if they at least skim everything that goes by in that one channel, they will not miss anything crucial.
  • The channel must be asynchronous to avoid excluding stakeholders with limited availability.

Synchronous conversations will still happen outside this canonical channel —and they should, because again, synchronous conversations are efficient for iterating on ideas!  However, the expectation is that a summary of any such offline conversation must be written up and posted to the canonical channel, allowing all stakeholders the opportunity to have their say.

At the ASF, the canonical channel must be an email list, but for other organizations different tools might be more appropriate: a non-technical task manager such as Asana, a wiki, even a spreadsheet (for a really small team). The precise technology doesn't matter: the point is that there are significant benefits which obtain if a channel exists which is 1) canonical, and 2) asynchronous.

Decision Making

In an office, decision makers can absorb a certain amount of information by osmosis — via overheard conversations, working lunches, impromptu collaborations, and so on.  That source of information goes disconcertingly dry on suddenly-remote teams, leaving only information siphoned through more deliberate action.

A canonical, asynchronous communication channel can compensate to some extent, providing transparency about what is being worked on and how well people are working together, and facilitating oversight even while most of the work gets done solo.  Because properly used asynchronous channels capture summaries rather than chaotic and verbose real-time exchanges, the information they provide is more easily consumed by observers watching from a distance.  The canonical channel also provides an arena for gauging consensus among stakeholders and for documenting signoff.

"Lazy consensus" is a particularly productive kind of signoff, where a proposal is posted to the canonical channel and if there are no objections within some time frame (72 hours at the ASF), the proposal is considered implicitly approved.  Provided that the channel is monitored actively enough that flawed proposals get flagged reliably, "lazy consensus" is a powerful tool for encouraging initiative — a precious quality in contributors collaborating remotely.

Conclusion

Organizations are adapting in myriad ways to the economic crisis brought on by the Coronavirus pandemic.  In the world of Open Source Software where countless projects have run over the internet for decades, we've accumulated a lot of hard-learned lessons about the possibilities and pitfalls of remote collaboration.  Perhaps our experiences can inform some of the suddenly-remote teams out there straining to find their way in these difficult times.  Let's help them to do their best!

Marvin Humphrey is a Member Emeritus of the ASF and a past VP Incubator, VP Legal Affairs, and member of the Board of Directors.  These days, he is focusing on family concerns and consulting part-time.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache  

Monday May 04, 2020

Success at Apache: bringing the Apache Beam firefly to life

by Julián Bruno


Creating the Apache Beam firefly was the first opportunity I had to contribute my skills as a designer and illustration artist to an open source project. I didn’t know anybody working in open source until I moved to San Francisco from Buenos Aires, Argentina. I knew about open source software for video games, like Unity or Unreal Engine... This allowed gamers to make modifications, like adding new levels or creating new character models, and upload them to the same engine that hosted the original game for other gamers to use. This practice enabled a sense of community, where users can share ideas, passions, and express creativity. There are so many things you can do when you work in collaboration with others. This spirit of community is one of the things that made me excited about contributing to Apache Beam. 


Living in an area where technology is everywhere really piqued my interest and drove my curiosity to understand how technology evolves. When the opportunity came to contribute to Apache Beam, I was interested right away. I didn’t know about the project before I got involved, and I certainly didn’t know there was a community behind it, working together to build this amazing solution. Building a mascot for a group of people is different from working for a brand because this firefly represents a group of people and what they find valuable. There is an extra layer that makes it more human. For this type of work, designing a mascot is usually a decision reserved for a small group, and the larger community is not involved. It is refreshing and very meaningful that the community had a chance to step into the process. I saw it as an opportunity for self-expression,participation, and one more exercise in community building. 


In order for this process to be inclusive, I built a group-wide communication system for the community to input during the process. I think that having open and frequent communication was key because, ideally, I wanted everyone to feel that the mascot represents them. I created questions that would help Apache Beam contributors understand what I needed as an illustrator. The questions helped me understand what they liked. This ensured that the mascot was aligned with the community’s taste. Some questions were about colors and visual styles they preferred, if the eyes are too big or small, and preferred line art style. There were 4 rounds of feedback, plus a final vote, where 18 people participated. Engagement increased with every new round. The Apache Way for communities to operate reminded me of a lot of animation forums I participated in the early 2000s. I’m glad to see that some of these practices are still around, because they help make processes more inclusive and build a sense of community.




This communication with the Apache Beam community helped me to create a mascot with features that are unique to the project. When I started, I was given a few concepts that I needed to work with, such as: cute, innovative, fast, data processing, and futuristic. The first few decisions, like making the mascot look as aerodynamic as possible were easy to make. Conveying "data processing" was a bit harder to figure out, butI eventually chose to communicate this concept by changing the mascot's color. What really gave the mascot its unique identity came from using Pokémon-like character style. I built the rhetoric for Apache Beam's logo by combining two concepts that have nothing to do with each other, Pokémon and data streaming, and created something new. 





In the end, I created the Apache Beam mascot and its model sheet, so that anyone can reproduce it, a version of the mascot learning (a key focus for the project at the moment), and a version of the firefly doing what it does best… stream data! I really enjoyed working for Apache Beam and contributing my skills as an illustration artist to open source. I think the most interesting part is the community: creating something in collaboration with others adds a lot of value to what you are making for the world.



Julián is a digital artist based in San Francisco, California. He has spent over 10 years in the animation industry and has developed his skills in art direction, 2D animation, illustration, and visual art development. My passions include art and cartoon animation, as well as connecting with people and creating new projects. He was born and raised in Buenos Aires, Argentina, where he studied Graphic Design at University of Buenos Aires (UBA). Find Julián's work on Artstation and Instagram.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes behind why the ASF "just works" https://blogs.apache.org/foundation/category/SuccessAtApache 

Monday April 27, 2020

Inside Infra: Drew Foulks

The second in the "Inside Infra" interview series with members of the ASF Infrastructure team features Drew Foulks, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.


 



"I am in the business of making life easy for people who do phenomenal stuff."



What is your name --how is it pronounced?

My name is Drew Foulks. “Droo Follx”.

If folks were to find you at the ASF, like on Slack or elsewhere, what's your handle? How do they find you?


They'll find me at Warwalrux, spelled with an X, so W-A-R W-A-L-R-U-X.

So, “War Walrus”, but with an X at the end. Where did that come from?


Kind of embarrassing story actually. I got picked on a lot in middle school because I was always really good with computers, but as bad as it sounds, I never really wanted to be. I always wanted to be one of the one of the cool kids and the cool kids were not going to computers. One day I got into a fight at school and one of my friends just absolutely made me lose it afterwards. I was sitting there on the ground crying and he said, "Man, you were a fighting walrus, like the walrus of war or something. It was awesome." I lost it. But ever since then, I’ve just been like, "You know what? I'm not even going to be ashamed about that anymore." I've been that since I started doing tech, which was actually not that long ago compared to the other guys on the team.

How long have you been in tech?

I'm 29, and have been in tech since I was 16, so 13 years.

When did you get involved with the ASF? How did you get here?

I was working at NASA for four and some change years, and I decided that I wanted to pursue some other opportunities because they really were not supportive of that work from home culture. And at the time I had a lot of stuff going on. My wife was sick, my daughter, my youngest, has special needs and stepson actually also has special needs, so being at home was something I had to do. A buddy of mine tipped me off on a Website called We Work Remotely. I ran across your ad there and thought, "There is no way that is who I think that is and I'm going to apply for the hell of it." Surprisingly, two months later, I got a call back.

You do understand how many interview candidates we had, right? A lot of people were competing against you.

It blows my mind. I heard the stories after I got hired and I was just like, "Man, that's nuts." And then when I got hired, I was actually told, jokingly, of course, the ASF was looking to launch its own brand of internet satellite. So that’s why we hired people from SpaceX and NASA.

The Infra guys have such a dry sense of humor! How long have you been a member of the team?

One year and one month, 13 months.

For some reason it feels like you’ve been part of the Apache family for years. What’s your role in ASF Infrastructure? What are you responsible for?

My latest contributions have been the Website builders, so I'm working on helping people migrate off of CMS. Some of the ways that I've chosen to do that are by working with Humbedooh (the handle for ASF Infrastructure team member Daniel Gruno) on his ASF.YAML project, that so many projects seemed to be really enjoying.

YAML? Yet Another Markup Language?

That's it. Yet Another Markup Language.

So basically, I built the system that lets you build Websites from ASF.YAML and you just specify your Website builder, whether it be Pelican or Jekyll --those are the two that we support right now. And you give it a source branch and a target branch and every time you check in, boom. It builds your website.

Who is this aimed at?

This is for Apache Projects building their TLP Websites. When you commit your Website to the repo, say any project, they've all got Websites, but some of them are generated via Jekyll. Some of them are generated with Pelican, some are generated in a custom way with a Jenkins job. It's just how each project is determined to generate their website, but we're trying to make it easy and provide lots of options for projects to migrate off of the old CMS. But still projects are allowed to be able to choose their own method of publishing or their method of creating a site, but you have to be able to enable all of that to happen.

Did you have to learn this or was this knowledge something that you came into the position with?

I learned it.

Was it difficult? How long did it take you to get this project up?

The Pelican one was a lot harder than the Jekyll one. So, Pelican took a couple of months. Really, Greg had a prototype when I came in that apparently had been kicking around for a little bit, so I tightened it up and pelicanized it. I think it works pretty well. I've not heard any complaints about it.

That took a while before I wasn't doing primarily Python programming, I was doing lots of different ops things just in a completely different way than what I do now. To be honest, I still haven't wrapped my head around exactly what it is I do here.

Do you mind sharing a little bit about that?

I came here from the government world, which is very silent. I worked for the OCIO, Office of Chief Intelligence Officer Data Center for NASA Langley, which is a very old NASA center. Older than NASA itself actually. Their infrastructure, as you can probably guess, is not the newest: It's 100 years old. They have wind tunnels from the 1920s. There are parts of the infrastructure that are 100 years old and it's insane. Everybody has a specialty, everybody's a subject matter expert in something, and there's nothing more permanent than a temporary government program, so if you take something on, expect to be doing that for the rest of your life. It's very regimented. If you’ve ever seen Hidden Figures, the computational research facility where they’ve opened the Katherine Johnson Research Center, was my data center.


And then to come to the ASF, it's like, "Okay, so we've got like 11 different Cloud providers and these are all the projects that we're supporting. Do you know this, this, this, this or this?” Jenkins, Buildbot, VMware, any of the Docker, Puppet and all that stuff. Do I know any of these myriad Open Source technologies that one doesn't really get to use a lot of in the government sphere. I mean, I've been doing Ansible there for three years.

It was very monolithic. We had VMware. I ran a data center. I had hardware. I had to track all of that. Coming here, everything is completely different. It's like, "We're juggling all these different Cloud providers, and oh, wait: we’ve got to migrate out of this one today, so let's do that. Okay. All right. Where are we going with this?" It's just like there's no end in sight. As technology progresses, so do we. It's just that we do it so much faster than anywhere else I've ever been.

Is that exciting or scary?

Oh, gosh. I've never stopped long enough to think about it. It is a bit of both. It is intimidating for sure, because before it was very silent. Like I said, I did my thing and I had my interests, my extracurricular interests, running home network setups and private media servers and whatnot. Then I come here and those hobbies go away, now I’m doing that for the Foundation instead.

Yeah, that's cool, though.

It is. I'm a professional hobbyist.

To get paid for doing your hobby is pretty rewarding.

It is. Yeah.

This has become your hobby in a different way, of course, because I'm sure you weren't planning on dealing with ~11 different Cloud providers.

No, I was not.

In our chat with Chris Thistlethwaite last month, we learned more about who ASF Infra serves and the scope of the work that you provide. Can you tell me more about the who and how it works exactly? So, who Infra serves and to what capacity or what is it that you guys do? Because I get every person's perspective is slightly different because I get the same, we do it all answer, and is that true? I mean, you're saying that so far, it sounds like it's true. I guess no one has a reason to expand upon it in terms of embellishment, but tell me more.

We serve Apache project developers and development teams. It’s not just the people who sit down and write the code, the people who orchestrate these very complex processes of building testing, checking, doing the sanity work behind the scenes, the people coordinating releases, PMCs planning out the future of these projects, we serve them, too, and we have to serve them in a capacity beyond, "Hey, here's a build platform," it's: "We support your email communications, we’re there to facilitate the goings on of the Project." Infra's domain is almost everything but the coordinating and writing of code.

Taking care of their code management systems, providing them with the means to do build testing and having it not kill us in the process. That's a big, big addendum to that requirement. Like I mentioned, email, I call them the central services, things like LDAP, authentication, your virtualization services, file sharing, all of those things that make the business of a TLP easy(ish). I am in the business of making life easy for people who do phenomenal stuff. That's honestly how I view my job and it's very, very different than my old one.

In my old job, I had one customer who I bent over backwards for; here, it's very much, "Listen, my job is to provide these services and to facilitate what you guys do, not do it for you." Drawing that line sometimes becomes difficult for me personally because I don't have as much experience in the ASF, I think. But that seems to be a skill that the other guys have is when to bounce back and say, "No, this is definitely a PMC or a PMC issue that you guys should be dealing with because it sets a bad precedent if I make this decision. I'm not going to do this work for you." It wouldn't be a right to pollute a project like that.

What you're saying doesn't come across as odd. One thing that I always want to know is how ASF compares with other infrastructure operations in general. Chris had said this also, here you have 300+ projects and all sorts of different groups that you're interfacing with, so it's a completely different type of interaction. Your response is totally legitimate: it takes a certain type of personality to be able to handle that because most people would likely be overwhelmed and run away. The fact that you're here and thriving and our projects are expanding is awesome.

Thank you. You can thank my wife for not letting me run away.

Based on my understanding, as a team you're autonomous yet coordinated. Is that the right way to describe how you work together?

Yes. That is a good way to describe how we work together.

Do you feel like that model works or do you think something else should be happening or how does that work for you?

That's a tough question because I'm not sure that the answer would make any sense, but I'll give it a go anyway. By constantly talking with each other, the team gets a sense for the direction that we need to be heading. Leadership is very organic and not spontaneous, but they're like a current guiding us towards the goal, really, whatever that is, so all of the decisions that we make on the daily really kind of help us towards that goal, because fighting the current is difficult.

In a lot of ways that long-term coordination is really facilitated by this, I'm going to call it “on a current of progress”. It's not forceful. That's kind of what it feels like. The team is driving towards something, it's not random, to be honest with you. It's typically a goal that we have in mind, but all of the work that we do is just like, "There's a cool idea that I had related to this, so let's just work on that." And we end up getting there. It's crazy.

Describe your typical workday. Are you on a rolling schedule? Do you guys work on a shift? How do you get it all done --and you're down one person now-- how do you get it done?

I have no idea. So really, personally, I have a nine-hour a day week schedule that I follow every day. So basically I start work and I break it up into two or two-and a half hour chunks and I do four of those, take little breaks in between, try to keep myself sane, try to throw in a dog walk. Really, I just approach it like I approach any other job, one ticket at a time.

Do you work in shifts? How do you cover those 24/7? How do you balance the load?

So there's a one week on-call rotation. So right now there are the... gosh, how many of us are there? Five? Anyway, so there's one week on-call rotation and that person is on 24/7 for the week, Monday to Monday. And then after that, it's pretty much just you cover your time zone. Yeah. So the scheduling, it's so loose that I mean really as long as you're putting in your eight hours a day, nobody really cares when you do that. I choose to have that nine-hour work day because kids really. It's fantastic for having a family, but whether you want to jump on at 1:00 in the morning and work for six hours, that's fine.

OK, so as long as someone's there, and it doesn't have to be you, you can work on your own timeframe. Are you guys usually slammed? Is it low-level? Is there a busy time for Infra on the whole? Is it like tax season if you're an accountant, or is it constantly just 24/7/365?

It's pretty much 24/7/365, but we do definitely have “seasons” as well. We do a one week on-call rotation, so somebody's always on, but the scheduling is very relaxed. So, it's optional, the hours you'd like to keep. I choose to work a work day because of the family and that just kind of fits in nicely actually. Some people may decide that, "I'm awake It's 1:00. I can't sleep. I might as well get some work done and I do that." And I've certainly done that before. So, yeah, it's pretty whatever and we're all kind of, I don't want to call us workaholics because I think that's a bad word, but we're all …

“Work enthusiasts.”

I don't know that I've called them busy seasons as much as busy cycles.

What are they? What triggers them?

Typically? Releases. The most tickets coming in is when some project is putting out a build or is putting out a release. For a large project release, we'll have a lot of tickets sent in because they're utilizing a bunch of resources and stuff gets backed up. That's typically it.

So whoever is on call during that time period, it's really their responsibility to handle: it's not like when Apache Wombat or whatever Project has an issue, it becomes “Drew's issue”. You're not assigned to a project to facilitate that, it's whomever is there will help them however possible, correct?

Yeah. And I think that you said it earlier: everybody that you've talked to says that we do it all. I'm going to tell you that we do it all. It's every project from Apache Zeppelin to Airflow, whatever the first one is. That's not our work.

I don't know if this is actually the case, but I'm curious: is it possible for an ASF Infra team member to be an introvert or do you all have to be “client-facing”? I know that we don't have an office, and you see people from time to time at ApacheCon, but do you have a wall that you can hide behind or do you have to interface with people all the time?

Did you go to the end for Lightning Talks?


I was not at Lightning Talks at ApacheCon/Vegas, but I heard it had quite an activity that happened there, Chris told me about it during his interview, let's put it that way. No one said anything to me up until that interview, so I was surprised. Fill me in with some more. What do I need to know?


[laughing] So, an introvert and two extroverts that are way too drunk, get up on a stage in front of people and proceed to just make fools of themselves for a minute. That's pretty much it.


I guess I know who the introvert was.


Yeah. So the original plan was to go up there and make thunder noises because that is the sound of lightning talking. That was a fun experience. Not one that I would do again, I think but it was fun.


Let's go back to the daily schedule for a minute. This is always a curiosity for me for anyone who's super busy, which is pretty much everyone at Apache: how do you keep your workload organized? Your structure for your day is very impressive, I have to say, this two-and-a-half hours times four. I think it's fascinating. But your actual workload, for example, you get one of these huge releases, how do you manage all that?

Okay, so the first part of my day is typically spent organizing my day as awful as that sounds. We get so much email that I think that it's literally impossible to read it all. I'm pretty sure it's literally impossible to read it all and so much email, so the first order of the day is sift through that while you drink your coffee because there's no way I can get through that. I catch up on the stuff that the team has been talking about, catch up on all the slack channels, look at my tickets, prioritize my workload, and that usually takes about an hour. So right at 8:30, I'm ready to actually start doing stuff. Then it's usually tickets and then a break. And then I don't like to check my email too terribly often. I wish I could three, four times a day, because I think it gets me off task, but that's not really something I have the luxury of being able to do all the time, so I do have to monitor my Ubuntu alerts as emails come in, scanning for anything important. But yeah, it's ticket work for the first half of the day, a project work for the back half of the day. And then right after lunch, I'll sit down and I'll figure out where I am on my project, and then try to move forward from there. Typically, that involves research, but yeah, I like to spend the last couple of hours of my day trying to do something. So, typically project work, because I don't like doing ticket changes at the end of the day.

Why is that?

Well, if you're going to nail your foot to the floor, don't be surprised when you can only run in circles.

I presume when you do ticket work, more things come out of it, too, so it never ends.

Yes. Typically, ticket work involves making a change of some sort, to something that's actually being used, whereas project work is kind of this nebulous, unused, non-production thing.

I'm hearing that you need to know a little bit about everything in addition to your own areas of expertise. How do you stay ahead of the curve? How do you learn about everything that you need to know especially if you don't know what you need to know? How do you do that?

I don't think that you do stay ahead of the curve. I really don't. I think that we do our best to ride it. Getting ahead is so immensely difficult. This technology essentially fractalizes into these many different various facets of high computing.

From virtualizing, networking, programming, you have all of these facets. Nobody can really, truly stay ahead of the curve. I mean, holy cow, the guys in the Infra team, they are all 12-pound brain-type dudes. They'll go from talking about hardware specs to talking about virtualization. They'll bounce around all these different facets of technology, and obviously you have strengths and weaknesses, I don't think anybody can really stay ahead of the curve at this point, and I feel like it's been a long time since anybody has. Technology has just gotten so complicated. We've really tried to, without specializing too much ... kind of pick out some of the non-essential fluff, the stuff that we don't use. I mean, hypervisors aren't really like super in these days. It's all about the Cloud, which is really just an abstract hypervisor, but whatever.

So, we don't really have any “machines” anymore, spec-ing out a physical machine is not something many of us do very often. It's not part of our job anymore, but that's definitely one area of technology that continues to advance as they put out better processors and whatnot. Mostly we try to stay ahead on the DevOps side of things without focusing too much on this operational infrastructure portion. And that's where I came from, this operational infrastructure, the data centers, the servers, the hypervisors, making VMs for people. That's what I used to do and now it's a lot less of that and a lot more fine-tuning this nebulous system of intermeshed tools that I don't fully understand yet.

Seeing that you and others can't stay ahead of the curve, can ASF Infrastructure actually stay ahead of the demand? I mean, is there any way you aren’t constantly in a reactive mode of “this new thing we're responding to, or here's a new part.” Can you get your house in order, or is the house in order?

At the ASF, especially Infra, we do a very good job of listening to our projects because we as individuals cannot stay ahead of the curve *and* have every good new idea that there ever was to be had. Our community is large, and our community is very smart as people and as a group. We have a lot of really excellent ideas that come in from tickets and you say, "You know? I think I'm going to look into that today." And you look into it. You realize that it has all this potential and suddenly, that's the service that we're now using, some things like Travis, which is a third party build validator, came to us in that way.

Since I've been here, some of them have come to us via tickets, where it's been, "Hey, I saw that GitHub has this new thing, you should check it out." So one of us will check it out and we’re like, "Dude, that's awesome. We should use that." I think that we're constantly being batted in front of the curve by a community, by a boots-on-the-ground community that knows what's up. We obviously have our own interests and our own passions, but I don't think if left to our own devices, it would look quite the same as if Apache TLPs couldn't put in tickets.

So it's been one year and one month, but how has Infra changed for you since you've come on board or has it changed?

Nope, still terrified. [chuckles]

How is the team coping with the ASF's unstoppable growth? We have 45 projects in the incubator and there's more than 300 projects out there … there's a geographic influence now on demand, fan increase in users and committers and projects from China, for example. Are there any issues that the team feels like, "Oh boy, we got to deal with this?" Is computing an international language, where it doesn't matter where you're from or what's happening? Are any shifts going on from the ASF’s growth impacting you guys beyond more of what you're already doing?

So, typically, all of my jobs really have been this kind of larger, national or international affairs so basically, since I was 20. I worked for a really large mortgage company, and then I left there and I went to a massive health insurance company. Lots of international folks and so, aside from the language barriers, yeah, I would say that computing is kind of an international thing. As far as the unlimited growth, I don't really know. I'm not sure. That sounds like a question that I would definitely advise you to go ask one of the board members about.

"Management."

Right: “Management”.

You had mentioned that you were working on the no-longer-CMS project. Is there another project that you're doing? Are you a go-to guy for something?

I don't think I'm the go-to guy for anything really. I just try to pick up whatever is there to be picked up. One of the things that I'm working on right now in the “demise of CMS” project is this custom builder. I'm still working on it, so it’s still a work in progress, but the idea is that you'll be able to have a custom build environment that would allow you to, from the ASF.YAML file, write a script, do a “thing” to create your own custom build environment so that we can really, really make a hardcore concerted effort to get off CMS.

Why? What was the issue with CMS? Why do we have to migrate from it? What was the problem?

To be honest with you, I've never actually used CMS. Fortunately, I have never been asked, too. John (former Infra team member John Andrunas) was, but I was not. I was spared, by the CMS gods, they shone their countenance upon me. It was pretty awesome. From what I understand, it's very cumbersome to use and not very friendly and also very old. My understanding is that although it works, there are changes we wish we could make to it that we cannot, so it might be time to just move on to something newer that maybe works a little bit better for us because our use case has changed.

You're still rather new to the role: when you first came on board, what was the biggest challenge or surprise? What really opened your eyes?


So, what really opened my eyes was how much of a learning curve there is. Man, that was rough.

Is that still the case?

Yes, that's still the case. It's just not as bad as it was. Where I was before, I was using all of the stuff that we're not using here, all the Enterprise Edition stuff. So I came in with a completely different toolbox than what I was handed, so the learning curve was massive. I had to relearn how to use the automation software and we were all Splunk, so I had to learn the ELK stack stuff and we were Ansible or they were Ansible, the Foundation is using Puppet. Just all of it down to the monitoring. We didn't have any third party monitoring because, “government”: we had this really unfathomably convoluted Xymon setup, which was interesting but  we were using RCS for everything. So instead of git or subversion or even CVS.

Yeah, they're stuck with their legacy, that's for sure.

Yeah. You got text files in there that have got 10,000 versions in RCS. It was like, "Oh, my God. What am I going to do with this?"

So, I tried to implement some of the new hotness there. The git workflow, gitflow, actually, the exact same kind of thing that we do here.

I had a good understanding of how ASF did business from an operational standpoint. I understood it, because I've helped implement it elsewhere, but this is the first time I've ever been fully immersed in the river of PRs and tickets and all that other stuff, so it's been a hell of a learning curve, like it has really, really kicked my butt.

But you're kicking it back. I mean, you're here. You're making it work.

Oh, yeah, hustle, man. That's really all you’ve got to have is hustle.

As you're describing the way the ASF is and you were talking about some of the tools and the orchestration requirements, is this a common thing that Infrastructure today in general is heading in that direction, or is it an anomaly not only from your personal experience, obviously, but that is an anomaly but from the way you see the industry? Does “infrastructure” in general seem to be headed in this direction, or is ASF really a unique animal in that way? Do people really have to be more jack-of-all-trades?

So the ASF is a unique animal. It is. Typically, people don't have 11 Cloud providers and if they do, they've usually got some sort of system underpinning all of that whereas ours is tribal knowledge and text documents and we're really trying to get this knowledge codified and our technical writer Andrew Wetmore was really doing a kick ass job with that. But, yeah, typically an infrastructure team of this sophistication would probably have a different set of tools.

It's surprising that we're not using, like Vagrant and Packer and Teraforms which abstract the way Cloud providers make VMs. We still make them by hand. It's work, and really the only way to be good at that is to know what you're doing and to be confident in that particular UI, which is always its own special kind of awkward, trying to get used to a new UI, finding out where all the options are, and we're doing all these things by hand … everybody just picks up this knowledge through osmosis, just by stumbling through these tickets from time to time and it's really crazy to see sometime how much process there is and how little documentation there is. So I'm really happy to have our documentation writer on board.

That's Andrew, right? Andrew Wetmore is working on the documentation?

Oh, yeah. Yep, and he's doing a really good job, helping us sort it out.

And he hasn't left screaming and running either, so that's a good sign. It's a lot of work.

That's true. Yeah. It is. It is a lot of work and he has not left running, but he is a really chill dude.

Our infrastructure is unique in that we do all of the things that are kind of necessary. There really isn't too much of a go-to guy for any of this stuff. If there's a problem in the build system, you take care of it. If there's a problem with a Web server, you take care of it. That's where the autonomous nature of Infra comes in. If there's a problem, you just take care of it. You have these tools, you know how to do it, you just do it.

How do you know that someone's not fixing it on their own at the same time? If something's broken, you're like, "Hey, this is broken. I'm dealing with it" or something else?

Just slack, typically. I always check.

Yeah. Okay, what's your favorite part of the job?

Oh, gosh. My favorite part of the job is not feeling icky at the end of the day. I've worked for some companies that kind of made me feel a little ick in their mission. So one of the stories that my wife likes to tell is that I quit [MEDICAL INSURANCE COMPANY] because I disagreed with them as a company and I paid $5,000 to do so. But yeah, so I worked in the mortgage industry a little while shortly after the housing collapsed and I just thought about it. It was like, "Man, I really don't feel good about this job anymore." And then I moved to [REDACTED], which was arguably a bad move.

Big Health.

I was there for like 11 months. I signed a contract, I got a sign-on bonus, I moved to get there, so the stipulation was I stayed a year. I stayed 11 months and three weeks and I quit. I couldn't take it anymore. I'm just like, "I'm not doing this. I'm not doing this."

I was walking on an image parser for the Affordable Care Act pipeline, which was awful. They were still implementing it. This was 2012, 2013.

It was really bad. So after that, I went to NASA and I finally felt good about what I was doing and to have made a move where, again, I agree ethically and morally with what we're doing. I mean, it really is noble work, not specifically the work that I do, but the work that the people that I support do, and so, by proxy, my work is also.

At Apache, we have volunteers that dedicate hours of their life to these projects that we distribute freely because it really does make the world a better place. I mean, where would the world be without HTTPd?

What you just said right now has totally touched me. I feel like I’m ready to burst into tears, that's amazing. Really: I mean, wow. That's from the heart. I totally get you about doing things for people you don't believe in. That's so hard.

That sucks so much.

I totally get it and you're right. This is such a crazy group. It should not work and they do and it's incredible: 21 years of this. It's amazing.

Yeah, it's like trying to watch an eight-legged horse run.

[laughing] A what?!

An eight-legged horse. Somehow twice as fast, but you have no idea how it's working. Or which direction it's going to go.

I can’t stop laughing over the visual of that.

It's actually really funny because I'm a huge classics and mythology nerd. Technology was not my first choice in careers. I wanted to be a Latin teacher.

I love this. These are the backstories that everyone wants to know. You want to be a Latin teacher?!

I wanted to be a Latin teacher, yeah. I did Latin from freshman year in high school until I decided that college wasn't for me. So sophomore year, I took six years of Latin and it is really awesome what learning Latin does for your programming ability because it’s surprisingly similar to learning to code. But yeah, I make a lot of really, really stupid classics and mythlogy puns. So my daughter, her nickname is actually Livy, in reference to the famous historian, which is not something a lot of people get, but that's okay, it makes me chuckle. And Odin had an eight-legged horse that was twice as fast as the other horses, supposedly really fast because it had twice as many legs.

It's interesting with your career, you've worked at places that are big names and people would be very impressed with that, but you're stressing that just because it's a big name or big group, it's not what it's all cracked up to be. What are you most proud of with your career, your Infra career, with Infra as a whole? What makes you say “yay”?

To be honest, becoming an Apache Member was pretty freaking awesome. When I got here, when I start a new job, I always try to set a goal for that job. Sometimes I get it and sometimes I don't, and sometimes I don't realize how hard it is to actually do what I'm setting out to do when I start. My goal at NASA was to win a silver Snoopy, but that was never going to happen.

Silver Snoopy? What’s that?

That's an award given by astronauts to engineers. They don't typically give that to IT folks, but I didn't know at that time.

But here, it was to kind of become a Member and really to be accepted. I feel like I'm doing okay on that. That's pretty cool. That's going along really well.

You fast tracked. I mean, if you've been here for 13 months and you're in as a Member, that's pretty cool. That's good timing, good performance on you.

Well, thank you. I have no idea of how well or badly I am doing. I'm just doing things in the hope that they affect the universe in a positive way.

You're there, we couldn't do it without you.

That's excellent. Thank you.

You got to pat yourself on the back for the work that you're doing, because with our community, you know if you weren't doing it, you'd hear it. People would grump about it.

That's true. That's very true. But again, this is a mindset that's really prevalent in IT is the Tetris mindset where when you're playing Tetris, you fill up a row and it disappears. As such, those are your successes.

The Tetris mindset really is being bogged down by the monument to failure that you've built because really, when you're playing Tetris, that's what you're looking at is the monument of your failure, places you haven't quite gotten the row completed yet and shifted out of your bucket. And it's really easy to succumb to that mindset, especially in a place like this.

And I really, really enjoy the fact that the Apache Community is they seem eager to call out wins for other people and that is an awesome attitude for a community. It's something I've not experienced a whole lot of being called out for successes. I think that on the whole, the community and being embraced by the community has really kind of helped me not fall into that funk, that Tetris mindset just doesn't seem to be prevalent in this community, which is nice.

Do you think that puts people in a kind of "I'm not good enough" mindset because there's not a reward? You're young enough to be part of that community that likes or is accustomed to getting trophies for showing up. Apache doesn't allow that. It's nice for you to show up, but you're not going to be rewarded. Do you think there's an impact with that?

I was on a soccer team once and I did get a participation trophy. You know what? I couldn't even tell you what the name of that soccer team was because I didn't want to play soccer. So, really, I think that if you're coming to The Apache Software Foundation, you're not doing it for the participation trophy, you're doing it because you want to, so the reward doesn't matter. You're doing it because you want to. It's really weird to be surrounded by people who are motivated by nothing other than the fact that they want to be here doing this.

And it's refreshing and I love it. I do.

I love hearing that, that's great. Here come the somewhat personal questions: there's just a few of them. Chris was laughing hard when I was asking them; I don't know if you read the full Chris interview, but it's always interesting to hear what they have to say. So ... how would your co-workers describe you?

Less cool than my wife.

What is your greatest piece of advice... what would you tell aspiring infra people, sysadmins, people like yourself, what would you give them for work advice or career advice or life advice: what would you say?

Oof, that's tough. I guess I would have to say that if at the end of the day you don't feel like your job is worth it, it's probably not.

So, if you're going to do something, make it worth it. That's my advice.

If you had a magic wand, what would you see happen with ASF Infra?

What would I see happen? Well, obviously bonuses and pay raises, but I have no idea. If I had a magic wand, I'd probably turn it over to someone who I thought could make the wish better than I could, but yeah, I have no idea.

What else do we need to know that I haven't asked?

Oh, gosh. So many things, but none of them would make sense out of the context of this particular conversation. To be honest, I'm still under the impression that everybody knows more about this than I do still, so I don't know.


Drew is based in Tennessee on UTC -5. His favorite thing to drink during the workday is a black coffee prepared using a French press or the pour-over method.


# # #

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation