The Apache Software Foundation Blog
Inside Infra: Andrew Wetmore --Part I
The "Inside Infra" series with members of the ASF Infrastructure team continues with Part I of the interview with Andrew Wetmore, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.
What is your name and how is it pronounced?
I'm An-drew Wet-more. The "Wetmore" is like a rainy day, very easy to pronounce.
When and how did you get involved with the ASF?
I was a Flex and ColdFusion developer. When Flex came to end-of-support with Adobe and they passed it over to Apache, I followed along. I wasn't an active committer: I was a participant in the Apache Flex project and contributing in my little ways here and there. Then when the Apache Royale project split off Apache Flex, I went there, but I was not an active, not a heavily significant contributor. That is, I was helping with documentation, a bit of testing, a bit of organizing and helping. I was truly surprised when I was invited to become a Committer. Then at some point, somebody noted on the Apache Royale list that the Apache Software Foundation Infrastructure Team was looking for a documentation person. I thought, "Well, that's interesting. Maybe I would be able to contribute to that."
I followed that up. I wrote to (ASF Infrastructure Administrator) Greg Stein and introduced myself and said, "Oh, I'd be interested if this is something that's happening." Then for one reason and another, nothing happened for quite a long time. That was fine. He told me nothing was going to happen for a while. He was migrating some monstrous mountain of something. Then when that long time was up, I pinged him again and said, "I'm still around if that's an interesting possibility," and we got talking. He did that wonderful interviewer thing of saying, "Well, if you were going to hire someone for this sort of a job, that has this heading, what sort of job description would you write?"
He made me write the job description. I thought, "this is cute: I'm happy to help. I don't know what person not me is going to get this job, but I'm happy to write what I think is a good job description for this thing." Truly, I really expected this to go out and a whole bunch of people to apply for it and that I would get a participation trophy. I was very pleased when I was invited to join the team.
… You got the real trophy.
Yes, I did.
You got involved when Flex came to Apache, so that goes back to 2011, you've been with the Foundation for nine years or so?
I was aware and downloading builds as soon as there were builds to download and participating. I was still building my own Flex stuff, but I don't think I was really contributing significantly until around maybe 2015. Then I didn't become a Committer until 2018.
The other things you were doing prior to Infra were limited to Apache Flex and then onto Royale?
Yeah, I had a glancing awareness of Apache. Without even thinking about it, of course, I was using Apache tools like Apache Tomcat packages, but I really had a distant but benevolent appreciation of Apache until I started to get more and more involved with Royale and began to understand from that angle all the things that the Foundation does to support these little projects that could not survive without it. Of course, now that I've become part of the Infrastructure team, I'm awestruck by the amount of work that the team does to support all these little projects, so they can do their thing.
It's interesting with Apache projects because they're mostly ingredient brands versus a customer-facing final product. Of course we do have those too, but the majority of them power something else. A lot of times people aren't aware until they're in it: then they’re like, "Oh, wow, Apache is everywhere."
Well, I keep trying to improve myself and I go and choose a product project at random and read its homepage and its "about our product" thing and see how far I can get before I've hit five things that I don't understand at all. I don't even understand what I would do with the thing that I do understand which is not a knock on those projects. It's what you just said, they're not end-user-facing. I, as a Flex developer, was a Flex developer. I was using Flex and now Royale to build other things, not getting in with the toolkit and adjusting and tweaking Flex or Royale.
Right, like a commercial product. Explain your role within the Infra team. What is it exactly? What's your title? When did you get there?
Let's see. It was at the very end of last year. I'm coming up on 11 months on the team. I started two weeks before the end of 2019. My title is either editor/writer or writer/editor. My job is ... well, the situation of the Infrastructure team from a point of view of a written material, of documentation, is that for the past 20 years people have been doing their very best they can to document what they do and how people should do things. They've been adding to that material and adding to that material. We have a built-up pile of stuff, some of which is no longer relevant, some of which contradicts other bits, some of which is written very provisionally for something that we now take for granted that we're always going to use.
My job is to go around with my little broom and dustpan and clean up things. I go and find a page in the Infrastructure documentation and do what I would do when I visit project pages. I start to read it. When I get lost, I start to edit it or I start to ask questions to the team and say, "Is this even a thing anymore?" The team sometimes says, "Oh, we were doing that in 2011, was it? I forget." We know that that page can go into history. There are other times when I run into things, there are many pages, far too many pages where there's a sentence like, "At this point, such and such is true." It doesn't say anywhere what "this point" is.
Then when I dig in a little bit, I find out it was again back several years, where either the conversation came to a stop or the conversation continued and whatever that uncertainty was on the page has been resolved. Now I'm able to update that page and say something that's more useful to the visitor like me who's coming now and doesn't want to read a historical document, so much as get help about doing the thing they want to do.
Right: "what does that mean for me?" This is the legacy dilemma.
Sure. Well, it's also a factor of people trying very hard to do too many things all at the same time. Let's just write enough explanation so people can get going, because surely they'll understand what we mean. That's often true for probably the vast majority of people who are active in an Apache project. The pages that stump me don't stump them.
We have this issue of tribal knowledge not scaling. Part of it is that people are moving on. Part of it is that processes are changing, technology is changing, people forgetting about pages. You're in a very interesting position. If I'm understanding it correctly it’s trying to rectify whatever the intention of what's historically been in place versus what people are doing now, even establishing relevance and trying to find out who's the de-facto source to go for this, considering the amount of contributors, is incredible.
There's another piece in this. I'm trying to come at my review of pages from the point of view of a person of good intention for whom English is not the first language. I'm trying to think, "What on Earth would one of my colleagues from Laos understand from this sentence that has a jokey little acronym or aphorism in it?" Then I tried to figure out how I can say that more clearly and not in a boring way, but in a more accessible way. The three flaws that all our documentation tends to have, one is to write in an academic way. One is to write with a heavy use of acronyms. One is to use a heavy serving of passive voice in the material.
Let me explain what I mean by those three. The academic style: so I have a page here that's going to tell you about X. If I'm an academic, I might write, "The intention of this page is to explain the modalities by which a user might accomplish X." As the reader, especially the reader for whom English is not the first language, I'd like to see, "Here's how to do X."
The second one is the acronyms. Even acronyms that should be, you would think would be obvious to everyone. "The PMC of a TLP." It's like reading Hebrew where there are no vowels, but a simple rule is to always spell out what the thing is and put the acronym in parentheses the first time you use it in a document. It's easy to forget if we know the acronym so very well ourselves.
The third thing is passive voice. Passive voice correlates strongly to the academic approach. It has a way of hiding who does what. "After the file is uploaded, it is passed to the server and verified." I don't know who does any of those things, if I'm the owner of that file, I don't know if I have to do any of those things or if I just sit back and watch them happen.
Sometimes when I ask—I actually enjoy doing this because it is my naïve or newbie character—I go to the team and I say, "Here's this sentence. Who is doing this?" In the conversation about that sentence, sometimes the team discovers that there's a difference of opinion about what part of the Infrastructure does it or whether someone has to do it. We might end up not only fixing the text but improving the process.
What I'm hearing, it's beyond just writer/editor. It's writer/interpreter.
I came into software through the QA door. Over 15 years, I ran teams that were generally documentation and quality assurance, documentation and testing. I think the two are very tightly connected. When I go to edit a document, to the extent that I'm competent to do so, I test it. If it says, "You go here and do that," I go over there and look, is the thing that I can do available there?
It does what it says on the tin. It doesn't correct that.
Well, we have innocent blind alleys that are built into our documentation out of good intentions, pages written about our software repositories, back just a few years, presumed that everything happens in a Subversion repository for projects.
For projects, almost everything happens in Git repositories now and the instructions of how to do something in the two repositories may be different. We need to go back and find those pages and make the path comfortable for someone who couldn't care less about Subversion, but they have to do something in Git.
Let's talk about the scale of what you're working on. How many pages are you handling?
Oh, good Lord, I don't know. I started out within Infrastructure itself, looking at several packages of pages. There was a set of pages on the Apache Website under the subhead dev, a set of pages on the Apache Website under infra.apache.org. There had been a set of pages under something called reference. There's a set of documents in a Subversion directory. Then there's a very large Infrastructure section of the Confluence wiki. One of the first things I had to work through is how did these relate to each other? How does anyone find their way? I proposed and the team seems to have accepted that the pages at the apache.org/dev area are the introduction. Here's what someone trying to figure out what goes on with dev and possibly the Infra team would need to look at.
The next step in is the infra.apache.org area. If you can't find what you need there, you follow the links through into the wiki. God help you if you have to go to the Subversion repository. Really, what should be in the Subversion repository is only, it seems to me, the instructions for restarting or rebuilding the Confluence wiki. I am gradually moving other stuff out of the Subversion repository.
This feels like a byproduct of individuals or projects or committees or communities actually having this "scratch your own itch" issue, right? "Hey, we need to document that somewhere on our thing and here's our particular experience." Apache is open enough. You can do that. The whole directive of integration or coordination or making sure that one hand is speaking to the other has never really been a mantra of ours. It's interesting to see as we're scaling, it reflects directly on what you’re experiencing.
There's the thing. If it were a team of six people in a room or at a project level, it wouldn't be a problem figuring out where to put what or how to fix a documentation collision. When you're talking about hundreds and hundreds and hundreds of people, some with decades of experience, literally two decades of experience in the Foundation, someone bopping in brand new can bring things to a standstill. People can get lost or disheartened.
The disorientation is very common. I hear people going, "How do I find …?" "Where do you start? Where do you go?" It's great that there's assistance.
In early January, I started with some thoughts. "I'll be a new person." "I'm going to read the apache.org site and go where it tells me." I got so depressed. I could not understand how the information on one page related to the information on this other page that I was looking at. There seemed to be two sets of instructions for doing many things. Part of my fun is trying to make it easier for people who come in: you have to help them find their way without giving up the whole thing and throwing the computer out the window.
Are you still at the audit and discovery stage or are you actually at the rewriting stage also? Are you in course-correction or new content?
I've moved almost everything out of the apache.org/dev area that shouldn't be there. What is left is introductory material. I've edited every page on the infra.apache.org area. I've edited almost every page, I think, on the wiki. I'm still digging around in the Subversion thing. I am doing a long march through the top level apache.org pages. I've been in most places, but I'm not done. Beyond this, I guess there are some areas I haven't touched or I need to ask permission. For instance, the Incubator has a set of interesting pages in which the material is clear, in large part. There are some suggestions I'd like to make for those pages, but I don't have any mandate to go in there and start changing things.
What I would do in that situation is write up a little sample report, "Here's how I would suggest changing this page," share it with the Incubator team and say, "This is yours to do as you'd like. Would you like some more?"
… Chances are they'd say yes.
Well, indeed, but I don't want to assume that. I know how I would feel if I turned around and found out someone was changing all my sentences.
I might very well feel that not having passed through the Incubator wouldn't have a good grip on what needs to be said.
It's really important for us to have these fresh eyes with respect to what the outside world is seeing: people who are new to it, how is their interpretation or misinterpretation? Again historically, there's been this issue of, "well, it's obvious". Not just has Apache evolved and the communities have evolved, but Open Source has evolved. The expectation is very different. Similar to what you were saying before with Subversion and Git, it's a completely different space now. We have to grow with that. I think again because we're not a corporation, we don't have these marching orders of "go, bring it up to speed or bring it in alignment." It's great that you were there to audit, align, and course-correct.
There are pluses for this that I had suspected but hadn't been sure I would find ... I'll give you an example. One of my projects is about Apache's content management system that's nearing end-of-life. We have to migrate all the projects that use that content management system to generate their Project websites to some other technology. We've been working on this for over a year, I think, but there were 40 or 50 projects that hadn't gotten started on that migration. I started conversations with them, saying, "Hey, are you going to move? What do you need help moving? What do you need to know?"
As I was getting feedback, I was able to improve the documentation we provide on the wiki on how to migrate your project off the CMS. Along the way, I've met some really interesting people and am having fascinating conversations with people deeply engaged in Projects; the output of which I know nothing. It's a lot of fun because these are very, very smart people. They're doing really significant stuff. I want to make it as easy as possible for them to turn from that highly significant stuff to this rather mundane thing of moving the way they built the Website from the current way, which is creaky but they know it, to a new way.
We long have had this "do your own thing" culture --no one's telling anyone that there's one, official, way to develop your Website. Hundreds of Apache Projects are developing their own Websites their own way. No doubt some that are using the current CMS, there is an opportunity to offer them a different development direction versus a giant arm sweep stating, "We're going to pull down 300 project sites and rebuild them all at once," as would be done in other organizations when they choose to rebrand or upgrade their CMS or backend. It's like little mushrooms popping up where everyone is producing their own site at their own pace, using their preferred tools. It's very, very interesting.
It's educational to me also because the Infra team has a series of recommendations, "We really recommend you go to this technology to build your Website." The subtext is because that gives you the most options, the most flexibility, and means Infra has to do less to hold things together, but then we have other Projects that say, "Oh, no, we don't like that. We really would like to use this." The Infrastructure response is, "Show us how you can possibly use that in a way that matches these requirements we have." For instance, that the landing page for the project Website has to be a thing that can be branded as [projectname].apache.org and hosted on our servers.
People, as a project, demonstrate, "Oh, they can use this technology" that we had not thought of, then we have the documentation for that and that might encourage some other project that doesn't like the vanilla package we're providing to migrate using this new thing. We're down to about 20 projects I think that haven't really gotten very far on their migration.
… Is there a deadline for that?
At some point, the content management system is just going to fall over. We'd like to get everyone out before that happens. We set the end of the year as, "Let's do this before the end of this year, but there's not a switch." There's not the end of a license or something like that that's going to happen. We have a little bit of wiggle room.
… We're not pulling the plug, so to speak.
No. We're not pulling the plug, especially since treasurer.a.o. hasn't moved and we wouldn't want to annoy them.
Right. Are there additional responsibilities that you take care of?
Well, that what I just described about helping or encouraging teams to migrate was not part of my job description. I just saw something that I could do that involved being engaged with projects to get them on the path, leaving the other team members available to do things that I can't do. I'm the least technologically savvy person on the team. I might as well do the stuff that involves words and interaction.
What's the process of sorting through 21 years of ASF history on apache.org? How far along are you? Is this a never-ending project or is there a specific milestone that you want to hit to say, "Hey, okay, we've done"? Is there an end in sight to this or …?
I think we'll get to a point where we'd say, "We're pretty well caught up. Now what?" That could happen within the next couple of months, but then remember, at that point, the process of doing doesn't end. Where new material is being created, technology changes. We're migrating server things from one kind of server to another kind of server. We have to document what that new server does. Git for instance, or GitHub, I guess, has provided a couple of new options for things projects can do. The Infra team has to learn how to support those, then we have to document them and help teams understand how they can use them for their benefit.
As long as the Foundation keeps doing stuff, the same problem of uncurated information silting up will recur. Hmm. That's my lifetime employment plan.
[laughs] Going to apache.org/dev, how did you decide where to start? Were there any active fires that you were told you had to put out or it was more a bunch of low simmer, "We'll get to that someday," types of sections of the site? Again, it just seems so like a Medusa situation. How did you decide to divide, conquer, and get started?
For apache.org/dev, I just started at the top file or the top link that said anything about dev and went into it. "Why is this out here? This looks very much like the same thing we say over here in infra.a.o. Why are we saying it twice in two different places in two different ways?" I started pretty much by grabbing anything. It's like the way you might go into your grandparents' attic when they're downsizing to a smaller house and you're going to help them move. All you can do is pick up the first box and see what's in it and give the best guess about where that should go.
… Then there's those people who just grab it and just donate everything.
… Not even looking through it, they're just purging and starting afresh.
Fairly early. I tried to elaborate that tiered idea of Infra information, so that if you land on dev, you're getting high-level stuff. If you go to info.a.o, you're getting more thorough stuff. If you need to, you can go over to the wiki to get code snippets or very detailed instructions. If you're an Infra team member, you go over there and get stuff. Only in the direst need you go down into the Subversion repository.
Before, everything was mixed around. Where the most essential and the least essential stuff was, was not consistent or logical.
Have you had to learn about the Apache Way of community-led development or other processes in order to get the job done? Even if they're talking about a technical thing, you're testing it out. Are you kicking the tires along the way saying, "Okay, this doesn't make sense," or are you not at that stage yet in terms of content?
I'm doing a fair bit of tire kicking. Of course, as a participant in the Flex and in the Royale projects, I've engaged myself to understand the Apache Way from them. The PMCs I work with modeled the management and development style of Apache. I learned it organically. I'm not seeing a conflict between what I learned on the Flex and Royale teams and the larger Apache Way of doing things. I think that's really good. You stumble into a small project with a very minor, very focused goal to do this thing, this bit of technology. You take in through your skin how to make decisions and how to share information and how to support each other.
… Continuity for the win: that's good to hear. What kind of influence do you have on content development? You said you're adjusting a page if it's not saying what it's supposed to do, but beyond that, are you saying, "Look, this really needs to take a different approach"? Are you deciding on your own? Is there a review committee that has to oversee every edit or is the process completely autonomous? How do you know what you're writing is factually correct? Who signs off on that?
The Infra team is in constant contact, 24-hour chatter every day on the Slack channel. There's an asynchronous conversation going on. When I run into something I don't understand ... Well, there are two things that happen. I can suggest things there that might be useful, but also when I notice people discussing something that's new or something went wrong and what they have to do to make it right, I often say, "Is that something we should write down, do you think? Where should we write it down?" That begins the conversation about documenting whatever the thing is.
One of the first things I created on the wiki page, the Infra wiki site, is a page for me called the job jar. Each time I come up with something that has to be written, I start a new item on a checklist and write in what that thing is to the best of my knowledge. Then, if I can't see any way to write because I don't have a clue what that thing is about, I go to the Slack channel or I go to the team meeting, which we have every Thursday, and say, "Who can help me write this? I just want you to blurt out the facts and then I'll turn it into pretty language." I can't direct that we have to write anything, but we work interactively.
If I write a new thing, I post it on the Slack channel. Someone will come back and say, "Well, you totally missed this thing. Here, let me fix it for you." We go back and forth like that until it's ready to make available to the larger public. Greg, of course, keeps a close eye on me, so I don't accidentally delete everything.
We review regularly what needs to be added or what can be sliced away because often, if you say less, you can communicate more.
Going back to "delete everything", when I first joined W3C 25 years ago, I remember making copies of everything because I was terrified that I was going to delete the Web's original history, there were thousands of legacy pages. Do you do the same thing? Do you make copies of things and edit that then just do merges? How do you actually do that?
I have a strong reliance in the team's guarantee that everything is version controlled. Actually, I'm more shy about changing things than they are to encourage me to do it. They just said, "Go ahead and do that. We'll fix it later." In that sense, I'm truly not afraid of deleting everything. I am afraid of inadvertently causing annoyance. I have an example: when I first started to move pages from a.o/dev area to the infra.a.o area, some of the pages I wanted to move that had titles that didn't really match what was in the document. God, I've got to improve this. I changed the name of the file at the new location. Then that was a pain because how do you redirect from the old location to the new direction?
I learned very quickly that I was causing trouble for my colleagues, but beyond that, I was causing trouble for people on projects who might have a link on their page to an Infra page. I really don't want to cause an information barrier because, in my mind, I'm making things more efficient. On the a.o/dev area, there are all sorts of pages sitting there now that are just stubs or just shells of their former selves. If you click on the link to go to that page, there's a little gearcranking and all of a sudden you're over at the same page at infra.a.o … most of the time. Sometimes it just does not work and then I have some sad people.
It's interesting you were saying about not wanting to upset people, but I think this is actually a parallel with good documentation and good data management. It becomes un-intrusive and a natural byproduct of your experience online. The whole point is you don't want to say, "Hey, there's some underhanded entity there that's controlling it." It's natural in terms of what you're seeing, what you're reading. In terms of comprehension, it's great UX. It's a very interesting comment that you made about you not wanting to ... this, "do no harm" approach --the outcome is very positive.
If you want to make a really highfalutin image, we're surgeons working on something together. There's a thing going on the table there that's going to ... Things are going to go bad if we don't do our job well. If I go moving around where the implements are that we're going to reach for on the tray from where they normally are just because I think they should be alphabetical or something, things are not going to go well for the patient.
… Someone might even die, right?
Fortunately, there's a limited amount of trouble I can cause because I'm not turning the nuts and bolts on the servers. Not yet.
But I am ... You asked earlier what sort of, I don't know, influence I have to bear, I'm in there asking questions whenever I can understand a question to ask about, "Shouldn't we update this list here of the servers? This doesn't look like it's been updated since 2014. Shouldn't we make this list more accessible to the people who have to look at it?" That makes it sound like my colleagues are bumbling along and inattentive. They're very attentive, they need to document what they're doing and they're very patient with me when I get fixated about a semicolon while they've done everything else right on that page, except that damn semicolon.
It's important. Both parties: that's a good dovetail of talent, right? You're talking about a page that hasn't been touched since 2014. We have pages that have been untouched since 2001. I'm sure you're coming across them.
Here's a situation that probably is of low impact, except when it has high impact. I've been reading the memorial pages for past committers. I got to a page that said many kind things about the person, "who died in a car accident this last week". This "last week".
… *When* was that, right?
Yeah. I'm making little reports on those pages and the people who have ownership of the pages have to decide what to do with those reports. I'm not going in and changing those pages, but I suggested, "Let's figure out the year at least, maybe the month and make that more accurate, so someone like me now visiting this memorial page about a person who died before I joined Apache can understand what happened." In some ways, that's important for remembering and honoring the people who have been with us and are gone. There is more painful stuff when we haven't updated something or we've left a sentence that says, "As of this writing, so and so is the case, but I don't know if it's going to be that way for long." Again, there's no date. I think it's scary.
… There's no frame of reference at all.
Exactly. It makes the whole thing provisional. We have, under this COVID-19 crisis right now in the province of Ontario in Canada, a very complex Website that purports to tell you if you're in the city of Toronto what you can do in different parts of Toronto, what the lockdown level is. At the very top of the pages, it says, "Latest information."
If you go in there, it's three months old. The latest information is elsewhere in the page. To me, it throws the whole thing. If I'm someone who's trying to find something out from that site, I tend not to believe any of it.
[END OF PART ONE]
Posted at 12:26PM Nov 29, 2020 by Sally Khudairi in SuccessAtApache | |