The Apache Software Foundation Blog

Thursday June 04, 2020

The Apache Software Foundation Announces Apache® Hudi™ as a Top-Level Project

Open Source data lake technology for stream processing on top of Apache Hadoop in use at Alibaba, Tencent, Uber, and more.

Wakefield, MA —4 June 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Hudi™ as a Top-Level Project (TLP).

Apache Hudi (Hadoop Upserts Deletes and Incrementals) data lake technology enables stream processing on top of Apache Hadoop compatible cloud stores & distributed file systems. The project was originally developed at Uber in 2016 (code-named and pronounced "Hoodie"), open-sourced in 2017, and submitted to the Apache Incubator in January 2019.

"Learning and growing the Apache way in the incubator was a rewarding experience," said Vinoth Chandar, Vice President of Apache Hudi. "As a community, we are humbled by how far we have advanced the project together, while at the same time, excited about the challenges ahead."

Apache Hudi is used to manage petabyte-scale data lakes using stream processing primitives like upserts and incremental change streams on Apache Hadoop Distributed File System (HDFS) or cloud stores. Hudi data lakes provide fresh data while being an order of magnitude efficient over traditional batch processing. Features include:

  • Upsert/Delete support with fast, pluggable indexing
  • Transactionally commit/rollback data
  • Change capture from Hudi tables for stream processing
  • Support for Apache Hive, Apache Spark, Apache Impala and Presto query engines
  • Built-in data ingestion tool supporting Apache Kafka, Apache Sqoop and other common data sources
  • Optimize query performance by managing file sizes, storage layout
  • Fast row based ingestion format with async compaction into columnar format
  • Timeline metadata for audit tracking

Apache Hudi is in use at organizations such as Alibaba Group, EMIS Health, Linknovate, Tathastu.AI, Tencent, and Uber, and is supported as part of Amazon EMR by Amazon Web Services. A partial list of those deploying Hudi is available at https://hudi.apache.org/docs/powered_by.html

"We are very pleased to see Apache Hudi graduate to an Apache Top-Level Project. Apache Hudi is supported in Amazon EMR release 5.28 and higher, and enables customers with data in Amazon S3 data lakes to perform record-level inserts, updates, and deletes for privacy regulations, change data capture (CDC), and simplified data pipeline development," said Rahul Pathak, General Manager, Analytics, AWS. “We look forward to working with our customers and the Apache Hudi community to help advance the project."

"At Uber, Hudi powers one of the largest transactional data lakes on the planet in near real time to provide meaningful experiences to users worldwide," said Nishith Agarwal, member of the Apache Hudi Project Management Committee. "With over 150 petabytes of data and more than 500 billion records ingested per day, Uber’s use cases range from business critical workflows to analytics and machine learning."

"Using Apache Hudi, end-users can handle either read-heavy or write-heavy use cases, and Hudi will manage the underlying data stored on HDFS/COS/CHDFS using Apache Parquet and Apache Avro," said Felix Zheng, Lead of Cloud Real-Time Computing Service Technology at Tencent.

"As cloud infrastructure becomes more sophisticated, data analysis and computing solutions gradually begin to build data lake platforms based on cloud object storage and computing resources," said Li Wei, Technical Lead on Data Lake Analytics, at Alibaba Cloud. "Apache Hudi is a very good incremental storage engine that helps users manage the data in the data lake in an open way and accelerate users' computing and analysis."

"Apache Hudi is a key building block for the Hopsworks Feature Store, providing versioned features, incremental and atomic updates to features, and indexed time-travel queries for features," said Jim Dowling, CEO/Co-Founder at Logical Clocks. "The graduation of Hudi to a top-level Apache project is also the graduation of the open-source data lake from its earlier data swamp incarnation to a modern ACID-enabled, enterprise-ready data platform."

"Hudi's graduation to a top-level Apache project is a result of the efforts of many dedicated contributors in the Hudi community," said Jennifer Anderson, Senior Director of Platform Engineering at Uber. "Hudi is critical to the performance and scalability of Uber's big data infrastructure. We're excited to see it gain traction and achieve this major milestone."

"Thus far, Hudi has started a meaningful discussion in the industry about the wide gaps between data warehouses and data lakes. We have also taken strides to bridge some of them, with the help of the Apache community," added Chandar. "But, we are only getting started with our deeply technical roadmap. We certainly look forward to a lot more contributions and collaborations from the community to get there. Everyone’s invited!"

Catch Apache Hudi in action at Virtual Berlin Buzzwords 7-12 June 2020, as well as at MeetUps, and other events.

Availability and Oversight
Apache Hudi software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Hudi, visit http://hudi.apache.org/ and https://twitter.com/apachehudi 

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/ 

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 813 individual Members and 7,800 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus. Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Inspur, Leaseweb, Microsoft, Pineapple Fund, Red Hat, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF 

© The Apache Software Foundation. "Apache", "Hudi", "Apache Hudi", "Hadoop", "Apache Hadoop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday May 28, 2020

The Apache Software Foundation Announces Apache® CloudStack® v 4.14

Mature Open Source Enterprise Cloud platform powers billions of dollars in transactions for the world's largest Cloud providers, Fortune 5 multinationals, educational institutions, and more.

Wakefield, MA —28 May 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® CloudStack® 4.14, the latest version of the mature, turnkey enterprise Cloud orchestration platform.

Apache CloudStack is the proven, highly scalable IaaS platform of choice to rapidly and easily create private, public, and hybrid Cloud environments: it "just works". CloudStack originated at Cloud.com in 2010, which was acquired by Citrix in 2011. CloudStack was submitted to the Apache Incubator in 2012 and graduated as an Apache Top-Level Project (TLP) in March 2013.

Apache CloudStack includes the entire "stack" of features in an IaaS cloud: compute orchestration, Network-as-a-Service, user and account management, full and open native API, resource accounting, and a first-class user interface.

"v4.14 is an exciting release for Apache Cloudstack and is the result of many months of collaboration by our community," said Sven Vogel, Vice President of Apache CloudStack. "We are introducing a number of major new features that have been driven by demand by users and operators of CloudStack based IaaS environments. At the same time, we have kept to the project's ethos of having a tightly defined scope and being the platform of choice on which to layer other services."

Of particular note are:

  • Cloudstack Kubernetes Service gives operators the ability to deliver CaaS or K8aaS style services with no change to underlying  infrastructure or business process

  • VM Ingestion gives operators the ability to easily “import” existing VMware environments into Cloudstack

  • The new backup and recovery framework, allows operators to integrate with any backup platform, giving a seamless user experience from the Cloudstack UI/API

"Apache Cloudstack 4.14 ships with a Technical Preview of Cloudstack’s new User Interface," added Vogel. "This presents a new, ‘enterprise feel’ user experience and is earmarked to replace the current UI. We are encouraging all Cloudstack users to explore the Technical Preview and give feedback to the community. Thank you to all of the contributors across our community who have made this release possible."

More than 200 new features, enhancements, and fixes include:

  • New modern UI (Project Primate, Technical preview)
  • Backup and Recovery framework
  • Backup and Recovery provider for Veeam 
  • VM ingestion
  • CloudStack Kubernetes Service
  • L2 network PVLAN enhancements 
  • UEFI support
  • KVM rolling maintenance
  • Enable Direct Download for systemVM templates 
  • Template Direct Download support for Local and SharedMountPoint storages 
  • VR health checks
  • Download logs and diagnostics data from SSVM/CPVM/VRs
  • Enable additional configuration metadata to virtual machines


The full list of new features can be found in the project release notes at http://docs.cloudstack.apache.org/en/4.14.0.0/releasenotes/index.html .

Apache CloudStack powers thousands of clouds and billions of dollars in transactions across an array of organizations that include Apple, BT, INRIA, Royal Melbourne Institute of Technology (RMIT), SAP, Taiwan Mobile, Verizon, and WebMD, among others. A list of some of Apache CloudStack’s users are available at http://cloudstack.apache.org/users.html .

Highlighted in Forrester’s Enterprise Open Source Cloud Adoption report, Apache CloudStack "sits beneath hundreds of service provider Clouds", and is behind numerous elastic Cloud computing services, including those at Fortune 5 multinationals as well as solutions ranked as Gartner Magic Quadrant leaders.

Availability and Oversight
Apache CloudStack software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache CloudStack, visit http://cloudstack.apache.org/ and https://twitter.com/cloudstack .

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 813 individual Members and 7,800 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus. Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Inspur, Leaseweb, Microsoft, Pineapple Fund, Red Hat, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit
http://apache.org/ and https://twitter.com/TheASF .

© The Apache Software Foundation. "Apache", "CloudStack", "Apache CloudStack", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

The Apache Software Foundation Announces Apache® Subversion® 1.14.0-LTS

Community-led Version Control Software and Source Code Management Tool Available on Most Integration Servers, Integrated Development Environments, Issue Tracking Systems, and more.

Wakefield, MA —28 May 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® Subversion® 1.14.0-LTS, the latest release of the popular centralized software version control system.

Apache Subversion ("SVN") provides a version controlled backing store for any kind of data. It records an accurate log of changes made to that data over time, and keeps track of who made them. Subversion allows users to commit files and directories, recover previous revisions, and even maintain multiple variations of their work in parallel. Able to service projects of any size, from individuals up to large scale collaborative efforts, Subversion is ideal for work in vast swaths of industries, from software development to semiconductor design, scientific research to medical technology. An Apache Top-Level Project for over a decade, Subversion celebrated its 20th Anniversary earlier this year.

"First and foremost, I'd like to thank all of our developers and community members who helped make this release possible," said Nathan Hartman, Vice President of Apache Subversion. "We are excited to publish our latest LTS release, and the first in the 1.14 line."

As an LTS release, the focus is on stability and availability. These are achieved through the project's policies. For any change in core code to be included in updates to 1.14.x, the change must first undergo a process of nomination and voting for backport. At least three Subversion developers must support the change, with none having concerns about it.

LTS (Long Term Support) is an industry designation that a particular release line is planned to be maintained for a longer period of time than regular, non-LTS releases. For the Subversion project, this means that later updates to the 1.14.x series may contain bug fixes and security updates only. Any bleeding edge new features, even if developed during the lifetime of 1.14.x, will have to be introduced in a separate release line. Server operators and system administrators usually prefer LTS releases for stability, while end users often choose the latest release (LTS or not) to get the newest features.

Numerous third parties provide Subversion install packages for Windows, macOS, Linux, OpenBSD, FreeBSD, and other operating systems. To maximize platform independence, Subversion is implemented with strict conformance to ISO C90, one of the most widely supported software coding standards worldwide. In addition, the Subversion developers provide bindings that enable integrations with software coded in popular web languages: Java, Ruby, Perl, and Python.

Particularly noteworthy for this release, Subversion's language bindings for Python received significant attention. Python 3 is supported, up from Python 2 in prior Subversion releases, an oft requested improvement that keeps Subversion 1.14.0-LTS current with the changing Python landscape. While this was a major undertaking, the project also tackled the challenge of maintaining compatibility with the older Python 2. This legacy support is expected to phase out gradually, as Python 3 continues to gain mindshare across the computing industry, but the Subversion project has a long tradition of maintaining compatibility wherever practical, giving operators of legacy systems some much-needed breathing room as they make the transition.

Among Subversion's strengths are its extensive support for working with giant repositories. The bedrock of this support is its centralized model, which allows users to check out only the portions of a repository that they need. The ASF uses Subversion this way in its own infrastructure, housing more than 80 of its Apache Top-Level Projects and sub-projects comprising millions of lines of code, including Subversion itself, in a single Subversion repository that makes all 1.8 million revisions of that information available to collaborators worldwide.

When dealing with such vast amounts of data, including all of its revisions, one might wonder about storage costs. Subversion uses a variety of techniques to minimize storage, including temporal compression, spacial compression, and data deduplication.

Another improvement in Subversion 1.14.0-LTS is a new tool in support of deduplication that could help some administrators reduce future storage costs. The deduplication feature uses an internal database named rep-cache.db. If deduplication was previously disabled, the database may not contain all necessary entries. The new feature, known as the 'svnadmin build-repcache' command, allows re-adding such missing entries and provides a way for those who had previously turned off deduplication to regain some of its benefits.

The release also includes several experimental features. One of these, Shelving and Checkpointing, allows users to save, restore, and roll back snapshots of their work, without making commits to the central repository. This is useful for setting aside a work in progress to work on something else, or for taking temporary snapshots when a network connection to the server is unavailable. Another experimental feature, Viewspec, allows users who create different cross-sections or "views" into their version controlled data, to save the layouts of those views and easily recreate them later. These experimental features are designated as such because they are not yet considered feature-complete. In Subversion 1.14.0-LTS, they are turned off and hidden by default, but are made available on an opt-in basis to entice open source community members to help further their development.

Subversion users, developers, and other stakeholders routinely communicate with each other through email lists. One ongoing discussion taking place there centers around a proposal to make Subversion even stronger at handling big files. The discussion thread, titled "Who else is using SVN for large-binary-asset storage?" has already generated some enthusiasm.

"Apache Subversion is more than code, it's a community," added Hartman. "As an open source and purely volunteer-driven effort, we thrive on participation from enthusiastic users and developers worldwide. We welcome their involvement in the future of Subversion and on our email lists."

Subversion 1.14.0-LTS is available now. The complete software source code can be downloaded from https://subversion.apache.org/download.html , with a list of install packages which are maintained by numerous third parties at https://subversion.apache.org/packages.html .

Over its 20-year history, Subversion has grown to become the most popular version control system on the market, and remains the leading centralized versioning and revision control software today. Millions of users worldwide depend on the collaboration-friendly system to easily access all files and historical data simultaneously without code conflicts or corruption. 

Apache Subversion is used for mission-critical code distribution and collaboration workflow by Adobe Dreamweaver, Eclipse, Google, Halliburton, Microsoft Visual Studio, Python, Ruby, Skype, SourceForge, and WordPress, among many others. The ASF's infrastructure uses Apache Subversion across millions of lines of code and nearly two million commits by more than 300 Apache projects.

Availability and Oversight
Apache Subversion software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Subversion, visit http://subversion.apache.org/ .

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 813 individual Members and 7,800 Committers across six continents successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus. Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Inspur, Leaseweb, Microsoft, Pineapple Fund, Red Hat, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF 

© The Apache Software Foundation. "Apache", "Subversion", "Apache Subversion", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Wednesday May 13, 2020

The Apache Software Foundation Announces the 10th Anniversary of Apache® HBase™

Open Source distributed, scalable Big Data store celebrates a decade of processing zettabytes of data across highly scalable large tables for the Apache Hadoop ecosystem 

Wakefield, MA —13 May 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the tenth Anniversary of Apache® HBase, the distributed, scalable data store for the Apache Hadoop Big Data ecosystem.

"The success of Apache HBase is the success of Open Source," said Duo Zhang, Vice President of Apache HBase. "Ten years after graduating as a TLP, HBase is still among the most active projects at the ASF. We have hundreds of contributors all around the world. We speak different languages, we have different skills, but we all work together to make HBase better and better. Ten year anniversary is not the end, but a new beginning, I believe our strong community will lead the project to a bright future."

HBase originated at Powerset in 2006 as an Open Source system to run on Apache Hadoop’s Distributed File System (HDFS), similar to how BigTable ran on top of the Google File System. In 2007, a significant code contribution was added to the Apache Hadoop codebase and was integrated into the Apache Hadoop 0.15.0 release later that year. Development on HBase continued as a sub-project of Apache Hadoop, and graduated as an Apache Top-Level Project (TLP) in April 2010.

An Open Source, versioned, non-relational database, Apache HBase provides low latency random access to very large tables —billions of rows and millions of columns— atop clusters of non-specialized, commodity hardware. HBase reads, writes, and processes structured, semi-structured, and unstructured data in real-time environments.

Apache HBase is in use at thousands of organizations, including Adobe, Airbnb, Alibaba, Bloomberg, Flipkart, Huawei, HP, Hubspot, IBM, Microsoft, NetEase, Pinterest, Salesforce, Shopee, Tencent, Twitter, Xiaomi, and Yahoo! (now Verizon Media), among others.

Testimonials

"Congratulations on the 10th birthday of Apache HBase! Alibaba started to use HBase since January 2011 and has witnessed its growth and come along with the community through the years. The Apache HBase community has always been an open and powerful team that produced many stable, production-ready and widely used versions. Today at Alibaba, we have HBase clusters with more than 10k nodes serving hundreds of petabytes of data, as well as  more than 1,000 enterprise HBase users on Alibaba Cloud. We will continue collaborating with and contributing to the HBase community and wish us all ongoing success in future!"
—Chunhui Shen and Yu Li, members of the HBase team at Alibaba

"I have worked with Apache HBase for many years and I think it is a great product. it does what it says on the tin so to speak. Ironically if you look around the NoSQL competitors, most of them are supported by start-ups, whereas HBase is only supported as part of Apache suite of products by vendors like Cloudera, Hortonworks, MapR, etc. For those who would prefer to use SQL on top, there is Apache Phoenix around which makes life easier for the most SQL-savvy world to work on HBase: problem solved. For TCO, HBase is still value for money compared to others. You don't need expensive RAM or SSD with HBase. That makes it easy to onboard it in no time. Also HBase can be used in a variety of different business applications, whereas other commercial ones  are focused on narrower niche markets. Least but last happy 10th anniversary and hope HBase will go from strength to strength and we will keep using it for years to come!"
—Dr. Mich Talebzadeh, Chief Data Architect, Big Data

"Congratulations on the 10th anniversary of Apache HBase! Xiaomi started to use HBase in 2012, when our business started booming. Many key Xiaomi products and services, as well as Xiaomi's data analytics platform, require a new system to provide quick and random access to billions of rows of structured and semi-structured data. Traditional solutions are not able to handle the large volume of data brought by the quickly increasing Xiaomi user base. Among several available options, we choose HBase not only because it provides a rich set of features and excellent performance specs, but also because it has a very active, open and friendly community. Embracing open source has been part of Xiaomi's engineering culture, and our deep involvement in the development of Apache HBase demonstrates the best practices of Xiaomi's open source strategy. In the past several years, we have contributed tons of bug fixes and important features to HBase, and, in the meantime, we have contributed 9 committers and 3 PMC members to the HBase community. Looking forward, we will continue to work closely with the Apache HBase community to help the project grow, and we wish Apache HBase a wonderful future!"
—Dr. Baoqiu Cui, Vice President of Xiaomi Corporation and Technical Committee Chairman

“Congratulations on the 10th anniversary of Apache HBase, it’s great to see how the project has developed over the years and continues to have good community support around it! Salesforce has a large global footprint of Apache HBase in production storing multiple petabytes of customer data and serving several billions of queries per day for a wide variety of use cases including security, monitoring, collaboration portals, and performance caches to scale over RDBMS limitations. HBase has played a major role in Salesforce’s customer success in the BigData storage space and we continue to invest in it as one of the pillars of our multi-substrate database strategy along with Apache Phoenix for SQL access to data stored in HBase. We have contributed many features and bug fixes to HBase over the last several years, and we look forward to continue working with the Apache HBase community to develop the project further. Here’s to many more successful years for Apache HBase!”
—Sanjeev Lakshmanan, Senior Director, Software Development, Salesforce

“Happy 10th Apache HBase! It was around 8 years ago that we started looking at HBase to include as part of our Hosted Big Data Services stack. Fast-forward to today and it continues to be a critical offering in our stack, powering a diverse set of use cases and workloads such as ad targeting, content personalization, analytics, security, monitoring, etc. HBase enables these diverse workloads thanks to it’s high-scalability, feature set and performance, all of which have been continuously refined through the years. In turn our footprint continues to grow storing petabytes of data across thousands of machines. Our success is in part thanks to the project’s success as we benefit from our collaborations, the contributions and other efforts by the community (eg mailing list, meetups, HBaseCon, etc). This is a testament to the open, friendly and dedicated community around Apache HBase which is necessary for the success of any open source project. We wish the project continued success for years to come as we continue to collaborate with and be part of the community cultivating the project.”
—Francis Liu and Thiruvel Thirumoolan,  HBase Big Data Team Members, Yahoo! (now Verizon Media)

“Congratulations on the 10th anniversary of Apache HBase! It’s great to see how this project has evolved from a big data project to one that runs business critical systems and continues to accelerate with a growing community and increasing pace of development! Cloudera has over 500 customers in production using it for a range of use cases ranging from mission critical transactional applications to supporting data warehousing. Our largest customers have footprints in excess of 7,000 nodes storing over 70PB of data. Our customers choose HBase because of its resilience with some customers able to realize 100% application uptime using HBase (over the past 3 years). We plan to continue to invest in HBase (and Apache Phoenix) to ensure that we can continue to both broaden support for a variety of hybrid transactional and analytical use cases and deepen support for existing use cases. Here's to many more successful years!"
—Arun C. Murthy, Chief Product Officer, Cloudera

“Many Congratulations to the Apache HBase community on the 10th anniversary. Apache HBase provides rich functions and excellent performance, and has an open and friendly community. Huawei started using HBase since 2010: HBase is widely used by multiple solutions of Huawei running on more than 10,000 nodes, storing hundreds of PBs data to meet our requirements. Huawei FusionInsight provides the Best Practices of Huawei for HBase, which serves a lot of customers across many industries such as finance, operators, government, energy, medical, manufacturing, and transportation. Meanwhile, Huawei team members contributed a lot of bug fixes and features to HBase, successfully hosted the first HBase Asia Technology Conference HBaseCon Asia 2017 at Shenzhen. Going forward, Huawei will continue to work closely with the Apache HBase community to promote community development.”
—Wei Zhi, Kai Mo and Pankaj Kumar, members of the HBase team at Huawei

“Happy 10th anniversary, HBase! At Ultra Tendency, you have been the backbone of our Dual Lambda Streaming Architecture for many years! You have served billions of queries to our customers without interruption and at low latency. Your architecture guaranteed that you were always there when we needed you, never letting us or our customers down. You are the reason why our European clients today are running flourishing new business models backed by low-latency streaming products. Our committers and contributors will continue to fix bugs and provide feature enhancements. Ultra Tendency wishes you a bright and successful future!”
—Jan Hentschel, Chief Information Officer, Ultra Tendency

“Congratulations on the 10th anniversary of Apache HBase, I can't believe it's been 10 years since the first day when I tried to use Apache HBase and its ecosystem to help the business and company. Also, it is so great to see many colleagues and friends work, discuss, cooperate together to make this system become better. Some of them also make great career development and some are still progress. Shopee, one of the biggest e-commerce platforms in Southeast Asia, has several large Apache HBase clusters in production to support businesses that depend on several billions of queries per day. Apache HBase has played a significant role in Shopee and it is still in expansion along with the business growth of Shopee. Apache HBase, as well as the community, helps us a lot and we also will continue to make contributions to Apache HBase. Looking forward to keeping working with the Apache HBase community to develop the project and its ecosystem further.”
—Li Luo, Manager of Data Infra department, Shopee

”At Microsoft, our mission is to empower every person and every organization on the planet to achieve more, and it’s this mission that drives our commitment to open source. Congratulations to the Apache HBase community on its 10th anniversary. Microsoft has been part of the vibrant HBase community since 2014, today we are proud to serve the numerous enterprise customers across industries who are leveraging HBase in Azure HDInsight for their most critical business applications.”
—Tomas Talius, Director of Engineering, Azure Data Services, Microsoft

Availability and Oversight
Apache HBase software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache HBase, visit http://hbase.apache.org/ and https://twitter.com/HBase 

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation (ASF) is the world’s largest Open Source foundation, stewarding 200M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 813 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 7,600+ Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Indeed, Inspur, Leaseweb, Microsoft, ODPi, Pineapple Fund, Private Internet Access, Red Hat, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF 


© The Apache Software Foundation. "Apache", "HBase", "Apache HBase", "Hadoop", "Apache Hadoop", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Tuesday May 05, 2020

Support Apache: #GivingTuesdayNow


Support The Apache Software Foundation (ASF) and help the world's largest Open Source foundation continue to provide $20B+ worth of software for the public good at 100% no cost.


Apache projects are helping millions of individuals and businesses struggling with the COVID-19 pandemic in numerous ways, including:

  1. Accessing one’s desktop remotely whilst working from home;

  1. Ordering, warehousing, picking, shipping, dispatching, and tracking critical supplies worldwide;

  1. Classifying and integrating 1.5B+ genetic data records;

  1. Storing, extracting, linking, and processing terabytes of electronic medical records across thousands of servers;

  1. Training large-scale, distributed deep learning libraries over clusters of hospital machines;

  1. Developing machine learning cardiovascular disease prediction models; and

  1. Supporting biobank research and real-time intensive care data analysis.


Your contribution helps ensure The Apache Software Foundation’s 350+ projects and initiatives remain accessible to all, absolutely free of charge. The not-for-profit ASF does not pay for development --all work on Apache projects is done by a volunteer community of more than 7,700+ Committers on six continents.

 

Donate today at https://donate.apache.org/


We thank you for your support during this unprecedented time.

#GivingTuesdayNow is a new global day of giving and unity taking place on 5 May 2020 as an emergency response to the unprecedented need caused by the COVID-19 pandemic. The annual GivingTuesday movement takes place  on the first Tuesday following Thanksgiving in the United States, and is the second largest giving day of the year. (1 December 2020).


# # #

Thursday April 30, 2020

The Apache Software Foundation Welcomes 34 New Members

The Apache Software Foundation (ASF) welcomes the following new Members who were elected during the annual ASF Members' Meeting on 31 March - 2 April 2020:

John Andrunas, Paul Angus, Zaheda Bhorat, Timothy Chen, Andrea Cosentino, Adina Crainiceanu, Griselda Cuevas, Fokko Driesprong, PJ Fanning, Julian Feinauer, Drew Foulks, Von Gosling, Susan Hinrich, Clay Leeds, Swapnil M Mane, Frank McQuillan, Gian Merlino, Andrew Musselman, François Papon, Jerry Shao, Shao Feng Shi, Mohammad Asif Siddiqui, Neil Smith, Casey Stella, Jincheng Sun, Wangda Tan, Luca Toscano, Xiaorui Wang, Geertjan Wielenga, Sheng Wu, Kete Yang, Awasum Yannick, Duo Zhang, and Zhe Zhang.

The ASF incorporated in 1999 with a core membership of 21 individuals who oversaw the progress of the Apache HTTP Server. This group grew with Committers —developers who contributed code, patches, documentation, and other contributions, and were subsequently granted access by the Membership:

 - to "commit" or "write" (contribute) directly to the code repository;

 - the right to vote on community-related decisions; and

 - the ability propose an active user for Committership.


Those Committers who demonstrate merit in the Foundation's growth, evolution, and progress are nominated for ASF Membership by existing Members.

This election brings the total number of ASF Members to 813 today. Individuals elected as ASF Members legally serve as the "shareholders" of the Foundation https://www.apache.org/foundation/governance/members.html 

For more information on how the ASF works, visit http://www.apache.org/foundation/how-it-works.html , Apache Is Open https://blogs.apache.org/foundation/entry/apache-is-open , and Briefing: The Apache Way http://apache.org/theapacheway/ 

# # #

Monday April 27, 2020

Inside Infra: Drew Foulks

The second in the "Inside Infra" interview series with members of the ASF Infrastructure team features Drew Foulks, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.


 



"I am in the business of making life easy for people who do phenomenal stuff."



What is your name --how is it pronounced?

My name is Drew Foulks. “Droo Follx”.

If folks were to find you at the ASF, like on Slack or elsewhere, what's your handle? How do they find you?


They'll find me at Warwalrux, spelled with an X, so W-A-R W-A-L-R-U-X.

So, “War Walrus”, but with an X at the end. Where did that come from?


Kind of embarrassing story actually. I got picked on a lot in middle school because I was always really good with computers, but as bad as it sounds, I never really wanted to be. I always wanted to be one of the one of the cool kids and the cool kids were not going to computers. One day I got into a fight at school and one of my friends just absolutely made me lose it afterwards. I was sitting there on the ground crying and he said, "Man, you were a fighting walrus, like the walrus of war or something. It was awesome." I lost it. But ever since then, I’ve just been like, "You know what? I'm not even going to be ashamed about that anymore." I've been that since I started doing tech, which was actually not that long ago compared to the other guys on the team.

How long have you been in tech?

I'm 29, and have been in tech since I was 16, so 13 years.

When did you get involved with the ASF? How did you get here?

I was working at NASA for four and some change years, and I decided that I wanted to pursue some other opportunities because they really were not supportive of that work from home culture. And at the time I had a lot of stuff going on. My wife was sick, my daughter, my youngest, has special needs and stepson actually also has special needs, so being at home was something I had to do. A buddy of mine tipped me off on a Website called We Work Remotely. I ran across your ad there and thought, "There is no way that is who I think that is and I'm going to apply for the hell of it." Surprisingly, two months later, I got a call back.

You do understand how many interview candidates we had, right? A lot of people were competing against you.

It blows my mind. I heard the stories after I got hired and I was just like, "Man, that's nuts." And then when I got hired, I was actually told, jokingly, of course, the ASF was looking to launch its own brand of internet satellite. So that’s why we hired people from SpaceX and NASA.

The Infra guys have such a dry sense of humor! How long have you been a member of the team?

One year and one month, 13 months.

For some reason it feels like you’ve been part of the Apache family for years. What’s your role in ASF Infrastructure? What are you responsible for?

My latest contributions have been the Website builders, so I'm working on helping people migrate off of CMS. Some of the ways that I've chosen to do that are by working with Humbedooh (the handle for ASF Infrastructure team member Daniel Gruno) on his ASF.YAML project, that so many projects seemed to be really enjoying.

YAML? Yet Another Markup Language?

That's it. Yet Another Markup Language.

So basically, I built the system that lets you build Websites from ASF.YAML and you just specify your Website builder, whether it be Pelican or Jekyll --those are the two that we support right now. And you give it a source branch and a target branch and every time you check in, boom. It builds your website.

Who is this aimed at?

This is for Apache Projects building their TLP Websites. When you commit your Website to the repo, say any project, they've all got Websites, but some of them are generated via Jekyll. Some of them are generated with Pelican, some are generated in a custom way with a Jenkins job. It's just how each project is determined to generate their website, but we're trying to make it easy and provide lots of options for projects to migrate off of the old CMS. But still projects are allowed to be able to choose their own method of publishing or their method of creating a site, but you have to be able to enable all of that to happen.

Did you have to learn this or was this knowledge something that you came into the position with?

I learned it.

Was it difficult? How long did it take you to get this project up?

The Pelican one was a lot harder than the Jekyll one. So, Pelican took a couple of months. Really, Greg had a prototype when I came in that apparently had been kicking around for a little bit, so I tightened it up and pelicanized it. I think it works pretty well. I've not heard any complaints about it.

That took a while before I wasn't doing primarily Python programming, I was doing lots of different ops things just in a completely different way than what I do now. To be honest, I still haven't wrapped my head around exactly what it is I do here.

Do you mind sharing a little bit about that?

I came here from the government world, which is very silent. I worked for the OCIO, Office of Chief Intelligence Officer Data Center for NASA Langley, which is a very old NASA center. Older than NASA itself actually. Their infrastructure, as you can probably guess, is not the newest: It's 100 years old. They have wind tunnels from the 1920s. There are parts of the infrastructure that are 100 years old and it's insane. Everybody has a specialty, everybody's a subject matter expert in something, and there's nothing more permanent than a temporary government program, so if you take something on, expect to be doing that for the rest of your life. It's very regimented. If you’ve ever seen Hidden Figures, the computational research facility where they’ve opened the Katherine Johnson Research Center, was my data center.


And then to come to the ASF, it's like, "Okay, so we've got like 11 different Cloud providers and these are all the projects that we're supporting. Do you know this, this, this, this or this?” Jenkins, Buildbot, VMware, any of the Docker, Puppet and all that stuff. Do I know any of these myriad Open Source technologies that one doesn't really get to use a lot of in the government sphere. I mean, I've been doing Ansible there for three years.

It was very monolithic. We had VMware. I ran a data center. I had hardware. I had to track all of that. Coming here, everything is completely different. It's like, "We're juggling all these different Cloud providers, and oh, wait: we’ve got to migrate out of this one today, so let's do that. Okay. All right. Where are we going with this?" It's just like there's no end in sight. As technology progresses, so do we. It's just that we do it so much faster than anywhere else I've ever been.

Is that exciting or scary?

Oh, gosh. I've never stopped long enough to think about it. It is a bit of both. It is intimidating for sure, because before it was very silent. Like I said, I did my thing and I had my interests, my extracurricular interests, running home network setups and private media servers and whatnot. Then I come here and those hobbies go away, now I’m doing that for the Foundation instead.

Yeah, that's cool, though.

It is. I'm a professional hobbyist.

To get paid for doing your hobby is pretty rewarding.

It is. Yeah.

This has become your hobby in a different way, of course, because I'm sure you weren't planning on dealing with ~11 different Cloud providers.

No, I was not.

In our chat with Chris Thistlethwaite last month, we learned more about who ASF Infra serves and the scope of the work that you provide. Can you tell me more about the who and how it works exactly? So, who Infra serves and to what capacity or what is it that you guys do? Because I get every person's perspective is slightly different because I get the same, we do it all answer, and is that true? I mean, you're saying that so far, it sounds like it's true. I guess no one has a reason to expand upon it in terms of embellishment, but tell me more.

We serve Apache project developers and development teams. It’s not just the people who sit down and write the code, the people who orchestrate these very complex processes of building testing, checking, doing the sanity work behind the scenes, the people coordinating releases, PMCs planning out the future of these projects, we serve them, too, and we have to serve them in a capacity beyond, "Hey, here's a build platform," it's: "We support your email communications, we’re there to facilitate the goings on of the Project." Infra's domain is almost everything but the coordinating and writing of code.

Taking care of their code management systems, providing them with the means to do build testing and having it not kill us in the process. That's a big, big addendum to that requirement. Like I mentioned, email, I call them the central services, things like LDAP, authentication, your virtualization services, file sharing, all of those things that make the business of a TLP easy(ish). I am in the business of making life easy for people who do phenomenal stuff. That's honestly how I view my job and it's very, very different than my old one.

In my old job, I had one customer who I bent over backwards for; here, it's very much, "Listen, my job is to provide these services and to facilitate what you guys do, not do it for you." Drawing that line sometimes becomes difficult for me personally because I don't have as much experience in the ASF, I think. But that seems to be a skill that the other guys have is when to bounce back and say, "No, this is definitely a PMC or a PMC issue that you guys should be dealing with because it sets a bad precedent if I make this decision. I'm not going to do this work for you." It wouldn't be a right to pollute a project like that.

What you're saying doesn't come across as odd. One thing that I always want to know is how ASF compares with other infrastructure operations in general. Chris had said this also, here you have 300+ projects and all sorts of different groups that you're interfacing with, so it's a completely different type of interaction. Your response is totally legitimate: it takes a certain type of personality to be able to handle that because most people would likely be overwhelmed and run away. The fact that you're here and thriving and our projects are expanding is awesome.

Thank you. You can thank my wife for not letting me run away.

Based on my understanding, as a team you're autonomous yet coordinated. Is that the right way to describe how you work together?

Yes. That is a good way to describe how we work together.

Do you feel like that model works or do you think something else should be happening or how does that work for you?

That's a tough question because I'm not sure that the answer would make any sense, but I'll give it a go anyway. By constantly talking with each other, the team gets a sense for the direction that we need to be heading. Leadership is very organic and not spontaneous, but they're like a current guiding us towards the goal, really, whatever that is, so all of the decisions that we make on the daily really kind of help us towards that goal, because fighting the current is difficult.

In a lot of ways that long-term coordination is really facilitated by this, I'm going to call it “on a current of progress”. It's not forceful. That's kind of what it feels like. The team is driving towards something, it's not random, to be honest with you. It's typically a goal that we have in mind, but all of the work that we do is just like, "There's a cool idea that I had related to this, so let's just work on that." And we end up getting there. It's crazy.

Describe your typical workday. Are you on a rolling schedule? Do you guys work on a shift? How do you get it all done --and you're down one person now-- how do you get it done?

I have no idea. So really, personally, I have a nine-hour a day week schedule that I follow every day. So basically I start work and I break it up into two or two-and a half hour chunks and I do four of those, take little breaks in between, try to keep myself sane, try to throw in a dog walk. Really, I just approach it like I approach any other job, one ticket at a time.

Do you work in shifts? How do you cover those 24/7? How do you balance the load?

So there's a one week on-call rotation. So right now there are the... gosh, how many of us are there? Five? Anyway, so there's one week on-call rotation and that person is on 24/7 for the week, Monday to Monday. And then after that, it's pretty much just you cover your time zone. Yeah. So the scheduling, it's so loose that I mean really as long as you're putting in your eight hours a day, nobody really cares when you do that. I choose to have that nine-hour work day because kids really. It's fantastic for having a family, but whether you want to jump on at 1:00 in the morning and work for six hours, that's fine.

OK, so as long as someone's there, and it doesn't have to be you, you can work on your own timeframe. Are you guys usually slammed? Is it low-level? Is there a busy time for Infra on the whole? Is it like tax season if you're an accountant, or is it constantly just 24/7/365?

It's pretty much 24/7/365, but we do definitely have “seasons” as well. We do a one week on-call rotation, so somebody's always on, but the scheduling is very relaxed. So, it's optional, the hours you'd like to keep. I choose to work a work day because of the family and that just kind of fits in nicely actually. Some people may decide that, "I'm awake It's 1:00. I can't sleep. I might as well get some work done and I do that." And I've certainly done that before. So, yeah, it's pretty whatever and we're all kind of, I don't want to call us workaholics because I think that's a bad word, but we're all …

“Work enthusiasts.”

I don't know that I've called them busy seasons as much as busy cycles.

What are they? What triggers them?

Typically? Releases. The most tickets coming in is when some project is putting out a build or is putting out a release. For a large project release, we'll have a lot of tickets sent in because they're utilizing a bunch of resources and stuff gets backed up. That's typically it.

So whoever is on call during that time period, it's really their responsibility to handle: it's not like when Apache Wombat or whatever Project has an issue, it becomes “Drew's issue”. You're not assigned to a project to facilitate that, it's whomever is there will help them however possible, correct?

Yeah. And I think that you said it earlier: everybody that you've talked to says that we do it all. I'm going to tell you that we do it all. It's every project from Apache Zeppelin to Airflow, whatever the first one is. That's not our work.

I don't know if this is actually the case, but I'm curious: is it possible for an ASF Infra team member to be an introvert or do you all have to be “client-facing”? I know that we don't have an office, and you see people from time to time at ApacheCon, but do you have a wall that you can hide behind or do you have to interface with people all the time?

Did you go to the end for Lightning Talks?


I was not at Lightning Talks at ApacheCon/Vegas, but I heard it had quite an activity that happened there, Chris told me about it during his interview, let's put it that way. No one said anything to me up until that interview, so I was surprised. Fill me in with some more. What do I need to know?


[laughing] So, an introvert and two extroverts that are way too drunk, get up on a stage in front of people and proceed to just make fools of themselves for a minute. That's pretty much it.


I guess I know who the introvert was.


Yeah. So the original plan was to go up there and make thunder noises because that is the sound of lightning talking. That was a fun experience. Not one that I would do again, I think but it was fun.


Let's go back to the daily schedule for a minute. This is always a curiosity for me for anyone who's super busy, which is pretty much everyone at Apache: how do you keep your workload organized? Your structure for your day is very impressive, I have to say, this two-and-a-half hours times four. I think it's fascinating. But your actual workload, for example, you get one of these huge releases, how do you manage all that?

Okay, so the first part of my day is typically spent organizing my day as awful as that sounds. We get so much email that I think that it's literally impossible to read it all. I'm pretty sure it's literally impossible to read it all and so much email, so the first order of the day is sift through that while you drink your coffee because there's no way I can get through that. I catch up on the stuff that the team has been talking about, catch up on all the slack channels, look at my tickets, prioritize my workload, and that usually takes about an hour. So right at 8:30, I'm ready to actually start doing stuff. Then it's usually tickets and then a break. And then I don't like to check my email too terribly often. I wish I could three, four times a day, because I think it gets me off task, but that's not really something I have the luxury of being able to do all the time, so I do have to monitor my Ubuntu alerts as emails come in, scanning for anything important. But yeah, it's ticket work for the first half of the day, a project work for the back half of the day. And then right after lunch, I'll sit down and I'll figure out where I am on my project, and then try to move forward from there. Typically, that involves research, but yeah, I like to spend the last couple of hours of my day trying to do something. So, typically project work, because I don't like doing ticket changes at the end of the day.

Why is that?

Well, if you're going to nail your foot to the floor, don't be surprised when you can only run in circles.

I presume when you do ticket work, more things come out of it, too, so it never ends.

Yes. Typically, ticket work involves making a change of some sort, to something that's actually being used, whereas project work is kind of this nebulous, unused, non-production thing.

I'm hearing that you need to know a little bit about everything in addition to your own areas of expertise. How do you stay ahead of the curve? How do you learn about everything that you need to know especially if you don't know what you need to know? How do you do that?

I don't think that you do stay ahead of the curve. I really don't. I think that we do our best to ride it. Getting ahead is so immensely difficult. This technology essentially fractalizes into these many different various facets of high computing.

From virtualizing, networking, programming, you have all of these facets. Nobody can really, truly stay ahead of the curve. I mean, holy cow, the guys in the Infra team, they are all 12-pound brain-type dudes. They'll go from talking about hardware specs to talking about virtualization. They'll bounce around all these different facets of technology, and obviously you have strengths and weaknesses, I don't think anybody can really stay ahead of the curve at this point, and I feel like it's been a long time since anybody has. Technology has just gotten so complicated. We've really tried to, without specializing too much ... kind of pick out some of the non-essential fluff, the stuff that we don't use. I mean, hypervisors aren't really like super in these days. It's all about the Cloud, which is really just an abstract hypervisor, but whatever.

So, we don't really have any “machines” anymore, spec-ing out a physical machine is not something many of us do very often. It's not part of our job anymore, but that's definitely one area of technology that continues to advance as they put out better processors and whatnot. Mostly we try to stay ahead on the DevOps side of things without focusing too much on this operational infrastructure portion. And that's where I came from, this operational infrastructure, the data centers, the servers, the hypervisors, making VMs for people. That's what I used to do and now it's a lot less of that and a lot more fine-tuning this nebulous system of intermeshed tools that I don't fully understand yet.

Seeing that you and others can't stay ahead of the curve, can ASF Infrastructure actually stay ahead of the demand? I mean, is there any way you aren’t constantly in a reactive mode of “this new thing we're responding to, or here's a new part.” Can you get your house in order, or is the house in order?

At the ASF, especially Infra, we do a very good job of listening to our projects because we as individuals cannot stay ahead of the curve *and* have every good new idea that there ever was to be had. Our community is large, and our community is very smart as people and as a group. We have a lot of really excellent ideas that come in from tickets and you say, "You know? I think I'm going to look into that today." And you look into it. You realize that it has all this potential and suddenly, that's the service that we're now using, some things like Travis, which is a third party build validator, came to us in that way.

Since I've been here, some of them have come to us via tickets, where it's been, "Hey, I saw that GitHub has this new thing, you should check it out." So one of us will check it out and we’re like, "Dude, that's awesome. We should use that." I think that we're constantly being batted in front of the curve by a community, by a boots-on-the-ground community that knows what's up. We obviously have our own interests and our own passions, but I don't think if left to our own devices, it would look quite the same as if Apache TLPs couldn't put in tickets.

So it's been one year and one month, but how has Infra changed for you since you've come on board or has it changed?

Nope, still terrified. [chuckles]

How is the team coping with the ASF's unstoppable growth? We have 45 projects in the incubator and there's more than 300 projects out there … there's a geographic influence now on demand, fan increase in users and committers and projects from China, for example. Are there any issues that the team feels like, "Oh boy, we got to deal with this?" Is computing an international language, where it doesn't matter where you're from or what's happening? Are any shifts going on from the ASF’s growth impacting you guys beyond more of what you're already doing?

So, typically, all of my jobs really have been this kind of larger, national or international affairs so basically, since I was 20. I worked for a really large mortgage company, and then I left there and I went to a massive health insurance company. Lots of international folks and so, aside from the language barriers, yeah, I would say that computing is kind of an international thing. As far as the unlimited growth, I don't really know. I'm not sure. That sounds like a question that I would definitely advise you to go ask one of the board members about.

"Management."

Right: “Management”.

You had mentioned that you were working on the no-longer-CMS project. Is there another project that you're doing? Are you a go-to guy for something?

I don't think I'm the go-to guy for anything really. I just try to pick up whatever is there to be picked up. One of the things that I'm working on right now in the “demise of CMS” project is this custom builder. I'm still working on it, so it’s still a work in progress, but the idea is that you'll be able to have a custom build environment that would allow you to, from the ASF.YAML file, write a script, do a “thing” to create your own custom build environment so that we can really, really make a hardcore concerted effort to get off CMS.

Why? What was the issue with CMS? Why do we have to migrate from it? What was the problem?

To be honest with you, I've never actually used CMS. Fortunately, I have never been asked, too. John (former Infra team member John Andrunas) was, but I was not. I was spared, by the CMS gods, they shone their countenance upon me. It was pretty awesome. From what I understand, it's very cumbersome to use and not very friendly and also very old. My understanding is that although it works, there are changes we wish we could make to it that we cannot, so it might be time to just move on to something newer that maybe works a little bit better for us because our use case has changed.

You're still rather new to the role: when you first came on board, what was the biggest challenge or surprise? What really opened your eyes?


So, what really opened my eyes was how much of a learning curve there is. Man, that was rough.

Is that still the case?

Yes, that's still the case. It's just not as bad as it was. Where I was before, I was using all of the stuff that we're not using here, all the Enterprise Edition stuff. So I came in with a completely different toolbox than what I was handed, so the learning curve was massive. I had to relearn how to use the automation software and we were all Splunk, so I had to learn the ELK stack stuff and we were Ansible or they were Ansible, the Foundation is using Puppet. Just all of it down to the monitoring. We didn't have any third party monitoring because, “government”: we had this really unfathomably convoluted Xymon setup, which was interesting but  we were using RCS for everything. So instead of git or subversion or even CVS.

Yeah, they're stuck with their legacy, that's for sure.

Yeah. You got text files in there that have got 10,000 versions in RCS. It was like, "Oh, my God. What am I going to do with this?"

So, I tried to implement some of the new hotness there. The git workflow, gitflow, actually, the exact same kind of thing that we do here.

I had a good understanding of how ASF did business from an operational standpoint. I understood it, because I've helped implement it elsewhere, but this is the first time I've ever been fully immersed in the river of PRs and tickets and all that other stuff, so it's been a hell of a learning curve, like it has really, really kicked my butt.

But you're kicking it back. I mean, you're here. You're making it work.

Oh, yeah, hustle, man. That's really all you’ve got to have is hustle.

As you're describing the way the ASF is and you were talking about some of the tools and the orchestration requirements, is this a common thing that Infrastructure today in general is heading in that direction, or is it an anomaly not only from your personal experience, obviously, but that is an anomaly but from the way you see the industry? Does “infrastructure” in general seem to be headed in this direction, or is ASF really a unique animal in that way? Do people really have to be more jack-of-all-trades?

So the ASF is a unique animal. It is. Typically, people don't have 11 Cloud providers and if they do, they've usually got some sort of system underpinning all of that whereas ours is tribal knowledge and text documents and we're really trying to get this knowledge codified and our technical writer Andrew Wetmore was really doing a kick ass job with that. But, yeah, typically an infrastructure team of this sophistication would probably have a different set of tools.

It's surprising that we're not using, like Vagrant and Packer and Teraforms which abstract the way Cloud providers make VMs. We still make them by hand. It's work, and really the only way to be good at that is to know what you're doing and to be confident in that particular UI, which is always its own special kind of awkward, trying to get used to a new UI, finding out where all the options are, and we're doing all these things by hand … everybody just picks up this knowledge through osmosis, just by stumbling through these tickets from time to time and it's really crazy to see sometime how much process there is and how little documentation there is. So I'm really happy to have our documentation writer on board.

That's Andrew, right? Andrew Wetmore is working on the documentation?

Oh, yeah. Yep, and he's doing a really good job, helping us sort it out.

And he hasn't left screaming and running either, so that's a good sign. It's a lot of work.

That's true. Yeah. It is. It is a lot of work and he has not left running, but he is a really chill dude.

Our infrastructure is unique in that we do all of the things that are kind of necessary. There really isn't too much of a go-to guy for any of this stuff. If there's a problem in the build system, you take care of it. If there's a problem with a Web server, you take care of it. That's where the autonomous nature of Infra comes in. If there's a problem, you just take care of it. You have these tools, you know how to do it, you just do it.

How do you know that someone's not fixing it on their own at the same time? If something's broken, you're like, "Hey, this is broken. I'm dealing with it" or something else?

Just slack, typically. I always check.

Yeah. Okay, what's your favorite part of the job?

Oh, gosh. My favorite part of the job is not feeling icky at the end of the day. I've worked for some companies that kind of made me feel a little ick in their mission. So one of the stories that my wife likes to tell is that I quit [MEDICAL INSURANCE COMPANY] because I disagreed with them as a company and I paid $5,000 to do so. But yeah, so I worked in the mortgage industry a little while shortly after the housing collapsed and I just thought about it. It was like, "Man, I really don't feel good about this job anymore." And then I moved to [REDACTED], which was arguably a bad move.

Big Health.

I was there for like 11 months. I signed a contract, I got a sign-on bonus, I moved to get there, so the stipulation was I stayed a year. I stayed 11 months and three weeks and I quit. I couldn't take it anymore. I'm just like, "I'm not doing this. I'm not doing this."

I was walking on an image parser for the Affordable Care Act pipeline, which was awful. They were still implementing it. This was 2012, 2013.

It was really bad. So after that, I went to NASA and I finally felt good about what I was doing and to have made a move where, again, I agree ethically and morally with what we're doing. I mean, it really is noble work, not specifically the work that I do, but the work that the people that I support do, and so, by proxy, my work is also.

At Apache, we have volunteers that dedicate hours of their life to these projects that we distribute freely because it really does make the world a better place. I mean, where would the world be without HTTPd?

What you just said right now has totally touched me. I feel like I’m ready to burst into tears, that's amazing. Really: I mean, wow. That's from the heart. I totally get you about doing things for people you don't believe in. That's so hard.

That sucks so much.

I totally get it and you're right. This is such a crazy group. It should not work and they do and it's incredible: 21 years of this. It's amazing.

Yeah, it's like trying to watch an eight-legged horse run.

[laughing] A what?!

An eight-legged horse. Somehow twice as fast, but you have no idea how it's working. Or which direction it's going to go.

I can’t stop laughing over the visual of that.

It's actually really funny because I'm a huge classics and mythology nerd. Technology was not my first choice in careers. I wanted to be a Latin teacher.

I love this. These are the backstories that everyone wants to know. You want to be a Latin teacher?!

I wanted to be a Latin teacher, yeah. I did Latin from freshman year in high school until I decided that college wasn't for me. So sophomore year, I took six years of Latin and it is really awesome what learning Latin does for your programming ability because it’s surprisingly similar to learning to code. But yeah, I make a lot of really, really stupid classics and mythlogy puns. So my daughter, her nickname is actually Livy, in reference to the famous historian, which is not something a lot of people get, but that's okay, it makes me chuckle. And Odin had an eight-legged horse that was twice as fast as the other horses, supposedly really fast because it had twice as many legs.

It's interesting with your career, you've worked at places that are big names and people would be very impressed with that, but you're stressing that just because it's a big name or big group, it's not what it's all cracked up to be. What are you most proud of with your career, your Infra career, with Infra as a whole? What makes you say “yay”?

To be honest, becoming an Apache Member was pretty freaking awesome. When I got here, when I start a new job, I always try to set a goal for that job. Sometimes I get it and sometimes I don't, and sometimes I don't realize how hard it is to actually do what I'm setting out to do when I start. My goal at NASA was to win a silver Snoopy, but that was never going to happen.

Silver Snoopy? What’s that?

That's an award given by astronauts to engineers. They don't typically give that to IT folks, but I didn't know at that time.

But here, it was to kind of become a Member and really to be accepted. I feel like I'm doing okay on that. That's pretty cool. That's going along really well.

You fast tracked. I mean, if you've been here for 13 months and you're in as a Member, that's pretty cool. That's good timing, good performance on you.

Well, thank you. I have no idea of how well or badly I am doing. I'm just doing things in the hope that they affect the universe in a positive way.

You're there, we couldn't do it without you.

That's excellent. Thank you.

You got to pat yourself on the back for the work that you're doing, because with our community, you know if you weren't doing it, you'd hear it. People would grump about it.

That's true. That's very true. But again, this is a mindset that's really prevalent in IT is the Tetris mindset where when you're playing Tetris, you fill up a row and it disappears. As such, those are your successes.

The Tetris mindset really is being bogged down by the monument to failure that you've built because really, when you're playing Tetris, that's what you're looking at is the monument of your failure, places you haven't quite gotten the row completed yet and shifted out of your bucket. And it's really easy to succumb to that mindset, especially in a place like this.

And I really, really enjoy the fact that the Apache Community is they seem eager to call out wins for other people and that is an awesome attitude for a community. It's something I've not experienced a whole lot of being called out for successes. I think that on the whole, the community and being embraced by the community has really kind of helped me not fall into that funk, that Tetris mindset just doesn't seem to be prevalent in this community, which is nice.

Do you think that puts people in a kind of "I'm not good enough" mindset because there's not a reward? You're young enough to be part of that community that likes or is accustomed to getting trophies for showing up. Apache doesn't allow that. It's nice for you to show up, but you're not going to be rewarded. Do you think there's an impact with that?

I was on a soccer team once and I did get a participation trophy. You know what? I couldn't even tell you what the name of that soccer team was because I didn't want to play soccer. So, really, I think that if you're coming to The Apache Software Foundation, you're not doing it for the participation trophy, you're doing it because you want to, so the reward doesn't matter. You're doing it because you want to. It's really weird to be surrounded by people who are motivated by nothing other than the fact that they want to be here doing this.

And it's refreshing and I love it. I do.

I love hearing that, that's great. Here come the somewhat personal questions: there's just a few of them. Chris was laughing hard when I was asking them; I don't know if you read the full Chris interview, but it's always interesting to hear what they have to say. So ... how would your co-workers describe you?

Less cool than my wife.

What is your greatest piece of advice... what would you tell aspiring infra people, sysadmins, people like yourself, what would you give them for work advice or career advice or life advice: what would you say?

Oof, that's tough. I guess I would have to say that if at the end of the day you don't feel like your job is worth it, it's probably not.

So, if you're going to do something, make it worth it. That's my advice.

If you had a magic wand, what would you see happen with ASF Infra?

What would I see happen? Well, obviously bonuses and pay raises, but I have no idea. If I had a magic wand, I'd probably turn it over to someone who I thought could make the wish better than I could, but yeah, I have no idea.

What else do we need to know that I haven't asked?

Oh, gosh. So many things, but none of them would make sense out of the context of this particular conversation. To be honest, I'm still under the impression that everybody knows more about this than I do still, so I don't know.


Drew is based in Tennessee on UTC -5. His favorite thing to drink during the workday is a black coffee prepared using a French press or the pour-over method.


# # #

Thursday April 16, 2020

The Apache Software Foundation Announces Apache® ShardingSphere™ as a Top-Level Project

Open Source distributed Big Data middleware ecosystem used for partitioning data, distributed transactions, and database orchestration by more than 120 organizations, including video sharing site Bilibili, commercial bank China MINSHENG Bank, telecommunications and mobile provider China Telecom, eCommerce retailer JD.com, and delivery courier ZTO Express, among many others.

Wakefield, MA —16 April 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache® ShardingSphere™ as a Top-Level Project (TLP).

Apache ShardingSphere is a distributed Big Data middleware ecosystem. The project was originally developed at Dangdang Information Technology, and was submitted to the Apache Incubator in November 2018.

"Graduating as a Top-Level Project reflects the efforts of the Apache ShardingSphere community over the past year and a half," said Liang Zhang, Vice President of Apache ShardingSphere. "Since entering the Apache Incubator, ShardingSphere has evolved from a JDBC driver for sharding into a distributed ecosystem. We thank our mentors, contributors, and the Apache Incubator for their support, especially during the challenges with the coronavirus outbreak. Moreover, the community has been active and diverse, with more than 120 contributors from all over the world involved with the project."
 
The Apache ShardingSphere ecosystem has 3 sub-projects that form the database solutions, nicknamed “JPS”, for:
 
  • ShardingSphere-JDBC —a lightweight Java framework that provides extra service at the Java JDBC (“Java Database Connectivity”) layer. It provides service in the form of JAR (“Java ARchive”) that requires no additional deployment or dependencies. It can be considered as an enhanced JDBC driver, which is fully compatible with JDBC and all kinds of ORM (Object/Relational Mapping) frameworks.

  • ShardingSphere-Proxy —database proxy that provides a database server that encapsulates database binary protocol to support all developed languages and any terminal.
     
  • ShardingSphere-Sidecar (TODO) —a Cloud-native database agent of the Kubernetes environment that controls the access to the database in the form of sidecar (supporting services deployed with the main application). It provides a mesh layer interacting with the database, known as “Database Mesh”.
 
Apache ShardingSphere's highlights include:
 
  • Completely distributed database solution that provides data sharding, distributed transactions, data migration, as well as database and data governance features.

  • Independent SQL parser for multiple SQL dialects that can be used independently of ShardingSphere.

  • Pluggable micro-kernel that enables all SQL dialects, database protocols and features to be plugged-in and pulled-out by service provider interfaces.
 
Apache ShardingSphere is in use at more than 120 organizations, including Bilibili video sharing site, China MINSHENG Bank, China Telecom Bestpay, DaoCloud, JD.com, Tingyun, and ZTO Express, among others.
 
"Glad to see the ShardingSphere community and contributors grow actively," said Hao Zheng, Senior Director of Jingdong Digital Technology Center. "It has already promoted and pushed the IT architecture of many enterprises to improve rapidly. ShardingSphere is widely used across JD.com, which validates its power and flexibility. Congratulations on graduation of Apache ShardingSphere from the Incubator!"
 
"In the past two years, we have witnessed ShardingSphere grow from small to large," said XiaoHu Zhang, General Manager and Senior Director of China Telecom Bestpay Technology Innovation Center, and Apache ShardingSphere committer. "It's a vibrant community with a group of contributors who are constantly contributing to it. Congratulations! We graduated!"

"Today, the number of customers and scenarios faced by the enterprise is increasing exponentially," said Grissom Wang, Products Vice President of DaoCloud. "Therefore, application architecture needs to transform from a traditional monolithic architecture to a microservice architecture. At the same time, more flexible data governance capability is needed, which can inherit the most familiar relationship database technology to meet the increasing data volume or new data usage scenarios. Relational database middleware is a suitable solution: it allows applications to continue to use the relational database access method, and at the same time fully and reasonably utilize the computing and storage capabilities of multiple relational databases in a distributed scenario. We researched many Open Source technologies, and chose ShardingSphere as the core component of DaoCloud database governance because of its functional characteristics, openness, scalability, and active community that meet the needs of the enterprise."

"Congratulations to the Apache ShardingSphere community," said Von Gosling, Apache ShardingSphere Incubator Mentor, original developer of Apache RocketMQ and OpenMessaging. "Graduation from the Incubator marks an important milestone for the Apache ShardingSphere project. This is recognition of the focus and hard work of the project members to learn The Apache Way and drive community around ShardingSphere. I am honored to have helped the project to successfully graduate, and wish its continued development in Cloud-Native Era."

"I am glad to see the ShardingSphere community becomegraduate from the Apache Incubator," said Dongxu Huang, Founder and CTO of PingCAP. "The community should be very proud of their abiity to develop such good Open Source software. With the continued efforts of the Apache ShardingSphere community, I am confident of their continued success in the future!"

"Apache ShardingSphere is a good Open Source distributed database middleware solution," said Lixun Peng, Member of MariaDB Foundation, Oracle ACE Director, and Vice President of ACMUG, the All China MySQL User Group. "Open Source is the mainstream of the world's software development. It's nice to see Chinese enterprises and developers become more invested in Open Source. I hope ShardingSphere continues to grow as part of the family of excellent Apache Open Source products."

"The construction and growth of the Apache ShardingSphere community has promoted the impressive development of Open Source products with new options for enterprise IT architecture," said Grace Guo, Sales Director of MySQL. "Congratulations on the graduation of Apache ShardingSphere! Looking forward to building collaboration between Apache ShardingSphere and MySQL communities to provide more diversified solutions for Open Source technologies and enterprises!"

"It's fantastic to see the work of the Apache ShardingSphere community being recognised," said Martin Woodward, Director of Developer Relations at GitHub, "We've been thrilled to see the community grow really well over the past two years with now over 120 direct contributors. This is thanks to the great work of the maintainers welcoming people to their project, with support from the Apache Software Foundation and their mentors. The team have also done a superb job on their documentation with easy-to-understand instructions available in both English and Chinese. Congratulations to everyone involved: a valuable addition for the whole Java community!"

"I'm very glad to hear that ShardingSphere graduated successfully," said Yanwei Zhou, Founder of ArkDB and Chairman of the Open Source database committee of China Computer Industry Association. "Another Open Source database project led by Chinese technology enthusiasts has officially become an Apache Project, which will further promote the development of Open Source database architecture, allowing more and more users to share the benefits of the technology ecosystem. I look forward to it continuing to get better and better."

"Congratulations to ShardingSphere for graduating as an Apache Top Level Project," said Yuchen Zhao, President of Tingyun, "In the past a few years, I've been very excited to see the progress that the ShardingSphere community has made, and expect the project will grow tremendously in the near future and make a deeper impact on database orchestration. As data becomes increasingly crucial for the digital world, Apache ShardingSphere provides an essential set of distributed database middleware solutions and implementations for making IT architecture easier, more robust, and secure. I recommend Apache ShardingSphere to anyone interested in building database solutions on massive and distributed data."

"Since entering the Apache Incubator, the ShardingSphere community has adopted The Apache Way of self-governance and has substantially increased the number of people using, developing, and supporting the project," said Craig Russell, Apache ShardingSphere Incubator Mentor. "The community has worked hard to make several releases under the Apache License and are expanding ShardingSphere’s functionality to meet the needs of the expanding number of Cloud-based enterprises that use the project."

"Apache ShardingSphere is on its way to becoming a standard distributed database solution," added Zhang. "When we were developing our initial architectural features and database dialects, it was clear that we needed contributions beyond those from our small group of dedicated individuals to accomplish the task. Thanks to our growing community, we are pleased to be graduating with our release goals completed. We welcome additional contributors to join the project to further diversify the Apache ShardingSphere community and develop a more flexible and lightweight platform together. It is a pleasure to collaborate with contributors in the open, and promote a fair and friendly atmosphere where we can enrich ShardingSphere and its community the Apache Way."

Catch Apache ShardingSphere in action at ShardingSphere Workshop (Beijing; 18 April 2020), DTCC (Beijing; 4-6 June 2020), and TiD (Beijing; 26-29 July 2020).

Availability and Oversight
Apache ShardingSphere software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache ShardingSphere, visit http://shardingsphere.apache.org/ and https://twitter.com/ShardingSphere  

About the Apache Incubator
The Apache Incubator is the primary entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. Code donations and communities from external organizations and existing external projects enter the ASF through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/  

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation (ASF) is the world’s largest Open Source foundation, stewarding 200M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 765 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 7,600 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 is an Open Source industry standard, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Amazon Web Services, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Indeed, Inspur, Leaseweb, Microsoft, ODPi, Pineapple Fund, Private Internet Access, Red Hat, Target, Tencent, Union Investment, Verizon Media, and Workday. For more information, visit http://apache.org/ and https://twitter.com/TheASF  

© The Apache Software Foundation. "Apache", "ShardingSphere", "Apache ShardingSphere", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.
 
# # #

Thursday April 02, 2020

Announcing New ASF Board of Directors

At The Apache Software Foundation (ASF) Members' Meeting held this week, the following individuals were elected to the ASF Board of Directors:

  • Shane Curcuru (re-elected Director)
  • Bertrand Delacretaz (former Director)
  • Roy Fielding (former Director)
  • Niclas Hedhman (new Director)
  • Justin Mclean (new Director)
  • Craig Russell (re-elected Director)
  • Sam Ruby (former Director)
  • Patricia Shanahan (new Director)
  • Sander Striker (former Director)

The ASF thanks Danny Angus, Rich Bowen, Ted Dunning, Dave Fisher, Myrle Krantz, Daniel Ruggeri, and Roman Shaposhnik for their service, and welcomes our new and returning directors.

An overview of the ASF's governance, along with the complete list of ASF Board of Directors, Executive Officers, and Project/Committee Vice Presidents, can be found at http://apache.org/foundation/ 

For more information on the Foundation's operations and structure, see http://apache.org/foundation/how-it-works.html#structure 

# # #

Tuesday March 31, 2020

Inside Infra: Chris Thistlethwaite

"Inside Infra" is a new interview series with members of the ASF Infrastructure team. The series opens with an interview with Chris Thistlethwaite, who shares his experience with Sally Khudairi, ASF VP Marketing & Publicity.




"I get very attached to the technology that I'm working with and the communities that I'm working with, so if a server goes down or a site's acting wonky, I take that very personally. That reflects on how I do my job.



Let’s start with you telling us your name --how is it pronounced?

It’s “Chris Thistle-wait” --I don’t correct people who say “thistle-th-wait”-- that’s also correct, but our branch of the family doesn’t pronounce the second “th”.

What’s your handle if people are trying to find you? I know you’re "christ" (pronounced "Chris T") on the internal ASF Slack channel.

Yeah --anything ASF-related is all under "christ".

Do people call you "Christ"?

They do! I first started in IT around Christmastime and was doing desktop support and office-type IT. When people started putting in tickets, and my username  was "christ" there, they were asking "why was Christ logging into my computer right now?" and it became a thing. When I was hired at the ASF I told Greg (Stein; ASF Infrastructure Administrator) about that story, he said "you gotta go with that for your Apache username."

When and how did you get involved with the ASF?


A long time ago I started getting into Linux and Open Source, and naturally progressed to httpd (Apache HTTP Server). Truth be told, that’s where it started and stopped, but I’ve always been interested in Open Source and working with projects and within communities. Three years ago I was looking for a new job and stumbled across the infra blog post for a job opening. I fired up an email, sent it off to VP infra and that’s how everything started. The ramp up of the job was diving deep into everything there is with the ASF and Open Source --which I am still diving. I don't think I found the bottom yet with the ASF.


How long have you been a member of the Infrastructure team?


This November will be my fourth year.


What are you responsible for in ASF Infrastructure?


Infrastructure has a whole bunch of different services that are used by both Apache projects as well as the Foundation itself: the Infrastructure team builds, monitors, supports, and keeps all those things running. Anything from Jenkins to mailing lists to Git, SVN repositories and on the back end of things we keep everything working for the Foundation itself within, say, SVN or mailing lists, keeping archives of those things, keeping your standard security and permissions set up and split out. Anyone you ask on the Infra team will say: "I do everything!" It's too hard to explain --it's quite possibly a little bit of everything that has anything to do with technology --as broad as it can possibly be.


So you really have to be a jack-of-all-trades. Do you have a specialty, or does everybody literally do everything?


Everyone on the team generally does everything --for the most part any one of us can jump into the role of anyone else on the team. Everyone has a deep knowledge of a particular or a handful of services that they’ll take care of --like, Gavin (McDonald; ASF Infrastructure team member) knows more about Jenkins and the buildbot and build services than most people on the team. At any one given point we’re on call and need to be able to fix something or take a look at something, so everyone needs to be versed enough in how to troubleshoot Jenkins. That can also be said for not just services that we offer, but also parts of technology, like MySQL or Postgres or our mail system or DNS: we do have actual physical hardware in some places, and we have VMs everywhere too, so sometimes we’re troubleshooting a bad backplane on a server or why a VM is acting the way it is. There's a very broad knowledge base that all of us have but there are specifics that some people know more about than others.


How does ASF Infrastructure differ from other organizations?


There are a lot of similarities but a ton of differences. A big part of how Infra is different is, to use a "Sally-ism": if you look at it on paper, it wouldn't work --I've heard you describe the ASF that way. If you explained the way things work at the Foundation to somebody, they would literally think that you're making it up and there's no way that it would possibly be working the way that it does. There's a lot of that with the Infrastructure team too: many people that I keep in contact with that I've worked with over the years, from my first job where we would buy servers, unbox them, rack them, wire them up, set them up, and run them from the office next door to us --I'd be impressed whenever I had 25 servers running in our little "data center" at that job, and now I talk to these guys about what we do at the ASF: we have 200 servers in 10+ different data centers that are vendor-agnostic and we make it all work. They ask: "how the heck do you do that?!" We just do --it's an interesting thing as to how it all works together because we solve problems that others have as well, but their problems are often centralized to one thing, or a data center that they control and own, or one cloud provider that they control and own, where they deal with a single vendor and possibly at most have to talk with the same vendor in two different geographical areas. We're having to deal with stuff with one cloud vendor that's a VM and other stuff on the other side of the world that's actual hardware in a co-location or data center running and the only thing that makes them the same is that they're on the Internet.


It's a good summation of the team too due to the fact that we’re all based out of worldwide locations, we’re not all in one spot doing something.


Describe your typical workday. Since you're all working on different things on such a huge scale, what's it like to be you?


"It's amazing" [laughs]. Everyone on the team generally has some project or projects that they are working on --long-running things for Infra. 


I'm currently working on rewriting a script for Apache ID creations. The process of putting your ICLA in, sending off to the Secretary, the Secretary says, "OK good," puts in all your data, and that gets put into a file in SVN ...currently, we have a script that we manually run that does a bunch of checks on the account and whatnot, and then creates it, sends off a welcome email, whatever. I'm rewriting that because it's an old script, it's in several different languages. It's actually six scripts that all run off of one script. Consolidating that into one, massive script, that's in a supported language for us, and then moving forward with it into something that we could potentially automate, versus me having to run a script manually a couple of times a day.


Fluxo (the ID/handle for Apache Infra team member Chris Lambertus) was working on some mail archive stuff in our mail servers. Gavin (Apache Infra team member Gavin McDonald) is working on some actual build stuff. Everyone has kind of "one-two punch" tasks that they work on during the day, and then the rest of the time is (Jira) tickets or staying on top of Slack, if people are asking questions in the Infra channel or in our team channel or something like that. The rest of it is bouncing around inside the ASF and checking things out, or finding out new projects to work on, or ways to improve such-and-such process. 


How many requests does Infra usually receive a day, in general?


Over the past three years, we've resolved an average of 6 Jira tickets a day, year-round. We've had 213 commits to puppet repositories in the last 30 days. We handle thousands of messages on our #asfinfra Slack channel, and have had 659 different email topics in the last year.


Dovetailing that, how do you keep your workload organized?


Everyone on the team does it their own personal way. I have a whiteboard and a Todoist list. We also have Jira to keep our actual tickets prioritized and running. We have a weekly team meeting/call and talk about things that are going on, and is the more social aspect of what we do week-to-week.


How do you get things done? You're juggling a lot of requests --what's the structure of the team? How do you prioritize when things are coming in? Is there a go-to person for certain things? If you're sharing everything, how do you balance it and who structures it? How does that work? 


To one end, the funnel to us starts with Greg and David (ASF Infrastructure Administrator Greg Stein and VP Infrastructure David Nalley). It's different from other places that I've worked, where I'm on a team of other systems administrator/engineering people, and we have a singular, customer-facing site. Someone says, "Hey, this should be blue instead of red," there's a ticket and we make the change and then it goes to the production.


There're many different ways to get a hold of the Infrastructure team. Everyone gets emails about Jira tickets and gets updated as soon as one of those comes in. If it's something that you know about --say, the Windows nodes that we handle-- those all fall into my wheelhouse because I'm the last one to work with Windows extensively. Everyone else knows how to work with them, but it makes more sense for me to pick it up in some cases. 


Most of the stuff in Jira are very "break-fix" kinds of things. A lot of the requests on Slack are too, for example: "DNS is busted," and we fix DNS. It's a very quick, conversational, "Let me go change that," or, "I'm going to go fix that real quick." Of course, some of the Jira tickets are very long-running, but the end result is they're fixing something that used to work. 


We were originally running git.apache.org, and Git WIP, so we hosted our own internal Git servers and we would read-only mirror those out to GitHub. Somewhere along the line, Humbedooh (the ID/handle for Apache Infrastructure team member Daniel Gruno) started writing out Gitbox or building Gitbox based on the need to have writable GitHub repositories. He built Gitbox and set up with the help of some other people on the team, got it going, and that became our replacement for git.apache.org. While we still host our own Git repositories, people are free to either write to ours or write to GitHub, and the changes are instantaneously mirrored between the two.


We had Git hosted at the ASF, and had GitHub as a read-only resource. The need arose to have rewrite on both sides: Humbedooh went and built out MATT (Merge All The Things), which does all of the sync between GitHub and our Git instance.


MATT started a while ago, and Humbedooh added on to that to do the rewrite to GitHub. Basically what all that does is once your Apache ID is created, or if you have one already, you go on ID.apache.org, you add your GitHub username in there and then MATT --there's another part of that called Grouper-- MATT and/or Grouper will run periodically, pull data from our LDAP system and say, "Oh, ChrisT at apache.org has ChrisT as his GitHub ID. I'll pull those down." It says, "ChrisT is in the Infrastructure group. Hey look, there's an Infrastructure group in GitHub. I'll give ChrisT write access to the GitHub project." In a nutshell, that's what that does.


There's a ton of other house cleaning things, if you get removed from the LDAP group ... we run LDAP and keep all this stuff straight. If you get removed from the Infrastructure group at LDAP then MATT/Grouper will go and say, "Oh, this person's not in this LDAP group but they do have access in GitHub. Let me pull that so that they don't have access to that any more." It does housekeeping of everything as well as additions to groups and that kind of thing. There's a ton of technical backend to that, and that's what Humbedooh's doing. 


At first when Git and GitHub were set up, it was fine: the ASF has to keep canonical everything about what goes into each project. You could only write to our Git repos. Then it was conveniently mirrored out to GitHub because there's a lot of tools that GitHub has that we didn't have or weren't prepared to set up. GitHub has a very familiar way of doing things for a lot of developers. Once GitHub Writable came along with Gitbox and the changes to MATT, that opened up a whole other world of tools for people on projects to use. If they wanted to use pull requests on GitHub, they could start using pull requests on GitHub to manage code. They could wire up their build systems to GitHub with Jenkins so that whenever a PR was submitted and got approved, it would kick off a build in Jenkins and go through unit tests and do all the lovely things that Jenkins does.


It was really an evolution of, "Here's the service that we have. Someone, somewhere, be it infrastructure or otherwise, once they have writable GitHub access, here we go." And here's the swath of things that that now opens up to projects inside the ASF that if they could come and set up a project with us, and then never, ever actually commit code to the ASF, it would always go to GitHub but still be safe and saved on our GitHub servers for ASF project reasons.


At the same point, we saw a need and said, "Let's build this out and go." Another funnel that comes into us is when we're on-call, something breaks and we ask, "Why do we do it this way? We should be doing it a different way." We then come up with a project to fix that or build it. It's a very interesting process of how work gets into the Infrastructure team.


It's been an interesting ride with that one.


There's always stuff that we're working and fixing and making better. For the most part, Gitbox as it is now is kind of in a state of "It's getting worked on". If there are bugs that need fixed, it gets fixed, but I don't know what the next feature request is on Gitbox. There's talk of other services ...like GitLab. If someone wanted to write code and put it in GitLab as opposed to  GitHub, then someone would need to come in and write the connector from Gitbox to GitLab. So it's possible. I don't know if that's necessarily an Infrastructure need as much as it is a volunteer need for infra. But it's a system that can be set up to any other Git service as long as someone goes in and writes that.


You brought up an interesting point here, which is volunteers. Do volunteers contribute to Infra also? 


We sometimes have volunteers, yes. We have a lot of people on the infra mailing lists that will bounce ideas back to us or they'll work on a ticket or put in a pull request.


Well, the need is not as critical because you have a paid team, versus Apache projects. 


Right. That's exactly true. There's a bit of a wall that we have to have because we work with Foundation data, which not everyone has access to. Granted, we're a non-profit, Open Source company and everything's out there to begin with, but usernames and passwords of databases and things that we have encrypted that the team has access to isn't necessarily something that you would want any volunteer to have access to.


How do you stay ahead of demand? This is a really interesting thing because part of it is you're saying, "Necessity is the mother of invention." You guys are doing stuff because you've got those binary, "break-fix" types of scenarios. In an ideal situation, do you even have enough runway to be able to optimize your processes? How do you have the opportunity to fix things and improve things as you're going along if you're firefighting pretty much all day long?


That's a really good question about just how our workflow is. In other companies that I've been in, there's the operations people that are doing the "break-fix", and then there's the development people that are doing "the next big thing". The break-fix folks are spinning the plates and keeping them spun without breaking, and that's a lot of firefighting. That's literally all that job is. Even when you're not firefighting, you're sitting around thinking about firefighting in a sense of, “when is this going to fall over again? If it does fall over, what can we do to fix it so it doesn't do that anymore?" And in the past, the break-fix guys, the firefighters, would end up saying, "Hey, there's this thing that needs fixed." And it would fall over the wall to the developers. They would develop the fix for it, and then it would go back into production and then the cycle continues. 


To some extent, that's kind of where DevOps came from: if you merge the two of those together, then while you're firefighting you can also write the fix for the problem, and then you don't have to wait for the lag between the two. We don't have that split here. Everyone on the team is firefighting with one hand and typing out the solution with another. And a lot of the times our project work, like getting a new mail server spun up or my task to rewrite the workflow for new Apache ID creations, I've been working on that for a very long time because it will keep falling off ... it gets put on the backburner while we're like, "Hey, we found out that our TLP servers are getting hammered with downloads from apps and people trying to use them instead of the mirror servers." So, let's set up downloads.apache.org and we can funnel stuff over to that so that that server can get hammered and do whatever it needs to do so that our www. site and all the Apache Project websites stay up and running in a more reliable way.


What's the size of the teams that you were dealing with before that had a firefighting team and a dev team versus ASF infra?


The last "big" corporate job I had was ...six ops people that kept the site going, four database people, another eight technical operations-type people… all-told it was about thirty.


There were technically thirty firefighting people and we had a NOC (network operations center) that was literally people that only watched dashboards and watched for alerts. Whenever those go off, they’d call the firefighting people. The NOC was another 20 people. And then the development teams were ... twenty to fifty people.


What kind of consumer base were they accommodating? Does it match the volume that ASF has? Was it more of a direct, enterprise type of, "We have a customer that's paying, we have to respond to them" situation? Or is it different?


This was at a financial services company that transacted on their Website: completely different from the type of stuff we're dealing with here at the ASF. Volume-wise, they were much smaller, but it was much more ...visible, as their big times were at the start of market and end of market. After end-of-market came all the processing for the day to get done before markets started the next day. The site had to be up 100% of the time. We had SLAs of five minutes. If you got paged or something broke, you had to get the page and respond to it in a way of, "Hey, this is what's going on and these are the people that I need involved with it," all within five minutes of it going off. That was the way the management structure was. It was intense.


In scale, Apache probably does way more: they do way more traffic across all of our services in any given day. If someone doesn't get mail for a little bit, then they come and tell us or we get alerted of it by our systems, and we go and we fix it and we take care of it. But with the financial services group, people were losing money: dealing with people and money is just a very stressful situation for anyone working in technology because you have to get it right and it has to be done as fast as possible before someone's kids can’t go to college anymore. It was a completely different minefield to navigate.


The type of stress that's involved or the type of demand or the pressure is different, but you also have the responsibility with ASF that systems have to be up and running. I understand it's not mission critical if something goes down for more than five minutes, which is different in the financial sector, but do you feel that same type of pressure? Is it there or is it completely different for you? 


No. I think I do because we also have SLAs here: they're just not five minutes. We have structure around that and the way that we handle uptime and that kind of thing. I get very attached to the technology that I'm working with and the communities that I'm working with, so if a server goes down or a site's acting wonky, I take that very personally. That reflects on how I do my job. If a server's not working or if something's broken either because of me or something externally that's going on, I want to get that up and running as fast as possible because that's how I would expect anyone to work in a field that has ...any technology field, for that matter. And generally, that's the same attitude the rest of the team has as well.


How has ASF Infra changed over the years?


It's matured quite a bit. When I first started, it was Gavin, Fluxo, Humbedooh, Pono (former ASF Infrastructure team member Daniel Takamori), and me. There were five of us. The amount of stuff that we got done, I'm like, "Man, there's no way that five people can do this."


That's kind of what I'm pointing at. If you're a team of eight or five or twelve or whatever, compared to the other thing that you did with the other job that had maybe a core team of twenty, thirty --that in itself is insane.


We were five people, everything was very, "Here's the shiny thing we're working on," and then something else would come up and we'd have to jump on that. Then something else would come up and we'd have to jump on that. We were very ...I don't want to say we were stretched thin, but there wasn't necessarily ...time for improvement.


There was a lot of stuff we had still in physical hardware, and a couple of vendors that we no longer use. But things were moving more towards a configuration-based infrastructure with Puppet instead of one person building a machine, setting up all the configs themselves, installing everything and then letting it go off into the ether to run and do its job. We were moving everything towards Puppet to where you configure Puppet to configure the server. So then if the server breaks, or goes down or goes away or we need to move vendors or whatever, all you need to do is spin up a new server somewhere else, give it the Puppet config, it configures itself and then goes off into the ether to run and do whatever it needs to do.


That's great. More automation.


Right. We were automating a lot more stuff right when I first started. Over the course of the next year, the team kind of ebbed and flowed a little bit until we were eight in the last year. We started to get to the point of "where can we point the gun to next? What can we target next to get it taken care of and done?" That's where we started taking on more specific infra projects, for instance, mail. Our mail server has been around since the dawn of time, and it's virtualized so it moves servers every now and then, but the same base of it is quite old in technology standards.


Fluxo started moving this on to newer stuff and he got that going. We started taking care of projects that were not broken, but needed to be worked on. Instead of waiting for it to break, we're fixing and upgrading and moving down that path versus firefighting, break-fix, that kind of thing. We were moving more towards, "Hey, I see a problem. I have time. I'm going to take care of that and make that into a more serviceable system." 


Automation has helped quite a bit with that. I also think that just as the team grew, it just got to a point where I think tickets were getting responded to quicker, emails, chat was responded to quicker. And then we also could focus more on the tools that we use for the foundation. Like, HipChat was going away. We needed a new chat platform, so we chose Slack. And then we updated and moved everything over to Slack, and that's where we are with that. It started following into its own with workflows of like, "Oh, okay. How do we get this done? Let's go do that."


What areas are you experiencing your biggest growth? Is it a technical area? Like, "Hey, all of a sudden mail's out of control"? Or, "Hey, we need to satiate the demand for more virtual machines," or is it a geographic influence that's coming in in terms of draw? Where are you guys pointing all your guns to?


Currently we're trying to get more out to the projects and talk to people more often. Not that we didn't do that before, because ApacheCons and any Meetups that we had, Infra would always have a table. We were always accessible, but we were always passively accessible. We weren't really going out and talking to projects proactively to say, "Hey. What do you guys need from us? What are we doing with this?" So I think that's one part of it, that I think that we're moving towards a little bit. It's not at all technical, but more of a foundation broadening, community broadening thing that we're doing.


That's one part of it. The other thing that we're doing too is from a more technical or infrastructure standpoint, is we're really trying to get our arms around all of the services we provide, and then really take a look at those and say, how is this used inside the ASF? How is it used in the industry as a whole? Do we need to put more time and energy towards those things in order to make the offerings of the infrastructure team a little bit a more solid platform, kind of thing? Generally, that ... and on top of any other automation and that kind of stuff, I think that's really the two spots that I see infra growing in a lot in the next year-ish of just really boiling down our services to, "Hey, we've seen a lot of people using this. And a lot more projects are using this. It's not just a flash in the pan. We need to build out more infra around blah service, so let's really do that and make that a solid platform to use."


What do you think people would be surprised to know about ASF Infra? When you tell someone something about your job and they go, "Whoa, I had no idea" or, "That's crazy." What would people be surprised to know?


That Apache has an infrastructure team. [laughs]


Why are you saying that?


Because honestly, I don't think a lot of people know about the Infrastructure team. Those that do, have used us for something, not used us for something, have talked to us about something, and worked with us on something. Those that don't are like, "Oh, I didn't know the ASF paid people to be here," --that kind of thing. That's kind of the two reactions I've got from people. It's like, "Oh, that's cool. You work for the infrastructure team." Shrug. And then the other people are like, "Oh, sweet. Yeah, that's great. I know Gav. I've worked with him on blah, blah, blah." But that's not necessarily surprising. I mean, it is in a sort of way. 


When people ask, "What are you doing for work?" and you say you work for ASF, do people even know what that is? Do they know what you're doing? Do they care? Are they like, "Oh, okay. Whatever"?


There's literally three types of people that I've run into that ask, "Oh, what are you doing for work?" One person is the person that has no idea what the ASF is, not even the vaguest hint of Apache, and they're like, "Oh, okay. That's cool." There's that next person that does, and may or may not know about the ASF but knows of Apache, the Web server, or some other lineage of that.  They're like, "Oh, whoa. That's super cool. It's impressive.” That's wild. Then the third people ask "Why are ‘Indians’/Native Americans running software? That doesn't make any sense to me" and "Are you on a reserve?" I swear to God I've gotten that question before. I don't even know how to answer that. I'm like, "No, buddy."


Are these technologists or are these just guys off the street? Are they in the industry?


Guys off the street. I say Apache Software Foundation, and they're like "Apache" and "software" doesn’t make sense. Actually I've gotten mean tweets too whenever I've been tweeting about being at ApacheCon. Things like I'm "taking away" from Native Americans and whatever...


We also get that on Twitter, on the Foundation side: we get included in tweets about some kind of violation along the lines of, "Stand up for the ..." I get it. From time to time we also get sent these "How dare you?" letters, that sort of kind of thing. It's an interesting challenge, the whole issue of "why do Native Americans run this thing?" misinterpretation.


Let’s move on. What's your favorite part of your job?


The whole job is my favorite part of the job.


That's funny because everyone at Infra ... You know how people have bad days or may be grumpy or whatever, in general you guys seem to all like each other. You all have a great camaraderie. You all get along. You work really closely together. It's a very interesting thing to see from the outside. Is that true? Or are you just playing it up? Does it really work that way?


That's absolutely true. I've found that generally speaking, when you get a bunch of nerds together, they either really like each other and everything works or they really don't like each other and nothing gets done. The team is great, and it's like no other team I've ever worked with before. But it's very odd because you go through the interview process, and the interviews are interviews. I mean, you get to know people in interviews, but not really. Then you start working with people, and at some point you start getting below the surface. And at some point you get deep enough to where you find out whether or not ...how you gel with all these people. 


It's very odd that all of us have the same general sense of humor. We'll talk about food non-stop in the channel, and recipes and cooking, and different beers or different whatevers. It's nice to get to that point with a team that you're comfortable enough with everybody to ... like I said, I've been here three years and there is still so much that I don't know, both technical and non-technical, about the ASF. I ask very dumb questions in channel and say, "I have no idea why this is doing this this way," or, "Can someone else take a look?" or, "I don't know what I'm doing here." And never in the entire time I've been here, from the day one until now, has anyone ever chastised me for not knowing something or said anything about the way that I work or something like that. Well, at least not in channel. At least not publicly. 


Everyone's very supportive. It doesn't matter if you know everything there possibly is to know about one singular product or thing you're working on, or don't know anything about it. You can ask questions and really learn about why it was done the way it was done, or figure out how to fix a problem. No problem on the team. It's just like, "Okay, yeah. This is what you have to do." Or, "Here's a document. Read up on it." Or, "I don't know either." And then out of that comes an hour of conversation and then a document pops out, and then the next person that asks, we can say, "Here, go read the doc." Yeah. I mean, we're all very happy. Very happy.


Which is really good. Looking back when you first started, what was your biggest challenge when you came onto the team?


Oh man. I look back at that and I feel like the learning curve was ... It wasn't a curve. It was a wall. I've used Linux, I've used Ubuntu for a while and various other flavors of Debian and whatnot, so getting spun up on all of ...expanding my Linux knowledge was a big deal, expanding everything about the ASF and how it works. Which I'm still trying to figure out. If you know, send me something to read to figure out how that all works. I mean, I don't want to sound like I was completely out of my depth and I have no idea what I'm doing, but I feel like I was completely out of my depth and I had no idea what I was doing. 


There's a lot about the ASF that is just tribal knowledge, and there's a lot about Infra that's tribal knowledge. It's just no one has anything written down --"the server's been running under Jim's desk for the last 15 years in a basement that has battery backups and redundant Internet, so it's never gone down. But don't ever touch that server, because if it goes down, then all of our mail goes down" or whatever. There was a lot of figuring all that out for myself and digging around. Which is, frankly, one of the parts that I really enjoy, is just, "Hey, this thing broke. I've no idea what that thing is. I've no idea where it lives," and just diving in and trying to figure out what's going on with it and how it's built, and then the hair trigger that sets it off to crash and never work again. Yeah. That's an interesting question too.


What are you most proud of in your Infra career to date? You're talking about overcoming these challenges, I'm always curious just to see what people are like, "Yeah, I'm patting myself on the back for that one" or, "Ta-da. That's my ta-da moment."


I did lightning talks at ApacheCon Las Vegas and didn't get a phone call from you when I was done. [laughs]


I wasn't at lightning talks --what did you say? What would make me call you?


I didn't say it. We were on stage, and it's John (former ASF Infrastructure team member John Andrunas), Drew (ASF Infrastructure team member Drew Foulks), and I, and we figured we'd do lightning talks: "Hey, we're the new guys: ask us infrastructure questions." A week or two before ApacheCon, there was a massive outage at a particular vendor. It wasn't: "Oh, our server's down for a while," the server went down and then it was *gone*. It got erased from the vendor side. I can't remember what service it was. There was something that disappeared two weeks before Vegas and never came back. 


It wasn't just us, though: tons of companies had this issue. So we're on stage answering questions, and someone asks where this service went: "What happened to XYZ?" And John has the mic and he goes, "You should probably go ask [vendor name]." At that point it was very widely published that the vendor"s response was like, "Whoops, someone tripped over the cord that powered the data center. And when it came back up, then deleted all of your VMs.” They totally acknowledged it and they didn't give refunds for it, so it was a little bit of a PR kerfuffle for them. The vendor is in the other room handing out buttons and stickers, and John was like, "Oh yeah, go ask the [vendor] guys what happened to your server. That's their fault," he said it jokingly but my jaw dropped. 


[laughs] No one told me this story. No one said anything. Someone's trying to protect you. I had no idea this happened ...oh my gosh.


Well, David Nalley was in the back of the room, and he's screaming with his hands cupped around his mouth, "Don't badmouth the vendor and the sponsors." I deflected and quickly moved onto something else. [laughs]


But yes, that's another good question that I haven't actually reflected on. Looking back and seeing where Infra was when I first started and where it is now, it was a very runnable and very good team then, and it's a very runnable and it's a very good team now. I feel like a lot of the work that I've done and a lot of the work that the team has done over the last three years has been getting from a spot of "everything's on fire, who's holding up what this weekend?" to things being stable and us nitpicking on whether or not something needs to be updated or not. That's huge. That's a big step from like starting a company and treading water to being profitable and having resources to do other things versus just keeping your employees paid. I mean, it's a big step for a company and it's a big step for Infrastructure.


I love your talking about how you guys are tightly-knit and all that. How would your co-workers describe you?


The other odd part about that too is being completely remote and not having day-to-day, face-to-face interactions with people. You get a very odd sense of people through text for a 24-hour period that you're online reading stuff. It's a different perspective than if I was in the office every day, working on something and interacting with people. Even though every day, except for the weekends, I'm online talking to these guys and doing stuff. How would they describe me? Dashingly good looking and ... I don't know. [laughs]


I know that Infra's "just Infra," right --you guys are all under the Infra umbrella. Do you have a title? When you got hired, what do they call you?


We're all systems administrators. The only person that actually has a title is Greg, and he's Infrastructure Administrator.


What are the biggest threats you face? For infra folks or systems administrators or infrastructure administrators even, what do you need to watch out for these days? What's big in the industry? Is everyone saying, "Oh, XYZ's coming"? In terms of your role in the job: is there something that you need to keep your eye on? Is there something that you would advise other people, "If you're in this job keep an eye out for blah, this is a new threat" or anything along those lines?


General scope stuff. 16 years ago, everything was hardware: you bought hardware and you had to physically put it somewhere. And virtual machines came along about the same time. People were starting to do virtual stuff to where you could have a physical machine and then multiple machines running on that, sharing resources. Then cloud and infrastructure as a service, and everything's been moving more and more towards that over the years.


Of course, there's still people that work in office IT, doing desk support stuff or office infrastructure type things.Those are still a majority of how things run at companies. As everything is moved more towards the cloud or hosted services, more systems administrators are becoming more like software engineers. And software engineers are becoming more like systems administrators. They're kind of melding into one, big group of people. Now of course, there are still people that only write software. But gone are the days where it used to be someone would write some code and say, "I need to deploy it and get it out to all these computers." They would write the code, they'd hand it off to a systems person. Systems would go and configure on whatever server to get it out to however many machines and hit the button and go. The software developer never really needed to know hardware specifics of the systems that it was going to run on. And the systems people never really needed to know what software packages this was getting put together. There's exceptions to that, but for the most part ... 


Over the years, it's fallen into a thing now where the software developer knows exactly what systems this is going to run on and how it's going to run there, so it's more efficient and things work better and they're releasing less buggy code based on the fact that they know they're closer to the hardware. And the systems people, they want to troubleshoot it more and work with it and fix problems because they're closer to the software and know more about its internal workings and how it's going to run on systems. Everything is getting more and more chunked down into, first it was VMs, then it's cloud, then it's containers with Docker and things like that, and it's going to get more virtualized down into that. Knowing about Docker orchestration and things like Kubernetes and Apache Mesos. The reality is other people run Kubernetes, people run Docker, people run everything. That's the interesting thing in terms of how they do it at ASF. We don't require folks to do just one thing.


In terms of where the industry's going ... everything's getting pushed down to "a developer can work in a container on a set of systems, write software for that and then deploy that to a machine themselves, never involving a systems engineer at all, and build a product using that." It's getting stuff out the door faster, and it's also keeping the unicorn of the industry a while to go ... even today, I developed this thing, it works on my machine. If I move it over to another computer, it stops working. Why? What's the problem with that? Containering or containers fix that problem. The container you run on my system runs the same way as it does on every system everywhere. It takes the "runs on my machine" thing out of the equation. 


What's your greatest piece of advice? What would you tell aspiring sysadmins?


Part of the ASF is the community behind it, and a giant part of that is what makes it work. I mean, you could say all of it. That's what makes everything work with this. Right when I first started the sysadmin kind of thing, I didn't get into Meetups and Linux Users Groups and any of that stuff. I didn't get into the network. I didn't go into the community that I had around me. And honestly, I don't know if that's because it didn't exist or because I didn't know about it or what, but now that I'm older and wiser, the community part of it is really ...there's a massive benefit to that. Aside from socialization, or networking and how to get a better job through networking, getting together with like-minded people and talking through your problems is an amazing tool to use. And I didn't do that enough when I was a sysadmin starting out, and looking back it's something that I sort of regret not doing, was really sharing knowledge with other people in the community and building a group of people that I could ping ideas off of, or help with other ideas, or share in the knowledge of, "Hey, this is what's going on in the industry" or, "Hey, I saw this at work the other day. How do we work around that?" or that kind of thing. It's much easier these days with social media: the never-ending amounts of social media. But it's a big, important part of my day-to-day now, that I wish I had 16 years ago.


That's powerful. OK, If you had a magic wand, what would you see happen with ASF infra?


If I had a magic wand, I'd update our mail server instantly or maybe magic wand a few other projects.


Wait. I know you're joking, but what is the problem with the mail server?


It's running on an older version FreeBSD that doesn't play well with our current tools. Some form of that server has been upgraded, patched, moved, migrated, etc for the last 20 years. We want to bring it up to more modern standards. Mail runs fine for the most part, but it's probably the most critical service we have at the ASF and we want to make sure everything continues to hum along. Because of that, it's a huge project that touches a ton of different parts of our infrastructure.


How big is it?


It's all of our email. Every email that goes through an apache.org address.


This is a huge project and Chris (Lambertus) has been working on it for a while --it's not a simple thing to fix. It's very, very complicated. We couldn’t do it without him.


Back to the magic wand thing: I'd wish for more wands. 


Chris is based in Pennsylvania on UTC -4. His favorite thing to eat during the workday is chicken ramen.


# # #

Sunday March 08, 2020

The Apache Software Foundation Statement on the COVID-19 Coronavirus Outbreak

As a global organization with contributors on every continent, safeguarding our community is our highest concern, especially during the public health emergency presented with the COVID-19 coronavirus outbreak.


The World Health Organization and US Centers for Disease Control continue to release updates: we are actively monitoring the situation as part of our commitment to helping protect individuals from contracting or spreading the virus. 


Effective immediately, The Apache Software Foundation (ASF) strongly recommends suspending all travel associated with official ASF business and events through May 2020, after which we will reassess the restriction period. This applies to official Apache Conferences*, including Apache Roadshows in Washington DC (25 March) and Chicago (18-19 May), as well as beneficiaries of the ASF Travel Assistance Committee.


Of course, exceptions need to be considered. We implore those who must travel to review the WHO's Travel Advice https://www.who.int/emergencies/diseases/novel-coronavirus-2019/travel-advice and the Centers for Disease Control and Prevention's comprehensive Information for Travel reports at https://www.cdc.gov/coronavirus/2019-ncov/travelers/index.html 


With email being the ASF's primary method of communication for more than two decades, we do not anticipate significant disruption to ASF operations or to Apache Projects and their communities. Where possible, those organizing in-person assemblies may wish to consider holding virtual events or postponing, as opposed to cancelling.


Many members of our community work remotely. Whilst working from home may not be possible for some, we urge everyone to practice caution and be proactive with frequent hand-washing, using hand sanitizer, covering coughs and sneezes, and handling food safely. We urge those who are at risk or feeling unwell to stay home and take care of themselves. As symptoms can take more than three weeks to appear in those affected, we commend those who encourage their friends, family, and coworkers to take proper precautions.


We will continue to monitor this rapidly changing situation, and endeavor to provide updates as early as possible.


*Please follow the Notice on Apache 2020 Conferences at https://s.apache.org/zgm8m for the latest updates on Apache events.


# # #

Thursday March 05, 2020

Notice on Apache 2020 Conferences

In light of the World Health Organization raising the threat level about the COVID-19 coronavirus outbreak, we have decided, after much consideration, to cancel the following events:

Note that the Apache Roadshow/Seattle, scheduled for 10-12 June 2020, has been postponed.

The safety of our event attendees, speakers, sponsors, and staff is of the utmost importance. We are committed to minimizing our global community’s potential health risk, exposure to border health inspections, and increased travel restrictions.

Event organizers will be in contact with delegates regarding further updates.

= = =

UPDATES --9 March: added Chicago Roadshow to cancellation list; added postponement notice for Seattle Roadshow.

For the latest developments, follow @ApacheCon on Twitter and ASF Events on LinkedIn.

Wednesday March 04, 2020

The Apache Software Foundation Operations Summary: November 2019 - January 2020

FOUNDATION OPERATIONS SUMMARY

Third Quarter, Fiscal Year 2020 (November 2019 - January 2020)

"The Foundation's unique approach has created many industry standards and will likely continue to do so for many more years. Apache projects are famous not just for great technology, but for their longevity and vendor-independence."
Doug Cutting, ASF Member and Chief Architect at Cloudera (ASF Platinum Sponsor)


> Conferences and 
Events http://apachecon.com/

During this period we held two major Apache events. Q3 was fairly quiet for Conferences. We did not hold any events during this period, but were busy with early planning happening for several upcoming events.

ApacheCon North America 2020 will be held in New Orleans in September https://www.apachecon.com/acna2020/

We will be holding several Apache Roadshows in the coming months:

Sponsorship opportunities and speaking opportunities are available for all of these events.

> Community Development http://community.apache.org/

One of the key themes this quarter was the discussion of how to encourage ASF participation locally by establishing Apache Local Communities (ALC). The ALC comprises local groups of Apache enthusiasts, called an 'ALC Chapter' that will be responsible for organising local Apache related events. To create the necessary oversight for these groups we have agreed a set of governance processes including how they are formed, roles and responsibilities, how events are to be organised and how to dissolve a group if it is no longer active.

We have received the requests to establish the ALC Chapters in Beijing, Warsaw and Budapest and these are currently under consideration. Our existing active ALC Chapter in Indore ran an event on Open Source and ASF Awareness for school students.

We have applied on behalf of the ASF to be a GSoC mentoring organisation for 2020 and are waiting for the response. In preparation we have setup a wiki page to collect GsoC ideas from our Apache project communities.

During January we prepared for participation in FOSDEM as we were once again allocated a booth at the event. Volunteers from many of our projects signed up to spend time on the booth or to make themselves available to talk to attendees. As usual Community Development co-ordinated the booth and managed the giveaways for the event.

As well as ApacheCon and the Apache Roadshows planned for 2020, we are continuing to actively support any third party events that we can.

Despite the holiday season our mailing list traffic has increased slightly this quarter.

> Committers and Contributions http://apache.org/licenses/contributor-agreements.html

Over the past quarter, 1,581 contributors committed 42,338 changes that amount to 14,073,594 lines of code across Apache projects. The top 5 contributors, in order, were: Tilman Hausherr (1,010 commits), Andrea Cosentino (788 commits), Mark Robert Miller (771 commits), Mark Thomas (681 commits), and Jean-Baptiste Onofré (616 commits).

All individuals who are granted write access to the Apache repositories must submit an Individual Contributor License Agreement (ICLA). Corporations that have assigned employees to work on Apache projects as part of an employment agreement may sign a Corporate CLA (CCLA) for contributing intellectual property via the corporation. Individuals or corporations donating a body of existing software or documentation to one of the Apache projects need to execute a formal Software Grant Agreement (SGA) with the ASF.

During Q3 FY2020, the ASF Secretary processed 187 ICLAs, 6 CCLAs, and 6 Software Grants. History of Apache committer growth can be seen at https://projects.apache.org/timelines.html

> Brand Management http://apache.org/foundation/marks/

Operations —the work of the Brand Management team falls broadly into one of four categories:

- providing advice to projects

- granting permission to use our marks

- trademark transfers and registrations

- addressing potential infringements of our marks

The volume of work this quarter has again increased significantly compared to the previous quarter. This has mostly been driven by starting work on a number of draft policies where we are looking to clarify policy around a number of uses of Apache marks.

The topics covered in the advice provided to projects this quarter included setting up an external package registry, podling naming, community managed sites, registration of marks, 'official' social media accounts, assignment of marks, name changes, event sponsorship and linking to external support services.

This quarter has seen requests to use Apache marks for marketing material, events, books, scientifc papers, Websites, t-shirts with nearly all requests being granted, subject to our Trademark Usage Policy. The few requests that are not granted often relate to using a derivtaive of our logos --something we do not permit.

This quarter a number of the event approval discussions resulted in changes to the proposed evenmst dates to avoid clashes with other planned ASF events.

Registrations —the registration of APACHE in the US completed this quarter.

A number of registrations came up for renewal this quarter. We review each renewal as it comes up and, as a result, opted not to renew some of those registrations. The remaining renewals are in now progress.

We also started a small number of new registrations this quarter.

Infringements potential infringements are brought to our attention from both internal and external sources. The majority of infringements we see are accidental and our project communities are able to resolve these quickly and informally with occasional input from the Brand Management team. A small number of issues take longer to resolve. We made progress on some of these this quarter and hope that that progress will continue next quarter.

We continue to work to resolve the significant infringement mentioned in the last quarterly report. Along side that projects have resolved a number of minor issues during this quarter.

And finally…

The Brand Management team welcomes your comments and suggestions as well as any questions you might have. Please see https://www.apache.org/foundation/marks/contact for our contact details.

> Security http://apache.org/security/

We continued to work on handling incoming security issues, keeping projects reminded of their outstanding issues, allocation of CVE names, and other general oversight and advice.

For Q3 we tracked 94 new vulnerability reports across 46 projects. (Q3 last year for comparison was 88 reports). Those reports led to 37 published CVE vulnerabilities.

We published metrics for the whole of 2019 including discussion of high severity issues in a report https://s.apache.org/security2019 


> Privacy http://apache.org/foundation/policies/privacy.html

The board has rekindled the privacy effort. Currently we're working on three parallel tracks; developing a general policy from which we can derive day to day implementations and operating procedures, capturing/collecting the areas where we know we've historically dropped balls while also dealing with the day to day operational aspects (such as requests). The complexity is that we have on the one hand the purpose of the Apache Software Foundation; allowing a community to develop code for the common good. With all that that entails (such as having healthy, transparent and trust in the community). And on the other hands we have the rights and worries of both those in our community and our end users; whose privacy we would like to protect as well as we can. And the two can collide; e.g. for a software grant or things having to do with finance; we need to keep a fair amount of personally identifiable information on file. But at the same time - we want to protect the privacy of our community. Yet for the health of our community - a certain level of transparency is needed; as do some governance processes (e.g. those where developers approve a release as an official release of the foundation). For next two quarters the focus will likely shift to developing SoP's for day to day implementation (and automation) & hunting down where we have 'needless' data.


> Infrastructure http://apache.org/dev/infrastructure.html

This quarter has been relatively quiet for the Infrastructure team, given the holidays and New Year.

Our biggest highlight was hiring Andrew Wetmore as a Technical Writer and Editor, to bring his experience to our set of web pages, wiki content, and assorted documentation. For twenty years, the Foundation has organically written a large number of words. Andrew will corral this set of content into a coherent whole, with two goals in mind: to assist our development community with information about Infrastructure and its services, and to provide better guidance to users and new community members.

Continuing with a reflection of our history, we have decades of email archives. These have been provided on mail-archives.apache.org to the public. This quarter, we finally announced the decommission of our old archive system, in favor of the lists.apache.org service. The archive will be turned off some time during the next quarter, with redirects left in place to handle the myriad of links established over time.

For many years, the Foundation has been investing in CI/CD (Continuous Integration / Continuous Development). Primarily through our Jenkins installation, but also through integrations with third-party services. We have begun testing new Jenkins-based tooling to improve our management of clusters of nodes for assignment/use by our projects.

Our hope is this will help us continue to scale with the increasing demands of the Apache communities.

Fundraising is pleased to report another successful quarter of smooth operations. Renewals and business-as-usual work has been executed as planned. We've had a "typical" flow of new Sponsors and returning Sponsors with a few exciting Sponsor "upgrades" this quarter. This quarter we also completed our first targeted cash donation to an Apache project (Cordova).

We're pleased to also report further participation and "cross department" collaboration within The ASF. Fundraising support for Events has remained a focus this quarter as we ramp up for the several 2020 events. Additional focus is being placed on documentation, process, repeatability, and ensuring our Event Sponsors have a smooth experience all around. TAC and Fundraising are also collaborating more to encourage Event participation via Targeted Sponsorships -- more to come!

Process-wise, we continue improving the internals of the Fundraising mechanics to ensure smooth operation as well as improved documentation. We've recently adopted an improved procedure for meeting minutes and action items to further ensure nothing falls through the cracks.

Our planned outreach activities are all on track for Sponsors and we remain responsive to changes in organizational structures as our contacts enter and depart roles. We enjoyed meeting several of our Sponsors at COSCon in Shanghai in early November. Finally, we also updated our link policy for the "thanks page" to comply with popular webmaster recommendations by adding rel="sponsored" tags to new links and upon Sponsor renewals.

We are delighted to share the results of a very successful individual giving campaign that ran from late November through the end of calendar year 2019. The proceeds of the campaign were $14,240 in total which represents a 222% increase from previous years! The donations were comprised of 112 individual donations and 3 corporate gifts. We truly felt the love as some donations included heartfelt notes of thanks and encouragement for our mission.

Thank you to all our Sponsors --

  • PLATINUM: Amazon Web Services, Cloudera, Comcast, Facebook, Google, LeaseWeb, Microsoft, Pineapple Fund, Verizon Media, Tencent
  • GOLD: Anonymous, ARM, Bloomberg, Handshake, Huawei, IBM, Indeed, Union Investment, Workday
  • SILVER: Aetna, Alibaba Cloud Computing, Baidu, Budget Direct, Capital One, Cerner, Inspur, ODPi, Private Internet Access, Red Hat, Target
  • BRONZE: Airport Rentals, The Blog Starter, Bookmakers, Cash Store, Bestecasinobonussen.nl, CarGurus, Casino2k, Cloudsoft, The Economic Secretariat, Emerio, Footprints Recruiting, Gundry MD, HostChecka.com, Host Advice, HostingAdvice.com, Journal Review, LeoVegas Indian Online Casino, Mutuo Kredit AG, Online Holland Casino, ProPrivacy, PureVPN, RX-M, SCAMS.info, Site Builder Report, Start a Blog by Ryan Robinson, Talend, The Best VPN, Top10VPN, Twitter, Web Hosting Secret Revealed, Xplenty
  • TARGETED PLATINUM: CloudBees, DLA Piper, JetBrains, Microsoft, OSU Open Source Labs, Sonatype, Verizon Media
  • TARGETED GOLD: Atlassian, The CrytpoFund, Datadog, PhoenixNAP, Quenda
  • TARGETED SILVER: Amazon Web Services, HotWax Systems, Rackspace
  • TARGETED BRONZE: Bintray, Education Networks of America, Google, Hopsie, No-IP, PagerDuty, Peregrine Computer Consultants Corporation, Sonic.net, SURFnet, Virtru

To sponsor The Apache Software Foundation, visit http://apache.org/foundation/sponsorship.html . To make a one-time or monthly recurring donation, please visit https://donate.apache.org/

= = =

Report prepared by Sally Khudairi, Vice President Marketing & Publicity, with contributions by Rich Bowen, Vice President Conferences; Mark Cox, Vice President Security; Sharan Foga, Vice President Community Development; Myrle Krantz, Treasurer; David Nalley, Vice President Infrastructure; Tom Pappas, Vice President Finance; Daniel Ruggeri, Vice President Fundraising; Greg Stein, ASF Infrastructure Administrator; Mark Thomas, Vice President Brand Management; and Dirk-Willem van Gulik, Vice President Data Privacy.

For more information, subscribe to the announce@apache.org mailing list and visit http://www.apache.org/, the ASF Blog at http://blogs.apache.org/, the @TheASF on Twitter, and https://www.linkedin.com/company/the-apache-software-foundation.

(c) The Apache Software Foundation 2020.

# # #

Tuesday March 03, 2020

The Apache Software Foundation Announces Apache® Brooklyn(TM) v1.0

Advanced Open Source framework for modelling, monitoring, and managing applications used by global systems integrators, Cloud software and service providers, and major enterprises across financial services, supply chain management, and more.

Wakefield, MA —3 March 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, today announced Apache® BrooklynTM v1.0, the latest version of the Open Source framework for modelling, monitoring, and managing applications.

"I am excited to see the 1.0 release of Apache Brooklyn," said Geoff Macartney, Vice President of Apache Brooklyn. "This reflects the maturity and stability that Brooklyn has reached after nearly five years as a Top-Level Apache project."

Apache Brooklyn provides a single tool that includes a REST API and GUI for:

  • managing provisioning and application deployment;
  • monitoring an application’s health and metrics;
  • understanding the dependencies between components; and 
  • applying complex policies to manage the application.

Apache Brooklyn uses declarative YAML blueprints to describe an application and all its components. Blueprints can be treated as an integral part of the application, and as modular components that can be composed and reused in many ways. Brooklyn blueprints incorporate policies that actively manage a deployed application by reacting to sensor data such as application health or load, and take actions such as replacing nodes or growing a cluster. Brooklyn’s design is influenced by Autonomic computing and promise theory and implements the OASIS CAMP and TOSCA standards.

Apache Brooklyn 1.0 highlights include:

  • Support for public and private clouds, available out-of-the-box thanks to integrated Apache jclouds, as well as private infrastructure
  • A modern, user-friendly, web-based UI including the drag-and-drop Blueprint Composer
  • REST API and CLI tools, suitable for power users, automation and scripting
  • A stable blueprint language and API
  • “Batteries included” entities and policies covering clusters, auto-scaling, replacing unhealthy components, and more

"Apache Brooklyn has been in use for some time in production environments," said Richard Downer, Apache Brooklyn 1.0 release manager. "I’m delighted we can now announce our 1.0 release. Everyone should feel confident building on and deploying Apache Brooklyn 1.0 and know that the Brooklyn Project Management Committee has prioritised the long-term stability of Brooklyn."

Apache Brooklyn is in use by global systems integrators, providers of Cloud software and services, as well as mission-critical applications for major enterprises in financial services, supply chain management, and more.

"We are delighted to see Apache Brooklyn reach this milestone," said David Cairns, CTO for innovation at Fujitsu Digital Technology Services. "Apache Brooklyn powers Fujitsu AIOps solutions with policy-based autonomics to detect service deterioration or outage and can automatically re-locate Cloud applications and services from one cloud provider to another to elevate resilience and uptime." 

"Reaching v1.0 reflects the maturity of Apache Brooklyn and we appreciate the community’s effort," said Ross Gray, CEO at Cloudsoft. "Cloudsoft AMP is built on Apache Brooklyn and helps customers eliminate manual processes, cut effort by 75%, and reduce infrastructure spend by as much as 66%."

Apache Brooklyn blueprints for many well-known applications and tools, including ElasticSearch, clustered MySQL, and DNS management, as well as Apache projects such as Cassandra, CouchDB, Kafka, Solr, Storm, ZooKeeper and more, are all freely available under the Apache License v2. The Apache Brooklyn community warmly welcomes new code, testing, blueprints, documentation, presentations, and other contributions.

"Brooklyn is a powerful tool for unified modelling, deployment and lifetime management of applications," added Macartney. "This latest release is a great opportunity for a wider audience to try Brooklyn for themselves and find out how it can help them create and manage their applications, be it in the Cloud, on-premise, or in a hybrid environment. We look forward to growing our community as people discover all that Brooklyn can do."

Availability and Oversight
Apache Brooklyn software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Brooklyn, visit https://brooklyn.apache.org/ and https://twitter.com/ApacheBrooklyn

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation (ASF) is the world’s largest Open Source foundation, stewarding 200M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 765 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 7,600 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 has become an industry standard within the Open Source world, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Indeed, Inspur, Leaseweb, Microsoft, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, Tencent, Union Investment, Workday, and Verizon Media. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Brooklyn", "Apache Brooklyn", "Cassandra", "Apache Cassandra", "CouchDB", "Apache CouchDB", "jclouds", "Apache jclouds", "Kafka", "Apache Kafka", "Solr", “Apache Solr", "Storm", “Apache Storm", "ZooKeeper", and "Apache ZooKeeper" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Thursday February 27, 2020

The Apache Software Foundation Announces 20th Anniversary of Apache® Subversion®

Community-led Version Control Software and Source Code Management Tool Available on Most Integration Servers, Integrated Development Environments, Issue Tracking Systems, and more. 

Wakefield, MA —27 February 2020— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the 20th Anniversary of Apache® Subversion®, the popular centralized software version control system.

Apache Subversion ("SVN") allows users to commit code, manage changes, and recover previous versions of all sorts of data across files and directories. Subversion is ideal for distributed teams who need to easily audit and act on modification logs and versioning history across projects. Subversion originated at CollabNet in 2000 as an effort to create an Open Source version-control system similar to the then-standard CVS (Concurrent Versions System) but with additional features and functionality. Subversion was submitted to the Apache Incubator In November 2009, and became an Apache Top-Level Project in February 2010.

"We are very proud of Subversion's long history, and remain committed to our mission statement," said Stefan Sperling, Vice President of Apache Subversion. "Subversion has moved well beyond its initial goal of creating a compelling replacement for CVS. In 2010 our mission statement was updated to ‘Enterprise-class centralized version control for the masses’.”

Over its 20-year history, Subversion has grown to become the most popular version control system on the market, and remains the leading centralized versioning and revision control software today. Millions of users worldwide depend on the collaboration-friendly system to easily access all files and historical data simultaneously without code conflicts or corruption. Subversion accommodates a wide variety of integrated development environments (IDEs), and is well-suited for large projects. 

Apache Subversion has been broadly adopted for mission-critical code distribution and collaboration workflow by Adobe Dreamweaver, Eclipse, Google, Halliburton, Microsoft Visual Studio, Python, Ruby, Skype, SourceForge, and WordPress, among many organizations and development communities. The ASF uses Apache Subversion in its own infrastructure, housing millions of lines of code in more than 1.8 Million commits across 300 Apache Top-Level Projects and sub-projects.

"One of the best decisions of my life was emailing up Karl (Fogel) to see if he was interested in moving the Open Source community beyond CVS," said Brian Behlendorf, co-founder of CollabNet and co-founder of The Apache Software Foundation. "Essential to Subversion's success was the core team of Karl, Ben (Collins-Sussman), and Mike (Pilato) working publicly, spending the difficult time on design docs and helping newbies up the learning curve, with the goal of building as a community what three people (even the best) alone could not do. 20 years later I'm not surprised to see it continuing to innovate, to add features, to fix bugs, and to push the envelope forward. Git still needs competition :) But it's also the best example, and essential example, for why community matters more than code. It's the Subversion community that made it successful, that made the code continuously better, that left no CVS user behind, and that did so with the technical precision and super-human decency all other projects should aspire to."

"Twenty years later, Subversion is no longer the upstart -- it is mature software, and still going strong," said Karl Fogel, original founding developer of Subversion, and Partner at Open Tech Strategies. "Subversion continues to be widely used, especially in enterprise settings, because of its reliability, the simplicity of its conceptual model, its ability to handle large files, and features like path-based access control and optional file-locking. In situations where Subversion's centralized model is the right tool for the job, it really shines: we use it for our entire internal corporate tree, for example, because the path-based authorization is crucial. To get some other viewpoints on where Subversion has come over 20 years, I took a walk through the main project's support forums and the forums of TortoiseSVN, the popular open source SVN client application for Windows. I was delighted by what I saw: a diversity of uses and users, fast and helpful responses, and a focus on practical needs. Starting two decades ago, Subversion helped bring version control beyond developers to a wider audience, and it continues to do that today."

"Today we've got a plethora of fast, reliable, and efficient version control systems, but twenty years ago we had exactly zero: CVS was the only widely used version control system and it still failed in unpredictable ways (including bitrot that was undetectable until you tried to check out old code)," said Brian Fitzpatrick, one of Subversion’s earlier developers. "Even though most people use Git today in the Open Source world, Subversion was the catalyst that allowed folks to move from CVS to Git and so many other modern day version control systems. While the core team wrote a great deal of Subversion's code, we also spent a great deal of time communicating outside of our office in Chicago in an effort to build a larger Subversion community--an effort that eventually paid off more than tenfold."

"When we gathered in my basement in early 2000, thinking about what paths Subversion should follow, none of us imagined what would be accomplished over the next twenty years," said Greg Stein, an early developer of Subversion, and former Vice President of Apache Subversion. "We focused on improving the experience of CVS users and administrators. We overshot our own expectations within just a few years, creating a system that millions have found worthy. From our humble beginnings, I couldn't be more proud of what the community has accomplished."

"Technology is at its best when it brings people together," said Matt Mullenweg, Founder and Lead Developer at the WordPress Foundation. "SVN has brought countless people together over the years and I wish it much continued success."

"Reliable and powerful version management is essential for our product development. Today, more than 100 of our employees regularly use Apache Subversion with several million lines of source code in our Subversion repository," said Roland Wagner, Head of Product Marketing at CODESYS Group. "Our success with Subversion convinced us to become the first company to develop a connected product for the area of industrial automation with the launch of CODESYS SVN. Many of the over 100,000 CODESYS users worldwide work with CODESYS SVN whichsignificantly simplifies the development of their industrial IEC 61131-3 application software, when realizing automation projects for factories and plants, mobile machines, buildings and energy systems. We thank and congratulate the Subversion community on its 20th anniversary!"

"After 20 years, Apache Subversion continues to deliver on our goal with a stable and portable version control system that powers software projects of all sizes being developed on any of the popular operating system platforms," added Sperling. "Apache Subversion repositories store valuable mission-critical assets of companies and organizations across the globe. Subversion remains an essential source code management tool for developers at every level --we welcome their participation on our lists and community."

Availability and Oversight
Apache Subversion software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Subversion, visit http://subversion.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, The Apache Software Foundation is the world’s largest Open Source foundation, stewarding 200M+ lines of code and providing more than $20B+ worth of software to the public at 100% no cost. The ASF’s all-volunteer community grew from 21 original founders overseeing the Apache HTTP Server to 765 individual Members and 206 Project Management Committees who successfully lead 350+ Apache projects and initiatives in collaboration with 7,200 Committers through the ASF’s meritocratic process known as "The Apache Way". Apache software is integral to nearly every end user computing device, from laptops to tablets to mobile devices across enterprises and mission-critical applications. Apache projects power most of the Internet, manage exabytes of data, execute teraflops of operations, and store billions of objects in virtually every industry. The commercially-friendly and permissive Apache License v2 has become an industry standard within the Open Source world, helping launch billion dollar corporations and benefiting countless users worldwide. The ASF is a US 501(c)(3) not-for-profit charitable organization funded by individual donations and corporate sponsors including Aetna, Alibaba Cloud Computing, Anonymous, ARM, Baidu, Bloomberg, Budget Direct, Capital One, CarGurus, Cerner, Cloudera, Comcast, Facebook, Google, Handshake, Huawei, IBM, Indeed, Inspur, Leaseweb, Microsoft, ODPi, Pineapple Fund, Pivotal, Private Internet Access, Red Hat, Target, Tencent, Union Investment, Workday, and Verizon Media. For more information, visit http://apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Subversion", "Apache Subversion", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation