The Apache Software Foundation Blog

Tuesday Jan 27, 2015

The Apache Software Foundation Announces Apache™ Samza™ as a Top-Level Project

Open Source Big Data distributed stream processing framework used in business intelligence, financial services, healthcare, mobile applications, security, and software development, among other industries.

Forest Hill, MD –27 January 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Samza™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

"The incubation process at Apache has been great. It has helped us cultivate a strong community, and provided us with the support and infrastructure to make Samza grow," said Chris Riccomini, Vice President of Apache Samza.

Apache Samza is a distributed stream processing framework, designed to handle fault tolerance, stateful processing, message durability, and scalability. Samza helps users to write light-weight processors that consume streams of data from messaging systems such as Apache Kafka. These processors empower organizations to understand and react to their data in real-time. In addition, Samza uses Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.

Samza represents a different approach to stream processing. It has been purpose-built first and foremost as a production-grade system with operability and scalability in mind. Samza integrates tightly with Apache Kafka, which makes it a natural fit to those already running Kafka in their data pipeline. The framework also introduces the concept of stateful processing and aggregation as a first-class feature. Stateful processing gives Samza developers a completely new paradigm for aggregating stream data. These features help organizations do high performance stream processing at scale.

Created to process tracking data, service log data, and for data ingestion pipelines for realtime services, Samza originated at LinkedIn, and was submitted to the Apache Incubator in July 2013. 

"LinkedIn is thrilled to see Apache Samza experience such strong adoption and now graduate to a Top-Level Project. Samza was developed to help solve some of LinkedIn's  toughest stream processing challenges and has become a central piece of our infrastructure," said Kevin Scott, Senior Vice President of Engineering and Operations at LinkedIn.

Apache Samza is used in an array of industries, applications, and organizations, including:
  • DoubleDutch, developers of mobile apps for events and conferences, uses Samza to power their analytics platform and stream data live into an event dashboard for real-time insights;
  • Forstcales' Big Data security analytics solutions use Samza to processes security events log as part of the data ingestion pipelines and on-line machine learning models creation process;
  • Happy Pancake, Northern Europe's largest internet dating service, uses Samza for all event handlers and data replication;
  • Advertising technology provider Improve Digital uses Samza as the foundation of a realtime processing capability performing data analytics and as the basis for an alerting system;
  • Jack Henry & Associates uses Samza to process user activity data across its Banno suite of products for financial institutions;
  • MobileAware uses Samza as a foundation for two mobile network products: real time analytics and multi channel notification (push, text message and HTML5);
  • Technology startup Project Florida uses Samza for real-time monitoring of data streams from wearable sensors, for preventative healthcare purposes;
  • Quantiply, providers of Cloud-based micro-applications, uses Samza to bring together user event, system performance, and business operational data for real-time visibility and decision support; and
  • Social media business intelligence solution VinTank uses Samza to power their analysis and natural language processing (NLP) pipeline.


"We've had great experiences with Samza at Improve Digital where it has enabled us to  build out our streaming data platform," said Garry Turkington, CTO of Improve Digital. "It's fantastic to see it graduate to a top-level project."

Jay Kreps, CEO of Confluent, said "Samza is a fantastic piece of infrastructure, and a great complement to Apache Kafka. We at Confluent are really excited to see it added as a top-level Apache project."

"Fortscale has been using Apache Samza successfully to build online machine learning algorithms and detect insider threats," said Dotan Patrich, Software Architect at Fortscale. "It's been a great experience building large scale streaming solution and using Samza's and enjoying it's unique state management architecture. It's fantastic to see it graduate to a Top-Level Project."

"I've been involved in Apache Samza's community since its inception. It's been thrilling to watch the community grow, and I'm very proud and excited to see that the project is graduating. Samza has a bright future, and I'm looking forward to what's to come," added Riccomini.

Availability and Oversight
As with all Apache products, Apache Samza software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache Samza, visit http://samza.apache.org/ and @SamzaStream on Twitter

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow https://twitter.com/TheASF.

© The Apache Software Foundation. "Apache", "Apache Samza", "Samza", "Apache Hadoop", "Hadoop", "Hadoop YARN", "Apache Kafka", "Kafka", "ApacheCon", and the Apache Samza logo are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

# # #

The Apache Software Foundation Announces Apache™ BookKeeper™ as a Top-Level Project

Open Source distributed Big Data logging service and publish/subscribe system used to reliably log streams of records

Forest Hill, MD –27 January 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ BookKeeper™ has graduated to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache BookKeeper was established in 2011 as a sub-project of Apache ZooKeeper™ (Open Source API for highly reliable distributed coordination) to reliably log streams of records. It serves as a building block for reliable system consistency and recovery, and can be used to turn any standalone service into a highly available replicated service.

With disk/server failure rates up to 10% annually, replication is a must in today's always-on Cloud and Big Data services. One way to build a replicated service is to ensure that all write operations to the service are copied to all replicas; Apache BookKeeper's replicated logging service is well suited for this purpose. A database may have two replicas to ensure availability: if one crashes, the other can continue to serve traffic. However, ensuring that the data in these two replicas is consistent is not an easy problem to solve. Unlike naive solutions that run into problems like deadlock and inconsistency when one or both of the replicas fail, BookKeeper uses a combination of quorum writes, fencing, and, when necessary, outsourcing of consensus to ZooKeeper to ensure no state will be lost in the case of a replica failure. BookKeeper can similarly be applied to different classes of systems, such as messaging systems, filesystems and transaction processing systems.

Apache BookKeeper is highly available (no single point of failure), and scales horizontally as more storage nodes are added. BookKeeper is used in production in many web scale companies. At Yahoo, it is used as the persistence layer for its Cloud messaging infrastructure, which delivers tens of billions of messages in a day. BookKeeper is used at Twitter as the replicated persistence backend for different messaging use cases, and is also used by Huawei as a shared storage in their solution for HDFS Namenode High Availability. 

"We're very proud to have BookKeeper become a Top-Level Project. It is a testament to the hard work that my fellow committers have put in over the years that the ASF would give us their stamp of approval," said Ivan Kelly, Vice President of Apache BookKeeper. "We hope that the increased exposure will bring even more contributions and use cases to the community."

Availability and Oversight
As with all Apache products, Apache BookKeeper software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache BookKeeper, visit http://bookkeeper.apache.org and https://twitter.com/asfbookkeeper

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.

© The Apache Software Foundation. "Apache", "Apache BookKeeper", "BookKeeper", ApacheCon", and the Apache BookKeeper logo are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

# # #

Friday Jan 23, 2015

The Apache News Round-up: week ending 23 January 2015

This week's highlights from The Apache Software Foundation's 350+ projects and initiatives include:

ASF Legal Affairs Committee –responsible for establishing and managing legal policies based on the advice of legal counsel and the interests of the Foundation.
 - The Apache Software Foundation subpoenaed regarding Patent Claim https://blogs.apache.org/foundation/entry/the_apache_software_foundation_subpoenaed1

FINAL CALL for ApacheCon™ –the official conference series of The Apache Software Foundation
 - CFP closes on 1 February http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
 - Travel Assistance applications close on 6 February http://www.apache.org/travel/

Apache Bookkeeper™ – distributed logging service called BookKeeper and a distributed publish/subscribe system build on top of BookKeeper called Hedwig.
 - Apache BookKeeper 4.2.4 released http://bookkeeper.apache.org/releases.html

Apache Directory™ LDAP client API –an ongoing effort to provide an enhanced LDAP API, as a replacement for JNDI and the existing LDAP API (jLdap and Mozilla LDAP API) – provides the building blocks for both client side validation and server side data validation
 - Apache Directory LDAP API 1.0.0-M28 released http://directory.apache.org/api/downloads.html

Apache Falcon™ –data processing and management solution for Apache Hadoop™, designed for data motion, coordination of data pipelines, lifecycle management, and data discovery.
 - The Apache Software Foundation Announces Apache Falcon as a Top-Level Project http://s.apache.org/GT2

Apache Flink™ –a system for distributed batch and real-time streaming data analysis that offers familiar collection-based programming APIs in Java and Scala
 - Apache Flink 0.8.0 released http://flink.apache.org/downloads.html

Apache HttpComponents™ Client for Android –can be deployed on Google Android in parallel to the outdated version shipped with platform while remaining partially API compatible with Apache HttpClient 4.3.
 - HttpComponents Client for Android 4.3.5.1 released http://hc.apache.org/downloads.cgi

Apache Tomcat™ –Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language and Java WebSocket technologies.
 - Apache Tomcat 8.0.17 available http://tomcat.apache.org/download-80.cgi

Apache Traffic Server™ –fast, scalable and extensible HTTP/1.1 compliant caching proxy server; can be used as a reverse, forward or even transparent HTTP proxy.
 - Apache Traffic Server 5.2.0 released http://trafficserver.apache.org/downloads

Apache Incbuator™ –the entry path into The Apache Software Foundation (ASF) for projects and codebases wishing to become part of the Foundation's efforts. All code donations from external organisations and existing external projects wishing to join Apache enter through the Incubator.
 - OpenAz and TinkerPop accepted as new podlings this month http://incubator.apache.org/projects/index.html

Are your software solutions Powered by Apache?
 - Download & use our "Powered By" logos today! http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news at announce@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community,https://twitter.com/PlanetApache provides an aggregate of both Project activities and the personal blogs of select ASF Committers.

# # #

Thursday Jan 22, 2015

The Apache Software Foundation Subpoenaed Regarding Patent Claim

The Apache Software Foundation (ASF) has received a United States International Trade Commission subpoena requiring the production of documents and testimony related to U.S. Patent No. 6,691,302.


The request requires the Foundation to produce the required materials related to the above patent. 
Apache will, of course, be complying with all court requirements.  As an open development group the majority of our documents are already publicly available. Pointers to our SVN history and mailing list archives have already been provided.

Please address any discussion to the ASF Legal mailing list.

Monday Jan 19, 2015

The Apache Software Foundation Announces Apache™ Falcon™ as a Top-Level Project

Open Source Big Data processing and management solution for Apache Hadoop™ in use at Hortonworks, InMobi, and Talend, among others.

Forest Hill, MD –19 January 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Falcon™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache Falcon is a data processing and management solution for Apache Hadoop™, designed for data motion, coordination of data pipelines, lifecycle management, and data discovery. Falcon provides enterprises higher quality and predictable outcomes for their data by enabling end consumers to quickly onboard their data and its associated processing and management tasks on Hadoop clusters. The platform is successfully deployed across various industries, including advertising, healthcare, mobile applications, software solutions, and technology.

"Apache Falcon solves a very important and critical problem in the big data space. Graduation to TLP marks an important step in progression of the project," said Srikanth Sundarrajan, Vice President of Apache Falcon. "Falcon has a robust road map to ease the pain of application developers and administrators alike in authoring and managing complex data management and processing applications."

"Graduation of Apache Falcon's is a proud moment for the community who came together to solve a very relevant problem of data processing and management in Hadoop ecosystem," said Mohit Saxena, CTO and co-founder InMobi, one of the largest users of Apache Falcon. "I also want to applaud the efforts of contributors, committers and user community who actively pitched in the development of Falcon and it is only because of their conviction and efforts project has graduated. I am hoping promotion of Falcon to TLP will increase the contribution and adoption across the community and help Falcon achieve newer heights." 

Falcon represents a significant step forward in the Hadoop platform by enabling easy data management. Users of Falcon platform simply define infrastructure endpoints, data sets and processing rules declaratively. These declarative configurations are expressed in such a way that the dependencies between these configured entities are explicitly described. This information about inter-dependencies between various entities allows Falcon to orchestrate and manage various data management functions.

"Falcon has evolved over the last couple of years into a mature data management solution for Apache Hadoop with many production deployments proving it to be very valuable for users to manage their data and associated processing on Hadoop clusters," said Venkatesh Seetharam, Apache Falcon Project Management Committee member. 

"As Hadoop usage patterns have matured, the highest value implementations are based on the data lake concept. Data lakes require prescriptive and reliable pipelines," explained Greg Pavlik, Vice President of Engineering at Hortonworks. "Apache Falcon represents the best and most mature --and therefore essential-- building block for modeling, managing and operating data lakes."

"Falcon has enabled our team to incrementally build up a complex pipeline comprised of over 90 processes and 200 feeds that would have been very challenging with Apache Oozie alone," said programmer Michael Miklavcic.

"I began to work on Falcon in my spare time for fun, but it quickly became interesting in relation to my job at Talend", said Jean-Baptise Onofré, Vice President of Apache Karaf and Software Architect at Talend. "As Talend DataIntegration provides features like CDC (Change Data Capture), and data notification, we are in the process of integrating Apache Falcon in Talend products." 

"Apache Falcon's graduation is a milestone for the project and a credit to its contributors. Its open, collaborative development has effected a robust community around software essential to the Hadoop ecosystem," said Chris Douglas, Falcon incubation mentor at the ASF. "By becoming a Top-Level Project, the ASF recognizes its demonstrated ability to self-govern. Congratulations to Falcon's users, to its contributors, and particularly to its new Project Management Committee on this achievement."

Availability and Oversight
As with all Apache products, Apache Falcon software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache Falcon, visit http://falcon.apache.org/ and @ApacheFalcon on Twitter

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow https://twitter.com/TheASF.

© The Apache Software Foundation. "Apache", "Apache Falcon", "Falcon", "Apache Hadoop", "Hadoop", "Apache Oozie", "Oozie", "ApacheCon", and the Apache Falcon logo are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

# # #

Friday Jan 16, 2015

The Apache News Round-up: week ending 16 January 2015

Our 4,000+ Committers have been busily working on a variety of projects this week. Here are the highlights:

Not A Mirage: The Apache Software Foundation's official number of projects and initiatives "grows overnight" with census adjustment
 - https://blogs.apache.org/foundation/entry/not_a_mirage_the_apache

Upcoming deadlines for ApacheCon™ –the official conference series of The Apache Software Foundation
 - Call for Papers closes on 1 February http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
 - Apache Travel Assistance applications close on 6 February http://www.apache.org/travel/

Apache Commons Validator™ – provides the building blocks for both client side validation and server side data validation. It may be used standalone or with a framework like Struts.
Apache Commons Validator 1.4.1 released http://commons.apache.org/proper/commons-validator/download_validator.cgi

Apache Curator™ –Java libraries that make using Apache ZooKeeper much easier and more reliable.
 - Apache Curator 2.7.1 released https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12314425&version=12328938

Apache Flink™ –Open Source distributed Big Data system for expressive, declarative, and efficient batch and streaming data processing and analysis
 - The Apache Software Foundation Announces Apache™ Flink™ as a Top-Level Project http://s.apache.org/YrZ

Apache Jackrabbit™ –scalable, high-performance hierarchical content repository designed for use as the foundation of modern world-class Web sites and other demanding content applications.
 - Apache Jackrabbit Oak 1.1.4 released http://jackrabbit.apache.org/downloads.html

Apache Knox™ –a REST API Gateway for providing secure access to the data and processing resources of Hadoop clusters.
 - Apache Knox Gateway 0.5.1 released http://www.apache.org/dyn/closer.cgi/knox/0.5.1

Apache Qpid™ –implements the latest AMQP specification, the first open standard for enterprise messaging, and provides transaction management, queuing, distribution, security, management, clustering, federation and heterogeneous multi-platform support and a lot more.
 - CVE-2015-0203: Apache Qpid's qpidd can be crashed by authenticated user http://s.apache.org/PCe

Apache Tika™ –an ASFv2 licensed open source tool for extracting information from digital documents.
 - Apache Tika 1.7 released http://www.apache.org/dist/tika/CHANGES-1.7.txt

Apache Incbuator™ –the entry path into The Apache Software Foundation (ASF) for projects and codebases wishing to become part of the Foundation's efforts. All code donations from external organisations and existing external projects wishing to join Apache enter through the Incubator.
 - Corinthia and Zeppelin are accepted as new podlings in December http://incubator.apache.org/projects/index.html

Are your software solutions Powered by Apache?
 - Download & use our "Powered By" logos today! http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news at announce@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community,https://twitter.com/PlanetApache provides an aggregate of both Project activities and the personal blogs of select ASF Committers.

# # #

Monday Jan 12, 2015

The Apache Software Foundation Announces Apache™ Flink™ as a Top-Level Project

Open Source distributed Big Data system for expressive, declarative, and efficient batch and streaming data processing and analysis

Forest Hill, MD –12 January 2015– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache™ Flink™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache Flink is an Open Source distributed data analysis engine for batch and streaming data. It offers programming APIs in Java and Scala, as well as specialized APIs for graph processing, with more libraries in the making.

"I am very happy that the ASF has become the home for Flink," said Stephan Ewen, Vice President of Apache Flink. "For a community-driven effort, I can think of no better umbrella. It is great to see the project is maturing and many new people are joining the community."

Flink uses a unique combination of streaming/pipelining and batch processing techniques to create a platform that covers and unifies a broad set of batch and streaming data analytics use cases. The project has put significant efforts into making a system that runs reliably and fast in a wide variety of scenarios. For that reason, Flink contained its own type serialization, memory management, and cost-based query optimization components from the early days of the project.

Apache Flink has its roots in the Stratosphere research project that started in 2009 at TU Berlin together with the Berlin and later the European data management communities, including HU Berlin, Hasso Plattner Institute, KTH (Stockholm), ELTE (Budapest), and others. Several Flink committers recently started data Artisans, a Berlin-based startup committed to growing Flink both in code and community as 100% Open Source. More than 70 people have by now contributed to Flink.

"Becoming a Top-Level Project in such short time is a great milestone for Flink and reflects the speed with which the community has been growing," said Kostas Tzoumas, co-founder and CEO of data Artisans. "The community is currently working on some exciting new features that make Flink even more powerful and accessible to a wider audience, and several companies around the world are including Flink in their data infrastructure."

"We use Apache Flink as part of our production data infrastructure," said Ijad Madisch, co-founder and CEO of ResearchGate. "We are happy all around and excited that Flink provides us with the opportunity for even better developer productivity and testability, especially for complex data flows. It’s with good reason that Flink is now a top-level Apache project."

"I have been experimenting with Flink, and we are very excited to hear that Flink is becoming a top-level Apache project," said Anders Arpteg, Analytics Machine Learning Manager at Spotify.

Denis Arnaud, Head of Data Science Development of Travel Intelligence at Amadeus said, "At Amadeus, we continually seek for better improvement in our analytic platform and our experiments with Apache Flink for analytics on our travel data show a lot of potential in the system for our production use."

"Flink was a pleasure to mentor as a new Apache project," said Alan Gates, Apache Flink Incubator champion at the ASF, and architect/co-founder at Hortonworks. "The Flink team learned The Apache Way very quickly. They worked hard at being open in their decision making and including new contributors. Those of us mentoring them just needed to point them in the right direction and then let them get to work."

Availability and Oversight
As with all Apache products, Apache Flink software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache Flink, visit http://flink.apache.org/ and @ApacheFlink on Twitter.

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Cerner, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter.

© The Apache Software Foundation. "Apache", "Apache Flink", "Flink", ApacheCon", and the Apache Flink logo are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.

# # #

Not A Mirage: The Apache Software Foundation's official number of projects and initiatives "grows overnight" with census adjustment

110 sub-projects now added to the Foundation's official activity count

Over the past 15 years, The Apache Software Foundation has been going from strength to strength. It's truly impressive to see all that our global community has achieved. 

In documenting the ASF's developments, I've always measured our growth across two metrics: people and projects.

The people behind Apache is a volunteer community comprising 588 individual Members and 4,166 Committers collaborating across six continents. We're a true 24/7 global operation.

Up until now, I have stated that there were more than 200 projects and initiatives at the ASF. To me, "projects and initiatives" = software projects (Top-Level Projects (TLPs) + podlings undergoing development at the Apache Incubator + Apache Labs (our innovation "sandbox" to test technical concepts), plus various community initiatives such as ApacheCon. I never counted sub-projects as part of that census, as we always had just a few handfuls scattered amongst a small number of TLPs.

I was stunned when ASF Member Daniel Gruno recently informed me that my "200+" figure was wrong, and provided an updated tally of our initiatives as detailed on http://projects.apache.org

This audit surprised me, as it had shown that we did not have a "few handfuls" of sub-projects as I had projected, but rather *110* sub-projects! They had grown into an entity unto itself that must be recognized in our official count. 

As such, the ASF appears to have grown overnight with the addition of sub-projects, although they were there all along. As of today, we have:

- 160 Top-Level Projects
- 110 sub-projects (sub-projects of existing TLPs)
- 36 podlings undergoing development in the Apache Incubator
- 39 initiatives in the Apache Labs

So there are currently 345 Open Source software projects and initiatives at the ASF. Add to that the ASF's special committees and activities such as Infrastructure, Travel Assistance, Security Team, Legal Affairs, Brand Management, and ApacheCon: we've exceeded 350.

With this knowledge, I stand corrected, even more impressed, and have adjusted our records accordingly. Thanks again, Daniel!

Join me in celebrating our amazing community –three cheers for Apache!

--Sally Khudairi, Vice President Marketing & Publicity


Friday Jan 09, 2015

The Apache News Round-up: week ending 9 January 2015

Our community of more than 4,000 contributors are busily working across six continents on all things Apache. Here's what's happened over the past week:

Apache Allura™ –Open Source implementation of a software forge, a Web site that manages source code repositories, bug reports, discussions, wiki pages, blogs, and more for any number of individual projects.
 - Apache Allura 1.2.0 released http://www.apache.org/dyn/closer.cgi/allura/allura-1.2.0.tar.gz

Apache Bloodhound™ –a tool to track progress and defects in software products.
 - Apache Bloodhound 0.8 released http://www.apache.org/dyn/closer.cgi/bloodhound/apache-bloodhound-0.8.tar.gz0.8.tar.gz

Apache CloudStack™ –an integrated Infrastructure-as-a-Service (IaaS) software platform that allows users to build feature-rich public and private cloud environments.
 - Apache CloudStack 4.3.2 released http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.3.2/

REMINDER: Upcoming deadlines for ApacheCon™ –the official conference series of The Apache Software Foundation
 - Call for Papers open until 1 February http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
 - Apache Travel Assistance applications accepted through 6 February http://www.apache.org/travel/

= = =

For real-time updates, sign up for Apache-related news at announce@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community,https://twitter.com/PlanetApache provides an aggregate of both Project activities and the personal blogs of select ASF Committers.

# # # 

Friday Jan 02, 2015

The Apache News Round-up: week ending 2 January 2015

The Apache Software Foundation wishes you a great 2015! With our community of more than 4,000 contributors collaborating across six continents on 200+ projects and initiatives, The ASF is truly a 24/7 operation. Here's what's happened over the past week:

Apache Commons™ –software library provides a generic configuration interface which enables an application to read configuration data from a variety of sources.
 - Apache Commons Math 3.4 released http://commons.apache.org/proper/commons-math/download_math.cgi
 - Apache Commons Pool 2.3 released http://commons.apache.org/proper/commons-pool/download_pool.cgi

Apache Directory™ –directory solutions entirely written in Java
 - Apache Directory LDAP API 1.0.0-M27 released http://directory.apache.org/api

Apache Lucene™ –a high-performance, full-featured text search engine library written entirely in Java
 - Apache Lucene 4.10.3 released http://lucene.apache.org/core/mirrors-core-latest-redir.html

Apache Lucy™ –search engine library that provides full-text search for dynamic programming languages
 - Apache Clownfish 0.4.2 and Apache Lucy 0.4.2 released http://lucy.apache.org/download.html

Apache OODT™ –component based software architecture beyond simple science applications.
- Apache OODT 0.8 released https://issues.apache.org/jira/browse/OODT/fixforversion/12326811/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-summary-panel

Apache Solr™ –popular, blazing fast Open Source NoSQL search platform from the Apache Lucene project
 - Apache Solr 4.10.3 released http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

ASF Operations –behind the scenes of the day-to-day functions at The Apache Software Foundation
 - The ASF publishes long-overdue New Code of Conduct http://s.apache.org/dGR

ApacheCon™ –the official conference series of The Apache Software Foundation
 - Call for Papers open until 1 February http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
 - Get involved with the program selection process http://s.apache.org/60N
 - Applications accepted for Apache Travel Assistance through 6 February http://www.apache.org/travel/

= = =

For real-time updates, sign up for Apache-related news at announce@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community,https://twitter.com/PlanetApache provides an aggregate of both Project activities and the personal blogs of select ASF Committers.

# # # 

Friday Dec 26, 2014

The Apache News Round-up: week ending 26 December 2014

The Apache Software Foundation's 200+ projects and initiatives and community of more than 4,000 contributors wish you a very happy and healthy 2015! Here's what we've been working on over the past week:

Apache Commons™ –software library provides a generic configuration interface which enables an application to read configuration data from a variety of sources.
 - Commons Configuration 2.0-alpha2 Released http://www.apache.org/dist/commons/configuration/RELEASE-NOTES.txt

Apache DeltaSpike™ –not a CDI-container, but a portable CDI extension.
Apache DeltaSpike 1.2.1 released http://s.apache.org/DeltaSpike_1.2.1

Apache Ivy™ –a tool for managing (recording, tracking, resolving and reporting) project dependencies, characterized by flexibility, configurability, and tight integration with Apache Ant.
 - Apache Ivy 2.4.0 released http://ant.apache.org/ivy/download.cgi

Apache Jackrabbit™ –a scalable, high-performance hierarchical content repository designed for use as the foundation of modern world-class Web sites and other demanding content applications.
 - Apache Jackrabbit Oak 1.0.9 released http://jackrabbit.apache.org/downloads.html

Apache POI™ –Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2), such as Excel, PowerPoint, Visio and Word.
 - Apache POI 3.11 released http://poi.apache.org/

ASF Operations –behind the scenes of the day-to-day functions at The Apache Software Foundation
 - The ASF publishes long-overdue New Code of Conduct http://s.apache.org/dGR

ApacheCon™ –the official conference series of The Apache Software Foundation
 - Call for Papers open until 1 February http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
 - Become involved with the program selection process --check out http://s.apache.org/60N
 - Applications accepted for Apache Travel Assistance through 6 February http://www.apache.org/travel/
 - Sign up to receive ApacheCon updates and announcements http://www.apache.org/foundation/mailinglists.html#foundation-apachecon

= = =

For real-time updates, sign up for Apache-related news at announce@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community,https://twitter.com/PlanetApache provides an aggregate of both Project activities and the personal blogs of select ASF Committers.

# # # 

Friday Dec 19, 2014

The Apache News Round-up: week ending 19 December 2014

There are more than 200 projects and initiatives under development at The ASF; below are our activities over the past week, and be sure to subscribe to announce@apache.org to keep an eye out for updates on your favorite projects!

Apache Jackrabbit™ –a scalable, high-performance hierarchical content repository designed for use as the foundation of modern world-class Web sites and other demanding content applications.
 - Apache Jackrabbit Oak 1.1.3 released http://jackrabbit.apache.org/downloads.html

Apache PDFBox™ –an Open Source Java tool for working with PDF documents.
 - Apache PDFBox 1.8.8 released http://pdfbox.apache.org/downloads.html

Apache Subversion™ –Open Source, centralized version control system characterized by its reliability as a safe haven for valuable data; the simplicity of its model and usage; and its ability to support the needs of a wide variety of users and projects, from individuals to large-scale enterprise operations.
 - Apache Subversion 1.7.19 released http://subversion.apache.org/download/#supported-releases
 - Apache Subversion 1.8.11 released http://subversion.apache.org/download/#recommended-release

ASF Operations –behind the scenes of the day-to-day functions at The Apache Software Foundation
 - The ASF publishes long-overdue New Code of Conduct http://s.apache.org/dGR
 - The ASF received 85 Individual Contributor License Agreements (ICLA), 11 Corporate Contributor License Agreements (CCLA), and four software grants over the past month https://www.apache.org/licenses/

ApacheCon™ –the official conference series of The Apache Software Foundation
 - Call for Papers open until 1 February http://events.linuxfoundation.org/events/apachecon-north-america/program/cfp
 - Become involved with the program selection process --check out http://s.apache.org/60N
 - Applications accepted for Apache Travel Assistance through 6 February http://www.apache.org/travel/
 - Sign up to receive ApacheCon updates and announcements http://www.apache.org/foundation/mailinglists.html#foundation-apachecon

= = =

For real-time updates, sign up for Apache-related news at announce@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of both Project activities and the personal blogs of select ASF Committers. 


ASF publishes long-overdue Code Of Conduct

tl;dr: The ASF has published a Code of Conduct

We pride ourselves at The Apache Software Foundation on our principles of "community over code" and "don't be a jerk". But, alas, we've been slow to codify some of these things in public. Part of this, I'm sure, is that it’s easy to think we all just know how we're supposed to treat people, and so you shouldn't have to say, right?


But, of course, you do have to say. In part because some people don't know. And in part because it’s important that we communicate our values to the people in our community, and to people who might be considering joining our community. There has been a recent push in tech circles to include a Code of Conduct at events, conferences, etc. (Ashe Dryden maintains an introductory resource for learning more about how Codes of Conduct can help.) Increasingly, open source projects are adopting a Code of Conduct too, and we think this is a good idea that could help improve open source as a whole.

At ApacheCon, I was approached by Joan Touzet, an active member of the Apache CouchDB community, who had noted that we referenced a Code of Conduct on the main ASF website, but that no such document actually existed anywhere on our site. CouchDB has devoted a lot of time over the last few months crafting their Code of Conduct. It addresses everything from what's acceptable on the mailing lists, to how to report it if someone isn’t upholding community standards. This seemed like a great starting point, and so the ASF has adopted this as our initial Code of Conduct, with minor edits that remove the CouchDB-specific language. (It is my understanding that the CouchDB community now intends to use the Foundation level Code of Conduct, and will work with us to bring additional improvements to it.) 

No doubt, we'll get criticism for being so slow to do this, and we accept that. But it's never too late to take steps in the right direction, and we feel that this is an important one. Not just for the ASF, but for all open source projects and organisations.

You are encouraged to join the conversation on the Community Development mailing list. Whether you have changes you'd like to see in that document, or whether you'd like to discuss any other aspect of the Apache community. Any sort of community discussion topic is welcome. For example, Noah Slater, also from the CouchDB community, brought up the subject of punitive measures for infractions, which is an important but difficult issue. We'd love to hear your perspective on this, and help us continue to move in the right direction.

--Rich Bowen, Executive Vice President

Friday Dec 12, 2014

The Apache News Round-up: week ending 12 December 2014

Here's a snapshot of what's been happening at the Apache Software Foundation over the past week. With more than 200 projects and initiatives under development at The ASF, keep an eye out for updates on your favorite projects!

Apache Buildr™ –build system for Java-based applications, including support for Scala, Groovy and a growing number of JVM languages and tools.
 - Apache Buildr 1.4.21 released http://buildr.apache.org/

Apache cTAKES™ –clinical Text Analysis and Knowledge Extraction System (cTAKES) is an Open Source natural language processing system for information extraction from electronic medical record clinical free-text.
 - Apache cTAKES 3.2.1 released http://ctakes.apache.org/downloads.cgi

Apache DeltaSpike™ –a portable JSR-299 CDI (Contexts and Dependency Injection for Java) Extension library which contains lots of useful tools and helpers which are missing in the CDI core spec; not a CDI-container, but a portable CDI extension.
 - Apache DeltaSpike 1.2.0 released http://deltaspike.apache.org/download.html

Apache MetaModel™ –uniform connector and query API to many very different datastore types, including: Relational (JDBC) databases, CSV files, Excel spreadsheets, XML files, JSON files, Fixed width files, MongoDB, Apache CouchDB, Apache HBase, Apache Cassandra, ElasticSearch, OpenOffice.org databases, and more.
 - Apache MetaModel graduated as a Top-Level Project http://s.apache.org/2vF

ASF Infrastructure –leading the ASF's multi-datacenter, multi-cloud deployment running 24x7x365 on multiple continents, distributing terabytes of artifacts per week and archiving more than 11 million Apache email messages.
 - ASF SVN service outage post-mortem https://blogs.apache.org/infra/entry/svn_service_outage_postmortem

ApacheCon™ –the official conference series of The Apache Software Foundation
 - CFP open for ApacheCon North America in Austin http://s.apache.org/60N

Apache Software Foundation Graphics –graphical assets that can be used by third parties when referring to The Apache Software Foundation or one of its projects.
 - Download new "Powered By Apache" general and project logos http://apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news at announce@apache.org and follow @TheASF on Twitter

For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of both Project activities and the personal blogs of select ASF Committers.

Tuesday Dec 09, 2014

The Apache Software Foundation Announces Apache™ MetaModel™ as a Top-Level Project

Dynamic, metadata-driven Open Source framework provides uniform data access and code consolidation across various data stores. 

Forest Hill, MD –09 December 2014– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 200 Open Source projects and initiatives, announced today that Apache™ MetaModel™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles. 

"It's a great privilege for us to have MetaModel graduated to a Top Level Project at Apache. It makes us proud and excited about welcoming more people into our community of coders and users," said Kasper Sørensen, Vice President of Apache MetaModel. "We've learned a lot about the Apache Way since entering the Apache Incubator in July 2013." 

Apache MetaModel is a data access framework that provides a common interface for the discovery, exploration, and querying of different types of data sources. Unlike traditional mapping frameworks, MetaModel emphasizes metadata of the data source itself and the ability to add more data sources at runtime. MetaModel's schema model and SQL-like query API is applicable to databases, CSV files, Excel spreadsheets, NoSQL databases, Cloud-based business applications, and even regular Java objects. This level of abstraction makes MetaModel great for dynamic data processing applications, less so for applications modelled strictly around a particular domain. 

MetaModel is so called as it's a model for interacting with data based on metadata, enabling developers to go above the physical data layer and apply their application to just about any data. 

"MetaModel enables you to consolidate code and consolidate data a lot quicker than any other library out there," Sørensen explained. "In these 'Big Data days' there's a lot of focus on performance and scalability, and surely these topics also surround Apache MetaModel. The Big Data challenge is not always about massive loads of data, but instead massive variation and feeding a lot of different sources into a single application. Now to make such an application you both need a lot of connectivity capabilities and a lot of modelling flexibility. Those are the two aspects where Apache MetaModel shines. We make it possible for you to build applications that retain the complexity of your data – even if that complexity may change over time. The trick to achieve this is to model on the metadata and not on your assumptions." 

"The performance and flexibility of Apache MetaModel is a key building block for us to improve the usability and power for the thousands of users of DataCleaner – the leading Open Source data quality solution, supported by Neopost," said Enno Ebels, Executive Vice President of Customer Information Management at Neopost. 

"It's been a joy to follow the growth in the community and in functionality," added Sørensen. "Over the last year we've introduced connectivity for Apache HBase, JSON files, ElasticSearch, Apache Cassandra and a whole lot more. It's always a great pleasure to see the excitement in people's eyes when they realize that you can develop for these data sources using the same API." 

"Apache MetaModel is the core technology used underneath our MDM offering at Human Inference, providing us an abstraction layer above the different database schemes we currently support, including Postgres, DB2, Oracle, SQL Server, and ElasticSearch," said Ankit Kumar, Technical Lead at Human Inference and Member of the Apache MetaModel Project Management Committee.

"The MetaModel query language helps us write code agnostic of the underlying database. Within our MDM offering we have even implemented some virtual data stores using MetaModel," said Winfried van Holland, CTO of Neopost Customer Information Management. "These expose our data model in a custom view for our consultants - stripping away the technical complexities and exposing the business value in a data model that is natural for the business people to consume."

"Apache MetaModel is a key technology in Stratio Datavis, allowing us to manage metadata and create SQL-based connectors for a bunch of data stores," said David Morales, Big Data Architect at Stratio. "Thanks to Apache MetaModel, Datavis users can create beautiful dashboards using their SQL skills, instead of knowing several query languages. That's why we are proud to be contributors of MetaModel and we will continue to collaborate with this great project." 

Availability and Oversight 
As with all Apache products, Apache MetaModel software is released under the Apache License v2.0, and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For documentation and ways to become involved with Apache MetaModel, visit http://metamodel.apache.org and https://twitter.com/ApacheMetaModel

About The Apache Software Foundation (ASF) 
Established in 1999, the all-volunteer Foundation oversees more than two hundred leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 500 individual Members and 4,500 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Budget Direct, Citrix, Cloudera, Comcast, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, Matt Mullenweg, Microsoft, Pivotal, Produban, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ or follow @TheASF on Twitter

© The Apache Software Foundation. "Apache," "Apache MetaModel," "MetaModel," ApacheCon," and the Apache MetaModel logo are trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners. 

# # # 

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation