The Apache Software Foundation Blog

Wednesday May 17, 2017

The Apache Software Foundation Announces Apache® Beam™ v2.0.0

Open Source unified programming model for batch and streaming Big Data processing in use at Google Cloud, PayPal, and Talend, among others.

Forest Hill, MD —17 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Beam™ v2.0.0, the first stable release of the unified programming model for both batch and streaming Big Data processing.

An Apache Top-Level Project (TLP) since December 2016, Beam includes Java and Python software development kits used to define data processing pipelines and runners to execute them on Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow, among other execution engines.

Apache Beam has its roots in Google's internal work on data processing over the last decade, evolving from the initial MapReduce system, through FlumeJava and MillWheel, into Google Cloud Dataflow v1.x, which defined the unified programming model that became the heart of Apache Beam.

"The first stable release is an important milestone for the Apache Beam community," said Davor Bonaci, Vice President of Apache Beam. "This is a statement from the community that it intends to maintain API stability with all releases for the foreseeable future, making Beam suitable for enterprise deployment."

Apache Beam v2.0.0 improves user experience across the project, focusing on seamless portability across execution environments, including engines, operating systems, on-premise clusters, cloud providers, and data storage systems. Other highlights include:
  • API stability and future compatibility within this major version;
  • Stateful data processing paradigms that unlock efficient, data-dependent computations;
  • Support for user-extensible file systems, with built-in support for Hadoop Distributed File System, among others; and
  • A metrics subsystem for deeper insight into pipeline execution.

Apache Beam is in use at Google Cloud, PayPal, and Talend, among others.

"Apache Beam is a mature data processing API for the enterprise, with powerful semantics that solve real-world challenges of stream processing," said Tomer Pilossof, Big Data Manager at PayPal. "With Beam, we provide data processing solutions for a wide range of customers within the PayPal organization."

"We at Talend are thrilled to have contributed to Apache Beam reaching the 2.0.0 milestone and its first official stable release," said Laurent Bride, Chief Technology Officer at Talend. "Apache Beam is now part of the foundation of Talend products. Recently, we released Talend Data Preparation for Big Data which leverages Beam to create transformation pipelines that are portable across many execution engines. Later this year, we plan to deliver Talend Data Streams, taking the Apache Beam integration one step further by utilizing its powerful streaming semantics. Whether for batch, streaming, or real-time use cases, Apache Beam is a powerful framework that delivers the flexibility and advanced functionality our customers need."

"We congratulate the Apache Beam community for reaching the key milestone of a first stable release," said William Vambenepe, Lead Product Manager for Big Data, Google Cloud. "We look forward to our Google Cloud Dataflow customers taking full advantage of Beam's powerful programming model and newest features to run their data processing pipelines on Google Cloud."

Apache Beam v2.0.0 is making its debut at Apache: Big Data, taking place this week in Miami, FL, with four sessions featuring Apache Beam. Apache Beam will also be highlighted at numerous face-to-face meetups and conferences, including the Future of Data San Jose meetup, Strata Data Conference London, Berlin Buzzwords, and DataWorks Summit San Jose.

"I'd like to invite everyone to try out Apache Beam v2.0.0 today and consider joining our vibrant community," added Bonaci. "We welcome feedback, contribution and participation through our mailing lists, issue tracker, pull requests, and events."

Availability and Oversight
Apache Beam software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Beam, visit https://beam.apache.org/ and https://twitter.com/ApacheBeam

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server -- the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Beam", "Apache Beam", "Apex", "Apache Apex", "Flink", "Apache Flink", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday May 15, 2017

The Apache Software Foundation Announces Apache® Samza™ v0.13

Open Source Big Data distributed stream processing framework in production at Intuit, LinkedIn, Netflix, Optimizely, Redfin, and Uber, among other organizations.

Forest Hill, MD —15 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® Samza™  v0.13, the latest version of the Open Source Big Data distributed stream processing framework.

An Apache Top-Level Project (TLP) since January 2015, Samza is designed to provide support for fault-tolerant, large scale stream processing. Developers use Apache Samza to write applications that consume streams of data and to help organizations understand and respond to their data in real-time. Apache Samza offers a unified API to process streaming data from pub-sub messaging systems like Apache Kafka and batch data from Apache Hadoop.

"The latest 0.13 release takes Apache Samza's data processing capabilities to the next level with multiple new features," said Yi Pan, Vice President of Apache Samza. "It also improves the simplicity and portability of real-time applications."

Apache Samza powers several real-time data processing needs including realtime analytics on user data, message routing, combating fraud, anomaly detection, performance monitoring, real-time communication, and more. Apache Samza can process up to 1.1 million messages per second on a single machine. v0.13 highlights include:
  • A higher level API that developers can use this to express complex processing pipelines on streams more concisely;
  • Support for running Samza applications as a lightweight embedded library without relying on YARN;
  • Support for flexible deployment options; 
  • Support for rolling upgrade of running Samza applications;
  • Improved monitoring and failure detection using a built-in heart beating mechanism;
  • Enabling better integrations with other cluster-manager frameworks and environments; and
  • Several bug-fixes that improve reliability, stability and robustness of data processing,

Organizations such as Intuit, LinkedIn, Netflix, Optimizely, Redfin, TripAdvisor, and Uber rely on Apache Samza to power complex data architectures that process billions of events each day. A list of user organizations is available at https://cwiki.apache.org/confluence/display/SAMZA/Powered+By

"Apache Samza is a highly performant stream/data processing system that has been battle tested over the years of powering mission critical applications in a wide range of businesses," said Kartik Paramasivam, Head of Streams Infrastructure, and Director of Engineering at LinkedIn. "With this 0.13 release, the power of Samza is no longer limited to YARN based topologies. It can now be used in any hosting environment. In addition, it now has a new higher level API that makes it significantly easier to create arbitrarily complex processing pipelines."

"Apache Samza has been powering near real-time use cases at Uber for the last year and a half," said Chinmay Soman, Staff Software Engineer at Uber. "This ranges from analytical use cases such as understanding business metrics, feature extraction for machine learning as well as some critical applications such as Fraud detection, Surge pricing and Intelligent promotions. Samza has been proven to be robust in production and is currently processing about billions of messages per day, accounting for 100s of TB of data flowing through the system." 

"At Optimizely, we have built the world’s leading experimentation platform, which ingests billions of click-stream events a day from millions of visitors for analysis," said Vignesh Sukumar, Senior Engineering Manager at Optimizely. "Apache Samza has been a great asset to Optimizely's Event ingestion pipeline allowing us to perform large scale, real time stream computing such as aggregations (e.g. session computations) and data enrichment on a multiple billion events/day scale. The programming model, durability and the close integration with Apache Kafka fit our needs perfectly."

"It has been a phenomenal experience engaging with this vibrant international community of users and contributors, and I look forward to our continued growth. It is a great time to be involved in the project and we welcome new contributors to the Samza community," added Pan.

Catch Apache Samza in action at Apache: Big Data, 16-18 May 2017 in Miami, FL http://apachecon.com/ , where the community will be showcasing how Samza simplifies stream processing at scale.

Availability and Oversight
Apache Samza software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Samza, visit http://samza.apache.org/ , https://blogs.apache.org/samza/ , and https://twitter.com/samzastream

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 680 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", "Kafka", "Apache Kafka", "Samza", "Apache Samza", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday May 12, 2017

The Apache News Round-up: week ending 12 May 2017

As members of the Apache community is preparing to convene in Miami for ApacheCon next week, we continue to be productive. Here's what we've been up to:

Support Apache –if Apache software has helped you, please consider making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield forward performance at 99.82% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today: Big Data, CloudStack, Flex, IoT, Tomcat, and dozens of Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - Support your favorite Apache projects and communities at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache Arrow™ –columnar in-memory analytics layer designed to accelerate Big Data.
- Apache Arrow 0.3.0 released http://arrow.apache.org/

Apache Directory™ Fortress –a standards-based access management system, written in Java, that provides role-based access control, delegated administration and password policy services with LDAP.
 - Apache Fortress 2.0.0-RC2 released http://directory.apache.org/fortress/

Apache HttpComponents™ Client –Java library implementing an HTTP client based on HttpCore components.
- HttpComponents Client 5.0 alpha2 released http://hc.apache.org/httpcomponents-client/

Apache Ignite™ –In-Memory Data Fabric providing in-memory data caching, partitioning, processing, and querying components.
 - Apache Ignite 2.0.0 released https://ignite.apache.org/

Apache OpenWebBeans™ Meecrowave –a light Apache web server based on Tomat and OpenWebBeans (à la microprofile fashion).
 - Apache Meecrowave 0.3.1 released http://openwebbeans.apache.org/meecrowave/

Apache NiFi™ –an easy to use, powerful, and reliable system to process and distribute data.
 - Apache NiFi 1.2.0 released https://nifi.apache.org/

Apache Qpid™ Proton –a messaging library for the Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464, http://www.amqp.org).
 -Apache Qpid Proton-J 0.19.0 released http://qpid.apache.org/

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and JASPIC technologies.
 - Apache Tomcat 9.0.0.M21 released http://tomcat.apache.org/

Apache Trafodion (incubating) –a Web-scale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop.
 - Apache Trafodion 2.1.0-incubating released http://trafodion.incubator.apache.org/index.html


Did You Know?

 - Did you know that previews for ApacheCon + Apache: Big Data keynotes and select sessions are available exclusively on Feathercast? https://feathercast.apache.org/

 - Did you know that over the past quarter, Apache source code has been downloaded more than 2M times? https://projects.apache.org/statistics.html

 - Did you know that RiskCo finacial engineering and risk intelligence group uses Apache Wicket? http://wicket.apache.org/


Apache Community Notices:

 - "Success at Apache" is a blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - The latest Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-april-2017

 - Do friend and follow us on the Apache Community Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity

 - The Apache Phoenix community will be holding PhoenixCon on 13 June in San Francisco https://www.eventbrite.com/e/phoenixcon-2017-tickets-32872245772

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Friday May 05, 2017

The Apache News Round-up: week ending 5 May 2017

Welcome, May! The Apache community is getting ready for ApacheCon, and we hope to see you in Miami soon. Here's what's happened over the past week:

Support Apache –your donations help the world's largest Open Source foundation enrich the lives of countless users and developers. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html
 - The Apache Software Foundation Welcomes 64 New Members https://s.apache.org/2Vt

Success at Apache –our sixth installment in the monthly blog series that focuses on the processes behind why the ASF "just works".
 - Meritocracy and Me, by Tom Barber https://s.apache.org/tQQh

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield whizbang performance at 99.97% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today: Big Data, CloudStack, Flex, IoT, Tomcat, and dozens of Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - Support your favorite Apache projects and communities at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache CarbonData™ –Open Source Big Data analytics accelerator.
The Apache Software Foundation Announces Apache® CarbonData™ as a Top-Level Project https://s.apache.org/QmTI

Apache HBase™ –an Open Source, distributed, versioned, non-relational database.
 -  Apache HBase 1.1.10 released https://hbase.apache.org/

Apache HttpComponents™ Core –a set of HTTP/1.1 and HTTP/2 transport components that can be used to build custom client and server side HTTP services with a minimal footprint.
 -Apache HttpComponents Core 5.0 alpha3 released http://hc.apache.org/

Apache Juneau (incubating) –a toolkit for marshalling POJOs to a wide variety of content types using a common framework, and for creating sophisticated self-documenting REST interfaces and microservices using very little code.
 - Apache Juneau 6.2.0 (incubating) released http://juneau.incubator.apache.org/

Apache Kylin™ –Open Source petabyte-scale Big Data Distributed Analytics Engine.
 - Apache Kylin 2.0.0 released https://kylin.apache.org/

Apache Mahout™ –Open Source scalable machine learning and data mining library for Big Data artificial intelligence.
 - The Apache Software Foundation Announces Apache® Mahout™ v0.13.0 https://s.apache.org/ioAa

Apache Qpid™ Dispatch –a router for the Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464, http://www.amqp.org).
 - Apache Qpid Dispatch 0.8.0 released http://qpid.apache.org/


Did You Know?

 - Did you know that previews for ApacheCon + Apache: Big Data keynotes and select sessions are available exclusively on Feathercast? https://feathercast.apache.org/

 - Did you know that the Top 5 closers (+ number of issues) over the past 30 days are: Wes McKinney (169), Sean Owen (75), Gavin McDonald (64), Aled Sage (57), and Jesse MacFadyen (55)? Well done, all! http://status.apache.org/

 - Did you know that you can support Apache by shopping at http://smile.amazon.com? 0.5% will be donated to the ASF!

 - Did you know that Air New Zealand's "Ask Oskar" online chatbot is trained by Apache OpenNLP? http://opennlp.apache.org/


Apache Community Notices:

 - "Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM 6) Meritocracy and Me https://s.apache.org/tQQh

 - Introducing the new Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-march-2017 Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity . Do friend and follow us.

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Wednesday May 03, 2017

The Apache Software Foundation Welcomes 64 New Members

The Apache Software Foundation welcomes the following new Members who were elected during the annual ASF Members' Meeting on 28-30 March 2017:

Taher A. Alkhateeb, Ryan Blue, Davor Bonaci, Michael Brohl, Cédric Champeau, Byung-Gon Chun, William Colen, Radu Cotescu, Jaroslaw Cwiklik, Dániel Dékány, Mike Drob, Eric Evans, Olaf Flebbe, Lars Francke, Roberto Galoppini, Robert Gemmell, Jorge Luis Betancourt Gonzalez, Thamme Gowda, Scott Gray,Stefan Hett, Jonathan Hsieh, Claus Ibsen, Bilgin Ismet Ibryam, Jeff Jirsa, Evgeny Kotkov, Andrew Kurth, Chris Lambertus, T Jake Luciani, Nicolas Malin, Stephen Mallette, Karl Heinz Marbaise, Sidney Markowitz, Gary Martin, Jan Materne, Nate McCall, Larry McCay, Robert Munteanu, Kay Ousterhout, Anil Patel, Christine Poerschke, Matt Post, Dominik Psenner, Chris Riccomini, Carlos Rovira, Daniel Ruggeri, Guergana K. Savova, Felix Schumacher, Anthony Shaw, Matt Sicker, Karanjeet Singh, Stian Soiland-Reyes, Lee Moon Soo, Michael Starch, Daniel Takamori, Josh Tynjala, Ashish Vijaywargiya, Jay Vyas, Andrew Wang, Claude Warren, Michael Semb Wever, Evans Ye, Piotr Zarzycki, Jeff Zhang, and Jordan Zimmerman.


When the ASF incorporated in 1999, the Foundation's core membership comprised 21 individuals who oversaw the progress of the Apache HTTP Server. This group grew with Committers --developers who contributed code, patches, or documentation, and were subsequently granted access by the Membership:
  1. to "commit" or "write" (contribute) directly to the code repository;

  2. the right to vote on community-related decisions; and

  3. the ability propose an active user for Committership.

Those Committers who demonstrate merit in the Foundation’s growth, evolution, and progress are nominated for ASF Membership by existing members.

This election brings the total number of ASF Members to 684 today. Individuals elected as ASF Members legally serve as the "shareholders" of the Foundation https://www.apache.org/foundation/governance/members.html

For more information on how the ASF works, visit http://www.apache.org/foundation/how-it-works.html .

# # # 


Monday May 01, 2017

Success at Apache: Meritocracy and Me.

by Tom Barber

When Sally asked for volunteers to help with a blog post series "Success at Apache" I realised there was a very human story to tell about how the ASF helped me get to where I am today and hopefully where I'll go tomorrow. Over the years I have worked on and run a number of Open Source projects whilst working with an awful lot of Open Source software. One day I was browsing Slashdot as you do, yeah I know a lot of people disparage it, but it's an awfully hard habit to kick, and without it I wouldn't have got involved in the ASF so I owe it a lot. Anyway, one day when browsing Slashdot I saw this article (https://it.slashdot.org/story/11/01/08/1544204/apache-to-steward-nasa-built-middleware), I had been working in the Open Source business intelligence industry for a few years at that point and I spent a lot of time hacking around and managing data systems, so I wondered how I could get some help out of OODT (http://oodt.apache.org/). Also as a kid I had always loved everything about space, I was a huge Apollo fan, had a small telescope, went to the total eclipse in the UK in 1999 and so on. I thought this OODT project would be a fun way for me to chat nonsense to a few NASA employees, find out how they did stuff and do a bit of Open Source hacking on the side, which would at least let me participate in some NASA related development work, and so it began.

For those of you who haven't heard of Apache OODT it is a middleware layer for building data systems. Originally written by NASA JPL and then Open Sourced to The Apache Software Foundation it provides data ingest capabilities, metadata extraction, data workflows and resource management. I started by asking pretty dumb questions on the IRC channel, posting stuff on the mailing lists and trying to figure out how this reasonably expansive stack of software even operated. Chris Mattmann and Sean Kelly guided me through the opening foray into OODT development and education. Eventually, having submitted a few bug fixes, I volunteered to be a release manager for an OODT release and that got me more involved. Not too long after Sean asked if I fancied having a go at being the project chair, which I duly accepted. Behind the scenes cogs were turning and in a matter of a few weeks, I'd gone from a committer and PMC member to Chair to ASF member, it was certainly a hectic time, trying to keep up with all the new things I had to do, mailing lists to follow and so on, but what a period in my ASF experience, lots of fun!

That was just over 2 years ago and I'm still happily stewarding the OODT folks and keeping the cogs turning, releases happening and Jira tickets triaged. Alongside that it is truly an honour to be involved with the ASF as a member and although the politics can get tedious, the foundation is an amazing place for people to learn to work on great software as part of a distributed team. 18 months ago I was getting a little jaded with the monotony of the BI work I was doing, there are only so many sales databases and budget reports one guy can take and after being in BI 8 or so years I felt like it was time for a change of scenery, I just didn't know what. So I blasted out an email to 10 or so people I knew or had had some contact with over the years who might be able to give me a job, or know someone who was looking for a Java developer, BI guy, Open Source advocate, that type of thing. I'd included some OODT folks on my email, not because I thought there was a chance of a job using it, but just in case they happened to know someone out in California needing some remote help. Everyone said no, except Chris Mattmann who said if I could hold on a few months he might have something for me at NASA! That response floored me, I'd never even considered that as an option and knew that with a young family it would be highly unlikely I'd be able to move to California, so I played along assuming it would fall through. But the as the process dragged on and contracts got drawn up we got closer and closer to it becoming real and the excitement grew, there was the tangible possibility of me fulfilling at least a bit of a life long dream, no I wouldn't be an astronaut, but there was the chance of employment by NASA.

Eventually 6 months or so later, the paperwork was signed and I joined the ranks at NASA JPL, working as an Apache OODT and devops guy. What is great is that having 10 years of business and development experience, I feel like I can very much make a positive contribution to the team, and in part that is down to what I have learnt developing and coding at the ASF. It has been an amazing experience  and a wonderful 12 months. I never dreamt an opportunity like that would arise and it is 100% down to the great work the ASF does in stewarding new projects through the incubation process and into mainstream adoption. Without the ASF I would likely still be a BI guy dealing with run of the mill data warehouses, instead I work on Genomics Search Engines, help hunt criminals on the dark web and a host of other stuff. Life sometimes throws you an opportunity that you don't expect and the ASF certainly facilitated that.

Last week I was in Pasadena, visiting the JPL facility and getting the guided tour, and doing a bit of work. It was amazing talking to such a dedicated group of people who clearly have a big passion for what they do. Getting to see their mission control, the mars rovers and various satellite mock ups was awe inspiring but what excited me the most was getting to sit down and pick the brains of people with whom I have worked with at the ASF for years yet not met in the flesh. Finally making that human connection means a lot.

What the ASF offers here is the ability to learn to work as a distributed team without the pressures of the "real world". Everyone at the ASF, pretty much, is a volunteer and other volunteers recognise that, and so it reduces the pressure, but whilst reducing the pressure it teaches you how to make binding decisions as a disparate group, how to keep records and how to ship good quality code whilst living in different timezones. At the ASF some of us might meet once or twice a year at ApacheCon, Fosdem or elsewhere, but largely all communications is done via mailing list. This can cause issues when people "just want to get it done" but it also provides an immutable record of what is going on in a project and who said what. This proves equally useful out in the "real world" where you want to track business decisions or look up historical records of why a certain choice was made. Also dealing with people who you don't work with on a daily basis also helps you think more about your communication style, what is fine to say and what isn't and also how you structure your communications, which is also very important in a business setting. Do you know the person? Do they understand your nuances? Is English their native language? etc

The other thing I find the ASF offers is understanding. Last year I was diagnosed with Aspergers at the age of 33, which is pretty late. What is nice is that generally, people like to listen, and if you have something that affects your personal or professional life, people who you've met at the ASF will often lend an understanding ear to allow you to off load or discuss something completely unrelated to the project you might be working on. Or like me, you can just stand up at the front of an ApacheCon lightning talk and tell everyone! Either way, you can generally find someone in the Apache family who will provide a sounding board for anything you want to discuss.

These days I spend my spare time still working on OODT stuff, but also doing a lot of public speaking and mentoring and whenever I do I make sure I talk up the Apache Software Foundation because it has given me the chance of a life time and one that I'll be forever grateful for. If you aren't involved in development here at the ASF, get involved, you don't have to be a coder, you just need to like helping out in a fun, Open Source community.

As I mentioned at the start, this blog series is about success at Apache, hopefully this proves that success can come in a number of ways, the ASF was selected by NASA as the home for its data middleware platform, that proves that the NASA deemed the incubation process, the license and ecosystem acceptable, that is success the the Apache Foundation. Similarly the foundation has proved very successful in placing people into employment from a range of different walks of life into new lines of work, and that is exactly what happened to me and the reason I wanted to share my story about success at Apache.

= = =

"Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM

The Apache Software Foundation Announces Apache® CarbonData™ as a Top-Level Project

Open Source Big Data analytics accelerator in use at Bank of Communications, Hulu, Huawei, SAIC Motor, Zhejiang Mobile, among others.

Forest Hill, MD –1 May 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® CarbonData™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache CarbonData is an indexed columnar store file format for fast analytics on Big Data platforms (including Apache Hadoop, Apache Spark, among others) to help speed up queries an order of magnitude faster over petabytes of data.

"We are very proud to complete the incubation process and graduate as an Apache Top-Level Project," said Liang Chen, Vice President of Apache CarbonData. "The CarbonData community grew rapidly over last ten months, both in terms of size and diversity. Since entering the Apache Incubator, we have completed 4 releases, and exceeded 90 contributors from 10 different organizations."

With the aim of using a unified file format to satisfy all kinds of data analysis cases, Apache CarbonData seamlessly integrates with Hadoop and Spark to improve Big Data analysis efficiency. In benchmarks, CarbonData's faster interactive query helps in speeding up queries approximately 10x faster than standard column-oriented SQL on Hadoop data stores.

Highlights include:

  • Unique data organization to allow faster filtering and better compression;
  • Multi-level Indexing to enable faster search and speeding up query processing;
  • Deep Apache Spark Integration for dataframe + SQL compliance;
  • Advanced push down optimization to minimize the amount of data being read processed, converted, transmitted, and shuffled;
  • Efficient compression and global encoding schemes to further improve aggregation query performance;
  • Dictionary encoding for reduced storage space and faster processing; and
  • Data update + delete support using standard SQL syntax.


Apache CarbonData is in use at an array of organizations, including Bank of Communications, medical/pharma social platform DXY, Hulu, Huawei, group online retailer MEITUAN, SAIC Motor, Zhejiang Mobile, among others.

"CarbonData has very good performance as a ‘SQL on Hadoop’ solution," said Tan Sheng, Director of SAIC Motor’s Big Data team. "It is suitable for SAIC Motor to adopt as a central Big Data platform component. Not only do we use Apache CarbonData, we also actively participate in its community as contributors." 

"Apache CarbonData is great, as helped our audit business to improve 7-10X performance based on 14 billion rows of data," said Wei Zhao, Senior Engineer at Bank of Communications.

"Apache CarbonData is very suitable for our filter query cases, and has averaged 20x improvement on performance," said William Zhu, Architecture team member at DXY. "And, as CarbonData supports data update and delete, this feature is very useful. We would consider CarbonData as our all-in-one solution to unify all analysis data."

CarbonData was first developed at Huawei in 2013. The project was submitted to the Apache Incubator in June 2016, and had its first official release two months later. The project won top honors in the BlackDuck 2016 Open Source Rookies of the Year's Big Data category.

"Apache CarbonData is a great example of the value of the incubation process," said Jean-Baptiste Onofré, Apache CarbonData Incubator Mentor and Project Management Committee member. "Helping grow the CarbonData developer and user communities has increased our visibility, which allowed us to extend our use cases and tests, and gather new ideas. The initial CarbonData committers did (and are still doing) great work to welcome new users and contributors, clearly understanding it's a step forward for the project."

"We will continue to put our efforts towards optimizing data format efficiency for Big Data ecosystem and provide an unified and high performance data storage solution," added Liang. "The Apache CarbonData community welcomes interested contributors to work with us on our journey forward."

Catch Apache CarbonData in action at ApacheCon (16-18 May/Miami), and Spark Summit (5-7 June/San Francisco).

Availability and Oversight
Apache CarbonData software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache CarbonData, visit http://carbondata.apache.org/ , https://twitter.com/ApacheCarbonDat , and https://www.facebook.com/carbondata/

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "CarbonData", "Apache CarbonData", "Hadoop", "Apache Hadoop", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # # 

The Apache Software Foundation Announces Apache® Mahout™ v0.13.0

Open Source scalable machine learning and data mining library for Big Data artificial intelligence now more powerful and easier to use.

Forest Hill, MD —1 May 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® MahoutTM v0.13.0, the latest version of the Open Source scalable machine learning library.

Apache Mahout provides an environment for quickly creating machine-learning applications that scale and run on the highest-performance parallel computation engines available. Mahout is the first scalable generalized tensor and linear algebra solving engine taking data scientists from interactive experiments to production use.

"Apache Mahout 0.13.0 is more powerful with its new algorithm framework that allows for easier implementation of machine learning algorithms," said Andrew Palumbo, Vice President of Apache Mahout. "The enhanced Mahout code base and development framework make machine learning even more accessible, which is a game changer in the field of artificial intelligence."

Mahout provides a wide variety of premade algorithms (Matrix Factorization, QR via ALS, SSVD, PCA, etc.) for Scala + Apache Spark, H2O, and Apache Flink, as well as on-GPU compute for performance improvements in very large tensor math. Apache Mahout provides the data science tools to automatically find meaningful patterns in Big Data sets by supporting the following main data science use cases:
  • Collaborative filtering – mines user behavior and makes product recommendations (such as eCommerce product recommenders);
  • Regression – estimates a numerical value based on values of other inputs;
  • Clustering – takes items in a particular class (such as Web pages or newspaper articles) and organizes them into naturally occurring groups, such that items belonging to the same group are similar to each other; and
  • Classifying – learns from existing categorizations and then assigns unclassified items to the best category.

New in v0.13.0
Apache Mahout now makes it easier to do matrix math on graphics cards, which is relevant for most modern machine-learning and deep-learning methods. In addition, v0.13.0 allows shared nothing computation on GPUs, on multi-core CPU, or in the JVM as appropriate, as well as a simplified framework for building new algorithms. As Mahout comprises an interactive environment and library that support generalized scalable linear algebra and include many modern machine-learning algorithms, the project has also collaborated with developers on other projects, including the Open Source linear algebra library ViennaCL, the Java wrapper library interface JavaCPP, and the graphics processor technology manufacturer NVIDIA to add CUDA bindings directly into Mahout for simplicity of development.

The v0.13.0 release reflects 62 separate JIRA issues from v0.12.2, including numerous enhancements to Mahout-Samsara, the vector math experimentation environment with R-like syntax that works at scale. Complete release notes are at http://mahout.apache.org/release-notes/Apache-Mahout-0.13.0-Release-Notes.pdf

Future versions of Mahout will include support for native iterative solvers, a more robust algorithm library, and smarter probing and optimization of multiplications, among other features.

A comprehensive list of users of Apache Mahout is available at https://mahout.apache.org/general/powered-by-mahout.html ; current users are mostly researchers and developers actively involved in building distributed machine-learning pipelines and tools.

"We thank our community of developers and users who helped make this milestone release possible, and welcome new contributors to help us advance machine learning," added Palumbo.

Catch Apache Mahout in action at Apache: Big Data, where attendees learn first-hand from many original project creators and companies from the greater Mahout community. Apache: Big Data will be held 16-18 May 2017 in Miami, FL. To register, and for more information, visit http://apachecon.com/

Availability and Oversight
Apache Mahout software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Mahout, visit http://mahout.apache.org/ and https://twitter.com/ApacheMahout

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Flink", "Apache Flink", "Mahout", "Apache Mahout", "Spark", "Apache Spark", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday April 28, 2017

The Apache News Round-up: week ending 28 April 2017

April is coming to a close with the following activities from the Apache community:

Support Apache –billions of people benefit from Apache software. We are grateful to those who support the ASF by making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield lyte performance at 99.92% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today: Big Data, Cloud, Flex, IoT, Tomcat, and dozens of Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - PREVIEWS exclusively on Feathercast https://feathercast.apache.org/
 - Support your favorite Apache projects and communities at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache Attic™ –provides process and solutions to make it clear when an Apache project has reached its end of life.
 - Apache Wink retired http://mail-archives.apache.org/mod_mbox/www-announce/201704.mbox/%3CF9AFF103-D166-4E0A-BAA6-30341D780FB8%40apache.org%3E

Apache Apex™ –a unified platform for Big Data stream and batch processing.
 - Reflections on the One Year Anniversary of Apache Apex http://www.atrato.io/blog/2017/04/25/one-year-apex/

Apache cTAKES™ –Widely adopted Open Source biomedical data extraction, annotation, and clinical information management platform now faster and easier to use.
 - The Apache Software Foundation Announces Apache® cTAKES™ v4.0 https://s.apache.org/OJJw

Apache Fineract™ –Open Source FinTech system for core banking platform enables financial services for billions of unbanked and underbanked individuals worldwide.
 - The Apache Software Foundation Announces Apache® Fineract™ as a Top-Level Project https://s.apache.org/QvFR

Apache Groovy™ –a powerful, optionally typed and dynamic language, with static-typing and static compilation capabilities, for the Java platform aimed at improving developer productivity thanks to a concise, familiar and easy to learn syntax.
 - Apache Groovy 2.4.11 released http://www.groovy-lang.org/download.html

Apache MINA™ FtpServer –a network application framework which helps users develop high performance and high scalability network applications easily.
 - Apache FtpServer 1.1.1 released http://mina.apache.org/ftpserver/downloads.html

Apache Kafka™ –a distributed, fault tolerant, publish-subscribe messaging.
 - Apache Kafka 0.10.2.1 released https://www.apache.org/dyn/closer.cgi?path=/kafka/0.10.2.1/kafka-0.10.2.1-src.tgz

Apache Libcloud™ –a Python library that abstracts away the differences among multiple Cloud provider APIs.
 - Apache Libcloud 2.0.0 release http://libcloud.apache.org/downloads.html

Apache Lucene™ –a high-performance, full-featured text search engine library written entirely in Java.
 - Apache Lucene 6.5.1 released http://www.apache.org/dyn/closer.lua/lucene/java/6.5.1
 - Apache Solr 6.5.1 released http://www.apache.org/dyn/closer.lua/lucene/solr/6.5.1

Apache Metron ™ –Open Source Cyber Security Data Analytics Platform used for rapid detection and response to threats at massive scale.
 - The Apache Software Foundation Announces Apache® Metron™ as a Top-Level Project https://s.apache.org/e4Uh

Apache Open Climate Workbench™ –a comprehensive suite of algorithms, libraries, and interfaces designed to standardize and streamline the process of interacting with large quantities of observational data and conducting regional climate model evaluations.
 - Apache Open Climate 1.2.0 released http://climate.apache.org/downloads.html

Apache PredictionIO (incubating) –an Open Source Machine Learning Serverbuilt on top of state-of-the-art open source stack, that enables developers to manage and deploy production-ready predictive services for various kinds of machine learning tasks.
 - Apache PredictionIO 0.11.0-incubating released https://dist.apache.org/repos/dist/release/incubator/predictionio/0.11.0-incubating/

Apache Qpid™ –newer JMS client supporting the Advanced Message Queuing Protocol 1.0 (AMQP 1.0, ISO/IEC 19464, http://www.amqp.org), based around the Apache Qpid Proton protocol engine and implementing the AMQP JMS Mapping as it evolves at OASIS.
 - Apache Qpid JMS 0.22.0 released http://qpid.apache.org/download.html


Did You Know?

 - Did you know that shopping24.de uses Apache Solr for eCommerce search? http://solr.apache.org/

 - Did you know that Capital One uses Apache JMeter, Apache Kafka, Apache Metron, and Apache NiFi in its security intelligence framework? http://jmeter.apache.org/ http://kafka.apache.org/ http://metron.apache.org/ http://nifi.apache.org/

 - Did you know that Catalyst uses Apache JMeter as a load testing tool? http://jmeter.apache.org/


Apache Community Notices:

 - "Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM

 - Introducing the new Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-march-2017 Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity . Do friend and follow us.

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Tuesday April 25, 2017

The Apache Software Foundation Announces Apache® cTAKES™ v4.0

Widely adopted Open Source biomedical data extraction, annotation, and clinical information management platform now faster and easier to use.

Forest Hill, MD —25 April 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the availability of Apache® cTAKES™ v4.0, the latest version of the Open Source natural language processing system for information extraction from health-related free-text.

Apache cTAKES (clinical Text Analysis Knowledge Extraction System) is a natural-language processing based information extraction platform for health-related text that identifies signals important for the biomedical domain including types of clinical named entities mapped to various biomedical terminologies/ontologies such as the Unified Medical Language System (UMLS) -- drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures along with their associated attributes such as negation, uncertainty, and more.

"Apache cTAKES has helped considerably advance biomedical data extraction and clinical information management over the last several years," said Pei Chen, Vice President of Apache cTAKES. "We are proud to lead the development of a widely adopted, interoperable, community-driven solution for clinical decision support systems and clinical research. The improvements in v4 makes cTAKES easier to use, thereby benefiting the greater medical community."

cTAKES originated in 2006 by a team of physicians, computer scientists, and software engineers at Mayo Clinic, and was submitted to the Apache Incubator in June 2012. cTAKES was built using the Apache UIMA (Unstructured Information Management Architecture) framework and Apache OpenNLP machine-learning based toolkit for the processing of health-related natural language text. Apache cTAKES components create rich linguistic and semantic annotations that have been utilized for a variety of biomedical use cases including clinical decision support systems and clinical research. 

Highlights of Apache cTAKES v4 include:
  • Dictionary Builder Graphical user interface (GUI) for easy dictionary selection and build-up;
  • Pipe Bits to be used to describe cTAKES modules for programs that help users create pipelines such as document descriptions of components, and inputs, outputs, parameters, dependencies implemented as Java annotations simplifies pipeline builders indicates whether a component is a Collection Reader, Annotator, or a Cas Consumer (Writer);
  • Piper files, allowing fast and easy creation and modification of custom pipelines with many capabilities;
  • Graphical user interface (GUI) for easy pipeline creation to select cTAKES components, view descriptions of the components, and inputs, outputs, parameters, dependencies implemented using the new Pipe Bits;
  • Example Clinical Documents with manual expert annotations of clinical narratives (mock ups). The narratives were annotated using the Open Source Anafora annotation tool (https://github.com/weitechen/anafora);
  • Temporal module for extraction of events, time expressions, and temporal relations; and
  • Numerous bug fixes that resulted in a more stable, much faster and robust release

"Apache cTAKES v4 release is a pivotal milestone that incorporates state-of-the-art methods for some of the most difficult tasks in clinical narrative processing and information extraction, namely coreference resolution and temporality. Integrating novel user friendly interfaces and a scaled up optimization of its core concept mapper, v4 provides the open-source and medical communities a stable, industrial strength tool to mine clinical text," said Prof. Guergana Savova, ASF Member and Apache cTAKES Project Management Committee member, and Principal Investigator of the Natural Language Processing Lab at the Computational Health Informatics Program, Boston Children’s Hospital and faculty at Harvard Medical School. "The world-wide community involvement is exactly what we envisioned when we started cTAKES back in 2006. We are grateful to the community for its many contributions and are greatly appreciative of the efforts of Sean Finan and James Masanz, members of the Apache cTAKES Project Management Committee for leading this milestone release."

"We are using Apache cTAKES v4 to link phenotypic and genomic/genetic data for the Boston Children’s Hospital Precision Link Biobank," said Kenneth D. Mandl, Director of the Computational Health Informatics Program at Boston Children’s Hospital.

"We are using cTAKES to help identify people with multiple sclerosis from the electronic health records and investigate disease trajectory and treatment response in this chronic neurological disorder," said Zongqi Xia, MD, PhD, an Assistant Professor of Neurology and Biomedical Informatics at University of Pittsburgh.

"We have been using cTAKES in the VA Radiology Reports to look for word tokens that correlate with lung, liver and other findings," said Dr. Joe Erdos, faculty at Yale School of Medicine and associated scientist at the Veterans Affairs (VA) in Connecticut.

"We have been frequent users of cTAKES since the 3.x days, and are excited by the cTAKES release," said Chris Mattmann, Principal Data Scientist in the Engineering & Science Directorate at NASA Jet Propulsion Laboratory, and member of the Apache cTAKES Project Management Committee. "Our Shangridocs tool that allows for interactive text extraction and analysis from science research papers in the bioinformatics/clinical domain is built around Apache cTAKES and Apache OpenNLP. We plan on upgrading ASAP to cTAKES 4.0 and contributing to the platform. cTAKES scalability is something we are very interested in - and in the ability to extend the existing UMLS taxonomy with custom medical metadata and information and cTAKES 4.0 (and beyond) is the perfect platform for growth in this area."

Availability and Oversight
Apache cTAKES software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache cTAKES, visit http://ctakes.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "cTAKES", "Apache cTAKES", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Monday April 24, 2017

The Apache Software Foundation Announces Apache® Fineract™ as a Top-Level Project

Open Source FinTech system for core banking platform enables financial services for billions of unbanked and underbanked individuals worldwide.

Forest Hill, MD –24 April 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® Fineract™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

The first Apache project developed from within the financial services technology sector (known as "FinTech"), Apache Fineract is an Open Source system for core banking as a platform. Fineract provides a reliable, robust, and affordable solution for entrepreneurs, financial institutions, and service providers to offer financial services to the world’s 2 billion underbanked and unbanked.

"Core banking software is one of the oldest applications of digital electronic computing. Apache Fineract's graduation as a Top-Level Project is an important next step for Open Source into applications that affect us all," said Myrle Krantz, Vice President of Apache Fineract.

Fineract was created at the Mifos (Micro Finance Open Source) Initiative (a spin-off of the Grameen Foundation, launched by Dr. Muhammad Yunus, the father of microfinance), which began to develop software in 2004 to meet the demands of microfinance institutions (MFIs). In December 2015, the Mifos Initiative submitted Fineract to the Apache Incubator.

"In many ways Fineract broke new grounds as an Incubating project at the ASF," said Roman Shaposhnik, Apache Fineract Incubator Mentor, Director of Open Source Strategy at Pivotal and Vice President Technology for ODPi at Linux Foundation. It was the first project that was originally developed by a non-profit: Mifos Initiative. It was the first project with an extremely important social agenda in mind: speed up the elimination of poverty. It was the first project that fully embraced the next generation microservices architecture. But there's one thing that gets me even more excited: how quickly the Fineract community embraced 'The Apache Way' of project governance. They truly made my job as a mentor smooth sailing and I wish them to grow by leaps and bounds now that they are a TLP at the ASF."

Apache Fineract can be deployed in any environment, whether Cloud or on-premise, on- or offline, on mobile or PC. Fineract is extensible enough to support any organizational type or delivery channel, and flexible enough to support any product, service, or methodology. For any organization, large or small, Apache Fineract will provide the client data management, loan and savings portfolio management, integrated real time accounting, and social and financial reporting needed to bring digital financial services in a modern connected world.

"As the steward of the Open Source Mifos community since its inception at the Grameen Foundation in 2006, I'm excited for the next phase of growth and development for our ecosystem as it becomes a part of the Apache community," said Edward Cable, President/CEO, Mifos Initiative. "Organizations like Musoni Services and Conflux Technologies have blazed a trail in showing others how to build sustainable and impactful financial inclusion solutions on top of the Apache Fineract platform. I look forward to cultivating and catalyzing new innovations for financial inclusion amongst the developers across The Apache Software Foundation."

An array of organizations use Apache Fineract for rapid product development, support for multiple lending methodologies, full range of deposit products, and flexible calculation of interest, penalties, and fees.

"In my role as CTO of Musoni I have been working in the Fineract community since the very early days, and have seen the community and platform grow to the mature state it is in today," said Sander van der Heyden, CTO, Musoni. "We are using Fineract as one of the core components for our cloud-based core-banking platform, which is currently in use by over 90 MFI's across Africa. Together with the community our team has collaborated on a large number of the features that are in the platform today. We would not have been able to deliver these without the advantages of working together with others in the community, sharing experience, market knowledge and development capacity. The Fineract community is in a truly unique position to have access to hands-on knowledge of financial markets and products across almost all developing markets worldwide and to combine these into the platform. Now that the project is graduating we are hoping to see even more activity and growth in the community, especially looking forward to new versions more suitable for modern cloud based architectures, which are already in development."

"We are the innovation lab for Gentera in which we are devoted to create new business models for the base of the pyramid, through internal entrepreneurs working in specific projects and supporting and collaborating with startups focusing on this market. We have used Fineract because it allowed us to generate loans through its API in a very flexible way, which enables us to offer a completely digitized offer for our group loans clients," said Eduardo Lincona, Platforms Subdirector at Fiinlab. "It has contributed to our success because on our current core banking we were not able to create loans through an interface, therefore it was impossible for us to offer a 100% digitized experience for them. This functionality allowed us to test whether it is feasible to offer a digital loan to our customers. The community is very wide and has many on line materials, in addition Aleks, who has been our consultant, has done a tremendous work allowing us to react very quickly to the demands of our customers."

"Apache Fineract's breadth of functionality and ease of extensibility has enabled us to leverage the same for building 'Finflux', a next generation core-banking system serving multiple sectors of the financial services industry including digital banks, fin-tech lenders, microfinance institutions, co-operatives, credit unions, self-help groups and business correspondents," said Vishwas Babu, CTO of Conflux Technologies. "Engineers at Conflux have been deeply involved with Apache Fineract since its inception, and we look forward to continue contributing our mite towards the project's success."

"For many years, technology had been a barrier to our rapid expansion plans," said Samit Shetty, Director of Chaitanya India Fin Credit Pvt. Ltd. "With the help of Conflux Technologies and their solutions which are powered by Apache Fineract, technology is now our primary business accelerator."

"With the help of Conflux Technologies, we moved from a manual system to a core banking solution powered by Apache Fineract to consolidate varied groups to follow our standardized procedures and reduce reporting delays and efforts by 40%," said Ruhiu (Gathogo) John, ICT Officer at Caritas Nairobi.

"To fully participate in the digital world, all people need access to banking services and electronic money," said Paul Maritz, an industry executive and investor, and a long-time supporter of Mifos.org and Fineract. "Open Source is playing an increasingly important role in today’s software world, and Fineract is an important foundation to enable access by all to the digital world."

"The Apache Fineract community is proud to provide code which helps institutions extend banking services to the poorest of the poor, thus helping them to lift themselves out of poverty," added Krantz. "We welcome contributors who want to support the Fineract technology, from the Fineract code itself, to documentation, and best practices information."

Availability and Oversight
Apache Fineract software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Fineract, visit http://fineract.apache.org/ and https://twitter.com/ApacheFineract

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Fineract", "Apache Fineract", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

The Apache Software Foundation Announces Apache® Metron™ as a Top-Level Project

Open Source Cyber Security Data Analytics Platform used for rapid detection and response to threats at massive scale.

Forest Hill, MD –24 April 2017– The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today that Apache® Metron™ has graduated from the Apache Incubator to become a Top-Level Project (TLP), signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles.

Apache Metron provides a scalable, advanced security analytics framework to offer a centralized tool for security monitoring and analysis. Metron’s extensible platform integrates a number of elements from the Apache Hadoop ecosystem to effectively enable rapid detection and rapid response using both traditional rules and machine learning for advanced security threats.

"It is abundantly clear that cybersecurity challenges are becoming a bigger part of our reality," said Casey Stella, Vice President of Apache Metron. "Solving them effectively and at scale requires an Open Source, community-oriented approach built upon proven scalable technologies. This is what Metron is about at its core."

Metron is a unified platform for aggregating and enriching a wide variety of security related data.   Built atop Apache Storm, Apache HBase and Apache Kafka, Metron can ingest, transform and normalize any source of telemetry at scale, including full network packet capture. Data ingested into Metron can be enriched with valuable context such as geographic location or asset identifiers as it streams by. New enrichments can be specified with no-downtime through user defined functions and a robust scripting language. Security threats can be specified and triaged using either rules or machine learning models so that only the greatest threats are prioritized for threat response and investigation.

Highlights include:
  • Mechanism to capture, store, and normalize any type of security;
  • Telemetry at extremely high rates;
  • Real time processing and application of enrichments;
  • Efficient information storage;
  • Interface that provides a centralized view of data and alerts passed through the system; and 
  • Use of statistical summary data structures (e.g. sketches) to perform security analytics even on the largest of data sets

Apache Metron leverages Big Data and machine learning to enable users to rapidly detect and respond to cyber security threats, whether in application-specific environments such as an email service provider, or across the Internet of Things (IoT). Australia’s largest telecommunications, media, and Internet Service Provider, Telstra, uses Apache Metron to power enterprise-grade security operations centers (SOCs) in key service hubs.

"Going through the Apache incubation process really illuminated how valuable and important it was to build vibrant and inclusive communities around code. Having infrastructure support from the ASF and active mentors to shepherd us through the hurdles made all the difference in the world," added Stella. "The core ideals of openness, community, and transparency are prerequisites for solving cybersecurity challenges. Metron was a great fit in Apache because the ASF shares those core ideals. It really does take a village to solve the really hard problems."

Metron initiated at Cisco in 2014 as OpenSOC. The project was submitted to the Apache Incubator in December 2015, and released its first release of Apache Metron in April 2016.

Catch Apache Metron in action at the DataWorks Summit, 13-15 June 2017 in San Jose.

Availability and Oversight
Apache Metron software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the Project's day-to-day operations, including community development and product releases. For downloads, documentation, and ways to become involved with Apache Metron, visit http://metron.apache.org/ and https://twitter.com/ApacheMetron 

About the Apache Incubator
The Apache Incubator is the entry path for projects and codebases wishing to become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects wishing to join the ASF enter through the Incubator to: 1) ensure all donations are in accordance with the ASF legal standards; and 2) develop new communities that adhere to our guiding principles. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. For more information, visit http://incubator.apache.org/

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cash Store, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, Target, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Hadoop", "Apache Hadoop", "Metron", "Apache Metron", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday April 21, 2017

The Apache News Round-up: week ending 21 April 2017

As we're cranking through April, the Apache community has been working on:

Support Apache –if Apache software has helped you, please consider making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 17 May 2017. Board calendar and minutes http://apache.org/foundation/board/calendar.html

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield swift performance at 99.95% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today. The latest in Big Data, Cloud, Flex, IoT, Tomcat, and dozens of other leading Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register today!
 - PREVIEWS on ApacheCon and Apache: Big Data, exclusively on Feathercast https://feathercast.apache.org/
 - Support your favorite Apache project and community at ApacheCon by becoming an Apache Community and/or BarCamp Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

The Apache Software Foundation Recognizes Apache Innovations Integral to the Pulitzer Prize-winning Panama Papers Investigation
 - Apache Open Source library, search, and document management tools used in investigating the biggest leak in journalism history. https://s.apache.org/UkEw

Apache Commons™ JEXL –a library intended to facilitate the implementation of dynamic and scripting features in applications and frameworks written in Java.
 - Apache Commons JEXL 3.1 released https://commons.apache.org/proper/commons-jexl/download_jexl.cgi 

Apache Jackrabbit™ Oak –a scalable, high-performance hierarchical content repository designed for use as the foundation of modern world-class Web sites and other demanding content applications.
 - Apache Jackrabbit Oak 1.2.25 released http://jackrabbit.apache.org/downloads.html

Apache JMeter™ –used to test performance both on static and dynamic resources such as files, Servlets, Perl scripts, Java Objects, Data Bases and Queries, FTP Servers, and more.
 - Apache JMeter 3.2 released http://jmeter.apache.org/download_jmeter.cgi

Apache Kudu™ –an Open Source storage engine for structured data which supports low-latency random access together with efficient analytical access patterns.
 - Apache Kudu 1.3.1 released http://kudu.apache.org/releases/1.3.1/

Apache Log4j™ –provides logging services for Java.
 - CVE-2017-5645: Apache Log4j socket receiver deserialization vulnerability http://mail-archives.apache.org/mod_mbox/www-announce/201704.mbox/%3CCACmp6kodWRpk7U6cguk3_TyP%2BFcA11sBExvb83EntE51Secn6A%40mail.gmail.com%3E

Apache Mahout™ –scalable machine learning library.
 - Apache Mahout 0.13.0 released http://www.apache.org/dist/mahout/0.13.0/

Apache OpenWebBeans™ –an ALv2-licensed implementation of the "Contexts and Dependency Injection for the Java EE platform" specification which is defined as JSR-299.
 - Apache OpenWebBeans-1.7.3 released http://www.apache.org/dyn/closer.cgi/openwebbeans/1.7.3/

Apache POI™ –Open Source Java library and APIs for various file formats based on Microsoft Office.
 - Apache POI 3.16 released https://poi.apache.org/download.html

Apache Syncope™ –an Open Source system for managing digital identities in enterprise environments, implemented in Java EE technology.
 - Apache Syncope 2.0.3 released http://syncope.apache.org/downloads.html

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and JASPIC technologies.
 - Apache Tomcat 8.5.14 released http://tomcat.apache.org/download-80.cgi
 - Apache Tomcat 9.0.0.M20 released http://tomcat.apache.org/download-90.cgi


Did You Know?

 - Did you know that Apache TinkerPop‏ Gremlin is Turing Complete? http://tinkerpop.apache.org/

 - Did you know that BarCamp Apache will be held prior to ApacheCon, and, per usual, will be free of charge to participate? http://events.linuxfoundation.org/events/apachecon-north-america/extend-the-experience/barcamp

 - Did you know that Walmart Labs uses Apache Storm and Apache Kafka? http://storm.apache.org/ and http://kafka.apache.org/


Apache Community Notices:

 - "Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM

 - Introducing the new Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-march-2017 Facebook page https://www.facebook.com/ApacheSoftwareFoundation/ and Twitter account https://twitter.com/ApacheCommunity . Do friend and follow us.

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Monday April 17, 2017

The Apache® Software Foundation Recognizes Apache Innovations Integral to the Pulitzer Prize-winning Panama Papers Investigation

Apache Open Source library, search, and document management tools used in investigating the biggest leak in journalism history.

Forest Hill, MD —17 April 2017— The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today the role played by several Apache projects in the investigation of the Panama Papers.

At 2.6 terabytes of data, the Panama Papers is the largest leak of all time, comprising 11.5M financial and legal records sent from an anonymous source. The journalistic cooperation involved more than 400 journalists from 100 publications on six continents over the course of a year. The discovery exposed a complex system of criminal and corrupt activities secretly hidden by offshore concerns. The investigation recently received a Pulitzer Prize in the Explanatory Reporting category.

"The Apache Software Foundation incorporated 18 years ago with the mission to create software for the public good," said ASF President Sam Ruby. "We are honored that Apache software played a critical role with the Panama Papers, and congratulate the International Consortium of Investigative Journalists and their media partners on this prestigious award."

The discovery, exchange, and management of information that involved 214,488 entities was made possible by:
  • Tika --toolkit that detects and extracts metadata and structured text content from various documents. Used for document processing.

  • Solr --enterprise search server, based on the Lucene Java search library, with advanced highlighting, faceted search, caching, and replication capabilities. Used for search and indexing.

  • PDFBox --Open Source Java library for working with PDF documents. Used for capturing text from PDF documents.

  • POI --Open Source Java library and APIs for various file formats based on Microsoft Office. Used to extract and manipulate Excel, Word, and PowerPoint files.

  • Commons --40+ projects for reusable Open Source Java components. Used to boost cross-platform development and productivity.

In addition to Apache software, a number of other Open Source projects were also integral to the investigation. This includes Tesseract-ocr (whose optical character recognition engine was used for capturing text from images), Project Blacklight (used as a discovery interface), and Jackcess (used for reading and writing MS Access databases): three examples of the millions of software solutions distributed under the Apache License v2.0, that allows for their free use, modification, and sharing.

Apache Open Source Projects
Many of the ASF's 300+ projects serve as the backbone for some of the world's most visible and widely used applications in Artificial Intelligence and Deep Learning, Big Data, Build Management, Cloud Computing, Content Management, DevOps, IoT and Edge Computing, Mobile, Servers, and Web Frameworks, among other categories.

Programmers, solutions architects, individual users, educators, researchers, corporations, governments, and enthusiasts worldwide depend on Apache software for development tools, libraries, frameworks, visualizers, end-user productivity solutions, and more.

75% of Apache's 150M lines of code have been developed over 65,000 person years, and are valued at US$7B. The ASF serves approximately 9M source code downloads from Apache mirrors on a yearly basis, excluding convenience binaries. Worldwide dependency on Apache software continues to grow, with Web requests received from every Internet-connected country on the planet.

The Apache Incubator is home to 63 projects undergoing development, with emerging innovations Big Data, communication protocols, connected devices, cryptography, data science/machine learning/analytics, development frameworks, microfinances, remote desktop access, serverless computing, and more.

All Apache products are available to the public-at-large completely free of charge. All software development and project leadership is done entirely by volunteers. As a not-for-profit charitable organization, the ASF is funded through tax-deductible contributions from corporations, foundations, and private individuals. Approximately 75% of the ASF's US$1.2M annual budget is dedicated to running critical infrastructure support services that keep Apache services running 24x7x365 at near 100% uptime on an annual budget of less than US$5,000 per project.

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than 350 leading Open Source projects, including Apache HTTP Server --the world's most popular Web server software. Through the ASF's meritocratic process known as "The Apache Way," more than 620 individual Members and 6,000 Committers successfully collaborate to develop freely available enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions are distributed under the Apache License; and the community actively participates in ASF mailing lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings, and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations and corporate sponsors including Alibaba Cloud Computing, ARM, Bloomberg, Budget Direct, Capital One, Cerner, Cloudera, Comcast, Confluent, Facebook, Google, Hortonworks, HP, Huawei, IBM, InMotion Hosting, iSigma, LeaseWeb, Microsoft, ODPi, PhoenixNAP, Pivotal, Private Internet Access, Produban, Red Hat, Serenata Flowers, WANdisco, and Yahoo. For more information, visit http://www.apache.org/ and https://twitter.com/TheASF

© The Apache Software Foundation. "Apache", "Apache Commons", "PDFBox", "Apache PDFBox", "POI", "Apache POI", "Solr", "Apache Solr", "Tika", "Apache Tika", and "ApacheCon" are registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. All other brands and trademarks are the property of their respective owners.

# # #

Friday April 14, 2017

The Apache News Round-up: week ending 14 April 2017

The Apache community salutes the ASF's 18 years of Open Source leadership with ongoing productivity:

Support Apache –Apache projects benefit billions of users worldwide for less than $5K/project/year. Please consider making a donation, no matter the size. Every dollar counts. http://apache.org/foundation/contributing.html

ASF Board –management and oversight of the business and affairs of the corporation in accordance with the Foundation's bylaws.
 - Next Board Meeting: 19 April 2017. Board calendar and minutes available at http://apache.org/foundation/board/calendar.html

Apache Community Development –helps guide newcomers towards becoming a part of the Apache community.
 - Introducing the new Apache Community Newsletter https://blogs.apache.org/comdev/entry/community-development-news-march-2017 and Twitter account https://twitter.com/ApacheCommunity

ASF Infrastructure –our distributed team on four continents keeps the ASF's infrastructure running around the clock.
 - 7M+ weekly checks yield twinkling performance back in the "three nines" at 99.89% uptime http://status.apache.org/

ApacheCon™ –Tomorrow's Technology Today. We're looking forward to seeing you the week of 15 May in Miami.
 - Learn the latest in Big Data, Cloud, Flex, IoT, Tomcat, and dozens of other leading Apache projects across 200+ sessions, 150+ speakers, 4 subconferences, BarCampApache and more. http://apachecon.com/ Register by 17 April and save $200!
 - PREVIEWS on ApacheCon and Apache: Big Data, exclusively on Feathercast –featuring Roman Shaposhnik on IoT; Ismaël Mejia & Etienne Chauchot on Apache Beam; Mark Thomas on Tomcat; and Shawn McKinney on Java security https://feathercast.apache.org/
 - Support your favorite Apache project and community at ApacheCon by becoming an Apache Community Sponsor http://events.linuxfoundation.org/events/apachecon-north-america/sponsors/community-sponsor

Apache Ignite™ –in-Memory Data Fabric providing in-memory data caching, partitioning, processing, and querying components.
 - [CVE-2016-6805] Arbitrary File Read due to eXternal Xml Entity attack in Apache Ignite http://mail-archives.apache.org/mod_mbox/www-announce/201704.mbox/%3CB39FC5C0-9AC5-4E84-A450-AFF690B74D9C%40apache.org%3E

Apache Libcloud™ –a Python library that abstracts away the differences among multiple Cloud provider APIs.
 - Apache Libcloud 2.0.0rc2 released http://libcloud.apache.org/downloads.html

Apache Log4j™ –provides logging services for Java.
 - Apache Log4j 2.8.2 released https://logging.apache.org/log4j/2.x/

Apache Tomcat™ –an Open Source software implementation of the Java Servlet, JavaServer Pages, Java Unified Expression Language, Java WebSocket and JASPIC technologies.
 - Apache Tomcat 6.0.53 released http://tomcat.apache.org/download-60.cgi
 - [SECURITY] CVE-2017-5651 Apache Tomcat Information Disclosure http://mail-archives.apache.org/mod_mbox/www-announce/201704.mbox/%3C63a584ba-4db7-85d3-0206-c1164b9d26c6%40apache.org%3E
 - [SECURITY] CVE-2017-5650 Apache Tomcat Denial of Service http://mail-archives.apache.org/mod_mbox/www-announce/201704.mbox/%3C6d8077ef-1bcb-d07b-0bd0-f70ab0043faf%40apache.org%3E
 - [SECURITY] CVE-2017-5648 Apache Tomcat Information Disclosure http://mail-archives.apache.org/mod_mbox/www-announce/201704.mbox/%3C8a78e8fe-616e-1959-3c0e-26704fc72766%40apache.org%3E

Apache Twill™ –an abstraction over Apache Hadoop® YARN that reduces the complexity of developing distributed applications, allowing developers to focus instead on their application logic.
 - Apache Twill 0.11.0 released http://www.apache.org/dyn/closer.cgi/twill/0.11.0/src


Did You Know?

 - Did you know that, over the past week, 518 Apache Committers changed 917,176 lines of code over 2,740 commits? http://status.apache.org/#commits

 - Did you know that Apache innovation will be highlighted at ApacheCon and Apache: Big Data with presentations on numerous podlings, including Edgent, Gossip, Hivemall, iota, Mynewt, OpenWhisk, RocketMQ + more? http://incubator.apache.org/

 - Did you know that HBaseCon ASIA will be held on 4 August in Shenzhen, China? http://hbase.apache.org/


Apache Community Notices:

 - "Success at Apache" is a new monthly blog series that focuses on the processes behind why the ASF "just works". 1) Project Independence https://s.apache.org/CE0V 2) All Carrot and No Stick https://s.apache.org/ykoG 3) Asynchronous Decision Making https://s.apache.org/PMvk 4) Rule of the Makers https://s.apache.org/yFgQ 5) JFDI --the unconditional love of contributors https://s.apache.org/4pjM

 - We have been chosen as a Google Summer of Code (GSoC) Mentoring Organization for the 12th consecutive year --Apache Committer Mentors wanted! https://summerofcode.withgoogle.com/organizations/5416945173135360/

 - Feedback from The Apache Software Foundation on the Free and Open Source Security Audit (FOSSA) https://s.apache.org/romf

 - ASF Operations Summary - Q3 FY2017 https://s.apache.org/NKFz

 - The new Apache Community Facebook page is https://www.facebook.com/ApacheSoftwareFoundation/ Do friend and follow us. 

 - The list of Apache project-related MeetUps can be found at http://apache.org/events/meetups.html

 - Find out how you can participate with Apache community/projects/activities --opportunities open with Apache HTTP Server, Avro, ComDev (community development), Directory, Incubator, OODT, POI, Polygene, Syncope, Tika, Trafodion, and more! https://helpwanted.apache.org/

 - ApacheCon North America + Apache: BigData, CloudStack Collaboration Conference, FlexJS Summit, Apache: IoT, and TomcatCon will be held 16-18 May 2017 in Miami http://apachecon.com/

 - Are your software solutions Powered by Apache? Download & use our "Powered By" logos http://www.apache.org/foundation/press/kit/#poweredby

= = =

For real-time updates, sign up for Apache-related news by sending mail to announce-subscribe@apache.org and follow @TheASF on Twitter. For a broader spectrum from the Apache community, https://twitter.com/PlanetApache provides an aggregate of Project activities as well as the personal blogs and tweets of select ASF Committers.

# # #

Calendar

Search

Hot Blogs (today's hits)

Tag Cloud

Categories

Feeds

Links

Navigation