Tuesday Mar 13, 2012

Where did the time go? (A look at the Apache OpenOffice (incubating) timeline)

The above timeline shows just some of the accomplishments of the Apache OpenOffice project since we first started incubation at Apache last June.  I've arbitrarily categorized the items as Infrastructure, Community or Development, knowing full well that any such act of categorization is dubious at best.  There is a lot of overlap, and something I put under Infrastructure could arguably also be other categories as well.  In the end, it is one project, with many aspects, and the pieces work together.

So what does this timeline tell us, other than the obvious fact that we've been busy?

Take a look at the box called "Removal of copyleft".  This is the work we did to get the OpenOffice.org code to conform to Apache policy regarding licensing.  In essence, Apache products are permissively licensed, so anyone is free to use them in open source or proprietary products.  So ensure that downstream consumers of Apache OpenOffice have maximum flexibility in that regard, and to encourage a broader ecosystem, we removed components that were incompatible with these goals. In most cases we replaced the copyleft modules with equivalent or superior libraries that were also permissive licensed.  That effort was a couple of months.  This is important to know, since we sometimes hear, or read on the web, the statement that the Apache OpenOffice has spent an inordinate amount of time removing or replacing copyleft components in OpenOffice.  But has the timeline above shows, this one-time cleanup effort actually took only a little time.  But it was time well spent.

As the timeline shows, most of our attention on the project has been spent on community building and infrastructure migration efforts. We're not engaging in a race to see how fast we can come out with a release, or to show how quickly we can crank out minor releases.  A huge portion of our effort has been to ensure continuity for the many millions of users of OpenOffice.org, by far the most popular open source productivity suite.  The OpenOffice ecosystem is not just a download site (though it certainly includes a download site that continues to see nearly 10 million downloads/month).  The ecosystem includes mailing lists, support forums, wikis, bug databases, documentation, extensions and templates repositories, etc.  These public-facing and user-facing services are critical to the entire ecosystem, not only to Apache OpenOffice.  To give a sense of the magnitude of this interdependence, the libreoffice.org domain contains 13,281 links to webpages hosted on openoffice.org domains.

Oracle has kindly allowed us use of some legacy servers during the migration to Apache.  The last of these servers should be disconnected on or soon after March 16th.  At that point the Apache OpenOffice infrastructure will be entirely hosted by Apache, aside from the extensions and templates repositories which are graciously hosted by SourceForge.  So our migration is complete.  A round of thanks is due, from all sides, for the efforts of the Apache Infrastructure team and Apache OpenOffice volunteers who worked tirelessly to ensure that the OpenOffice web presence was preserved and can continue to be a valuable resource for OpenOffice users, as well as other projects based on the same codebase. 


I am curious as to where your statement and it's precision: "the libreoffice.org domain contains 13,281 links to webpages hosted on openoffice.org domains." comes from. A quick google query for: "site:libreoffice.org link:openoffice.org" shows under 5000 hits, many of them bugs, archived mailing-list postings and so on. It also appears that Google link: searches are not quite as precise as might be hoped eg. apparently our homepage, donations page and more link (though in fact they just give fair credit). Anyhow - no doubt I'm doing something wrong in my queries - please educate me.

Posted by Michael Meeks on March 14, 2012 at 10:02 AM UTC #

Great account of what Apache has been up to. Do you know if Oracle plans to create builds of Apache OpenOffice for Oracle Solaris OS? Would make sense given its non-GPL license. (I know Apache OO for FreeBSD has already been ported). Best regards, FC

Posted by Fernando on March 14, 2012 at 12:21 PM UTC #

@Michael, the link numbers are from the Google Webmaster Tools. I would not expect you'll get the same numbers from a search engine query, which after all is optimized for relevancy, not for analytics. Google has a tech note you might to glance at that explains some of the other factors that cause their numbers to vary depending on what Google service you use: https://support.google.com/webmasters/bin/answer.py?hl=en&answer=1213138 My own experience with the Google site/link type search queries is that they are almost always noisy and incomplete. Yahoo used to have a better "Backlinks" search feature, but I haven't checked recently.

Posted by Rob Weir on March 14, 2012 at 08:03 PM UTC #

@Fernando, I have not heard anything about restarting a Solaris port. We have the BSD port, as you know. We also have someone in the projecting making great progress with an OS/2 port. Of course, if anyone wants to join and work on a Solaris port, they would be very welcome!

Posted by Rob Weir on March 14, 2012 at 08:03 PM UTC #

@ Rob Weir, first thanks for the post and the great graphics. I have not heard a new port on solaris. I have the solaris port connected to the project and hope that it will further voluntary Nich!

Posted by Erwin West on March 15, 2012 at 06:46 AM UTC #

New UI?

Posted by Ivanisky on March 15, 2012 at 12:23 PM UTC #

Any details how the integration of Symphony with OO.org progresses?

Posted by Sebastian on March 16, 2012 at 04:23 AM UTC #

On our snapshot page (https://cwiki.apache.org/confluence/display/OOOUSERS/AOO+3.4+Unofficial+Developer+Snapshots) you can also find developer snapshots for Solaris Intel provided by project members. There is work ongoing!

Posted by Juergen Schmidt on March 16, 2012 at 09:11 AM UTC #

Could you please expand upon what exactly "Removal of Copyleft" means? I mean, while I don't really see this as something to be proud of, I do on the same realize why you had to do it. Given that, though, how exactly and what exactly did you remove? OpenOffice is dependent on A LOT of GPLed software. Surely you didn't attempt to rewrite hunspell for example did you?

Posted by Steven Oliver on March 16, 2012 at 03:01 PM UTC #

@Sebastian, good idea. I'll plan on writing a future blog post to update on the Symphony work.

Posted by Rob Weir on March 16, 2012 at 03:41 PM UTC #

@Steve, 3rd paragraph says, "Take a look at the box called 'Removal of copyleft'. This is the work we did to get the OpenOffice.org code to conform to Apache policy regarding licensing". So it is not just removal. It is removal in some cases, replacement in others, segregation in other cases. In many cases it also involves aggregating the licenses and required notices. Apache projects have a uniform way of doing this, to downstream consumers of Apache products know what they are getting and are clear what their obligations are. So when we do this work, we are doing it for the benefit of downstream consumers of the code, to make it easier for them to work with, at least from the IP perspective. You might find it interesting to read this page for a general overview: http://www.apache.org/legal/resolved.html

Posted by Rob Weir on March 16, 2012 at 03:48 PM UTC #

It's an interesting approach to try to determine interdependence using cross-linking. I did a bit of digging around this method and my conclusions are here. http://people.gnome.org/~michael/blog/2012-03-14.html From 10k feet though, I'm deeply skeptical of any substantial interdependence in either direction really.

Posted by Michael Meeks on March 16, 2012 at 08:16 PM UTC #

