Archive

Archive for the ‘Widgets’ Category

Zip files and Encoding – I hate you.

December 8th, 2008

I’ve written about some of the issues with depending on zip as a packaging format in the past. As people know, Web Apps is depending on Zip as the packaging format for Widgets.

Zip the good

Zip has a lot going for it. It is ubiquitous and dependable… so long as you don’t want to share files across cultures.

Zip the bad

The Zip spec does not seem to know that there are normalization models for UTF-8, when there are actually 4 (or more, because there is some non-standard ones too!). The Zip file gives no guidance as to how file names inside zip files are to be normalized.

Consider, when a zip file is created on Linux, it just writes the bytes for the file name in the encoding of the underlying file system. So, if the file system is in ISO-8859-1, the bytes are written in ISO-8859-1. This may seem ok, but when you decompress the zip file on Windows, which runs on encoding Windows-1252, the file names get all mangled. If the underlying encoding of the file system on Linux is something else, you won’t be able to share files with other systems at all. So in this case, it is not Window’s fault.

The Zip spec says that the only supported encodings are CP437 and UTF-8, but everyone has ignored that. Implementers just encode file names however they want (usually byte for byte as they are in the OS… see table below).

It gets worst! because MacOS runs on some weird non-standard decomposed Unicode mode, you can only share zip files with other MacOs users. According to this email, the LimeWire guys also ran into a similar problem with regards to encodings in MacOS:

“for example a French, German or Spanish Windows user cannot exchange files that contain [file names with] French, German or Spanish accents with a French, German or Spanish Macintosh users”

The following table illustrates the problem:

Bytes that represent ñ in a Zip file (in hex)
File name Zip in Windows Zip in Linux Zip in Mac OS
ñ a4 (Extended US-ASCII/CP437) C3 B1 (UTF-8 NFC) 6E CC 83 (UTF-8 NFD)

Yes! holly crap! three different byte sequences corresponding to different character encodings.

The only way around this would be a *special* custom-built widget zipping tool that normalizes file name strings to NFC. If the widget engine needs to decompress the widget to disk, then it would take the NFC and convert them to the operating system’s native encoding (or store the files in memory, and reference them that way). This affects the URI scheme and DOM normalization of Widgets, so Web Apps will have to deal with it eventually… but not sure exactly how.

W3C, Widgets , , , ,

Widget spec is now Widget Specs

March 7th, 2008

In an effort to expedite the standardization of widgets, the Web Application Formats Working Group yesterday decided to split the Widgets 1.0 Specification into three (or more) specs:

Other specs may also follow, particularly:

Other documents are still under development too:

We are aiming to have all these done (ie. Last Call) by October. However, now that the document split has happened, I should be able to get the packaging format done fairly quickly.

We have more or less now settled on the configuration language format. The elements are going to be:

  • <widget width=”" height=”" id=”">
    • <title: the title/name of a widget
    • <description> a description
    • <author email=”" url=”"> some details about the author
    • <license> paste your GPL here! :)
    • <icon src=”"> the icon
    • <access network=”true|false” plugins=”true|false”> if your widget need to get online
    • <content src=”"> some file in the widget archive

Only <widget> and <content> are mandatory at this point.

The processing model for the XML is going to be quite forgiving. The only thing that will cause an error, is not having a well-formed document.  For example, the following the following would result in “The Awesome Super Dude Widget” as the title:

<widget xmlns="http://www.w3.org/ns/widgets">
   <title>
     The <blink>Awesome</blink> 
     <author email="dude@example.com">Super Dude</author> Widget</title>
</widget>

The unrecognized elements are simply ignored, but their text content is extracted. This makes processing more forgiving and allows for extensibility and some graceful degradation. I also want to push that the widget should function if the namespace is omitted.

We are also currently investigating how we are going to deal with internationalization in the configuration document format. We are looking at following ideas from the Best Practices for XML Internationalization.

PhD, W3C, WAF-WG, Widgets

WAF and WebAPI are dead. Long Live WebApps Working Group!

December 19th, 2007

The charters of both  the W3C Web Application Formats and WebAPI Working Groups have now expired (as of the 15th of November, 2007) meaning they are effectively dead (although still twitching!). From their ashes will rise a new merged working group called the Web Applications Working group… hopefully by the 31 of January.

According to the new proposed charter, the missions of the new working group is to:

…is to provide specifications that enable improved client-side application development on the Web, including specifications both for application programming interfaces (APIs) for client-side development and for markup vocabularies for describing and controlling client-side application behavior.

The new Web Applications Working Group is chartered with the continual development of the following specifications:

Specification FPWD LC CR PR Rec
ClipOps spec 2007-Q2 2008-Q4 2009-Q2 2009-Q4 2010
DOM 3 Core bis spec          
DOM 3 Events spec 2007-Q2 2008-Q2 2008-Q4 2009-Q4 2010
Element Traversal spec 2007-Q2 2007-Q4 2008-Q2 2008-Q4 2008
Access Control spec 2006-Q2 2008-Q1 2008-Q3 2009-Q4 2010
File Upload spec 2007-Q2 2008-Q2 2008-Q4 2009-Q4 2010
Language Bindings spec 2007-Q2 2008-Q2 2008-Q4 2009-Q4 2010
MAXIM spec 2008-Q1 2008-Q3 2008-Q4 2009-Q2 2009
Network API spec 2008-Q2 2009-Q1 2009-Q3 2010-Q2 2010
Progress Events spec 2007-Q2 2008-Q2 2008-Q3 2009-Q2 2009
Selectors API spec 2007-Q2 2007-Q4 2008-Q2 2008-Q4 2008
XHR Object spec 2007-Q2 2008-Q2 2008-Q4 2009-Q4 2010
Widgets spec 2006-Q4 2008-Q4 2009-Q1 2009-Q3 2009-Q4
Widgets Requirements 2006-Q3 2008-Q4 2009-Q1 2009-Q3 2009-Q4
Window Object spec 2007-Q2 2008-Q2 2008-Q4 2009-Q4 2010
XBL2 spec 2006-Q2 2010 2011 2013 2013
XBL2 Primer 2007-Q3 2010 2011 2013 2013

Another cool thing about the new working group is that it is modeled on the HTML Working Group, meaning that is open, transparent (no secret chats on the members list) and anyone will be able to participate via the public mailing list.

I’ll continue to edit the Widget Spec and Requirements, and possibly continue to help out with the XBL Primer.  I’ll continue to be part of this new working group for a least 1 year, as I my PhD program ends in March 2009… and hopefully longer, if someone gives me a job to continue working on specs! ;)

PhD, W3C, WAF-WG, Widgets, Work

"OMG, I'm a server!": widgets and the exciting future of mobiles

December 17th, 2007

I’ve been doing my fair share of traveling lately. I went to the W3C TPAC in Boston, which was great, and I just got back from vacation in Tropical North Queensland (Port Douglas) a few days ago. I went whitewater rafting, and snorkeling in the (sadly dyingGreat Barrier Reef, got to swim with a turtle, and some sharks.

While I was in Boston for the TPAC, I bought myself an IPod touch and a Nokia N95. The first thing I did when I got my iPod was to jail break it. I have to say, the iPod touch is simply awesome… however, I wont go into a rant because I don’t want to expose myself too much as an Apple fanboy:) The first thing that struck me as I was navigating the list of apps to install on the jail broken iPhone was the availability of the Apache Web Server and PHP. When I saw that, I instantly thought “OMG! this changes everything: I am a server!”. Sure enough, I installed them and they worked. I got my friends from Australia to log onto my IPod – very cool! It was only a few weeks later that I heard that Nokia was also going to release a phone with Apache, PHP, and MySQL (APM) which I’m keen to try out on my N95. I think this is a significant development while we wait for the standardization and eventual implementation of HTML5 (which will provide similar functionality).

Putting aside all security and privacy concerns for a minute, I think the idea of everyone now being a web server is a very exciting and disruptive innovation. Imagine a widgets ecosystem that intertwines phones and desktops and integrate ideas from social networking and the unique aspects of the mobile in a single container (widgets).

I don’t know what Nokia is going to do with their APM phones (and I am sure that Apple Iphone/IPod and Google Android will both feature web servers really soon), but here is a simple future scenario: I buy a new phone with the APM capability. When I connect the phone to the internet, people can access the phone via its IP address (which kinda sucks, but fixable… more on this later). Pre-installed with the phone is a widget engine, which allows the user to either manually install widgets or use pre-installed widgets. The widget engine provides an admin interface, accessible only via, say, “http://widgetengine/” or something, which allows me to add/customize/remove widgets. Widgets in this contexts are little PHP apps, packaged to conform with the widgets 1.0 spec. Lets says the default widget that ships with the phone is a Nokia-build one that shows some info about the phone, and generates a photo gallery of the pictures stored on the device.  Although impressive, is not really of much use to me because everyone I care about is on Facebook ( or some OpenSocial network).

Given that the phone has a widget engine that runs on top of the server, a developer could create a Facebook widget that gathers all the phone numbers and details from my facebook friends list and packages them into a widgets. When the widget is installed, all those phone numbers and details get stored into the MySQL database. I can then ask the widget to either SMS or simply message, via Facebook, all the preferred contacts to let them know that my phone server is up. Better still, the widget, via PHP, can monitor the phone to see when it is assigned an IP address, and automatically connect to Facebook to let my contacts know that I am online. From there, my contacts can check out, for example, photos that I have just taken on my phone or other things the widget may allow viewers to do.

The things that I would want to share as a user (my profile: things that define me publicly as an individual and associate me as part of a group) and some simple app ideas:

  • My location (exact (gps) or derived (eg. brisbane) or abstract (eg. the office))
    • Apps: Where am I now? Where I’ve been (recently, travelling, etc)? What exercise path did I take (and times, calories burnt)?
  • My pictures (sortable, in sets, searchable)
    • Apps: my picture gallery; my picture gallery and with pictures taken from similar location (eg. mix locally stored pictures with flickr)
  • My music (what I’ve got on my device, what I am listening to right now)
    • App: my music and music people around me are listening to?
  • My details (maybe my social wants and needs. link to my blog online)
    • App: a dating widget? Syndication of my blog combined with my locally stored pictures?

The effect of these apps is very interesting because it means that I can bypass services such as flickr, or I can integrate both flickr and my phone. I can also merge the means of communication with my contacts, via SMS or the web.

These applications require additional infrastructure to connect me to other users:

  • Global peer-to-peer infrastructure: when my phone connects to the internet, I want my contacts to know about it!
  • Local peer-to-peer infrastructure: when my phone connects to the internet in this place, let those near me know: eg, for playing location-based games, or other multiplayer games; or, for example, for letting people know at this place that I’ve arrived.

This also requires a place where phone widgets are distributed by developers and scrutinized by the community for security and quality.

The future looks pretty nice if AMP enabled phones and services take off…. and if the security and privacy issues are handled with care.

HTML, Rant, Widgets

Widgets 1.0 (v2)

October 17th, 2007

Today the W3C published the Second Public Working Draft of the Widgets 1.0 Specification. It’s been nearly a year since we published the first public working draft (11 Nov, 2006) and much has changed and been added to the spec (…and it still has a long long way to go yet before it will be finished!). The most notable addition to this version of the spec are in the attempt to standardize a subset of the Zip specification and support for digital signatures using XML Digital Signatures. Unfortunately, a lot of exciting things that are under discussion by those participating in the standardization effort have not made it into this latest draft. For example, we are still trying to work out a nice model for automatic updates, but we should have something drafted up fairly soon.

The main problem I’ve been working on over the last two months is trying to specify a subset of Zip that should be used by widgets. My goal has been to define a subset that is interoperable across all platforms and devices in such a way that it also ensures longevity. As you might imagine, this has proven to be quite a challenge…

The issues with Zip

The Zip file format is what is commonly referred to as a de facto standard: it is not formally specified by any standards body, but of it is so widely implemented that it is interoperable across OSs and devices. This seems great on the surface, but when you try to standardize it, it becomes quite a nightmare. The main issues are these:

  • There are competing Zip specifications and there are many versions of each of the Zip specifications.
  • Different version of the Zip specification are implemented across different platforms and OSs.
  • There are many features in Zip that are desirable (eg. UTF-8 support), but are not widely implemented.
  • Zip is not an “open standard”, it is the property of PKWARE.
  • Zip is periodically updated and PKWARE does not provide any links to previous versions of their specs.

Competing Zip specifications

There are essentially two Zip Specifications that applications make use of: the “official” PKWARE Zip Application Notes and the “unofficial”Info-Zip Application Notes (mostly on Unix). The unofficial notes basically take whatever PKWARE has officially published, and gets modified, or otherwise clarified, by the guys at Info-Zip. In this sense, much of what one finds in the Info-Zip specs is identical to the PKWARE Zip spec. But, because PKWARE actually maintains the official spec, the PKWARE spec is always more up-to-data than what Info-Zip has on its website (for instance, the latests version of Info-zip covers version 6.2.0 of the official Zip spec (26 April 2004); the latest version of Zip is version 6.3.2 which came out in September 2007!, so InfoZip is three years behind PKWARE!).

Problem: Info-zip contains details that pertain to how info-zip works and may not be compatible/interoperable with the PKZip Spec. For example, Info-zip contains details about how to handle Unix permissions, while PKWARE’s Zip spec does not. This might not make the file formats incompatible, but it does make them physically different. You can try this out yourself: zip up a file using Info-Zip’s zip implementation and then zip up the same file using Windows’ Compressed Folders. The results will be different, but you should still be able to decompress the Info-Zip file using Windows’ native Zip implementation.

Different version of the Zip specification are implemented across different platforms, OSs, Specs

Another significant issue form a standardization perspective is that packaging formats are making use of either some Info-Zip spec or some PWARE spec. Significant examples include:

Java/JAR (including WAR and EAR) :
Info-ZIP Application Note 19970311
Open Document Format (ODF):
Info-ZIP Application Note 19970311
Open Office XML – Open Packaging Convention (OOXML-OPC):
PKWARE Zip Application Note (version 6.2.1), but with a bunch of clarifications.
OEBPS Container Format 1.0:
PKWARE Zip Application Note (no explicit version, but at least version 2.0 needed to extract and version 4.5 needed to extract Zip64).

I still have little idea as to what version of the Zip specification is actually implemented on each OS, let alone on mobile devices (information that seems to be quite difficult to come by!). As a result, and after some discussion with Jon Ferraiolo of IBM, I decided to base the Widget Spec on the OEBPS-OCF’s conformance requirements for Zip packages. I was tempted to make the widgets specification conform to the OOXML-OPC spec (put away your tomatoes!) because, in my opinion, the container aspects and conformance requirements are well specified (even if the rest of OOXML is “evil”).

Desirable features in Zip (6.3.2)

There are a number of really cool features in Zip that would make specifying a container format for widgets much better. They include:

  • Strong Encryption (using x.509 digital certificates): basically solves the digital signature problem, I think.
  • UTF-8 support: solves a significant part of the internationalization problem.
  • Zip64: future proofing.

To require widget engines to actually support these features puts a fair bit of strain on makers of widget engines. At this point, we have required that implementers support UTF-8 and Zip64.

Zip is not an open standard

The fact that Zip is proprietary might be something that comes back to bite us on the ass. I’m no lawyer, but there of patents/IPR issues surrounding Zip. I’m also not sure about how PKWARE will feel about WAF specifying a subset of their specification. I’ve emailed PKWARE and informed them of what we are doing and requested that they review the spec. They have responded and said that they will look into it.

Where to from here…

Looking forward, I’d really like to get all the physical and logical packaging stuff done. That includes:

  • Anything Zip related
  • The inter-package addressing model
  • How to handle decompression
  • How to name files in ASCII and UT-8

I’d also really like to nail down the auto-updates model and make sure that the manifest language we are specifying is covers all the common use cases. The security model is the elephant in the room :) No one wants to touch it at this point; but we know its a massive issue. Another massive issue is the APIs… but that’s not something I want to get into now. A big issue for me is internationalization. I’ve been blocked a number of times when I’ve proposed doing internationalization using folders… every widget engine except Opera does it, so I think we should do it too.

W3C, WAF-WG, Widgets ,

Web Directions South Conference

October 7th, 2007

Last week I attended the Web Directions South Conference, in Sydney. I was invited to give a talk on Widgets as part of the conference’s W3C SIG day. Overall, I thought the conference was really good: very well organized with lots of good interesting talks. The slideshow for my talk are now hosted on slideshare:

W3C, Widgets

July-August wrap-up…finally confirmed

September 7th, 2007

Wow, I can’t believe it’s September and how slack I have been about keeping this blog up-to-date…

July was mainly taken up by my PhD confirmation and August with the widgets spec, visiting Oslo for a WAF face-2-face, and a brief visit to the W3C office in Canberra to attend a standards symposium.

Confirmation dramas

My PhD confirmation went well enough… about 10 people were nice enough to come along. I basically presented the issues around widgets and then argued about why it was valid to do my research their standardization. I got a bit of grief from the academic panel about my resistance to include the academic community in the research. I basically tried to argue that academics are out of touch with what is happening in this area and there would be little point in me including them (as I also have very little idea about what is going on in this area!;)). Still, they insisted that I produce at least three academic papers out of my research and make that my PhD… I agreed.

I will now produce three papers for my PhD (I’m doing my PhD by publication, which means I just publish papers instead of writing PhD thesis that no one will read):

  1. The first will be about the issues around widget engines and the problems that could be addressed by standardization.
  2. The second will be on the design decisions that have gone into the Widgets 1.0 Spec. This paper would cover, for instance, the reasons we are using XML for the manifest format instead of JSON… or why we may or may not have a namespace for the widget manifest format.
  3. Third, would discuss the Widgets 1.0 Test-suite… particularly any interesting design decisions. There is not much literature out there on web-based test suites for specs. Well, not much that I could find as part of my research.

Anyway, I was asked to resubmit my confirmation document and list the above three publications. To be a smart-ass, I submitted the document with the following footnote (mostly to test if they would read any of my changes):

It is my position that producing peer-reviewed publications for journals and conferences is a waste of time (according to an article published CiteSeer in Nature (Lawrence, 2001), the average citation rate of a journal paper is 2.74, and articles freely available online are more highly cited. For greater impact and faster scientific progress, authors and publishers should aim to make research easy to access. Something that is not possible when papers are published for profit on the medium of paper. Also QUT’s Creative Industries faculty finds it difficult to deal with the concept of harnessing collective intelligence (O’Reilly, 2005) as it undermines what academics do as they can no longer claim authorship over works/publications. This is both unfortunate, sad, and a very archaic way of thinking. It is obvious than harnessing the collective intelligence, as it done in open standardization efforts, will more often yield higher quality of output than those produced by pretending that everything is written in isolation.

That footnote did not go down too well, so I was asked to remove it (hehe, yes!! CENSORSHIP AT QUT, IT’S ALL TRUE!!!!)… anyway, I removed it resubmitted a few days ago, so I am now finally be confirmed. Glad that part of my life is over and done with.

WAF F2F in Oslo

Morning in Oslo...

The Web Applications Format’s (WAF) face-2-face meeting in Oslo went very well. It was held at Opera Software’s HQ in Oslo. We most talked about Widgets and decided to finally dump the Declarative Format For Applications and User Interfaces (DFAUI). (I’ll talk more about the widgets stuff we discussed in a bit). The implication of dumping the DFAUI was that our working group charter is monumentally out of date! In fact, the current WAF charter does not even mention widgets. The WAF working group will have its charter reviewed in November (as the group also expires in November, 2007). I expect the group to be rechartered for another 2 years.

The discussions around widgets focused mostly on the editor’s drafts of the widget requirements and the widget spec. As I’ve written about previously, I’ve done a lot of work recently on the requirements. The requirements document is starting to become quite stable. However, the widget spec is still in a bit of a mess. I mostly blame myself for that so I’ve started working on restructuring it. The reasons I think the widget spec is a bit of a mess is because:

  • I still don’t have a complete overview of the overall problem space (particularly APIs).
  • I really want to make sure that a proposed solution is as compatible as possible with existing implementations, and also follows W3C principles. This means really understanding how the current market-leading widget engines actually work. This involves a lot of research, development, and testing on each platform, so it is quite time consuming.

Regardless of my document structuring inabilities, I think that the meeting was quite successful in at least nailing down what we need to make a priority in the short term. Namely:

  1. Manifest (language and processing)
  2. Packaging (bits of Zip that we want to use)
  3. Auto updates (a HTTP and XML based model)

I recently made a post to the WAF public list about automatic update based on the discussion the working group had at the face to face meeting. In the coming days (weeks?), I’ll write a blog entry on current approaches to automated updates. I’ll mostly just focus on FireFox, Yahoo! Widgets, and compare that to a pure HTTP solution proposed by Mark Nottingham.

Standards Symposium, Canberra

The W3C Australia office and NICTA jointly organized small standards symposium, which was held at the W3C office (CSIRO) Canberra on the 26th of August. Overall the event was very interesting covering a wide range of topics relating to standardization of various technologies (mostly built on XML).

One interesting talk was given by Anne Cregan about some work she is involved with regarding the Semantic Web. As someone who was once into the rhetoric of the semantic web, I was quite fascinated to hear that there is a working group trying to create what Anne described as an English-like expression of OWL. I then proceeded to ask Anne why there was no collaboration between the HTML WG and the Semantic Web working groups, given that, the semantic web is supposed to interact with the Web (AKA, HTML+tag-soup)? She mentioned GRRDL, which made me giggle as I can’t think of a more useless technology.

My conclusion from Anne’s talk is that the Semantic Web and HTML groups are still working completely separately of each other, which is probably why the Semantic Web movement got so badly pwned by the Microformats and Tagging communities. At least it’s good that they have started dumping all the XML syntax in favor of more human readable alternatives. If the alternatives prove to be usable, then there may be hope yet to integrate them on the Web… particularly if any new semantic web language does not rely on XML… maybe someone will come up with a nice JSON alternative ;-)

WAF-WG, Widgets, XBL 2.0, monthly wrap-up

June Wrap-up

July 6th, 2007

June was a fairly busy month:

July is also going to be pretty intense:

  • will try to get the First Public Working Draft of the XBL Primer by the 13th of July.
  • Presenting my PhD confirmation on the 25th.
  • Going to Melbourne on the 26th to work with Cameron McCormack for two days on a model for the XBL 2.0 Test Suite.

Widgets Requirements (4.0)

July 6th, 2007

Widget Reqs on the W3C homepage

The Fourth Working Draft of the Widgets 1.0 Requirements has now been published at the w3c.

Next come a lot of research into each requirement and getting the normative wording into the spec. I think I will start with the "low hanging fruit" (the manifest) while I research aspects related to persistent storage.

W3C, WAF-WG, Widgets

Widgets 1.0 Requirements Updated!

July 2nd, 2007

Widget reqs widget Today I put in a request for the Widgets 1.0 Requirements to be republished by the W3C. I pretty much completely reworked every aspect of the document. I hope it’s much more clear now as to what is needed from the widgets 1.0 spec. Please send us feedback to public-appformats@w3.org.

What’s new in the document? Read more…

W3C, WAF-WG, Widgets