Wednesday, March 07, 2007

Atom Publishing Protocol, Where is the Batch Semantics?

Batch support has been proposed (PaceBatch) and heavily discussed in the Atom Publishing Protocol Working Group but it did not make it.

Google needed it for its Google Data API and went ahead with their PaceBatch idea. There is no need to explain further why batch support is needed. PaceBatch has good rational on it. Google has a real world use case for it, Google Base.

But, in my opinion, the PaceBatch/Google-Data solution has some critical issues.
  • It mixes up the transport layer (the POST/PUT/DELETE operation and HTTP headers) with the data layer (the atom:entry element).
  • An Atom entry has to be different (has to include transport information) just because it submitted in batch.
  • To support batch processing code has to be rewritten at protocol and data handling level.
  • Large batch submissions cannot be handled with XML DOM parsers.
So, how about the following alternative (which, by the way it is suggested in the PaceBatch in the Limitations section) ?

[I have not considered HTTP 1.1 Pipeline as "8.1.2.2 Pipelining" section clearly discourages the use of pipelines for non-idempotent methods and the recommended behavior is to serialize request/response-s if non-idempotent methods are used, thus taking us back to square one]
  • Use MIME Multi-Part (RFC 2046) document to post a batch.
  • Each part has a headers section, which it would be used to mimic the HTTP header section of a single operation using an 'Atom-Operation' header to indicate the operation for the entry in the part.
  • The data section of each part would be the Atom entry to insert/update/delete (If delete data section is empty).
  • The response it would be a MIME Multi-Part document of the same number of parts as the request, each part containing the operation status of the corresponding request part plus other headers and the echoed Atom entry if necessary.
This alternative addresses the 4 issues I've described above:
  • The transport and data layers remain separate from each other.
  • It is transparent for an Atom entry if it is sent as part of a single operation or a batch operation.
  • Only the code handling the protocol level has to be rewritten, data code handling remains the same.
  • An XML DOM parser can operate on each entry without having to process the full batch.
While this adds MIME Multi-Part to the mix it is something it will be buried inside of the batch implementation with no exposure to the application developer.

Further thoughts:
  • An HTTP header on the request could indicate if the semantics of the batch submission is full or partial failure.
  • An HTTP header on the request could indicate that response could be a simple HTTP OK if all entries are processed successfully (not to respond with an HTTP status for each one).
  • A correlation header in the MIME Part could be used to correlate an entry with a status response.
  • Members with other content type (such as images) would be just another MIME part.
  • A correlation header in the MIME Part could be used to correlate entries with members of other content type.
References:

Labels:

5 Comments:

Anonymous Anonymous said...

www.hotels-rome.ws

3:16 AM  
Anonymous Anonymous said...

Alejandro, are you still active on the ROME project? How can I reach you? dave.mcloughlin@openlogic.com

12:30 PM  
Anonymous John Roche said...

This doesn't seem to be a solution that browsers (i.e. UI developers) can easily use since they do not have a way of accessing the body of the multipart request if files are being sent. And if they were to send a multipart with no bytes and just atom then what you're proposing is essentially a new input format for the UI developer in question, with atom as a microformat. I'm about to try implementing feed posting in order to solve this problem. I don't have to send binary data though so maybe this solution wouldn't be suitable for your needs.

3:00 AM  
Anonymous Anonymous said...

Dear hadoop, How can i reach you?
Syed
9663330622/syedkams.net@gmail.com

5:16 AM  
Blogger Syed Kamran said...

Dear hadoop, How can i reach you? Sorry my last comment was Anonymous
Syed
9663330622/syedkams.net@gmail.com

5:18 AM  

Post a Comment

<< Home