Thursday, October 13, 2011

XML Processing in Java EE 5

All of the new Web Services API requires XML processing. Thankfully there have been changes to how Java EE will handle that as well with a fresh batch of updates.

JAXB 2.0: Improves vastly over JAXB 1.0

W3C XML Schema features (fixes missing bindings)

Adds javax.xml.bind.annotation and supports Java-to-XML binding.

Reduction in generated schema-derived classes.

Validation via JAXP 1.3 validation APIs

Smaller runtime binaries.

Schema compiler, Schema generator and Binding runtime framework.

JAXB 1.0 allowed validation: at unmarshall time, and on-demand validation on the content tree. JAXB 2.0 allows validation at marshall time and unmarshall time.

Streaming API for XML (StAX)

StAX is the all new efficient API for XML, it has a lot of great features:

  • Stream-oriented
  • Event-Driven
  • Pull-design
  • Read/WriteYou can create fast, light-weight, bi-directional parsers that is easy on the heap.
    JAXP (Java API for XML Processing) family includes StAX, TrAX, SAX, and DOM. StAX is good for low memory and limited extensibility applications.
    Pull Parser – simpler than SAX, more memory efficient than DOM.
    SAX can’t write – and isn’t bidirectional. DOM is way more powerful and flexible. One would dump SAX for StAX. An iterative pull parser – stax, an event driven push parser – then go for SAX.
    I can’t see anyone using SAX anymore. Why would you? Unless you don’t want a cursor and iterator concept in your code – or you simply hate procedural and believe everything should be read-only events for XML processing. XMLStreamReader or XMLEventReader are the Cursor and Iterator APIs – well, Iterator APIs can do things a Cursor cannot do: Iterator is more extensible and flexible. Cursor is efficient, performant and memory friendly – ideal for small JVMs and JME