3.5.1 Choosing a XML parsing model

There are several classifications depending of the parser characteristics and what you want to do or how. You may be interested in:

Making modifications or just processing?

  • For modifications: The parser creates long-lived representations from the XML document (necessary for modifications): You should choose DOM or VTD
    • Do you need to query or modify the objects (parser creates nodes): DOM
    • You do not need the objects (parser creates integers and locations caches): VTD
  • For processing: The parser doesn't creates long-lived objects: SAX or StAX.

Type of Access

  • Back-and-forth: Access the data after the parsing is complete: DOM or VTD
    • Massive or very frequent access: Choose DOM
    • Rare or simple access: Choose VTD
  • Sequential: Access the data while you're processing the document: SAX or StAX
    • Processing all tokens: SAX
    • Processing custom tokens (allows skipping forward): StAX

Type of Application

  • Streaming applications (very large documents): SAX or StAX
  • Database applications: DOM or VTD
  • Hardware acceleration?: VTD

For the S_X parsers you need to know the XML token types because, for example in the case of XMLParser, you probably would subclass SAXHandler and override one or several methods in the content category to do your own processing. For DOM usage examples you may see http://community.ofset.org/index.php/Les_bases_de_XML_dans_Squeak (it is in french but is a good document)

Add a Note

Licensed under Creative Commons BY-NC-SA | Published using Pier |