Processing XML with C++ under ubuntu with xerces

Setting up

  • apt-get this: ¬†apt-get install libxerces-c3.1 libxmltooling-dev -y
  • Project linking libraries: /usr/lib/x86_64-linux-gnu/libxerces-c.so (path might vary, use locate to find out)

After setting up everything, first thing is to generate XML data that we can port to a file and a string. the file is for storage and for us to read from in the second part, the string is to pass the data around.

First thing: initializing Xerces:

before any xerces method is called, XMLPlatformUtils::Initialize(); must be executed. At the end of your program, don’t forget to call XMLPlatformUtils::Terminate();. both are defined in xercesc/util/PlatformUtils.hpp.

The XML “factory”

First thing in creating DOM documents, is to initialize the key object, the DOMImplementation.This is done like this:

DOMImplementation *pImplement = nullptr;
pImplement = DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));

To create a document, Xerces requires us to create a document type first.

DOMDocumentType* pDoctype = nullptr;
pDoctype=pImplement->createDocumentType(XMLString::transcode("xml"),0,0);
DOMDocument * pDoc = pImplement->createDocument(XMLString::transcode("xml"),

XMLString::transcode("xml"),pDoctype);

Sidebar: theXMLString

The XML string is a utf16 string. after creating it, you have to release it or it will leak memory! I’m not doing this here to get a cleaner code, but it must be done. there are many ways to handle this, and I will show one of them later

From this point, the creating of elements is very easy and done by calling

DOMDocument::createElement(XMLString). Again, elements returned here, must be release.

To add a child element to a parent, use DOMElement::appendChild(DOMElement * child)

To extract the xml data to a string we use the DOMLSSerializer class. Like many others, this class is also retained from the DOMImplementation class:

DOMImplementation *pImplement = nullptr;
pImplement = DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
pSerializer=pImplement->createLSSerializer();

(of course, if you already have an instance you don’t need to create a new one!)

Sidebar: nullptr is a new type that is a part of the C11 standard. If you can’t compile with it, you can replace it with a regular NULL

To extract the whole document to a string we need to root element and not the document itself. Here is how to do that:

string data=XMLString::transcode(pSerializer->writeToString(pRoot));

Fixing the XMLString memory leaks:

As I mentioned, there are many options to solve this problem. Here is one:

Define a buffer to hold the data, for example:XMLCh buffer[100];

Now, you can use this buffer with the XMLString::transcode method. This method usually create a new string (that you have to release!) but when used like this: XMLString::transcode(“user”,buffer,99); it will only copy the standard string to the XMLCh buffer, saving you all the releasing problems. Seeing this, you are probably thinking of macro or encapsulating class. If you re familiar with them, I strongly recommend that you use one of them!

 

Xerces API here: http://xerces.apache.org/xerces-c/apiDocs-3/classes.html