XML parsing in java using SAX parser

DOM Parser is slower and it is not suitable for parsing files with large sizes ,because DOM Parser loads the entire file into memory and navigating through the nodes.But SAX Parser is entirely different from DOM parser.Here we are discussing about  XML parsing in java using SAX parser.

 

XML parsing in java using SAX parser

SAX stands for Simple API for XML. SAX parser consumes less  memory when comparing with DOM Parser.SAX parser uses the call back methods  for parsing the XML document. Okay lets begin learning about XML parsing in Java using SAX parser.

The call back methods of org.xml.sax.helpers.DefaultHandler.java are given below.

1)startDocument() – This method will be called at the start of an XML document

2)endDocument() – This method will be called at the end of an XML document

3)startElement() – This method will be called  at the start of an element

4)endElement() – This method will be called at the end of an element

5)characters() – This method will be called if some text content is there in between the start and end tags.

The DefaultHandler class plays an important role in parsing the XML file.We can understand the details after doing a sample code.

Reading XML using SAX Parser

First we will see our student.xml that we are using as an input XML file for parsing.




1
Bijoy
10 A

As already discussed a DefaultHandler instance is needed  to do parsing with SAX parser. We just created a custom DefaultHandler class by inheriting the DefaulHandler.java.  So let us see the StudentXMLHandler.java first.

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class StudentXMLHandler extends DefaultHandler {
private String currentElement = "";

public StudentXMLHandler() {
super();
}

public void startDocument() {
System.out.println("Start of document");
}

public void startElement(String arg1, String arg2, String arg3,
Attributes attributes) throws SAXException {
currentElement = arg3;
}

public void characters(char arr[], int begin, int length) {
if (currentElement.equalsIgnoreCase("id")) {
System.out.println("Id = " + new String(arr, begin, length));
currentElement = "";
} else if (currentElement.equalsIgnoreCase("name")) {
System.out.println("Name = " + new String(arr, begin, length));
currentElement = "";
} else if (currentElement.equalsIgnoreCase("class")) {
System.out.println("Class = " + new String(arr, begin, length));
currentElement = "";
} else if (currentElement.equalsIgnoreCase("division")) {
System.out.println("Division = " + new String(arr, begin, length));
currentElement = "";
} else {
currentElement = "";
}
}

public void endElement(String arg1, String arg2, String arg3) {

}
public void endDocument() {
System.out.println("End of document");
}
}

As we already discussed , the callback methods are significant. We have overridden the needed methods.The characters()  method fetches the data between the start and end tags of an element. Now let us see the  SAXParserSample.java  which contains the main method.

import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.IOException;

public class SAXParserSample {
public SAXParserSample() {
}
public void readXML() {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
DefaultHandler studentHandler = new StudentXMLHandler();
System.out.println("Starting parsing");
parser.parse("C:\\Users\\My PC\\Projects\\Sample\\files\\student.xml", studentHandler);
System.out.println("Done");
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
SAXParserSample sample = new SAXParserSample();
sample.readXML();
}
}

The StudentXMLHandler object is passing along with the input xml file  for parsing .Now let’s verify the output.

Output

Starting parsing

Start of document

Id = 1

Name = Bijoy

Class = 10

Division = A

End of document

Done

 

See also:

XML Processing in Java

DOM interface

StAX interface

JAXB