DOM Parser is slower and it is not suitable for parsing files with large sizes ,because DOM Parser loads the entire file into memory and navigating through the nodes.But SAX Parser is entirely different from DOM parser.Here we are discussing about XML parsing in java using SAX parser.
XML parsing in java using SAX parser
SAX stands for Simple API for XML. SAX parser consumes less memory when comparing with DOM Parser.SAX parser uses the call back methods for parsing the XML document. Okay lets begin learning about XML parsing in Java using SAX parser.
The call back methods of org.xml.sax.helpers.DefaultHandler.java are given below.
1)startDocument() – This method will be called at the start of an XML document
2)endDocument() – This method will be called at the end of an XML document
3)startElement() – This method will be called at the start of an element
4)endElement() – This method will be called at the end of an element
5)characters() – This method will be called if some text content is there in between the start and end tags.
The DefaultHandler class plays an important role in parsing the XML file.We can understand the details after doing a sample code.
Reading XML using SAX Parser
First we will see our student.xml that we are using as an input XML file for parsing.
As already discussed a DefaultHandler instance is needed to do parsing with SAX parser. We just created a custom DefaultHandler class by inheriting the DefaulHandler.java. So let us see the StudentXMLHandler.java first.
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class StudentXMLHandler extends DefaultHandler {
private String currentElement = "";
public StudentXMLHandler() {
super();
}
public void startDocument() {
System.out.println("Start of document");
}
public void startElement(String arg1, String arg2, String arg3,
Attributes attributes) throws SAXException {
currentElement = arg3;
}
public void characters(char arr[], int begin, int length) {
if (currentElement.equalsIgnoreCase("id")) {
System.out.println("Id = " + new String(arr, begin, length));
currentElement = "";
} else if (currentElement.equalsIgnoreCase("name")) {
System.out.println("Name = " + new String(arr, begin, length));
currentElement = "";
} else if (currentElement.equalsIgnoreCase("class")) {
System.out.println("Class = " + new String(arr, begin, length));
currentElement = "";
} else if (currentElement.equalsIgnoreCase("division")) {
System.out.println("Division = " + new String(arr, begin, length));
currentElement = "";
} else {
currentElement = "";
}
}
public void endElement(String arg1, String arg2, String arg3) {
}
public void endDocument() {
System.out.println("End of document");
}
}
As we already discussed , the callback methods are significant. We have overridden the needed methods.The characters() method fetches the data between the start and end tags of an element. Now let us see the SAXParserSample.java which contains the main method.
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.IOException;
public class SAXParserSample {
public SAXParserSample() {
}
public void readXML() {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
DefaultHandler studentHandler = new StudentXMLHandler();
System.out.println("Starting parsing");
parser.parse("C:\\Users\\My PC\\Projects\\Sample\\files\\student.xml", studentHandler);
System.out.println("Done");
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
SAXParserSample sample = new SAXParserSample();
sample.readXML();
}
}
The StudentXMLHandler object is passing along with the input xml file for parsing .Now let’s verify the output.
Output
Starting parsing
Start of document
Id = 1
Name = Bijoy
Class = 10
Division = A
End of document
Done
See also: