International Journal of Computer Trends and Technology

An Efficient Classification Approach for the XML Documents

	International Journal of Computer Trends and Technology (IJCTT)
	© - Issue 2013 by IJCTT Journal
	Volume-4 Issue-3
	Year of Publication : 2013
	Authors :Navya sree.Yarramsetti, G.Siva Nageswara Rao

MLA

Navya sree.Yarramsetti, G.Siva Nageswara Rao "An Efficient Classification Approach for the XML Documents"International Journal of Computer Trends and Technology (IJCTT),V4(3):362-366 Issue 2013 .ISSN 2231-2803.www.ijcttjournal.org. Published by Seventh Sense Research Group.

Abstract: -Extensible Markup Language (XML) has been used as standard format for a data representation over the internet. An XML document is usually organized by a set of textual data according to a predefined logical structure. Due to the presence of inherent structure in the XML documents, conventional text classification methods cannot be used to classify XML documents directly. In this paper, we propose the learning issues with XML documents from three major research areas. First, a knowledge representation method, which is based on typed higher order logic formalism. Here, the main focus is how to represent an XML document using higher order logic terms where both its contents and structures are captured. Second-symbolic machine learning. Here, a new decision-tree learning algorithm determined by precision/recall breakeven point (PRDT) for the XML document classification problem. Precision/recall heuristic is considered in xml document classification is that the xml documents have strong connections with text documents. Finally, we had a semi-supervised learning algorithm which is based on the PRDT algorithm and the co-training framework. By producing comprehensible theories, the tentative results exhibit that our framework is capable to attain good performance in both the machine learning techniques.

References-

[1] S. Giri, A. Chandramouli, and S. Gauch, “XML Classification Using Content and Structure,” Technical Report ITTC-FY2007-TR- 31020-02, 2007.
[2] J.X. Wu and J. Zhang, “Knowledge Representation and Learning for Semistructured Data,” Technical Report, CSIRO ICT Centre, 2009.
[3] Bouchachia.A, Hassler.M, “Classification of XML Documents”,2007
[4] Xiaobing Jemma Wu, XML Document Classification with Co-training, CSIRO ICT Centre, 2009
[5] Qingjiu Zhang, “Shiliang sun, “Evolutionary classifier ensembles for semi-supervised learning”,2010
[6] Yuanyuan Guo, Xiaoda Niu ; Zhang.H “An Extensive Empirical Study on Semi-supervised Learning”,2010

Keywords— precision/recall, Co-training, machine learning, knowledge representation, semi-supervised learning.

IJBTT

IJCTT - An Efficient Classification Approach for the XML Documents