Open Access Open Access  Restricted Access Subscription or Fee Access

An Ontology-Based Learnable Focused Crawler using Hybrid Latent Semantic Indexing In Data Mining

Pravendra Kumar

Abstract


In this paper, we proposed an ontological model of an information system that provides precise definitions of fundamental concepts like Query, Sub-query, and Coupling. Queries are mapped to this space with documents being retrieved based on similarity model. In this Research, OLC (Ontological Semantic Indexing) Performance in document retrieval is investigated and compared with traditional term matching techniques with the help of LSI, LUD, CHOLSKEY Transform, Natural language processing to improve searching redundancy with stemming and start stopping words processing. After that, we have to apply for singular value decomposition; so as to gain the motivation for fetching records and information from document which would be same in the meaning and texture.

Keywords: Ontology Crawler, LSI, NLP, Cholskey Transform, SVM, LUD

Cite this Article

Kumar Pravendra and Solanki Arun. An Ontology-Based Learnable Focused Crawler Using Hybrid Latent Semantic Indexing in Data Mining. Journal of Web Engineering & Technology. 2015; 2(2): 11–17p.


Full Text:

PDF

References


Ron Bekkerman, Ran El-Yaniv, and Naftali Tishby. Distributional word clusters vs. words for text categorization. 2003. G.

Matveeva et al. Generalized latent semantic analysis for term representation. In Proc. of RANLP. 2005.K.

Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In Proceedings of the 21st annual international ACM SIGIR conference. 1998.S.

Chakrabarti, M. Van den Berg and B. Dom. Focused crawling: a new approach to topic-specific web resource discovery. Computer Networking. 1999; 1623–1640p.

S. Pandey and C. Olston. Crawl ordering by search impact. ACM. In Proceedings of the international conference on Web search and web data mining. 2008; 3–14p.

x S. Dumais and H. Chen. Hierarchical classification of Web content. In Proceedings of the 23rd Annual International ACM SIGIR, 2000; 263p.

Almpanidis, C. Kotropoulos and I. Pitas. Combining text and link analysis for focused crawling-an application for vertical search engines. Information Systems. 2007.


Refbacks

  • There are currently no refbacks.