Sunday 1 January 2017

Web Information Extraction Using Deep Learning Algorithm

Vol. 9  Issue 2
Year:2014
Issue:Oct-Dec
Title:Web Information Extraction Using Deep Learning Algorithm
Author Name:J.Sharmila Jahangir and Dr. A.Subramani
Synopsis:
Web mining related research is getting more important nowadays because of the large amount of data that is managed through the internet. Web usage is increasing in an uncontrollable manner. A specific system is needed for controlling such large amount of data in the web space. Web mining is classified into three major divisions: Web content mining, web usage mining and web structure mining. Tak-Lam Wong and Wai Lam have proposed a web content mining approach in a research with the help of Bayesian networks. In their approach, they discuss on extracting web information and attribute discovery based on the Bayesian approach. Inspired from their research, the authors intend to propose a web content mining approach, based on a deep learning algorithm. The deep learning algorithm provides the advantage over Bayesian networks because Bayesian network is not considered in any learning architecture alike the proposed technique. In the proposed approach, three features are considered for extracting the web content. The features used are: concept feature that deals with the semantic relations on the web, format feature that deals with the format of the content and title feature, which deals with the web title. The above listed features produce some model parameters, which are given as the input to the deep learning algorithm. The process continues according to the deep learning algorithm and finally extracts content according to the input provided. There are a lot of approaches that have been developed in the area of Web Information Extraction (IE), which are concerned with harvesting useful information for any further analysis from web pages. Learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success. In this paper, a method has been proposed for information extraction from the Web using Deep Learning Algorithm.

No comments:

Post a Comment