|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 98 - Issue 9 |
| Published: July 2014 |
| Authors: Kopal Maheshwari, Namrata Tapaswi |
10.5120/17215-7448
|
Kopal Maheshwari, Namrata Tapaswi . Design and Implementation of Hidden based Web Retrieval using Innovative Vision-based Segmentation. International Journal of Computer Applications. 98, 9 (July 2014), 42-47. DOI=10.5120/17215-7448
@article{ 10.5120/17215-7448,
author = { Kopal Maheshwari,Namrata Tapaswi },
title = { Design and Implementation of Hidden based Web Retrieval using Innovative Vision-based Segmentation },
journal = { International Journal of Computer Applications },
year = { 2014 },
volume = { 98 },
number = { 9 },
pages = { 42-47 },
doi = { 10.5120/17215-7448 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2014
%A Kopal Maheshwari
%A Namrata Tapaswi
%T Design and Implementation of Hidden based Web Retrieval using Innovative Vision-based Segmentation%T
%J International Journal of Computer Applications
%V 98
%N 9
%P 42-47
%R 10.5120/17215-7448
%I Foundation of Computer Science (FCS), NY, USA
We assimilate the extracted information from a conference website to acquire the clean and high superiority academic data. This research has subsequent contributors: We propose a novel vision-based page segmentation algorithm, which use DOM tree to compensate the information loss of classical vision-based segmentation algorithm. We transform the conference Web material extraction which is difficult into a classification problematic, and categorize text blocks as predefined sets permitting to vision, key disputes, text and content information. We improve the classification quality by post-processing. Our experimental results on real-world datasets shows that our method is highly effective and efficient for extracting academic information from conference pages.