International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 98 - Issue 9 |
Published: July 2014 |
Authors: Kopal Maheshwari, Namrata Tapaswi |
![]() |
Kopal Maheshwari, Namrata Tapaswi . Design and Implementation of Hidden based Web Retrieval using Innovative Vision-based Segmentation. International Journal of Computer Applications. 98, 9 (July 2014), 42-47. DOI=10.5120/17215-7448
@article{ 10.5120/17215-7448, author = { Kopal Maheshwari,Namrata Tapaswi }, title = { Design and Implementation of Hidden based Web Retrieval using Innovative Vision-based Segmentation }, journal = { International Journal of Computer Applications }, year = { 2014 }, volume = { 98 }, number = { 9 }, pages = { 42-47 }, doi = { 10.5120/17215-7448 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2014 %A Kopal Maheshwari %A Namrata Tapaswi %T Design and Implementation of Hidden based Web Retrieval using Innovative Vision-based Segmentation%T %J International Journal of Computer Applications %V 98 %N 9 %P 42-47 %R 10.5120/17215-7448 %I Foundation of Computer Science (FCS), NY, USA
We assimilate the extracted information from a conference website to acquire the clean and high superiority academic data. This research has subsequent contributors: We propose a novel vision-based page segmentation algorithm, which use DOM tree to compensate the information loss of classical vision-based segmentation algorithm. We transform the conference Web material extraction which is difficult into a classification problematic, and categorize text blocks as predefined sets permitting to vision, key disputes, text and content information. We improve the classification quality by post-processing. Our experimental results on real-world datasets shows that our method is highly effective and efficient for extracting academic information from conference pages.