Research Article

Design of Web Ranking Module using Genetic Algorithm

by  Vikas Thada, Vivek Jaglan
journal cover
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 97 - Issue 9
Published: July 2014
Authors: Vikas Thada, Vivek Jaglan
10.5120/17038-7346
PDF

Vikas Thada, Vivek Jaglan . Design of Web Ranking Module using Genetic Algorithm. International Journal of Computer Applications. 97, 9 (July 2014), 43-48. DOI=10.5120/17038-7346

                        @article{ 10.5120/17038-7346,
                        author  = { Vikas Thada,Vivek Jaglan },
                        title   = { Design of Web Ranking Module using Genetic Algorithm },
                        journal = { International Journal of Computer Applications },
                        year    = { 2014 },
                        volume  = { 97 },
                        number  = { 9 },
                        pages   = { 43-48 },
                        doi     = { 10.5120/17038-7346 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }
                        %0 Journal Article
                        %D 2014
                        %A Vikas Thada
                        %A Vivek Jaglan
                        %T Design of Web Ranking Module using Genetic Algorithm%T 
                        %J International Journal of Computer Applications
                        %V 97
                        %N 9
                        %P 43-48
                        %R 10.5120/17038-7346
                        %I Foundation of Computer Science (FCS), NY, USA
Abstract

Crawling is a process in which web search engines collect data from the web. Focused crawling is a special type of crawling process where crawler look for information related to a predefined topic[1]. In this paper a method for finding out the most relevant document among a set of documents for the given set of keyword is presented. Relevance checking is done with the help of Rogers-Tanimoto, MountFord and Baroni-Urbani/Buser similarity coefficients. The method uses genetic algorithm to show that the average similarity of documents to the query increases when Probability of mutation is taken as low and Probability of crossover is taken as high. The method does the performance analysis of different similarity coefficients on the same set of documents and applies ranking to the documents whose relevancy is highest among the three coefficients.

References
  • B. Novak "A Survey Of Focused Web Crawling Algorithms", Proceedings of SIKDD, pp. 55–58, 12-15 Oct 2004.
  • http://www. wikipedia. org/Web_Crawler
  • B. Klabbankoh, O. Pinngern. "applied genetic algorithms in information retrieval" Proceeding of IEEE ,pp. 702-711,Nov 2004
  • S. S. Satya and P. Simon, "Review on Applicability of Genetic Algorithm to Web Search," International Journal of Computer Theory and Engineering, vol. 1, no. 4, pp. 450-455, 2009.
  • Shokouhi, M. ; Chubak, P. ; Raeesy, Z " Enhancing focused crawling with genetic algorithms"Vol: 4-6, pp. 503-508,2005.
  • www. sequentix. de/gelquest/help/distance_measures. htm?
  • V. Consonni and R. Todeschini ,"New Similarity Coefficients for Binary Data", Communications in Mathematical and in Computer Chemistry, pp. 581-592, 2012
  • H. Wolda, "Similarity Indices, Sample Size and Diversity", OecoIogia-Springer-Verlag ,pp. 296-302,1981
  • M. A. Kauser, M. Nasar, S. K. Singh, "A Detailed Study on Information Retrieval using Genetic Algorithm", Journal of Industrial and Intelligent Information vol. 1, no. 3, pp. 122-127 Sep 2013.
  • http://en. wikipedia/wiki/Fitness_Proportionate_Selection
  • J. R. Koza, " Survey Of Genetic Algorithms And Genetic Programming", Proceedings of the Wescon, pp. 589-595,1995
  • http://textalyser. net/
  • http://www. webconfs. com/keyword-density-checker. php.
  • V. Thada, V. Jaglan, "Use of Genetic Algorithm in Web Information Retrieval", International Journal of Emerging Technologies in Computational and Applied Sciences, vol. 7,no. 3,pp. 278-281, Feb,2014
Index Terms
Computer Science
Information Sciences
No index terms available.
Keywords

Relevancy similarity coefficients genetic

Powered by PhDFocusTM