A NEW APPROACH TO ORGANIZE THE RESULTS OF SEARCHING THE WEB, USING A COMBINATION OF RANKING AND GENETIC STRUCTURE-BASED CLUSTERING
Vol. 2, Jan-Dec 2016 | Page: 80-94
Abstract
Web mining means searching the Web for find specific information. Web mining operation should be done in a way to give the best results to the user. Two of the best methods in this area are clustering and ranking Web pages. The hereby-proposed method is a new approach which is a combination of the above-mentioned methods. In the proposed method, first, the Web graph is clustered in two phases, based on structural equivalences; next, each cluster is scored according to its value; then, ranking is done on all present pages in the clusters; and, finally, the final rank of each Web page would be the result of multiplying these two values. In the end, Web pages will be presented to the user based on their final rank. The results obtained from the comparison of the proposed algorithm (GCRM) with other methods indicate a good performance of this algorithm in finding high quality Web pages. Since quality is the main parameter in Web mining, main effort in GCRM algorithm is on increasing the quality of found pages, where, according to the results in this area, GCRM has been successful.
References
- Batagelj V., Mrvar A., Ferligoj A. and Doreian P., “Generalized Block modeling with Pajek,” Metodoloˇskizvezki , pp. 455-467, 2004
- Batagelj V., “Notes on block modeling,” Social Network, Vol. 100, No. 19, pp. 143- 155, 1997.
- Carolyn J. A. and Wasserman S., “Building stochastic blockmodels,” Social Networks, pp. 137-161, 1992.
- Cason T. P., September 2012, Role Extraction in Networks, PHD Thesis, computer faculty of University catholique deLouvain.
- Douglas R. W. and Karl P. R., “Graph and Semigroup Homomorphisms on Networks of Relations,” Social Networks, pp. 193-234, 1983.
- Duhan N. and Sharma A. K., “A Novel Approach for Organizing Web Search Results using Ranking and Clustering,” International Journal of Computer Applications, Vol. 5, No. 10, pp. 8887-8896, 2010.
- Faust K. and Wasserman S., “Block models: Interpretation and evaluation,” Social Networks, pp. 5-1, 1992.
- Guenoche A., “Comparing Recent Methods in Graph Partitioning,” Electronic Notes in Discrete Mathematics, Vol. 22, pp. 83–89, 2005.
- Grabmeier J. and Rudolph A., “Techniques of Cluster Algorithms in Data Mining,” Data Mining and Knowledge Discovery, pp. 303–360, 2002
- Ishii H., Tempo R. and Wei Bai E., “A Web Aggregation Approach for Distributed Randomized Page Rank Algorithms,” IEEE Transactions on Automatic Control, pp. 1203-1232, 2012.
- Jain R. and Purohit G. N., “Page Ranking Algorithms for Web Mining,” International Journal of Computer Applications, Vol. 13, No. 5, pp. 8887–8891, 2011.
- Jessop A., “Blockmodels with Maximum Concentration,” European journal of operational research, pp. 56-64,2008.
- KamvarS., HaveliwalaT.andGolub G., “Adaptive Methods for the Computation of PageRank,” LinearAlgebra and its Applications, Vol. 386, No. 19, pp. 51–65, 2004.
- Kohmot K., Katayama K. and Hiroyuki N., “Performance of a Genetic Algorithm for the Graph Partitioning Problem,” Mathematical and Computer Modeling, Vol. 38, pp. 1325–1332,2003
- Lorrain F. and White H. C., “Structural Equivalence of Individuals in Social Networks,” The Journal of Mathematical Sociology, pp. 49-80, 2012.
- Murugesan K. and Zhang J., “Hybrid Hierarchical Clustering: an Experimental Analysis,” The Journal of Mathematical Sociology, pp. 01-11, 2011.
- Page L., “The PageRank Citation Ranking: Bringing Order to the Web,” Technical Report, Computer Science Department, Stanford University,2000.
- Schaeffer S. E., “Graph Clustering,” Computer Science Review, Vol. 1, pp. 27–64, 2007.
- Shaojie Q., Tianrui L., Hong L. and Hongmei C., “A New Blockmodeling based Hierarchical Clustering Algorithm for Web Social Networks,” Engineering Applications of Artificial Intelligence, Vol. 10, No. 16, pp. 1-9,2012.
- Tormen C., Leiserson C., Rivest R. and Stein C.,Introduction to Algorithms, McGrawHill, 2001.
- Weining Q. and Aoying Z., “Analyzing Popular Clustering Algorithms from Different Viewpoints,” Journal of Software, pp. 1382–1392, 2002.
- Wu X., Kumar V., Ross Quinlan J. and Ghosh J., “Top 10 Algorithms in Data Mining,” KnowlInfSyst, Vol. 10, No. 1007, pp. 1-37,2008.
- Yan L., Gui G., Du W. and Guo Q., “An Improved PageRank Method based on Genetic Algorithm for Web Search,” Procedia Engineering, Vol. 15, No. 34, pp. 2983– 2987, 2011.
- ZarehBidoki A. M. and Yazdani N., “DistanceRank: an Intelligent Ranking Algorithm for Web Pages,” Information Processing and Management, Vol. 44, No. 10, pp. 877–892, 2008.
- ] ZarehBidoki A. M., Oroumchian F., Ghodsnia P. and Yazdani N., “A3CRank: an Adaptive Ranking Method base on Connectivity, Content and Click-through Data,” Information Processing & Management, Vol. 46, No. 2, pp. 159-169, 2010.
- Zdravko M. and Daniel L., Data Mining The Web, Wiley, 2007.
- Zhang K., October 2007, Visual Cluster Analysis in Data Mining, PHD Thesis, Department of Computing Division of Information and Communication Sciences of Macquarie University.
- Zhang D. and Dong Y., “An Efficient Algorithm to Rank Web Resources,” Computer Networks, Vol. 33, No. 6, pp. 449–455,2000.
- Ziberna A., “Evaluation of Direct and Indirect Blockmodeling of Regular Equivalence in Valued Networks by Simulations,” Metodološkizvezki, pp. 99- 134, 2009.
Rakesh Kumar Giri
Research Scholar, Department of Computer Science & Engineering, Sunrise University, Alwar
Dr. Sudhir Dawara
Prof., I P University, Delh
Received: 10-05-2016, Accepted: 15-06-2016, Published Online: 28-06-2016