IDEAS home Printed from https://ideas.repec.org/a/igg/jaec00/v10y2019i3p34-39.html
   My bibliography  Save this article

2 Way Crawling: A Review

Author

Listed:
  • Mayuri Anantrao Deshmukh

    (MIT College of Engineering Aurangabad, Pune, India)

Abstract

As we know that the deep web grows at very fast pace, there has been increased interest in techniques which help efficiently locate and check deep web interfaces. So, it is important to achieve wide coverage and high efficiency on the large volume of web resources. For this we propose a multistage framework, Smart crawler. Smart crawler is a two-stage crawler used to efficiently harvest deep web interfaces. In the first stage, the crawler performs site-based searching for center pages and avoids visiting non-relevant sites. In the second stage, an adaptive link ranking technique is used which helps to searching relevant site by excavating most relevant links. It is important to eliminate bias on visiting highly relevant links which is hidden in web directories, for this a link tree data structure is designed to achieve wider coverage for a website. The proposed framework gives experimental result on different domains and shows the agility and accuracy of the proposed framework, which retrieves deep-web interfaces from a large volume of sites and achieves higher harvest rates than other crawler.

Suggested Citation

  • Mayuri Anantrao Deshmukh, 2019. "2 Way Crawling: A Review," International Journal of Applied Evolutionary Computation (IJAEC), IGI Global, vol. 10(3), pages 34-39, July.
  • Handle: RePEc:igg:jaec00:v:10:y:2019:i:3:p:34-39
    as

    Download full text from publisher

    File URL: http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/IJAEC.2019070105
    Download Restriction: no
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:igg:jaec00:v:10:y:2019:i:3:p:34-39. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Journal Editor (email available below). General contact details of provider: https://www.igi-global.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.