Beginners question

Discussion in 'ASP.NET Starter Kits' started by Phaedos, Jan 25, 2009.

  1. Hi, I'm pretty new to all this, so sorry if this is a bit dumb. At the moment I have a desktop app that I made with C#, which relies on an external screen scraping program (from www.screen-scraper.com). This external program can run as a service, so my C# program just invokes it, makes it scrape the relevant websites, get's the data it has scraped processes it, outputs to screen etc etc.....All this works great, so now I am building an ASP website that will basically do the same thing. However I seem to have hit a bit of a stumbling block, since I obviously cant just install the external screen scraper to discountASP, so I'm wondering how I should go about having it serve my website?

    The only thing I can think of so far, is to buy another home machine, have the scrape run looping forever on that scraping the websites, and have it update an SQL database. Then have my website use that database?

    Just thinking there must be a better way than this? How are websites that use info scraped off other websites to update there content dynamically normally made and hosted?

    Cheers for any help
     
  2. I'm already pulling the HTTP, basically my website works like this:

    I use a program called screen-scraper (from ekiwi company, not created by myself, but an install program I've installed on my laptop). Within this program I have written various scripts etc for specific websites to scrape data from. Now I have written in C# my own program, that invokes this screen-scraper program, making it loop over the scrapes infinitely, and takes the data from it and writes into an SQL Express database (also currently on my home machine).

    My website currently is working by accessing this SQL db sat on my home machine to get it's dynamic content. If I left my laptop on all day and all night, and kept it connected to the internet, then everything is working a charm, but obv this isnt very practical, and I was hopeing I could host this screen-scraping program somewhere and just let it run all night and day there.....

    So this is my dilemma..It seems I either 1) buy another PC and just sit it in some room and let it run and run and run.... 2) get a dedicated or vps type server, but this would cost a bomb and prob cheaper just to do option 1.....or?

    Even if I wrote the scraping code myself insead of relying on an external program it just wouldnt be practicle, some site scrapes take abt 10mins to complete, and I will be scraping possibly 100 sites, all with specific features and layouts, so there is just no way I could put it in the code behind page to be done when the user requests the page. If I wrote my own scraper, it would just be a clone of this screen-scraper program from ekiwi, and then I'd have the same problem of how to install it and have it run continuously anyway right?

    So yeah, would really really appreciate any advice from someone who knows more about this
     
  3. Bruce

    Bruce DiscountASP.NET Staff

    I don't think you can do the backend scraping on our server. It may be best to run the task on your own computer and populate the DB.

    Bruce

    DiscountASP.NET
    www.DiscountASP.NET
     
  4. Bruce

    Bruce DiscountASP.NET Staff

    do you mean you want to pull the HTTP source of another site OR 'take a screenshot' of another website?>

    Bruce

    DiscountASP.NET
    www.DiscountASP.NET
     

Share This Page