i'm working on a web site. it is scraping product details(names, features, prices etc.) from various web sites, processing and displaying them. i'am considering to run update script on each day and keep data fresh.
scrape data
process them
store on database
read(from db) and display them
i'am already storing all the data in a sql schema but i'm not sure. After each update, all the old records are vanishing. if the scraped new data comes corrupted somehow, there is nothing to show.
so, is there any common way to archive the old data? which one is more convenient: seperate sql schemas or xml files? or something else?
Source: http://stackoverflow.com/questions/13686474/what-is-the-right-way-of-storing-screen-scraping-data
No comments:
Post a Comment