ASP.NET site recursively HTTP-requesting its own URLs

Eabweuiqovfkw · Jan 27, 2013

Background: I have an ASP.NET MVC web-application. I want to capture its user-visible HTML content periodically and persist it somewhere so I can track how content evolved over time. I want to be able to pull for example the HTML of homepage as it existed a year ago. This can be done using some kind of crawler that periodically runs through a list of URLs.My question: Is it a good idea to have the website itself issue \[code\]HttpWebRequest\[/code\]s to its own URLs? I could launch a \[code\]Timer\[/code\] inside the web-application that downloads and stores one URL per hour.An alternative architecture would be to have the crawler in an external application like a Windows Service. This would be a much more complicated architecture, though. In this question I'd like not to explore this option because I'm trying to get away with a simpler architecture.What can go wrong if an ASP.NET application requests its own URLs using \[code\]HttpWebRequest\[/code\]?In pseudo-code:\[code\]StartTimer(TimeSpan.FromHours(1), () => { var url = "http://localhost/SomePageInTheCurrentW3wpProcess.aspx"; var data = http://stackoverflow.com/questions/14549039/new WebClient().DownloadString(url); //calling current application Persist(data);});\[/code\]I'm not sure what bad things could happen. I'm thinking of threading an reentrancy problems. I'd have to be careful with distributed deadlocks and such.

ASP.NET site recursively HTTP-requesting its own URLs

Eabweuiqovfkw

New Member