The framework I wanted should have the following
- Proxy any webpage, letting me insert my own scripts into the page
- Should not require me to configure the server as a proxy. Instead, I should be able to hit a specific url to view the page
- Images, CSS, and even AJAX should work like it would work on the original page.
- Dynamicall marking pages for automatic scraping, like dapper.net.
- Adding more data to a page and saving it as our version
- Tracking user mouse movements, studying user behaviour.
- Walking users through sites
- Colloborative browsing
- For URLs (img, script, src), instead of rewriting the fetched HTML, we may as well have a handler at the root of our domain that redirects to the actual source. In this case, if the URL is absolute, it works just fine; in case of relative URLs, they hit our server and are redirected
- Insert a script at the top of the page that substitutes the XMLHTTPRequest.open to call our proxyUrl.
- Use our script to rewrite a href attributes to have control over the target pages.
- Use our script to rewirte targets of form submit.
- Send cookies set in the browser to our proxy for them to be relayed by the proxy to the actual server.
The idea is crude, but as I refine it, I would be posting updates.