web site scraping data - c#

I am working on web site scraping project where I am using webbrowser control in C#.net
I almost done these by going from page to page and collecting data.This project has to run automatically.
One page has JavaScript function and there is alert message notification is popping up. How I can pass this alert?Is there any way to comment it out or call click method?
Please help with this.
Thank you in advance

Related

URL Not loading in a System.Windows.Forms.WebBrowser

I'm using a System.Windows.Forms.WebBrowser control to load a webpage. My understanding is it is just a wrapper around Internet Explorer. I'm therefore struggling to understand why the web page loads ok in a desktop browser (IE 11) but not in the browser control.
The URL has looks as so:
http://subdomain.domain.co.uk/?argumentName=argumentValue
My question is, firstly, is that a valid url (format wise)? Can you have a '?' directly after the '/' as shown? My theory is this could be getting removed behind the scenes when using a desktop browser and not happening when using the browser control. Unfortunately I cannot test that theory as the web page is behind a firewall. At the moment I know nothing else about the location of the web page other than the url.
If that is a valid url, could anyone suggest reasons it would load in IE but not in the web browswer control?
To expand on how its not working, the request times out after 30 seconds. The NavigateError event is raised (status code for me is -2146697211, but that could well be different to status code they get. I cannot find that out until I deploy a dll with some logging info).
The webpage then displays as

How to download infinitely-scrolling web page

With WebClient.DownloadString method it's fairly simple to load normal web page source to string.
But is there any easy way to load those pages which extends and loads new content when you scroll down to end?
You cannot "download" such a page, as it doesn't exist in full form. Such pages require user interaction.
You can use one of the forms of the WebBrowser control to browse to, and programmatically interact with a web site.
hey you can try this approach if you want to do it webclient..
See here.. basically he is using the scrapy but this approach can be adopted in case of webclient to i think so.
basically he is using the firebug or chrome developer tool in order to trace the ajax web request after knowing the web request you can get the content with webclient.

Post to fan page as application not as user (but the page is not showing on /me/accounts)

I'm trying to post to a page as an application using C# Facebook SDK, I've browsed through lots of posts explaining how to do so and I think I understood how, the only problem is that when I access /me/accounts/ it just shows applications I have made, it doesn't show any page I'm an administrator, is there anything I need to do besides being an administrator to have the page I want to post to listed on /me/accounts/ ?
Thanks a lot

Need guidance on ASPX website automation

NET ASPX Timesheet website automation, where i have to detect whenever a submit button is pressed by the user and send email automatically.
Unfortunately we dont have access to any website code. Just the website which we can use in internet Explorer.
How to detect the button press? how should i proceed? Do i need to write some custom browser plugin?
Thanks,
Anil
This makes no sense, you're asking how to get access to a websites sessions without having access to the website code or server. I don't think this can be done, in fact I would be really surprised if it can.
Think about the privacy issues.

Auto fill web page using ASP.Net C# web page

I need to write a ASP.Net C# web page that can open a web page, fill in the fields and click the submit button on the web page automatically. My web page should launch the IE browser and navigate to a specified URL and fill the form and submit it. Not sure where I should start from. Any help is greatly appreciated.
Thank you.
I'm not sure that, browser will allow to run all your js code.
But you can try with jquery, or with some another javascript lib.
You can add to your page iframe tag, than add dynamically another web-site to the iframe content. You also must know id's or classNames of html controls.
if you're going to use jquery
$(document).ready(function () {
$('#textbox1').val()="someText1";
....
$('#textboxN').val()="someTextN";
//call click event
$('#btnID').click();
});
to automatically start your page, you can use Process.Start() method in server side.
if you're trying too run in client side, you can use Response.Redirect Method
Why not make a POST/GET call directly?
You could use HTTPWebRequest
http://www.codeproject.com/KB/webservices/HttpWebRequest_Response.aspx

Categories