Value Scraping Error C# [closed] - c#

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I am new to coding in C#, and am making a little program to scrape the current Bitcoin value from Mt.Gox.
Here is the code I am currently using:
namespace BitcoinValueScraper
{
public partial class GetValue : Form
{
public GetValue()
{
InitializeComponent();
}
private void GetValue_Load(object sender, EventArgs e)
{
System.Windows.Forms.WebBrowser wb = new System.Windows.Forms.WebBrowser();
wb.Navigate("www.mtgox.com");
wb.Stop();
wb.Document.GetElementById("lastPrice").SetAttribute("value", textBox1.Text);
}
}
}
This returns with:
"An unhandled exception of type 'System.NullReferenceException'
occurred in BitcoinValueScraper.exe Additional information: Object
reference not set to an instance of an object."
Help please!

You have to bind to the LoadCompleted Event on the webbrowser control. If you dont do this, document on the control will be null. The page might not be downloaded yet.
Example Code:
public WebBrowser webb;
public MainWindow()
{
InitializeComponent();
webb = new WebBrowser();
webb.LoadCompleted += webb_LoadCompleted;
webb.Navigate("http://www.google.com");
}
void webb_LoadCompleted(object sender, NavigationEventArgs e)
{
//NOW DOCUMENT SHOULD NOT BE NULL
MessageBox.Show("Completed loading the page");
mshtml.HTMLDocument doc = webb.Document as mshtml.HTMLDocument;
mshtml.HTMLInputElement obj = doc.getElementById("gs_taif0") as mshtml.HTMLInputElement;
mshtml.HTMLFormElement form = doc.forms.item(Type.Missing, 0) as mshtml.HTMLFormElement;
}
Above is for windows presentation foundation webbrowser control. In windows forms i believe the event is: DocumentCompleted reference: http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser%28v=vs.110%29.aspx
Here is windows forms code (i just tested this):
private System.Windows.Forms.WebBrowser wb;
public Form1()
{
InitializeComponent();
GetValue_Load(null, EventArgs.Empty);
}
private void GetValue_Load(object sender, EventArgs e)
{
wb = new System.Windows.Forms.WebBrowser();
wb.DocumentCompleted += wb_DocumentCompleted;
wb.Navigate("http://www.google.com");
}
void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
MessageBox.Show("Document loading completed");
//GET YOUR DOCUMENT HERE
}

While not a direct answer to the code problem you're currently encountering I'd like to highly suggest that you try doing things a different way because trying to pull information out of HTML on a website like that is extremely fragile (if they change their markup at all your code is broken) and just wrong on a lot of levels. In general, programmers usually rely on data APIs for querying this kind of information as it provides a standardized and (hopefully) tested way of querying for information. A quick Google search turned up some Bitcoin API's offered by BlockChain who seem to be pretty well regarded in the bitcoin world. Here is a sample API call to query for Bitcoin exchanges rates:
http://blockchain.info/api/exchange_rates_api
By making an HTTP request to their API you can much more reliably pull down the information that you're looking for and display it in your user interface.
Further Bitcoin API resources can be found here:
http://blockchain.info/api
Unfortunately, as you are new to both programming and interacting with APIs its hard to give you an answer without taking the time to physically write the code for you. However, I can say that currently, your approach is wrong. A WebBrowser object is not a suitable mechanism with which to interact with a web API. A more suitable approach would be to make an HTTP call to the API URL that you posted and then read the JSON out of the response. This would then need to be parsed into some kind of format that makes sense for your application (such as a simple Price object etc). There are many articles online regarding parsing JSON with C# as well as interacting with web based APIs through the HTTP protocol. I'd definitely recommend that you start there.
Here is a great starting article that will walk you through creating a basic application for interacting with JSON APIs. Just replace the Bing URLs with the appropriate BitCoin ones and you should have a good starting point.

If anything try:
namespace BitcoinValueScraper
{
public partial class GetValue : Form
{
System.Windows.Forms.WebBrowser wb = new System.Windows.Forms.WebBrowser();
public GetValue()
{
InitializeComponent();
}
private void GetValue_Load(object sender, EventArgs e)
{
wb.Navigate("www.mtgox.com");
wb.DocumentCompleted += wb_LoadCompleted;
}
void wb_LoadCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlDocument doc = wb.Document;
textBox1.Text = doc.GetElementById("lastPrice").ToString();
}
}

Related

How do I raise an external event in the Revit API from a WPF app?

I have been trying to create a Revit plugin that allows the user to view issues that are stored on a remote server on the local version of the file. It should create a new 3D perspective view from the stored data on the server and open it in Revit. However, whenever I attempt to run it, I get an exception warning thusly:
Exception thrown: 'Autodesk.Revit.Exceptions.InvalidOperationException' in RevitAPIUI.dll
An unhandled exception of type 'Autodesk.Revit.Exceptions.InvalidOperationException' occurred in RevitAPIUI.dll
Attempting to create an ExternalEvent outside of a standard API execution
I think I understand vaguely what this means, but I am unsure what exactly needs to be changed to fix it. I am defining a custom ExternalEventHandler and implementing its Execute method:
class CameraEventHandler : IExternalEventHandler
{
Issue issue;
int i;
public CameraEventHandler(Issue issue, int index)
{
this.issue = issue;
this.i = index;
}
public void Execute(UIApplication app)
{
Document doc = app.ActiveUIDocument.Document;
using (Transaction t = new Transaction(doc, "CameraTransaction"))
{
t.Start();
...
//Irrelevant code to set camera position programmatically
...
t.Commit();
}
}
public string GetName()
{
return "Camera event handler";
}
}
And then in one of my WPF forms, I create an ExternalEvent and calling the Raise method:
private void RevitViewButton_Click(object sender, RoutedEventArgs e)
{
CameraEventHandler handler = new CameraEventHandler(issue, issueIndex);
ExternalEvent cameraEvent = ExternalEvent.Create(handler);
cameraEvent.Raise();
}
However, the exception is thrown when it reaches the ExternalEvent.Create method.
Edit: I feel it is worth mentioning that the WPF app I am using is launched as a Revit plugin.
The Revit API cannot be used outside a valid Revit API context:
Use of the Revit API Requires a Valid Context
Revit API Context Summary
That is not a bug, Gaz, it is by design.
The solution suggested by Gaz is absolutely correct!
Reading this blog it would appear to be a bug in Revit.
The solution appears to be to create your custom handler during IExternalCommand.Execute or IExternalApplication.OnStartup, rather than at the time of raising the event.

c# webbrowser control does not navigate to another page

I have a console application and i've defined a webbrowser inside it.
Firstly, i navigate to a page and fill a login form and invoke the submit button to login.
After that, i want to go to another page in the same site using the same webbrowser but it does not navigate to that page. instead, it navigates to the page that it's redirected after login.
Here is my code for clarification; this code gives me the source code of www.websiteiwanttogo.com/default.aspx instead of product.aspx
what is wrong here?
static WebBrowser wb = new WebBrowser();
[STAThread]
static void Main(string[] args)
{
wb.AllowNavigation = true;
wb.Navigate("https://www.thewebsiteiwanttogo.com/login.aspx");
wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
Application.Run();
}
static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (wb.Url.ToString().IndexOf("login.aspx") > -1)
{
wb.Document.GetElementById("txtnumber").SetAttribute("value", "000001");
wb.Document.GetElementById("txtUserName").SetAttribute("value", "myusername");
wb.Document.GetElementById("txtPassword").SetAttribute("value", "mypassword");
wb.Document.GetElementById("btnLogin").InvokeMember("click");
}
else
{
//wb.Document.Body you are logged in do whatever you want here.
wb.Navigate("https://www.thewebsiteiwanttogo.com/product.aspx");
Console.WriteLine(wb.DocumentText);
Console.ReadLine();
Application.Exit();
}
}
There are a lot of different ways to accomplish this functionality. However, my guess is that:
Either the call to navigate to the next page is happening too quickly, or
The Document.Completed event is not firing properly after logging in (this is common especially if the destination document contains dynamic scripts)
I've done a lot of web page automating (navigating from link to link, then performing some actions, then navigating to another link, etc.), and you should consider using async processes. In principle, it is probably always best when dealing with the webBrowser object to use async processes, simply because there are many instances where you need one process to run while you perform other functions.
Without going into too much detail, look at the answer to this question and study the code: Flow of WebBrowser Navigate and InvokeScript
Before trying that implementation, however, you could simply try adding an async await before trying to navigate to the page. (async await is similar to a Thread.Sleep(), but doesn't actually stop the loading of the page, i.e. the "thread").
(Never heard of asynchronous processes before? Check out this tutorial on MSDN).
Try this first:
static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (wb.Url.ToString().IndexOf("login.aspx") > -1)
{
wb.Document.GetElementById("txtnumber").SetAttribute("value", "000001");
wb.Document.GetElementById("txtUserName").SetAttribute("value", "myusername");
wb.Document.GetElementById("txtPassword").SetAttribute("value", "mypassword");
wb.Document.GetElementById("btnLogin").InvokeMember("click");
}
else
{
//wb.Document.Body you are logged in do whatever you want here.
await Task.Delay(1000); //wait for 1 second just to let the WB catch up
wb.Navigate("https://www.thewebsiteiwanttogo.com/product.aspx");
Console.WriteLine(wb.DocumentText);
Console.ReadLine();
Application.Exit();
}
}
If this doesn't help, consider the link above and try implementing a more robust navigating sequence with async processes.
If that doesn't work, and you'd like some help navigating through or waiting for dynamic pages to load, try this post: how to dynamically generate HTML code using .NET's WebBrowser or mshtml.HTMLDocument?
I've used this code theology many times, and it works great.
Hope one of these methods helps! Let me know, and I can help you generate some more specific code snippets.
EDIT:
At second glance, I'm going to guess that the Console.ReadLine() is going to freeze up the navigating of wb.Navigate("https://www.thewebsiteiwanttogo.com/product.aspx");, since it won't happen instantaneously. You'll probably want to add another if statement in the Document.Completed handler to allow wb.Navigate("https://www.thewebsiteiwanttogo.com/product.aspx"); to finish navigating before trying to grab the wb.DocumentText. For example:
static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (wb.Url.ToString().IndexOf("login.aspx") > -1)
{
wb.Document.GetElementById("txtnumber").SetAttribute("value", "000001");
wb.Document.GetElementById("txtUserName").SetAttribute("value", "myusername");
wb.Document.GetElementById("txtPassword").SetAttribute("value", "mypassword");
wb.Document.GetElementById("btnLogin").InvokeMember("click");
}
else if(wb.Url.ToString().IndexOf("product.aspx") > -1)
{
Console.WriteLine(wb.DocumentText);
Console.ReadLine();
Application.Exit();
}
else
{
//wb.Document.Body you are logged in do whatever you want here.
await Task.Delay(1000); //wait for 1 second just to let the WB catch up
wb.Navigate("https://www.thewebsiteiwanttogo.com/product.aspx");
}
}

Windows Phone 8.1 App crashes on ShowShareUI()

I'm trying to add share functionality to my Windows Phone App. The code behaves in an unpredictable way. Sometimes it works, but mostly it doesn't and I haven't been able to get any details about what's causing the crash. Could someone please go through the code below and let me know if I've missed something? Thanks!
public ArticlePage()
{
this.InitializeComponent();
//..
RegisterForShare();
}
private void RegisterForShare()
{
DataTransferManager dataTransferManager = DataTransferManager.GetForCurrentView();
dataTransferManager.DataRequested += new TypedEventHandler<DataTransferManager,
DataRequestedEventArgs>(this.ShareLinkHandler);
}
private void ShareLinkHandler(DataTransferManager sender, DataRequestedEventArgs e)
{
DataRequest request = e.Request;
DataRequestDeferral defferal = request.GetDeferral();
request.Data.Properties.Title = this.article.Title;
request.Data.Properties.Description = this.article.Summary;
request.Data.SetWebLink(new Uri(this.article.UrlDomain));
defferal.Complete();
}
private void ShareCommand_Click(object sender, RoutedEventArgs e)
{
DataTransferManager.ShowShareUI();
}
UPDATE
The code always works while I'm debugging from visual studio but pretty much never otherwise. I made a release build thinking there might be some code in the debug build which is causing the problem but that didn't make any difference.
I also had that problem recently. The share UI crashes when one of the important parameters is not set. In your case I'd suspect that
this.article.UrlDomain
is null or not a valid Uri pattern. You should build an if-clause around it and make sure that you're dealing with a real Uri. To test your code you should insert hardcoded constants and run it again. If it doesn't crash, check your Title, Summary and UrlDomain one by one.
Other places to investigate:
Try adding your handler in the OnNavigatedTo method and remove it when you're leaving the page
protected override async void OnNavigatedTo(NavigationEventArgs e)
{
DataTransferManager.GetForCurrentView().DataRequested += SharePage_DataRequested;
}
protected override void OnNavigatingFrom(NavigatingCancelEventArgs e)
{
base.OnNavigatingFrom(e);
DataTransferManager.GetForCurrentView().DataRequested -= SharePage_DataRequested;
}
I also searched my code and looked at official samples again and did not find any defferals. Just to be sure - if I were you I'd strip all unnessecary lines in my code and get it as closest as possible to the official examples and then extend it back to where it was from there which is why I would comment out these two lines as well:
void SharePage_DataRequested(DataTransferManager sender, DataRequestedEventArgs args)
{
DataRequest request = e.Request;
//DataRequestDeferral defferal = request.GetDeferral();
request.Data.Properties.Title = this.article.Title;
request.Data.Properties.Description = this.article.Summary;
request.Data.SetWebLink(new Uri(this.article.UrlDomain));
//defferal.Complete();
}
Okay, I had the same problem. ShowShareUi actually suspends your app. If you try suspending your app you would get the error. It is actually the serialization problem.
If you want to look into the error, then while debugging, press the lifecycle events and suspend, you will crash in debug mode now.
If you are navigating between pages with a custom class you would get error. *My suggestion is that you would convert to jsonstring and send and get it back.*
I've faced similar problem (crash on ShowShareUI).
After very long investigations I've found, that this appears because unhandled exception in SaveFrameNavigationState (SuspensionManager class from template project).
In my case it was because SessionStateForFrame method failed on parsing class that couldn't be serialized.
Check out what you're saving in page state on SaveState of the page.
It happens not only on ShowShareUI but in suspend mode generally.

How do you update ASP.NET web forms from different threads?

In the last few days I've been trying to learn how to use ASP.NET Web Forms together with multithreading the hard way by building a simple applet using both and I've been struggling with aspects of interactions between different threads and the UI.
I've resolved some multithreading issues in some other questions (and also learned after waaaaaay too long that web forms and WPF are not the same thing) but now I'm running into trouble finding the best way to update UI elements based on data acquired in multiple threads.
Here's my code:
Default.aspx
public partial class _Default : System.Web.UI.Page
{
private NlSearch _search;
private static int _counter = 0;
private static SortedList<long, SearchResult> resultsList = new SortedList<long, SearchResult>();
protected void Page_Load(object sender, EventArgs e)
{
_search = new NlSearch();
}
protected void AddSearchMethod(object sender, EventArgs e)
{
var text = SearchForm.Text;
new Task(() => MakeRequest(text));
}
protected void UpdateMethod(object sender, EventArgs e)
{
resultsLabel.Text = "";
foreach (var v in resultsList.Values)
{
resultsLabel.Text += v.SearchTerm + ": " + v.Count + " occurances<br/>";
}
}
protected void ClearSearchMethod(object sender, EventArgs e)
{
resultsLabel.Text = "";
resultsList.Clear();
}
protected void MakeRequest(string text)
{
_counter++;
SearchResult s = new SearchResult
{
SearchTerm = text,
Count = _search.MakeRequests(text)
};
resultsList.Add(_counter, s);
}
}
I've tried quite a few versions of the same basic thing. NlSearch.MakeRequest (called by MakeRequests) sends an HTTP POST request to an outside web site imitating a search bar input, and then extracts an integer from the markup indicating how many results came back.
The current simple UI revolves around a SearchForm textfield, an "Add Search" button, an "Update Label" button a "Clear Search" method, and a ResultsLabel that displays results. The AddSearch button creates a new task that calls MakeRequest, which calls the method to send the HTTP request and then stores the results in the order they were sent in a static sorted list.
So now ideally in a good UI I would like to just update the label every time a thread returns, however I've tried using ContinueWhenAll and a few other task functions and the problem seems to be that other threads do not have the ability to change the UI.
I have also tried running a new thread on page load that updates the label every few seconds, but this likewise failed.
Because I haven't been able to implement this correctly, I've had to use the "Update Label" button which literally just tells the label to display what's currently in the static list. I would really like to get rid of this button but I can't figuer out how to get my threads to make UI changes.
In general, trying to do threading in a web app is a bad idea. Web servers are designed for this, but spinning off new threads or processes should be avoided if at all possible. While there used to be a mechanism (and maybe there still is) to "push" results to a client, there are better solutions available today.
What you're describing is exactly the problem that AJAX is intended to solve.
You mentioned WPF in your question -- are you perhaps instead looking for a Windows application, like WinForms? I think that perhaps the term "web forms" has confused the situation. Web forms are just webpages with some (okay, a lot) of added in Microsoft functionality.
It sounds like you're trying to send updates to a webpage from a thread in code. The web doesn't work that way. I'd suggest reading through the ASP.NET Page Life Cycle Overview if you're actually trying to design webpages. Other answers have suggested AJAX functionality (which is where the web page executes some JavaScript that goes out and talks to a web server).
Have you ever hear about AJAX before? I think you're a thinking as application dev instead of web dev.
If you want to run your code asynchonous you may want to use the Async Await keywords instead of managing threads yourself. See information about Asynchronous Programming with Async and Await
Do not let your threads get tangled up ;)

Web automation using .NET

I am a very newbie programmer. Does anyone of you know how to do Web automation with C#?
Basically, I just want auto implement some simple action on the web.
After I have opened up the web link, i just want to perform the actions below automatically.
Automatically Input some value and Click on "Run" button.
Check In the ComboBox and Click on "Download" button.
How can I do it with C#? My friend introduce me to use Powershell but I guess .Net do provide this kind of library too. Any suggestion or link for me to refer?
You can use the System.Windows.Forms.WebBrowser control (MSDN Documentation). For testing, it allows your to do the things that could be done in a browser. It easily executes JavaScript without any additional effort. If something went wrong, you will be able to visually see the state that the site is in.
example:
private void buttonStart_Click(object sender, EventArgs e)
{
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
webBrowser1.Navigate("http://www.wikipedia.org/");
}
void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
HtmlElement search = webBrowser1.Document.GetElementById("searchInput");
if(search != null)
{
search.SetAttribute("value", "Superman");
foreach(HtmlElement ele in search.Parent.Children)
{
if (ele.TagName.ToLower() == "input" && ele.Name.ToLower() == "go")
{
ele.InvokeMember("click");
break;
}
}
}
}
To answer your question: how to check a checkbox
for the HTML:
<input type="checkbox" id="testCheck"></input>
the code:
search = webBrowser1.Document.GetElementById("testCheck");
if (search != null)
search.SetAttribute("checked", "true");
actually, the specific "how to" depends greatly on what is the actual HTML.
For handling your multi-threaded problem:
private delegate void StartTestHandler(string url);
private void StartTest(string url)
{
if (InvokeRequired)
Invoke(new StartTestHandler(StartTest), url);
else
{
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
webBrowser1.Navigate(url);
}
}
InvokeRequired, checks whether the current thread is the UI thread (actually, the thread that the form was created in). If it is not, then it will try to run StartTest in the required thread.
Check out SimpleBrowser, which is a fairly mature, lightweight browser automation library.
https://github.com/axefrog/SimpleBrowser
From the page:
SimpleBrowser is a lightweight, yet
highly capable browser automation
engine designed for automation and
testing scenarios. It provides an
intuitive API that makes it simple to
quickly extract specific elements of a
page using a variety of matching
techniques, and then interact with
those elements with methods such as
Click(), SubmitForm() and many more.
SimpleBrowser does not support
JavaScript, but allows for manual
manipulation of the user agent,
referrer, request headers, form values
and other values before submission or
navigation.
If you want to simulate a real browser then WatiN will be a good fit for you. (Selenium is another alternative, but I do not recommend it for you).
If you want to work on the HTTP level, then use WebRequest and related classes.
You could use Selenium WebDriver.
A quick code sample below:
using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;
// Requires reference to WebDriver.Support.dll
using OpenQA.Selenium.Support.UI;
class GoogleSuggest
{
static void Main(string[] args)
{
// Create a new instance of the Firefox driver.
// Note that it is wrapped in a using clause so that the browser is closed
// and the webdriver is disposed (even in the face of exceptions).
// Also note that the remainder of the code relies on the interface,
// not the implementation.
// Further note that other drivers (InternetExplorerDriver,
// ChromeDriver, etc.) will require further configuration
// before this example will work. See the wiki pages for the
// individual drivers at http://code.google.com/p/selenium/wiki
// for further information.
using (IWebDriver driver = new FirefoxDriver())
{
//Notice navigation is slightly different than the Java version
//This is because 'get' is a keyword in C#
driver.Navigate().GoToUrl("http://www.google.com/");
// Find the text input element by its name
IWebElement query = driver.FindElement(By.Name("q"));
// Enter something to search for
query.SendKeys("Cheese");
// Now submit the form. WebDriver will find the form for us from the element
query.Submit();
// Google's search is rendered dynamically with JavaScript.
// Wait for the page to load, timeout after 10 seconds
var wait = new WebDriverWait(driver, TimeSpan.FromSeconds(10));
wait.Until(d => d.Title.StartsWith("cheese", StringComparison.OrdinalIgnoreCase));
// Should see: "Cheese - Google Search" (for an English locale)
Console.WriteLine("Page title is: " + driver.Title);
}
}
}
The great thing (among others) about this approach is that you can easily switch the underlying browser implementations, just by specifying a different IWebDriver, like FirefoxDriver, InternetExplorerDriver, ChromeDriver, etc. This also means you can write 1 test and run it on multiple IWebDriver implementations, thus testing how the page works when viewed in Firefox, Chrome, IE, etc. People working in QA sector often use Selenium to write automated web page tests.
I'm using ObjectForScripting to automate WebBrowser, A Javascript callback to C# function and then function in c# extract data or automate many-thing.
I have clearly explained in the following link
Web Automation using Web Browser and C#
.NET does not have any built-in functionality for this. It does have the WebClient and HttpRequest/HttpResponse classes, but they are only building blocks.
You cannot easily automate client-side activity, like filling out forms or clicking on buttons from C#. However, if you look into JavaScript, you may be able to better automate some of those things. To really automate, you would need to reverse engineer the call made by clicking the button, and connect to the url directly, using the classes #John mentions.

Categories