I am using HTMLElementCollection, HtmlElement to iterate through a website and using Get/Set attributes of a website HTML and returning it to a ListView. Is it possible to get values from website a and website b to return it to the ListView?
HtmlElementCollection oCol1 = oDoc.Body.GetElementsByTagName("input");
foreach (HtmlElement oElement in oCol1)
{
if (oElement.GetAttribute("id").ToString() == "search")
{
oElement.SetAttribute("value", m_sPartNbr);
}
if (oElement.GetAttribute("id").ToString() == "submit")
{
oElement.InvokeMember("click");
}
}
HtmlElementCollection oCol1 = oDoc.Body.GetElementsByTagName("tr");
foreach (HtmlElement oElement1 in oCol1)
{
if (oElement1.GetAttribute("data-mpn").ToString() == m_sPartNbr.ToUpper())
{
HtmlElementCollection oCol2 = oElement1.GetElementsByTagName("td");
foreach (HtmlElement oElement2 in oCol2)
{
if (oElement2 != null)
{
if (oElement2.InnerText != null)
{
if (oElement2.InnerText.StartsWith("$"))
{
string sPrice = oElement2.InnerText.Replace("$", "").Trim();
double dblPrice = double.Parse(sPrice);
if (dblPrice > 0)
m_dblPrices.Add(dblPrice);
}
}
}
}
}
}
As one of the comments mentioned the better approach would be to use HttpWebRequest to send a get request to www.bestbuy.com or whatever site. What it returns is the full HTML code (what you see) which you can then parse through. This kind of approach keeps you from seinding too many requests and getting blacklisted. If you need to click a button or type in a text field its best to mimic human input to avoid being blacklisted also. I would suggest injecting a simple javascript into the page header or body and execute it from the app to send a 'onClick' event from the button (which would then reply with a new page to parse or display) or to modify the text property of something.
this example is in c++/cx but it originally came from a c# example. the script sets the username and password text fields then clicks the login button:
String^ script = "document.GetElementById('username-text').value='myUserName';document.getElementById('password-txt').value='myPassword';document.getElementById('btn-go').click();";
auto args = ref new Platform::Collections::Vector<Platform::String^>();
args->Append(script);
create_task(wv->InvokeScriptAsync("eval", args)).then([this](Platform::String^ response){
//LOGIN COMPLETE
});
//notes: wv = webview
EDIT:
as pointed out the absolute best approach would be to get/request an api. I was surprised to see that site mason pointed out for bestbuy developers. Personally I have only tried to work with auto part stores who either laugh while saying I can't afford it or have no idea what I'm asking for and hang up (when calling corporate).
EDIT 2: in my code the site used was autozone. I had to use chrome developer tools (f12) to get the names of the username, password, and button name. From the developer tools you can also watch what is sent from your computer to the site/server. This allows you to recreate everything and mimic javascript input and actions using post/get with HttpWebRequest.
Related
I am new in c# programming. I am trying to scrape data from div (I want to display temperature from web page in Forms application).
This is my code:
private void btnOnet_Click(object sender, EventArgs e)
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
HtmlWeb web = new HtmlWeb();
doc = web.Load("https://pogoda.onet.pl/");
var temperatura = doc.DocumentNode.SelectSingleNode("/html/body/div[1]/div[3]/div/section/div/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]");
onet.Text = temperatura.InnerText;
}
This is the exception:
System.NullReferenceException:
temperatura was null.
You can use this:
public static bool TryGetTemperature(HtmlAgilityPack.HtmlDocument doc, out int temperature)
{
temperature = 0;
var temp = doc.DocumentNode.SelectSingleNode(
"//div[contains(#class, 'temperature')]/div[contains(#class, 'temp')]");
if (temp == null)
{
return false;
}
var text = temp.InnerText.EndsWith("°") ?
temp.InnerText.Substring(0, temp.InnerText.Length - 5) :
temp.InnerText;
return int.TryParse(text, out temperature);
}
If you use XPath, you can select with more precission your target. With your query, a bit change in the HTML structure, your application will fail. Some points:
// is to search in any place of document
You search any div that contains a class "temperature" and, inside that node:
you search a div child with "temp" class
If you get that node (!= null), you try to convert the degrees (removing '°' before)
And check:
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
HtmlWeb web = new HtmlWeb();
doc = web.Load("https://pogoda.onet.pl/");
if (TryGetTemperature(doc, out int temperature))
{
onet.Text = temperature.ToString();
}
UPDATE
I updated a bit the TryGetTemperature because the degrees are encoded. The main problem is the HTML. When you request the source code you get some HTML that browser update later dynamically. So the HTML that you get is not valid for you. It doesn't contains the temperature.
So, I see two alternatives:
You can use a browser control (in Common Controls -> WebBrowser, in the Form Tools with the Button, Label...), insert into your form and Navigate to the page. It's not difficult, but you need learn some things: wait to events for page downloaded and then get source code from the control. Also, I suppose you'll want to hide the browser control. Be carefully, sometimes the browser doesn't works correctly if you hide. In that case, you can use a visible Form outside desktop and manage activate events to avoid activate this window. Also, hide from Task Window (Alt+Tab). Things become harder in this way but sometimes is the only way.
The simple way is search the location that you want (ex: Madryt) and look in DevTools the request done (ex: https://pogoda.onet.pl/prognoza-pogody/madryt-396099). Use this Url and you get a valid HTML.
I'm looking for a way to get the document information (or document text) from another applications webbrowser control (and possibly alter it).
The other application is written in .net, but not by me.
I'm looking for an ability like this:
I would like an eventhandler for the OnDocumentCompleted that can get me the information of that document.
If possible, i would also like to intercept certain pages, add some html, and send them back to the second app to be displayed.
Searching the web pointed me towards using 'Hooks', but not much is found using hooks in this situation.
Hope you can help me out
Anthony
This code provides an example of html parsing that returns plain text (
the parsing depends on page content).
private string GetPlainText(WebBrowser webBrowser)
{
StringBuilder sb = new StringBuilder();
// Pick out a heading.
foreach (HtmlElement h1 in webBrowser.Document.GetElementsByTagName("H1"))
sb.Append(h1.InnerText + ". ");
// Select only some text, ignoring everything else.
foreach (HtmlElement div in webBrowser.Document.GetElementsByTagName("DIV"))
if (div.GetAttribute("classname") == "story-body")
foreach (HtmlElement p in div.GetElementsByTagName("P"))
{
string classname = p.GetAttribute("classname");
if (classname == "introduction" || classname == "") sb.Append(p.InnerText + " ");
}
return sb.ToString();
}
}
I've done quite a bit of searching (several hours actually) but I haven't been able to get this working. Basically, I have this button:
<asp:Button runat="server" Text="Go!" id="go" onClick="getDoc()" />
and this block of script:
<script type="c#" runat="server">
public void getDoc(object sender, EventArgs e) {
// Test to see if function was running (it's not...)
DocFrame.Attributes["src"] = "http://www.google.com";
// Get the current state of the dropdowns
String dropYear = (String)Year.SelectedValue;
String dropDiv = (String)Division.SelectedValue;
String dropControl = (String)Control.SelectedValue;
String dropQuart= (String)Quarter.SelectedValue;
// Get the Site where the list is
using (SPSite siteCol = new SPSite("http://portal/Corporate/IT/")) {
using (SPWeb web = siteCol.RootWeb){
// Get the list items we need
SPListItemCollection items = list.GetItems("Year", "Division", "Control", "Quarter");
SPListItem item = null;
// Loop through them until we find a matching everything
foreach (SPListItem it in items){
if(it.Year == dropYear && it.Division == dropDiv && it.Control == dropControl && it.Quarter == dropQuart){
item = it;
break;
}
}
// Assign the item as a string
String URL = (String)item["Title"];
// Set the iframe to the new URL
DocFrame.Attributes["src"] = URL;
}
}
}
It's all in the page where this is happening, please keep in mind that I've been using sharepoint for less than a week and have only ever coded in C++, so I could be doing everything horribly wrong. Anyway, it seems that getDoc() is never even getting called, so can anyone point out what I'm doing wrong?
Instead of
onClick="getDoc()"
you should do
OnClick="getDoc"
That's the proper way to wire an up an event.
By the way, you should consider following C# Naming Guidelines. If you were using better naming, it might look like this:
<asp:Button runat="server" Text="Go!" id="GoBtn" onClick="GoBtn_Click" />
Common practice convention is to append the event name after the ID of the control. It's not required, but it looks cleaner and other developers like to see that when they look at your code.
Also, DocFrame.Attributes["src"] = "http://www.google.com"; is not a good way to see if the function is running. It doesn't update the page in realtime, as the entire server side function executes, then the results are sent to the client. Instead, use your IDE's debugging tools to hook up to the server and set code breaks etc. Or what I do is have the code send me an email, I created a little utility library for that.
I am attempting to help a user log into their account using a custom WebBrowser control. I am trying to set the value of an input tag to the players username using the WebBrowser's InvokeScript function. However, my current solution is doing nothing but rendering a blank white page.
My current code looks like this (web is the name for my WebBrowser control):
web.Navigate(CurrentURL, null, #"<script type='text/javascript'>
function SetPlayerData(input) {
username.value = input;
return true;
}
</script>");
web.Navigated += (o, e) =>
{
web.IsScriptEnabled = true;
web.InvokeScript("SetPlayerData", #"test");
};
As mentioned, this does not work right now. I am attempting to do this on Windows Phone so a number of the example's I have found here and in other places will not work as I do not have access to the same functions.
How would I perform this successfully?
EDIT: Perhaps I was not clear, but I am working with Windows Phone, which has a limited API available meaning I do not have access to the Document property and a number of other functions. I do have access to InvokeScript, but not much more.
webBrowser1.Document.GetElementById("navbar_username").InnerText ="Tester";
webBrowser1.Document.GetElementById("navbar_password").InnerText = "xxxxxxxxxxx";
foreach (HtmlElement HtmlElement1 in webBrowser1.Document.Body.All)
{
if (HtmlElement1.GetAttribute("value") == "Log in")
{
HtmlElement1.InvokeMember("click");
break;
}
}
you may find more here : http://deltahacker.gr/2011/08/15/ftiakste-to-diko-sas-robot/
It's been along time since this question is posted but I think I'll post an answer to this so that it will help some one who came across the same situation
try
{
webBrowser1.Document.GetElementById("navbar_username").SetAttribute("value", "your user");
webBrowser1.Document.GetElementById("navbar_password").SetAttribute("value", "your pass");
webBrowser1.Document.GetElementById("Log in").InvokeMember("click");
}
catch { }
I’m trying to send an email using LOTUS NOTES with the help of “domino” dll (Programming language : C#).
I want to attach a mail signature into the body of email. I’m hoping to add a .jpg for the signature. I also have other email body formatting. Hence I have decided to use HTML for styling and attaching the signature. After browsing the web found out that in NotesRichTextStyle there is a property PassThruHTML. The legal values that can be given for it as per this link are (-1), (0), (255).
The ISSUE is that when I set (-1) the app popup a message saying that “Style value must be True, False, or STYLE_NO_CHANGE (YES, NO, or MAYBE for Java)”.
But in c sharp code it accepts only int values but not the values given in the popup.
Following is the C# code for the answer given by Ken Pespisa's reference link.
NotesSession LNSession = new NotesSession();
NotesDatabase LNDatabase = null;
NotesDocument LNDocument;
NotesMIMEEntity LNBody;
NotesStream LNStream;
NotesMIMEHeader LNHeader;
try
{
LNSession.Initialize(txtPassword.Text);
LNDatabase = LNSession.GetDatabase(txtServer.Text, txtUserName.Text, false);
LNStream = LNSession.CreateStream();
LNSession.ConvertMime = false;
//Create an email
LNDocument = LNDatabase.CreateDocument();
LNDocument.ReplaceItemValue("Form", "Memo");
LNBody = LNDocument.CreateMIMEEntity();
LNHeader = LNBody.CreateHeader("Subject");
LNHeader.SetHeaderVal("Add your subject here");
LNHeader = LNBody.CreateHeader("To");
LNHeader.SetHeaderVal("Give your recipient email address");
LNStream.WriteText("<html>");
LNStream.WriteText("<body bgcolor=\"blue\" text=\"white\">");
LNStream.WriteText("<table border=\"2\">");
LNStream.WriteText("<tr>");
LNStream.WriteText("<td>Hello World!</td>");
LNStream.WriteText("</tr>");
LNStream.WriteText("</table>");
LNStream.WriteText("</body>");
LNStream.WriteText("</html>");
LNBody.SetContentFromText(LNStream, "text/HTML;charset=UTF-8", MIME_ENCODING.ENC_IDENTITY_7BIT);
LNDocument.Send(false);
}
catch (Exception e)
{
MessageBox.Show(e.Message);
}
If you're just sending email you should look at the NotesMimeEntity classes, and review this website for examples: http://www-01.ibm.com/support/docview.wss?uid=swg21098323
PassThruHTML won't help you much unless you're trying to display custom HTML in a browser when viewing a Notes document or form via Domino.