Parsing AJAX driven page - c#

I am trying to parse data from a page that is not filled in until after the page is finished loading. Because of this I cannot get a simple solution utilizing
while (wb.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
to work. I have tried using the solution found at View Generated Source (After AJAX/JavaScript) in C# but I cannot figure out how to get it to wait for the post-loading data is downloaded. Please help! The data is automatically filled into the page after it is loaded, no user interaction is required. Thanks!
I just found Waiting for WebBrowser ajax content where the answer was to use a timer....I am not sure how to fix this using a timer instead of Thread.Sleep() (which blocks the thread completely), can someone help me understand the proper way to use this with a quick sample code? Thanks again
I am looking into the suggestion of calling the AJAX myself, but I think it would be better to use the timer. I am still looking for help on the subject. Thanks.

Take a look at the page you are dealing with with Firebug for Firefox. There is a "Net" tab which will allow you to see the actual raw data of all subsequent HTTP Ajax requests that are occurring while the page is loading (but after the initial part of the page has loaded).
By looking at this data it is quite likely you will be able to find JSON or other XML data that contains exactly what you are looking for in response to a GET request containing an ID or something of that nature.
Using a 'fake' browser as mentioned in that linked post should be considered a last resort because it will yield the worst performance on your end because you will likely be downloading and parsing a lot more data than necessary.

For my situation the following solved it:
while (wb.ReadyState != WebBrowserReadyState.Complete)
Application.DoEvents();
while (wb.Document.GetElementById(elementId) != null && wb.Document.GetElementById(elementId).InnerHtml == null)
Application.DoEvents();
The second while loop waits until a specified element is populated by the AJAX. In my situation, if an invalid store # is provided in the url, it forwards to a 404-type page. The first condition verified the element still exists on the page, which it won't if it gets sent to the 404 page. The second condition waits until the element is populated.
An interesting thing I found if that after the AJAX populates the page, wb.Document.InnerText and wb.DocumentStream still contain the downloaded html. Only wb.Document.InnterHTML is updated. In my situation I am creating an HtmlAgilityPack HtmlDocument from the results. Because the DocumentStream becomes outdated, I have to recreate my document like this:
htmlDoc.LoadHtml("<html><head><title>" + wb.DocumentTitle + "</title></head><body>" + wb.Document.Body.InnerHtml + "</body></html>");
In my situation I don't care about meta/scripts in the header, so this works. If someone cared about those things, they would obviously need to adapt that line of code for their own use.

Related

Asp.Net asynchronous loop not firing

I am attempting to send an image to the browser every 16 ms (~60Hz) from a specific file on the drive which is changing constantly (also at ~60Hz). To do this I am using Response.BinaryWrite(). Below is my code (it really is quite simple)
protected void Page_Load(object sender, EventArgs e)
{
MainLoop();
}
private async Task MainLoop()
{
while (true)
{
//Update image to latest from server
Response.ContentType = "image/png";
Response.BinaryWrite(File.ReadAllBytes(Server.MapPath("/Frames/frame.png")));
Response.Flush();
await Task.Delay(16);
}
}
My problem is that it does not refresh. I have double and triple checked that the file is indeed changing, and I have also tried updating a label with the current time in milliseconds. I have found that even when doing that, it does not update at all after the page loads.
If I reload the page it displays both a new image, and a new time, so the issue doesn't seem to be in the Response writing. Rather, it's as if the page is simply ignoring the loop entirely, and only running once through.
If anybody has any advice on alternatives to try (keep in mind video is not an option due to the live nature of this) I would be glad to hear them. Perhaps I am missing something very simple here, but I just can't find it!
Thanks!
Thoughts on the problem:
So the problem is that the page doesn't render updated results every 16ms. We can't solve this issue through server side code (e.g. c# code like your snippet).
I'm sure the while loop is still running but you will always need to refresh the page manually since this is ASP.NET (traditional web app with refresh). Here is a stack overflow answer that can back me up (in case you are skeptical about what I'm saying) refresh page after 3 seconds using c#
I'm not sure what the end goal here is, but I'm assuming you want to render your page with your updated png every 16ms. Basically from my 2 points above, don't resort to trying to update the image through the server side; instead, look into other options through the client side code (e.g. javascript).
Of course, I would try to comment further on where you should look exactly, but I never really dealt with updating an image every 16ms before!
I'm just going to throw in some links that could generate some ideas:
Refresh an image in the browser every x milliseconds
Change image in HTML page every few seconds
http://www.labbookpages.co.uk/web/realtime.html
I hope this can push you along! Let me know how this goes because I'm curious on how you will do this :)
You need to not only get the new image but also ensure the browser does not use the cashed image.
So, there's four steps to the solution:
Create a controller action that returns the current image, but your controller action needs to take an argument. The argument becomes a query string parameter. Each time you call the method, the param value should be different. This way, the browser considers the URL to be new and will not use a cached image. Your url should look something like http://mysite/my controller/getimage?param=12345678927483817
Create a JavaScript method to change the SRC property of the image by pointing it to the URL of your new controller action. Use something that constantly changes for your query parameter. I suggest using the current date time in milliseconds.
After the page loads, set a timer in JavaScript to call the function you created in Step 2.
Eat a cookie, because cookies are good.

Selenium - PhantomJS - Findelements in DOM traversal is slow

For, reasons, I'm trying to recurse through the DOM using Selenium/PhantomJS.
It works but its slow and I dont know why.
Findelements seems to take about 250ms every time.
I've tried zeroing the implicit wait with not much success. I've also tried using the Xpath with no real change.
Here's the code, any suggestions ?
public static void RecurseDomFromTop()
{
DomRecursor( pjsDriver.FindElement( By.TagName( "*" ) ) );
}
public static void DomRecursor( IWebElement node )
{
ReadOnlyCollection<IWebElement> iwes = node.FindElements( By.TagName( "*" ) );
foreach (IWebElement iwe in iwes)
{
DomRecursor( iwe );
}
}
The approach you are taking to compare two doms this way is wrong. Every time you make Selenium request there is a HTTP request created that is sent to the driver, which send it to the browser, which then sends it back to driver and driver back to you language binding. There is a lot of overhead involved in this.
Instead what you should do is use driver.PageSource and get the whole HTML response in a single call. Then later you can use HTML parsing libraries which are at least 10x faster than the approach you are taking now.
Look at below article which uses HtmlAgilityPack for getting DOM data
https://www.codeproject.com/Articles/659019/Scraping-HTML-DOM-elements-using-HtmlAgilityPack-H

How to set the start position for a video using AxShockwaveFlash (revised)

Revised question with added detail...
I'm trying to use AxShockwaveFlash to play a youtube video and start it at a specific location. This is within a C# winforms app. I have the basics. I can successfully start and stop the video. I cannot, however, figure out how to set the position and start at a specific time within the video. What property or method needs to be called to do this?
Here's what I have so far:
AxShockwaveFlashObjects.AxShockwaveFlash mFlashPlayer;
...
mFlashPlayer.Movie = #"http://www.youtube.com/v/9O9HfafzBPE?version=3&hl=ru_RU";
mFlashPlayer.Play();
I've tried setting various properties and calling various methods, but I'm just guessing. For example, I tried calling
mFlashPlayer.GotoFrame(200);
and it did nothing. Seems so simple, I'm starting to wonder if I'm encountering a bug. ?
I also tried using the standard form when using a web browser, which is to encode it directly in the url. For example, the following did not work either:
mFlashPlayer.Movie = #"http ...blah... #t=77s";
mFlashPlayer.Play();
Thanks in advance for any suggestions.
ANSWER
Apparently, I'm not authorized to post an answer, so I'm editing the question and adding it here...
Got it! The magic trick is to set the 'start' parameter in the url that you provide to the Movie property or the LoadMovie() method.
Embedded AS3 player:
http://www.youtube.com/v/VIDEO_ID?version=3&start=NUMBER_SECONDS
Chromeless AS3 player:
http://www.youtube.com/apiplayer?video_id=VIDEO_ID&version=3&start=NUMBER_SECONDS
Replace VIDEO_ID and NUMBER_SECONDS with the desired values.
Example:
mFlashPlayer.Movie = #"http://www.youtube.com/v/1ZKz2KW87Y4?version=3&start=100";
I found the required info here:
https://developers.google.com/youtube/player_parameters
That was one of the more frustrating hunts for info that I've been on in a while. Hope this post saves other folks the headaches.

Trying to work with Request.Files in WebMatrix

everyone,
I am trying to work with uploading images to my site, and I have successfully gotten it to work, however, I need to extend the functionality beyond that of one simple image. The examples I have seen use the WebMatrix File Helper (File Helper? Is that right? Oh well, it's a helper of some kind that auto plots the html necessary for the input=type"file" field). The line of code I have in the form:
#FileUpload.GetHtml(initialNumberOfFiles:1, allowMoreFilesToBeAdded:false, includeFormTag:false)
The line of code I have in (IsPost):
var UploadedPicture = Request.Files[0];
if(Path.GetFileName(UploadedPicture.FileName) != String.Empty)
{
var ContentType = UploadedPicture.ContentType;
var ContentLength = UploadedPicture.ContentLength;
var InputStream = UploadedPicture.InputStream;
Session["gMugshot"] = new byte[ContentLength];
InputStream.Read((byte[])Session["gMugshot"], 0, ContentLength);
}
else
{
Session["gMugshot"] = new byte[0];
}
More code in the (IsPost) after this stores it in the database as binary data, and I can get the image back on the page from that (I have no desire to save the actual image files in a folder on the server and use GUID, etc. etc. Binary data is fine, and I imagine takes up a lot less space).
I have it set up to click-ably scroll through pictures by using jQuery to read the clicks of manually created buttons and subsequently hide and unhide the divs that contain the images rendered by C# (which gets them from reading the database). Sorry if that's a little TMI, just trying to be thorough, but to refine the question: I don't know enough about file uploading to know how to work with the uploaded data that well yet. I tried researching this information, but I didn't find any information that seemed pertinent to me (actually, I didn't find much useful information on input type="file", or the FileUpload method, at all, really).
Would it be better to use input type="file" id="pic1id"? Is there something that I can use such as Request.Files["pic1id"] that could get the file from the id of the input element? Or does the program simply take all uploaded files, stick them in a logistical group somewhere waiting to be called by index like this: "Request.Files[0]". If so, what order does the index get put in? Do I need to use Request.Files.Count to test how many have been uploaded before I begin working with the data?
Please note that I want separate input type="file" fields (whether plotted by the helper or not). I do not want to accept multiple files in one input (mainly because of a lack of knowledge, e.g., I am afraid I won't know how to work with the data). So far, the plan is that the separate input type="file" fields will be within the divs that get hidden/unhidden upon scrolling through pictures, so each picture (space) will have its own input type="file" field. The hiding and unhiding of divs, (the one) picture being displayed, storing and receiving binary data from the database, and clicking through the picture placeholders all function great. Pretty much I just need to know how to work with more than one uploaded picture at a time for storage in their individual database "image" fields.
Any examples of the syntax I need to use will be much appreciated.
Sorry so many questions, but I just couldn't find much useful information on this at all.
Thanks to any who try to help!
Okay, in order to solve this, I had to test and test and test, until something finally worked for me. Here's what I did:
First, I abandoned my use of the part of the helper that plotted the html, that is I took out:
#FileUpload.GetHtml(initialNumberOfFiles:1, allowMoreFilesToBeAdded:false, includeFormTag:false)
And added a regular input type="file" with a certain id, such as id="pic1".
Next I was able to get the individual file post based on id, which was really the main thing I needed to know how to do, and it really was as simple as this:
Request.Files["pic1"];

Locate only non-hidden elements using Selenium WebDriver in C#

I have a collection of records on a web page, and when a record is clicked, a 'Delete' link is displayed (actually 'unhidden' as its actually always there).
When trying to access this 'Delete' link, I am using its value.
When I use Driver.FindElement, it returns the first Delete link, even though it's hidden, and therefore can't click it (and shouldn't as it is not the right link).
So, what I basically want to do is find only non-hidden links. The code below works, but as it iterates through every Delete link I am afraid it may be inefficient.
Is there a better way?
public class DataPageModel : BasePageModel
{
private static readonly By DeleteSelector = By.CssSelector("input[value=\"Delete\"]");
private IWebElement DeleteElement
{
get
{
var elements = Driver.FindElements(DeleteSelector);
foreach (var element in elements.Where(e => e.Displayed))
{
return element;
}
Assert.Fail("Could not locate a visible Delete Element");
return null;
}
}
}
While I agree with #Torbjorn that you should be weary about where you spend your time optimizing, I do think this code is a bit inefficient.
Basically what is slowing the code down is the back and forth checking of each element to see if its displayed. To speed up the code, you need to get the element you want in one go.
Two options (both involve javascript):
jQuery
Take a look at the different ways to bring jQuery selectors to Selenium (I wrote about it here). Once you have that, you can make use of jQuery's :visible selector.
Alternatively if you know for sure the page already has jQuery loaded and you don't want to do all the extra code, you can simply use ExecuteScript:
IWebElement element = (IWebElement)driver.ExecuteScript("return $('input[value=\"Delete\"]:visible').first().get(0)");
Javascript
If you want to avoid jQuery you can just write a javascript function to do the same thing you are doing now in C#: Get all the possible elements and return the first visible one.
Then you would do something similar:
string script = //your javascript
IWebElement element = (IWebElement)driver.ExecuteScript(script);
You trade of readability with different degrees depending on which option you pick but they should all be more efficient. Of course these all require that javascript be enabled in the browser.

Categories