Access to content of a process information - c#

I create an instance of IE with this code:
System.Diagnostics.Process p =
System.Diagnostics.Process.Start("IEXPLORE.EXE",
#"http://www.asnaf.ir/moreinfounit.php?sSdewfwo87kjLKH7624QAZMLLPIdyt75576rtffTfdef22de=1&iIkjkkewr782332ihdsfJHLKDSJKHWPQ397iuhdf87D3dffR=2009585&gGtkh87KJg89jhhJG75gjhu64HGKvuttt87guyr6e67JHGVt=117&cCli986gjdfJK755jh87KJ87hgf9871g00113kjJIZAEQ798=0a26e8ea07358781d128aa4bc98dd89a");
I want to get the contents of the opened window. Is it possible to read the HTML content by this process?

Use following COde,
using (var client = new WebClient())
{
string result = client.DownloadString("http://www.asnaf.ir/moreinfounit.php?sSdewfwo87kjLKH7624QAZMLLPIdyt75576rtffTfdef22de=1&iIkjkkewr782332ihdsfJHLKDSJKHWPQ397iuhdf87D3dffR=2009585&gGtkh87KJg89jhhJG75gjhu64HGKvuttt87guyr6e67JHGVt=117&cCli986gjdfJK755jh87KJ87hgf9871g00113kjJIZAEQ798=0a26e8ea07358781d128aa4bc98dd89a");
// TODO: ur logice here
}

no. your processes run in different virtual addressing spaces. That would have been a serious security vulnerability if you could have read the memory space allocated by another process.
Edit: Consider using something like a WebBrowserControl in your original process. That way you cold easily retrieve the page it displays.

It might be possible, but I'd actually use a HttpWebRequest to obtain the HTML content. If you really just want to get the HTML content for a given http-URL, using IE as a separate process is definitely not the way to go.

You should use WebClient class to retrieve web page content. Check this link:
http://msdn.microsoft.com/en-us/library/system.net.webclient(v=vs.80).aspx

Related

Streaming MP3 Chunks on ASP.NET

Currently, I have a feature on an ASP.NET website where the user can play back MP3 Files. The code looks something like this:
Response.Clear();
Response.ContentType = "audio/mpeg";
foreach (DataChunk leChunk in db.Mp3Files.First(mp3 => mp3.Mp3ResourceId.Equals(id)).Data.Chunks.OrderBy(chunk => chunk.ChunkOrder))
{
Response.BinaryWrite(leChunk.Data);
}
Unfortunately, if a larger MP3 file is selected, the audio does not begin to play until the entire file is downloaded, which can cause a noticeable delay. Is there any way to get the MP3 to start playing immediately, even though the entire file may not yet be transferred?
You should be able to do what you want by writing to the outpstream of the response, i.e.:
Response.OutputStream.Write
It is also probably a good idea to check previously if Response.IsClientConnected and give up if not.
I found a demo that allows playback of mp3 files from an asp.net web application:
http://aspsnippets.com/Articles/Save-MP3-Audio-Files-to-database-and-display-in-ASPNet-GridView-with-Play-and-Download-option.aspx
try this:
Response.BufferOutput = false; //sets chunked encoding
Response.ContentType = "audio/mpeg";
using (var bw = new BinaryWriter(Response.OutputStream))
{
foreach (DataChunk leChunk in db.Mp3Files.First(mp3 => mp3.Mp3ResourceId.Equals(id)).Data.Chunks.OrderBy(chunk => chunk.ChunkOrder))
{
if (Response.IsClientConnected) //avoids the host closed the connection exception
{
bw.Write(leChunk.Data);
}
}
}
Also, go yo your web.config file and do this if you still have problems with chunked encoding:
<system.webServer>
<asp enableChunkedEncoding="true" />
</system.webServer>
The error you reported above about the host being closing the connection is happening probably because you are opening the page using the browser and when the browser reads the content type, it opens the media player and closes itself who had the opened connection which was then closed, causing that error, so to avoid this, you need to check periodically whether your client is still connected or not.
Finally, I would use a Generic Handler (.ashx) or a custom handler and set a .mp3 extension for this if you are using a aspx page to avoid the unnecessary overhead of the web page.
I hope this helps.
Try setting Response.BufferOutput = false before streaming the response.
If the location of the MP3 files are publicly available to your user then an alternative approach could be to just return the MP3's URL and use the HTML 5 audio tags in your mark up to stream the music. I am pretty sure that the default behaviour of the audio tag would be to stream the file rather than wait until the whole file has downloaded.
One method to support this would be implementing HTTP byte range requests.
By default I don't believe that ASP.NET does this, and definitely won't if using any of the code in the questions or the answer.
You can implement this manually with a little work though. Another option, which would be much less dev work, would be to let IIS serve a static file. I assume that isn't an option though.
Here's an example implementation:
http://www.codeproject.com/Articles/820146/HTTP-Partial-Content-In-ASP-NET-Web-API-Video

Download content from the internet with code

I have to download some content from a website every day so I figure it will be nice to have a program that will do it... The problem is that the website requires authentication.
My current solution is by using System.Windows.Forms.WebBrowser control. I currently do something like:
/* Create browser */
System.Windows.Forms.WebBrowser browser = new System.Windows.Forms.WebBrowser();
/* navigate to desired site */
browser.Navigate("http://stackoverflow.com/");
// wait for browser to download dom
/* Get all tags of type input */
var elements = browser.Document.Body.GetElementsByTagName("input");
/* let's look for the one we are interested */
foreach (System.Windows.Forms.HtmlElement curInput in elements)
{
if (curInput.GetAttribute("name") == "q") //
{
curInput.SetAttribute("value", "I changed the value of this input");
break;
}
}
// etc
I think this approach works but is not the best solution. I have tried to use the webclient class and that seems to work but for some reason it does not work. I belive the reason why it does not work is because I have to save the cookies?
So my question is how will I be able to track all the bytes that get send to the server and all the bytes that get responded in order to download what I need. In other words I will like to have the webclient act as a webrowser and once I get to the part I need by just looking at the source I should be able to parser the data that I need.
I will appreciate if someone can show me an example of how to do so. Google chrome does a pretty good job displaying lots of information:
Thanks in advance,
Antonio
Answering your question:
The best utility i know to track traffic is Fiddler (its free).
For sending advanced HTTP requests, you should use class System.Net.HttpWebRequest, which also has property CookieContainer, and Headers, allowing you to do what ever you want.
Hope it helps.

Parsing XML from webbrowser control?

I need to parse a XML file (which is generated by PHP) from a webbrowser control as the page I am trying to parse requires cookies to track some things. When I use something like:
string xmlURL = "urltophpfile";
XmlTextReader reader = null;
reader = new XmlTextReader(xmlUrl);
to parse it, cookies aren't enabled so I need to use a webbrowser control or something which will allow me to use cookies.
The problem I am having is that when I try to put the webbrowser text to a string (string info = webBrowser2.DocumentText.ToString(); it gives the full source of the web page and therefore I can't parse it.
Does anyone have any suggestions on how I can work this out please?
You should use HttpWebRequest and specify the CookieContainer property.
This URL has a good example of it: http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.cookiecontainer.aspx
EDIT: To be clear, I mean use HttpWebRequest to fetch the XML, and then load the xml using XmlTextReader.Create with one of the overloads that supports a stream or direct string input.

How to get raw page source (not generated source) from c#

The goal is to get the raw source of the page, I mean do not run the scripts or let the browsers format the page at all. for example: suppose the source is <table><tr></table> after the response, I don't want get <table><tbody><tr></tr></tbody></table>, how to do this via c# code?
More info: for example, type "view-source:http://feeds.gawker.com/kotaku/full" in the browser's address bar will give u a xml file, but if you just call "http://feeds.gawker.com/kotaku/full" it will render a html page, what I want is the xml file. hope this is clear.
Here's one way, but it's not really clear what you actually want.
using(var wc = new WebClient())
{
var source = wc.DownloadString("http://google.com");
}
If you mean when rendering your own page. You can get access the the raw page content using a ResponseFilter, or by overriding page render. I would question your motives for doing this though.
Scripts run client-side, so it has no bearing on any c# code.
You can use a tool such as Fiddler to see what is actually being sent over the wire.
disclaimer: I think Fiddler is amazing

Read only the title and/or META tag of HTML file, without loading complete HTML file

Scenario :
I need to parse millions of HTML files/pages (as fact as I can) & then read only only Title or Meta part of it & Dump it to Database
What I am doing is using System.Net.WebClient Class's DownloadString(url_path) to download & then Saving it to Database by LINQ To SQL
But this DownloadString function gives me complete html source, I just need only Title part & META tag part.
Any ideas, to download only that much content?
I think you can open a stream with this url and use this stream to read the first x bytes, I can't tell the exact number but i think you can set it to reasonable number to get the title and the description.
HttpWebRequest fileToDownload = (HttpWebRequest)HttpWebRequest.Create("YourURL");
using (WebResponse fileDownloadResponse = fileToDownload.GetResponse())
{
using (Stream fileStream = fileDownloadResponse.GetResponseStream())
{
using (StreamReader fileStreamReader = new StreamReader(fileStream))
{
char[] x = new char[Number];
fileStreamReader.Read(x, 0, Number);
string data = "";
foreach (char item in x)
{
data += item.ToString();
}
}
}
}
I suspect that WebClient will try to download the whole page first, in which case you'd probably want a raw client socket. Send the appropriate HTTP request (manually, since you're using raw sockets), start reading the response (which will not be immediately) and kill the connection when you've read enough. However, the rest will have probably already been sent from the server and winging its way to your PC whether you want it or not, so you might not save much - if anything - of the bandwidth.
Depending on what you want it for, many half decent websites have a custom 404 page which is a lot simpler than a known page. Whether that has the information you're after is another matter.
You can use the verb "HEAD" in a HttpWebRequest to return the the response headers (not element. To get the full element with the meta data you'll need to download the page and parse out the meta data you want.
System.Net.WebRequest.Create(uri) { Method = "HEAD" };

Categories