I have done something similar to this before however I'm not sure how to do this with a bigger project.
I'm trying to return the titles of all the stuff on the front page of reddit.
From this site:
http://www.reddit.com/r/all.json
I pasted the data into
http://json2csharp.com/#
to find out the class I need.
From here though, I'm not too sure on how to proceed. If I wanted to return an array of all this data so I can easily get information, how could I do it.
Sorry for the vagueness of this question but I'm just at a loss and don't know what to do.
Use
using (var webClient = new System.Net.WebClient()) {
var json = webClient.DownloadString("http://www.reddit.com/r/all.json");
}
For old .Net:
var request = WebRequest.Create(url);
string text;
request.ContentType = "application/json; charset=utf-8";
var response = (HttpWebResponse) request.GetResponse();
using (var sr = new StreamReader(response.GetResponseStream()))
{
text = sr.ReadToEnd();
}
Related
I spent all day trying to figure out what I was doing wrong yesterday.
Coming here to try and find some help.
The follow error is triggered when I run the actual GetResponse.
I am new to APIs so I am sure I am missing something real simple.
You must provide a request body if you set ContentLength>0 or SendChunked==true. Do this by calling [Begin]GetRequestStream before [Begin]GetResponse.
Here is my code I am using to try and send JSON to the API. Payment object just has the form values entered in and the credentials to use the correct account on the merchants end.
var json = JsonConvert.SerializeObject(payment);
var apiUrl = new Uri($"Removed endpoint URL");
var postBytes = Encoding.UTF8.GetBytes(json);
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
var httpWebRequest = (HttpWebRequest)WebRequest.Create(apiUrl);
httpWebRequest.ContentType = "application/json";
httpWebRequest.Accept = "application/json";
httpWebRequest.Method = "POST";
httpWebRequest.ContentLength = postBytes.Length;
httpWebRequest.AllowWriteStreamBuffering = false;
//This is where the error triggers and drops to the catch.
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
{
var result = streamReader.ReadToEnd();
}
I appreciate any help in advance, I may be doing this completely wrong, its a series of things I threw together trying to fix issues with the call.
Unless I missed it, you're not actually writing your payload data to the HttpWebRequest body before you're sending it.
using (Stream _reqStrm = httpWebRequest.GetRequestStream())
{
_reqStrm.Write(postBytes, 0, postBytes.Length);
}
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
....
Unrelated but if you can, consider HttpClient
Hth..
I have a question that seems to have been asked before, but is a bit different. I'm trying to scrape data from this website but the problem is that is seems like it's loaded with AJAX. Because of that my application is unable to find the id's and classes in the HTML that I'm looking for.
You can reproduce this by inspecting an element or viewing the source. Whilst viewing the source I'm seeing a lot less than whilst inspecting an element.
I thought that I could track down the file that contains the AJAX to load this html by pressing F12, going to the network tab and selecting XHR, but I'm unable to find it.
My question is: how do I retrieve this data or find out what file is
used to collect the data?
An example of my code (I'm unable to find the Timetable_toolbar_elementSelect_popup0):
private async Task GetHtmlDocument(string url)
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
//request.Credentials = new LoginCredentials().Credentials;
try
{
WebResponse myResponse = await request.GetResponseAsync();
HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.OptionFixNestedTags = true;
htmlDoc.Load(myResponse.GetResponseStream());
var test = htmlDoc.GetElementbyId("Timetable_toolbar_elementSelect_popup0");
}
catch (Exception e)
{
}
}
I was going to leave this as a comment. But it got too big and too badly formatted. So here we go.
Firstly. The site is updated dynamically using javascript that is called with an ajaxcommand.
If you can open up a session and store the cookie containing the SESSIONID and the now "encrypted" schoolname then you can call the ajax commands as such.
https://roosters.windesheim.nl/ajaxCommand=getWeeklyTimetable&elementType=1&elementId=13090&date=20171126&formatId=7&departmentId=0&filterId=-2
This does however require you to know what elementType is and what elementId is.
In this case elementId refers to Klas when it is equal to 1GLD. And formatID(7) refers Roosterformaat when it is equal to "Beknopt". You have to figure out what the remaining variables does. Even more important is that if you succeed in being able to make valid ajax commands to the server then you wont get html back as a response you will receive the data in JSON.
The easiest way to do what you want is to have all the classes in a separate file. And use that as reference point. Same goes for the other options.
And then use a headless browser like phantomjs.org with Selenium. This way you can find and click on the classes you want to scrape. Load the html into a HtmlAgilityPack.HtmlDocument and then do what you need to do. Selenium/PhantomJS till keep track of your cookies.
This method is slower - but a lot easier to do.
EDIT Storing cookies from a webrequest - the easy way.
I am not keen on this subject. But OP asked. If anybody has a better way of doing it please edit.
CookieContainer cookies = new CookieContainer();
try
{
string webAddr = "https://roosters.windesheim.nl/WebUntis/";
var httpWebRequest = (HttpWebRequest)WebRequest.Create(webAddr);
httpWebRequest.ContentType = "application/json; charset=utf-8";
httpWebRequest.Method = "POST";
httpWebRequest.CookieContainer = cookies;
httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
httpWebRequest.Headers.Add("X-Requested-With", "XMLHttpRequest");
using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
{
string json = "ajaxCommand=getWeeklyTimetable&elementType=1&elementId=13092&date=20171126&formatId=7&departmentId=0&filterId=-2";
streamWriter.Write(json);
streamWriter.Flush();
}
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
{
cookies.Add(httpWebRequest.CookieContainer.GetCookies(httpWebRequest.RequestUri));
//cookies.Add(httpResponse.Cookies);
var responseText = streamReader.ReadToEnd();
doc.LoadHtml(responseText);
foreach(Cookie c in httpResponse.Cookies)
{
Console.WriteLine(c.ToString());
}
}
}
catch (WebException ex)
{
Console.WriteLine(ex.Message);
}
Console.WriteLine(doc.DocumentNode.InnerHtml);
Console.ReadKey();
Solution where you call the ajax method using a webrequest.
So I got bored and figured most of it out. What is missing below is how to identify the Klase by id. The below example will fetch the klase '1GLD'. The reason why we need cookies is in order for the request to know which school we are fetching the Klase from. Also the below code only returns JSON - and not HTML since it is an ajax method we call.
CookieContainer cookies = new CookieContainer();
try
{
string webAddr = "https://roosters.windesheim.nl/";
var httpWebRequest = (HttpWebRequest)WebRequest.Create(webAddr);
httpWebRequest.ContentType = "application/json; charset=utf-8";
httpWebRequest.Method = "POST";
httpWebRequest.CookieContainer = cookies;
httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
httpWebRequest.Headers.Add("X-Requested-With", "XMLHttpRequest");
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
{
cookies.Add(httpWebRequest.CookieContainer.GetCookies(httpWebRequest.RequestUri));
}
}
catch (WebException ex)
{
Console.WriteLine(ex.Message);
}
//According to my web debugger the cookie will last until the 10th of December. So need to fix a new cookie until then.
//I noticed the url used unixtimestamps at the end of the url. So we just add the unixtimestamp at the end for each request.
long unixTimeStamp = new DateTimeOffset(DateTime.Now).ToUnixTimeMilliseconds() - 100;
//we are now ready to call the ajax method and get the JSON.
try
{
string webAddr = "https://roosters.windesheim.nl/WebUntis/Timetable.do?request.preventCache="+unixTimeStamp.ToString();
var httpWebRequest = (HttpWebRequest)WebRequest.Create(webAddr);
httpWebRequest.ContentType = "application/x-www-form-urlencoded; charset=utf-8";
httpWebRequest.Method = "POST";
httpWebRequest.CookieContainer = cookies;
httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
httpWebRequest.Headers.Add("X-Requested-With", "XMLHttpRequest");
using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
{
string json = "ajaxCommand=getWeeklyTimetable&elementType=1&elementId=13090&date=20171126&formatId=7&departmentId=0&filterId=-2";
//The command below will return a JSON datastructure containing all the klases and their relevant ID.
//string otherJson = "ajaxCommand=getPageConfig&type=1&filter=-2"
streamWriter.Write(json);
streamWriter.Flush();
}
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
{
var responseText = streamReader.ReadToEnd();
//THE RESULTS GETS PRINTED HERE.
Console.Write(responseText);
}
}
catch (WebException ex)
{
Console.WriteLine(ex.Message);
}
Other solution with Selenium with Firefox driver.
This is way easier to do. but it also takes some time. Not all the thread sleeps are necessary. This will give an HTML to work with isntead just like you requested. But I found it necessary in the last foreach loop.
public static void Main(string[] args)
{
HtmlDocument doc = new HtmlDocument();
//According to my web debugger the cookie will last until the 10th of December. So need to fix a new cookie until then.
//I noticed the url used unixtimestamps at the end of the url. So we just add the unixtimestamp at the end for each request.
long unixTimeStamp = new DateTimeOffset(DateTime.Now).ToUnixTimeMilliseconds() - 100;
string webAddr = "https://roosters.windesheim.nl/WebUntis/Timetable.do?request.preventCache="+unixTimeStamp.ToString();
var ffOptions = new FirefoxOptions();
ffOptions.BrowserExecutableLocation = #"C:\Program Files (x86)\Mozilla Firefox\firefox.exe";
ffOptions.LogLevel = FirefoxDriverLogLevel.Default;
ffOptions.Profile = new FirefoxProfile { AcceptUntrustedCertificates = true };
var service = FirefoxDriverService.CreateDefaultService();
var driver = new FirefoxDriver(service, ffOptions, TimeSpan.FromSeconds(120));
driver.Navigate().GoToUrl(webAddr);
driver.FindElement(By.XPath("//input[#id='school']")).SendKeys("Windesheim"+Keys.Enter);
Thread.Sleep(2000);
driver.FindElement(By.XPath("//span[#id='dijit_PopupMenuBarItem_0_text' and text() ='Lesrooster']")).Click();
driver.FindElement(By.XPath("//td[#id='dijit_MenuItem_0_text' and text() ='Klassen']")).Click();
Thread.Sleep(2000);
driver.FindElement(By.XPath("//div[#id='widget_Timetable_toolbar_elementSelect']//input[#class='dijitReset dijitInputField dijitArrowButtonInner']")).Click();
//we get all the options for Klase
doc.LoadHtml(driver.PageSource);
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[#id='Timetable_toolbar_elementSelect_popup']/div[#item]");
List<String> options = new List<String>();
foreach (HtmlNode n in nodes)
{
options.Add(n.InnerText);
}
foreach(string s in options)
{
driver.FindElement(By.XPath("//input[#id='Timetable_toolbar_elementSelect']")).Clear();
driver.FindElement(By.XPath("//input[#id='Timetable_toolbar_elementSelect']")).SendKeys(s);
Thread.Sleep(2000);
driver.FindElement(By.XPath("//body")).SendKeys(Keys.Enter);
Thread.Sleep(2000);
doc.LoadHtml(driver.PageSource);
//Console.WriteLine(driver.Url); //Now we can see the id of the current Klase
}
Console.WriteLine(doc.DocumentNode.InnerHtml);
Console.ReadKey();
}
Last update
Using the Selenium solution I was able to get the ID's for all courses. I have included the file here so you can use it with your ajax and web requests.
I have a Php Script in my Host Which has the link of my new version of my Program,How Can I Get that link From Php? I mean I wanna get that link From Php and Save it in one String.
I Often Use This Code For Doing Something like this:
webbrowser.Nagative("MyPhp Uri");
webbrowser.Document.ExecCommand("SelectAll", false, null);
webbrowser.Document.ExecCommand("Copy", false, null);
Than I Paste it in one Textbox
textbox1.Paste();
But This Way is not Complete way to get data From Php?
Can you help me?
You should use webrequest instead.
I'm not posting a complete solution because I'm pretty sure you will find one as soon as you know what to search for:
using System;
using System.Net;
//create a request object and server call
Uri requestUri = new Uri("MyPhp Uri");
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(this.requestUri);
//set all properties you need for the request, like
request.Method = "GET";
request.BeginGetResponse(new AsyncCallback(ProcessResponse), request);
//handle response
private void ProcessResponse(IAsyncResult asynchronousResult)
{
string responseData = string.Empty;
HttpWebRequest myrequest = (HttpWebRequest)asynchronousResult.AsyncState;
using (HttpWebResponse response = (HttpWebResponse)myrequest.EndGetResponse(asynchronousResult))
{
Stream responseStream = response.GetResponseStream();
using (var reader = new StreamReader(responseStream))
{
responseData = reader.ReadToEnd();
}
responseStream.Close();
}
//TODO: do something with your responseData
}
Please notice: you should definitively add some try/catch blocks.. this is only a short example to point you in the right direction.
I am trying to use .Net WebRequest to POST a form. The form contains fields that are XML. (Among other things) I have tried the following code:
WebRequest req = WebRequest.Create(ctx.SvcUrl);
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
using (var writer = new StreamWriter(req.GetRequestStream(), System.Text.Encoding.ASCII))
{
string reqBody = "first=<bill/>&last=smith"; //(embedded <>) - 500 Internal Server Error
writer.Write(reqBody);
}
rsp = req.GetResponse();
var strm = rsp.GetResponseStream();
var rdr = new StreamReader(strm);
string input = rdr.ReadToEnd();
The <> in reqBody causes a 500 - Internal Server error.
What's the right way to encode this? Or are multi-part forms the answer??
Try using:
string reqBody = string.Format("first={0}&last={1}", HttpUtility.HtmlEncode("<bill/>"), "smith");
You need to encode the request. Use the HttpEncoder class.
using System.Web.Util;
WebRequest req = WebRequest.Create(ctx.SvcUrl);
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
using (var writer = new StreamWriter(req.GetRequestStream(),
System.Text.Encoding.ASCII))
{
var encoder = new HttpEncoder();
string reqBody = String.Format("first={0}&last={1}",
encoder.HtmlEncode("<bill/>"),
encoder.HtmlEncode("smith") );
writer.Write(reqBody);
}
rsp = req.GetResponse();
var strm = rsp.GetResponseStream();
var rdr = new StreamReader(strm);
string input = rdr.ReadToEnd();
I used String.Format() because I thought it looked nicer and made it clearer what I was doing, but it isn't necessary. You can build the string through string concatenation, too, as long as you pass it through HttpEncoder.HtmlEncode() first.
It turns out that UrlEncoding is being done automatically, so doing it myself can cause trouble. Also, the server I was connecting to couldn't handle any encoding. This muddied the water and made it difficult to see what was failing.
Bottom line solution was to get the server fixed to handle UrlEncoding.
As 'cheong00 on Microsoft's Forums' points out, to avoid the automatic, use TcpClient. But the encoding should be there.
This may be a pathetically simple problem, but I cannot seem to format the post webrequest/response to get data from the Wikipedia API. I have posted my code below if anyone can help me see my problem.
string pgTitle = txtPageTitle.Text;
Uri address = new Uri("http://en.wikipedia.org/w/api.php");
HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest;
request.Method = "POST";
request.ContentType = "application/x-www-form-urlencoded";
string action = "query";
string query = pgTitle;
StringBuilder data = new StringBuilder();
data.Append("action=" + HttpUtility.UrlEncode(action));
data.Append("&query=" + HttpUtility.UrlEncode(query));
byte[] byteData = UTF8Encoding.UTF8.GetBytes(data.ToString());
request.ContentLength = byteData.Length;
using (Stream postStream = request.GetRequestStream())
{
postStream.Write(byteData, 0, byteData.Length);
}
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
// Get the response stream.
StreamReader reader = new StreamReader(response.GetResponseStream());
divWikiData.InnerText = reader.ReadToEnd();
}
You might want to try a GET request first because it's a little simpler (you will only need to POST for wikipedia login). For example, try to simulate this request:
http://en.wikipedia.org/w/api.php?action=query&prop=images&titles=Main%20Page
Here's the code:
HttpWebRequest myRequest =
(HttpWebRequest)WebRequest.Create("http://en.wikipedia.org/w/api.php?action=query&prop=images&titles=Main%20Page");
using (HttpWebResponse response = (HttpWebResponse)myRequest.GetResponse())
{
string ResponseText;
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
ResponseText = reader.ReadToEnd();
}
}
Edit: The other problem he was experiencing on the POST request was, The exception is : The remote server returned an error: (417) Expectation failed. It can be solved by setting:
System.Net.ServicePointManager.Expect100Continue = false;
(This is from: HTTP POST Returns Error: 417 "Expectation Failed.")
I'm currently in the final stages of implementing an C# MediaWiki API which allows the easy scripting of most MediaWiki viewing and editing actions.
The main API is here: http://o2platform.googlecode.com/svn/trunk/O2%20-%20All%20Active%20Projects/O2_XRules_Database/_Rules/APIs/OwaspAPI.cs and here is an example of the API in use:
var wiki = new O2MediaWikiAPI("http://www.o2platform.com/api.php");
wiki.login(userName, password);
var page = "Test"; // "Main_Page";
wiki.editPage(page,"Test content2");
var rawWikiText = wiki.raw(page);
var htmlText = wiki.html(page);
return rawWikiText.line().line() + htmlText;
You seem to be pushing the input data on HTTP POST, but it seems you should use HTTP GET.
From the MediaWiki API docs:
The API takes its input through
parameters in the query string. Every
module (and every action=query
submodule) has its own set of
parameters, which is listed in the
documentation and in action=help, and
can be retrieved through
action=paraminfo.
http://www.mediawiki.org/wiki/API:Data_formats