I can't seem to get the hang of my HTTP POST methods. I have just learned how to do GET methods to retrieve webpages but now i'm trying to fill in information on the webpage and can't seem to get it working. The source code that comes back is always an invalid page (full of broken images/not the right information)
public static void jsonPOST(string url)
{
url = "http://treasurer.maricopa.gov/Parcel/TaxReceipt.aspx/GetTaxReceipt";
var httpWebRequest = (HttpWebRequest)WebRequest.Create(new Uri(url));
httpWebRequest.ContentType = "application/json; charset=utf-8";
httpWebRequest.Accept = "application/json, text/javascript, */*; q=0.01";
httpWebRequest.Headers.Add("Accept-Encoding: gzip, deflate");
httpWebRequest.CookieContainer = cookieJar;
httpWebRequest.Method = "POST";
httpWebRequest.Headers.Add("Accept-Language: en-US,en;q=0.5");
httpWebRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW65; Trident/7.0; MAM5; rv:11.0) like Gecko";
httpWebRequest.Referer = "http://treasurer.maricopa.gov/Parcel/TaxReceipt.aspx";
string postData = "{\"startDate\":\"1/1/2013\",\"parcelNumber\":\"17609419\"}";
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(postData);
httpWebRequest.ContentLength = bytes.Length;
System.IO.Stream os = httpWebRequest.GetRequestStream();
os.Write(bytes, 0, bytes.Length); //Push it out there
os.Close();
System.Net.WebResponse resp = httpWebRequest.GetResponse();
if (resp == null)
{
Console.WriteLine("null");
}
System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
string source = sr.ReadToEnd().Trim();
}
EDIT: I updated the code to reflect my new problem. The problem i have now is that the source code is not what is coming back to me. I am getting just the raw JSON information in the source. Which i can use to deserialize the information i need to obtain, but i'm curious why the actual source code isn't coming back to me
The source code that comes back is always an invalid page (full of broken images/not the right information)
It sounds like you just get the Source code without thinking of relative paths. As long as there are relative paths on the site it will not show correctly at your copy. You have to replace all the relative paths before it is useful.
http://webdesign.about.com/od/beginningtutorials/a/aa040502a.htm
Remember crossdomain ajax can be a problem in that situation.
Related
You can see my code down there.This is going to show user id of a instagram user as the response but it is showing "�".
private void button18_Click(object sender, EventArgs e)
{
string username = ""; // your username
string password = ""; // your password
HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create("https://i.instagram.com/api/v1/accounts/login/");
httpWebRequest.Headers.Add("X-IG-Connection-Type", "WiFi");
httpWebRequest.Method = "POST";
httpWebRequest.ContentType = "application/x-www-form-urlencoded; charset=UTF-8";
httpWebRequest.Headers.Add("X-IG-Capabilities", "AQ==");
httpWebRequest.Accept = "*/*";
httpWebRequest.UserAgent = "Instagram 10.9.0 Android (23/6.0.1; 944dpi; 915x1824; samsung; SM-T185; gts210velte; qcom; en_GB)";
httpWebRequest.Headers.Add("Accept-Encoding", "gzip, deflate");
httpWebRequest.Headers.Add("Accept-Language", "en;q=1, ru;q=0.9, ar;q=0.8");
httpWebRequest.Headers.Add("Cookie", "mid=qzejldb8ph9eipis9e6nrd1n457b;csrftoken=a7nd2ov9nbxgqy473aonahi58y21i8ee");
httpWebRequest.Host = "i.instagram.com";
httpWebRequest.KeepAlive = true;
byte[] bytes = new ASCIIEncoding().GetBytes("ig_sig_key_version=5&signed_body=5128c31533802ff7962073bb1ebfa9972cfe3fd9c5e3bd71fe68be1d02aa92c8.%7B%22username%22%3A%22"+ username + "%22%2C%22password%22%3A%22"+ password +"%22%2C%22_uuid%22%3A%22D26E6E86-BDB7-41BE-8688-1D60DE60DAF6%22%2C%22_uid%22%3A%22%22%2C%22device_id%22%3A%22android-a4d01b84202d6018%22%2C%22_csrftoken%22%3A%22a7nd2ov9nbxgqy473aonahi58y21i8ee%22%2C%22login_attempt_count%22%3A%220%22%7D");
httpWebRequest.ContentLength = (long)bytes.Length;
using (Stream requestStream = httpWebRequest.GetRequestStream())
{
requestStream.Write(bytes, 0, bytes.Length);
}
string result = new StreamReader(((HttpWebResponse)httpWebRequest.GetResponse()).GetResponseStream()).ReadToEnd();
textBox1.Text = result;
}
It's answered in a comment by #Dour, but I'm going to add a bit of detail that isn't mentioned.
This following line tells the server: Hey server! Please send me a compressed response (to reduce size).
httpWebRequest.Headers.Add("Accept-Encoding", "gzip, deflate");
so, the response you're getting is a compressed response.
Does that mean you just remove this line ?
Well, Removing it will fix your problem but that isn't the correct way to do that.
It's really better to get a reduced size response because that will need lower time to download it from the server.
What you really need to do is to make HttpWebRequest handles the decompression process for you.
You can do that by setting AutomaticDecompression property.
httpWebRequest.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
Note: With the previous line, you do NOT need to set Accept-Encoding yourself. HttpWebRequest will send it automatically for you. And when you call GetResponse, HttpWebRequest will handle the decompression process for you.
Besides of your problem:
I suggest that you use using statement for getting the response, because with your current code, I think it won't get disposed.
Pretty standard implementation of HttpWebRequest, whenever I pass a certain URL to get the html it comes back with nothing but special characters. An example of what comes back is below.
Now this site is SSL so I'm wondering if that has something to do with it but I've never had this problem before and I've used this with other SSL sites.
�
ServicePointManager.ServerCertificateValidationCallback = new System.Net.Security.RemoteCertificateValidationCallback(AcceptAllCertifications);
var request = (HttpWebRequest)WebRequest.Create(url);
using (var response = (HttpWebResponse)request.GetResponse())
{
Stream data = response.GetResponseStream();
HtmlDocument hDoc = new HtmlDocument();
using (StreamReader readURLContent = new StreamReader(data))
{
html = readURLContent.ReadToEnd();
hDoc.LoadHtml(html);
}
}
I can't really find anything for this specific issue so I'm kind of lost if anybody could point me in the right direction that would be awesome.
Edit: here's an image of what it looks like since I can't copy paste it
My guess is that the response is compressed. If you use a WebDebugger like Charles or Fiddler. You can see how the requests and structured and what data they contain - it makes it a lot easier to replicate the http requests later on when programming them. Try the following code.
try
{
string webAddr = url;
var httpWebRequest = (HttpWebRequest)WebRequest.Create(webAddr);
httpWebRequest.ContentType = "text/html; charset=utf-8";
httpWebRequest.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0";
httpWebRequest.AllowAutoRedirect = true;
httpWebRequest.Method = "GET";
httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream(), Encoding.UTF8))
{
var responseText = streamReader.ReadToEnd();
doc.LoadHtml(responseText);
}
}
catch (WebException ex)
{
Console.WriteLine(ex.Message);
}
The code sets the encoding on the requsts. You an also set the encoding at the streamreader when reading the response. And automatic decompression is enabled.
I am submitting a aspx with my C# code behind which works good, however I get the html direct back as I do not display this in a browser.
This is my C# code:
string getUrl = "https://www.facebook.com/login.php?login_attempt=1";
string email = "email#email.com";
string pw = "pwd";
string postData = String.Format("email={0}&pass={1}", email, pw);
HttpWebRequest getRequest = (HttpWebRequest)WebRequest.Create(getUrl);
getRequest.CookieContainer = new CookieContainer();
getRequest.CookieContainer.Add(cookies); //recover cookies First request
getRequest.Method = WebRequestMethods.Http.Post;
getRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2";
getRequest.AllowWriteStreamBuffering = true;
getRequest.ProtocolVersion = HttpVersion.Version11;
getRequest.AllowAutoRedirect = true;
getRequest.ContentType = "application/x-www-form-urlencoded";
byte[] byteArray = Encoding.ASCII.GetBytes(postData);
getRequest.ContentLength = byteArray.Length;
Stream newStream = getRequest.GetRequestStream(); //open connection
newStream.Write(byteArray, 0, byteArray.Length); // Send the data.
newStream.Close();
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
string sourceCode = "";
using (StreamReader sr = new StreamReader(getResponse.GetResponseStream()))
{
sourceCode = sr.ReadToEnd();
}
this delivers me the response I need, however the page should load javascript first and than delivered back as html, unfortunatly it does not do this.
I have been looking to:
Open this page on POST in a popup browser (or at least post this and
wait till javascript is loaded complete)
Get the loaded page back
after javascript is fully loaded instead of the current
Of course I prefer this to be before:
HttpWebResponse getResponse = (HttpWebResponse)getRequest.GetResponse();
I have tried to:
- use WebBrowser wb = new WebBrowser();, however this gives a single thread error and seems not to be possible to Post the page to the url.
- while (wb.ReadyState != WebBrowserReadyState.Complete), this can't be used as I do not load a actually page but get only the response
Anybody has a good and smart idea to load the page in a browser with the POST, wait till javascript has been executed and me load this html into my C# code?
I am trying to scrape a website to get the Textarea information.
I'm using:
HtmlDocument doc = this.webBrowser1.Document;
When I look at the view source it shows <textarea name="message" class="profile">
But when I try to access this textarea with:
HtmlDocument doc = this.webBrowser1.Document;
doc.GetElementsByTagName("textarea")
.GetElementsByName("message")[0]
.SetAttribute("value", "Hello");
It shows the error:
Value of '0' is not valid for 'index'. 'index' should be between 0 and -1.
Parameter name: index
Any Help?
For your current need you can simply use this:
doc.GetElementsByTagName("textarea")[0].InnerText = "Hello";
For complex things you can use HtmlDocument class with MSHTML class.
I can entrust HtmlAgilityPack to you!
I'd like to think that you try to access a website that uses cookies to determine if a user is logged in (or not). If not, it will force you to register/log-in else you aren't allowed to see anything. Am I right?
Your browser stores that cookies, your C# does not! (broadly speaking)
You need to create a cookie container to solve that problem.
Your C#-App may log-in, request a cookie/session, may grab the Cookies from the responseheader and then you should be able to scrape the profiles or whatever you want.
Get the Post Data, which is send to server. You can use tools/addons like Fiddler, Tamper, ect..
E.g. PostdataString: user_name=TESTUSER&password=TESTPASSWORD&language=en&action%3Asubmit=Submit
Here is a snippet you can use.
//Create the PostData
string strPostData = "user_name=" + txtUser.Text + "&password=" + txtPass.Text + "&language=en&action%3Asubmit=Submit";
CookieContainer tempCookies = new CookieContainer();
ASCIIEncoding encoding = new ASCIIEncoding();
byte[] data = encoding.GetBytes(strPostData);
//Create the Cookie
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.website.com/login.php");
request.Method = "POST";
request.KeepAlive = true;
request.AllowAutoRedirect = false;
request.Accept = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.ContentType = "application/x-www-form-urlencoded";
request.Referer = "http://www.website.com/login.php";
request.UserAgent = "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1";
request.ContentLength = data.Length;
Stream requestStream = request.GetRequestStream();
requestStream.Write(data, 0, data.Length);
HttpWebResponse response;
response = (HttpWebResponse)request.GetResponse();
string sRequestHeaderBuffer = Convert.ToString(response.Headers);
requestStream.Close();
//Stream(-output) of the new website
StreamReader postReqReader = new StreamReader(response.GetResponseStream());
//RichTextBox to see the new source.
richTextBox1.Text = postReqReader.ReadToEnd();
You will need to adjust the Cookie-parameters in between and add your current sessionid aswell to the code. This depends on the requested website you visit.
E.g.:
request.Headers.Add("Cookie", "language=en_US.UTF-8; StationID=" + sStationID + "; SessionID=" + sSessionID);
I tried to search previous discussion about this issue but I didn't find one, maybe it's because I didn't use right keywords.
I am writing a small program which posts data to a webpage and gets the response. The site I'm posting data to does not provide an API. After some Googling I came up to the use of HttpWebRequest and HttpWebResponse. The code looks like this:
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create("https://www.site.com/index.aspx");
CookieContainer cookie = new CookieContainer();
httpRequest.CookieContainer = cookie;
String sRequest = "SomeDataHere";
httpRequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
httpRequest.Headers.Add("Accept-Encoding: gzip, deflate");
httpRequest.Headers.Add("Accept-Language: en-us,en;q=0.5");
httpRequest.Headers.Add("Cookie: SomecookieHere");
httpRequest.Host = "www.site.com";
httpRequest.Referer = "https://www.site.com/";
httpRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1";
httpRequest.ContentType = "application/x-www-form-urlencoded";
//httpRequest.Connection = "keep-alive";
httpRequest.ContentLength = sRequest.Length;
byte[] bytedata = Encoding.UTF8.GetBytes(sRequest);
httpRequest.ContentLength = bytedata.Length;
httpRequest.Method = "POST";
Stream requestStream = httpRequest.GetRequestStream();
requestStream.Write(bytedata, 0, bytedata.Length);
requestStream.Flush();
requestStream.Close();
HttpWebResponse httpWebResponse = (HttpWebResponse)httpRequest.GetResponse();
string sResponse;
using (Stream stream = httpWebResponse.GetResponseStream())
{
StreamReader reader = new StreamReader(stream, System.Text.Encoding.GetEncoding("iso-8859-1"));
sResponse = reader.ReadToEnd();
}
return sResponse;
I used firefox's firebug to get the header and data to post.
My question is, when I store and display the response using a string, all I got are garbled characters, like:
?????*??????xV?J-4Si1?]R?r)f?|??;????2+g???6?N-?????7??? ?6?? x???q v ??? j?Ro??_*?e*??tZN^? 4s?????? ??Pwc??3???|??_????_??9???^??#?Y??"?k??,?a?H?Lp?A?$ ;???C#????e6'?N???L7?j#???ph??y=?I??=(e?V?6C??
By reading the response header using FireBug I got the content type of response:
Content-Type text/html; charset=ISO-8859-1
And it is reflected in my code. I have even tried other encoding such as utf-8 and ascii, still no luck. Maybe I am in the wrong direction.
Please advise. A small code snippet will be even better.
Thanks you.
You're telling the server that you can accept compressed responses with httpRequest.Headers.Add("Accept-Encoding: gzip, deflate");. Try removing that line, and you should get a clear-text response.
HttpWebRequest does have built in support for gzip and deflate if you want to allow compressed responses. Remove the Accept-Encoding header line, and replace it with
httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
This will take care of adding the appropriate Accept-Encoding header for you, and handle decompressing the content automatically when you receive it.