HttpClient GetAsync with a hash in URL - c#

.NET Core 2.2 console application on Windows.
I'm exploring how to use HttpClient GetAsync on a Stackoverflow share style URL eg: https://stackoverflow.com/a/29809054/26086 which returns a 302 redirect URL with a hash in it
static async Task Main()
{
var client = new HttpClient();
// 1. Doesn't work - has a hash in URL
var url = "https://stackoverflow.com/questions/29808915/why-use-async-await-all-the-way-down/29809054#29809054";
HttpResponseMessage rm = await client.GetAsync(url);
Console.WriteLine($"Status code: {(int)rm.StatusCode}"); // 400 Bad Request
// 2. Does work - no hash
url = "https://stackoverflow.com/questions/29808915/why-use-async-await-all-the-way-down/29809054";
rm = await client.GetAsync(url);
Console.WriteLine($"Status code: {(int)rm.StatusCode}"); // 200 Okay
// 3. Doesn't work as the 302 redirect goes to the first URL above with a hash
url = "https://stackoverflow.com/a/29809054/26086";
rm = await client.GetAsync(url);
Console.WriteLine($"Status code: {(int)rm.StatusCode}"); // 400 Bad Request
}
I'm crawling my blog which has many SO short codes in it.
Update/Workaround
With thanks to #rohancragg I found that turning off AutoRedirect then getting the URI from the returned header worked
// as some autoredirects fail due to #fragments in url, handle redirects manually
var handler = new HttpClientHandler { AllowAutoRedirect = false };
var client = new HttpClient(handler);
var url = "https://stackoverflow.com/a/29809054/26086";
HttpResponseMessage rm = await client.GetAsync(url);
// gives the desired new URL which can then GetAsync
Uri u = rm.Headers.Location;

As #Damien_The_Unbeliever implies in a comment, you'll just need to strip off the hash and everything after it - all that does is tell the browser to jump to that anchor tag in the HTML page (see: https://w3schools.com/jsref/prop_anchor_hash.asp).
You could also use the Uri class to parse the Uri and ignore any 'fragments': https://learn.microsoft.com/en-us/dotnet/api/system.uri.fragment
Because the share-style Urls are only ever going to return a 302 then I'd suggest capturing the Uri to which the 302 is referring and do as I suggest above and just get the path and ignore the fragment.
So you need to use some mechanism (which I'm just looking up!) to handle a 302 gracefully followed by option 2
Update: this looks relevant! How can I get System.Net.Http.HttpClient to not follow 302 redirects?
Update 2 Steve Guidi has a very important bit of advice in a comment here: https://stackoverflow.com/a/17758758/5351
In response to the advice that you need to use HttpResponseMessage.RequestMessage.RequestUri:
it is very important to add HttpCompletionOption.ResponseHeadersRead
as the second parameter of the GetAsync() call
Disclaimer - I've not tried the above, this is just based on reading ;-)

Maybe you need to encode your URL before send the request using HttpUtility class, this way any special character will be escaped.
using System.Web;
var url = $"htpps://myurl.com/{HttpUtility.UrlEncode("#1234567")}";

Related

WebRequest returns unreadable string [duplicate]

I'm trying to download an html document from Amazon but for some reason I get a bad encoded string like "��K��g��g�e".
Here's the code I tried:
using (var webClient = new System.Net.WebClient())
{
var url = "https://www.amazon.com/dp/B07H256MBK/";
webClient.Encoding = Encoding.UTF8;
var result = webClient.DownloadString(url);
}
Same thing happens when using HttpClient:
var url = "https://www.amazon.com/dp/B07H256MBK/";
var httpclient = new HttpClient();
var html = await httpclient.GetStringAsync(url);
I also tried reading the result in Bytes and then convert it back to UTF-8 but I still get the same result. Also note that this DOES NOT always happen. For example, yesterday I was running this code for ~2 hours and I was getting a correctly encoded HTML document. However today I always get a bad encoded result. It happens every other day so it's not a one time thing.
==================================================================
However when I use the HtmlAgilitypack's wrapper it works as expected everytime:
var url = "https://www.amazon.com/dp/B07H256MBK/";
HtmlWeb htmlWeb = new HtmlWeb();
HtmlDocument doc = htmlWeb.Load(url);
What causes the WebClient and HttpClient to get a bad encoded string even when I explicitly define the correct encoding? And how does the HtmlAgilityPack's wrapper works by default?
Thanks for any help!
I fired up Firefox's web dev tools, requested that page, and looked at the response headers:
See that content-encoding: gzip? That means the response is gzip-encoded.
It turns out that Amazon gives you a response compressed with gzip even when you don't send an Accept-Encoding: gzip header (verified with another tool). This is a bit naughty, but not that uncommon, and easy to work around.
This wasn't a problem with character encodings at all. HttpClient is good at figuring out the correct encoding from the Content-Type header.
You can tell HttpClient to un-zip responses with:
HttpClientHandler handler = new HttpClientHandler()
{
AutomaticDecompression = DecompressionMethods.GZip,
};
using (var client = new HttpClient(handler))
{
// your code
}
This will be set automatically if you're using the NuGet package versions 4.1.0 to 4.3.2, otherwise you'll need to do it yourself.
You can do the same with WebClient, but it's harder.

How to send an access_token and id_token to an api using System.Net.Http

how can you send both the access_token and id_token to your api using System.Net.Http? when i was testing my api with postman it seemed to send both tokens and returned the individual user information I needed (a list of products the user is selling). I am unsure how I can do this in my Xamarin app and have being stuck on this for quite some time. I am able to send the access_token as shown below but anything I have tried when sending both tokens has returned a 404 not found. (unauthorized is corrected to a 401 so the access_token is still working)
public async Task<string> GetResponseJsonString(string url)
{
string responseJsonString = null;
var access_token = CrossSecureStorage.Current.GetValue("access_token");
using (var httpClient = new HttpClient())
{
httpClient.DefaultRequestHeaders.Clear();
httpClient.DefaultRequestHeaders.Add("Authorization", "Bearer " + access_token);
HttpResponseMessage response = httpClient.GetAsync(url).Result;
responseJsonString = await response.Content.ReadAsStringAsync();
}
return responseJsonString;
}
Note: I am aware the id_token should contain the user information and it should be decoded rather than sending requests for user information. I looked at this and have been unable to find a library that works in a xamarin PCL. I looked at JosePCL.Jwt but was unable to get it to work. I figure since any time I need user information it is returning information from my database that it made sense to send both tokens with the request and let my api get the user information.
This is entirely dependent on the API you're calling. I've never seen an API that needs something more than the access_token it's provided back to you. It's possible you have the nomenclature incorrect here.
Do you mean "access key & secret"? Or are you certain you have an access_token?
In the former case, normally API's will expect things as followed:
Append the key & secret together separated by a ":"
Base64 Encode
Set the Authorization Bearer|Basic header with the result
It's also worth asking if you've tried passing in the id_token as the Authorization header?
It's also also worth asking if you can provide us with a screen capture of the successful response from postman (make sure you obfuscate the sensitive data).
It's also also also worth pointing out an optimization tweak for your code. Since you're using async, it seems you probably are somewhat concerned about performance. Have a look at this article, discussing the disposability of HttpClient. As a better alternative, use HttpRequestMessage as follows:
public async Task<string> GetResponseJsonString(string url)
{
string responseJsonString = null;
var req = new HttpRequestMessage(HttpMethod.Get, "/your/api/url");
req.Headers.Authorization = new AuthenticationHeaderValue("Bearer", access_token);
using (var resp = await client.SendAsync(req))
using (var s = await resp.Content.ReadAsStreamAsync())
using (var sr = new StreamReader(s))
{
if (resp.IsSuccessStatusCode)
{
responseJsonString = await sr.ReadToEndAsync();
}
else
{
string errorMessage = await sr.ReadToEndAsync();
int statusCode = (int)resp.StatusCode;
//log your error
}
}
return responseJsonString;
}
Where client is a reference to a statically shared instance of HttpClient. My preferred way to do all this, is to wrap my API calls, usually one-file-per-service. I inject this service as a singleton, which will broker it's own static instance of HttpClient. This setup is even more straightforward if you're using .NET Core.

Getting httpresponse from other site

Need help. I want to get the returned data from this link - http://www.pse.com.ph/stockMarket/companyInfoSecurityProfile.html?method=getListedRecords&common=yes&ajax=true
However, if you copy and paste that link to your browser you get Access Denied ( See Tab Title). But if you paste this link first http://www.pse.com.ph ( load the page) then paste again the link above you data.
Here is my code. I am using RestSharp
string url = "http://www.pse.com.ph/stockMarket/companyInfoSecurityProfile.html?method=getListedRecords&common=yes&ajax=true";
var client = new RestClient();
client.BaseUrl = new Uri(url);
var request = new RestRequest();
IRestResponse response = client.Execute(request);
var strResult = response.Content;
return Ok("OK");
It takes so much time getting the response from the site. Maybe because of the source site behavior?
Thank you so much
I think it should be the response of your site.
Try testing another way around.
Maybe due to the slow response, your host prevent the request.

How do I set up HttpClient PostAsync to call a new web browser

I am using HttpClient PostAsync to send data to a URI. However, the following code doesn't behave as expected:
using (var client = new HttpClient())
{
var values = new Dictionary<string, string>
{
{"cpm_site_id",TOKEN},
{"apikey",API_KEY},
{"cpm_amount",input.Amount},
{"cpm_currency",input.Currency},
{"cpm_trans_id",input.Id},
{"cpm_custom",input.Custom},
};
// Get the parameters in the url encoded format
var content = new FormUrlEncodedContent(values);
//Send request
var response = await client.PostAsync(new Uri(Urls.GetUrl(Methods.Pay, IS_PRODUCTION_SITE)), content);
When the client closes their browser, I want to receive an event notification to call this code, send the above data to the client, and open a new browser instance to perform additional actions. However, this code doesn't accomplish this and I'm not sure exactly why.
I think you'll need to use something like Selenium to automate a web browser. The HttpClient can perform HTTP functions, but does not work like a web browser does.
See this SO post for a 'hello world' example
See this SO post for an example of capturing the browser close event. I've not done this with C#, but I'd imagine it'll be similar to this JAVA example.

Preserve an escaped Uri with HttpClient

I'm trying to use HttpClient to create a GET request with the following Uri:
http://test.com?action=enterorder&ordersource=acme&resid=urn%3Auuid%3A0c5eea50-9116-414e-8628-14b89849808d
As you can see, the resid param is escaped with %3A, ie the ":" character.
When I use this Uri in the HttpClient request, the url becomes:
http://test.com?action=enterorder&ordersource=acme&resid=urn:uuid:0c5eea50-9116-414e-8628-14b89849808d and I receive an error from the server because %3A is expected.
Anyone have any clue on what to do to preserve the escaped Uri when sending the request? It seems HttpClient always unescaped characters on the string before sending it.
Here is the code used:
Uri uri = new Uri("http://test.com?action=enterorder&ordersource=acme&resid=urn%3Auuid%3A0c5eea50-9116-414e-8628-14b89849808d");
using (HttpClient client = new HttpClient())
{
var resp = client.GetAsync(uri);
if (resp.Result.IsSuccessStatusCode)
{
var responseContent = resp.Result.Content;
string content = responseContent.ReadAsStringAsync().Result;
}
}
You may want to test in .NET 4.5 as a bunch of improvements were made to Uri parsing for escaped chars.
You can also check out this SO question: GETting a URL with an url-encoded slash which has a hack posted that you can use to force the URI to not get touched.
As a workaround you could try to encode this url part again to circumvent the issue. %3A would become %253A

Categories