how to fix encoding issue in the API call - c#

I m making calls to a third party API over HTTP and passing a string in the parameters. This string is UTF-8 URL encoded. My API client is written in asp.net C# where as the API host is probably written in Java. When I have characters like parenthesis/brackets () in the string parameter, UTF-8 encoder does not encode them whereas the API host encodes them in %28 and %29 and I get incorrect response. Any suggestions how to fix this encoding problem?
(API documentation specify to use UTF-8 encoding and recommends to refer this document http://docs.oracle.com/javase/6/docs/api/java/net/URLEncoder.html)

There are some little known Uri methods:MSDN
Uri.EscapeString and Uri.EscapeDataString may fit the bill. unfortunately there is no MSDN documentation on what characters they will encode , you will just have to experiment.

Related

WebAPI get request parameters being UrlDecoded at the controller

I have an issue where I'm issuing a GET to a WebAPI controller, essentially:
$.getJSON('/api/feefo/getproductfeedback?id='+ encodeURIComponent(skuNum))
I'm using encodeURIComponent to url encode the skuNum parameter, viewing a request in dev tools I get the expected result for a skuNum that needs to be encoded:
The skuNum has gone from 1000EF+ to 1000EF%2B as expected.
However, when I view the id parameter in the WebAPI controller, it's coming through un-encoded:
It's as though the client side url encoding is being undone somehow, can anyone explain what's going on here? Obviously I can work around this by just doing the encoding in the controller, but I'd like to understand why this is happening.
That is by design. The API framework will decode the URL encoded parameters by default. the encoding should only be used for transporting the data. once on server developer shouldn't have to deal with having to decode it (cross cutting concern). Use the value as intended.

What is encoding used for SAML conversations?

I'm setting up a service to be a SAML2.0 Service Provider (SP). As such, I need to generate SAML Requests and I need to accept SAML Responses. SAML Responses (with IDP initiated assertions) may come without request. This is just the world of SSO and SAML, and I have this much working.
My sense is that SAML Requests or Responses may or may not be deflated. It seems to be good practice for a SP to deflate SAML Requests.
Requests and Responses are also Base 64 Encoded. But here lies my question. Let us say that I get a SAML Response. It is Base 64 Encoded. When I decode that, I get a byte array. Assuming that this is NOT deflated, I now need to get a string out of that byte array in order to treat it as XML.
What encoding should I assume for that string?
So, in the c#/.NET/MVC world:
public ActionResult ConsumeSamlAssertion(string samlResponse)
{
if (string.IsNullOrWhiteSpace(samlResponse))
{
return Content("Consumption URL hit without a SAML Response");
}
// MVC Already gives me this URL-decoded
byte[] bytes = Convert.FromBase64String(samlResponse);
// For this question, assume that this is not deflated.
string samlXmlIfAscii = Encoding.ASCII.GetString(bytes);
string samlXmlIfUtf8 = Encoding.UTF8.GetString(bytes);
// Which is correct? Or is there a different one?
Is this in some standard I have missed (which isn't for want of looking)?
Many thanks.
I can't find anything authoritative in the SAML2 specification on what encoding to use. I've used UTF8 and it works.
Regarding the deflate step - that depends on the binding. In the redirect binding where the message is passed in the query string, it is deflated. In the POST binding where it is past as a form field it is not deflated.
Also I'd suggest that you look at existing SAML2 stacks for .NET instead of rolling your own. It's a lot of work doing SAML2 right, and it's easy to get security issues such as XML signature wrapping.
SAML requests and responses are in XML format, so this boils down to the question how to encode XML data.
See for example: Meaning of - <?xml version="1.0" encoding="utf-8"?>
The default encoding for XML (if no preamble is present, or it does not specify an encoding) is UTF-8. Therefore, we can say that the XML specification authoritatively specifies that UTF-8 CAN be used.
Whether all SAML implementations, and the SAML specification itself, allow other encodings is unclear to me, but using UTF-8 should be safe.

Removing utf-8 identifier (BOM) from Response Sent by WCF

I am creating a clone of the facebook Rest API in c#, I am testing it with the facebook PHP sdk. The problem I am having is that the responses sent by my net Rest Service contain utf-8 Bom in front of it and Facebook SDK is not able to parse the responses correctly.
Any ideas on how to resolve this problem.
If you can specify a specific Encoding to your service, then you can use new UTF8Encoding(false) which is UTF-8 without BOM.
I don't know what you are returning in your service, but if it is a string like mine (I was returning a json), you can simply return a Message object instead with the string in it (from System.ServiceModel.Channels - google it) and then at the end of your service method implementation just do this:
Encoding utf8 = new System.Text.UTF8Encoding(false); //false == no BOM
return WebOperationContext.Current.CreateTextResponse(stringWithContent, "application/json;charset=utf-8", utf8);
The Wikipedia UTF-8 article suggests that the pretend-BOM that Windows applications frequently prepend to the actual content is three bytes long. Can you simply not send the first three bytes of your generated content?

What is the correct encoding for querystrings?

I am trying to send a request to an url like this "http://mysite.dk/tværs?test=æ" from an asp.net application, and I am having trouble getting the querystring to encode correctly. Or maybe the querystring is encoded correctly, the service I am connecting to just doesn't understand it correctly.
I have tried to send the request with different browsers and logging how they encode the request with Wireshark, and I get these results:
Firefox: http://mysite.dk/tv%C3%A6rs?test=%E6
Ie8: http://mysite.dk/tv%C3%A6rs?test=\xe6
Curl: http://mysite.dk/tv\xe6rs?test=\xe6
Both Firefox, IE and Curl receive the correct results from the service. Note that they encode the danish special character 'æ' differently in the querystring.
When I send the request from my asp.net application using HttpWebRequest, the URL gets encoded this way:
http://mysite.dk/tv%C3%A6rs?test=%C3%A6
It encodes the querystring the same way as the path part of the url. The remote service does not understand this encoding, so I don't get a correct answer.
For the record, 'æ' (U+00E6) is %E6 in ISO-LATIN-1, and %C3%A6 in UTF-8.
I could change the remote service to accept the UTF-8 encoded querystring, but then the service would stop working in browsers and I am not really interested in that. Is there a way to specify to .NET that it shouldn't encode querystrings with UTF-8?
I am creating the webrequest like this:
var req = WebRequest.Create("http://mysite.dk/tværs?test=æ") as HttpWebRequest;
But the problem seems to originate from System.Uri which is apparently used inside WebRequest.Create:
var uri = new Uri("http://mysite.dk/tværs?test=æ");
// now uri.AbsolutePath == "http://mysite.dk/tv%C3%A6rs?test=%C3%A6"
It looks like you're applying UrlEncode over the entire URL - this isn't correct, paths and query strings are encoded differently as you've seen. What is doing the encoding of the URI, WebRequest?
You could manually build the various parts using a UriBuilder, or manually encode using UrlPathEncode for the path and UrlEncode for the query string names and values.
Edit:
If the problem lies in the path, rather than the query string you could try turning on IRI support, via web.config
<configuration>
<uri>
<iriParsing enabled="true" />
</uri>
</configuration>
That should then leave international characters alone in the path.
Have you tried the UrlEncode?
http://msdn.microsoft.com/en-us/library/zttxte6w.aspx
I ended up changing my remote webservice to expect the querystring to be UTF-8 encoded. It solves my immediate problem, the webservice can not be correctly called by both PHP and the .NET framework.
However, the behavior is now strange in browsers. Copy pasting an url like "http://mysite.dk/tv%C3%A6rs?test=%C3%A6" into the browser and then pressing return works, it even corrects the encoded characters and displays the location as "http://mysite.dk/tværs?test=æ". If then reload the page (F5) it still works. But if I click on the location bar and press return again, the querystring will become encoded with latin-1 and fail.
For anyone interested here is an old Firefox bugreport about the problem: https://bugzilla.mozilla.org/show_bug.cgi?id=284474 (thanks to #dtb)
So, it seems there is no good solution.
Thanks to everyone who helped though!

Does WCF automatically URL encode/decode streams?

I'm programming a service for a program that uses HTTP post/get requests, so I handle all incoming requests with a hook method that takes a System.IO.Stream and returns a System.IO.Stream.
When I parse the incoming request (contained in an HTML form) by converting to a string and then using System.Web.HttpUtility.ParseQueryString(string), it seems to automatically URL-decode the data. When I return a file path (a Windows UNC, not going to explain why I do that), I initially URL-encoded the string before converting to a stream and returning it using a return-statement, the client seems to get a doubly-coded string.
So, just to be sure, does WCF automatically URL encode/decode streams for me as part of using System.ServiceModel.WebHttpBinding?
Apparently, it does:
"For RESTful services, WCF provides a binding named System.ServiceModel.WebHttpBinding.
This binding includes pieces that know how to read and write information using the HTTP and HTTPS transports, as well as encode messages suitable for use with HTTP."
from here.

Categories