I have to create a webrequest using HTTP GET and I have to add on the url a JSON serialized and encoded with all parameters.
But I have a problem when I'm encoding the serialized object, they encode also symbols like "{" and ":"
I would like to know what I have to do in order to encode the serialized object like the followed above:
Serialized Object:
{\"Name\":\"Bob\"}
Encoded With HttpUtility.Utility, or another encoder will encode all symbols:like "{" ":"
"%7b%22Name%22%3a%22Bob%22%2
What I'm looking for:
http://tmpserviceURL.test?parameters={%20%22Name%22:%20%22Bob%22}
#Ant P is correct: you want those characters to be encoded. It is a bad idea not to encode them.
HttpUtility.UrlEncode and other similar methods encode {, } and : because they MUST do so per section 2.2 of the Uniform Resource Locators specification (RFC 1738).
From page 2:
Octets [within a URL] must be encoded if they have no corresponding graphic
character within the US-ASCII coded character set, if the use of the
corresponding character is unsafe, or if the corresponding character
is reserved for some other interpretation within the particular URL
scheme.
The spec goes on to define : as being in the set of "reserved" characters (those that have special meaning within a URL), and defines { and } as being in the set of "unsafe" characters (those that are known to be sometimes modified by gateways and other transport agents).
So, in short, if you send these characters unencoded in a URL, then you risk the URL not being interpretted correctly, or the data being corrupted by the time it reaches its destination. It may work sometimes, but you can't rely on it always working.
If you really feel you must ignore the URL spec, then you will have to roll your own URL encoder that does not encode those particular characters. I doubt you're going to find an off-the-shelf encoder that allows you to do this.
Related
I have to download a file (using existing Flurl-Http endpoints [1]) whose name contains a "#" which of course has to be escaped to %23 to not conflict with uri-fragment detection.
But Flurl always escapes the rest but not this character, resulting in a non working uri where half of the path and all query params are missing because they got parsed as uri-fragment:
Url url = "http://server/api";
url.AppendPathSegment("item #123.txt");
Console.WriteLine(url.ToString());
Returns: http://server/api/item%20#123.txt
This means a http request (using Flurl.Http) would only try to download the non-existing resource http://server/api/item%20.
Even when I pre-escape the segment, the result still becomes exactly the same:
url.AppendPathSegment("item %23123.txt");
Console.WriteLine(url.ToString());
Again returns: http://server/api/item%20#123.txt.
Any way to stop this "magic" happen?
[1] This means I have delegates/interfaces where input is an existing Flurl.Url instance which I have to modify.
It looks like you've uncovered a bug. Here are the documented encoding rules Flurl follows:
Query string values are fully URL-encoded.
For path segments, reserved characters such as / and % are not encoded.
For path segments, illegal characters such as spaces are encoded.
For path segments, the ? character is encoded, since query strings get special treatment.
According to the 2nd point, it shouldn't encode # in the path, so how it handles AppendPathSegment("item #123.txt") is correct. However, when you encode the # to %23 yourself, Flurl certainly shouldn't unencode it. But I've confirmed that's what's happening. I invite you to create an issue on GitHub and it'll be addressed.
In the mean time, you could write your own extension method to cover this case. Something like this should work (and you wouldn't even need to pre-encode #):
public static Url AppendFileName(this Url url, string fileName) {
url.Path += "/" + WebUtility.UrlEncode(fileName);
return url;
}
I ended up using Uri.EscapeDataString(foo) because suggested WebUtility.UrlEncode replaces space with '+' which I didn't want to.
I'm generating an encoded value to get passed within my URL, the issue is, our SEO manager configure the application, to pass lowercase URL, and he says he won't change the configuration. now i have to somehow encode my url, that uppercase, or whole string get encoded by their character code, so i can pass it without ruin the main value,
for example, my resulting base64 string is as following:
aHR0cDovL2xvY2FsaG9zdDoxMzUwL2hvdGVscy9nMy8xMzk1LTA1LTEwLzEvOTI3MjIyZmY
but it turn to be like this, when is passed to controller:
ahr0cdovl2xvy2fsag9zddoxmzuwl2hvdgvscy9nmy8xmzk1lta1ltewlzevoti3mjiyzmy
which can't be read... the case cause issue while decode.
You cannot encode it using base64 if it will be transformed to lowercase out of your control, base64 relies upon using uppercase characters.
If the configuration your manager is insisting on is that incoming or outgoing query string parameters be incorrectly lower cased, however, you should inform him that he is in violation of the URI specification, specifically the query string section. Of course it is ultimately up to your own internal company choices whether you want only lower case in your internal URIs, but you should not assume that other applications handling URIs will operate like this.
As #sachin stated above, if you can make this a POST request (instead of a GET like I assume it is now), and provided that your manager is not lower casing those upon sending them as well :/ You can send this data via POST.
Alternatively, you could use Base32 instead to get around this, it does rely on uppercase characters only, but you can simply transform the recieved value to upper case upon recieveing it prior to decoding the (now Base32) string. This is a pretty ridiculous solution though...
Just to be clear: "lol" would encode in Base32 to "NRXWY===" which would then be lower cased to "nrxwy===" which you could then uppercase back to "NRXWY===" prior to decoding.
These are two NuGet packages that do Base32 encoding:
Base32 as per RFC4648 here and the author claims it's tested and working correctly.
Another package, which looks appealing because it supports zBase32 here, the advantage with zBase32 is that it already uses lowercase characters only, so you won't have to worry about changing the case. The porter/author has included instructions on how to get zBase32 encoding
Both of the these (Base32 and zBase32) use a subset of Base64 characters, so they'll both work fine with URIs, all of the charcaters used are valid in URIs (the utf-8 content is irrelevant since you're just encoding bytes, so you'll get the same bytes back when you decode from Base32)
My site throws an exception every time a special kind of character is included in the request, or when the size of the URL exceeds a certain length.
How can I control the URL and transform it before processing it (For example : if the request was http://xwz.com/"ert I want to turn it into http://xwz.com/ert). Something like that.
I am using .net and c#
use this : HttpServerUtility.UrlEncode Method (String)
You can use it like this :
System.Web.HttpUtility.UrlEncode("test t");
You will need this library : UrlEncode usesSystem.Web.HttpUtility.UrlEncodeto encode strings.
Looking for HttpUtility.UrlEncode
The UrlEncode(String) method can be used to encode the entire URL, including query-string values. If characters such as blanks and punctuation are passed in an HTTP stream without encoding, they might be misinterpreted at the receiving end. URL encoding converts characters that are not allowed in a URL into character-entity equivalents; URL decoding reverses the encoding. For example, when the characters < and > are embedded in a block of text to be transmitted in a URL, they are encoded as %3c and %3e.
The code below will replace any invalid characters in your URL by an empty space
string url = System.Text.RegularExpressions.Regex.Replace(url , #"/^[!#$&-;=?-[]_a-z~]+$/", "");
I think this is what you're looking for:
System.Web.HttpUtility.UrlEncode(string url)
I'm using .NET 4.5 and I'm trying to parse a URI query string into a NameValueCollection. The right way seems to be to use HttpUtility.ParseQueryString(string query) which takes the string obtained from Uri.Queryand returns a NameValueCollection. Uri.Query returns a string that is escaped according to RFC 2396, and HttpUtility.ParseQueryString(string query) expects a string that is URL-encoded. Assuming RFC 2396 and URL-encoding are the same thing, this should work fine.
However, the documentation for ParseQueryString claims that it "uses UTF8 format to parse the query string". There is also an overloaded method which takes a System.Text.Encoding and then uses that instead of UTF8.
My question is: what does it mean to use UTF8 as the encoding? The input is a string, which by definition (in C#) is UTF-16. How is that interpreted as UTF-8? What is the difference between using UTF8 and UTF16 as the encoding in this case? My concern is that since I'm accepting arbitrary user input, there might be some security risk if I botch the encoding (i.e. the user might be able to slip through some script exploit).
There is a previous question on this topic (How to parse a query string into a NameValueCollection in .NET) but it doesn't specifically adress the encoding problem.
When parsing encoded values, it treats those values as UTF-8. Take the character ยข, for example. The UTF-8 encoding is C2 A2. So if it were in a query string, it would be encoded as %C2%A2.
Now, when ParseQueryString is decoding, it needs to know what encoding to use. The default is UTF-8, meaning that the character would be decoded correctly. But perhaps the user was using Microsoft's Cyrillic code page (Windows-1251), where C2 and A2 are two different characters. In that case, interpreting it as UTF-8 would be an error.
If this is a user interface application (i.e. the user is entering data directly), then you probably want to use whatever encoding is defined for the current UI culture. If you're getting this information from Web pages, then you'll want to use whatever encoding the page uses. And if you're writing a Web service then you can tell the users that their input has to be UTF-8 encoded.
Using HttpUtility.UrlEncode and passing via the URL the receiving page sees the variables as:
brand new -> brand+new
Airconaire+Ltd -> Airconaire+Ltd
Can you see how the first and the second both have a + in them where they didn't at the start? I'm assuming this is something to do with the encoding (specifically RFC3986 or RFC2396) but how do I solve this?
I think ideally the spaces should be converted to %20 but is this the best way forward?
Try using HttpUtility.UrlPathEncode rather than URLEncode.
The UrlEncode() method can be used to encode the entire URL, including query-string values. If characters such as blanks and punctuation are passed in an HTTP stream, they might be misinterpreted at the receiving end. URL encoding converts characters that are not allowed in a URL into character-entity equivalents; URL decoding reverses the encoding. For example, when the characters < and > are embedded in a block of text to be transmitted in a URL, they are encoded as %3c and %3e.
You can encode a URL using with the UrlEncode() method or the UrlPathEncode() method. However, the methods return different results. The UrlEncode() method converts each space character to a plus character (+). The UrlPathEncode() method converts each space character into the string "%20", which represents a space in hexadecimal notation. Use the UrlPathEncode() method when you encode the path portion of a URL in order to guarantee a consistent decoded URL, regardless of which platform or browser performs the decoding.
http://msdn.microsoft.com/en-us/library/4fkewx0t.aspx