C# Parse HTML Post Data

C# Parse HTML Post Data - c#

I have MemoryStream data (HTML POST Data) which i need to parse it.
Converting it to string give result like below
key1=value+1&key2=val++2
Now the problem is that all this + are space in html. Am not sure why space is converting to +
This is how i am converting MemoryStream to string
Encoding.UTF8.GetString(request.PostData.ToArray())

If you are using Content-Type of application/x-www-form-urlencoded, your data needs to be url encoded.
Use System.Web.HttpUtility.UrlEncode():
using System.Web;
var data = HttpUtility.UrlEncode(request.PostData);
See more in MSDN.
You can also use JSON format for POST.

I suppose that the data you are retrieving are encoded with URL rules.
You can discover why data are encoded to this format reading this simple article from W3c school.
To encode/decode your post string you may use this couple of methods:
System.Web.HttpUtility.UrlEncode(yourString); // Encode
System.Web.HttpUtility.UrlDecode(yourString); // Decode
You can find more informations about URL manipulation functions here.
Note: If you need to encode/decode an array of string you need to enumerate your collection with a for or foreach statement. Remember that with this kind of cycles you cannot directly change the cycle variable value during the enumeration (so probably you need a temporary storage variable).
At least, to efficiently parse strings, I suggest you to use the System.Text.RegularExpression.Regex class and learn the regex "language".
You can find some example on how to use Regex here; Regex101 site has also a C# code generator that shows you how to translate your regex into code.

Related

Redis compressing string values in .net MVC

In my app I want to compress the data that get stored in redis string keys.
I don't want to compress all of them though because small data values don't compress well and I want to avoid the cpu overhead on them.
My question is how to detect that a value is compressed when I read the string key in order to perform decompression?
I tried some code to append a custom header to the zip stream but i didn't had any luck.

A common pattern is to use a payload prefix combined with a delimiter.
For example, you could use a format like this:
[key];[encoding];[metatype];[version]\t[payload]
I use delimiters ; and \t here. Choose other delimiters if you like them better. Ofcourse you must prevent these delimiters from occurring in your prefix tags themselves. [payload] contains for example binary data, string data, whatever. [encoding] can for example be zip,msgpack,utf8,base64,json (just some ideas).
The benefit of using a payload prefix is that you don't have to deserialize or uncompress the payload itself to use it as an entity. In Redis-Lua for example, you can't unzip. But you can do a simple read of the preload prefix, and respond to client requests. Even if you can deserialize inside Redis-Lua, like JSON or MsgPack formats, you might not want to do that because of performance reasons.
There are other options ofcourse. If you don't like prefixes with delimiters, you could also put the payload and encoding-tag in an array, and serialize it as MsgPack. Or, use JSON for the prefix, then a null character, then the payload. Or even (a bit more memory efficient): use 4 or 8 bytes for the prefix size, MsgPack for the prefix, and use the prefix size to determine where the payload starts (which might even be MsgPack as well).
Final word of advice: don't mess with the payload itself (like altering the zip header), that's bound to get you in a whole lot of unnecessary trouble.
Hope this helps, TW

Best datatype for storing html

I need to store HTML in one of the varibles in my class, which datatype would you suggest? Will string be ok or is there any special datatype I can use for this kind of operation.

string if you're storing the raw HTML.
If you were planning on storing an object-representation of the HTML, then obviously you would use that object.
However, if it's just the raw HTML string, you'd use a string. There's nothing specially suited to any type of string content.
Actually, there kindof is, but it has a specialist usage to represent already-encoded HTML data that should not be encoded again (generally used to output raw HTML in ASP.NET). This isn't what you want, but just so that this answer is complete - HtmlString.

Replace ASCII characters with their equivalent

I am setting a value in the cookie using JavaScript and getting the contents of the cookie in the code behind.
But the problem is if I am storing the string with some special characters or whitespace characters, when I am retrieving the contents of the cookie the special symbols are getting converted into ASCII equivalent.
For example, if I want to store Adam - (SET) in cookie , its getting converted into Adam%20-%20%28SET%29 and getting stored and when I am retrieving it I get the same Adam%20-%20%28SET%29. But I wan tot get this Adam - (SET) in the code behind.
How I get this. Please help.

In C#
Use:
String decoded = HttpUtility.UrlDecode(EncodedString);
HttpUtility.UrlDecode() is the underlying function used by most of the other alternatives you can use in the .NET Framwework (see below).
You may want to specify an encoding, if necessary.
Or:
String decoded = Uri.UnescapeDataString(s);
See Uri.UnescapeDataString()'s documentation for some caveats.
In JavaScript
var decoded = decodeURIComponent(s);
Before jumping on using unescape as recommended in other questions, read decodeURIComponent vs unescape, what is wrong with unescape? . You may also want to read What is the difference between decodeURIComponent and decodeURI? .

You can use the unescape function in JS to do that.
var str = unescape('Adam%20-%20%28SET%29');

You are looking for HttpUtility.UrlDecode() (in the System.Web namespace, I think)

In javasrcipt you can use the built-in decodeURIComponent, but I suspect that the string encoding is happening when the value is sent to server so the C# answers are what you want.

Insert image into xml file using c#

I've looked everywhere for the answer to this question but cant find anything so hoping you guys can help me on here.
Basically I want to insert an image into an element in xml document that i have using c#
I understand i have to turn it into bytes but im unsure of how to do this and then insert it into the correct element...
please help as i am a newbie

Read all the bytes into memory using
File.ReadAllBytes().
Convert the bytes to a Base64 string
using Convert.ToBase64String().
Write the Base64 Encoded string to
your element content.
Doneski!

Here's an example in C# for writing and reading images to/from XML.

You can use a CDATA part or simply put all the bytes in their hexadecimal form as a string.
Another option is to use a base64 encoding
The element you use is up to you.

http://www.dreamincode.net/code/snippet1335.htm seems to do exactly what you want to do. It might be something you might want to try out. Note that it is in VB.NET which you can easily convert to C#.

XML can only contain characters, it can't contain an image. There are various ways you can represent an image using characters, for example by encoding the image in PNG and then encoding the PNG in base64; or you could generate an element that contains a link to a URI from where the image can be retrieved. All such conventions have to be agreed between sender and recipient. So before you rush into base64 encoding, check that this is what the recipient expects.

How to get QueryString from a href?

I am trying to stop XSS attack so I am using html agility pack to make my whitelist and Microsoft Anti-Cross Site Scripting Library to deal with the rest.
Now I am looking at encoding all html hrefs. I get a big string of html code that can contain hrefs. Accours to MS Library they have an URL encode but if you encode the whole URl then it can't be used. So in the example they just encode the query string
UrlEncode Untrusted input is used in a
URL (such as a value in a
querystring) Click
Here!
http://msdn.microsoft.com/en-us/library/aa973813.aspx
So now my questions is how do I parse through a href and find the query string. Is it always just "?" then query string or can it have spaces and be written in different ways?
Edit
This urls will not be written by me but the users who will share them. So that's why I need a way to make sure I get all query strings and not just ones in valid format. If it can work invalid format I have to grab these ones too. Hackers won't care if it is valid format or not as long as it still does what they want.

I believe it is always the part after the ? but you can easily use the Uri class for this:
Uri uri = new Uri("http://foo.com/page.html?query");
string query = uri.Query;
That will include the ? itself. Of course, you can fetch the other bits as well, which could be handy.

what about using encrypted query string and in your code you can decrypt it
OR you can use Request.PathInfo that make you not need ? in query string

Here's a W3C reference addressing the composition of URIs with querystrings, which says in part:
The question mark ("?", ASCII 3F hex)
is used to delimit the boundary
between the URI of a queryable object,
and a set of words used to express a
query on that object.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Parse HTML Post Data - c#

If you are using Content-Type of application/x-www-form-urlencoded, your data needs to be url encoded. Use System.Web.HttpUtility.UrlEncode(): using System.Web; var data = HttpUtility.UrlEncode(request.PostData); See more in MSDN. You can also use JSON format for POST.

Related

Redis compressing string values in .net MVC

Best datatype for storing html

Replace ASCII characters with their equivalent

Insert image into xml file using c#

How to get QueryString from a href?

Categories

Resources