I'm doing a class that gets a string of a JSON (that represents an object) and I'm deserializing it using JSON.NET from Newtonsoft. As I don't know exactly the object that I need to serialize what I'm doing with the JSON.NET library is to get a Dictionary.
The thing is that I'm processing each property differently depending of its type. I can recognize without problems the date or intes by comparing the parsed object JToken.Type with for instance JTokenType.Date.
However the type JTokenType.Bytes, seem to not be working. Then I have this problem, I have a string that represents a normal string an a string that represents a byte[]. How can I distinguish them? Either by using the JTokenType class or other types? Any idea?
Lot of thanks
Best regards
I attach here an example of the JSON that I'm parsing:
{"firstName":"Mario","lastName":"Bross","users":[{"name":"Tony","country":"UK","telephone":"663242342"},{"name":"Ahmed","country":"UAE","telephone":"66934534"},{"name":"Alejandro","country":"ES","telephone":"666243098"}],"firstTable":true,"secondTable":false,"greeting":false,"headerInfo":"This is a test document","footerInfo":"This page ends here","date":"2016-03-29T00:00:00+02:00","remark":"Here we have some remarks","logo":"...mZjx+pfYrAejus9sOjAMAzUOOvxc98y+f/gu4TmWYcKWX/toSHT5BJf98ayyqQiXB1izthrBtEkIcKq+3xrx9syw/avveMvpiTQFWtPq668MXW5PtsgDFViZ3K8h9mvhbueZ2k9Mfbb9DqQ2+mSr9+V8SrEZJlE5s1OodgThNCt/a63NXGbjiYvAkYnmQd8xOfSQA1dITz4phH6f+POO2aUmUM+2Gdqh+iMc6Fg52Of8rqvs/SbTf+Sanf31V+nPSC+gwIo+NK3J0YAhOGgyW8CebDOPR/IINWrOOGfjXZF8elcM5QzIPYOWO4m3FvjzfEd5AzrKTrFzcNqr7F0qZfxH7d1Onv/NeAAaHdGRDFoZdpyBOdiT1zJp4UAFQwHVGMaMPU4+Q/POlrRm8B2xVMk4G8nrsV9PkfzO95Resv1VMX/k1hetYyvyfo2J/Z72AykSkaMlqtOzL8jYeDNRNhvgSe+zt3mD5OfNM4324/Kj92wlBUNLyqukkTFd2BXJdF+J3fC++dAAirMBzRE8EWqURsmXf9br2E7+3bQT6zsCyjxeJ87+B6S2/t3CdQHX/LpT5r6FOf2crAlWZPkykPet99Q2BXraBm8bTxJ7Zg2/w2+UXa79Xb/82bMG9cOsubcioJy4FPgvKrgCbXxN1DQrDIvYhelqI6i65uOObn7TXAOdDGFwYetuLQ3LX/ckq13yN6R1/RUwf+AHOPxZ+jibWXxtD0486/R3iBZSwA5FhHb7kpUDRZT1liwFulPKccPmFB+AkYALRG+9NXzzABiMAsCLmBoB9GVOOa4KKgDi5YZz9GXcOI4XSoc/cgdbZn1LiD2U/jPtayPVGrhcQ57+jxPpTdf8AsAMTKLY6h9VqN5CHAodDMa7LbDS3XZc/FuN2GLitH/3Adb/mTSDMQmckDIXo7sNPPBasTTwep7Q84tzOgBA76nPvQOZMjPp6SUr8LaWy32VM33PRUwtR9VBVsf4C2COgdwqiLJg06YZg3szJoxebDu7AJJ9IXbcd1PHm/OTNuOM1zeXGfj+45Gajy+Duv7zzoOY9yEM6HOn7mPi0988bs+YPyfXHrL+VlPjDsM98Kvu1oxA1qsZ9h3NZsJVmdQff90Qwbl4k+4wTevH/1w67TYE72rV52keS6/8o0zf9YNbfxZT462oo+6mOvwB6AXJCsB2LS8zgFvrrgDaeDLmpzKkPTyCa9+8pJe15xR4T13820xd9Yrdlbynxp8p+QfICREKwmVZ2aZ52AcvffCIQs/nNmnm2Zo131c9bZ4tk5KmaEGiKstNz82NKBy34mZWL+yfRfZH1f8Tg+mO1HXZdNFSJv9AkBHWGINaM9Zg02+lOPjNmH2b42W3fRQwbiO01dhUMR9CcFZgf8y3ur9rgDXL9MehjLWX9b+J6EdcBBte/skr8hYYh2IASMOks+87NTnz5ON2N8/pA9nG3kMPKA7CbvUapzOm2ZbxeNC4IDUqbb5OeYPuh0+9pIvxgzNcsyvrnMH23ZWvl+oc+FKhRXBVALTZn6fv+tvF+2ndkmRKfr6O57Soors/6yG9X6rCeO+ZLYvsh7seUH4z4voPrZKav+EqnQ0e5/mEQClSh2msLjSBUtdFINnr1Nz419Yx5qEyZD4SfYJT4EKt7vYVGqfOaMmUfJf3kuB8z/tDpdx4RfjrRoVPDkPVXBiBENOFEIgi10bKy9TqOZ2M3HPVqZPfoRWU4/bcNma7GdcWS9p97mJJ+qPej0QcDPu8jth/ifiz46EaHTW1F9w2/fEBDcs3SWKt+V7P8TSftfPEY4CHX+GEI8DsFihjL+FdIeJXpXX5PENUX9f5bmd7mO4zi/jZ02CQqwk/45gPO1Uo0HXNvY/mbf3f3xSOxJ4M/GPG+0jDTEctOsoS6r0tkn/VS0g/1/lFcU5k+5KOBivvDOx9Qq5gfgHity7g72biiM2ZfPMZzGbfvqll9MQ1+kfHHgI85TO/yQ9KvL9X7G6u4P3L4AS2ZPpdtIOsxaZHRCBjBv2DQlIDG++2mvaDA5uPINnkdmOO1fn2qrwD/RmL6IeN/LZF9Minp15Tq/aLRR7n+YZoPkJOCqNNiL9sQlnTJYmEEZPDD/Q/0Rh7ovKe/UID2QdG+a3f6r9fgr9XqbQP4MdgT/f2Y7oOlnv3Jk2yukn6Rkw8QSUH0C2Ao49lM52sPZ90nLJ0zaOoZAf7d6UODwufH8AmstVZGwLcNQI7vL4TbX63Jmxbgv47KfdmU8W9JHmUCHS4q7o+gpGB1URmoERfX942uvV8Q4EdPf6DXcaGVFYMx9hw6UdzqiqEWoxb+y6/BFkodjfmN4D+f6TRfgL8VMf0SVdIv8vIBYpZgzYG1a7f7plefVwX4P+2R/r/EMat/DdbS0VXbvynuc4/02X8RX+rT+/rNwH+9BP7uBP565EnK4FcGIJLyAY927Nj6QErme8Unf7feb1QpX/5q1qDTEjZq5dFg3XwYcKF4+SFUTPTR6/yeTn4z8KuMfyQagA+Tk9sfSsv6rwD/W116rSBCBza33sQS6y9mwx7cH5R59o/sUCAMlfa+ai8x/P6P6vwbKNuvwB+tIcBX6em9D6dlHQTwf0rN+v2Fjkk38N+nkAG4iL74Wax8xQX8dNgZjFBAgTEEXX3tR37BdG4/6L1PEMnnISr1TaGEnwJ/NIF/T0rmoMPpWb8C/EdSMg8Vtu+IdU3o3cYAh5Fcx3O9kuI+jHa6nfWY9CIrMCcMKY3Qfv4GXd5nelffP5jO7UdP/2Ii+VxLpb5sKeGnwB/p4P8+LTOfg/8vgP9QSubuBS3bYGZbZ6ZvbBnMdQxZfXgBE5jO875COw1a9lvGch/9RQEownXAXUelTD9aercwvbHnAabTe68mkk9/Bf7oKf2V/z49cxoHvxbvH0jO+GhS/cbJROQA+GEIMMMtl3QM/X8O/czTDEL1xrewofd/qYAUoS5/0sRvpHgfyT7M8UNLL5Z3zuB6OX3/GXRvtKRSnwJ/JJN/9qdlLjjUh7r5kvu+3qdmzc7k9qcT+AH0URQCDCXrn04kIWgf8hDyNY8g+Zp/sILNfylgRYiOXvU7TfGBy/+cFO8j038X1+lcJ9L3j+8a9N7mRPJJVOCPUPAvadOm0v70zGUC/P/r1fe5RhUrtiPXDl/0IDrhR5ERGEi/T2J6gweYgmjxFM/pR4+dyBp1u5eNXPGDAliYa5+bfmDxVbdLLj8GeWBl94MU7yPheyElgJEIbs90bj/ovUaGnwJ/JIH/QHrWUwD/T2lZrq96pa8hQBvBD6s/nLK9qeT6AfSNKfaDC9iAToQOdJMM1ZKF5Stdz9Kmvqq8gTA99Vv0/VjK8j8lufwY5HEbxfsF9N1jew8292LTtGjsUeCPRAPwclJS9QN9+r/0I4H/yx5pS5jer42TvS+BP4fAP4xc/hRy/RD3oU+gBrl/CRQD1qKbAzdJD7ppCrSKQcNu97PhS75TwAsTTZ+6j5/6r0qnPjb2rKMSH1z+GynBO5rifYSDrcnQgyIOlmgFBf4IBP97vXtXO9gn618C/J8kpcwl8PcgF36wlNwbwvSWTsT559Ipj1O/Gt0EFelGqEgnQnUyDq2k0uEoIg9NY13HP8NyHz2hQBgiHfrAL1J5D7H+k9Kpfz/T9/VNkVx+JIA70vcukn2in1919UWi6486P8B/JB28/tTZdGKLk1+AP4fAn0GuH1z7ZpT0kTO+5UnPYiVNRFXpcc2kkACvBcbYNSyx3h2s703vKt5AkOv67Ud+zsrFbZMy/Fuoti9O/Zsoy58nufw4GJpQvJ9o+N4V+COx1r8vLeNSxP2f9UyfSe58FynbP5x0EBmEHpTga+rmJiguJdLvK1FYUFMKCWBgsqh8OFHzBup3WMIGzPtCATSAmr/pDEu6dDeLr/aK5O4/Thn+5RTro7Z/LZ36OZT/6UJeXAMK9VS8HyUG4Kzv0jOu+TKlzy0EzrZk6TPp9B8qZfq7E3ibUHyfwNx3dskzBSoYQoKWlD9IJW/gPKazCW9mjXquYgPv+VoB1mHgJ0/+H0uo+5oEfJT20MSD7byLmE7nBcV7EtexdOr3oiy/7O3Fs9KDPBT4I9kAFHbuLGayN6Evuxe5+gMo2ZdGlYCzfcj4mnkDieQ9NCGD0p3+HhKMF9HpM5M17rGWDbx7twJwQICPOH8VlfbuYjqF+yoKy4aRB9iZEn0N6TtPUC5/lMX/rKTPvwaV8doSIFMI+L3pRmhDN4Jw/7zN+MregJwgRAKxBeUGepPRGUthAZZHzGaNktaxrNu/YAVFfytQ29TcdX9qrn5C3e0WwF9MwIfnN5kM70jy/JLoPmhKp341leWPYg+AlUz7aUBxXgeK+TqTR9DSEPv56v4ZvYF48gZEubA1K+kzGEIlw8vILZ3NqjVexlKmfMBv7pMK5BY6bPFx1n7kZyyukkjuiRh/owR8bOS9lTyti6m0ly1VdlpIZd0qFjkeJVFkACpKbnkjugFaEfCbULlHrvX6eyMYvYHKdMrUkcICQT6COzqOstG6IShfYSFrn/M8Gzz/O42vHuugz9t4mqVP38cadnmP6Qs4n6OsPmr5GyjGl098AH8CJWDhcSVTPqY1ff+16H6IV6d+DFQB6EsW5bpa5JYLRl+tANV6zXIDcljQjKoNohwpDMFlFBqgYnEvq9F0Netx6TssZ+nRmAI9DN+AeYdZ26Gf0mm/VXLzN3N9lOsKSu7Jrv7FEvBTpPCuscHdF/kdderHQB7gLMkICBZfVToFKgc48WP1HmqSG9rcYAhQlcinHMEUurHnMZBWap9dyHpdsYMNe/BQ1Cb0suYeYu1H7WSVa70igR7knS102sPNX0rlPBC6UMu/hmJ8K+DXJeCLxK5y92PUCMTRDSBrMFzAchaGINHEEHSjDDV4Cbl0YyN7faMWHuhtqotZ1YbrWce8N1jW7D1szNpTEQv6nGUnWOp1/2Mt+/2bn/SCrSdOegF69OYvYzpz7y7yjqYxfS7DBUxnXmZLrr4M/OqGOF+5+zFsBASD7yzJ/QvmSWB8H2aGoBnlCLrQSYYyJSjK4BFMIq9gBp1+Cyj2XcFqt32SdR73Lsuc9S0b+civYevWD130C0u9fi87e9DHLKHeKxLgn6JknnDvV0mgR1IP5J3pdNpPIC8JYVMGJfc6UowvXH13wFfgj2EjYKahNkiyIahBJ1cTuqGRte7JSmjLcHPHU9LwOjIGc5i+k/4Bco9Xsso1C1mLvv9kXS/cwfpM/5oNWXSE5T76RxA7705pMXzKtXvYuaP+wxp0fIuf8KJc9zS59UjibWI6S28NxfRo0lpIYc9MCfQTKUcygtx8VFK6UxWnFSX36kiuvgK+kojzTIQhEMlCVC0aUtWiLXkFcHMzKVeQS8ZgEmW+ARa0tN6hJRB1g/AQAWu1drLG1yxi9Tq9wFpnv8k65e1gPS//jPW96VvWb+b/2OAFh9ng+YdZztJf2KiVv5nqsEXH2KD5R9igew7z533HUqZ8w3pM2sXaj/yYNUt9h9Vq8yqLq4zY/RkJ6DjZiwjscOnX0Qm/nAB/PxmwOZTzmEphzwQCPer3gq3Zk077s8lbakCJXAV8JVFlCCpRgrIqhQf1yCtoRadeN8kYDKY4eBxlwS+XDMKtBKy7KGR4gED3MNM74mAY1pLbDXAWkhu+hUAr9DETFf+2hZ4DgG+UQL6GgL6C3PnFBPb5dMLfToCfRqc8DBn4+WjOyaGTXoC+E4VGzaXTvoaUyK2ogK8kmsIUkbSUKwfV6bSrbzAG8Ax6UfKwP3kHMAgFlEQEsK6mkGE6gW4W5RHm0em7gMC5iID6EIUTy8hYWOkyetwSet4icuEX0OvOIwMEdx503BvIOF1Jbv14AvxIMmQwaOifSKKTXgZ9XTKGVaXTvgIrW85TwFcSNYZA9goEqchoDBpTmAC3GCzHrmQQ0ghQqCgMp/wBjMIF5F5PIiBOJuOAk/hGyivcSuHETNJZJjqTHnMrGZWbycBMpWQlTvUr6O9cTH83j4zTMHLrMwjw6MIUzMzW5N43pJPeDPRxTNXxlShjoBmDagSSOhQTNyGD0IYA1YkSZsIo9CNPYRABcQQZhzwKI86nk/kiAu4EN3oRue0A93lkYNDrMJpeV3RbZlEiM5Vc+q50wrclwItTvh7lPqrbBL0CvpKYMwRmxqCSiUGoTYACsJpKRqEteQqdCIhJZBxEg1RfMhKZBFx3mkmneD+K11MpN9GTXrcr/Z0O5M4LsDehE14AvgYBPoGVnrykQK9EiU1jYGUQqhqMQl0KHRpR+NCMlfRHtCGgwki0Iw/CnbYjbUshCADekkDelF6/If29OhS21KD3k0gnfDy934qGmF6BXokSPw1CnMEoxEuGIZGMQ3UCZS0yEHXISNSTtL6Jyj0Vdem5tcjQ1JDc+ETpZDeCPc4N4BXolShxyCAYWZAVJMNgNA5Cq5AmWGgVSSsbAF6JlaZbx7GyDEwFeCVKQmAQjIbBzEAI78GOnmWi7kCuAK9ESRgbhkCoEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEiVKlChRokSJEgv5f5QbBX6f1PscAAAAAElFTkSuQmCC"}
The last property is a picture (byte[])
Given a string of printable characters, you want to determine whether that string has been base64 encoded or not. This is not possible. You can determine if a string may have been base64 encoded:
How to check whether the string is base64 encoded or not
But it is possible for a user to construct a string that looks as if it was base64 encoded, but in fact was not.
If you have control over the data source, you should add an extra metadata field to indicate whether a field is bas64 encoded or not.
Related
I'm trying to deserialize json into a C# object. The json basically looks like this:
{ "hexValue": "0x9a7f" }
My POCO looks like this:
public class HexTest
{
public int hexValue;
}
I've read in a link from this question that Newtonsoft supports deserializing hex values. But in all fairness, those release notes were published a decade ago. I've even read in some source code on github
published here what appears to be code to deserialize a hex formatted string that starts with "0x". Yet, when I try to deserialize a hex value, I always get the following exception:
Could not convert string to int: 0x9a7f.
It doesn't matter what type I try. I've tried using int long decimal Decimal, etc... From reading the source it looked like the Decimal type should have worked but nothing works. Does Newtonsoft really have support for converting hex values defined as strings into a numeric data type of some kind?
Sure, I know I can use the information in the question I linked to above to implement custom support for it but I'd really rather use the built-in support if it's there.
Thanks to the comments to my original question above by Fildor, I was able to resolve the problem by removing quotes around the value in the JSON so it now read like this:
{ "hexValue": 0x9a7f }
Also, further testing reveals that any of the numeric data types work for this in the POCO including int, long, and decimal. It is probably worth noting that (not sure about the latest standard) most if not all JSON validators will consider this invalid JSON because hexadecimal is not a valid JSON numeric data type.
Taking another look at the source, it's clear why this works and not the string. The parser will only call the method that detects the 0x prefix if it recognizes the json value token as a numeric type which, if quoted, it cannot do because by definition, that is a string.
I was CBOR serializing a JSON object in C++ with nlohmann::json library and my use case involves reading the cbor byte string output in c#. I've noticed that, whereas when dumping a json object to a string in C++ with nlohmann::json library, json string values (i.e., case value_t::string) are escaped (a call to escape_string is made), no such call is made when json values are string values in the CBOR approach.
I was reading the CBOR CRF 7049 and it seems that strings do not need to be escaped when serializing to CBOR.
The behavior in the nlohmann::json library is consistent: strings are not escaped when serializing, nor excepted to be escaped when de-serializing.
But it appears that Newtonsoft.Json (C# library), expects that. Is it a valid expectation? Or am I doing something wrong in the process?
C++ side:
nlohmann::json json_doc;
json_doc["characters"] = nlohmann::json::array();
for (int i = 0; i < characters.size(); i++) {
json_doc["characters"][i]["name"] = (characters[i] != nullptr) ? characters[i]->name() : "";
}
std::vector<uint8_t> cbor = nlohmann::json::to_cbor(json_doc);
output->assign((char*)&cbor[0], cbor.size());
C# side. cbor_bytes is the cbor byte string (c++ output vector)
CBORObject cbor = CBORObject.DecodeFromBytes(cbor_bytes);
output = cbor.ToString();
Such output string by then, is wrongly formed:
{"characters": [{"name": "Clara Oswald"}, {"name": "Kensi Blye"}, {"name": "Temperance "Bones" Brennan"}]}
and cannot, obviously be parsed:
JObject output_obj = JObject.Parse(output);
CBOR (Concise Binary Object Representation) is not JSON (JavaScript Object Notation). Although CBOR may have borrowed some concepts from JSON, it is clearly a different format with different rules and goals. CBOR is a binary format; JSON is text. In CBOR, strings have length prefixes, whereas they do not in JSON. Furthermore, CBOR does not allow arbitrary whitespace between elements (it wouldn't make sense for a binary format), whereas JSON does (for human readability). Ultimately, CBOR does not need a mechanism to escape strings because it does not require delimiters to tell where a string starts and ends. JSON, on the other hand, requires double quotes to mark the beginning and end of each string. As a consequence, quotes and control characters within strings must be escaped with backslashes in JSON, as well as literal backslashes themselves. There is no getting around this rule if you want to ensure the JSON will be parsable.
In your code above you are using the CBORObject.ToString() method to turn the object into a string. If this CBORObject is from a third-party library, does the documentation state that ToString() will produce valid JSON? If so, then it definitely has a bug; it should be doing the proper escaping as required by the JSON spec. If there is no such promise of valid JSON, then you can't expect that Json.Net will be able to parse the string, even if it sort of looks like JSON. (You might check to see whether the CBORObject has some other dedicated method like ToJson() for performing this conversion.) If CBORObject is your own code, then it is on you to escape the strings properly when converting from CBOR to JSON.
I have large amount of data which consists of tables,font,bold,size,etc. Those data will be stored as byte[] in Database.
when i retrieve those data i need to convert byte[] into string,because i need to some find & replace from this string,after i convert this string into byte[],am losing the original data structure which means, i can't able to see any tables,font,bold etc. properly. So how can i find and replace in byte[] by converting string and also to keep remain the data in original format.
The short answer is don't. Figure out the format of the data and see what you can do to do the manipulation. If the data is actually text, just stored as byte[], your approach would work, provided you encode the string correctly (ie. if your DB expects UTF-8, use UTF-8 encoding, if it's windows-1251, use that).
If you have a structure where a part of it is a string, what you're doing can't really work well. First, you probably want to modify just the relevant parts of the field. On MS SQL, you have handy functions for that. But even then, you should know what's actually stored there, not just assume that a string replace will magically work.
Now, a hack could be to use an explicit encoding that doesn't break the non-string data. That would be some single-byte encoding that doesn't do anything fancy. This is OK as long as you use the same encoding while reading the text data - however, if you use any variant of unicode, you're out of luck; due to features like string normalization, you can't really guarantee that what comes in comes out the same way, per-byte. It's generally a bad practice anyway.
Don't forget that it's quite possible the string you are looking for is actually somewhere outside of the text fields - even by pure chance, it can happen, and certain practices make that even more likely.
Again: figure out the data format inside that data field - then you can decide how to do what you want.
Try this
string result = System.Text.Encoding.UTF8.GetString(byteArray)
To make Byte[] to String
byte[] byteArray = new byte[10]; // put your byte array here
public void byteToString()
{
stringTemp = "";
stringTemp = BitConverter.ToString(byteArray).Replace("-", "");
}
And your data still in byteArray.. :)
If the byte Array contains binary data and is no string, try to convert it to base64:
Convert.ToBase64String(yourByteArray);
I have a text that is a property of an object.The object gets XmlSerialized and after that there is an element in the XML call Text that represents the text from the object.I am wondering how to turn it back to string.
THE TYPE OF SERIALIZATION: XmlSerializer serizlizer = new XmlSerializer(typeof(Act));
THE PROPERTY IN THE CLASS :
[System.Runtime.Serialization.OptionalFieldAttribute()]
private byte[] ActTextField;
In the xml file it looks something like that:
0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAABAAAALQAAAAAAAAAAEAAALwAAAAEAAAD+////AAAAACwAAAD////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////spcEAJ2AJBAAA8BK/AAAAAAAAEAAAAAAABgAAYB4AAA4AYmpiavbg9uAAAAAAAAAAAAAAAAAAAAAAAAACBBYALiIAAJSKAQCUigEAzwYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD//w8AAAAAAAAAAAD//w8AAAAAAAAAAAD//w8AAAAAAAAAAAAAAAAAAAAAAKQAAAAAANADAAAAAAAA0AMAANADAAAAAAAA0AMAAAAAAADQAwAAAAAAANADAAAAAAAA0AMAABQAAAAAAAAAAAAAAOQDAAAAAAAArAgAAAAAAACsCAAAAAAAAKwIAAAAAAAArAgAABQAAADACAAAFAAAAOQDAAAAAAAA/Q4AALYAAADgCAAAAAAAAOAIAAAAAAAA4AgAAAAAAADgCAAAAAAAAOAIAAAAAAAA4AgAAAAAAADgCAAAAAAAAOAIAAAAAAAAWA4AAAIAAABaDgAAAAAAAFoOAAAAAAAAWg4AAAAAAABaDgAAAAAAAFoOAAAAAAAAWg4AACQAAACzDwAAaAIAABsSAACSAAAAfg4AADkAAAAAAAAAAAAAAAAAAAAAAAAA0AMAAAAAAADgCAAAAAAAAAAAAAAAAAAAAAAAAAAAAADgCAAAAAAAAOAIAAAAAAAA4AgAAAAAAADgCAAAAAAAAH4OAAAAAAAAAAAAAAAAAADQAwAAAAAAANADAAAAAAAA4AgAAAAAAAAAAAAAAAAAAOAIAAAAAAAAtw4AABYAAAAkDgAAAAAAACQOAAAAAAAAJA4AAAAAAADgCAAAagMAANADAAAAAAAA4AgAAAAAAADQAwAAAAAAAOAIAAAAAAAAWA4AAAAAAAAAAAAAAAAAACQOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA4AgAAAAAAABYDgAAAAAAAAAAAAAAAAAAJA4AAAAAAAAAAAAAAAAAACQOAAAAAAAA0AMAAAAAAADQAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAJA4AAAAAAADgCAAAAAAAANQIAAAMAAAAUCGbpyopzgEAAAAAAAAAAKwIAAAAAAAASgwAANAAAAAkDgAAAAAAAAAAAAAAAAAAWA4AAAAAAADNDgAAMAAAAP0OAAAAAAAAJA4AAAAAAACtEgAAAAAAABoNAAD0AAAArRIAAAAAAAAkDgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAK0SAAAAAAAAAAAAAAAAAADQAwAAAAAAACQOAAA0AAAA4AgAAAAAAADgCAAAAAAAACQOAAAAAAAA4AgAAAAAAADgCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA4AgAAAAAAADgCAAAAAAAAOAIAAAAAAAAfg4AAAAAAAB+DgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADg4AABYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOAIAAAAAAAA4AgAAAAAAADgCAAAAAAAAP0OAAAAAAAA4AgAAAAAAADgCAAAAAAAAOAIAAAAAAAA4AgAAAAAAAAAAAAAAAAAAOQDAAAAAAAA5AMAAAAAAADkAwAAJAMAAAgHAACkAQAA5AMAAAAAAADkAwAAAAAAAOQDAAAAAAAACAcAAAAAAADkAwAAAAAAAOQDAAAAAAAA5AMAAAAAAADQAwAAAAAAANADAAAAAAAA0AMAAAAAAADQAwAAAAAAANADAAAAAAAA0AMAAAAAAAD/////AAAAAAIADAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB4EIAAgAB8EIAAgACAEIAAgABUEIAAgABQEIAAgABUEIAAgABsEIAAgABUEIAAgAB0EIAAgABgEIAAgABUEIAANABUELgAgAEAEMAQ5BD4EPQQ1BD0EIABBBEoENAQgADIEIAA3BDAEOgRABDgEQgQ+BCAAQQRKBDQENQQxBD0EPgQgADcEMARBBDUENAQwBD0EOAQ1BCAAPQQwBCAANAQyBDAENAQ1BEEENQRCB
I can not even suppose what is its encoding and how to decode it.I tried to read it into byte array but it didn't actualy work after applying few decodings Encode.UTF8 , Encode.ASCII,
That looks like Base64 to me - just use
byte[] data = Convert.FromBase64String(base64Text);
It's odd that it's using base64 at all if this is really a text property though. I'd expect just the text.
To convert that binary data back to text you would need to know which encoding was used to convert it to the binary data to start with - and UTF-8 is the most likely - but all the repeated AAAAA... parts in there make this look pretty unlike text, to be honest.
EDIT: Now that we've seen the field declaration, we can see that it was a byte[] to start with, so that makes sense for it to be encoded in this way. Judging by comments, it sounds like it's actually a Word file - at which point extracting the text is a very separate problem.
I'm using .NET 4.5 and I'm trying to parse a URI query string into a NameValueCollection. The right way seems to be to use HttpUtility.ParseQueryString(string query) which takes the string obtained from Uri.Queryand returns a NameValueCollection. Uri.Query returns a string that is escaped according to RFC 2396, and HttpUtility.ParseQueryString(string query) expects a string that is URL-encoded. Assuming RFC 2396 and URL-encoding are the same thing, this should work fine.
However, the documentation for ParseQueryString claims that it "uses UTF8 format to parse the query string". There is also an overloaded method which takes a System.Text.Encoding and then uses that instead of UTF8.
My question is: what does it mean to use UTF8 as the encoding? The input is a string, which by definition (in C#) is UTF-16. How is that interpreted as UTF-8? What is the difference between using UTF8 and UTF16 as the encoding in this case? My concern is that since I'm accepting arbitrary user input, there might be some security risk if I botch the encoding (i.e. the user might be able to slip through some script exploit).
There is a previous question on this topic (How to parse a query string into a NameValueCollection in .NET) but it doesn't specifically adress the encoding problem.
When parsing encoded values, it treats those values as UTF-8. Take the character ยข, for example. The UTF-8 encoding is C2 A2. So if it were in a query string, it would be encoded as %C2%A2.
Now, when ParseQueryString is decoding, it needs to know what encoding to use. The default is UTF-8, meaning that the character would be decoded correctly. But perhaps the user was using Microsoft's Cyrillic code page (Windows-1251), where C2 and A2 are two different characters. In that case, interpreting it as UTF-8 would be an error.
If this is a user interface application (i.e. the user is entering data directly), then you probably want to use whatever encoding is defined for the current UI culture. If you're getting this information from Web pages, then you'll want to use whatever encoding the page uses. And if you're writing a Web service then you can tell the users that their input has to be UTF-8 encoded.