Some text fields in my database have bad control characters embedded. I only noticed this when trying to serialize an object and get an xml error on char and . There are probably others.
How do I replace them using C#? I thought something like this would work:
text.Replace('\x2', ' ');
but it doesn't.
Any help appreciated.
Strings are immutable - you need to reassign:
text = text.Replace('\x2', ' ');
exactly as was said above, strings are immutable in C#. This means that the statement:
text.Replace('\x2', ' ');
returned the string you wanted,but didn't change the string you gave it. Since you didn't assign the return value anywhere, it was lost. That's why the statement above should fix the problem:
text = text.Replace('\x2', ' ');
If you have a string that you are frequently making changes to, you might look at the StringBuilder object, which works very much like regular strings, but they are mutable, and therefore much more efficient in some situatations.
Good luck!
-Craig
The larger problem you're dealing with is the XmlSerialization round trip problem. You start with a string, you serialize it to xml, and then you deserialize the xml to a string. One expects that this always results in a string that is equivalent to the first string, but if the string contains control characters, the deserialization throws an exception.
You can fix that by passing an XmlTextReader instead of a StreamReader to the Deserialize method. Set the XmlTextReader's Normalization property to false.
You should also be able to solve this problem by serializing the string as CDATA; see How do you serialize a string as CDATA using XmlSerializer? for more information.
Related
Isn't a string already a character array in c#? Why is there a explicit ToCharacterArray function? I stumbled upon this when I was looking upon ways to reverse a string and saw a few answers converting the string to a character array first before proceeding with the loop to reverse the string. I am a beginner in coding.
Sorry if this seems stupid, but I didn't get the answer by googling.
Isn't a string already a character array in c# ?
The underlying implementation is, yes.
But you are not allowed to directly access that. String is using encapsulation to be an immutable object.
The actual array is private and hidden from view. You can use an indexer (property) to read characters but you cannot change them. The indexer is read only.
So yes, you do need ToCharacterArray() for reversing and similar actions. Note that you always end up with a different string, you cannot alter the original.
Isn't a string already a character array in c# ?
No, a string is a CLASS that encapsulates a "sequential collection of characters" (see Docs). Notice it doesn't explicitly say an "Array of Char". Now, it may be true that the string class currently uses a character array to accomplish this, but that doesn't mean it ~must~ use a character array to achieve that end. This is a fundamental concept of Object Oriented Programming that combines information hiding and the idea of a "black box" that does something. It doesn't matter how the black box (class) accomplishes its task under the hood, as long is it doesn't change the public interface presented to the end user. Perhaps, in the next version of .Net, some new-fangled magical structure that is not an array of characters will be used to implement the string class. The end user may not be aware that this change has even occurred because they can still use the string class in the same way, and if they so desire, could still output the characters to an array with ToCharArray()...even though internally the string is no longer an array of characters.
Yes String type is a character array but string array is not an character array you must have to convert each string in your array in char type so that you can easily reverse its indexes and then convert it into temporary string and then add that string to array to be reversed
Internally, the text is stored as a sequential read-only collection of Char objects.
See Programming Guide Docs
Console.WriteLine(StringHelper.ReverseString("framework"));
Console.WriteLine(StringHelper.ReverseString("samuel"));
Console.WriteLine(StringHelper.ReverseString("example string"));
OR
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
I have a string where I need to use as the body of a JSON object. I know its possible that the data could have quotes in it, so I parse through to add an escape character to those instance of quotes.. like so:
string NewComment = comment.Replace("\"", "\\\"");
However, somehow on some edgecases, a quote still makes it through. I don't know if this is something with UTF or some other issue, But I am trying to find a function that would safely create a json compatible string, I figured there has to be something like this out there, or a regex way of doing so.
Basically a TLDR is how to create a json syntax safe string from a c# string
The simple answer is don't do it this way. What if you have escaped quotes in your string? "Hello \"World\"" would become invalid with such a simple approach: "Hello \\"World\\"". JSON.Net or Newtonsoft are going to save you so many headaches in the long run.
Here is my prob, I wanted String.Format() function should take 4 objects and format string. But it throws "Input string not in a correct format error".
Here is my code,
string jsonData = string.Format("{{\"sectionTitle\":\"{0}\",\"strPushMsg\":\"{1}\",\"Language\":\"{2}\",}\",\"articleid\":\"{3}\"}}", urlsectiontitle, formatHeadline, Language, articleid);
\"{2}\",}\"
Looks like you need to escape that closing brace by doubling it:
string.Format("{{\"sectionTitle\":\"{0}\",\"strPushMsg\":\"{1}\",\"Language\":\"{2}\",}}\",\"articleid\":\"{3}\"}}", urlsectiontitle, formatHeadline, Language, articleid);
It appears you are creating JSON. This can use single quotes (which would avoid all the escaping), but even better use a tool like JSON.Net designed to create JSON. While your (partial) structure here is quite small (the unmatched } shows this is only partial), and the JSON gets bigger it is much easier to use a tool to get it right.
Currently my understanding of XML legal strings is that all is required is that you convert any instances of: &, ", ', <, > with & " ' < >
So I made the following parser:
private static string ToXmlCompliantStr(string uriStr)
{
string uriXml = uriStr;
uriXml = uriXml.Replace("&", "&");
uriXml = uriXml.Replace("\"", """);
uriXml = uriXml.Replace("'", "'");
uriXml = uriXml.Replace("<", "<");
uriXml = uriXml.Replace(">", ">");
return uriXml;
}
I am aware that there are similar questions out there with good answers (which is how I was able to write this function) I am writing this question to ask if this code will translate ANY string that C# can throw at it and have XDocument parse it as a part of a whole document without any complaints as all the questions out there that I've found state that these are the only escape characters, not that parsing them will cause 100% valid XML string. I've gone as far as reading through the decompiled XNode class trying to see how that parse it.
Thanks
Firstly, you should absolutely not do this yourself. Use an XML API - that way you can trust that to do the right thing, rather than worrying about covering corner cases etc. You generally shouldn't be trying to come up with an "escaped string" at all - you should pass the string to the XElement constructor (or XAttribute, or whatever your situation is).
In other words, I think you should try really hard to design your application so that you don't need a method of the kind you've shown in your question at all. Look at where you'd be using that method, and see whether you can just create an XElement (or whatever) instead. If you try to treat XML as a data structure in itself rather than just as text, you'll have a much better experience in my experience.
Secondly, you need to understand that in XML 1.0 at least, there are Unicode characters that cannot be validly represented in XML, no matter how much escaping you use. In particular, values U+0000 to U+001F are unrepresentable other than U+0009 (tab), U+000A (line feed) and U+000D (carriage return). Also if you have a string which contains invalid UTF-16 (e.g. an unmatched half of a surrogate pair), that can't be correctly represented in XML.
I am having a few problems with trying to replace backslashes in a date string on C# .net.
So far I am using:
string.Replace(#"\","-")
but it hasnt done the replacement. Could anyone please help?
string.Replace does not modify the string itself but returns a new string, which most likely you are throwing away. Do this instead:
myString= myString.Replace(#"\","-");
On a side note, this kind of operation is usually seen in code that manually mucks around with formatted date strings. Most of the time there is a better way to do what you want (which is?) than things like this.
as all of them saying you need to take value back in the variable.
so it should be
val1= val1.Replace(#"\","-");
Or
val1= val1.Replace("\\","-");
but not only .. below one will not work
val1.Replace(#"\","-");
Use it this way.
oldstring = oldstring.Replace(#"\","-");
Look for String.Replace return type.
Its a function which returns a corrected string. If it would have simply changed old string then it would had a void return type.
You could also use:
myString = myString.Replace('\\', '-'));
but just letting you know, date slashes are usually forward ones /, and not backslashes \.
As suggested by others that String.Replace doesn't update the original string object but it returns a new string instead.
myString= myString.Replace(#"\","-");
It's worthwhile for you to understand that string is immutable in C# basically to make it thread-safe. More details about strings and why they are immutable please see links here and here