Removing String Escape Codes

Removing String Escape Codes - c#

My program outputs strings like "Wzyryrff}av{v5~fvzu: Bb``igbuz~+\177Ql\027}C5]{H5LqL{" and the problem is the escape codes (\\\ instead of \, \177 instead of the character, etc.)
I need a way to unescape the string of all escape codes (mainly just the \\\ and octal \027 types). Is there something that already does this?
Thanks
Reference: http://www.tailrecursive.org/postscript/escapes.html
The strings are an encrypted value and I need to decrypt them, but I'm getting the wrong values since the strings are escaped

It sounds more like it's encoded rather than simply escaped (if \177 is really a character). So, try decoding it.

There is nothing built in to do exactly this kind of escaping.
You will need to parse and replace these sequences yourself.
The \xxx octal escapes can be found with a RegEx (\\\d{3}), iterating over the matches will allow you to parse out the octal part and get the replacement character for it (then a simple replace will do).
The others appear to be simple to replace with string.Replace.

If the string is encrypted then you probably need to treat it as binary and not text. You need to know how it is encoded and decode it accordingly. The fact that you can view it as text is incidental.

If you want to replace specific contents you can just use the .Replace() method.
i.e. myInput.Replace("\\", #"\")
I am not sure why the "\" is a problem for you. If it its actually an escape code then it just should be fine since the \ represents the \ in a string.
What is the reason you need to "remove" the escape codes?

Related

Regex not matching when input string contains an ampersand

I am trying to come up with a regex that starts with a letter followed by only letters, spaces, commas, dots, ampersands, apostrophes and hyphens.
However, the ampersand character is giving me headaches. Whenever it appears in the input string, the regex no longer matches.
I am using the regex in an ASP.net project using C# in the 'Format' property of a TextInput (a custom control created in the project). In it, I am using Regex.IsMatch(Text, Format) to match it.
For example, using this regex:
^[a-zA-Z][a-zA-Z&.,'\- ]*$
The results are:
John' william-david Pass
John, william'david allen--tony-'' Pass
John, william&david Fail
Whenever I put a & in the input string the regex no longer matches, but without it everything works fine.
How can I fix my issue? Why would the ampersand be causing a problem?
Notes:
I've tried to escape the ampersand with ^[a-zA-Z][a-zA-Z\&.,'\- ]*$ but it has the same issue
I've tried to put the ampersand at the beginning or end o ^[a-zA-Z][&a-zA-Z.,'\- ]*$ or ^[a-zA-Z][a-zA-Z.,'\-\& ]*$ but it also doesn't work

Your problem is somewhere else. The following expression evaluates to true:
Regex.IsMatch(#"John, william&david", #"^[a-zA-Z][a-zA-Z&.,'\- ]*$")
See https://dotnetfiddle.net/WDvQNP

You mentioned in the comments that your problem pertains to C#, so I'll answer your question in that context. If ampersand (&) is truly giving you issues in your character class, you should specify it in an alternate manner.
Luckily, C# supports hex escape sequences which means that you can specifying & as \x26.
For example, instead of:
^[a-zA-Z][a-zA-Z&.,'\- ]*$
use
^[a-zA-Z][a-zA-Z\x26.,'\- ]*$
If that doesn't fix your issue, then your issue is not the &, it's something else.

Path/File name backslash in C#

I'm converting from VB to C#, and in C# I seem not to be able to simply write a path string to the application settings..
D:\Something becomes D:\\Something
I tried also #"D:\Something", but that also doesn't work.
So what is the correct way? Say I want to have two settings; path and filename. How shall I format them, for the purpose of Path.Combine to make this a valid file-path/name for a database, or in other words, to have single backslashes?

Your code is working correctly - when you read a string with doubled slashes back, they becomes single slashes again. This is called escaping. It is designed to let you enter special characters as sequences starting in \. Single slash becomes special in this scheme, so you need to escape it with a slash as well.

Regex for getting key:value from JSON in C#

what is the pattern for getting a-z, A-Z, 0-9, space, special characters to deteck url
This is my input string:
{id:1622415796,name:Vincent Dagpin,picture:https://fbcdn-profile-a.akamaihd.net/hprofile-ak-snc4/573992_1622415796_217083925_q.jpg}
This is the pattern: so far
([a-z_]+):[ ]?([\d\s\w]*(,|}))
Expected Result:
id:1622415796
name:Vincent Dagpin
picture:https://fbcdn-profile-a.akamaihd.net/hprofile-ak-snc4/573992_1622415796_217083925_q.jpg
the problem is i can't get the last part.. the picture url..
any help please..

If this is the only kind of json input you expect and further json parsing is very unlikely, a full json parser would be overkill.
A string split may be all you need, jsonString.Split(',', '{', '}');
The regex for that would be along the lines of [{},]([a-z_]+):[ ]?(.+?)(?=,|})
If you can modify the json string that's being sent, you can key the RegEx on something else, like double quotes. Here's one I'm using that requires knowing the json key name. System.Text.RegularExpressions.Regex("(?<=\"" + key + "\"+ *: *\"+).*(?=\")");

I don't think a regex is the right solution. C# already contains the tools you need in JavaScriptSerializer. Check out the answer here to see how.

C# string - creating an unescaped backslash

I am using .NET (C#) code to write to a database that interfaces with a Perl application. When a single quote appears in a string, I need to "escape" it. IOW, the name O'Bannon should convert to O\'Bannon for the database UPDATE. However, all efforts at string manipulation (e.g. .Replace) generate an escape character for the backslash and I end up with O\\'Bannon.
I know it is actually generating the second backslash, because I can read the resulting database field's value (i.e. it is not just the IDE debug value for the string).
How can I get just the single backslash in the output string?
R

Well I did
"O'Bannon".Replace("'","\\'")
and result is
"O\'Bannon"
Is this what you want?

You can use "\\", which is the escape char followed by a backslash.
See the list of Escape Sequences here: http://msdn.microsoft.com/en-us/library/h21280bw.aspx

even better assign a var to the replace so that you can check it as well if needed
var RepName = "O'Bannon";
var Repstr = RepName.Replace("'","\\'");

You can also use a verbatim string
s = s.Replace("'", #"\'");

Escape string from file

I have to parse some files that contain some string that has characters in them that I need to escape. To make a short example you can imagine something like this:
var stringFromFile = "This is \\n a test \\u0085";
Console.WriteLine(stringFromFile);
The above results in the output:
This is \n a test \u0085
, but I want the text escaped. How do I do this in C#? The text contains unicode characters too.
To make clear; The above code is just an example. The text contains the \n and unicode \u00xx characters from the file.
Example of the file contents:
Fisika (vanaf Grieks, \u03C6\u03C5\u03C3\u03B9\u03BA\u03CC\u03C2,
\"Natuurlik\", en \u03C6\u03CD\u03C3\u03B9\u03C2, \"Natuur\") is die
wetenskap van die Natuur

Try it using: Regex.Unescape(string)
Should be the right way.
Att.

Don't use the # symbol -- this interprets the string as 100% literal. Just take it off and all shall be well.
EDIT
I may have been a bit hasty with my reply. I think what you're asking is: how can I have C# turn the literal string '\n' into a newline, when read from a file (similar question for other escaped literals).
The answer is: you write it yourself. You need to search for "\\n" and convert it to "\n". Keep in mind that in C#, it's the compiler not the language that changes your strings into actual literals, so there's not some library call to do this (actually there could be -- someone look this up, quick).
EDIT
Aha! Eureka! Behold:
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.unescape.aspx

Since you are reading the string from a file, \n is not read as a unicode character but rather as two characters \ and n.
I would say you probably need a search an replace function to convert string "\n" to its unicode character '\n' and so on.

I don't think there's any easy way to do this. Because it's the job of lexical analyzer to parse literals.
I would try generating and compiling a class via CodeDOM with the string inserted there as constant. It's not very fast but it will do all escaping.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Removing String Escape Codes - c#

It sounds more like it's encoded rather than simply escaped (if \177 is really a character). So, try decoding it.

If the string is encrypted then you probably need to treat it as binary and not text. You need to know how it is encoded and decode it accordingly. The fact that you can view it as text is incidental.

Related

Regex not matching when input string contains an ampersand

Path/File name backslash in C#

Regex for getting key:value from JSON in C#

C# string - creating an unescaped backslash

Escape string from file

Categories

Resources