Best way to separate two base64 strings - c#

I am using standard input and output to pass 2 base64 strings from one application to another. What would be the best way separating them so I could get them as a two separate strings in other application? I was thinking using a simple comma, to separate them and then just use
string[] s = output.Split(',');
Where output is the data I read in from standard output.
Example with the comma:
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCv5E5Y0Wrad5FrLjeUsA71Gipl3mhjIuCw1xhj
jDwXN87lIhpE32UvItf+mvp8flQ+fhi5H0PditDCzUFg8lXuiuOXxelLWEXA8hs7jc+4zzR5ps3R
fOv3M6H8K5XGkwWLhVNQX47sAGyY/43JdbfX7+FsYUFeHW/wa2yKSMZS3wIDAQAB
,HNJpFQyeyJoVbeEgkw/WNtzR0JTPIa1hlK1C8LbFcVcJfL33ssq3gbzi0zxn0n2WxBYKJZj2Kqbs
lVrmFbQJRgvq4ZNF4F8z+xjL9RVVE/rk5x243c3Szh05Phzx+IUyXJe6GkITDmsxcwovvzSaGhzU
3qQkNbhIN0fVyynpg0Kfm0WytuW71ku1eq45ibcczgwQLRJX1GKzC9wH7x/V36i6SpyrxZ/+uCIL
4QgnKt6x4QG7Gfk3Msam6h6JTFdzkeHJjq6JzeapdQn5LxeMY0jLGc4cadMCvy/Jdrcg02pG2wOO
/gJT77xvX+d1igi+BQ/YpFlwXI0BIuRwMAeLojmZdRYjJ+LY69auxgpnQvSF4A+Wc6Jo8m1pzzHB
yQvA8KyiRwbyijoBOsg+oK18UPFWeJ5hE3e+8l/WSEcii+oPgXyXTnK+seesGdOPeem3HukNyIps
L/StHZEkzeJFTr8LIB9HLqDikYU2mQjTiK5cIExoyy2Go+0ndL84rCzMZAlfFlffocL9x+SGyeer
M1mxmyDtmiQfDphEZixHOylciKUhWR00dhxkVRQ4Q9LYCeyGfDiewL+rm5se/ePCklWtTGycV9HM
H5vYLhgIkf5W6+XcqcJlE6vp4WWxmKHQYqRAdfW5MYWskx7jBDTMV2MLy7N6gQRQa/OpK8ruAbVf
MwWP1sGyhAxgrw/UxTH1tW498WI5JtQR3oub3+Uj5AqydhwzQtWM58WfVQXdv2bFZmGH7d9A+C95
DQ8QXKrV7Ot/wVq5KKLgpJy8iMe/G/iyXOmQhkLnZ3qvBaIJd+E2ZIVPty6XGMwgC4JebArr+a6V
Cb/SO+vR+eZmXLln/w==

All you have to do is to use a separator which is not a valid Base64 character. Comma is not a base64 character so you can use.
Base64 characters are [0-9a-zA-Z/=+] (all numbers, uppercase, lowercase, forward slash plus and equal sign).

This seems like a good solution. The comma cannot be part of a base64 index table so it is a safe separator.

You can wrap it i some XML. the CDATA element is perfect for that.

Related

How do I replace all special characters with their respective hex codes?

I have a XML file and it contains multiple special characters.
I want to replace all the special characters with their respective hex codes.
So & becomes &#x0026 and so on. But only special characters.
Please help.
You can use HttpUtility.HtmlDecode to decode special characters. More in the official documentation: https://learn.microsoft.com/en-us/dotnet/api/system.web.httputility.htmldecode
But you cannot use this method on the whole XML string, because < and > will be replaced. So you need to apply it only on the text nodes and attributes values

How to escape variable name when using Roslyn C# Syntax Factory?

So I'm using Roslyn SyntaxFactory to generate C# code.
Is there a way for me to escape variable names when generating a variable name using IdentifierName(string)?
Requirements:
It would be nice if Unicode is supported but I suppose ASCII can suffice
It would be nice if it's reversible
Always same result for same input ("a" is always "a")
Unique result for each input ("a?"->"a_" cannot be same as "a!"->"a_")
Can convert from 1 special character to multiple single ones
The implication from the API docs seems to be that it expects a valid C# identifier here, so Roslyn's not going to provide an escaping mechanism for you. Therefore, it falls to you to define a string transformation such that it achieves what you want.
The way to do this would be to look at how other things already do it. Look at HTML entities, which are always introduced using &. They can always be distinguished easily, and there's a way to encode a literal & as well so that you don't restrict your renderable character set. Or consider how C# strings allow you to include string delimiters and other special characters in the string through the use of \.
You need to pick a character which is valid in C# identifiers to be your 'marker' for a sequence which represents one of the non-identifier characters you want to encode, and a way to allow that character to also be represented. Then make a mapping table for what comes after the marker for each of the encoded characters. If you want to do all of Unicode, the easiest way is probably to just use Unicode codepoint numbers. The resulting identifiers might not be very readable, but maybe that doesn't matter in your use case.
Once you have a suitable system worked out, it should be pretty straightforward to write a string transformation function which implements it.

Regex match a hash that has been split over multiple lines

I want to match a hash that has been word wrapped by an author, and received over multiple lines.
Example:
SHA256: AB76235776BC87DBAB76235776BC87DBAB76235776BC87
DBAB76235776BC87DB
Has been received. My usual regex to match a sha256 hash like this is of course: [0-9A-Fa-f]{64}
But this does not work. I would like to leave the file unmodified while searching for this match, any ideas on how to match the split hash without removing newlines?
I'd like to have a regex that basically says 'look for 64 sequential hexadecimal values, but allow for one or more newlines in the mix, kthx'
Thanks in advance. C# is the language.
Try this:
\b(?:[a-fA-F0-9]\s*){64}\b
It allows any kind of whitespace, not just line separators. If it really has to allow only line separators, you can use this:
\b(?:[a-fA-F0-9][\r\n]*){64}\b
This will also include the line separator following the number, if there is one, and if it's followed by a word character. You can prevent that like this:
\b(?:[a-fA-F0-9][\r\n]*){63}[a-fA-F0-9]\b
Change your regex to include newline characters:
[A-Z0-9a-z\\r\\n ]{64, }
You could modify the upper bound to include a restriction on the number of linebreaks.
In this case you need to keep in mind linebreaks can be 2 symbols long, depending on machine culture and OS.
1 linebreak --> 66 chars
2 linebreaks --> 68 chars
Continue as much as you like.
On a sidenote. While parsing the file, you generally leave it rest. All your modifications are made with the variables you read the file in to. This is why I do not see the point of keeping the linebreaks.

Removing String Escape Codes

My program outputs strings like "Wzyryrff}av{v5~fvzu: Bb``igbuz~+\177Ql\027}C5]{H5LqL{" and the problem is the escape codes (\\\ instead of \, \177 instead of the character, etc.)
I need a way to unescape the string of all escape codes (mainly just the \\\ and octal \027 types). Is there something that already does this?
Thanks
Reference: http://www.tailrecursive.org/postscript/escapes.html
The strings are an encrypted value and I need to decrypt them, but I'm getting the wrong values since the strings are escaped
It sounds more like it's encoded rather than simply escaped (if \177 is really a character). So, try decoding it.
There is nothing built in to do exactly this kind of escaping.
You will need to parse and replace these sequences yourself.
The \xxx octal escapes can be found with a RegEx (\\\d{3}), iterating over the matches will allow you to parse out the octal part and get the replacement character for it (then a simple replace will do).
The others appear to be simple to replace with string.Replace.
If the string is encrypted then you probably need to treat it as binary and not text. You need to know how it is encoded and decode it accordingly. The fact that you can view it as text is incidental.
If you want to replace specific contents you can just use the .Replace() method.
i.e. myInput.Replace("\\", #"\")
I am not sure why the "\" is a problem for you. If it its actually an escape code then it just should be fine since the \ represents the \ in a string.
What is the reason you need to "remove" the escape codes?

Extract decimal separator

I'm importing a csv file in C#, sometimes with '.', sometimes with ',' as decimal separator.
Is there a best way of determinate the decimal separator better than counting from the last char down to the first apperance?
Thanks in advance.
Franklin Albricias.
If you know the correct culture in advance (for example, because you know the user that created the file), you can try to parse the provided value using the appropriate CultureInfo or NumberFormatInfo:
Decimal value = Decimal.Parse(input, new CultureInfo("es-ES"));
But if the type is not known in advance, you'll have to check it manually by examining the characters until you find a separator. (And even that approach assumes that you are guaranteed to always have a decimal separator, such that one is written as 1.0 rather than 1.)
You can't just try each expected format one after the other because you may get false positives.
10,000 means something valid but different for both formats.
Why not use both as a separator?
Have a look at NumberFormatInfo
Edit:
For each value try to parse it with one of the separators.
If that fails try to parse it with the other.
This depends on the actual data stored in the csv file and the data separation character (';' or ',' or ' ').
If all data is always in floting point notation you can use a regular expression that checks both cases. You can use "d+,\d+" to check for values separated by ',' or "\d+\.\d+" for values using '.' as separator
Under the assumption that the file contains only numbers - no strings and what ever - and there are at least two columns, you can do the following.
Go through the first line and look for a semicolon. If you find one, you have semicolon separated numbers with commas as decimal separator, else comma separated numbers with points as decimal separator.
In all other cases you will have to use a heuristic (and sometimes get the wrong conclusion) or you have to strictly parse the file under both assumptions.

Categories