Extracing specific parts of a string - c#

I'm reading a byte[] from memory and then converting that into a string. The start of the string would be something like this "NTDLL.RtlnitializeSListHead\0" where the rest of the string after the \0 would things I don't care about (there would also be more \0 characters in the string.)
I am trying to extract only "NTDLL.RtlnitializeSListHead" part of the string - This name will vary between usages so some sort of pattern would be needed.
What would be the best way to get this part of the string? I'm not going to lie my string manipulation skills are subpar to what they should be which is why I am having trouble with. I have thought about using regex but I was just wondering if there was an easier way to do this.

Just trim the end
var myString = Encoding.UTF8.GetString(MyByteBuffer).TrimEnd((char)0);

Related

Parsing a string to get a specific value

I'm new to C#. I'm parsing for a lot number in a 2D barcode. The actual lot number 'A2351' is hidden in this barcode string "+M727PP011/$$3201001A2351S". I would like to break this barcode up in separate string blocks but the delimiters are not consistent.
The letter prefix in front of the 4 digit lot number can be a 'A', 'P', or a 'D' There is a single letter following the lot number that can be ignored.
string Delimiter = "/$$3";
//barcode format:M###PP###/$$3 ddmmyy lotnumprefix 'A' followed by lotNum
string lotNum= "+M727PP011/$$3201001A2351S";
string[] split = lotNum.Split(new[] {Delimiter}, StringSplitOptions.None);
How do I extract the lot number after the date?
Based on your initial example and then the subsequent edit in which you showed how you are solving this, it sounds like the lot number is always in the same place. It would be cleaner (and more in line with standard C# code) to use a single call to string.Substring(int,int) rather than the two lines you are using which also require pulling in the VB library. You just need to call Substring and give it the starting index and the length.
So this code:
string lotNum = Strings.Right(barcode, 6);
lotNum = lotNum.Remove((lotNum.Length - 1), 1);
Can be done with this single substring call:
string lotNum = barcode.Substring(barcode.Length - 6, 5);
Edit
Just further clarification on why it might be better to use the call to Substring. In C# string objects are immutable. That means that when you make the call to Strings.Right you are getting back a new string object. When you then call lotNum.Remove you do not "remove" a character from the existing string, a new string is allocated with the character(s) removed and is returned to you. So with your code there are two new string allocations when trying to extract the lot number. When you make the call to Substring you will get back a new string, but instead of getting a new string that you immediately then modify and get a second new string, you will only need to allocate one new string to extract the lot number. In the example you have given there probably would not be any noticeable performance/memory issue, but it is something that could potentially lead to trouble if this code was in a tight loop or something like that.
If you're just trying to get the lot number, it's really dependent on the format of the input string (is it a consistent length, are there any reliable prefixes/suffixes relative to the data you're trying to parse that you can reference from, etc). It looks like your data is definable by its static position in the string, so it looks like you could use the substring
(with an index of 20?) method to accomplish what you want.

"Evaluate" a c# string

I am reading a C# source file.
When I encounter a string, I want to get it's value.
For instance, in the following example:
public class MyClass
{
public MyClass()
{
string fileName = "C:\\Temp\\A Weird\"FileName";
}
}
I would like to retrieve
C:\Temp\A Weird"FileName
Is there an existing procedure to do that?
Coding a solution with all the possible cases should be quite tricky (#, escape sequences. ...).
I am convinced such procedure exists...
I would like to have the dual function too (to inject a string into a C# source file)
Thanks in advance.
Philippe
P.S:
I gave an example with a filename, but I look for a solution working for all kinds of strings.
I'm pretty sure you can use CodeDOM to read a C# code file and parse its elements. It generates a code tree, and then you can look for nodes representing strings.
http://www.codeproject.com/Articles/2502/C-CodeDOM-parser
Other CodeDom parsers:
http://www.codeproject.com/Articles/14383/An-Expression-Parser-for-CodeDom
NRefactory: https://github.com/icsharpcode/NRefactory and http://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code
There is a way of extracting these strings using a regular expression:
("(\\"|[^"])*")
This particular one works on your simple example and gives the filename (complete with leading and trailing quote characters); whether it would work on more complex ones I can't easily tell unfortunately.
For clarity, (\\"|[^"]) matches any character apart from ", except where it has a leading \ character.
Just use ".*" Regex to match all string values, then remove trailing inverted commas and unescape it.
this will allow \" and "" characters inside your string
so both "C:\\Temp\\A Weird\"FileName" and "Hello ""World""" will match

Parsing a String for Special characters in C#

I am getting a string in the following format in the query string:
Arnstung%20Chew(20)
I want to convert it to just Arnstung Chew.
How do I do it?
Also how do I make sure that the user is not passing a script or anything harmful in the query string?
string str = "Arnstung Chew (20)";
string replacedString = str.Substring(0, str.IndexOf("(") -1 ).Trim();
string safeString = System.Web.HttpUtility.HtmlEncode(replacedString);
It's impossible to provide a comprehensive answer without knowing what variations might appear on your input text. For example, will there always be two words separated by a space followed by a number in parentheses? Or might there be other variations as well?
I have a lot of parsing code on my Black Belt Coder site, including a sscanf() replacement for .NET that may potentially be useful in your case.

Declaring a looooong single line string in C#

Is there a decent way to declare a long single line string in C#, such that it isn't impossible to declare and/or view the string in an editor?
The options I'm aware of are:
1: Let it run. This is bad because because your string trails way off to the right of the screen, making a developer reading the message have to annoying scroll and read.
string s = "this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. ";
2: #+newlines. This looks nice in code, but introduces newlines to the string. Furthermore, if you want it to look nice in code, not only do you get newlines, but you also get awkward spaces at the beginning of each line of the string.
string s = #"this is my really long string. this is my long string.
this line will be indented way too much in the UI.
This line looks silly in code. All of them suffer from newlines in the UI.";
3: "" + ... This works fine, but is super frustrating to type. If I need to add half a line's worth of text somewhere I have to update all kinds of +'s and move text all around.
string s = "this is my really long string. this is my long string. " +
"this will actually show up properly in the UI and looks " +
"pretty good in the editor, but is just a pain to type out " +
"and maintain";
4: string.format or string.concat. Basically the same as above, but without the plus signs. Has the same benefits and downsides.
Is there really no way to do this well?
There is a way. Put your very long string in resources. You can even put there long pieces of text because it's where the texts should be. Having them directly in code is a real bad practice.
If you really want this long string in the code, and you really don't want to type the end-quote-plus-begin-quote, then you can try something like this.
string longString = #"Some long string,
with multiple whitespace characters
(including newlines and carriage returns)
converted to a single space
by a regular expression replace.";
longString = Regex.Replace(longString, #"\s+", " ");
If using Visual Studio
Tools > Options > Text Editor > All Languages > Word Wrap
I'm sure any other text editor (including notepad) will be able to do this!
It depends on how the string is going to wind up being used. All the answers here are valid, but context is important. If long string "s" is going to be logged, it should be surrounded with a logging guard test, such as this Log4net example:
if (log.IsDebug) {
string s = "blah blah blah" +
// whatever concatenation you think looks the best can be used here,
// since it's guarded...
}
If the long string s is going to be displayed to a user, then Developer Art's answer is the best choice...those should be in resource file.
For other uses (generating SQL query strings, writing to files [but consider resources again for these], etc...), where you are concatenating more than just literals, consider StringBuilder as Wael Dalloul suggests, especially if your string might possibly wind up in a function that just may, at some date in the distant future, be called many many times in a time-critical application (All those invocations add up). I do this, for example, when building a SQL query where I have parameters that are variables.
Other than that, no, I don't know of anything that both looks pretty and is easy to type (though the word wrap suggestion is a nice idea, it may not translate well to diff tools, code print outs, or code review tools). Those are the breaks. (I personally use the plus-sign approach to make the line-wraps neat for our print outs and code reviews).
you can use StringBuilder like this:
StringBuilder str = new StringBuilder();
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
string s = str.ToString();
You can also use: Text files, resource file, Database and registry.
Does it have to be defined in the source file? Otherwise, define it in a resource or config file.
Personally I would read a string that big from a file perhaps an XML document.
You could use StringBuilder
For really long strings, I'd store it in XML (or a resource). For occasions where it makes sense to have it in the code, I use the multiline string concatenation with the + operator. The only place I can think of where I do this, though, is in my unit tests for code that reads and parses XML where I'm actually trying to avoid using an XML file for testing. Since it's a unit test I almost always want to have the string right there to refer to as well. In those cases I might segregate them all into a #region directive so I can show/hide it as needed.
I either just let it run, or use string.format and write the string in one line (the let it run method) but put each of the arguments in new line, which makes it either easier to read, or at least give the reader some idea what he can expect in the long string without reading it in detail.
Use the Project / Properties / Settings from the top menu of Visual Studio. Make the scope = "Application".
In the Value box you can enter very long strings and as a bonus line feeds are preserved. Then your code can refer to that string like this:
string sql = Properties.Settings.Default.xxxxxxxxxxxxx;

String Format Does Not Works For string

I have a serial number which is String type.
Like This;
String.Format("{0:####-####-####-####}", "1234567891234567" );
I need to see Like This, 1234-5678-9123-4567;
Bu this Code does not work?
Can you help me?
That syntax takes an int, try this:
String.Format("{0:####-####-####-####}", 1234567891234567);
Edit: If you want to use this syntax on a string try this:
String.Format("{0:####-####-####-####}", Convert.ToInt64("1234567891234567"))
For ####-####-####-####, you will need a number. But you're feeding it a string.
It would be more practical to pad the string with additional zero's on the left so it becomes exactly 16 characters. Then insert the dash in three locations inside the string. Converting it to an Int64 will also work but if these strings become bigger or start to contain non-numerics, then you will have a problem.
string Key = "123456789012345";
string FormattedKey = Key.PadLeft(16, '0').Insert(12, "-").Insert(8, "-").Insert(4, "-");
That should be an alternative to formatting. It makes the key exactly 16 characters, then inserts three dashes from right to left. (Easier to keep track of indices.)
There are probably plenty of other alternatives but this one works just fine.

Categories