Related
I'm new to C#. I'm parsing for a lot number in a 2D barcode. The actual lot number 'A2351' is hidden in this barcode string "+M727PP011/$$3201001A2351S". I would like to break this barcode up in separate string blocks but the delimiters are not consistent.
The letter prefix in front of the 4 digit lot number can be a 'A', 'P', or a 'D' There is a single letter following the lot number that can be ignored.
string Delimiter = "/$$3";
//barcode format:M###PP###/$$3 ddmmyy lotnumprefix 'A' followed by lotNum
string lotNum= "+M727PP011/$$3201001A2351S";
string[] split = lotNum.Split(new[] {Delimiter}, StringSplitOptions.None);
How do I extract the lot number after the date?
Based on your initial example and then the subsequent edit in which you showed how you are solving this, it sounds like the lot number is always in the same place. It would be cleaner (and more in line with standard C# code) to use a single call to string.Substring(int,int) rather than the two lines you are using which also require pulling in the VB library. You just need to call Substring and give it the starting index and the length.
So this code:
string lotNum = Strings.Right(barcode, 6);
lotNum = lotNum.Remove((lotNum.Length - 1), 1);
Can be done with this single substring call:
string lotNum = barcode.Substring(barcode.Length - 6, 5);
Edit
Just further clarification on why it might be better to use the call to Substring. In C# string objects are immutable. That means that when you make the call to Strings.Right you are getting back a new string object. When you then call lotNum.Remove you do not "remove" a character from the existing string, a new string is allocated with the character(s) removed and is returned to you. So with your code there are two new string allocations when trying to extract the lot number. When you make the call to Substring you will get back a new string, but instead of getting a new string that you immediately then modify and get a second new string, you will only need to allocate one new string to extract the lot number. In the example you have given there probably would not be any noticeable performance/memory issue, but it is something that could potentially lead to trouble if this code was in a tight loop or something like that.
If you're just trying to get the lot number, it's really dependent on the format of the input string (is it a consistent length, are there any reliable prefixes/suffixes relative to the data you're trying to parse that you can reference from, etc). It looks like your data is definable by its static position in the string, so it looks like you could use the substring
(with an index of 20?) method to accomplish what you want.
I usually wrap long strings by concatenating them:
Log.Debug("I am a long string. So long that I must " +
"be on multiple lines to be feasible.");
This is perfectly efficient, since the compiler handles concatenation of string literals. I also consider it the cleanest way to handle this problem (the options are weighed here).
This approach worked well with String.Format:
Log.Debug(String.Format("Must resize {0} x {1} image " +
"to {2} x {3} for reasons.", image.Width, image.Height,
resizedImage.Width, resizedImage.Height));
However, I now wish to never use String.Format again in these situations, since C# 6's string interpolation is much more readable. My concern is that I no longer have an efficient, yet clean way to format long strings.
My question is if the compiler can somehow optimize something like
Log.Debug($"Must resize {image.Width} x {image.Height} image " +
$"to {resizedImage.Width} x {resizedImage.Height} for reasons.");
into the above String.Format equivalent or if there's an alternative approach that I can use that won't be less efficient (due to the unnecessary concatenation) while also keeping my code cleanly structured (as per the points raised in the link above).
This program:
var name = "Bobby Tables";
var age = 8;
String msg = $"I'm {name} and" +
$" I'm {age} years old";
is compiled as if you had written:
var name = "Bobby Tables";
var age = 8;
String msg = String.Concat(String.Format("I'm {0} and", name),
String.Format(" I'm {0} years old", age));
You see the difficulty in getting rid of the Concat - the compiler has re-written our interpolation literals to use the indexed formatters that String.Format expects, but each string has to number its parameters from 0. Naively concatenating them would cause them both to insert name. To get this to work out correctly, there would have to be state maintained between invocations of the $ parser so that the second string is reformatted as " I'm {1} years old". Alternatively, the compiler could try to apply the same kind of analysis it does for concatenation of string literals. I think this would be a legal optimization even though string interpolation can have side effects, but I wouldn't be surprised if it turned out there was a corner case under which interpolated string concatenation changed program behavior. Neither sounds impossible, especially given the logic is already there to detect a similar condition for string literals, but I can see why this feature didn't make it into the first release.
I would write the code in the way that you feel is cleanest and most readable, and not worry about micro-inefficiencies unless they prove to be a problem. The old saying about code being primarily for humans to understand holds here.
Maybe it would be not as readable as with + but by all means, it is possible. You just have to break line between { and }:
Log.Debug($#"Must resize {image.Width} x {image.Height} image to {
resizedImage.Width} x {resizedImage.Height} for reasons.");
SO's colouring script does not handle this syntax too well but C# compiler does ;-)
In the specialized case of using this string in HTML (or parsing with whatever parser where multiple whitespaces does not matter), I could recommend you to use #$"" strings (verbatim interpolated string) eg.:
$#"some veeeeeeeeeeery long string {foo}
whatever {bar}"
In c# 6.0:
var planetName = "Bob";
var myName = "Ford";
var formattedStr = $"Hello planet {planetName}, my name is {myName}!";
// formattedStr should be "Hello planet Bob, my name is Ford!"
Then concatenate with stringbuilder:
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.Append(formattedStr);
// Then add the strings you need
Append more strings to stringbuilder.....
I am getting a string in the following format in the query string:
Arnstung%20Chew(20)
I want to convert it to just Arnstung Chew.
How do I do it?
Also how do I make sure that the user is not passing a script or anything harmful in the query string?
string str = "Arnstung Chew (20)";
string replacedString = str.Substring(0, str.IndexOf("(") -1 ).Trim();
string safeString = System.Web.HttpUtility.HtmlEncode(replacedString);
It's impossible to provide a comprehensive answer without knowing what variations might appear on your input text. For example, will there always be two words separated by a space followed by a number in parentheses? Or might there be other variations as well?
I have a lot of parsing code on my Black Belt Coder site, including a sscanf() replacement for .NET that may potentially be useful in your case.
Is there a decent way to declare a long single line string in C#, such that it isn't impossible to declare and/or view the string in an editor?
The options I'm aware of are:
1: Let it run. This is bad because because your string trails way off to the right of the screen, making a developer reading the message have to annoying scroll and read.
string s = "this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. ";
2: #+newlines. This looks nice in code, but introduces newlines to the string. Furthermore, if you want it to look nice in code, not only do you get newlines, but you also get awkward spaces at the beginning of each line of the string.
string s = #"this is my really long string. this is my long string.
this line will be indented way too much in the UI.
This line looks silly in code. All of them suffer from newlines in the UI.";
3: "" + ... This works fine, but is super frustrating to type. If I need to add half a line's worth of text somewhere I have to update all kinds of +'s and move text all around.
string s = "this is my really long string. this is my long string. " +
"this will actually show up properly in the UI and looks " +
"pretty good in the editor, but is just a pain to type out " +
"and maintain";
4: string.format or string.concat. Basically the same as above, but without the plus signs. Has the same benefits and downsides.
Is there really no way to do this well?
There is a way. Put your very long string in resources. You can even put there long pieces of text because it's where the texts should be. Having them directly in code is a real bad practice.
If you really want this long string in the code, and you really don't want to type the end-quote-plus-begin-quote, then you can try something like this.
string longString = #"Some long string,
with multiple whitespace characters
(including newlines and carriage returns)
converted to a single space
by a regular expression replace.";
longString = Regex.Replace(longString, #"\s+", " ");
If using Visual Studio
Tools > Options > Text Editor > All Languages > Word Wrap
I'm sure any other text editor (including notepad) will be able to do this!
It depends on how the string is going to wind up being used. All the answers here are valid, but context is important. If long string "s" is going to be logged, it should be surrounded with a logging guard test, such as this Log4net example:
if (log.IsDebug) {
string s = "blah blah blah" +
// whatever concatenation you think looks the best can be used here,
// since it's guarded...
}
If the long string s is going to be displayed to a user, then Developer Art's answer is the best choice...those should be in resource file.
For other uses (generating SQL query strings, writing to files [but consider resources again for these], etc...), where you are concatenating more than just literals, consider StringBuilder as Wael Dalloul suggests, especially if your string might possibly wind up in a function that just may, at some date in the distant future, be called many many times in a time-critical application (All those invocations add up). I do this, for example, when building a SQL query where I have parameters that are variables.
Other than that, no, I don't know of anything that both looks pretty and is easy to type (though the word wrap suggestion is a nice idea, it may not translate well to diff tools, code print outs, or code review tools). Those are the breaks. (I personally use the plus-sign approach to make the line-wraps neat for our print outs and code reviews).
you can use StringBuilder like this:
StringBuilder str = new StringBuilder();
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
string s = str.ToString();
You can also use: Text files, resource file, Database and registry.
Does it have to be defined in the source file? Otherwise, define it in a resource or config file.
Personally I would read a string that big from a file perhaps an XML document.
You could use StringBuilder
For really long strings, I'd store it in XML (or a resource). For occasions where it makes sense to have it in the code, I use the multiline string concatenation with the + operator. The only place I can think of where I do this, though, is in my unit tests for code that reads and parses XML where I'm actually trying to avoid using an XML file for testing. Since it's a unit test I almost always want to have the string right there to refer to as well. In those cases I might segregate them all into a #region directive so I can show/hide it as needed.
I either just let it run, or use string.format and write the string in one line (the let it run method) but put each of the arguments in new line, which makes it either easier to read, or at least give the reader some idea what he can expect in the long string without reading it in detail.
Use the Project / Properties / Settings from the top menu of Visual Studio. Make the scope = "Application".
In the Value box you can enter very long strings and as a bonus line feeds are preserved. Then your code can refer to that string like this:
string sql = Properties.Settings.Default.xxxxxxxxxxxxx;
Is there an easy way to create a multiline string literal in C#?
Here's what I have now:
string query = "SELECT foo, bar"
+ " FROM table"
+ " WHERE id = 42";
I know PHP has
<<<BLOCK
BLOCK;
Does C# have something similar?
You can use the # symbol in front of a string to form a verbatim string literal:
string query = #"SELECT foo, bar
FROM table
WHERE id = 42";
You also do not have to escape special characters when you use this method, except for double quotes as shown in Jon Skeet's answer.
It's called a verbatim string literal in C#, and it's just a matter of putting # before the literal. Not only does this allow multiple lines, but it also turns off escaping. So for example you can do:
string query = #"SELECT foo, bar
FROM table
WHERE name = 'a\b'";
This includes the line breaks (using whatever line break your source has them as) into the string, however. For SQL, that's not only harmless but probably improves the readability anywhere you see the string - but in other places it may not be required, in which case you'd either need to not use a multi-line verbatim string literal to start with, or remove them from the resulting string.
The only bit of escaping is that if you want a double quote, you have to add an extra double quote symbol:
string quote = #"Jon said, ""This will work,"" - and it did!";
As a side-note, with C# 6.0 you can now combine interpolated strings with the verbatim string literal:
string camlCondition = $#"
<Where>
<Contains>
<FieldRef Name='Resource'/>
<Value Type='Text'>{(string)parameter}</Value>
</Contains>
</Where>";
The problem with using string literal I find is that it can make your code look a bit "weird" because in order to not get spaces in the string itself, it has to be completely left aligned:
var someString = #"The
quick
brown
fox...";
Yuck.
So the solution I like to use, which keeps everything nicely aligned with the rest of your code is:
var someString = String.Join(
Environment.NewLine,
"The",
"quick",
"brown",
"fox...");
And of course, if you just want to logically split up lines of an SQL statement like you are and don't actually need a new line, you can always just substitute Environment.NewLine for " ".
One other gotcha to watch for is the use of string literals in string.Format. In that case you need to escape curly braces/brackets '{' and '}'.
// this would give a format exception
string.Format(#"<script> function test(x)
{ return x * {0} } </script>", aMagicValue)
// this contrived example would work
string.Format(#"<script> function test(x)
{{ return x * {0} }} </script>", aMagicValue)
Why do people keep confusing strings with string literals? The accepted answer is a great answer to a different question; not to this one.
I know this is an old topic, but I came here with possibly the same question as the OP, and it is frustrating to see how people keep misreading it. Or maybe I am misreading it, I don't know.
Roughly speaking, a string is a region of computer memory that, during the execution of a program, contains a sequence of bytes that can be mapped to text characters. A string literal, on the other hand, is a piece of source code, not yet compiled, that represents the value used to initialize a string later on, during the execution of the program in which it appears.
In C#, the statement...
string query = "SELECT foo, bar"
+ " FROM table"
+ " WHERE id = 42";
... does not produce a three-line string but a one liner; the concatenation of three strings (each initialized from a different literal) none of which contains a new-line modifier.
What the OP seems to be asking -at least what I would be asking with those words- is not how to introduce, in the compiled string, line breaks that mimick those found in the source code, but how to break up for clarity a long, single line of text in the source code without introducing breaks in the compiled string. And without requiring an extended execution time, spent joining the multiple substrings coming from the source code. Like the trailing backslashes within a multiline string literal in javascript or C++.
Suggesting the use of verbatim strings, nevermind StringBuilders, String.Joins or even nested functions with string reversals and what not, makes me think that people are not really understanding the question. Or maybe I do not understand it.
As far as I know, C# does not (at least in the paleolithic version I am still using, from the previous decade) have a feature to cleanly produce multiline string literals that can be resolved during compilation rather than execution.
Maybe current versions do support it, but I thought I'd share the difference I perceive between strings and string literals.
UPDATE:
(From MeowCat2012's comment) You can. The "+" approach by OP is the best. According to spec the optimization is guaranteed: http://stackoverflow.com/a/288802/9399618
Add multiple lines : use #
string query = #"SELECT foo, bar
FROM table
WHERE id = 42";
Add String Values to the middle : use $
string text ="beer";
string query = $"SELECT foo {text} bar ";
Multiple line string Add Values to the middle: use $#
string text ="Customer";
string query = $#"SELECT foo, bar
FROM {text}Table
WHERE id = 42";
You can use # and "".
string sourse = #"{
""items"":[
{
""itemId"":0,
""name"":""item0""
},
{
""itemId"":1,
""name"":""item1""
}
]
}";
In C# 11 [2022], you will be able to use Raw String literals.
The use of Raw String Literals makes it easier to use " characters without having to write escape sequences.
Solution for OP:
string query1 = """
SELECT foo, bar
FROM table
WHERE id = 42
""";
string query2 = """
SELECT foo, bar
FROM table
WHERE id = 42
and name = 'zoo'
and type = 'oversized "jumbo" grand'
""";
More details about Raw String Literals
See the Raw String Literals GitHub Issue for full details; and Blog article C# 11 Preview Updates – Raw string literals, UTF-8 and more!
I haven't seen this, so I will post it here (if you are interested in passing a string you can do this as well.) The idea is that you can break the string up on multiple lines and add your own content (also on multiple lines) in any way you wish. Here "tableName" can be passed into the string.
private string createTableQuery = "";
void createTable(string tableName)
{
createTableQuery = #"CREATE TABLE IF NOT EXISTS
["+ tableName + #"] (
[ID] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
[Key] NVARCHAR(2048) NULL,
[Value] VARCHAR(2048) NULL
)";
}
Yes, you can split a string out onto multiple lines without introducing newlines into the actual string, but it aint pretty:
string s = $#"This string{
string.Empty} contains no newlines{
string.Empty} even though it is spread onto{
string.Empty} multiple lines.";
The trick is to introduce code that evaluates to empty, and that code may contain newlines without affecting the output. I adapted this approach from this answer to a similar question.
There is apparently some confusion as to what the question is, but there are two hints that what we want here is a string literal not containing any newline characters, whose definition spans multiple lines. (in the comments he says so, and "here's what I have" shows code that does not create a string with newlines in it)
This unit test shows the intent:
[TestMethod]
public void StringLiteralDoesNotContainSpaces()
{
string query = "hi"
+ "there";
Assert.AreEqual("hithere", query);
}
Change the above definition of query so that it is one string literal, instead of the concatenation of two string literals which may or may not be optimized into one by the compiler.
The C++ approach would be to end each line with a backslash, causing the newline character to be escaped and not appear in the output. Unfortunately, there is still then the issue that each line after the first must be left aligned in order to not add additional whitespace to the result.
There is only one option that does not rely on compiler optimizations that might not happen, which is to put your definition on one line. If you want to rely on compiler optimizations, the + you already have is great; you don't have to left-align the string, you don't get newlines in the result, and it's just one operation, no function calls, to expect optimization on.
If you don't want spaces/newlines, string addition seems to work:
var myString = String.Format(
"hello " +
"world" +
" i am {0}" +
" and I like {1}.",
animalType,
animalPreferenceType
);
// hello world i am a pony and I like other ponies.
You can run the above here if you like.
using System;
namespace Demo {
class Program {
static void Main(string[] args) {
string str = #"Welcome User,
Kindly wait for the image to
load";
Console.WriteLine(str);
}
}
}
Output
Welcome User,
Kindly wait for the image to
load