I'm using this line of code to insert a value from an array into a certain line, in a list of lines.
lineList[LineNumber].Insert(lineList[LineNumber].Count(), pArray[i]);
After debugging all the variables are correct, the pArray is passed in as a parameter and lineList is inherited from another class. I can't see why this wouldnt work, all the lines that are added are just empty?
This is because .NET strings are immutable; string.Insert returns a new string, rather than modifying an existing one. If you need to modify the string, add an assignment, like this:
lineList[LineNumber] = lineList[LineNumber]
.Insert(lineList[LineNumber].Count(), pArray[i]);
This should be equivalent to
lineList[LineNumber] += pArray[i];
Related
I am trying to split a string using ; as a delimiter.
My output is weird, why is there an empty string a the end of the returned array?
string emails = "bitebari#gmail.com;abcd#gmail.com;";
string[] splittedEmails = emails.TrimEnd().Split(';');
foreach (var email in splittedEmails)
{
Console.WriteLine("Value is :" + email);
}
The console output looks like this:
Value is: bitebari#gmail.com
Value is: abcd#gmail.com
Value is:
The string.Split method doesn't remove empty entries by default, anyhow you can tell it to do that, by providing it with the StringSplitOptions. Try to use your method with the StringSplitOptions.RemoveEmptyEntries parameter.
string[] splittedEmails = emails.Split(';', StringSplitOptions.RemoveEmptyEntries);
Actually you should try to pass ; to your TrimEnd method, since it will truncate white spaces otherwise. Therefore your string remains with the ; at the end. This would result to the following:
string[] splittedEmails = emails.TrimEnd(';').Split(';');
Both of the solutions above work, it really comes to preference as the performance difference shouldn't be that high.
Edit
This behavior is considered to be 'standard' at least in C#, let me quote the MSDN for this one.
This behavior makes it easier for formats like comma separated values (CSV) files representing tabular data. Consecutive commas represent a blank column.
You can pass an optional StringSplitOptions.RemoveEmptyEntries parameter to exclude any empty strings in the returned array. For more complicated processing of the returned collection, you can use LINQ to manipulate the result sequence.
Also there isn't just any special case for that.
Let's say I have an array of strings from executing this Split method
string[] parsed = message.Split(' ');
And then I want to store a certain value of this array in a new string
string name = parsed[3];
Now I remove this string from the array using an extension method, this method removes the value at the specified index and shifts everything back to fill the gap
parsed = parsed.RemoveAt(3);
I am aware that because strings are reference types in C# my name variable is now null, after searching a little I've been told that making exact copies of strings in C# is useless. What is the best and correct way to set the name variable as a new instance so that it does not get deleted after the .RemoveAt() call?
EDIT:
This is the best way that I found so far
string name = new string(parsed[3].ToCharArray());
Another way proposed by Willy David Jr
parsed = parsed.Where((source, index) => index != 3).ToArray();
EDIT 2:
Please disregard this question and read the approved answer, I misunderstood how reference types work.
You're misunderstanding how reference types work. Removing an object from an array does not modify that object in any way - it just means that the array no longer contains a reference to the object.
You can test this yourself. Run the code you included in the debugger (or a console app) and then view (or print out) the value of name at the end.
The thing that can trick you up with reference types occurs when there are two variables (or arrays or whatever) that hold a reference to the same object. In this case, changes made to the object via one variable will be reflected when the object is accessed via another variable - it's the same object, but with two different variables referencing it. If you want both variables to refer to their own "copy" of the object, you have to create a copy yourself and assign it to one of the variables.
However, in C#, the string type is immutable, meaning that once a string object is created there is no way to change that object. So there is never a reason to create a copy of a string. If there is a variable that references a particular string, you can be sure that no other reference can change it out from under you.
Why do you think your name variable should be null? It will stay untouched after removal from array. Your original code is enough to accomplish what you desire.
Are you sure that there is RemoveAt at your string? There is RemoveAt on Collections but not on string or string array per se.
You can do this instead:
List<string> lstParse = new List<string>();
foreach (var i in parsed)
{
lstParse.Add(i);
}
string name = parsed[3];
lstParse.RemoveAt(3);
To join again the list and convert it to string:
string strResult = string.Join(" ", lstParse.ToArray());
If you really want to remove index without a new list of object, disregard the code above and you can do this one line instead:
parsed = parsed.Where((source, index) => index != 3).ToArray();
I run a repeated Regex.Replace over a string, replacing certain "variables" with their "values". Thing is, some get replaced and some don't!
I have to analyze certain batch files (IBM JCL batch language, to be precise) and search them for JCL variables (rules: JCLvariable starts with "&" and ends with space; ","; "." or other variable start, that being "&"). My functions is supposed to take the string with variables and array of variables-and-their-values as an input; then search the string and replace JCL variables with their values. So is I run a forcycle and for each value-variable struct in array, I run Regex.Replace (in order to prevent the "&TOSP." being misplaced for "&TO." and adhere to JCL var rules, see above):
private string ReplaceDSNVarsWithValues(string _DSN,JCLvar[] VarsAndValues)
{
//FIXME: nefunguje pro TIPfile a nebere všechny &var
for(int Fa=0;Fa<VarsAndValues.Length/2;++Fa)
{
_DSN = Regex.Replace(_DSN, "&"+VarsAndValues[Fa].JCLvariable+"[^A-Za-z0-9]", VarsAndValues[Fa].JCLvalue);
}
return _DSN;
}
Eg. I have this as a string to replace:
string _DSN = "&TOSP..COPY.&SYSTEM..SP&APL..BVSIN.SAVEC.D&MES.&DEN..V&VER.K99";
And then I have an array of struct containing couples of variable and value, eg.
JCLvar[1].variable = "APL",JCLvar[1].value = "PROD"
Combine that and it should result in the "SP&APL." part changing to "SPPROD".
The problem is, only SOME of the variables get replaced:
&TOSP..COPY.&SYSTEM..SP&APL..BVSIN.SAVEC.D&MES.&DEN..V&VER.K99 gets changed to SP.COPY.DBA0.SPPROD.BVSIN.SAVEC.D&MESDENV&VER.K99 as it should (disregard &MES,&DEN - these are not filled in the ValsAnd Values array and therefore don't get replaced), but in
&TO..#ZDSK99.PODVYP.M&MES.U&DEN..SUC.RES, the "&TO." doesn't get changed at all - although it exists in the array and via debugging, I see that it is being passed to the regex /but it doesn't get changed/.
How the heck it comes SOME variables get replaced and others don't?
In the array VarsAndValues, order of variables matters, because if "TOSP" is first, it gets replaced and "&TO" does not, while if "TO" is first, it gets replaced and "&TOSP" doesn't; therefore, I got suspicion that Regex.Replace somehow fails to do repeated replace on similar expressions/variables in the same string OR fails to recognize the variable/expression to be replaced - but I see no reason for the first possibility and the second one is impossible, as the replaced expressions clearly stay there.
//Note - I know it's certainly not nice coding, but it's more a single-purpose script I wrote to save me weeks of manual work than anything else
I don't see anything wrong with your regex. But why are you iterating over only half of VarsAndValues?
for(int Fa=0;Fa<VarsAndValues.Length/2;++Fa)
tells me you're stopping halfway through the array, so if TOSP happens to fall in the second half, it won't be replaced.
I'm about to build a solution to where I receive a comma separated list every night. It's a list with around 14000 rows, and I need to go through the list and select some of the values in the list.
The document I receive is built up with around 50 semicolon separated values for every "case". How the document is structured:
"";"2010-10-17";"";"";"";Period-Last24h";"Problem is that the customer cant find....";
and so on, with 43 more semicolon statements. And every "case" ends with the value "Total 515";
What I need to do is go through all these "cases" and withdraw some of the values in the "cases". The "cases" is always built up in the same order and I know that it's always the 3, 15 and 45'th semicolon value that I need to withdraw.
How can I do this in the easiest way?
I think you should decompose this problem into smaller problems. Here are the steps I'd take:
Each semi-colon separated record represents a single object. C# is an object-oriented language. Stop thinking in terms of .csv records and start thinking in terms of objects. Break up the input into semi-colon delimited records.
Given a single comma-separated record, the values represent the properties of your object. Give them meaningful names.
Parse a comma-separated record into an object. When you're done, you'll have a collection of objects that you can deal with.
Use C#'s collections and LINQ to filter your list based on those cases that you need to withdraw. When you're done, you'll have a collection of objects with the desired cases removed.
Don't worry about the "easiest" way. You need one way that works. Whatever you do, get something working and worry about optimizing it to make it easiest, fastest, smallest, etc. later on.
Assuming the "rows" are lines and that you read line by line, your main tool should be string.Split:
foreach (string line in ... )
{
string [] parts = line.split (';');
string part3 = parts[2];
string part15 = parts[14];
// etc
}
Note that this is a simple approach that will fail if the content of any column can contain ';'
You could use String.Split twice.
The first time using "Total 515"; as the split string using this overload. This will give you an array of cases.
The second time using ";" as the split character using this overload on each of the cases. This will give you a data array for each case. As the data is consistent you can extract the 3rd, 15th and 45th elements of this array.
I'd search for an existing csv library. The escaping rules are probably not that easily mapped to regex.
If writing a library myself I'd first parse each line into a list/an array of strings. And then in a second step(probably outside of the csv library itself) convert the stringlist to a strongly typed object.
A simple but slow approach would be reading single characters from the input (StringReader class, for example). Write a ReadItem method that reads a quote, continues to read until the next quote, and then looks for the next character. If it is a newline of semicolon, one item has been read. If it is another quote, add a single quote to the item being read. Otherwise, throw an exception. Then use this method to split up the input data into a series of items, each line stored e.g. in a string[number of items in a row], lines stored in a List<>. Then you can use this class to read the CSV data inside another class that decodes the data read into objects that you can get your data out of.
I guess it has to do something with string being a reference type but I dont get why simply string.Replace("X","Y") does not work?
Why do I need to do string A = stringB.Replace("X","Y")? I thought it is just a method to be done on specified instance.
EDIT: Thank you so far. I extend my question: Why does b+="FFF" work but b.Replace does not?
Because strings are immutable. Any time you change a string .net creates creates a new string object. It's a property of the class.
Immutable objects
String Object
Why doesn't stringA.Replace("X","Y") work?
Why do I need to do stringB = stringA.Replace("X","Y"); ?
Because strings are immutable in .NET. You cannot change the value of an existing string object, you can only create new strings. string.Replace creates a new string which you can then assign to something if you wish to keep a reference to it. From the documentation:
Returns a new string in which all occurrences of a specified string in the current instance are replaced with another specified string.
Emphasis mine.
So if strings are immutable, why does b += "FFF"; work?
Good question.
First note that b += "FFF"; is equivalent to b = b + "FFF"; (except that b is only evaluated once).
The expression b + "FFF" creates a new string with the correct result without modifying the old string. The reference to the new string is then assigned to b replacing the reference to the old string. If there are no other references to the old string then it will become eligible for garbage collection.
Strings are immutable, which means that once they are created, they cannot be changed anymore. This has several reasons, as far as I know mainly for performance (how strings are represented in memory).
See also (among many):
http://en.wikipedia.org/wiki/Immutable_object
http://channel9.msdn.com/forums/TechOff/58729-Why-are-string-types-immutable-in-C/
As a direct consequence of that, each string operation creates a new string object. In particular, if you do things like
foreach (string msg in messages)
{
totalMessage = totalMessage + message;
totalMessage = totalMessage + "\n";
}
you actually create potentially dozens or hundreds of string objects. So, if you want to manipulate strings more sophisticatedly, follow GvS's hint and use the StringBuilder.
Strings are immutable. Any operation changing them has to create a new string.
A StringBuilder supports the inline Replace method.
Use the StringBuilder if you need to do a lot of string manipulation.
Why "b+="FFF"works but the b.replace is not
Because the += operator assigns the results back to the left hand operand, of course. It's just a short hand for b = b + "FFF";.
The simple fact is that you can't change any string in .Net. There are no instance methods for strings that alter the content of that string - you must always assign the results of an operation back to a string reference somewhere.
Yes its a method of System.String. But you can try
a = a.Replace("X","Y");
String.Replace is a shared function of string class that returns a new string. It is not an operator on the current object. b.Replace("a","b") would be similar to a line that only has c+1. So just like c=c+1 actually sets the value of c+1 to c, b=b.Replace("a","b") sets the new string returned to b.
As everyone above had said, strings are immutable.
This means that when you do your replace, you get a new string, rather than changing the existing string.
If you don't store this new string in a variable (such as in the variable that it was declared as) your new string won't be saved anywhere.
To answer your extended question, b+="FFF" is equivalent to b = b + "FFF", so basically you are creating a new string here also.
Just to be more explicit. string.Replace("X","Y") returns a new string...but since you are not assigning the new string to anything the new string is lost.