How to Efficient way comma separator use - c#

I work on C#. I have an array. To separate the array items I need to use comma. I did it but I think it's not efficient. How to do that, without an if condition? Please don't use replace method. My syntax is below.
string container = "";
string[] s = "Hellow world how are you".Split(' ');
foreach (string item in s)
{
if (container == "")
{
container += item;
}
else
{
container += "," + item;
}
}
I must need to continue the loop. I just want below type solution.
string container = "";
string[] s = "Hellow world how are you".Split(' ');
foreach (string item in s)
{
container += "," + item;
}
Thanks in advance. If have any queries please ask.

Using String.Join to join an array with comma separators.
string[] s = "Hello world how are you".Split(' ');
string container = String.Join(",", s);
Also, if you like getting help on this site, I recommend you start accepting a few answers.

Your problem is not the if statement. Your problem is that it is generally poor form and bad practice to perform string concatenation and other manipulations in a loop. The string class is immutable, changes are creating new strings, allocation new memory, etc. As a result, this practice is slow and inefficient, much more than your if statement will be. The more iterations of your loop, the more you'll notice the inefficiency.
You should familiarize yourself with the StringBuilder class, which allows you to perform efficient manipulations of a string without repeatedly allocating new objects. It is particularly useful in loops like yours above.
An example of using a StringBuilder is like the following
StringBuilder builder = new StringBuilder();
foreach (string item in array)
{
if (builder.Length != 0) builder.Append(",");
builder.Append(item);
}
string finalOutput = builder.ToString();
With that said, string.Join is also a powerful tool for the type of concatenation you are performing.

string.split starts with a string and ends with an array of strings and
is very fast . It uses unsafe code to determine the indexes of the separators. Then it allocates the array of the correct size and then cuts up the original string by allocating a bunch of other strings.
string.join starts with an array of strings and ends with a string and is also very fast and uses unsafe code. It creates a buffer and adds to the buffer each item in the string growing the string as it goes.
But since you want to Start with a string and end with a string your best bet is to use a method that uses unsafe code to change the ' ' with ','.
string s1 = "Hellow world how are you";
fixed (char* p = s1)
{
for (int i = 0; i < s1.Length; i++)
{
if (p[i] == ' ')
{
p[i] = ',';
}
}
}
This is a really bad idea
It only works because the source and target are the same length
It requires unsafe code
Since I'm mutating the string directly all references to the string get updated
There's probably a bunch of checks that I'm missing
Its only marginally faster then string.replace
Just use String.Replace if you really need it to be very fast and its very safe

Related

Why this function increase the memory usage?

This is my code.
private string ConvertOriginalDataListToString(List<string> originalDataList)
{
string totalOriginalData = string.Empty;
foreach (var eachOriginalData in originalDataList)
{
if (totalOriginalData == string.Empty)
{
totalOriginalData = eachOriginalData;
}
else
{
totalOriginalData = totalOriginalData + "," + eachOriginalData;
}
}
return totalOriginalData;
}
before running this function. the memory usually use 60MB.
but while running this function. the memory use over 200MB.
[![enter image description here][1]][1]
But, when i use StringBuilder instead. the memory issue has solved.
private string ConvertOriginalDataListToString(List<string> originalDataList)
{
StringBuilder totalOriginalData = new StringBuilder();
foreach (var eachOriginalData in originalDataList)
{
totalOriginalData.Append(eachOriginalData);
totalOriginalData.Append(",");
}
totalOriginalData.Remove(totalOriginalData.Length - 1, 1);
return totalOriginalData.ToString();
}
I think this issue associated with local variable and memory allocation.
please let me know under below.
why this function has a lot of memory
what is different between string+string and StringBuilder.Append
Thanks !!
[1]: https://i.stack.imgur.com/vTjXg.png
The StringBuilder class exists specifically for this reason. Every time you concatenate two string objects, you create a new string object that contains the combined characters of the other two. As strings get long, that obviously becomes a problem. If you do this:
var substrings = new[] {"1", "2", "3", "4", "5"};
var str = string.Empty;
foreach (var substring in substrings)
{
str += substring;
}
then you will create the following string objects in that loop:
1
12
123
1234
12345
As you can imagine, if the substrings were long and numerous, that would eat up memory pretty quickly. There is an overhead with using a StringBuilder but I've read that, once you get to about a dozen concatenations, the additional efficiency of avoiding all the extra allocations overrides that overhead and the difference only increases. You should generally use methods of the String class, e.g. Concat or Join, in fairly simple cases and a StringBuilder directly for complex cases.
As a general rule, I never use more than a single concatenation in one place. I either use string interpolation (string.Format in older versions) or a method like string.Concat or a StringBuilder.
In your specific case, you should be using this:
var totalOriginalData = string.Join(",", originalDataList);
One line and done. You don't even need your method, because you can do that where you'd otherwise be calling your method.

count occurrences in a string similar to run length encoding c#

Say I have a string like
MyString1 = "ABABABABAB";
MyString2 = "ABCDABCDABCD";
MyString3 = "ABCAABCAABCAABCA";
MyString4 = "ABABACAC";
MyString5 = "AAAAABBBBB";
and I need to get the following output
Output1 = "5(AB)";
Output2 = "3(ABCD)";
Output3 = "4(ABCA)";
Output4 = "2(AB)2(AC)";
Output5 = "5(A)5(B)";
I have been looking at RLE but I can't figure out how to do the above.
The code I have been using is
public static string Encode(string input)
{
return Regex.Replace(input, #"(.)\1*", delegate(Match m)
{
return string.Concat(m.Value.Length, "(", m.Groups[1].Value, ")");
});
}
This works for Output5 but can I do the other Outputs with Regex or should I be using something like Linq?
The purpose of the code is to display MyString in a simple manner as I can get MyString being up to a 1000 characters generally with a pattern to it.
I am not too worried about speed.
Using RLE with single characters is easy, there never is an overlap between matches. If the number of characters to repeat is variable, you'd have a problem:
AAABAB
Could be:
3(A)BAB
Or
AA(2)AB
You'll have to define what rules you want to apply. Do you want the absolute best compression? Does speed matter?
I doubt Regex can look forward and select "the best" combination of matches - So to answer your question I would say "no".
RLE is of no help here - it's just an extremely simple compression where you repeat a single code-point a given number of times. This was quite useful for e.g. game graphics and transparent images ("next, there's 50 transparent pixels"), but is not going to help you with variable-length code-points.
Instead, have a look at Huffman encoding. Expanding it to work with variable-length codewords is not exactly cheap, but it's a start - and it saves a lot of space, if you can afford having the table there.
But the first thing you have to ask yourself is, what are you optimizing for? Are you trying to get the shortest possible string on output? Are you going for speed? Do you want as few code-words as possible, or do you need to balance the repetitions and code-word counts in some way? In other words, what are you actually trying to do? :))
To illustrate this on your "expected" return values, Output4 results in a longer string than MyString4. So it's not the shortest possible representation. You're not trying for the least amounts of code-words either, because then Output5 would be 1(AAAAABBBBB). Least amount of repetitions is of course silly (it would always be 1(...)). You're not optimizing for low overhead either, because that's again broken in Output4.
And whichever of those are you trying to do, I'm thinking it's not going to be possible with regular expressions - those only work for regular languages, and encoding like this doesn't seem all that regular to me. The decoding does, of course; but I'm not so sure about the encoding.
Here is a Non-Regex way given the data that you provided. I'm not sure of any edge cases, right now, that would trip this code up. If so, I'll update accordingly.
string myString1 = "ABABABABAB";
string myString2 = "ABCDABCDABCD";
string myString3 = "ABCAABCAABCAABCA";
string myString4 = "ABABACAC";
string myString5 = "AAAAABBBBB";
CountGroupOccurrences(myString1, "AB");
CountGroupOccurrences(myString2, "ABCD");
CountGroupOccurrences(myString3, "ABCA");
CountGroupOccurrences(myString4, "AB", "AC");
CountGroupOccurrences(myString5, "A", "B");
CountGroupOccurrences() looks like the following:
private static void CountGroupOccurrences(string str, params string[] patterns)
{
string result = string.Empty;
while (str.Length > 0)
{
foreach (string pattern in patterns)
{
int count = 0;
int index = str.IndexOf(pattern);
while (index > -1)
{
count++;
str = str.Remove(index, pattern.Length);
index = str.IndexOf(pattern);
}
result += string.Format("{0}({1})", count, pattern);
}
}
Console.WriteLine(result);
}
Results:
5(AB)
3(ABCD)
4(ABCA)
2(AB)2(AC)
5(A)5(B)
UPDATE
This worked with Regex
private static void CountGroupOccurrences(string str, params string[] patterns)
{
string result = string.Empty;
foreach (string pattern in patterns)
{
result += string.Format("{0}({1})", Regex.Matches(str, pattern).Count, pattern);
}
Console.WriteLine(result);
}

Proper way in C# to combine an arbitrary number of strings into a single string

I breezed through the documentation for the string class and didn't see any good tools for combining an arbitrary number of strings into a single string. The best procedure I could come up with in my program is
string [] assetUrlPieces = { Server.MapPath("~/assets/"),
"organizationName/",
"categoryName/",
(Guid.NewGuid().ToString() + "/"),
(Path.GetFileNameWithoutExtension(file.FileName) + "/")
};
string assetUrl = combinedString(assetUrlPieces);
private string combinedString ( string [] pieces )
{
string alltogether = "";
foreach (string thispiece in pieces) alltogether += alltogether + thispiece;
return alltogether;
}
but that seems like too much code and too much inefficiency (from the string addition) and awkwardness.
If you want to insert a separator between values, string.Join is your friend. If you just want to concatenate the strings, then you can use string.Concat:
string assetUrl = string.Concat(assetUrlPieces);
That's marginally simpler (and possibly more efficient, but probably insignificantly) than calling string.Join with an empty separator.
As noted in comments, if you're actually building up the array at the same point in the code that you do the concatenation, and you don't need the array for anything else, just use concatenation directly:
string assetUrl = Server.MapPath("~/assets/") +
"organizationName/" +
"categoryName/" +
Guid.NewGuid() + "/" +
Path.GetFileNameWithoutExtension(file.FileName) + "/";
... or potentially use string.Format instead.
I prefer using string.Join:
var result = string.Join("", pieces);
You can read about string.Join on MSDN
You want a StringBuilder, I think.
var sb = new StringBuilder(pieces.Count());
foreach(var s in pieces) {
sb.Append(s);
}
return sb.ToString();
Update
#FiredFromAmazon.com: I think you'll want to go with the string.Concat solution offered by others for
Its sheer simplicity
Higher performance. Under the hood, it uses FillStringChecked, which does pointer copies, whereas string.Join uses StringBuilder. See http://referencesource.microsoft.com/#mscorlib/system/string.cs,1512. (Thank you to #Bas).
string.Concat is the most appropriate method for what you want.
var result = string.Concat(pieces);
Unless you want to put delimiters between the individual strings. Then you'd use string.Join
var result = string.Join(",", pieces); // comma delimited result.
A simple way to do this with a regular for loop:
(since you can use the indices, plus I like these loops better than foreach loops)
private string combinedString(string[] pieces)
{
string alltogether = "";
for (int index = 0; index <= pieces.Length - 1; index++) {
if (index != pieces.Length - 1) {
alltogether += string.Format("{0}/" pieces[index]);
}
}
return alltogether;

How to split a string while preserving line endings?

I have a block of text and I want to get its lines without losing the \r and \n at the end. Right now, I have the following (suboptimal code):
string[] lines = tbIn.Text.Split('\n')
.Select(t => t.Replace("\r", "\r\n")).ToArray();
So I'm wondering - is there a better way to do it?
Accepted answer
string[] lines = Regex.Split(tbIn.Text, #"(?<=\r\n)(?!$)");
The following seems to do the job:
string[] lines = Regex.Split(tbIn.Text, #"(?<=\r\n)(?!$)");
(?<=\r\n) uses 'positive lookbehind' to match after \r\n without consuming it.
(?!$) uses negative lookahead to prevent matching at the end of the input and so avoids a final line that is just an empty string.
Something along the lines of using this regular expression:
[^\n\r]*\r\n
Then use Regex.Matches().
The problem is you need Group(1) out of each match and create your string list from that. In Python you'd just use the map() function. Not sure the best way to do it in .NET, you take it from there ;-)
Dmitri, your solution is actually pretty compact and straightforward. The only thing more efficient would be to keep the string-splitting characters in the generated array, but the APIs simply don't allow for that. As a result, every solution will require iterating over the array and performing some kind of modification (which in C# means allocating new strings every time). I think the best you can hope for is to not re-create the array:
string[] lines = tbIn.Text.Split('\n');
for (int i = 0; i < lines.Length; ++i)
{
lines[i] = lines[i].Replace("\r", "\r\n");
}
... but as you can see that looks a lot more cumbersome! If performance matters, this may be a bit better. If it really matters, you should consider manually parsing the string by using IndexOf() to find the '\r's one at a time, and then create the array yourself. This is significantly more code, though, and probably not necessary.
One of the side effects of both your solution and this one is that you won't get a terminating "\r\n" on the last line if there wasn't one already there in the TextBox. Is this what you expect? What about blank lines... do you expect them to show up in 'lines'?
If you are just going to replace the newline (\n) then do something like this:
string[] lines = tbIn.Text.Split('\n')
.Select(t => t + "\r\n").ToArray();
Edit: Regex.Replace allows you to split on a string.
string[] lines = Regex.Split(tbIn.Text, "\r\n")
.Select(t => t + "\r\n").ToArray();
As always, extension method goodies :)
public static class StringExtensions
{
public static IEnumerable<string> SplitAndKeep(this string s, string seperator)
{
string[] obj = s.Split(new string[] { seperator }, StringSplitOptions.None);
for (int i = 0; i < obj.Length; i++)
{
string result = i == obj.Length - 1 ? obj[i] : obj[i] + seperator;
yield return result;
}
}
}
usage:
string text = "One,Two,Three,Four";
foreach (var s in text.SplitAndKeep(","))
{
Console.WriteLine(s);
}
Output:
One,
Two,
Three,
Four
You can achieve this with a regular expression. Here's an extension method with it:
public static string[] SplitAndKeepDelimiter(this string input, string delimiter)
{
MatchCollection matches = Regex.Matches(input, #"[^" + delimiter + "]+(" + delimiter + "|$)", RegexOptions.Multiline);
string[] result = new string[matches.Count];
for (int i = 0; i < matches.Count ; i++)
{
result[i] = matches[i].Value;
}
return result;
}
I'm not sure if this is a better solution. Yours is very compact and simple.

Java equivalents of C# String.Format() and String.Join()

I know this is a bit of a newbie question, but are there equivalents to C#'s string operations in Java?
Specifically, I'm talking about String.Format and String.Join.
The Java String object has a format method (as of 1.5), but no join method.
To get a bunch of useful String utility methods not already included you could use org.apache.commons.lang.StringUtils.
String.format. As for join, you need to write your own:
static String join(Collection<?> s, String delimiter) {
StringBuilder builder = new StringBuilder();
Iterator<?> iter = s.iterator();
while (iter.hasNext()) {
builder.append(iter.next());
if (!iter.hasNext()) {
break;
}
builder.append(delimiter);
}
return builder.toString();
}
The above comes from http://snippets.dzone.com/posts/show/91
Guava comes with the Joiner class.
import com.google.common.base.Joiner;
Joiner.on(separator).join(data);
As of Java 8, join() is now available as two class methods on the String class. In both cases the first argument is the delimiter.
You can pass individual CharSequences as additional arguments:
String joined = String.join(", ", "Antimony", "Arsenic", "Aluminum", "Selenium");
// "Antimony, Arsenic, Alumninum, Selenium"
Or you can pass an Iterable<? extends CharSequence>:
List<String> strings = new LinkedList<String>();
strings.add("EX");
strings.add("TER");
strings.add("MIN");
strings.add("ATE");
String joined = String.join("-", strings);
// "EX-TER-MIN-ATE"
Java 8 also adds a new class, StringJoiner, which you can use like this:
StringJoiner joiner = new StringJoiner("&");
joiner.add("x=9");
joiner.add("y=5667.7");
joiner.add("z=-33.0");
String joined = joiner.toString();
// "x=9&y=5667.7&z=-33.0"
TextUtils.join is available on Android
You can also use variable arguments for strings as follows:
String join (String delim, String ... data) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < data.length; i++) {
sb.append(data[i]);
if (i >= data.length-1) {break;}
sb.append(delim);
}
return sb.toString();
}
As for join, I believe this might look a little less complicated:
public String join (Collection<String> c) {
StringBuilder sb=new StringBuilder();
for(String s: c)
sb.append(s);
return sb.toString();
}
I don't get to use Java 5 syntax as much as I'd like (Believe it or not, I've been using 1.0.x lately) so I may be a bit rusty, but I'm sure the concept is correct.
edit addition: String appends can be slowish, but if you are working on GUI code or some short-running routine, it really doesn't matter if you take .005 seconds or .006, so if you had a collection called "joinMe" that you want to append to an existing string "target" it wouldn't be horrific to just inline this:
for(String s : joinMe)
target += s;
It's quite inefficient (and a bad habit), but not anything you will be able to perceive unless there are either thousands of strings or this is inside a huge loop or your code is really performance critical.
More importantly, it's easy to remember, short, quick and very readable. Performance isn't always the automatic winner in design choices.
Here is a pretty simple answer. Use += since it is less code and let the optimizer convert it to a StringBuilder for you. Using this method, you don't have to do any "is last" checks in your loop (performance improvement) and you don't have to worry about stripping off any delimiters at the end.
Iterator<String> iter = args.iterator();
output += iter.hasNext() ? iter.next() : "";
while (iter.hasNext()) {
output += "," + iter.next();
}
I didn't want to import an entire Apache library to add a simple join function, so here's my hack.
public String join(String delim, List<String> destinations) {
StringBuilder sb = new StringBuilder();
int delimLength = delim.length();
for (String s: destinations) {
sb.append(s);
sb.append(delim);
}
// we have appended the delimiter to the end
// in the previous for-loop. Let's now remove it.
if (sb.length() >= delimLength) {
return sb.substring(0, sb.length() - delimLength);
} else {
return sb.toString();
}
}
If you wish to join (concatenate) several strings into one, you should use a StringBuilder. It is far better than using
for(String s : joinMe)
target += s;
There is also a slight performance win over StringBuffer, since StringBuilder does not use synchronization.
For a general purpose utility method like this, it will (eventually) be called many times in many situations, so you should make it efficient and not allocate many transient objects. We've profiled many, many different Java apps and almost always find that string concatenation and string/char[] allocations take up a significant amount of time/memory.
Our reusable collection -> string method first calculates the size of the required result and then creates a StringBuilder with that initial size; this avoids unecessary doubling/copying of the internal char[] used when appending strings.
I wrote own:
public static String join(Collection<String> col, String delim) {
StringBuilder sb = new StringBuilder();
Iterator<String> iter = col.iterator();
if (iter.hasNext())
sb.append(iter.next().toString());
while (iter.hasNext()) {
sb.append(delim);
sb.append(iter.next().toString());
}
return sb.toString();
}
but Collection isn't supported by JSP, so for tag function I wrote:
public static String join(List<?> list, String delim) {
int len = list.size();
if (len == 0)
return "";
StringBuilder sb = new StringBuilder(list.get(0).toString());
for (int i = 1; i < len; i++) {
sb.append(delim);
sb.append(list.get(i).toString());
}
return sb.toString();
}
and put to .tld file:
<?xml version="1.0" encoding="UTF-8"?>
<taglib version="2.1" xmlns="http://java.sun.com/xml/ns/javaee"
<function>
<name>join</name>
<function-class>com.core.util.ReportUtil</function-class>
<function-signature>java.lang.String join(java.util.List, java.lang.String)</function-signature>
</function>
</taglib>
and use it in JSP files as:
<%#taglib prefix="funnyFmt" uri="tag:com.core.util,2013:funnyFmt"%>
${funnyFmt:join(books, ", ")}
StringUtils is a pretty useful class in the Apache Commons Lang library.
There is MessageFormat.format() which works like C#'s String.Format().
I see a lot of overly complex implementations of String.Join here. If you don't have Java 1.8, and you don't want to import a new library the below implementation should suffice.
public String join(Collection<String> col, String delim) {
StringBuilder sb = new StringBuilder();
for ( String s : col ) {
if ( sb.length() != 0 ) sb.append(delim);
sb.append(s);
}
return sb.toString();
}
ArrayList<Double> j=new ArrayList<>;
j.add(1);
j.add(.92);
j.add(3);
String ntop=j.toString(); //ntop= "[1, 0.92, 3]"
So basically, the String ntop stores the value of the entire collection with comma separators and brackets.
I would just use the string concatenation operator "+" to join two strings. s1 += s2;

Categories