Split a string containing digits - c#

I'm having a string like,
"abc kskd 8.900 prew"
need to Split this string so that i get the result as "abc kskd" and "8.900 prew"
how can i achieve this with C#?

Get the index of first digit using LINQ then use Substring:
var input = "abc kskd 8.900 prew";
var index = input.Select( (x,idx) => new {x, idx})
.Where(c => char.IsDigit(c.x))
.Select(c => c.idx)
.First();
var part1 = input.Substring(0, index);
var part2 = input.Substring(index);

This should do if you don't need to do something complicated:
var data = "abc kskd 8.900 prew";
var digits = "0123456789".ToCharArray();
var idx = data.IndexOfAny(digits);
if (idx != -1)
{
var firstPart = data.Substring(0, idx - 1);
var secondPart = data.Substring(idx);
}
IndexOfAny is actually very fast.
This could also be modified to separate the string into more parts (using the startIndex parameter), but you didn't ask for that.

straightforward with a regular expression:
var str = "abc kskd 8.900 prew";
var result = Regex.Split(str, #"\W(\d.*)").Where(x => x!="").ToArray();

Try this,
public string[] SplitText(string text)
{
var startIndex = 0;
while (startIndex < text.Length)
{
var index = text.IndexOfAny("0123456789".ToCharArray(), startIndex);
if (index < 0)
{
break;
}
var spaceIndex = text.LastIndexOf(' ', startIndex, index - startIndex);
if (spaceIndex != 0)
{
return new String[] { text.Substring(0, spaceIndex), text.Substring(spaceIndex + 1) };
}
startIndex = index;
}
return new String[] {text};
}

Something similar to what #Dominic Kexel provided, but only if you don't want to use linq.
string[] result = Regex.Split("abc kskd 8.900 prew", #"\w*(?=\d+\.\d)");

Related

Get count of unique characters between first and last letter

I'm trying to get the unique characters count that are between the first and last letter of a word. For example: if I type Yellow the expected output is Y3w, if I type People the output should be P4e and if I type Money the output should be M3y. This is what I tried:
//var strArr = wordToConvert.Split(' ');
string[] strArr = new[] { "Money","Yellow", "People" };
List<string> newsentence = new List<string>();
foreach (string word in strArr)
{
if (word.Length > 2)
{
//ignore 2-letter words
string newword = null;
int distinctCount = 0;
int k = word.Length;
int samecharcount = 0;
int count = 0;
for (int i = 1; i < k - 2; i++)
{
if (word.ElementAt(i) != word.ElementAt(i + 1))
{
count++;
}
else
{
samecharcount++;
}
}
distinctCount = count + samecharcount;
char frst = word[0];
char last = word[word.Length - 1];
newword = String.Concat(frst, distinctCount.ToString(), last);
newsentence.Add(newword);
}
else
{
newsentence.Add(word);
}
}
var result = String.Join(" ", newsentence.ToArray());
Console.WriteLine("Output: " + result);
Console.WriteLine("----------------------------------------------------");
With this code I'm getting the expect output for Yellow, but seems that is not working with People and Money. What can I do to fix this issue or also I'm wondering is maybe there is a better way to do this for example using LINQ/Regex.
Here's an implementation that uses Linq:
string[] strArr = new[]{"Money", "Yellow", "People"};
List<string> newsentence = new List<string>();
foreach (string word in strArr)
{
if (word.Length > 2)
{
// we want the first letter, the last letter, and the distinct count of everything in between
var first = word.First();
var last = word.Last();
var others = word.Skip(1).Take(word.Length - 2);
// Case sensitive
var distinct = others.Distinct();
// Case insensitive
// var distinct = others.Select(c => char.ToLowerInvariant(c)).Distinct();
string newword = first + distinct.Count().ToString() + last;
newsentence.Add(newword);
}
else
{
newsentence.Add(word);
}
}
var result = String.Join(" ", newsentence.ToArray());
Console.WriteLine(result);
Output:
M3y Y3w P4e
Note that this doesn't take account of case, so the output for FiIsSh is 4.
Maybe not the most performant, but here is another example using linq:
var words = new[] { "Money","Yellow", "People" };
var transformedWords = words.Select(Transform);
var sentence = String.Join(' ', transformedWords);
public string Transform(string input)
{
if (input.Length < 3)
{
return input;
}
var count = input.Skip(1).SkipLast(1).Distinct().Count();
return $"{input[0]}{count}{input[^1]}";
}
You can implement it with the help of Linq. e.g. (C# 8+)
private static string EncodeWord(string value) => value.Length <= 2
? value
: $"{value[0]}{value.Substring(1, value.Length - 2).Distinct().Count()}{value[^1]}";
Demo:
string[] tests = new string[] {
"Money","Yellow", "People"
};
var report = string.Join(Environment.NewLine, tests
.Select(test => $"{test} :: {EncodeWord(test)}"));
Console.Write(report);
Outcome:
Money :: M3y
Yellow :: Y3w
People :: P4e
A lot of people have put up some good solutions. I have two solutions for you: one uses LINQ and the other does not.
LINQ, Probably not much different from others
if (str.Length < 3) return str;
var midStr = str.Substring(1, str.Length - 2);
var midCount = midStr.Distinct().Count();
return string.Concat(str[0], midCount, str[str.Length - 1]);
Non-LINQ
if (str.Length < 3) return str;
var uniqueLetters = new Dictionary<char, int>();
var midStr = str.Substring(1, str.Length - 2);
foreach (var c in midStr)
{
if (!uniqueLetters.ContainsKey(c))
{
uniqueLetters.Add(c, 0);
}
}
var midCount = uniqueLetters.Keys.Count();
return string.Concat(str[0], midCount, str[str.Length - 1]);
I tested this with the following 6 strings:
Yellow
Money
Purple
Me
You
Hiiiiiiiii
Output:
LINQ: Y3w, Non-LINQ: Y3w
LINQ: M3y, Non-LINQ: M3y
LINQ: P4e, Non-LINQ: P4e
LINQ: Me, Non-LINQ: Me
LINQ: Y1u, Non-LINQ: Y1u
LINQ: H1i, Non-LINQ: H1i
Fiddle
Performance-wise I'd guess they're pretty much the same, if not identical, but I haven't run any real perf test on the two approaches. I can't imagine they'd be much different, if at all. The only real difference is that the second route expands Distinct() into what it probably does under the covers anyway (I haven't looked at the source to see if that's true, but that's a pretty common way to get a count of . And the first route is certainly less code.
I Would use Linq for that purpose:
string[] words = new string[] { "Yellow" , "People", "Money", "Sh" }; // Sh for 2 letter words (or u can insert 0 and then remove the trinary operator)
foreach (string word in words)
{
int uniqeCharsInBetween = word.Substring(1, word.Length - 2).ToCharArray().Distinct().Count();
string result = word[0] + (uniqeCharsInBetween == 0 ? string.Empty : uniqeCharsInBetween.ToString()) + word[word.Length - 1];
Console.WriteLine(result);
}

Regex to split by a Targeted String up to a certain character

I have an LDAP Query I need to build the domain.
So, split by "DC=" up to a "comma"
INPUT:
LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account
RESULT:
SOMETHING.ELSE.NET
You can do it pretty simple using DC=(\w*) regex pattern.
var str = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var result = String.Join(".", Regex.Matches(str, #"DC=(\w*)")
.Cast<Match>()
.Select(m => m.Groups[1].Value));
Without Regex you can do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
int startIndex = ldapStr.IndexOf("DC=");
int length = ldapStr.LastIndexOf("DC=") - startIndex;
string output = null;
if (startIndex >= 0 && length <= ldapStr.Length)
{
string domainComponentStr = ldapStr.Substring(startIndex, length);
output = String.Join(".",domainComponentStr.Split(new[] {"DC=", ","}, StringSplitOptions.RemoveEmptyEntries));
}
If you are always going to get the string in similar format than you can also do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var outputStr = String.Join(".", ldapStr.Split(new[] {"DC=", ",","\\"}, StringSplitOptions.RemoveEmptyEntries)
.Skip(1)
.Take(3));
And you will get:
outputStr = "SOMETHINGS.ELSE.NET"

how To get specific part of a string in c#

I have a string
string a = "(something is there),xyz,(something there)";
and, I use this
string s = "(something is there),xyz,(something there)";
int start = s.IndexOf("(") + 1;
int end = s.IndexOf(")", start);
string result = s.Substring(start, end - start);
but I want to use the second part (something there)
how can I do it?
a.Split("(),".ToCharArray(),StringSplitOptions.RemoveEmptyEntries);
This will return an array with 3 strings: something is there, xyz, and something there
Not sure what exactly you're doing around this, however this does it in this specific case:
var last = s.Split(',').Last(); // "(something there)"
Or more verbosely for explanation:
var s = "(something is there),xyz,(something there)";
var split = s.Split(','); // [ "(something is there)", "xyz", "(something there)" ]
var last = split.Last(); // "(something there)"
And if you don't want the brackets(en-GB)
var content = last.Trim('(', ')'); // "something there"
If "last" is the same as "second" in this case you can use String.LastIndexOf:
string lastPart = null;
int lastStartIndex = a.LastIndexOf('(');
if (lastStartIndex >= 0)
{
int lastEndIndex = a.LastIndexOf(')');
if (lastEndIndex >= 0)
lastPart = a.Substring(++lastStartIndex, lastEndIndex - lastStartIndex);
}
Here is a solution which extracts all tokens from the string into a List<string>:
int startIndex = -1, endIndex = -1;
var tokens = new List<string>();
while (true)
{
startIndex = a.IndexOf('(', ++endIndex);
if (startIndex == -1) break;
endIndex = a.IndexOf(')', ++startIndex);
if (endIndex == -1) break;
tokens.Add(a.Substring(startIndex, endIndex - startIndex));
}
So now you could use the indexer or Enumerable.ElementAtOrDefault:
string first = tokens[0];
string second = tokens.ElementAtOrDefault(1);
If the list is too small you get null as result. If you just want the last use tokens.Last().
You can use this:
string s = "(something is there),xyz,(something there)";
var start = s.Split(',')[2];
Also You can use:
string s = "(something is there),xyz,(something there)";
Regex regex = new Regex(#"\([^()]*\)(?=[^()]*$)");
Match match = regex.Match("(something is there),xyz,(something there)");
var result = match.Value;
You could use the following if you just want the text:
var s = "(something is there),xyz,(something there)";
var splits = s.Split('(');
var text = splits[2].Trim(')');
If you want to get the text between second '(' and ')' then use the second parameter of IndexOf which sets the starting index for searching
start = s.IndexOf("(", end) + 1;
end = s.IndexOf(")", start);
string secondResult = s.Substring(start, end - start);
If you want to get the string after the last ) use this code:
string otherPart = s.Substring(end+1);

Splitting an string into a string array.?

I am facing a problem while executing a sql query in C#.The sql query throws an error when the string contains more than 1000 enteries in the IN CLAUSE .The string has more than 1000 substrings each seperated by ','.
I want to split the string into string array each containing 999 strings seperated by ','.
or
How can i find the nth occurence of ',' in a string.
Pull the string from SQL server into a DataSet using a utilities code like
string strResult = String.Empty;
using (SqlCommand cmd = new SqlCommand())
{
cmd.Connection = conn;
cmd.CommandText = strSQL;
strResult = cmd.ExecuteScalar().ToString();
}
Get the returned string from SQL Server
Split the string on the ','
string[] strResultArr = strResult.Split(',');
then to get the nth string that is seperated by ',' (I think this is what you mean by "How can i find the nth occurence of ',' in a string." use
int n = someInt;
string nthEntry = strResultArr[someInt - 1];
I hope this helps.
You could use a regular expression and the Index property of the Match class:
// Long string of 2000 elements, seperated by ','
var s = String.Join(",", Enumerable.Range(0,2000).Select (e => e.ToString()));
// find all ',' and use '.Index' property to find the position in the string
// to find the first occurence, n has to be 0, etc. etc.
var nth_position = Regex.Matches(s, ",")[n].Index;
To create an array of strings of your requiered size, you could split your string and use LINQ's GroupBy to partition the result, and then joining the resulting groups together:
var result = s.Split(',').Select((x, i) => new {Group = i/1000, Value = x})
.GroupBy(item => item.Group, g => g.Value)
.Select(g => String.Join(",", g));
result now contains two strings, each with 1000 comma seperated elements.
How's this:
int groupSize = 1000;
string[] parts = s.Split(',');
int numGroups = parts.Length / groupSize + (parts.Length % groupSize != 0 ? 1 : 0);
List<string[]> Groups = new List<string[]>();
for (int i = 0; i < numGroups; i++)
{
Groups.Add(parts.Skip(i * groupSize).Take(groupSize).ToArray());
}
Maybe something like this:
string line = "1,2,3,4";
var splitted = line.Split(new[] {','}).Select((x, i) => new {
Element = x,
Index = i
})
.GroupBy(x => x.Index / 1000)
.Select(x => x.Select(y => y.Element).ToList())
.ToList();
After this you should just String.Join each IList<string>.
//initial string of 10000 entries divided by commas
string s = string.Join(", ", Enumerable.Range(0, 10000));
//an array of entries, from the original string
var ss = s.Split(',');
//auxiliary index
int index = 0;
//divide into groups by 1000 entries
var words = ss.GroupBy(w =>
{
try
{
return index / 1000;
}
finally
{
++index;
}
})//join groups into "words"
.Select(g => string.Join(",", g));
//print each word
foreach (var word in words)
Console.WriteLine(word);
Or you may find the indeces in the string and split it into substrings afterwards:
string s = string.Join(", ", Enumerable.Range(0, 100));
int index = 0;
var indeces =
Enumerable.Range(0, s.Length - 1).Where(i =>
{
if (s[i] == ',')
{
if (index < 9)
++index;
else
{
index = 0;
return true;
}
}
return false;
}).ToList();
Console.WriteLine(s.Substring(0, indeces[0]));
for (int i = 0; i < indeces.Count - 1; i++)
{
Console.WriteLine(s.Substring(indeces[i], indeces[i + 1] - indeces[i]));
}
However, I would think over, if it was possible to work with the entries before they are combined into one string. And probably think, if it was possible to prevent the necessity to make a query which needs that great list to pass into the IN statement.
string foo = "a,b,c";
string [] foos = foo.Split(new char [] {','});
foreach(var item in foos)
{
Console.WriteLine(item);
}

Split a separated string into hierarchy using c# and linq

I have string separated by dot ('.') characters that represents a hierarchy:
string source = "Class1.StructA.StructB.StructC.FieldA";
How can I use C# and linq to split the string into separate strings to show their hierarchy? Such as:
string[] result = new string[]
{
"Class1",
"Class1.StructA",
"Class1.StructA.StructB",
"Class1.StructA.StructB.FieldA"
};
Split the string by the delimiters taking 1...N of the different levels and rejoin the string.
const char DELIMITER = '.';
var source = "Class1.StructA.StructB.StructC.FieldA";
var hierarchy = source.Split(DELIMITER);
var result = Enumerable.Range(1, hierarchy.Length)
.Select(i => String.Join(".", hierarchy.Take(i)))
.ToArray();
Here's a more efficient way to do this without LINQ:
const char DELIMITER = '.';
var source = "Class1.StructA.StructB.StructC.FieldA";
var result = new List<string>();
for (int i = 0; i < source.Length; i++)
{
if (source[i] == DELIMITER)
{
result.Add(source.Substring(0, i));
}
}
result.Add(source); // assuming there is no trailing delimiter
Here is solution that uses aggregation:
const string separator = ".";
const string source = "Class1.StructA.StructB.StructC.FieldA";
// Get the components.
string[] components = source.Split(new [] { separator }, StringSplitOptions.None);
List<string> results = new List<string>();
// Aggregate with saving temporary results.
string lastResult = components.Aggregate((total, next) =>
{
results.Add(total);
return string.Join(separator, total, next);
});
results.Add(lastResult);
Here's a solution completely without LINQ:
public static string[] GetHierarchy(this string path)
{
var res = path.Split('.');
string last = null;
for (int i = 0; i < res.Length; ++i)
{
last = string.Format("{0}{1}{2}", last, last != null ? "." : null, res[i]);
res[i] = last;
}
return res;
}
Shlemiel the painter approach is better than the "super Shlemiel" string.Join in this case.
const char DELIMITER = '.';
string soFar = "";
List<string> result = source.Split(DELIMITER).Select(s =>
{
if (soFar != "") { soFar += DELIMITER; };
soFar += s;
return soFar;
}).ToList();

Categories