How to split a string with non-numbers as delimiter? - c#

I want to split a string in C#. It should split on the basis of a text in the string.Like i have a string "41sugar1100" , i want to split on the base of text in it that is "sugar".How can i do this ?
NOTE: Without passing "sugar" directly as a delimiter.Because text can be change in next iteration.Means wherever it finds text in the string, it should split on the basis of that text.

Use Regex.Split:
string input = "44sugar1100";
string pattern = "[a-zA-Z]+"; // Split on any group of letters
string[] substrings = Regex.Split(input, pattern);
foreach (string match in substrings)
{
Console.WriteLine("'{0}'", match);
}

char[] array = "41sugar1100".ToCharArray();
StringBuilder sb = new StringBuilder();
// Append letters and special char '#' when original char is a number to split later
foreach (char c in array)
sb.Append(Char.IsNumber(c) ? c : '#');
// Split on special char '#' and remove empty string items
string[] items = sb.ToString().Split('#').Where(s => s != string.Empty).ToArray();
foreach (string item in items)
Console.WriteLine(item);
// Output:
// 41
// 1100

****Use char[] array for split a string from string****
string s = "44sugar1100";
char[] c = new char[] { 's', 'u', 'g', 'a', 'r' };
string[] s1 = s.Split(c,StringSplitOptions.RemoveEmptyEntries);
string s2 = s1.ToString();

Regex regex = new Regex(#"(?<firstNumber>\d+)(?<word>[^\d]+)+(?<secondNumber>\d+)", RegexOptions.CultureInvariant);
string s = "41sugar1100";
Match match = regex.Match(s);
if (match.Success)
{
string firstNumber = match.Groups["firstNumber"].Value;
string word = match.Groups["word"].Value;
string secondNumber = match.Groups["secondNumber"].Value;
}

I would take the string and put it into a char array
then int.tryparse each char in the array for example...
string myString = "44sugar1100";
int num=0; //for storage
string newString="";//for rebuilding
foreach(char ch in myString)
{
if(int.TryParse(ch, out num)
{
newString+=num.toString();
}
}

string text = "41sugar1100";
string[] array = text.Split('sugar');

Related

C# - Identify the matching character when using String.Split(CharArray)

If I use the Split() function on a string, passing in various split characters as a char[] parameter, and given that the matching split character is removed from the string, how can I identify which character it matched & split on?
string inputString = "Hello, there| world";
char[] splitChars = new char[] { ',','|' }
foreach(string section in inputString.Split(splitChars))
{
Console.WriteLine(section) // [0] Hello [1} there [2] world (no splitChars)
}
I understand that perhaps its not possible to retain this information with my approach. If thats the case, could you suggest an alternative approach?
The C# Regex.Split() method documented here can return the split characters as well as the words between them.
string inputString = "Hello, there| world";
string pattern = #"(,)|([|])";
foreach (string result in Regex.Split(inputString, pattern))
{
Console.WriteLine("'{0}'", result);
}
the result is:
'Hello'
','
' there'
'|'
' world'
Use the Regex.Split() method. I have wrapped this method in the following extension method that is as easy to use as string.Split() itself:
public static string[] ExtendedSplit(this string input, char[] splitChars)
{
string pattern = string.Join("|", splitChars.Select(x => "(" + Regex.Escape(x.ToString()) + ")"));
return Regex.Split(input, pattern);
}
Usage:
string inputString = "Hello, there| world";
char[] splitChars = new char[]{',', '|'};
foreach (string result in inputString.ExtendedSplit(splitChars))
{
Console.WriteLine("'{0}'", result);
}
Output:
'Hello'
','
' there'
'|'
' world'
No, but its rather trivial to write one yourself. Remember, framework methods aren't magic, somebody wrote them. If something doesn't exactly match your needs, write one that does!
static IEnumerable<(string Sector, char Separator)> Split(
this string s,
IEnumerable<char> separators,
bool removeEmptyEntries)
{
var buffer = new StringBuilder();
var separatorsSet = new HashSet<char>(separators);
foreach (var c in s)
{
if (separatorsSet.Contains(c))
{
if (!removeEmptyEntries || buffer.Length > 0)
yield return (buffer.ToString(), c);
buffer.Clear();
}
else
buffer.Append(c);
}
if (buffer.Length > 0)
yield return (buffer.ToString(), default(char));
}

Split string by List

Split string by List:
I have SplitColl with delimeters:
xx
yy
..
..
And string like this:
strxx
When i try to split string:
var formattedText = "strxx";
var lst = new List<String>();
lst.Add("xx");
lst.Add("yy");
var arr = formattedText.Split(lst.ToArray(), 10, StringSplitOptions.RemoveEmptyEntries);
I have "str" result;
But how to skip this result? I want to get empty array in this case (when delim is a part of a word).
I expect, that when formattedText="str xx", result is str.
EDIT:
I have a many delimeters of address: such as street,city,town,etc.
And i try to get strings like: city DC-> DC.
But, when i get a word like:cityacdc-> i get acdc, but it not a name of a city.
It seems that you are not using your keywords really as delimiters but as search criterion. In this case you could use RegEx to search for each pattern. Here is an example program to illustrate this procedure:
static void Main(string[] args)
{
List<string> delim = new List<string> { "city", "street" };
string formattedText = "strxx street BakerStreet cityxx city London";
List<string> results = new List<string>();
foreach (var del in delim)
{
string s = Regex.Match(formattedText, del + #"\s\w+\b").Value;
if (!String.IsNullOrWhiteSpace(s))
{
results.Add(s.Split(' ')[1]);
}
}
Console.WriteLine(String.Join("\n", results));
Console.ReadKey();
}
This would handle this case:
And I try to get strings like: city DC --> DC
to handle the case where you want to find the word in front of your keyword:
I expect, that when formattedText="str xx", result is str
just switch the places of the matching criterion:
string s = Regex.Match(formattedText, #"\b\w+\s"+ del).Value;
and take the first element at the split
results.Add(s.Split(' ')[0]);
Give this a try, basically what I'm doing is first I remove any leading or tailing delimiters (only if they are separated with a space) from the formattedText string. Then using the remaining string I split it for each delimiter if it has spaces on both sides.
//usage
var result = FormatText(formattedText, delimiterlst);
//code
static string[] FormatText(string input, List<string> delimiters)
{
delimiters.ForEach(d => {
TrimInput(ref input, "start", d.ToCharArray());
TrimInput(ref input, "end", d.ToCharArray());
});
return input.Split(delimiters.Select(d => $" {d} ").ToArray(), 10, StringSplitOptions.RemoveEmptyEntries);
}
static void TrimInput(ref string input, string pos, char[] delimiter)
{
//backup
string temp = input;
//trim
input = (pos == "start") ? input.TrimStart(delimiter) : input.TrimEnd(delimiter);
string trimmed = (pos == "start") ? input.TrimStart() : input.TrimEnd();
//update string
input = (input != trimmed) ? trimmed : temp;
}

Fetch Occurrence of alphabet in a string c#

I have a string which look likes
E-1,E-2,F-3,F-1,G-1,E-2,F-5
Now i want output in array like
E, F, G
I only want the name of character once that appears in the string.
My Code Sample is as follows
string str1 = "E-1,E-2,F-3,F-1,G-1,E-2,F-5";
string[] newtmpSTR = str1.Split(new char[] { ',' });
Dictionary<string, string> tmpDict = new Dictionary<string, string>();
foreach(string str in newtmpSTR){
string[] tmpCharPart = str.Split('-');
if(!tmpDict.ContainsKey(tmpCharPart[0])){
tmpDict.Add(tmpCharPart[0], "");
}
}
Is there any easy way to do it in c#, using string function, If yes the how
string input = "E-1,E-2,F-3,F-1,G-1,E-2,F-5";
string[] splitted = input.Split(new char[] { ',' });
var letters = splitted.Select(s => s.Substring(0, 1)).Distinct().ToList();
Maybe you can obtain the same result with a regular expression! :-)

Retrieve String Containing Specific substring C#

I am having an output in string format like following :
"ABCDED 0000A1.txt PQRSNT 12345"
I want to retreieve substring(s) having .txt in above string. e.g. For above it should return 0000A1.txt.
Thanks
You can either split the string at whitespace boundaries like it's already been suggested or repeatedly match the same regex like this:
var input = "ABCDED 0000A1.txt PQRSNT 12345 THE.txt FOO";
var match = Regex.Match (input, #"\b([\w\d]+\.txt)\b");
while (match.Success) {
Console.WriteLine ("TEST: {0}", match.Value);
match = match.NextMatch ();
}
Split will work if it the spaces are the seperator. if you use oter seperators you can add as needed
string input = "ABCDED 0000A1.txt PQRSNT 12345";
string filename = input.Split(' ').FirstOrDefault(f => System.IO.Path.HasExtension(f));
filname = "0000A1.txt" and this will work for any extension
You may use c#, regex and pattern, match :)
Here is the code, plug it in try. Please comment.
string test = "afdkljfljalf dkfjd.txt lkjdfjdl";
string ffile = Regex.Match(test, #"\([a-z0-9])+.txt").Groups[1].Value;
Console.WriteLine(ffile);
Reference: regexp
I did something like this:
string subString = "";
char period = '.';
char[] chArString;
int iSubStrIndex = 0;
if (myString != null)
{
chArString = new char[myString.Length];
chArString = myString.ToCharArray();
for (int i = 0; i < myString.Length; i ++)
{
if (chArString[i] == period)
iSubStrIndex = i;
}
substring = myString.Substring(iSubStrIndex);
}
Hope that helps.
First split your string in array using
char[] whitespace = new char[] { ' ', '\t' };
string[] ssizes = myStr.Split(whitespace);
Then find .txt in array...
// Find first element starting with .txt.
//
string value1 = Array.Find(array1,
element => element.Contains(".txt", StringComparison.Ordinal));
Now your value1 will have the "0000A1.txt"
Happy coding.

Removing Specified Punctuation From Strings

I have a String that in need to convert into a String[] of each word in the string. However I do not need any white space or any punctuation EXCEPT hyphens and Apostrophes that belong in the word.
Example Input:
Hello! This is a test and it's a short-er 1. - [ ] { } ___)
Example of the Array made from Input:
[ "Hello", "this", "is", "a", "test", "and", "it's", "a", "short-er", "1" ]
Currently this is the code I have tried
(Note: the 2nd gives an error later in the program when string.First() is called):
private string[] ConvertWordsFromFile(String NewFileText)
{
char[] delimiterChars = { ' ', ',', '.', ':', '/', '|', '<', '>', '/', '#', '#', '$', '%', '^', '&', '*', '"', '(', ')', ';' };
string[] words = NewFileText.Split(delimiterChars, StringSplitOptions.RemoveEmptyEntries);
return words;
}
or
private string[] ConvertWordsFromFile(String NewFileText)
{
return Regex.Split(NewFileText, #"\W+");
}
The second example crashes with the following code
private string GroupWordsByFirstLetter(List<String> words)
{
var groups =
from w in words
group w by w.First();
return FormatGroupsByAlphabet(groups);
}
specifically, when w.First() is called.
To remove unwanted characters from a String
string randomString = "thi$ is h#ving s*me inva!id ch#rs";
string excpList ="$#*!";
LINQ Option 1
var chRemoved = randomString
.Select(ch => excpList.Contains(ch) ? (char?)null : ch);
var Result = string.Concat(chRemoved.ToArray());
LINQ Option 2
var Result = randomString.Split().Select(x => x.Except(excList.ToArray()))
.Select(c => new string(c.ToArray()))
.ToArray();
Here is a little something I worked up. Splits on \n and removes any unwanted characters.
private string ValidChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ123456789'-";
private IEnumerable<string> SplitRemoveInvalid(string input)
{
string tmp = "";
foreach(char c in input)
{
if(c == '\n')
{
if(!String.IsNullOrEmpty(tmp))
{
yield return tmp;
tmp = "";
}
continue;
}
if(ValidChars.Contains(c))
{
tmp += tmp;
}
}
if (!String.IsNullOrEmpty(tmp)) yield return tmp;
}
Usage could be something like this:
string[] array = SplitRemoveInvalid("Hello! This is a test and it's a short-er 1. - [ ] { } _)")
.ToArray();
I didnt actually test it, but it should work. If it doesnt, it should be easy enough to fix.
Use string.Split(char [])
string strings = "4,6,8\n9,4";
string [] split = strings .Split(new Char [] {',' , '\n' });
OR
Try below if you get any unwanted empty items. String.Split Method (String[], StringSplitOptions)
string [] split = strings .Split(new Char [] {',' , '\n' },
StringSplitOptions.RemoveEmptyEntries);
This can be done quite easily with a RegEx, by matching words. I am using the following RegEx, which will allow hyphens and apostrophes in the middle of words, but will strip them out if they occur at a word boundary.
\w(?:[\w'-]*\w)?
See it in action here.
In C# it could look like this:
private string[] ConvertWordsFromFile(String NewFileText)
{
return (from m in new Regex(#"\w(?:[\w'-]*\w)?").Matches(NewFileText)
select m.Value).ToArray();
}
I am using LINQ to get an array of words from the MatchCollection returned by Matches.

Categories