Check if lowercase and uppercase alternate - c#

string s = "This is an ExAmPlE of sTrInG";
Such words in this string: 2.
I understand how to convert a phrase to uppercase and lowercase words but i dont know how to find such words in string and count it

This should do the trick:
First you need to split the string in words and than you can check each word if is alternating
public void Counter(string phrase)
{
var qty = 0;
var words = phrase.Split(' ');
for (var i = 0; i < words.Length; ++i)
{
if (IsUpperLowerAltenated(words[i].Trim()))
++qty;
}
}
public bool IsUpperLowerAltenated(string word)
{
for (var i = 1; i < word.Length; ++i)
{
if(char.IsLower(word[i-1]) == char.IsLower(word[i]))
return false;
}
return true;
}

String is mixed case if it's not either lower or upper
bool IsUpper(string str) => str.Upper() == str;
bool IsLower(string str) => str.Lower() == str;
bool IsMixedCase(string str) => !IsUpper(str) && !IsLower(str);
string s = "This is an ExAmPlE of sTrInG";
foreach (var str in s.Split().Where(IsMixedCase))
{
Console.WriteLine(str);
}
Update
Solution without explicit loop. String is alternating if each pair of characters in different cases
public static IEnumerable<(T, T)> Pairwise<T>(this IEnumerable<T> input)
=> input.Zip(input.Skip(1), (a, b) => (a, b));
public static bool OfDifferentCase(char c1, char c2)
=> char.IsUpper(c1) && char.IsLower(c2) || char.IsLower(c1) && char.IsUpper(c2);
public static bool IsAlternating(string str)
=> str.Pairwise().All(p => OfDifferentCase(p.Item1, p.Item2));
string s = "This is an ExAmPlE of sTrInG";
foreach (var str in s.Split().Where(IsAlternating))
{
Console.WriteLine(str);
}

You can find count of upper and lower chars like this.
public int IsUpperCount(string myString)
{ int upperC=0;
var charArray = myString.ToCharArray();
foreach (var item in charArray)
{
if(item.toString()==item.toString().ToUpper())
{ upperC++;}
return upperC;
}
public int IsLowerCount(string myString)
{ int lowerC=0;
var charArray = myString.ToCharArray();
foreach (var item in charArray)
{
if(item.toString()==item.toString().ToUpper())
{ lowerC++;}
return lowerC;
}

string s = "This is an ExAmPlE of sTrInG";
var words = Regex.Matches(s, #"\b((\p{Ll}\p{Lu})+\p{Ll}?|(\p{Lu}\p{Ll})+\p{Lu}?)\b");
foreach (var w in words)
{
Console.WriteLine(w);
}
Reference:
Regex pattern search for alternating character case
RegEx for matching alternating case letters
Make regular expression not match empty string?
regex - match pattern of alternating characters
Supported Unicode general categories

Related

C# string.split() separate string by uppercase

I've been using the Split() method to split strings. But this work if you set some character for condition in string.Split(). Is there any way to split a string when is see Uppercase?
Is it possible to get few words from some not separated string like:
DeleteSensorFromTemplate
And the result string is to be like:
Delete Sensor From Template
Use Regex.split
string[] split = Regex.Split(str, #"(?<!^)(?=[A-Z])");
Another way with regex:
public static string SplitCamelCase(string input)
{
return System.Text.RegularExpressions.Regex.Replace(input, "([A-Z])", " $1", System.Text.RegularExpressions.RegexOptions.Compiled).Trim();
}
If you do not like RegEx and you really just want to insert the missing spaces, this will do the job too:
public static string InsertSpaceBeforeUpperCase(this string str)
{
var sb = new StringBuilder();
char previousChar = char.MinValue; // Unicode '\0'
foreach (char c in str)
{
if (char.IsUpper(c))
{
// If not the first character and previous character is not a space, insert a space before uppercase
if (sb.Length != 0 && previousChar != ' ')
{
sb.Append(' ');
}
}
sb.Append(c);
previousChar = c;
}
return sb.ToString();
}
I had some fun with this one and came up with a function that splits by case, as well as groups together caps (it assumes title case for whatever follows) and digits.
Examples:
Input -> "TodayIUpdated32UPCCodes"
Output -> "Today I Updated 32 UPC Codes"
Code (please excuse the funky symbols I use)...
public string[] SplitByCase(this string s) {
var ʀ = new List<string>();
var ᴛ = new StringBuilder();
var previous = SplitByCaseModes.None;
foreach(var ɪ in s) {
SplitByCaseModes mode_ɪ;
if(string.IsNullOrWhiteSpace(ɪ.ToString())) {
mode_ɪ = SplitByCaseModes.WhiteSpace;
} else if("0123456789".Contains(ɪ)) {
mode_ɪ = SplitByCaseModes.Digit;
} else if(ɪ == ɪ.ToString().ToUpper()[0]) {
mode_ɪ = SplitByCaseModes.UpperCase;
} else {
mode_ɪ = SplitByCaseModes.LowerCase;
}
if((previous == SplitByCaseModes.None) || (previous == mode_ɪ)) {
ᴛ.Append(ɪ);
} else if((previous == SplitByCaseModes.UpperCase) && (mode_ɪ == SplitByCaseModes.LowerCase)) {
if(ᴛ.Length > 1) {
ʀ.Add(ᴛ.ToString().Substring(0, ᴛ.Length - 1));
ᴛ.Remove(0, ᴛ.Length - 1);
}
ᴛ.Append(ɪ);
} else {
ʀ.Add(ᴛ.ToString());
ᴛ.Clear();
ᴛ.Append(ɪ);
}
previous = mode_ɪ;
}
if(ᴛ.Length != 0) ʀ.Add(ᴛ.ToString());
return ʀ.ToArray();
}
private enum SplitByCaseModes { None, WhiteSpace, Digit, UpperCase, LowerCase }
Here's another different way if you don't want to be using string builders or RegEx, which are totally acceptable answers. I just want to offer a different solution:
string Split(string input)
{
string result = "";
for (int i = 0; i < input.Length; i++)
{
if (char.IsUpper(input[i]))
{
result += ' ';
}
result += input[i];
}
return result.Trim();
}

Splitting a string array

I have a string array string[] arr, which contains values like N36102W114383, N36102W114382 etc...
I want to split the each and every string such that the value comes like this N36082 and W115080.
What is the best way to do this?
This should work for you.
Regex regexObj = new Regex(#"\w\d+"); # matches a character followed by a sequence of digits
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success) {
matchResults = matchResults.NextMatch(); #two mathches N36102 and W114383
}
If you have the fixed format every time you can just do this:
string[] split_data = data_string.Insert(data_string.IndexOf("W"), ",")
.Split(",", StringSplitOptions.None);
Here you insert a recognizable delimiter into your string and then split it by this delimiter.
Forgive me if this doesn't quite compile, but I'd just break down and write the string processing function by hand:
public static IEnumerable<string> Split(string str)
{
char [] chars = str.ToCharArray();
int last = 0;
for(int i = 1; i < chars.Length; i++) {
if(char.IsLetter(chars[i])) {
yield return new string(chars, last, i - last);
last = i;
}
}
yield return new string(chars, last, chars.Length - last);
}
If you use C#, please try:
String[] code = new Regex("(?:([A-Z][0-9]+))").Split(text).Where(e => e.Length > 0 && e != ",").ToArray();
in case you're only looking for the format NxxxxxWxxxxx, this will do just fine :
Regex r = new Regex(#"(N[0-9]+)(W[0-9]+)");
Match mc = r.Match(arr[i]);
string N = mc.Groups[1];
string W = mc.Groups[2];
Using the 'Split' and 'IsLetter' string functions, this is relatively easy in c#.
Don't forget to write unit tests - the following may have some corner case errors!
// input has form "N36102W114383, N36102W114382"
// output: "N36102", "W114383", "N36102", "W114382", ...
string[] ParseSequenceString(string input)
{
string[] inputStrings = string.Split(',');
List<string> outputStrings = new List<string>();
foreach (string value in inputstrings) {
List<string> valuesInString = ParseValuesInString(value);
outputStrings.Add(valuesInString);
}
return outputStrings.ToArray();
}
// input has form "N36102W114383"
// output: "N36102", "W114383"
List<string> ParseValuesInString(string inputString)
{
List<string> outputValues = new List<string>();
string currentValue = string.Empty;
foreach (char c in inputString)
{
if (char.IsLetter(c))
{
if (currentValue .Length == 0)
{
currentValue += c;
} else
{
outputValues.Add(currentValue);
currentValue = string.Empty;
}
}
currentValue += c;
}
outputValues.Add(currentValue);
return outputValues;
}

String formatting using C#

Is there a way to remove every special character from a string like:
"\r\n 1802 S St Nw<br>\r\n Washington, DC 20009"
And to just write it like:
"1802 S St Nw, Washington, DC 20009"
To remove special characters:
public static string ClearSpecialChars(this string input)
{
foreach (var ch in new[] { "\r", "\n", "<br>", etc })
{
input = input.Replace(ch, String.Empty);
}
return input;
}
To replace all double space with single space:
public static string ClearDoubleSpaces(this string input)
{
while (input.Contains(" ")) // double
{
input = input.Replace(" ", " "); // with single
}
return input;
}
You also may split both methods into a single one:
public static string Clear(this string input)
{
return input
.ClearSpecialChars()
.ClearDoubleSpaces()
.Trim();
}
two ways, you can use RegEx, or you can use String.Replace(...)
Use the Regex.Replace() method, specifying all of the characters you want to remove as the pattern to match.
You can use the C# Trim() method, look here:
http://msdn.microsoft.com/de-de/library/d4tt83f9%28VS.80%29.aspx
System.Text.RegularExpressions.Regex.Replace("\"\\r\\n 1802 S St Nw<br>\\r\\n Washington, DC 20009\"",
#"(<br>)*?\\r\\n\s+", "");
Maybe something like this, using ASCII int values. Assumes all html tags will be closed.
public static class StringExtensions
{
public static string Clean(this string str)
{
string[] split = str.Split(' ');
List<string> strings = new List<string>();
foreach (string splitStr in split)
{
if (splitStr.Length > 0)
{
StringBuilder sb = new StringBuilder();
bool tagOpened = false;
foreach (char c in splitStr)
{
int iC = (int)c;
if (iC > 32)
{
if (iC == 60)
tagOpened = true;
if (!tagOpened)
sb.Append(c);
if (iC == 62)
tagOpened = false;
}
}
string result = sb.ToString();
if (result.Length > 0)
strings.Add(result);
}
}
return string.Join(" ", strings.ToArray());
}
}

How do I remove all non alphanumeric characters from a string except dash?

How do I remove all non alphanumeric characters from a string except dash and space characters?
Replace [^a-zA-Z0-9 -] with an empty string.
Regex rgx = new Regex("[^a-zA-Z0-9 -]");
str = rgx.Replace(str, "");
I could have used RegEx, they can provide elegant solution but they can cause performane issues. Here is one solution
char[] arr = str.ToCharArray();
arr = Array.FindAll<char>(arr, (c => (char.IsLetterOrDigit(c)
|| char.IsWhiteSpace(c)
|| c == '-')));
str = new string(arr);
When using the compact framework (which doesn't have FindAll)
Replace FindAll with1
char[] arr = str.Where(c => (char.IsLetterOrDigit(c) ||
char.IsWhiteSpace(c) ||
c == '-')).ToArray();
str = new string(arr);
1 Comment by ShawnFeatherly
You can try:
string s1 = Regex.Replace(s, "[^A-Za-z0-9 -]", "");
Where s is your string.
Using System.Linq
string withOutSpecialCharacters = new string(stringWithSpecialCharacters.Where(c =>char.IsLetterOrDigit(c) || char.IsWhiteSpace(c) || c == '-').ToArray());
The regex is [^\w\s\-]*:
\s is better to use instead of space (), because there might be a tab in the text.
Based on the answer for this question, I created a static class and added these. Thought it might be useful for some people.
public static class RegexConvert
{
public static string ToAlphaNumericOnly(this string input)
{
Regex rgx = new Regex("[^a-zA-Z0-9]");
return rgx.Replace(input, "");
}
public static string ToAlphaOnly(this string input)
{
Regex rgx = new Regex("[^a-zA-Z]");
return rgx.Replace(input, "");
}
public static string ToNumericOnly(this string input)
{
Regex rgx = new Regex("[^0-9]");
return rgx.Replace(input, "");
}
}
Then the methods can be used as:
string example = "asdf1234!##$";
string alphanumeric = example.ToAlphaNumericOnly();
string alpha = example.ToAlphaOnly();
string numeric = example.ToNumericOnly();
Want something quick?
public static class StringExtensions
{
public static string ToAlphaNumeric(this string self,
params char[] allowedCharacters)
{
return new string(Array.FindAll(self.ToCharArray(),
c => char.IsLetterOrDigit(c) ||
allowedCharacters.Contains(c)));
}
}
This will allow you to specify which characters you want to allow as well.
Here is a non-regex heap allocation friendly fast solution which was what I was looking for.
Unsafe edition.
public static unsafe void ToAlphaNumeric(ref string input)
{
fixed (char* p = input)
{
int offset = 0;
for (int i = 0; i < input.Length; i++)
{
if (char.IsLetterOrDigit(p[i]))
{
p[offset] = input[i];
offset++;
}
}
((int*)p)[-1] = offset; // Changes the length of the string
p[offset] = '\0';
}
}
And for those who don't want to use unsafe or don't trust the string length hack.
public static string ToAlphaNumeric(string input)
{
int j = 0;
char[] newCharArr = new char[input.Length];
for (int i = 0; i < input.Length; i++)
{
if (char.IsLetterOrDigit(input[i]))
{
newCharArr[j] = input[i];
j++;
}
}
Array.Resize(ref newCharArr, j);
return new string(newCharArr);
}
I´ve made a different solution, by eliminating the Control characters, which was my original problem.
It is better than putting in a list all the "special but good" chars
char[] arr = str.Where(c => !char.IsControl(c)).ToArray();
str = new string(arr);
it´s simpler, so I think it´s better !
Here's an extension method using #ata answer as inspiration.
"hello-world123, 456".MakeAlphaNumeric(new char[]{'-'});// yields "hello-world123456"
or if you require additional characters other than hyphen...
"hello-world123, 456!?".MakeAlphaNumeric(new char[]{'-','!'});// yields "hello-world123456!"
public static class StringExtensions
{
public static string MakeAlphaNumeric(this string input, params char[] exceptions)
{
var charArray = input.ToCharArray();
var alphaNumeric = Array.FindAll<char>(charArray, (c => char.IsLetterOrDigit(c)|| exceptions?.Contains(c) == true));
return new string(alphaNumeric);
}
}
If you are working in JS, here is a very terse version
myString = myString.replace(/[^A-Za-z0-9 -]/g, "");
I use a variation of one of the answers here. I want to replace spaces with "-" so its SEO friendly and also make lower case. Also not reference system.web from my services layer.
private string MakeUrlString(string input)
{
var array = input.ToCharArray();
array = Array.FindAll<char>(array, c => char.IsLetterOrDigit(c) || char.IsWhiteSpace(c) || c == '-');
var newString = new string(array).Replace(" ", "-").ToLower();
return newString;
}
There is a much easier way with Regex.
private string FixString(string str)
{
return string.IsNullOrEmpty(str) ? str : Regex.Replace(str, "[\\D]", "");
}

How to word by word iterate in string in C#?

I want to iterate over string as word by word.
If I have a string "incidentno and fintype or unitno", I would like to read every word one by one as "incidentno", "and", "fintype", "or", and "unitno".
foreach (string word in "incidentno and fintype or unitno".Split(' ')) {
...
}
var regex = new Regex(#"\b[\s,\.-:;]*");
var phrase = "incidentno and fintype or unitno";
var words = regex.Split(phrase).Where(x => !string.IsNullOrEmpty(x));
This works even if you have ".,; tabs and new lines" between your words.
Slightly twisted I know, but you could define an iterator block as an extension method on strings. e.g.
/// <summary>
/// Sweep over text
/// </summary>
/// <param name="Text"></param>
/// <returns></returns>
public static IEnumerable<string> WordList(this string Text)
{
int cIndex = 0;
int nIndex;
while ((nIndex = Text.IndexOf(' ', cIndex + 1)) != -1)
{
int sIndex = (cIndex == 0 ? 0 : cIndex + 1);
yield return Text.Substring(sIndex, nIndex - sIndex);
cIndex = nIndex;
}
yield return Text.Substring(cIndex + 1);
}
foreach (string word in "incidentno and fintype or unitno".WordList())
System.Console.WriteLine("'" + word + "'");
Which has the advantage of not creating a big array for long strings.
Use the Split method of the string class
string[] words = "incidentno and fintype or unitno".Split(" ");
This will split on spaces, so "words" will have [incidentno,and,fintype,or,unitno].
Assuming the words are always separated by a blank, you could use String.Split() to get an Array of your words.
There are multiple ways to accomplish this. Two of the most convenient methods (in my opinion) are:
Using string.Split() to create an array. I would probably use this method, because it is the most self-explanatory.
example:
string startingSentence = "incidentno and fintype or unitno";
string[] seperatedWords = startingSentence.Split(' ');
Alternatively, you could use (this is what I would use):
string[] seperatedWords = startingSentence.Split(new char[] {' '}, StringSplitOptions.RemoveEmptyEntries);
StringSplitOptions.RemoveEmptyEntries will remove any empty entries from your array that may occur due to extra whitespace and other minor problems.
Next - to process the words, you would use:
foreach (string word in seperatedWords)
{
//Do something
}
Or, you can use regular expressions to solve this problem, as Darin demonstrated (a copy is below).
example:
var regex = new Regex(#"\b[\s,\.-:;]*");
var phrase = "incidentno and fintype or unitno";
var words = regex.Split(phrase).Where(x => !string.IsNullOrEmpty(x));
For processing, you can use similar code to the first option.
foreach (string word in words)
{
//Do something
}
Of course, there are many ways to solve this problem, but I think that these two would be the simplest to implement and maintain. I would go with the first option (using string.Split()) just because regex can sometimes become quite confusing, while a split will function correctly most of the time.
When using split, what about checking for empty entries?
string sentence = "incidentno and fintype or unitno"
string[] words = sentence.Split(new char[] { ' ', ',' ,';','\t','\n', '\r'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string word in words)
{
// Process
}
EDIT:
I can't comment so I'm posting here but this (posted above) works:
foreach (string word in "incidentno and fintype or unitno".Split(' '))
{
...
}
My understanding of foreach is that it first does a GetEnumerator() and the calles .MoveNext until false is returned. So the .Split won't be re-evaluated on each iteration
public static string[] MyTest(string inword, string regstr)
{
var regex = new Regex(regstr);
var phrase = "incidentno and fintype or unitno";
var words = regex.Split(phrase);
return words;
}
? MyTest("incidentno, and .fintype- or; :unitno",#"[^\w+]")
[0]: "incidentno"
[1]: "and"
[2]: "fintype"
[3]: "or"
[4]: "unitno"
I'd like to add some information to JDunkerley's awnser.
You can easily make this method more reliable if you give a string or char parameter to search for.
public static IEnumerable<string> WordList(this string Text,string Word)
{
int cIndex = 0;
int nIndex;
while ((nIndex = Text.IndexOf(Word, cIndex + 1)) != -1)
{
int sIndex = (cIndex == 0 ? 0 : cIndex + 1);
yield return Text.Substring(sIndex, nIndex - sIndex);
cIndex = nIndex;
}
yield return Text.Substring(cIndex + 1);
}
public static IEnumerable<string> WordList(this string Text, char c)
{
int cIndex = 0;
int nIndex;
while ((nIndex = Text.IndexOf(c, cIndex + 1)) != -1)
{
int sIndex = (cIndex == 0 ? 0 : cIndex + 1);
yield return Text.Substring(sIndex, nIndex - sIndex);
cIndex = nIndex;
}
yield return Text.Substring(cIndex + 1);
}
I write a string processor class.You can use it.
Example:
metaKeywords = bodyText.Process(prepositions).OrderByDescending().TakeTop().GetWords().AsString();
Class:
public static class StringProcessor
{
private static List<String> PrepositionList;
public static string ToNormalString(this string strText)
{
if (String.IsNullOrEmpty(strText)) return String.Empty;
char chNormalKaf = (char)1603;
char chNormalYah = (char)1610;
char chNonNormalKaf = (char)1705;
char chNonNormalYah = (char)1740;
string result = strText.Replace(chNonNormalKaf, chNormalKaf);
result = result.Replace(chNonNormalYah, chNormalYah);
return result;
}
public static List<KeyValuePair<String, Int32>> Process(this String bodyText,
List<String> blackListWords = null,
int minimumWordLength = 3,
char splitor = ' ',
bool perWordIsLowerCase = true)
{
string[] btArray = bodyText.ToNormalString().Split(splitor);
long numberOfWords = btArray.LongLength;
Dictionary<String, Int32> wordsDic = new Dictionary<String, Int32>(1);
foreach (string word in btArray)
{
if (word != null)
{
string lowerWord = word;
if (perWordIsLowerCase)
lowerWord = word.ToLower();
var normalWord = lowerWord.Replace(".", "").Replace("(", "").Replace(")", "")
.Replace("?", "").Replace("!", "").Replace(",", "")
.Replace("<br>", "").Replace(":", "").Replace(";", "")
.Replace("،", "").Replace("-", "").Replace("\n", "").Trim();
if ((normalWord.Length > minimumWordLength && !normalWord.IsMemberOfBlackListWords(blackListWords)))
{
if (wordsDic.ContainsKey(normalWord))
{
var cnt = wordsDic[normalWord];
wordsDic[normalWord] = ++cnt;
}
else
{
wordsDic.Add(normalWord, 1);
}
}
}
}
List<KeyValuePair<String, Int32>> keywords = wordsDic.ToList();
return keywords;
}
public static List<KeyValuePair<String, Int32>> OrderByDescending(this List<KeyValuePair<String, Int32>> list, bool isBasedOnFrequency = true)
{
List<KeyValuePair<String, Int32>> result = null;
if (isBasedOnFrequency)
result = list.OrderByDescending(q => q.Value).ToList();
else
result = list.OrderByDescending(q => q.Key).ToList();
return result;
}
public static List<KeyValuePair<String, Int32>> TakeTop(this List<KeyValuePair<String, Int32>> list, Int32 n = 10)
{
List<KeyValuePair<String, Int32>> result = list.Take(n).ToList();
return result;
}
public static List<String> GetWords(this List<KeyValuePair<String, Int32>> list)
{
List<String> result = new List<String>();
foreach (var item in list)
{
result.Add(item.Key);
}
return result;
}
public static List<Int32> GetFrequency(this List<KeyValuePair<String, Int32>> list)
{
List<Int32> result = new List<Int32>();
foreach (var item in list)
{
result.Add(item.Value);
}
return result;
}
public static String AsString<T>(this List<T> list, string seprator = ", ")
{
String result = string.Empty;
foreach (var item in list)
{
result += string.Format("{0}{1}", item, seprator);
}
return result;
}
private static bool IsMemberOfBlackListWords(this String word, List<String> blackListWords)
{
bool result = false;
if (blackListWords == null) return false;
foreach (var w in blackListWords)
{
if (w.ToNormalString().Equals(word))
{
result = true;
break;
}
}
return result;
}
}

Categories