replacing words - c#

I want to replace the first and last words of the sentence which I typed in the console.
if I type the following sentence in console:
London is the Capital of UK.
I need such result
UK is the capital of London.

You could use following method and String.Split + String.Join:
public static void SwapFirstAndLast<T>(IList<T>? items)
{
if (items == null || items.Count < 2) return;
T first = items[0];
items[0] = items.Last();
items[^1] = first;
}
string sentence = " London is the Capital of UK";
string[] wordDelimiters = { " " };
string[] words = sentence.Trim().Split(wordDelimiters, StringSplitOptions.RemoveEmptyEntries);
SwapFirstAndLast(words);
sentence = string.Join(" ", words);

In more generic case when we should take punctuation into account, e.g.
London, being a vast area, is the capital of UK =>
=> UK, being a vast area, is the capital of London
we can use regular expressions to match words. Assuming that word is a
sequence of letters and apostrophes we can use
[\p{L}']+
pattern and do the following:
using System.Text.RegularExpressions;
...
string text = "London, being a vast area, is the capital of UK";
// All words matched
var words = Regex.Matches(text, #"[\p{L}']+");
// Replace matches: first into last,
// last into first, all the others keep intact
int index = -1;
var result = Regex.Replace(
text,
#"[\p{L}']+",
m => {
index += 1;
if (index == 0)
return words[^1].Value;
if (index == words.Count - 1)
return words[0].Value;
return m.Value;
});
``

Related

Ignore regex separator between "display name"

I need to split the string with the separator ">," but this separator is allowed between the "" that correspond to the display name e.g:
""display>, name>," <email#tegg.com>, "<display,> >" <display_a#email.com>";
I need it to be separated like:
[["display>, name>," <email#tegg.com>,] ["<display,> >" <display_a#email.com>"]]
I'm using at this moment this:
aux = Regex.Split(addresses, #"(?<=\>,)");
But this doesnt work when the display name has ">,"
E.g str:
str = "\"Some name>,\" <example#email.com>, \"<display,> >\" <display_a#email.com>'";
You can use
var matches = Regex.Matches(str, #"""([^""]*)""\s*<([^<>]*)>")
.Cast<Match>()
.Select(x => new[] { x.Groups[1].Value, x.Groups[2].Value })
.ToArray();
See the C# demo:
var p = #"""([^""]*)""\s*<([^<>]*)>";
var str = "\"Some name>,\" <example#email.com>, \"<display,> >\" <display_a#email.com>'";
var matches = Regex.Matches(str, p).Cast<Match>().Select(x => new[] { x.Groups[1].Value, x.Groups[2].Value }).ToArray();
foreach (var pair in matches)
Console.WriteLine("{0} : {1}", pair[0],pair[1]);
Output:
Some name>, : example#email.com
<display,> > : display_a#email.com
See also the regex demo. Details:
" - a " char
([^"]*) - Group 1: any zero or more chars other than "
" - a " char
\s* - zero or more whitespaces
< - a < char
([^<>]*) - Group 2: any zero or more chars other than < and >
> - a > char

How to return the amount of duplicate letters in a string

I am trying to get a user's input so that I can return how many duplicate characters they have.
This is how I got the input
Console.WriteLine("Input a word to reveal duplicate letters");
string input = Console.ReadLine();
For example, the code should return something like:
List of duplicate characters in String 'Programming'
g : 2
r : 2
m : 2
How do I find duplicate letters and count them in a string?
Yes you can obtain this by using System.Linq GroupBy(), you going to group your string by character value and after filter the generated groups that have more than 1 values like so :
var word = "Hello World!";
var multipleChars = word.GroupBy(c => c).Where(group => group.Count() > 1);
foreach (var charGroup in multipleChars)
{
Console.WriteLine(charGroup .Key + " : " + charGroup .Count());
}
this will include case sensitivity as well as excluding non alphanumeric char
var sample = Console.ReadLine();
var letterCounter = sample.Where(char.IsLetterOrDigit)
.GroupBy(char.ToLower)
.Select(counter => new { Letter = counter.Key, Counter = counter.Count() })
.Where(c=>c.Counter>1);
foreach (var counter in letterCounter){
Console.WriteLine(String.Format("{0} = {1}", counter.Letter,counter.Counter));
}

Find NOT matching characters in a string with regex?

If Im able to check a string if there are invalid characters:
Regex r = new Regex("[^A-Z]$");
string myString = "SOMEString";
if (r.IsMatch(myString))
{
Console.WriteLine("invalid string!");
}
it is fine. But what I would like to print out every invalid character in this string? Like in the example SOMEString => invalid chars are t,r,i,n,g. Any ideas?
Use LINQ. Following will give you an array of 5 elements, not matching to the regex.
char[] myCharacterArray = myString.Where(c => r.IsMatch(c.ToString())).ToArray();
foreach (char c in myCharacterArray)
{
Console.WriteLine(c);
}
Output will be:
t
r
i
n
g
EDIT:
It looks like, you want to treat all lower case characters as invalid string. You may try:
char[] myCharacterArray2 = myString
.Where(c => ((int)c) >= 97 && ((int)c) <= 122)
.ToArray();
In your example the regex would succeed on one character since it's looking for the last character if it isn't uppercase, and your string has such a character.
The regex should be changed to Regex r = new Regex("[^A-Z]");.
(updated following #Chris's comments)
However, for your purpose the regex is actually what you want - just use Matches.
e.g.:
foreach (Match item in r.Matches(myString))
{
Console.WriteLine(item.ToString() + " is invalid");
}
Or, if you want one line:
foreach (Match item in r.Matches(myString))
{
str += item.ToString() + ", ";
}
Console.WriteLine(str + " are invalid");
Try with this:
char[] list = new char[5];
Regex r = new Regex("[^A-Z]*$");
string myString = "SOMEString";
foreach (Match match in r.Matches(myString))
{
list = match.Value.ToCharArray();
break;
}
string str = "invalid chars are ";
foreach (char ch in list)
{
str += ch + ", ";
}
Console.Write(str);
OUTPUT: invalid chars are t, r, i, n, g

How to remove lowercase on a textbox?

I'm trying to remove the lower case letters on a TextBox..
For example, short alpha code representing the insurance (e.g., 'BCBS' for 'Blue Cross Blue Shield'):
txtDesc.text = "Blue Cross Blue Shield";
string Code = //This must be BCBS..
Is it possible? Please help me. Thanks!
Well you could use a regular expression to remove everything that wasn't capital A-Z:
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main( string[] args )
{
string input = "Blue Cross Blue Shield 12356";
Regex regex = new Regex("[^A-Z]");
string output = regex.Replace(input, "");
Console.WriteLine(output);
}
}
Note that this would also remove any non-ASCII characters. An alternative regex would be:
Regex regex = new Regex(#"[^\p{Lu}]");
... I believe that should cover upper-case letters of all cultures.
string Code = new String(txtDesc.text.Where(c => IsUpper(c)).ToArray());
Here is my variant:
var input = "Blue Cross Blue Shield 12356";
var sb = new StringBuilder();
foreach (var ch in input) {
if (char.IsUpper(ch)) { // only keep uppercase
sb.Append(ch);
}
}
sb.ToString(); // "BCBS"
I normally like to use regular expressions, but I don't know how to select "only uppercase" in them without [A-Z] which will break badly on characters outside the English alphabet (even other Latin characters! :-/)
Happy coding.
But see Mr. Skeet's answer for the regex way ;-)
Without Regex:
string input = "Blue Cross Blue Shield";
string output = new string(input.Where(Char.IsUpper).ToArray());
Response.Write(output);
string Code = Regex.Replace(txtDesc.text, "[a-z]", "");
I´d map the value to your abbreviation in a dictionary like:
Dictionary<string, string> valueMap = new Dictionary<string, string>();
valueMap.Add("Blue Cross Blue Shield", "BCBS");
string Code = "";
if(valueMap.ContainsKey(txtDesc.Text))
Code = valueMap[txtDesc.Text];
else
// Handle
But if you still want the functionality you mention use linq:
string newString = new string(txtDesc.Text.Where(c => char.IsUpper(c).ToArray());
You can try use the 'Replace lowercase characters with star' implementation, but change '*' to '' (blank)
So the code would look something like this:
txtDesc.Text = "Blue Cross Blue Shield";
string TargetString = txt.Desc.Text;
string MainString = TargetString;
for (int i = 0; i < TargetString.Length; i++)
{
if (char.IsLower(TargetString[i]))
{
TargetString = TargetString.Replace( TargetString[ i ].ToString(), string.Empty );
}
}
Console.WriteLine("The string {0} has converted to {1}", MainString, TargetString);
string caps = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
string.Join("",
"Blue Cross Blue Shield".Select(c => caps.IndexOf(c) > -1 ? c.ToString() : "")
.ToArray());
Rather than matching on all capitals, I think the specification would require matching the first character from all the words. This would allow for inconsitent input but still be reliable in the long run. For this reason, I suggest using the following code. It uses an aggregate on each Match from the Regex object and appends the value to a string object called output.
string input = "Blue Cross BLUE shield 12356";
Regex regex = new Regex("\\b\\w");
string output = regex.Matches(input).Cast<Match>().Aggregate("", (current, match) => current + match.Value);
Console.WriteLine(output.ToUpper()); // outputs BCBS1
string Code = Regex.Replace(txtDesc.text, "[a-z]", "");
This isn't perfect but should work (and passes your BCBS test):
private static string AlphaCode(String Input)
{
List<String> capLetter = new List<String>();
foreach (Char c in Input)
{
if (char.IsLetter(c))
{
String letter = c.ToString();
if (letter == letter.ToUpper()) { capLetter.Add(letter); }
}
}
return String.Join(String.Empty, capLetter.ToArray());
}
And this version will handle strange input scenarios (this makes sure the first letter of each word is capitalized).
private static string AlphaCode(String Input)
{
String capCase = System.Globalization.CultureInfo.CurrentCulture.TextInfo.ToTitleCase(Input.ToString().ToLower());
List<String> capLetter = new List<String>();
foreach (Char c in capCase)
{
if (char.IsLetter(c))
{
String letter = c.ToString();
if (letter == letter.ToUpper()) { capLetter.Add(letter); }
}
}
return String.Join(String.Empty, capLetter.ToArray());
}

Find substring ignoring specified characters

Do any of you know of an easy/clean way to find a substring within a string while ignoring some specified characters to find it. I think an example would explain things better:
string: "Hello, -this- is a string"
substring to find: "Hello this"
chars to ignore: "," and "-"
found the substring, result: "Hello, -this"
Using Regex is not a requirement for me, but I added the tag because it feels related.
Update:
To make the requirement clearer: I need the resulting substring with the ignored chars, not just an indication that the given substring exists.
Update 2:
Some of you are reading too much into the example, sorry, i'll give another scenario that should work:
string: "?A&3/3/C)412&"
substring to find: "A41"
chars to ignore: "&", "/", "3", "C", ")"
found the substring, result: "A&3/3/C)41"
And as a bonus (not required per se), it will be great if it's also not safe to assume that the substring to find will not have the ignored chars on it, e.g.: given the last example we should be able to do:
substring to find: "A3C412&"
chars to ignore: "&", "/", "3", "C", ")"
found the substring, result: "A&3/3/C)412&"
Sorry if I wasn't clear before, or still I'm not :).
Update 3:
Thanks to everyone who helped!, this is the implementation I'm working with for now:
http://www.pastebin.com/pYHbb43Z
An here are some tests:
http://www.pastebin.com/qh01GSx2
I'm using some custom extension methods I'm not including but I believe they should be self-explainatory (I will add them if you like)
I've taken a lot of your ideas for the implementation and the tests but I'm giving the answer to #PierrOz because he was one of the firsts, and pointed me in the right direction.
Feel free to keep giving suggestions as alternative solutions or comments on the current state of the impl. if you like.
in your example you would do:
string input = "Hello, -this-, is a string";
string ignore = "[-,]*";
Regex r = new Regex(string.Format("H{0}e{0}l{0}l{0}o{0} {0}t{0}h{0}i{0}s{0}", ignore));
Match m = r.Match(input);
return m.Success ? m.Value : string.Empty;
Dynamically you would build the part [-, ] with all the characters to ignore and you would insert this part between all the characters of your query.
Take care of '-' in the class []: put it at the beginning or at the end
So more generically, it would give something like:
public string Test(string query, string input, char[] ignorelist)
{
string ignorePattern = "[";
for (int i=0; i<ignoreList.Length; i++)
{
if (ignoreList[i] == '-')
{
ignorePattern.Insert(1, "-");
}
else
{
ignorePattern += ignoreList[i];
}
}
ignorePattern += "]*";
for (int i = 0; i < query.Length; i++)
{
pattern += query[0] + ignorepattern;
}
Regex r = new Regex(pattern);
Match m = r.Match(input);
return m.IsSuccess ? m.Value : string.Empty;
}
Here's a non-regex string extension option:
public static class StringExtensions
{
public static bool SubstringSearch(this string s, string value, char[] ignoreChars, out string result)
{
if (String.IsNullOrEmpty(value))
throw new ArgumentException("Search value cannot be null or empty.", "value");
bool found = false;
int matches = 0;
int startIndex = -1;
int length = 0;
for (int i = 0; i < s.Length && !found; i++)
{
if (startIndex == -1)
{
if (s[i] == value[0])
{
startIndex = i;
++matches;
++length;
}
}
else
{
if (s[i] == value[matches])
{
++matches;
++length;
}
else if (ignoreChars != null && ignoreChars.Contains(s[i]))
{
++length;
}
else
{
startIndex = -1;
matches = 0;
length = 0;
}
}
found = (matches == value.Length);
}
if (found)
{
result = s.Substring(startIndex, length);
}
else
{
result = null;
}
return found;
}
}
EDIT: here's an updated solution addressing the points in your recent update. The idea is the same except if you have one substring it will need to insert the ignore pattern between each character. If the substring contains spaces it will split on the spaces and insert the ignore pattern between those words. If you don't have a need for the latter functionality (which was more in line with your original question) then you can remove the Split and if checking that provides that pattern.
Note that this approach is not going to be the most efficient.
string input = #"foo ?A&3/3/C)412& bar A341C2";
string substring = "A41";
string[] ignoredChars = { "&", "/", "3", "C", ")" };
// builds up the ignored pattern and ensures a dash char is placed at the end to avoid unintended ranges
string ignoredPattern = String.Concat("[",
String.Join("", ignoredChars.Where(c => c != "-")
.Select(c => Regex.Escape(c)).ToArray()),
(ignoredChars.Contains("-") ? "-" : ""),
"]*?");
string[] substrings = substring.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
string pattern = "";
if (substrings.Length > 1)
{
pattern = String.Join(ignoredPattern, substrings);
}
else
{
pattern = String.Join(ignoredPattern, substring.Select(c => c.ToString()).ToArray());
}
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine("Index: {0} -- Match: {1}", match.Index, match.Value);
}
Try this solution out:
string input = "Hello, -this- is a string";
string[] searchStrings = { "Hello", "this" };
string pattern = String.Join(#"\W+", searchStrings);
foreach (Match match in Regex.Matches(input, pattern))
{
Console.WriteLine(match.Value);
}
The \W+ will match any non-alphanumeric character. If you feel like specifying them yourself, you can replace it with a character class of the characters to ignore, such as [ ,.-]+ (always place the dash character at the start or end to avoid unintended range specifications). Also, if you need case to be ignored use RegexOptions.IgnoreCase:
Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
If your substring is in the form of a complete string, such as "Hello this", you can easily get it into an array form for searchString in this way:
string[] searchString = substring.Split(new[] { ' ' },
StringSplitOptions.RemoveEmptyEntries);
This code will do what you want, although I suggest you modify it to fit your needs better:
string resultString = null;
try
{
resultString = Regex.Match(subjectString, "Hello[, -]*this", RegexOptions.IgnoreCase).Value;
}
catch (ArgumentException ex)
{
// Syntax error in the regular expression
}
You could do this with a single Regex but it would be quite tedious as after every character you would need to test for zero or more ignored characters. It is probably easier to strip all the ignored characters with Regex.Replace(subject, "[-,]", ""); then test if the substring is there.
Or the single Regex way
Regex.IsMatch(subject, "H[-,]*e[-,]*l[-,]*l[-,]*o[-,]* [-,]*t[-,]*h[-,]*i[-,]*s[-,]*")
Here's a non-regex way to do it using string parsing.
private string GetSubstring()
{
string searchString = "Hello, -this- is a string";
string searchStringWithoutUnwantedChars = searchString.Replace(",", "").Replace("-", "");
string desiredString = string.Empty;
if(searchStringWithoutUnwantedChars.Contains("Hello this"))
desiredString = searchString.Substring(searchString.IndexOf("Hello"), searchString.IndexOf("this") + 4);
return desiredString;
}
You could do something like this, since most all of these answer require rebuilding the string in some form.
string1 is your string you want to look through
//Create a List(Of string) that contains the ignored characters'
List<string> ignoredCharacters = new List<string>();
//Add all of the characters you wish to ignore in the method you choose
//Use a function here to get a return
public bool subStringExist(List<string> ignoredCharacters, string myString, string toMatch)
{
//Copy Your string to a temp
string tempString = myString;
bool match = false;
//Replace Everything that you don't want
foreach (string item in ignoredCharacters)
{
tempString = tempString.Replace(item, "");
}
//Check if your substring exist
if (tempString.Contains(toMatch))
{
match = true;
}
return match;
}
You could always use a combination of RegEx and string searching
public class RegExpression {
public static void Example(string input, string ignore, string find)
{
string output = string.Format("Input: {1}{0}Ignore: {2}{0}Find: {3}{0}{0}", Environment.NewLine, input, ignore, find);
if (SanitizeText(input, ignore).ToString().Contains(SanitizeText(find, ignore)))
Console.WriteLine(output + "was matched");
else
Console.WriteLine(output + "was NOT matched");
Console.WriteLine();
}
public static string SanitizeText(string input, string ignore)
{
Regex reg = new Regex("[^" + ignore + "]");
StringBuilder newInput = new StringBuilder();
foreach (Match m in reg.Matches(input))
{
newInput.Append(m.Value);
}
return newInput.ToString();
}
}
Usage would be like
RegExpression.Example("Hello, -this- is a string", "-,", "Hello this"); //Should match
RegExpression.Example("Hello, -this- is a string", "-,", "Hello this2"); //Should not match
RegExpression.Example("?A&3/3/C)412&", "&/3C\\)", "A41"); // Should match
RegExpression.Example("?A&3/3/C) 412&", "&/3C\\)", "A41"); // Should not match
RegExpression.Example("?A&3/3/C)412&", "&/3C\\)", "A3C412&"); // Should match
Output
Input: Hello, -this- is a string
Ignore: -,
Find: Hello this
was matched
Input: Hello, -this- is a string
Ignore: -,
Find: Hello this2
was NOT matched
Input: ?A&3/3/C)412&
Ignore: &/3C)
Find: A41
was matched
Input: ?A&3/3/C) 412&
Ignore: &/3C)
Find: A41
was NOT matched
Input: ?A&3/3/C)412&
Ignore: &/3C)
Find: A3C412&
was matched

Categories