Find a string in a list of strings in c#

Find a string in a list of strings in c# - c#

I am trying to find if a list of strings contains a specific string in C#.
for example: Suppose I have 3 entries in my list
list<string> s1 = new List<string>(){
"the lazy boy went to the market in a car",
"tom",
"balloon"};
string s2 = "market";
Now I want to return true if s1 contains s2, which it does in this case.
return s1.Contains(s2);
This returns false which is not what I want. I was reading about Predicate but could not make much sense out of it for this case.
Thanks in advance.

The simplest way is to search each string individually:
bool exists = s1.Any(s => s.Contains(s2));
The List<string>.Contains() method is going to check if any whole string matches the string you ask for. You need to check each individual list element to accomplish what you want.
Note that this may be a time-consuming operation, if your list has a large number of elements in it, very long strings, and especially in the case where the string you're searching for either does not exist or is found only near the end of the list.

Contains' alternative could be IndexOf:
var res = s1.Any(s => s.IndexOf(s2, StringComparison.Ordinal) >= 0)
StringComparison.Ordinal passed as parameter because Contains() also use it internally.

Peter Duniho's answer is generally the best way. I am providing an alternate solution. This one does not require LINQ, lamdas, or loops. This only requires string built-in type's methods.
string.Concat(listOfString).Contains("data");
Note: This approach can lead to incorrect results. For example:
string.Concat("da", "ta").Contains("data");
will return true when it should be false;

Related

c# Remove elements from List containing string [duplicate]

What would be the fastest way to check if a string contains any matches in a string array in C#? I can do it using a loop, but I think that would be too slow.

Using LINQ:
return array.Any(s => s.Equals(myString))
Granted, you might want to take culture and case into account, but that's the general idea.
Also, if equality is not what you meant by "matches", you can always you the function you need to use for "match".

I really couldn't tell you if this is absolutely the fastest way, but one of the ways I have commonly done this is:
This will check if the string contains any of the strings from the array:
string[] myStrings = { "a", "b", "c" };
string checkThis = "abc";
if (myStrings.Any(checkThis.Contains))
{
MessageBox.Show("checkThis contains a string from string array myStrings.");
}
To check if the string contains all the strings (elements) of the array, simply change myStrings.Any in the if statement to myStrings.All.
I don't know what kind of application this is, but I often need to use:
if (myStrings.Any(checkThis.ToLowerInvariant().Contains))
So if you are checking to see user input, it won't matter, whether the user enters the string in CAPITAL letters, this could easily be reversed using ToLowerInvariant().
Hope this helped!

That works fine for me:
string[] characters = new string[] { ".", ",", "'" };
bool contains = characters.Any(c => word.Contains(c));

You could combine the strings with regex or statements, and then "do it in one pass," but technically the regex would still performing a loop internally. Ultimately, looping is necessary.

If the "array" will never change (or change only infrequently), and you'll have many input strings that you're testing against it, then you could build a HashSet<string> from the array. HashSet<T>.Contains is an O(1) operation, as opposed to a loop which is O(N).
But it would take some (small) amount of time to build the HashSet. If the array will change frequently, then a loop is the only realistic way to do it.

Build regular expression for replacing duplicated string into single word

I'm working of filtering comments. I'd like to replace string like this:
llllolllllllllllooooooooooooouuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooouuuuuuuuuuuuuuuuudddddddddddddd
with two words: lol loud
string like this:
cuytwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
with: cuytw
And string like this:
hyyuyuyuyuyuyuyuyuyuyuyuyuyu
with: hyu
but not modify strings like look, geek.
Is there any way to achieve this with single regular expression in C#?

I think I can answer this categorically.
This definitely cant be done with RegEx or even standard code due to your input and output requirements without at minimum some sort of dictionary and algorithm to try and reduce doubles in a permutation check for legitimate words.
The result (at best) would give you a list of possible non mutually-exclusive combinations of nonsense words and legitimate words with doubles.
In fact, I'd go as far to say with your current requirements and no extra specificity on rules, your input and output are generically impossible and could only be taken at face value for the cases you have given.

I'm not sure how to use RegEx for this problem, but here is an alternative which is arguably easier to read.*
Assuming you just want to return a string comprising the distinct letters of the input in order, you can use GroupBy:
private static string filterString(string input)
{
var groups = input.GroupBy(c => c);
var output = new string(groups.Select(g => g.Key).ToArray());
return output;
}
Passes:
Returns loud for llllolllllllllllooooooooooooouuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooouuuuuuuuuuuuuuuuudddddddddddddd
Returns cuytw for cuytwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
Returns hyu for hyyuyuyuyuyuyuyuyuyuyuyuyuyu
Failures:
Returns lok for look
Returns gek for geek
* On second read you want to leave words like look and geek alone; this is a partial answer.

Most efficient way of adding/removing a character to beginning of string?

I was doing a small 'scalable' C# MVC project, with quite a bit of read/write to a database.
From this, I would need to add/remove the first letter of the input string.
'Removing' the first character is quite easy (using a Substring method) - using something like:
String test = "HHello world";
test = test.Substring(1,test.Length-1);
'Adding' a character efficiently seems to be messy/awkward:
String test = "ello World";
test = "H" + test;
Seeing as this will be done for a lot of records, would this be be the most efficient way of doing these operations?
I am also testing if a string starts with the letter 'T' by using, and adding 'T' if it doesn't by:
String test = "Hello World";
if(test[0]!='T')
{
test = "T" + test;
}
and would like to know if this would be suitable for this

If you have several records and to each of the several records field you need to append a character at the beginning, you can use String.Insert with an index of 0 http://msdn.microsoft.com/it-it/library/system.string.insert(v=vs.110).aspx
string yourString = yourString.Insert( 0, "C" );
This will pretty much do the same of what you wrote in your original post, but since it seems you prefer to use a Method and not an operator...
If you have to append a character several times, to a single string, then you're better using a StringBuilder http://msdn.microsoft.com/it-it/library/system.text.stringbuilder(v=vs.110).aspx

Both are equally efficient I think since both require a new string to be initialized, since string is immutable.
When doing this on the same string multiple times, a StringBuilder might come in handy when adding. That will increase performance over adding.
You could also opt to move this operation to the database side if possible. That might increase performance too.

For removing I would use the remove command as this doesn't require to know the length of the string:
test = test.Remove(0, 1);
You could also treat the string as an array for the Add and use
test = test.Insert(0, "H");
If you are always removing and then adding a character you can treat the string as an array again and just replace the character.
test = (test.ToCharArray()[0] = 'H').ToString();
When doing lots of operations to the same string I would use a StringBuilder though, more expensive to create but faster operations on the string.

How to compare strings by ignore " " prefix and postfix without call string.Trim() [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
String.comparison performance (with trim)
I would like to write a function to judge whether two strings are equal or not, by ignore the first string's whiteSpace prefix and postfix, without call string.Trim().
Please also considering the insensitive case.
Suppose:
string str1 = " Abc ";
string str2 = "abc";
bool trueEqual = IsEqualWithoutWhiteSpace(str1, str2, /*ignore case?*/ true); // return true.

I assume you want to do this for performance reasons (this would be a valid reason in some cases - I'll spare you the usual premature optimization warnings).
First, count the number of whitespace chars in both strings at the beginning and at the end. If the non-whitespace portion is not of the same length, return false. Now we know it has the same length.
Next, call stringA.IndexOf(stringB, ...) with the appropriate start and count arguments to determine if a match was found. If there was a match, the string are equal according to your implementation.
If you don't need case-insensitivity you can use a loop to compare the middle part of both strings, too.

Like the comments show, there is no direct reason to not use trim. So if you are convinced that Trim is fine, here is a solution how to do it.
public static bool IsEqualWithoutWhiteSpace(this string aLhs, string aRhs)
{
var left = aLhs.Trim();
var right = aRhs.Trim();
return left.Equals(right, StringComparison.OrdinalIgnoreCase);
}
string str1 = " Abc ";
string str2 = "abc";
var b = str1.IsEqualWithoutWhiteSpace(str2);
if it is a performance reason, please reconsider your question because you are asking "Can somebody write a function for me" instead of "Can somebody explain why my code is not working".

look ma, no Trim()!
string str1 = " AbC ";
string str2 = "abc";
var r = new Regex("^[ \t]+|[ \t]+$");
var trimStr1 = r.Replace(str1, "");
var trimStr2 = r.Replace(str2, "");
return trimStr1.Equals(str2, StringComparison.OrdinalIgnoreCase);
Although, like everybody else is saying, I would use certainly use Trim() for this in one of my own projects.

To kick the can once more Trim is your friend. However, another method to avoid using trim would be something like:
bool found = string1.Contains(string2);
If you are sure that string1 is the one that potentially has the whitespace and want to be case sensitive. If you don't want to be case sensitive throw in a .ToLower after each string:
bool found = string1.ToLower().Contains(string2.ToLower());
Finally if you aren't sure which one has the white space but only one of them will (ex. one of two inputs will come from a reasonably reliable source (like a string in code) and the other from something a user added but you aren't sure which order a user of your code might chose to put inputs in) you can use the or operator:
bool found = string1.Contains(string2) || string2.Contains(string1);
These are kind of special cases though and the .Net team does a pretty good job making their code efficient so it is likely calling trim and comparing once is more efficient than comparing twice and avoiding the Trim call (comparisons have to look at at least Len2 - Len1 number of combinations where as a trim can terminate once it finds a non-whitespace character so it is logically a simpler problem for the function call to solve.

Using .NET RegEx to retrieve part of a string after the second '-'

This is my first stack message. Hope you can help.
I have several strings i need to break up for use later. Here are a couple of examples of what i mean....
fred-064528-NEEDED
frederic-84728957-NEEDED
sam-028-NEEDED
As you can see above the string lengths vary greatly so regex i believe is the only way to achieve what i want. what i need is the rest of the string after the second hyphen ('-').
i am very weak at regex so any help would be great.
Thanks in advance.

Just to offer an alternative without using regex:
foreach(string s in list)
{
int x = s.LastIndexOf('-')
string sub = s.SubString(x + 1)
}
Add validation to taste.

Something like this. It will take anything (except line breaks) after the second '-' including the '-' sign.
var exp = #"^\w*-\w*-(.*)$";
var match = Regex.Match("frederic-84728957-NEE-DED", exp);
if (match.Success)
{
var result = match.Groups[1]; //Result is NEE-DED
Console.WriteLine(result);
}
EDIT: I answered another question which relates to this. Except, it asked for a LINQ solution and my answer was the following which I find pretty clear.
Pimp my LINQ: a learning exercise based upon another post
var result = String.Join("-", inputData.Split('-').Skip(2));
or
var result = inputData.Split('-').Skip(2).FirstOrDefault(); //If the last part is NEE-DED then only NEE is returned.
As mentioned in the other SO thread it is not the fastest way of doing this.

If they are part of larger text:
(\w+-){2}(\w+)
If there are presented as whole lines, and you know you don't have other hyphens, you may also use:
[^-]*$
Another option, if you have each line as a string, is to use split (again, depending on whether or not you're expecting extra hyphens, you may omit the count parameter, or use LastIndexOf):
string[] tokens = line.Split("-".ToCharArray(), 3);
string s = tokens.Last();

This should work:
.*?-.*?-(.*)

This should do the trick:
([^\-]+)\-([^\-]+)\-(.*?)$

the regex pattern will be
(?<first>.*)?-(?<second>.*)?-(?<third>.*)?(\s|$)
then you can get the named group "second" to get the test after 2nd hyphen
alternatively
you can do a string.split('-') and get the 2 item from the array

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Find a string in a list of strings in c# - c#

Contains' alternative could be IndexOf: var res = s1.Any(s => s.IndexOf(s2, StringComparison.Ordinal) >= 0) StringComparison.Ordinal passed as parameter because Contains() also use it internally.

Related

c# Remove elements from List containing string [duplicate]

Build regular expression for replacing duplicated string into single word

Most efficient way of adding/removing a character to beginning of string?

How to compare strings by ignore " " prefix and postfix without call string.Trim() [duplicate]

Using .NET RegEx to retrieve part of a string after the second '-'

Categories

Resources