c# Remove elements from List containing string [duplicate] - c#

What would be the fastest way to check if a string contains any matches in a string array in C#? I can do it using a loop, but I think that would be too slow.

Using LINQ:
return array.Any(s => s.Equals(myString))
Granted, you might want to take culture and case into account, but that's the general idea.
Also, if equality is not what you meant by "matches", you can always you the function you need to use for "match".

I really couldn't tell you if this is absolutely the fastest way, but one of the ways I have commonly done this is:
This will check if the string contains any of the strings from the array:
string[] myStrings = { "a", "b", "c" };
string checkThis = "abc";
if (myStrings.Any(checkThis.Contains))
{
MessageBox.Show("checkThis contains a string from string array myStrings.");
}
To check if the string contains all the strings (elements) of the array, simply change myStrings.Any in the if statement to myStrings.All.
I don't know what kind of application this is, but I often need to use:
if (myStrings.Any(checkThis.ToLowerInvariant().Contains))
So if you are checking to see user input, it won't matter, whether the user enters the string in CAPITAL letters, this could easily be reversed using ToLowerInvariant().
Hope this helped!

That works fine for me:
string[] characters = new string[] { ".", ",", "'" };
bool contains = characters.Any(c => word.Contains(c));

You could combine the strings with regex or statements, and then "do it in one pass," but technically the regex would still performing a loop internally. Ultimately, looping is necessary.

If the "array" will never change (or change only infrequently), and you'll have many input strings that you're testing against it, then you could build a HashSet<string> from the array. HashSet<T>.Contains is an O(1) operation, as opposed to a loop which is O(N).
But it would take some (small) amount of time to build the HashSet. If the array will change frequently, then a loop is the only realistic way to do it.

Related

Build regular expression for replacing duplicated string into single word

I'm working of filtering comments. I'd like to replace string like this:
llllolllllllllllooooooooooooouuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooouuuuuuuuuuuuuuuuudddddddddddddd
with two words: lol loud
string like this:
cuytwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
with: cuytw
And string like this:
hyyuyuyuyuyuyuyuyuyuyuyuyuyu
with: hyu
but not modify strings like look, geek.
Is there any way to achieve this with single regular expression in C#?
I think I can answer this categorically.
This definitely cant be done with RegEx or even standard code due to your input and output requirements without at minimum some sort of dictionary and algorithm to try and reduce doubles in a permutation check for legitimate words.
The result (at best) would give you a list of possible non mutually-exclusive combinations of nonsense words and legitimate words with doubles.
In fact, I'd go as far to say with your current requirements and no extra specificity on rules, your input and output are generically impossible and could only be taken at face value for the cases you have given.
I'm not sure how to use RegEx for this problem, but here is an alternative which is arguably easier to read.*
Assuming you just want to return a string comprising the distinct letters of the input in order, you can use GroupBy:
private static string filterString(string input)
{
var groups = input.GroupBy(c => c);
var output = new string(groups.Select(g => g.Key).ToArray());
return output;
}
Passes:
Returns loud for llllolllllllllllooooooooooooouuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooooooooouuuuuuuuuuuuuuuuuuddddddddddddddllllollllllllllllloooooooooooouuuuuuuuuuuuuuuuudddddddddddddd
Returns cuytw for cuytwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
Returns hyu for hyyuyuyuyuyuyuyuyuyuyuyuyuyu
Failures:
Returns lok for look
Returns gek for geek
* On second read you want to leave words like look and geek alone; this is a partial answer.

Find a string in a list of strings in c#

I am trying to find if a list of strings contains a specific string in C#.
for example: Suppose I have 3 entries in my list
list<string> s1 = new List<string>(){
"the lazy boy went to the market in a car",
"tom",
"balloon"};
string s2 = "market";
Now I want to return true if s1 contains s2, which it does in this case.
return s1.Contains(s2);
This returns false which is not what I want. I was reading about Predicate but could not make much sense out of it for this case.
Thanks in advance.
The simplest way is to search each string individually:
bool exists = s1.Any(s => s.Contains(s2));
The List<string>.Contains() method is going to check if any whole string matches the string you ask for. You need to check each individual list element to accomplish what you want.
Note that this may be a time-consuming operation, if your list has a large number of elements in it, very long strings, and especially in the case where the string you're searching for either does not exist or is found only near the end of the list.
Contains' alternative could be IndexOf:
var res = s1.Any(s => s.IndexOf(s2, StringComparison.Ordinal) >= 0)
StringComparison.Ordinal passed as parameter because Contains() also use it internally.
Peter Duniho's answer is generally the best way. I am providing an alternate solution. This one does not require LINQ, lamdas, or loops. This only requires string built-in type's methods.
string.Concat(listOfString).Contains("data");
Note: This approach can lead to incorrect results. For example:
string.Concat("da", "ta").Contains("data");
will return true when it should be false;

Most efficient way of adding/removing a character to beginning of string?

I was doing a small 'scalable' C# MVC project, with quite a bit of read/write to a database.
From this, I would need to add/remove the first letter of the input string.
'Removing' the first character is quite easy (using a Substring method) - using something like:
String test = "HHello world";
test = test.Substring(1,test.Length-1);
'Adding' a character efficiently seems to be messy/awkward:
String test = "ello World";
test = "H" + test;
Seeing as this will be done for a lot of records, would this be be the most efficient way of doing these operations?
I am also testing if a string starts with the letter 'T' by using, and adding 'T' if it doesn't by:
String test = "Hello World";
if(test[0]!='T')
{
test = "T" + test;
}
and would like to know if this would be suitable for this
If you have several records and to each of the several records field you need to append a character at the beginning, you can use String.Insert with an index of 0 http://msdn.microsoft.com/it-it/library/system.string.insert(v=vs.110).aspx
string yourString = yourString.Insert( 0, "C" );
This will pretty much do the same of what you wrote in your original post, but since it seems you prefer to use a Method and not an operator...
If you have to append a character several times, to a single string, then you're better using a StringBuilder http://msdn.microsoft.com/it-it/library/system.text.stringbuilder(v=vs.110).aspx
Both are equally efficient I think since both require a new string to be initialized, since string is immutable.
When doing this on the same string multiple times, a StringBuilder might come in handy when adding. That will increase performance over adding.
You could also opt to move this operation to the database side if possible. That might increase performance too.
For removing I would use the remove command as this doesn't require to know the length of the string:
test = test.Remove(0, 1);
You could also treat the string as an array for the Add and use
test = test.Insert(0, "H");
If you are always removing and then adding a character you can treat the string as an array again and just replace the character.
test = (test.ToCharArray()[0] = 'H').ToString();
When doing lots of operations to the same string I would use a StringBuilder though, more expensive to create but faster operations on the string.

C# - using string contains with string array

I have a question regarding C#, strings and arrays.
I've searched for similar questions at stack overflow, but could not find any answers.
My problem:
I have a string array, which contains words / wordparts to check file names. If all of these strings in the array matches, the word is "good".
String[] StringArray = new String[] { "wordpart1", "wordpart2", ".txt" };
Now I want to check if all these strings are a part of a filename. If this checkresult is true, I want to do something with this file.
How can I do that?
I already tried different approaches, but all doesn't work.
i.e.
e.Name.Contains(StringArray)
etc.
I want to avoid to use a loop (for, foreach) to check all wordparts. Is this possible?
Thanks in advance for any help.
Now I want to check if all these strings are a part of a filename. If this checkresult is true, I want to do something with this file. How can I do that?
Thanks to LINQ and method groups conversions, it can be easily done like this:
bool check = StringArray.All(yourFileName.Contains);
Similar question: Using C# to check if string contains a string in string array
This uses LINQ:
if(stringArray.Any(stringToCheck.Contains))
This checks if stringToCheck contains any one of substrings from
stringArray. If you want to ensure that it contains all the
substrings, change Any to All:
if(stringArray.All(s => stringToCheck.Contains(s)))

How to generate a unique string from a string collection?

I need a way to convert a strings collection into a unique string. This means that I need to have a different string if any of the strings inside the collection has changed.
I'm working on a big solution so I may wont be able to work with some better ideas. The required unique string will be used to compare the 2 collections, so different strings means different collections. I cannot compare the strings inside one by one because the order may change plus the solution is already built to return result based on 2 strings comparison. This is an add-on. The generated string will be passed as parameter for this comparison.
Thank you!
These both work by deciding to use the separator character of ":" and also using an escape character to make it clear when we mean something else by the separator character. We therefore just need to escape all our strings before concatenating them with our separator in between. This gives us unique strings for every collection. All we need to do if we want to make collections the same regardless or order is to sort our collection before we do anything. I should add that my sample uses LINQ and thus assumes the collection implements IEnumerable<string> and that you have a using declaration for System.LINQ
You can wrap that up in a function as follows
string GetUniqueString(IEnumerable<string> Collection, bool OrderMatters = true, string Escape = "/", string Separator = ":")
{
if(Escape == Separator)
throw new Exception("Escape character should never equal separator character because it fails in the case of empty strings");
if(!OrderMatters)
Collection = Collection.OrderBy(v=>v);//Sorting fixes ordering issues.
return Collection
.Select(v=>v.Replace(Escape, Escape + Escape).Replace(Separator,Escape + Separator))//Escape String
.Aggregate((a,b)=>a+Separator+b);
}
What about using a hash function?
Considering you constraints, use a delimited approach:
pick a delimiter and an escape method.
e.g. use ; and escape it bwithin strings y \;, also escape \ by \\
So this list of strings...
"A;bc"
"D\ef;"
...becomes "A\;bc;D\\ef\;"
It ain't pretty, but considering that it has to be a string, then the good old ways of csv and its brethren isn't all too bad.
By a "collection string" you mean "collection of strings"?
Here's a naive (but working) approach: sort the collection (to eliminate dependency on order), concat them, and take a hash of that (MD5 for instance).
Trivial to implement, but not very clever performance-wise.
Are you saying that you need to encode a string collection as a string. So for example the collection {"abc", "def"} may be encoded as "sDFSDFSDFSD" but {"a", "b"} might be encoded as "SDFeg". If so and you don't care about unique keys then you could use something like SHA or MD5.

Categories