string.contains() vs string.equals() or string == performance [closed] - c#

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm returning a string from an API that has a length of 45 characters. There is one word that is unique for one condition that doesn't appear in the other condition.
I'm wondering if using string.contains() is faster performance-wise than comparing the whole string with string.equals() or string == "blah blah".
I don't know the inner workings of any of these methods, but logically, it seems like contains() should be faster because it can stop traversing the string after it finds the match. Is this accurate? Incidentally, the word I want to check is the first word in the string.

I agree with D Stanley (comment). You should use String.StartsWith()
That said, I also don't know the inner working of each method either, but I can see your logic. However "String.Contains()" may still load the entire string before processing it, in which case the performance difference would be very small.
As a final point, with a string length of only 45 characters, the performance difference should me extremely minute. I was shocked when I wrote a junky method to substitute characters and found that is processes ~10kb of text in a fraction of a blink of the eye. So unless you're doing some crazy handling else wise in your app, it shouldn't matter much.

Related

c# string.CompareOrdinal vs operator == [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to compare two strings in a linq expression. Do I take advantage if I use `string.CompareOrdinal or is it the same?
list.Where(str1 => string.CompareOrdinal(str1, str2) == 0);
list.Where(str1 => str1 == str2);
According to benchmarks done by someone else, string.CompareOrdinal can be slightly faster than == when doing a lot of comparisons:
Most of the board remained green up through 10,000 comparisons and didn’t register any time.
At the 100,000 and 1,000,000 marks, things started to get a bit more interesting in terms of time differences.
String.CompareOrdinal was the constant superstar. What surprised me is for the case-insensitive comparisons, String.CompareOrdinal outperformed most other methods by a whole decimal place.
For case sensitive comparisons, most programmers can probably stick with the “==” operator.
-- The Curious Consultant: Fastest Way to Compare Strings in C# .Net
Note, though, that we are talking about a total difference of 3 milliseconds for 100,000 case-sensitive string comparisons, and that no measurable differences have been observed for 10,000 and 1,000,000 comparisons.
Thus, is very unlikely that this difference is relevant to your application (especially if you are using LINQ-to-objects), so the more readable == should be preferred.

C# Check if a string is a Sentence [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Basically I want to check if a String is a Sentence ("Hello, I am Me!") or Symbol Spam ("HH,,,{''{"), without using the number of symbols as a factor as much as possible. Right now it just detects based on a counter of symbols, but when someone says something with lots of punctuation, they get kicked.
Help?
If the number of symbols in the text is not sufficient, and you don't want to use something too fancy (or bought) could I suggest implementing one or more of these further steps (of increasing difficulty):
Make a count of all A-Za-z and space characters in the string and make a ratio of this to the count of symbols - so if they write a sentence then !!!!!!!!!!!!! at the end it still doesn't snag as the ratio is high enough.
If this still isn't discerning enough, add a further check if you pass item 1...
Count numbers of consecutive A-Za-z characters in the string - work out the average length of these 'words' - if the average is too short then it is probably spam.
These can be done in RegEx reasonably easily - If you want more sophistication then you have to use something written by someone else that has much more developed statistical methods (or start reading lexographical university papers that are beyond me!)

How do I get two numbers between two words (C#) [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I have a string "Building1Floor2" and it's always in that format, how do I cleanly get the building number (e.g. 1) and floor number. I'm thinking I need a regex, but not entirely sure that's the best way. I could just use the index if the format stays the same, but if I have have a high floor number e.g. 100 it will break.
P.S. I'm using C#.
Use a regex like this:
Building(\d+)Floor(\d+)
Regex would be an ok option here if "Building" and "Floor" could change. e.g.: "Floor1Room23"
You could use "[A-Za-z]+([0-9]{1,})[A-Za-z]+([0-9]{1,})"
With those groupings, $1 would now be the Building number, and $2 would be Floor.
If "Building" and "Floor" never changed, however, then regex might be overkill.. you could use a string split
Find the index of the "F" and substring on that.
int first = str.IndexOf("F") ;
String building = str.substring(1, first);

Porter stemmer algorithm in information-retrieval [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I need to create simple search engine for my application. Let's simplify it to the following: we have some texts (a lot) and i need to search and show relevant results.
I've based on this great article extend some things and it works pretty well for me.
But i have problem with stemming words to terms. For example words "annotation", "annotations" etc. will be stemmed to "annot", but imagine you try search something, and you will see unexpected results:
"anno" - nothing
"annota" - nothing
etc.
Only word "annot" will give relevant result. So, how should i improve my search to give expected results? Because "annot" contains "anno" and "annota" is slightly more than "annot". Using contains all the time obviously isn't the solution
If in first case i can use some Ternary search tree, in second case i don't know what to do.
Any ideas would be very helpful.
UPDATE
oleksii has pointed me to n-grams here, which may works for me, but i don't know how to properly index n-grams.
So the Question:
Which data structure would be the best for my needs
How properly index my n-grams
Stemming perhaps isn't much relevant here. Stemming will convert a plural to a singular form.
Given you have a tokeniser, a stemmer and a cleaner (to remove stop words, perhaps punctuation and numbers, short words etc) what you are looking at is a full-text search. I would advice you to get an off-the-shelf solution (like Elasticsearch, Lucene, Solr), but if you fancy a DIY approach I can suggest the following naive implementation.
Step 1
Create a search-orientated tokeniser. One example would be an n-gram tokeniser. It will take your word and split into the following sequences:
annotation
1 - [a, n, o, t, a, i]
2 - [an, nn, no, ot, ...]
3 - [ann, nno, not, ota, ...]
4 - [anno, nnot, nota, otat, ...]
....
Step 2
Sort n-grams for more efficient look-up
Step 3
Search n-grams for exact match using binary search

String Concatenation in Front End Code (.ascx or .aspx)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Is there any reason to use one of these techniques over the other?
There are several strings that get created in the code behind:
protected string String1;
protected string String2;
protected string String3;
protected string String4;
They are used in the front end code and can be printed to the screen using:
<%#String1%><%#String2%><%#String3%><%#String4%>
Alternatively these can be printed using:
<%#String1 + String2 + String3 + String4%>
The second technique seems a little easier to read. The thought popped into my head that it may be slightly less efficient depending on how the <#%%> is evaluated compared to the +.
Is there a difference in efficiency that makes one way better than the other?
Well, for the second case it'll be transforming that code into a single call to string.Concat, which is as efficient of a method as you can get for concatting 4 C# strings together.
I'm not positive how ASP goes about taking each component of the markup and building a single string out of the content, but I would be shocked to find out that it used a silly strong concatenation method that ends up building an intermediate string and copying over the entire page's HTML every time a new component is added in. I think it's a pretty safe bet to assume that some reasonably sensible method is used, most likely either StringBuilder, writing the content to a stream, or some other comparable method of efficiently appending a series of strings together.
Asp.net engine will not concat the following string to each other it just render 4 times and place it in the html code
<%#String1%><%#String2%><%#String3%><%#String4%>
While below string will concat on server side and render 1 time only. definitely this is easy to read and understand.
<%#String1 + String2 + String3 + String4%>

Categories