Searching a string array without built in methods - c#

If anyone can help out there, I'll be tremendously grateful.
Essentially I am working on a homework project where, for part of it, I need to search an array. The array, I currently have as a String type, but is essentially a collection of dates. (In the format 05/06/2014)
I am just about at my wits end trying to find a way to allow the user to search this array, in particular that doesn't use built in methods like array.binarysearch etc.
I tried to implement a binary search but that didn't seem to work, I can provide code if you wish to see where I'm probably going wrong. But is there perhaps a better search I should use for this string type, or should I be converting the string array into a different type?
If anyone can help I would greatly appreciated, I'm not necessarily asking for anyone to do my work for me I'd just be thrilled if someone could bump me in the right direction, as this problem has been doing my nut in. Thanks!
Current Binary Search Code:
public static void BinarySearch(string[] dateArray, string searchTerm)
{
int first = 0;
int last = dateArray.Length - 1;
int position = -1;
bool found = false;
int compCount = 0;
while (found != true && first <= last)
{
int middle = (first + last) / 2;
int comparisonSTR = string.Compare(dateArray[middle], searchTerm);
if (dateArray[middle] == searchTerm)
{
found = true;
position = middle;
compCount++;
Console.WriteLine("Your search has been found after " + compCount + "comparisons.");
}
else if (comparisonSTR > 0)
{
last = middle;
compCount++;
}
else
{
first = middle;
compCount++;
}
}
}

For an educational response, your binary search is correct*, if not very clean - like #Alex said, you only have to make sure you're comparing them as DateTimes. The problem is with the line
int comparisonSTR = string.Compare(dateArray[middle], searchTerm);
because the "string class" doesn't know what a date is so it can't really give you a datetime comparison when you're trying to search for dates. It can only give you a comparison on if one term comes alphabetically before, equal, or after another term.
Instead, if you convert them to DateTimes and use the comparer specific for DateTimes, then you should get back a comparison that you can use for binary search. You can either convert them to DateTime in-line
int comparisonSTR = DateTime.Compare(Convert.ToDateTime(searchTerm), Convert.ToDateTime(dateArray[middle]));
or convert them outside of the loop as the first thing you do in your method to make it a little easier to read
DateTime[] dates = Array.ConvertAll(dateArray, Convert.ToDateTime);
DateTime searchDate = Convert.ToDateTime(searchTerm);
while (found != true && first <= last)
{
int middle = (first + last) / 2;
int comparison = DateTime.Compare(searchDate, dates[middle]);
Other than that, you're pretty much set. You might have already solved this by now, so in that case I'm just posting it in part to explain why the string.Compare didn't work for you to convert dates in this case.
Edit: Make sure to test your edge cases (e.g. searching for not only the middle, but also the first and last elements for varying array sizes), because I suspect that your binary search may not be entirely correct on second review.

Related

String search in C# somewhat similiar to LIKE operator in say VB

I am aware this question as been asked. And I am not really looking for a function to do so. I was hoping to get some tips on making a little method I made better. Basically, take a long string, and search for a smaller string inside of it. I am aware that there is literally always a million ways to do things better, and that is what brought me here.
Please take a look at the code snippet, and let me know what you think. No, its not very complex, yes it does work for my needs, but I am more interested in learning where the pain points would be using this for something I would assume it would work for, but would not for such and such reason. I hope that makes sense. But to give this question a way to be answered for SO, is this a strong way to perform this task (I somewhat know the answer :) )
Super interested in constructive criticism, not just in "that's bad". I implore you do elaborate on such a thought so I can get the most out of the responses.
public static Boolean FindTextInString(string strTextToSearch, string strTextToLookFor)
{
//put the string to search into lower case
string strTextToSearchLower = strTextToSearch.ToLower();
//put the text to look for to lower case
string strTextToLookForLower = strTextToLookFor.ToLower();
//get the length of both of the strings
int intTextToLookForLength = strTextToLookForLower.Length;
int intTextToSearch = strTextToSearchLower.Length;
//loop through the division amount so we can check each part of the search text
for(int i = 0; i < intTextToSearch; i++)
{
//substring at multiple positions and see if it can be found
if (strTextToSearchLower.Substring(i,intTextToLookForLength) == strTextToLookForLower)
{
//return true if we found a matching string within the search in text
return true;
}
}
//otherwise we will return false
return false;
}
If you only care about finding a substring inside a string, just use String.Contains()
Example:
string string_to_search = "the cat jumped onto the table";
string string_to_find = "jumped onto";
return string_to_search.ToLower().Contains(string_to_find.ToLower());
You can reuse VB's Like operator this way:
1) Make a reference to Microsoft.VisualBasic.dll library.
2) Use the following code.
using Microsoft.VisualBasic;
using Microsoft.VisualBasic.CompilerServices;
if (LikeOperator.LikeString(Source: "11", Pattern: "11*", CompareOption: CompareMethod.Text)
{
// Your code here...
}
To implement your function in a case-insensitive way, it may be more appropriate to use IndexOf instead of the combination of two ToLower() calls with Contains. This is both because ToLower() will generate a new string, and because of the Turkish İ Problem.
Something like the following should do the trick, where it returns False if either term is null, otherwise uses a case-insensitive IndexOf call to determine if the search term exists in the source string:
public static bool SourceContainsSearch(string source, string search)
{
return search != null &&
source?.IndexOf(search, StringComparison.OrdinalIgnoreCase) > -1;
}

How to convert String to One Int

following problem in C# (working in VS Community 2015):
First off, i fairly new to C#, so excuse me if that question would be an easy fix.
I have a contact sensor giving me a string of numbers (length measurement). I read them with the SystemPort Methods and cut them down to the numbers that i need with substring (as the beginning of the string, the "SR00002" is useless to me).
In the end i end up with a string like : "000.3422" or "012.2345". Now i want to convert that string to one solid int-variable that i can work with, meaning subtract values from and such.
Bsp: I want to calculate 012.234 - 000.3422 (or , instead of . but i could change that beforehand)
I already tried Parse and ConvertToInt (while iterating through the string) but the endresult is always a string.
string b = serialPort2.ReadLine();
string[] b1 = Regex.Split(b, "SR,00,002,");
string b2 = b1[1].Substring(1);
foreach (char c in b2)
{
Convert.ToInt32(c);
}
textBox2.Text = b2 + b2.GetType();
I know that when b2 will be int it can not be printed in the Textbox but ill take care of that later.
When everything is converted accordingly, ill outsource the conversion to its own method =)
The GetType is just for testing and as said shows only System.String (which i dont want). Help would be much appreaciated. I also browsed the searchfunction and google but couldnt find anything of help. I wish any possible helpers a nice day, mfg Chris.
use the int.Parse
int.Parse("123")
You need to assign the converted values to a new variable or array that takes int or other numeric values that you want.
int[] numbers = new int[b1.length];
for(int i = 0; i < b2.length; i++)
{
numbers[i] = Convert.ToInt32(b2[i]);
}

C# - efficiently check if string contains string at specific position (something like regionMatches)

For example, I might have the string "Hello world!", and I want to check if a substring starting at position 6 (0-based) is "world" - in this case true.
Something like "Hello world!".Substring(6).StartsWith("world", StringComparison.Ordinal) would do it, but it involves a heap allocation which ought to be unnecessary for something like this.
(In my case, I don't want a bounds error if the string starting at position 6 is too short for the comparison - I just want false. However, that's easy to code around, so solutions that would give a bounds error are also welcome.)
In Java, 'regionMatches' can be used to achieve this effect (with the bounds error), but I can't find an equivalent in C#.
Just to pre-empt - obviously Contains and IndexOf are bad solutions because they do an unnecessary search. (You know someone will post this!)
If all else fails, it's quick to code my own function for this - mainly I'm wondering if there is a built-in one that I've missed.
obviously Contains and IndexOf are bad solutions because they do an unnecessary search
Actually, that's not true: there is an overload of IndexOf that keeps you in control of how far it should go in search of the match. If you tell it to stay at one specific index, it would do exactly what you want to achieve.
Here is the three-argument overload of IndexOf that you could use. Passing the length of the target for the count parameter would prevent IndexOf from considering any other positions:
var big = "Hello world!";
var small = "world";
if (big.IndexOf(small, 6, small.Length) == 6) {
...
}
Demo.
Or manually
int i = 0;
if (str.Length >= 6 + toFind.Length) {
for (i = 0; i < toFind.Length; i++)
if (str[i + 6] != toFind[i])
break;
}
bool ok = i == toFind.Length;
here you are
static void Main(string[] args)
{
string word = "Hello my friend how are you ?";
if (word.Substring(0).Contains("Hello"))
{
Console.WriteLine("Match !");
}
}

is String.Contains() faster than walking through whole array of char in string?

I have a function that is walking through the string looking for pattern and changing parts of it. I could optimize it by inserting
if (!text.Contains(pattern)) return;
But, I am actually walking through the whole string and comparing parts of it with the pattern, so the question is, how String.Contains() actually works? I know there was such a question - How does String.Contains work? but answer is rather unclear. So, if String.Contains() walks through the whole array of chars as well and compare them to pattern I am looking for as well, it wouldn't really make my function faster, but slower.
So, is it a good idea to attempt such an optimizations? And - is it possible for String.Contains() to be even faster than function that just walk through the whole array and compare every single character with some constant one?
Here is the code:
public static char colorchar = (char)3;
public static Client.RichTBox.ContentText color(string text, Client.RichTBox SBAB)
{
if (text.Contains(colorchar.ToString()))
{
int color = 0;
bool closed = false;
int position = 0;
while (text.Length > position)
{
if (text[position] == colorchar)
{
if (closed)
{
text = text.Substring(position, text.Length - position);
Client.RichTBox.ContentText Link = new Client.RichTBox.ContentText(ProtocolIrc.decode_text(text), SBAB, Configuration.CurrentSkin.mrcl[color]);
return Link;
}
if (!closed)
{
if (!int.TryParse(text[position + 1].ToString() + text[position + 2].ToString(), out color))
{
if (!int.TryParse(text[position + 1].ToString(), out color))
{
color = 0;
}
}
if (color > 9)
{
text = text.Remove(position, 3);
}
else
{
text = text.Remove(position, 2);
}
closed = true;
if (color < 16)
{
text = text.Substring(position);
break;
}
}
}
position++;
}
}
return null;
}
Short answer is that your optimization is no optimization at all.
Basically, String.Contains(...) just returns String.IndexOf(..) >= 0
You could improve your alogrithm to:
int position = text.IndexOf(colorchar.ToString()...);
if (-1 < position)
{ /* Do it */ }
Yes.
And doesn't have a bug (ahhm...).
There are better ways of looking for multiple substrings in very long texts, but for most common usages String.Contains (or IndexOf) is the best.
Also IIRC the source of String.Contains is available in the .Net shared sources
Oh, and if you want a performance comparison you can just measure for your exact use-case
Check this similar post How does string.contains work
I think that you will not be able to simply do anything faster than String.Contains, unless you want to use standard CRT function wcsstr, available in msvcrt.dll, which is not so easy
Unless you have profiled your application and determined that the line with String.Contains is a bottle-neck, you should not do any such premature optimizations. It is way more important to keep your code's intention clear.
Ans while there are many ways to implement the methods in the .NET base classes, you should assume the default implementations are optimal enough for most people's use cases. For example, any (future) implementation of .NET might use the x86-specific instructions for string comparisons. That would then always be faster than what you can do in C#.
If you really want to be sure whether your custom string comparison code is faster than String.Contains, you need to measure them both using many iterations, each with a different string. For example using the Stopwatch class to measure the time.
If you now the details which you can use for optimizations (not just simple contains check) sure you can make your method faster than string.Contains, otherwise - not.

String cannot contain any part of another string .NET 2.0

I'm looking for a simple way to discern if a string contains any part of another string (be that regex, built in function I don't know about, etc...). For Example:
string a = "unicorn";
string b = "cornholio";
string c = "ornament";
string d = "elephant";
if (a <comparison> b)
{
// match found ("corn" from 'unicorn' matched "corn" from 'cornholio')
}
if (a <comparison> c)
{
// match found ("orn" from 'unicorn' matched "orn" from 'ornament')
}
if (a <comparison> d)
{
// this will not match
}
something like if (a.ContainsAnyPartOf(b)) would be too much to hope for.
Also, I only have access to .NET 2.0.
Thanks in advance!
This method should work. You'll want to specify a minimum length for the "part" that might match. I'd assume you'd want to look for something of at least 2, but with this you can set it as high or low as you want. Note: error checking not included.
public static bool ContainsPartOf(string s1, string s2, int minsize)
{
for (int i = 0; i <= s2.Length - minsize; i++)
{
if (s1.Contains(s2.Substring(i, minsize)))
return true;
}
return false;
}
I think you're looking for this implementation of longest common substring?
Your best bet, according to my understanding of the question, is to compute the Levenshtein (or related values) distance and compare that against a threshold.
Your requirements are a little vague.
You need to define a minimum length for the match...but implementing an algorithm shouldn't be too difficult when you figure that part out.
I'd suggest breaking down the string into character arrays and then using tail recursion to find matches for the parts.

Categories