BinarySearch limits for numbers within range

BinarySearch limits for numbers within range - c#

This is my code:
SortedDictionary<int,int> Numbers = new SortedDictionary<int,int>();
List<int> onlyP = new List<int>(Numbers.Keys);
int Inferior = int.Parse(toks[0]);
int Superior = int.Parse(toks[1]);
int count = 0;
int inferiorindex = Array.BinarySearch(Numbers.Keys.ToArray(), Inferior);
if (inferiorindex < 0) inferiorindex = (inferiorindex * -1) - 1;
int superiorindex = Array.BinarySearch(Numbers.Keys.ToArray(), Superior);
if (superiorindex < 0) superiorindex = (superiorindex * -1) - 1;
count = Numbers[onlyP[superiorindex]] - Numbers[onlyP[inferiorindex]];
So what I'm trying to do is this: I've got a sorted dictionary with powers as keys, and a normal iteration as values. I've to print how many numbers of the keys fit within a specified range.
Example:
Some entries of the dict: [1,1],[4,2],[8,3],[9,4],[16,5],[25,6],[27,7],[32,8]
Limits: 2 and 10
Numbers within 2 - 10 : 4, 8, 9 = 3 numbers.
With BinarySearch I'm trying to quickly find the numbers I want and then substract Potencias[onlyP[superiorindex]] - Potencias[onlyP[inferiorindex]] to find how many numbers are within the range. Unfortunately it's not working for all the cases, and it sometimes gives less numbers than the actual amount. How can this be fixed? Thanks in advance.
[EDIT] Examples of the problems: If I select limits: 4 and 4... it returns 0, but the answer is 1.
limits: 1 and 10^9 (the whole range) returns 32669... But the answer is 32670.
The algorithm is ignoring powers.

Finally, having read the documentation. Notice the -1 on the upperIndex conversion and the +1 on the return value, these are important.
var numbers = new[] { 1, 4, 8, 9, 16, 25, 27, 32 };
var lowerBound = 4;
var upperBound = 17;
int lowerIndex = Array.BinarySearch(numbers, lowerBound);
if (lowerIndex < 0) lowerIndex = ~lowerIndex;
// - 1 here because we want the index of the item that is <= upper bound.
int upperIndex = Array.BinarySearch(numbers, upperBound);
if (upperIndex < 0) upperIndex = ~upperIndex - 1;
return (upperIndex - lowerIndex) + 1;
Explanation:
For the lower index we just take the complement because the BinarySearch returns the index of the first item >= lowerBound.
For the upper index we additionally minus one from the complement because we want the first item <= upperBound (not >= upperBound which is what BinarySearch returns).

It seems that you're not doing it the wright way for post processing the binary search return value :
http://msdn.microsoft.com/en-us/library/5kwds4b1.aspx
Should be :
if (inferiorindex < 0) inferiorindex = ~inferiorindex;
(untested)
Moreover, List supports a binary search, so you don't have to do the Array.BinarySearch thing, just work on onlyP.

int inferiorindex = Array.BinarySearch<int>(keys, Inferior);
if (inferiorindex < 0) {
inferiorindex = ~inferiorindex;
}
int superiorindex = Array.BinarySearch<int>(keys, Superior);
if (superiorindex < 0) {
// superiorindex is the binary complement of the next higher.
// -1 because we want the highest.
superiorindex = ~superiorindex - 1;
}
int count = superiorindex - inferiorindex + 1;

Related

Split int into chunks with a max value

I've been trying to do the following, I have a max value and a int, I want to split that int like this:
Max = 10
Int = 45
Result = [10, 10, 10, 10, 5]
I already search a lot and I didn't find nothing like the thing I want to do, and my head hurts for thinking and trying to do it.
Thank you for any help!

You just need to repeat the max the number of times it divides into your value. Then if it does not divide evenly add the remainder.
int value = 45;
int max = 10;
var results = Enumerable.Repeat(max, value/max).ToList();
if(value % max != 0)
results.Add(value % max);
Console.WriteLine(string.Join(",", results));

I think LINQ is the nicest way to do this, but you could also have a straightforward loop approach that finds how many times the size fits into your max value, then add these numbers to a List<int>, and add the leftover(if any) to the list at the end.
var size = 10;
var max = 45;
// Find how many times the size fits and leftover
var goesInto = max / size;
var leftover = max % size;
var result = new List<int>();
// Add the sizes that fit in first
for (var i = 0; i < goesInto; i++)
{
result.Add(size);
}
// Add leftover size at the end.
if (leftover > 0)
{
result.Add(leftover);
}

That looks a lot like Pseudo code. We however will write C# code, as that was the langauge tag.
//Need a list, or have to calculate the expected lenght. List is easier.
List<int> ResultList = new List<int>();
//Make a copy to work with
int temp = value;
//Now let us math down towards 0
while(temp>0){
//All those multiples of Max are added first
if(temp >= Max){
ResultList.Add(Max);
temp -= Max;
}
//We are down to the rest, here
else{
//If the rest is not 0, you can add it too
if(temp > 0){
ResultList.Add(temp);
temp = 0;
}
}
}

Here how you do in plain code without LINQ, List, etc. This will also take care of negatives
int val = -45; // negative
int max = 10;
int count = Math.Abs(val / max);
int rem = Math.Abs(val % max);
var output = new int[count + (rem == 0 ? 0 : 1)];
for(int i = 0; i < output.Length ; i++)
{
if (i == output.Length - 1)
output[i] = rem;
else
output[i] = max;
Console.WriteLine(output[i]);
}
return output;
10
10
10
10
5

Algorithm to get which values make sum of a given number from array

I don't know to search or google it so I ask it here.
I have an array of integers with fixed size and exactly with this logic.
sample [1,2,4,8,16,32]
Now I am given a number for example 26. And I shall find the numbers whose sum will make this number, in this case is [2,8,16]
for a number of 20 it will be [4,16]
for 40 it is [8,32]
and for 63 it is all of these numbers [1,2,4,8,16,32]
What is the proper algorithm for that?
I know strictly that there is always this continuation that the number is double of the previous value.
as well as only the numbers from the given array will sum up to the given number and each number will be used only for once or none
If it will be in C# method that takes array of ints and an int value and returns the array of ints that contains the ints that sum up this number from the given array will be preferred.
Thank you

As you can see, the number are base-2, which means you can easily use shift.
You could try this:
private IEnumerable<int> FindBits(int value)
{
// check for bits.
for (int i = 0; i < 32; i++)
{
// shift 1 by i
var bitVal = 1 << i; // you could use (int)Math.Pow(2, i); instead
// check if the value contains that bit.
if ((value & bitVal) == bitVal)
// yep, it did.
yield return bitVal;
}
}
This method will check what bits are set and return them as an ienumerable. (which can be converted to an array of list)
Usage:
// find the bits.
var res = FindBits(40).ToArray();
// format it using the string.join
var str = $"[{string.Join(",", res)}]";
// present the results
Console.WriteLine(str);
Results in [8,32]
Extra info:
counter
00000001 = 1 = 1 << 0
00000010 = 2 = 1 << 1
00000100 = 4 = 1 << 2
00001000 = 8 = 1 << 3
00010000 = 16 = 1 << 4
00100000 = 32 = 1 << 5
01000000 = 64 = 1 << 6
10000000 = 128 = 1 << 7
Instead of writing all combinations you make a for loop which does the counter.
Some extra non-sense:
If you like lambda's, you could replace the FindBits with this:
private Func<int, IEnumerable<int>> FindBits = (int value) => Enumerable
.Range(0, 31)
.Select(i => 2 << i).Where(i => (value & i) == i);
But it's better to keep it simpel/readable.

First you should notice that
( 1 2 4 8 16 ... ) = (2^0 2^1 2^2 2^3 2^4 ... )
And that this is the same as finding a binary encoding for a decimal number. What you are looking for is an algorithm to transform a decimal or base 10 number to a binary or base 2 number.
The algorithm is pretty simple:
public List<int> dec_to_bin(int num)
{
List<int> return_list = new List<int>();
int index = 0;
int remainder = num;
int bit = 0;
while (remainder > 0)
{
bit = remainder % 2;
if (bit == 1 )
{
return_list.Add((int)Math.Pow(2, index));
}
remainder = remainder / 2;
index = index + 1;
}
return return_list;
}
There is a better way however that just uses the underlying encoding of the number which is already binary.
public List<int> dec_to_bin(int num)
{
List<int> return_list = new List<int>();
int value = 1;
while( value < num )
{
if( (value & num) == value )
{
return_list.Add(value);
}
value = value * 2;
}
return return_list;
}

Another way to state your requirement is "What are the unique powers of 2 that sum to a given integer?" Since computers work with powers of 2 natively, there are built-in goodies in most languages to do this very succinctly.
As a bonus, you can use existing .Net types and methods to eliminate the need to write your own loops.
Here's one approach:
IEnumerable<int> GetCompositePowersOf2(int input) =>
//convert to enumerable of bools, one for each bit in the
//input value (true=1, false=0)
new BitArray(new[] { input }).Cast<bool>()
// get power of 2 corresponding to the position in the enumerable
// for each true value, gets 0 for false values.
.Select((isOne, pos) => isOne ? (1 << pos) : 0)
//filter out the 0 values
.Where(pow => pow > 0);

I don't quite get the " takes array of ints " part, since this creation of sums only works with numbers that are the power of 2.
private int[] count (int num)
{
int factor = 0;
List<int> facts = new List<int>();
while (num > 0)
{
int counter = 0;
int div = num;
int remainder = 0;
while (remainder == 0)
{
remainder = div % 2;
div = div / 2;
counter++;
}
factor = 1;
for (int i = 1; i < counter; i++)
factor *= 2;
num = num - factor;
facts.Add(factor);
}
return (facts.ToArray());
}

Finding the closest integer value, rounded down, from a given array of integers

I am trying to figure out the best way to find the closest value, ROUNDED DOWN, in a List of integers using any n that is between two other numbers that are stored in a List. The all integers in this situation will ALWAYS be unsigned, in case that helps.
The assumptions are as follows:
The List always starts at 0
The List is always sorted ASC
All integers in the List are unsigned (no need for Math.Abs)
The number for comparison is always unsigned
For example:
List<int> numbers = new List<int>() { 0, 2000, 4000, 8000, 8500, 9101, 10010 };
int myNumber = 9000;
int theAnswer; // should be 8500
for (int i = 0; i < numbers.Count; i++) {
if (i == numbers.Count - 1) {
theAnswer = numbers[i];
break;
} else if (myNumber < numbers[i + 1]) {
theAnswer = numbers[i];
break;
}
}
The previous code example works without any flaws.
Is there a better more succint way to do it?

You can use List<T>.BinarySearch instead of enumerating elements of list in sequence.
List<int> numbers = new List<int>() { 0, 2000, 4000, 8000, 8500, 9101, 10010 };
int myNumber = 9000;
int r=numbers.BinarySearch(myNumber);
int theAnswer=numbers[r>=0?r:~r-1];

Filter list obtaining all values less than the myNumber and return last one:
theAnswer = numbers.Where(x => x <= myNumber ).Last();

A list can be indexed.
Start at the index in the middle of the list. If you found the exact number, you are good. If the number is less than the target number, search in the middle of the range from the start of the list to one less than the middle of the list. If the number is greater than the target number, work with the opposite half of the list. Continue this binary search until you find either an exact match, or the adjacent numbers that are smaller and larger than the target number.
Select the smaller of the two.

Please try this code:
List<int> numbers = new List<int>() { 0, 2000, 4000, 8000, 8500, 9101, 10010 };
int myNumber = 9000;
int theAnswer = numbers[numbers.Count - 1];
if (theAnswer > myNumber)
{
int l = 0, h = numbers.Count - 1, m;
do
{
m = (int)((double)(myNumber - numbers[l]) / (double)(numbers[h] - numbers[l]) * (h - l) + l);
if (numbers[m] > myNumber) h = m; else l = m;
}
while ((h - l) != 1);
theAnswer = numbers[l];
}

C# Random Number Generator -1 or 1 only?

How would I go about using a random number generator to generate either 1 or -1 but not 0. I have so far: xFacingDirection = randomise.Next(-1,2); but that runs the risk of generating the number 0. How would I go about making not accept 0 or keeping trying for a number until its not 0.
Many thanks

What about this:
var result = random.Next(0, 2) * 2 - 1;
To explain this, random.Next(0, 2) can only produce a value of 0 or 1. When you multiply that by 2, it gives either 0 or 2. And simply subtract 1 to get either -1 or 1.
In general, to randomly produce one of n integers, evenly distributed between a minimum value of x and a maximum value of y, inclusive, you can use this formula:
var result = random.Next(0, n) * (y - x) / (n - 1) + x;
For example substituting n=4, x=10, and y=1 will generate randum numbers in the set { 1, 4, 7, 10 }.

You could also use a ternary operator to determine which number to use.
int number = (random.Next(0, 2) == 1) ? 1 : -1);
It evaluates to
if (random.Next(0, 2) == 1)
number = 1;
else
number = -1;

Here is my implementation. Similar to the above but slightly different. Figure more answers could help you out even more.
public int result()
{
// randomly generate a number either 1 or -1
int i;
Random rando = new Random();
i = rando.Next(0, 2);
if (i == 0)
{
i = -1;
}
return i;
}

How to calculate distance similarity measure of given 2 strings?

I need to calculate the similarity between 2 strings. So what exactly do I mean? Let me explain with an example:
The real word: hospital
Mistaken word: haspita
Now my aim is to determine how many characters I need to modify the mistaken word to obtain the real word. In this example, I need to modify 2 letters. So what would be the percent? I take the length of the real word always. So it becomes 2 / 8 = 25% so these 2 given string DSM is 75%.
How can I achieve this with performance being a key consideration?

I just addressed this exact same issue a few weeks ago. Since someone is asking now, I'll share the code. In my exhaustive tests my code is about 10x faster than the C# example on Wikipedia even when no maximum distance is supplied. When a maximum distance is supplied, this performance gain increases to 30x - 100x +. Note a couple key points for performance:
If you need to compare the same words over and over, first convert the words to arrays of integers. The Damerau-Levenshtein algorithm includes many >, <, == comparisons, and ints compare much faster than chars.
It includes a short-circuiting mechanism to quit if the distance exceeds a provided maximum
Use a rotating set of three arrays rather than a massive matrix as in all the implementations I've see elsewhere
Make sure your arrays slice accross the shorter word width.
Code (it works the exact same if you replace int[] with String in the parameter declarations:
/// <summary>
/// Computes the Damerau-Levenshtein Distance between two strings, represented as arrays of
/// integers, where each integer represents the code point of a character in the source string.
/// Includes an optional threshhold which can be used to indicate the maximum allowable distance.
/// </summary>
/// <param name="source">An array of the code points of the first string</param>
/// <param name="target">An array of the code points of the second string</param>
/// <param name="threshold">Maximum allowable distance</param>
/// <returns>Int.MaxValue if threshhold exceeded; otherwise the Damerau-Leveshteim distance between the strings</returns>
public static int DamerauLevenshteinDistance(int[] source, int[] target, int threshold) {
int length1 = source.Length;
int length2 = target.Length;
// Return trivial case - difference in string lengths exceeds threshhold
if (Math.Abs(length1 - length2) > threshold) { return int.MaxValue; }
// Ensure arrays [i] / length1 use shorter length
if (length1 > length2) {
Swap(ref target, ref source);
Swap(ref length1, ref length2);
}
int maxi = length1;
int maxj = length2;
int[] dCurrent = new int[maxi + 1];
int[] dMinus1 = new int[maxi + 1];
int[] dMinus2 = new int[maxi + 1];
int[] dSwap;
for (int i = 0; i <= maxi; i++) { dCurrent[i] = i; }
int jm1 = 0, im1 = 0, im2 = -1;
for (int j = 1; j <= maxj; j++) {
// Rotate
dSwap = dMinus2;
dMinus2 = dMinus1;
dMinus1 = dCurrent;
dCurrent = dSwap;
// Initialize
int minDistance = int.MaxValue;
dCurrent[0] = j;
im1 = 0;
im2 = -1;
for (int i = 1; i <= maxi; i++) {
int cost = source[im1] == target[jm1] ? 0 : 1;
int del = dCurrent[im1] + 1;
int ins = dMinus1[i] + 1;
int sub = dMinus1[im1] + cost;
//Fastest execution for min value of 3 integers
int min = (del > ins) ? (ins > sub ? sub : ins) : (del > sub ? sub : del);
if (i > 1 && j > 1 && source[im2] == target[jm1] && source[im1] == target[j - 2])
min = Math.Min(min, dMinus2[im2] + cost);
dCurrent[i] = min;
if (min < minDistance) { minDistance = min; }
im1++;
im2++;
}
jm1++;
if (minDistance > threshold) { return int.MaxValue; }
}
int result = dCurrent[maxi];
return (result > threshold) ? int.MaxValue : result;
}
Where Swap is:
static void Swap<T>(ref T arg1,ref T arg2) {
T temp = arg1;
arg1 = arg2;
arg2 = temp;
}

What you are looking for is called edit distance or Levenshtein distance. The wikipedia article explains how it is calculated, and has a nice piece of pseudocode at the bottom to help you code this algorithm in C# very easily.
Here's an implementation from the first site linked below:
private static int CalcLevenshteinDistance(string a, string b)
{
if (String.IsNullOrEmpty(a) && String.IsNullOrEmpty(b)) {
return 0;
}
if (String.IsNullOrEmpty(a)) {
return b.Length;
}
if (String.IsNullOrEmpty(b)) {
return a.Length;
}
int lengthA = a.Length;
int lengthB = b.Length;
var distances = new int[lengthA + 1, lengthB + 1];
for (int i = 0; i <= lengthA; distances[i, 0] = i++);
for (int j = 0; j <= lengthB; distances[0, j] = j++);
for (int i = 1; i <= lengthA; i++)
for (int j = 1; j <= lengthB; j++)
{
int cost = b[j - 1] == a[i - 1] ? 0 : 1;
distances[i, j] = Math.Min
(
Math.Min(distances[i - 1, j] + 1, distances[i, j - 1] + 1),
distances[i - 1, j - 1] + cost
);
}
return distances[lengthA, lengthB];
}

There is a big number of string similarity distance algorithms that can be used. Some listed here (but not exhaustively listed are):
Levenstein
Needleman Wunch
Smith Waterman
Smith Waterman Gotoh
Jaro, Jaro Winkler
Jaccard Similarity
Euclidean Distance
Dice Similarity
Cosine Similarity
Monge Elkan
A library that contains implementation to all of these is called SimMetrics
which has both java and c# implementations.

I have found that Levenshtein and Jaro Winkler are great for small differences betwen strings such as:
Spelling mistakes; or
ö instead of o in a persons name.
However when comparing something like article titles where significant chunks of the text would be the same but with "noise" around the edges, Smith-Waterman-Gotoh has been fantastic:
compare these 2 titles (that are the same but worded differently from different sources):
An endonuclease from Escherichia coli that introduces single polynucleotide chain scissions in ultraviolet-irradiated DNA
Endonuclease III: An Endonuclease from Escherichia coli That Introduces Single Polynucleotide Chain Scissions in Ultraviolet-Irradiated DNA
This site that provides algorithm comparison of the strings shows:
Levenshtein: 81
Smith-Waterman Gotoh 94
Jaro Winkler 78
Jaro Winkler and Levenshtein are not as competent as Smith Waterman Gotoh in detecting the similarity. If we compare two titles that are not the same article, but have some matching text:
Fat metabolism in higher plants. The function of acyl thioesterases in the metabolism of acyl-coenzymes A and acyl-acyl carrier proteins
Fat metabolism in higher plants. The determination of acyl-acyl carrier protein and acyl coenzyme A in a complex lipid mixture
Jaro Winkler gives a false positive, but Smith Waterman Gotoh does not:
Levenshtein: 54
Smith-Waterman Gotoh 49
Jaro Winkler 89
As Anastasiosyal pointed out, SimMetrics has the java code for these algorithms. I had success using the SmithWatermanGotoh java code from SimMetrics.

Here is my implementation of Damerau Levenshtein Distance, which returns not only similarity coefficient, but also returns error locations in corrected word (this feature can be used in text editors). Also my implementation supports different weights of errors (substitution, deletion, insertion, transposition).
public static List<Mistake> OptimalStringAlignmentDistance(
string word, string correctedWord,
bool transposition = true,
int substitutionCost = 1,
int insertionCost = 1,
int deletionCost = 1,
int transpositionCost = 1)
{
int w_length = word.Length;
int cw_length = correctedWord.Length;
var d = new KeyValuePair<int, CharMistakeType>[w_length + 1, cw_length + 1];
var result = new List<Mistake>(Math.Max(w_length, cw_length));
if (w_length == 0)
{
for (int i = 0; i < cw_length; i++)
result.Add(new Mistake(i, CharMistakeType.Insertion));
return result;
}
for (int i = 0; i <= w_length; i++)
d[i, 0] = new KeyValuePair<int, CharMistakeType>(i, CharMistakeType.None);
for (int j = 0; j <= cw_length; j++)
d[0, j] = new KeyValuePair<int, CharMistakeType>(j, CharMistakeType.None);
for (int i = 1; i <= w_length; i++)
{
for (int j = 1; j <= cw_length; j++)
{
bool equal = correctedWord[j - 1] == word[i - 1];
int delCost = d[i - 1, j].Key + deletionCost;
int insCost = d[i, j - 1].Key + insertionCost;
int subCost = d[i - 1, j - 1].Key;
if (!equal)
subCost += substitutionCost;
int transCost = int.MaxValue;
if (transposition && i > 1 && j > 1 && word[i - 1] == correctedWord[j - 2] && word[i - 2] == correctedWord[j - 1])
{
transCost = d[i - 2, j - 2].Key;
if (!equal)
transCost += transpositionCost;
}
int min = delCost;
CharMistakeType mistakeType = CharMistakeType.Deletion;
if (insCost < min)
{
min = insCost;
mistakeType = CharMistakeType.Insertion;
}
if (subCost < min)
{
min = subCost;
mistakeType = equal ? CharMistakeType.None : CharMistakeType.Substitution;
}
if (transCost < min)
{
min = transCost;
mistakeType = CharMistakeType.Transposition;
}
d[i, j] = new KeyValuePair<int, CharMistakeType>(min, mistakeType);
}
}
int w_ind = w_length;
int cw_ind = cw_length;
while (w_ind >= 0 && cw_ind >= 0)
{
switch (d[w_ind, cw_ind].Value)
{
case CharMistakeType.None:
w_ind--;
cw_ind--;
break;
case CharMistakeType.Substitution:
result.Add(new Mistake(cw_ind - 1, CharMistakeType.Substitution));
w_ind--;
cw_ind--;
break;
case CharMistakeType.Deletion:
result.Add(new Mistake(cw_ind, CharMistakeType.Deletion));
w_ind--;
break;
case CharMistakeType.Insertion:
result.Add(new Mistake(cw_ind - 1, CharMistakeType.Insertion));
cw_ind--;
break;
case CharMistakeType.Transposition:
result.Add(new Mistake(cw_ind - 2, CharMistakeType.Transposition));
w_ind -= 2;
cw_ind -= 2;
break;
}
}
if (d[w_length, cw_length].Key > result.Count)
{
int delMistakesCount = d[w_length, cw_length].Key - result.Count;
for (int i = 0; i < delMistakesCount; i++)
result.Add(new Mistake(0, CharMistakeType.Deletion));
}
result.Reverse();
return result;
}
public struct Mistake
{
public int Position;
public CharMistakeType Type;
public Mistake(int position, CharMistakeType type)
{
Position = position;
Type = type;
}
public override string ToString()
{
return Position + ", " + Type;
}
}
public enum CharMistakeType
{
None,
Substitution,
Insertion,
Deletion,
Transposition
}
This code is a part of my project: Yandex-Linguistics.NET.
I wrote some tests and it's seems to me that method is working.
But comments and remarks are welcome.

Here is an alternative approach:
A typical method for finding similarity is Levenshtein distance, and there is no doubt a library with code available.
Unfortunately, this requires comparing to every string. You might be able to write a specialized version of the code to short-circuit the calculation if the distance is greater than some threshold, you would still have to do all the comparisons.
Another idea is to use some variant of trigrams or n-grams. These are sequences of n characters (or n words or n genomic sequences or n whatever). Keep a mapping of trigrams to strings and choose the ones that have the biggest overlap. A typical choice of n is "3", hence the name.
For instance, English would have these trigrams:
Eng
ngl
gli
lis
ish
And England would have:
Eng
ngl
gla
lan
and
Well, 2 out of 7 (or 4 out of 10) match. If this works for you, and you can index the trigram/string table and get a faster search.
You can also combine this with Levenshtein to reduce the set of comparison to those that have some minimum number of n-grams in common.

Here's a VB.net implementation:
Public Shared Function LevenshteinDistance(ByVal v1 As String, ByVal v2 As String) As Integer
Dim cost(v1.Length, v2.Length) As Integer
If v1.Length = 0 Then
Return v2.Length 'if string 1 is empty, the number of edits will be the insertion of all characters in string 2
ElseIf v2.Length = 0 Then
Return v1.Length 'if string 2 is empty, the number of edits will be the insertion of all characters in string 1
Else
'setup the base costs for inserting the correct characters
For v1Count As Integer = 0 To v1.Length
cost(v1Count, 0) = v1Count
Next v1Count
For v2Count As Integer = 0 To v2.Length
cost(0, v2Count) = v2Count
Next v2Count
'now work out the cheapest route to having the correct characters
For v1Count As Integer = 1 To v1.Length
For v2Count As Integer = 1 To v2.Length
'the first min term is the cost of editing the character in place (which will be the cost-to-date or the cost-to-date + 1 (depending on whether a change is required)
'the second min term is the cost of inserting the correct character into string 1 (cost-to-date + 1),
'the third min term is the cost of inserting the correct character into string 2 (cost-to-date + 1) and
cost(v1Count, v2Count) = Math.Min(
cost(v1Count - 1, v2Count - 1) + If(v1.Chars(v1Count - 1) = v2.Chars(v2Count - 1), 0, 1),
Math.Min(
cost(v1Count - 1, v2Count) + 1,
cost(v1Count, v2Count - 1) + 1
)
)
Next v2Count
Next v1Count
'the final result is the cheapest cost to get the two strings to match, which is the bottom right cell in the matrix
'in the event of strings being equal, this will be the result of zipping diagonally down the matrix (which will be square as the strings are the same length)
Return cost(v1.Length, v2.Length)
End If
End Function

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

BinarySearch limits for numbers within range - c#

Related

Split int into chunks with a max value

Algorithm to get which values make sum of a given number from array

Finding the closest integer value, rounded down, from a given array of integers

C# Random Number Generator -1 or 1 only?

How to calculate distance similarity measure of given 2 strings?

Categories

Resources