C# QuickSort algorithm - c#

Today I decided to learn how Quick sort algorithm work. I studied some examples and watched a few YouTube videos on the matter.
So, after that I decided to try and write it down. I did use some pointers from a YouTube channel that I find extremely helpful.
When I try to run the program however, I get a "Process is terminated due to StackOverflowException" and I'm unable to figure why.
I will appreciate if anyone could help me.
private static void QuickSort(int[] numbers, int beginning, int end)
{
int pivotLoc = 0;
if(beginning < end)
{
Partition(numbers, beginning, end, pivotLoc);
QuickSort(numbers, beginning, pivotLoc - 1);
QuickSort(numbers, pivotLoc + 1, end);
}
}
private static void Partition(int[] numbers, int beginning, int end,int pivotLoc)
{
int left = beginning;
int right = end;
int temp;
pivotLoc = left;
while (true)
{
while(numbers[pivotLoc] <= numbers[right] && pivotLoc != right)
{
right--;
}
if(pivotLoc == right)
{
break;
}
else if(numbers[pivotLoc] > numbers[right])
{
temp = numbers[right];
numbers[right] = numbers[pivotLoc];
numbers[pivotLoc] = temp;
pivotLoc = right;
}
while (numbers[pivotLoc] >= numbers[left] && pivotLoc != left)
{
left++;
}
if(pivotLoc == left)
{
break;
}
else if(numbers[pivotLoc] < numbers[left])
{
temp = numbers[left];
numbers[left] = numbers[pivotLoc];
numbers[pivotLoc] = temp;
pivotLoc = left;
}
}
}

Related

How can I make my trie more efficient?

I'm working on a hacker rank problem and I believe my solution is correct. However, most of my test cases are being timed out. Could some one suggest how to improve the efficiency of my code?
The important part of the code starts at the "Trie Class".
The exact question can be found here: https://www.hackerrank.com/challenges/contacts
using System;
using System.Collections.Generic;
using System.IO;
class Solution
{
static void Main(String[] args)
{
int N = Int32.Parse(Console.ReadLine());
string[,] argList = new string[N, 2];
for (int i = 0; i < N; i++)
{
string[] s = Console.ReadLine().Split();
argList[i, 0] = s[0];
argList[i, 1] = s[1];
}
Trie trie = new Trie();
for (int i = 0; i < N; i++)
{
switch (argList[i, 0])
{
case "add":
trie.add(argList[i, 1]);
break;
case "find":
Console.WriteLine(trie.find(argList[i, 1]));
break;
default:
break;
}
}
}
}
class Trie
{
Trie[] trieArray = new Trie[26];
private int findCount = 0;
private bool data = false;
private char name;
public void add(string s)
{
s = s.ToLower();
add(s, this);
}
private void add(string s, Trie t)
{
char first = Char.Parse(s.Substring(0, 1));
int index = first - 'a';
if(t.trieArray[index] == null)
{
t.trieArray[index] = new Trie();
t.trieArray[index].name = first;
}
if (s.Length > 1)
{
add(s.Substring(1), t.trieArray[index]);
}
else
{
t.trieArray[index].data = true;
}
}
public int find(string s)
{
int ans;
s = s.ToLower();
find(s, this);
ans = findCount;
findCount = 0;
return ans;
}
private void find(string s, Trie t)
{
if (t == null)
{
return;
}
if (s.Length > 0)
{
char first = Char.Parse(s.Substring(0, 1));
int index = first - 'a';
find(s.Substring(1), t.trieArray[index]);
}
else
{
for(int i = 0; i < 26; i++)
{
if (t.trieArray[i] != null)
{
find("", t.trieArray[i]);
}
}
if (t.data == true)
{
findCount++;
}
}
}
}
EDIT: I did some suggestions in the comments but realized that I can't replace s.Substring(1) with s[0]... because I actually need s[1..n]. AND s[0] returns a char so I'm going to need to do .ToString on it anyways.
Also, to add a little more information. The idea is that it needs to count ALL names after a prefix for example.
Input: "He"
Trie Contains:
"Hello"
"Help"
"Heart"
"Ha"
"No"
Output: 3
I could just post a solution here that gives you 40 points but i guess this wont be any fun.
Make use of Enumerator<char> instead of any string operations
Count while adding values
any charachter minus 'a' makes a good arrayindex (gives you values between 0 and 25)
I think, this link must be very helpfull
There, you can find the good implementation of a trie data structure, but it's a java implementation. I implemented trie on c# with that great article one year ago, and i tried to find solution to an another hackerrank task too ) It was successful.
In my task, i had to a simply trie (i mean my alphabet equals 2) and only one method for adding new nodes:
public class Trie
{
//for integer representation in binary system 2^32
public static readonly int MaxLengthOfBits = 32;
//alphabet trie size
public static readonly int N = 2;
class Node
{
public Node[] next = new Node[Trie.N];
}
private Node _root;
}
public void AddValue(bool[] binaryNumber)
{
_root = AddValue(_root, binaryNumber, 0);
}
private Node AddValue(Node node, bool[] val, int d)
{
if (node == null) node = new Node();
//if least sagnificient bit has been added
//need return
if (d == val.Length)
{
return node;
}
// get 0 or 1 index of next array(length 2)
int index = Convert.ToInt32(val[d]);
node.next[index] = AddValue(node.next[index], val, ++d);
return node;
}

Postfix increment into if, c#

Code example:
using System;
public class Test {
public static void Main() {
int a = 0;
if(a++ == 0){
Console.WriteLine(a);
}
}
}
In this code the Console will write: 1. I can write this code in another way:
public static void Main() {
int a = 0;
if(a == 0){
a++;
Console.WriteLine(a);
}
}
These two examples work exactly the same (from what I know about postfix).
The problem is with this example coming from the Microsoft tutorials:
using System;
public class Document {
// Class allowing to view the document as an array of words:
public class WordCollection {
readonly Document document;
internal WordCollection (Document d){
document = d;
}
// Helper function -- search character array "text", starting
// at character "begin", for word number "wordCount". Returns
//false if there are less than wordCount words. Sets "start" and
//length to the position and length of the word within text
private bool GetWord(char[] text, int begin, int wordCount,
out int start, out int length) {
int end = text.Length;
int count = 0;
int inWord = -1;
start = length = 0;
for (int i = begin; i <= end; ++i){
bool isLetter = i < end && Char.IsLetterOrDigit(text[i]);
if (inWord >= 0) {
if (!isLetter) {
if (count++ == wordCount) {//PROBLEM IS HERE!!!!!!!!!!!!
start = inWord;
length = i - inWord;
return true;
}
inWord = -1;
}
} else {
if (isLetter) {
inWord = i;
}
}
}
return false;
}
//Indexer to get and set words of the containing document:
public string this[int index] {
get
{
int start, length;
if(GetWord(document.TextArray, 0, index, out start,
out length)) {
return new string(document.TextArray, start, length);
} else {
throw new IndexOutOfRangeException();
}
}
set {
int start, length;
if(GetWord(document.TextArray, 0, index, out start,
out length))
{
//Replace the word at start/length with
// the string "value"
if(length == value.Length){
Array.Copy(value.ToCharArray(), 0,
document.TextArray, start, length);
}
else {
char[] newText = new char[document.TextArray.Length +
value.Length - length];
Array.Copy(document.TextArray, 0, newText, 0, start);
Array.Copy(value.ToCharArray(), 0, newText, start, value.Length);
Array.Copy(document.TextArray, start + length, newText,
start + value.Length, document.TextArray.Length - start - length);
document.TextArray = newText;
}
} else {
throw new IndexOutOfRangeException();
}
}
}
public int Count {
get {
int count = 0, start = 0, length = 0;
while (GetWord(document.TextArray, start + length,
0, out start, out length)) {
++count;
}
return count;
}
}
}
// Class allowing the document to be viewed like an array
// of character
public class CharacterCollection {
readonly Document document;
internal CharacterCollection(Document d) {
document = d;
}
//Indexer to get and set character in the containing
//document
public char this[int index] {
get {
return document.TextArray[index];
}
set {
document.TextArray[index] = value;
}
}
//get the count of character in the containing document
public int Count {
get {
return document.TextArray.Length;
}
}
}
//Because the types of the fields have indexers,
//these fields appear as "indexed properties":
public WordCollection Words;
public readonly CharacterCollection Characters;
private char[] TextArray;
public Document(string initialText) {
TextArray = initialText.ToCharArray();
Words = new WordCollection(this);
Characters = new CharacterCollection(this);
}
public string Text {
get {
return new string(TextArray);
}
}
class Test {
static void Main() {
Document d = new Document(
"peter piper picked a peck of pickled peppers. How many pickled peppers did peter piper pick?"
);
//Change word "peter" to "penelope"
for(int i = 0; i < d.Words.Count; ++i){
if (d.Words[i] == "peter") {
d.Words[i] = "penelope";
}
}
for (int i = 0; i < d.Characters.Count; ++i) {
if (d.Characters[i] == 'p') {
d.Characters[i] = 'P';
}
}
Console.WriteLine(d.Text);
}
}
}
If I change the code marked above to this:
if (count == wordCount) {//PROBLEM IS HERE
start = inWord;
length = i - inWord;
count++;
return true;
}
I get an IndexOutOfRangeException, but I don't know why.
Your initial assumption is incorrect (that the two examples work exactly the same). In the following version, count is incremented regardless of whether or not it is equal to wordCount:
if (count++ == wordCount)
{
// Code omitted
}
In this version, count is ONLY incremented when it is equal to wordCount
if (count == wordCount)
{
// Other code omitted
count++;
}
EDIT
The reason this is causing you a failure is that, when you are searching for the second word (when wordCount is 1), the variable count will never equal wordCount (because it never gets incremented), and therefore the GetWord method returns false, which then triggers the else clause in your get method, which throws an IndexOutOfRangeException.
In your version of the code, count is only being incremented when count == wordCount; in the Microsoft version, it's being incremented whether the condition is met or not.
using System;
public class Test {
public static void Main() {
int a = 0;
if(a++ == 0){
Console.WriteLine(a);
}
}
}
Is not quite the same as:
public static void Main() {
int a = 0;
if(a == 0){
a++;
Console.WriteLine(a);
}
}
In the second case a++ is executed only if a == 0. In the first case a++ is executed every time we check the condition.
There is your mistake:
public static void Main() {
int a = 0;
if(a == 0){
a++;
Console.WriteLine(a);
}
}
It should be like this:
public static void Main() {
int a = 0;
if(a == 0){
a++;
Console.WriteLine(a);
}
else
a++;
}
a gets alwasy increased. This means, that in your code example count will get only increased when count == wordCount (In which case the method will return true anyway...). You basicly never increasing count.

Recursive edge search

I'm trying to write an algorithm that will take a list of points visited along an edge, and a list of unvisited edges (made up of pairs of points) which make up the rest of the object and search through them for a path that completes the edge (that is, connects the start to the end). I currently have:
public static int PolygonSearch(Point start, Point end, List<Point> visitedPoints, List<Point[]> unvisitedEdges)
{
int count = 0;
for (int i = unvisitedEdges.Count - 1; i > -1; i--)
{
Point[] line = unvisitedEdges[i];
if (((Equal(line[0], start) && Equal(line[1], end))
|| (Equal(line[1], start) && Equal(line[0], end)))
&& visitedPoints.Count > 2)
{
return count + 1;
}
else if (Equal(start, line[0]))
{
unvisitedEdges.RemoveAt(i);
count += PolygonSearch(line[1], end, visitedPoints, unvisitedEdges);
}
else if (Equal(start, line[1]))
{
unvisitedEdges.RemoveAt(i);
count += PolygonSearch(line[0], end, visitedPoints, unvisitedEdges);
}
}
return count;
}
(start and end being the current start and end points of the line)
The obvious problem here is the removal, which messes up the outer loops, but I'm not sure how to correct for it, I tried creating a new list each time but that didn't work (I've not even implemented a way to return the path yet, just to count the valid ones).
Any help fixing this would be greatly appreciated.
To avoid removing an object, you can set it as 'removed', then ignore it if it is so set.
The following uses a flag called Visited. If it is 'removed', Visited is set to true.
I haven't tested this obviously, but it should give you a general idea of what to do:
public struct Edge
{
public Edge()
{
this.Visited = false;
}
public Point[] Points;
public bool Visited;
}
public static int PolygonSearch(Point start, Point end, List<Point> visitedPoints, List<Edge> unvisitedEdges)
{
int count = 0;
for (int i = unvisitedEdges.Count - 1; i > -1; i--)
{
Edge line = unvisitedEdges[i];
if (((Equal(line.Points[0], start) && Equal(line.Points[1], end))
|| (Equal(line.Points[1], start) && Equal(line.Points[0], end)))
&& visitedPoints.Count > 2
&& line.Visited == false)
{
return count + 1;
}
else if (Equal(start, line[0]))
{
unvisitedEdges[i].Visited = true;
count += PolygonSearch(line.Points[1], end, visitedPoints, unvisitedEdges);
}
else if (Equal(start, line[0]))
{
unvisitedEdges[i].Visited = true;
count += PolygonSearch(line.Points[1], end, visitedPoints, unvisitedEdges);
}
}
return count;
}

RichTextBox is finding my search terms in funny places

I'm trying to find instances of a string in a WPF RichTextBox. What I have now almost works, but it highlights the wrong section of the document.
private int curSearchLocation;
private void FindNext_Click(object sender, RoutedEventArgs e)
{
TextRange text = new TextRange(RichEditor.Document.ContentStart, RichEditor.Document.ContentEnd);
var location = text.Text.IndexOf(SearchBox.Text, curSearchLocation, StringComparison.CurrentCultureIgnoreCase);
if (location < 0)
{
location = text.Text.IndexOf(SearchBox.Text, StringComparison.CurrentCultureIgnoreCase);
}
if (location >= 0)
{
curSearchLocation = location + 1;
RichEditor.Selection.Select(text.Start.GetPositionAtOffset(location), text.Start.GetPositionAtOffset(location + SearchBox.Text.Length));
}
else
{
curSearchLocation = 0;
MessageBox.Show("Not found");
}
RichEditor.Focus();
}
This is what happens when I search for "document":
This is because GetPositionAtOffset includes non-text elements such as opening and closing tags in its offset, which is not what I want. I couldn't find a way to ignore these elements, and I also couldn't find a way to directly get a TextPointer to the text I want, which would also solve the problem.
How can I get it to highlight the correct text?
Unfortunately the TextRange.Text strips out non-text characters, so in this case the offset computed by IndexOf will be slightly too low. That is the main problem.
I tried to solve your problem and found working solution that works fine even when we have formatted text in many paragraphs.
A lot of help is taken from this CodeProject Article. So also read that article.
int curSearchLocation;
private void FindNext_Click(object sender, RoutedEventArgs e)
{
TextRange text = new TextRange(RichEditor.Document.ContentStart, RichEditor.Document.ContentEnd);
var location = text.Text.IndexOf(SearchBox.Text, curSearchLocation, StringComparison.CurrentCultureIgnoreCase);
if (location < 0)
{
location = text.Text.IndexOf(SearchBox.Text, StringComparison.CurrentCultureIgnoreCase);
}
if (location >= 0)
{
curSearchLocation = location + 1;
Select(location, SearchBox.Text.Length);
}
else
{
curSearchLocation = 0;
MessageBox.Show("Not found");
}
RichEditor.Focus();
}
public void Select(int start, int length)
{
TextPointer tp = RichEditor.Document.ContentStart;
TextPointer tpLeft = GetPositionAtOffset(tp, start, LogicalDirection.Forward);
TextPointer tpRight = GetPositionAtOffset(tp, start + length, LogicalDirection.Forward);
RichEditor.Selection.Select(tpLeft, tpRight);
}
private TextPointer GetPositionAtOffset(TextPointer startingPoint, int offset, LogicalDirection direction)
{
TextPointer binarySearchPoint1 = null;
TextPointer binarySearchPoint2 = null;
// setup arguments appropriately
if (direction == LogicalDirection.Forward)
{
binarySearchPoint2 = this.RichEditor.Document.ContentEnd;
if (offset < 0)
{
offset = Math.Abs(offset);
}
}
if (direction == LogicalDirection.Backward)
{
binarySearchPoint2 = this.RichEditor.Document.ContentStart;
if (offset > 0)
{
offset = -offset;
}
}
// setup for binary search
bool isFound = false;
TextPointer resultTextPointer = null;
int offset2 = Math.Abs(GetOffsetInTextLength(startingPoint, binarySearchPoint2));
int halfOffset = direction == LogicalDirection.Backward ? -(offset2 / 2) : offset2 / 2;
binarySearchPoint1 = startingPoint.GetPositionAtOffset(halfOffset, direction);
int offset1 = Math.Abs(GetOffsetInTextLength(startingPoint, binarySearchPoint1));
// binary search loop
while (isFound == false)
{
if (Math.Abs(offset1) == Math.Abs(offset))
{
isFound = true;
resultTextPointer = binarySearchPoint1;
}
else
if (Math.Abs(offset2) == Math.Abs(offset))
{
isFound = true;
resultTextPointer = binarySearchPoint2;
}
else
{
if (Math.Abs(offset) < Math.Abs(offset1))
{
// this is simple case when we search in the 1st half
binarySearchPoint2 = binarySearchPoint1;
offset2 = offset1;
halfOffset = direction == LogicalDirection.Backward ? -(offset2 / 2) : offset2 / 2;
binarySearchPoint1 = startingPoint.GetPositionAtOffset(halfOffset, direction);
offset1 = Math.Abs(GetOffsetInTextLength(startingPoint, binarySearchPoint1));
}
else
{
// this is more complex case when we search in the 2nd half
int rtfOffset1 = startingPoint.GetOffsetToPosition(binarySearchPoint1);
int rtfOffset2 = startingPoint.GetOffsetToPosition(binarySearchPoint2);
int rtfOffsetMiddle = (Math.Abs(rtfOffset1) + Math.Abs(rtfOffset2)) / 2;
if (direction == LogicalDirection.Backward)
{
rtfOffsetMiddle = -rtfOffsetMiddle;
}
TextPointer binarySearchPointMiddle = startingPoint.GetPositionAtOffset(rtfOffsetMiddle, direction);
int offsetMiddle = GetOffsetInTextLength(startingPoint, binarySearchPointMiddle);
// two cases possible
if (Math.Abs(offset) < Math.Abs(offsetMiddle))
{
// 3rd quarter of search domain
binarySearchPoint2 = binarySearchPointMiddle;
offset2 = offsetMiddle;
}
else
{
// 4th quarter of the search domain
binarySearchPoint1 = binarySearchPointMiddle;
offset1 = offsetMiddle;
}
}
}
}
return resultTextPointer;
}
int GetOffsetInTextLength(TextPointer pointer1, TextPointer pointer2)
{
if (pointer1 == null || pointer2 == null)
return 0;
TextRange tr = new TextRange(pointer1, pointer2);
return tr.Text.Length;
}
Hope so this code will work for your case.

Comparing names

Is there any simple algorithm to determine the likeliness of 2 names representing the same person?
I'm not asking for something of the level that Custom department might be using. Just a simple algorithm that would tell me if 'James T. Clark' is most likely the same name as 'J. Thomas Clark' or 'James Clerk'.
If there is an algorithm in C# that would be great, but I can translate from any language.
Sounds like you're looking for a phonetic-based algorithms, such as soundex, NYSIIS, or double metaphone. The first actually is what several government departments use, and is trivial to implement (with many implementations readily available). The second is a slightly more complicated and more precise version of the first. The latter-most works with some non-English names and alphabets.
Levenshtein distance is a definition of distance between two arbitrary strings. It gives you a distance of 0 between identical strings and non-zero between different strings, which might also be useful if you decide to make a custom algorithm.
Levenshtein is close, although maybe not exactly what you want.
I've faced similar problem and tried to use Levenstein distance first, but it did not work well for me. I came up with an algorithm that gives you "similarity" value between two strings (higher value means more similar strings, "1" for identical strings). This value is not very meaningful by itself (if not "1", always 0.5 or less), but works quite well when you throw in Hungarian Matrix to find matching pairs from two lists of strings.
Use like this:
PartialStringComparer cmp = new PartialStringComparer();
tbResult.Text = cmp.Compare(textBox1.Text, textBox2.Text).ToString();
The code behind:
public class SubstringRange {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
int start;
public int Start {
get { return start; }
set { start = value; }
}
int end;
public int End {
get { return end; }
set { end = value; }
}
public int Length {
get { return End - Start; }
set { End = Start + value;}
}
public bool IsValid {
get { return MasterString.Length >= End && End >= Start && Start >= 0; }
}
public string Contents {
get {
if(IsValid) {
return MasterString.Substring(Start, Length);
} else {
return "";
}
}
}
public bool OverlapsRange(SubstringRange range) {
return !(End < range.Start || Start > range.End);
}
public bool ContainsRange(SubstringRange range) {
return range.Start >= Start && range.End <= End;
}
public bool ExpandTo(string newContents) {
if(MasterString.Substring(Start).StartsWith(newContents, StringComparison.InvariantCultureIgnoreCase) && newContents.Length > Length) {
Length = newContents.Length;
return true;
} else {
return false;
}
}
}
public class SubstringRangeList: List<SubstringRange> {
string masterString;
public string MasterString {
get { return masterString; }
set { masterString = value; }
}
public SubstringRangeList(string masterString) {
this.MasterString = masterString;
}
public SubstringRange FindString(string s){
foreach(SubstringRange r in this){
if(r.Contents.Equals(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public SubstringRange FindSubstring(string s){
foreach(SubstringRange r in this){
if(r.Contents.StartsWith(s, StringComparison.InvariantCultureIgnoreCase))
return r;
}
return null;
}
public bool ContainsRange(SubstringRange range) {
foreach(SubstringRange r in this) {
if(r.ContainsRange(range))
return true;
}
return false;
}
public bool AddSubstring(string substring) {
bool result = false;
foreach(SubstringRange r in this) {
if(r.ExpandTo(substring)) {
result = true;
}
}
if(FindSubstring(substring) == null) {
bool patternfound = true;
int start = 0;
while(patternfound){
patternfound = false;
start = MasterString.IndexOf(substring, start, StringComparison.InvariantCultureIgnoreCase);
patternfound = start != -1;
if(patternfound) {
SubstringRange r = new SubstringRange();
r.MasterString = this.MasterString;
r.Start = start++;
r.Length = substring.Length;
if(!ContainsRange(r)) {
this.Add(r);
result = true;
}
}
}
}
return result;
}
private static bool SubstringRangeMoreThanOneChar(SubstringRange range) {
return range.Length > 1;
}
public float Weight {
get {
if(MasterString.Length == 0 || Count == 0)
return 0;
float numerator = 0;
int denominator = 0;
foreach(SubstringRange r in this.FindAll(SubstringRangeMoreThanOneChar)) {
numerator += r.Length;
denominator++;
}
if(denominator == 0)
return 0;
return numerator / denominator / MasterString.Length;
}
}
public void RemoveOverlappingRanges() {
SubstringRangeList l = new SubstringRangeList(this.MasterString);
l.AddRange(this);//create a copy of this list
foreach(SubstringRange r in l) {
if(this.Contains(r) && this.ContainsRange(r)) {
Remove(r);//try to remove the range
if(!ContainsRange(r)) {//see if the list still contains "superset" of this range
Add(r);//if not, add it back
}
}
}
}
public void AddStringToCompare(string s) {
for(int start = 0; start < s.Length; start++) {
for(int len = 1; start + len <= s.Length; len++) {
string part = s.Substring(start, len);
if(!AddSubstring(part))
break;
}
}
RemoveOverlappingRanges();
}
}
public class PartialStringComparer {
public float Compare(string s1, string s2) {
SubstringRangeList srl1 = new SubstringRangeList(s1);
srl1.AddStringToCompare(s2);
SubstringRangeList srl2 = new SubstringRangeList(s2);
srl2.AddStringToCompare(s1);
return (srl1.Weight + srl2.Weight) / 2;
}
}
Levenstein distance one is much simpler (adapted from http://www.merriampark.com/ld.htm):
public class Distance {
/// <summary>
/// Compute Levenshtein distance
/// </summary>
/// <param name="s">String 1</param>
/// <param name="t">String 2</param>
/// <returns>Distance between the two strings.
/// The larger the number, the bigger the difference.
/// </returns>
public static int LD(string s, string t) {
int n = s.Length; //length of s
int m = t.Length; //length of t
int[,] d = new int[n + 1, m + 1]; // matrix
int cost; // cost
// Step 1
if(n == 0) return m;
if(m == 0) return n;
// Step 2
for(int i = 0; i <= n; d[i, 0] = i++) ;
for(int j = 0; j <= m; d[0, j] = j++) ;
// Step 3
for(int i = 1; i <= n; i++) {
//Step 4
for(int j = 1; j <= m; j++) {
// Step 5
cost = (t.Substring(j - 1, 1) == s.Substring(i - 1, 1) ? 0 : 1);
// Step 6
d[i, j] = System.Math.Min(System.Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
}
I doubt there is, considering even the Customs Department doesn't seem to have a satisfactory answer...
If there is a solution to this problem I seriously doubt it's a part of core C#. Off the top of my head, it would require a database of first, middle and last name frequencies, as well as account for initials, as in your example. This is fairly complex logic that relies on a database of information.
Second to Levenshtein distance, what language do you want? I was able to find an implementation in C# on codeproject pretty easily.
In an application I worked on, the Last name field was considered reliable.
So presented all the all the records with the same last name to the user.
User could sort by the other fields to look for similar names.
This solution was good enough to greatly reduce the issue of users creating duplicate records.
Basically looks like the issue will require human judgement.

Categories