fastest way to compare string elements with each other

fastest way to compare string elements with each other - c#

I have a list with a lot of strings (>5000) where I have to take the first element and compare it to all following elements. Eg. consider this list of string:
{ one, two, three, four, five, six, seven, eight, nine, ten }. Now I take one and compare it with two, three, four, ... afterwards I take two and compare it with three, four, ...
I believe the lookup is the problem why this takes so long. On a normal hdd (7200rpm) it takes about 30 seconds, on a ssd 10 seconds. I just don't know how I can speed this up even more. All strings inside the list are ordered by ascending and it is important to check them in this order. If it can speed things up considerably I would not mind to have an unordered list.
I took a look into hashset but I need the checking order so that would not work even with the fast contain method.
EDIT: As it looks like I am not clear enough and as wanted by Dusan here is the complete code. My problem case: I have a lot of directories, with similar names and am getting a collection with all directory names only and comparing them with each other for similarity and writing that. Hence the comparison between hdd and ssd. But that is weird because I am not writing instantly, instead putting it in a field and writing in the end. Still there is a difference in speed.
Why did I not include the whole code? Because I believe my core issue here is the lookup of value from the list and the comparison between the 2 strings. Everything else should already be sufficiently fast, adding to list, looking in the blacklist (hashset) and getting a list of dir names.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Text.RegularExpressions;
using System.Diagnostics;
using System.Threading;
namespace Similarity
{
/// <summary>
/// Credit http://www.dotnetperls.com/levenshtein
/// Contains approximate string matching
/// </summary>
internal static class LevenshteinDistance
{
/// <summary>
/// Compute the distance between two strings.
/// </summary>
public static int Compute(string s, string t)
{
int n = s.Length;
int m = t.Length;
int[,] d = new int[n + 1, m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
}
internal class Program
{
#region Properties
private static HashSet<string> _blackList = new HashSet<string>();
public static HashSet<string> blackList
{
get
{
return _blackList;
}
}
public static void AddBlackListEntry(string line)
{
blackList.Add(line);
}
private static List<string> _similar = new List<string>();
public static List<string> similar
{
get
{
return _similar;
}
}
public static void AddSimilarEntry(string s)
{
similar.Add(s);
}
#endregion Properties
private static void Main(string[] args)
{
Clean();
var directories = Directory.EnumerateDirectories(Directory.GetCurrentDirectory(), "*", SearchOption.TopDirectoryOnly)
.Select(x => new DirectoryInfo(x).Name).OrderBy(y => new DirectoryInfo(y).Name).ToList();
using (StreamWriter sw = new StreamWriter(#"result.txt"))
{
foreach (var item in directories)
{
Console.WriteLine(item);
sw.WriteLine(item);
}
Console.WriteLine("Amount of directories: " + directories.Count());
}
if (directories.Count != 0)
{
StartSimilarityCheck(directories);
}
else
{
Console.WriteLine("No directories");
}
WriteResult(similar);
Console.WriteLine("Finish. Press any key to exit...");
Console.ReadKey();
}
private static void StartSimilarityCheck(List<string> whiteList)
{
int counter = 0; // how many did we check yet?
var watch = Stopwatch.StartNew();
foreach (var dirName in whiteList)
{
bool insertDirName = true;
if (!IsBlackList(dirName))
{
// start the next element
for (int i = counter + 1; i <= whiteList.Count; i++)
{
// end of index reached
if (i == whiteList.Count)
{
break;
}
int similiariy = LevenshteinDistance.Compute(dirName, whiteList[i]);
// low score means high similarity
if (similiariy < 2)
{
if (insertDirName)
{
//Writer(dirName);
AddSimilarEntry(dirName);
insertDirName = false;
}
//Writer(whiteList[i]);
AddSimilarEntry(dirName);
AddBlackListEntry(whiteList[i]);
}
}
}
Console.WriteLine(counter);
//Console.WriteLine("Skip: {0}", dirName);
counter++;
}
watch.Stop();
Console.WriteLine("Time: " + watch.ElapsedMilliseconds / 1000);
}
private static void WriteResult(List<string> list)
{
using (StreamWriter sw = new StreamWriter(#"similar.txt", true, Encoding.UTF8, 65536))
{
foreach (var item in list)
{
sw.WriteLine(item);
}
}
}
private static void Clean()
{
// yeah hardcoded file names incoming. Better than global variables??
try
{
if (File.Exists(#"similar.txt"))
{
File.Delete(#"similar.txt");
}
if (File.Exists(#"result.txt"))
{
File.Delete(#"result.txt");
}
}
catch (Exception)
{
throw;
}
}
private static void Writer(string s)
{
using (StreamWriter sw = new StreamWriter(#"similar.txt", true, Encoding.UTF8, 65536))
{
sw.WriteLine(s);
}
}
private static bool IsBlackList(string name)
{
return blackList.Contains(name);
}
}
To fix the bottleneck which is the second for-loop insert an if-condition which checks if similiariy is >= than what we want, if that is the case then break the loop. now it runs in 1 second. thanks everyone

Your inner loop uses a strange double check. This may prevent an important JIT optimization, removal of redundant range checks.
//foreach (var item myList)
for (int j = 0; j < myList.Count-1; j++)
{
string item1 = myList[j];
for (int i = j + 1; i < myList.Count; i++)
{
string item2 = myList[i];
// if (i == myList.Count)
...
}
}

The amount of downvotes is crazy but oh well... I found the reason for my performance issue / bottleneck thanks to the comments.
The second for loop inside StartSimilarityCheck() iterates over all entries, which in itself is no problem but when viewed under performance issues and efficient, is bad. The solution is to only check strings which are in the neighborhood but how do we know if they are?
First, we get a list which is ordered by ascension. That gives us a rough overview of similar strings. Now we define a threshold of Levenshtein score (smaller score is higher similarity between two strings). If the score is higher than the threshold it means they are not too similar, thus we can break out of the inner loop. That saves time and the program can finish really fast. Notice that that way is not bullet proof, IMHO because if the first string is 0Directory it will be at the beginning part of the list but a string like zDirectory will be further down and could be missed. Correct me if I am wrong..
private static void StartSimilarityCheck(List<string> whiteList)
{
var watch = Stopwatch.StartNew();
for (int j = 0; j < whiteList.Count - 1; j++)
{
string dirName = whiteList[j];
bool insertDirName = true;
int threshold = 2;
if (!IsBlackList(dirName))
{
// start the next element
for (int i = j + 1; i < whiteList.Count; i++)
{
// end of index reached
if (i == whiteList.Count)
{
break;
}
int similiarity = LevenshteinDistance.Compute(dirName, whiteList[i]);
if (similiarity >= threshold)
{
break;
}
// low score means high similarity
if (similiarity <= threshold)
{
if (insertDirName)
{
AddSimilarEntry(dirName);
AddSimilarEntry(whiteList[i]);
AddBlackListEntry(whiteList[i]);
insertDirName = false;
}
else
{
AddBlackListEntry(whiteList[i]);
}
}
}
}
Console.WriteLine(j);
}
watch.Stop();
Console.WriteLine("Ms: " + watch.ElapsedMilliseconds);
Console.WriteLine("Similar entries: " + similar.Count);
}

Related

c# comparing list of IDs

I have a List<Keyword> where Keyword class is:
public string keyword;
public List<int> ids;
public int hidden;
public int live;
public bool worked;
Keyword has its own keyword, a set of 20 ids, live by default is set to 1 and hidden to 0.
I just need to iterate over the whole main List to invalidate those keywords whose number of same ids is greater than 6, so comparing every pair, if the second one has more than 6 ids repeated respect to the first one, hidden is set to 1 and live to 0.
The algorithm is very basic but it takes too long when the main list has many elements.
I'm trying to guess if there could be any method I could use to increase the speed.
The basic algorithm I use is:
foreach (Keyword main_keyword in lista_de_keywords_live)
{
if (main_keyword.worked) {
continue;
}
foreach (Keyword keyword_to_compare in lista_de_keywords_live)
{
if (keyword_to_compare.worked || keyword_to_compare.id == main_keyword.id) continue;
n_ids_same = 0;
foreach (int id in main_keyword.ids)
{
if (keyword_to_compare._lista_models.IndexOf(id) >= 0)
{
if (++n_ids_same >= 6) break;
}
}
if (n_ids_same >= 6)
{
keyword_to_compare.hidden = 1;
keyword_to_compare.live = 0;
keyword_to_compare.worked = true;
}
}
}

The code below is an example of how you would use a HashSet for your problem. However, I would not recommend using it in this scenario. On the other hand, the idea of sorting the ids to make the comparison faster still.
Run it in a Console Project to try it out.
Notice that once I'm done adding new ids to a keyword, I sort them. This makes the comparison faster later on.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
namespace KeywordExample
{
public class Keyword
{
public List<int> ids;
public int hidden;
public int live;
public bool worked;
public Keyword()
{
ids = new List<int>();
hidden = 0;
live = 1;
worked = false;
}
public override string ToString()
{
StringBuilder s = new StringBuilder();
if (ids.Count > 0)
{
s.Append(ids[0]);
for (int i = 1; i < ids.Count; i++)
{
s.Append(',' + ids[i].ToString());
}
}
return s.ToString();
}
}
public class KeywordComparer : EqualityComparer<Keyword>
{
public override bool Equals(Keyword k1, Keyword k2)
{
int equals = 0;
int i = 0;
int j = 0;
//based on sorted ids
while (i < k1.ids.Count && j < k2.ids.Count)
{
if (k1.ids[i] < k2.ids[j])
{
i++;
}
else if (k1.ids[i] > k2.ids[j])
{
j++;
}
else
{
equals++;
i++;
j++;
}
}
return equals >= 6;
}
public override int GetHashCode(Keyword keyword)
{
return 0;//notice that using the same hash for all keywords gives you an O(n^2) time complexity though.
}
}
class Program
{
static void Main(string[] args)
{
List<Keyword> listOfKeywordsLive = new List<Keyword>();
//add some values
Random random = new Random();
int n = 10;
int sizeOfMaxId = 20;
for (int i = 0; i < n; i++)
{
var newKeyword = new Keyword();
for (int j = 0; j < 20; j++)
{
newKeyword.ids.Add(random.Next(sizeOfMaxId) + 1);
}
newKeyword.ids.Sort(); //sorting the ids
listOfKeywordsLive.Add(newKeyword);
}
//solution here
HashSet<Keyword> set = new HashSet<Keyword>(new KeywordComparer());
set.Add(listOfKeywordsLive[0]);
for (int i = 1; i < listOfKeywordsLive.Count; i++)
{
Keyword keywordToCompare = listOfKeywordsLive[i];
if (!set.Add(keywordToCompare))
{
keywordToCompare.hidden = 1;
keywordToCompare.live = 0;
keywordToCompare.worked = true;
}
}
//print all keywords to check
Console.WriteLine(set.Count + "/" + n + " inserted");
foreach (var keyword in set)
{
Console.WriteLine(keyword);
}
}
}
}

The obvious source of inefficiency is the way you calculate intersection of two lists (of ids). The algorithm is O(n^2). This is by the way problem that relational databases solve for every join and your approach would be called loop join. The main efficient strategies are hash join and merge join. For your scenario the latter approach may be better I guess, but you can also try HashSets if you like.
The second source of inefficiency is repeating everything twice. As (a join b) is equal to (b join a), you do not need two cycles over the whole List<Keyword>. Actually, you only need to loop over the non duplicate ones.
Using some code from here, you can write the algorithm like:
Parallel.ForEach(list, k => k.ids.Sort());
List<Keyword> result = new List<Keyword>();
foreach (var k in list)
{
if (result.Any(r => r.ids.IntersectSorted(k.ids, Comparer<int>.Default)
.Skip(5)
.Any()))
{
k.hidden = 1;
k.live = 0;
k.worked = true;
}
else
{
result.Add(k);
}
}
If you replace the linq with just the index manipulation approach (see the link above), it would be a tiny bit faster I guess.

Applying Algorithm to multiple files

I'm currently working on a project where I have to choose a file and then sort it along with other files. The files are just filled with numbers but each file has linked data. So the first number in a file is linked to the first number in the second file and so on. I Currently have code that allows me to read a file and display the file unsorted and sorted using the bubble sort. I am not sure how I would be able to apply this same principle to multiple files at once. So that I could choose a file and then sort it in line with the same method I have for a single file.
Currently, the program loads and asks the user to choose between 1 and 2. 1 Shows the code unsorted and 2 shows the code sorted. I can read in the files but the problem is sorting and displaying in order. Basically How do I sort multiple files that are linked together. What steps do I need to take to do this?
An example of one file:
4
28
77
96
An example of the second file:
66.698
74.58
2.54
48.657
Any help would be appreciated.
Thanks
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
public class Program
{
public Program()
{
}
public void ReadFile(int [] numbers)
{
string path = "Data1/Day_1.txt";
StreamReader sr = new StreamReader(path);
for (int index = 0; index < numbers.Length; index++)
{
numbers[index] = Convert.ToInt32(sr.ReadLine());
}
sr.Close(); // closes file when done
}
public void SortArray(int[] numbers)
{
bool swap;
int temp;
do
{
swap = false;
for (int index = 0; index < (numbers.Length - 1); index++)
{
if (numbers[index] > numbers[index + 1])
{
temp = numbers[index];
numbers[index] = numbers[index + 1];
numbers[index + 1] = temp;
swap = true;
}
}
} while (swap == true);
}
public void DisplayArray(int[] numbers)
{
for(int index = 0; index < numbers.Length; index++)
{
Console.WriteLine("{0}", numbers[index]);
}
}
}
The main is in another file to keep work organised:
using System;
public class FileDemoTest
{
public static void Main(string[] args)
{
int[] numbers = new int[300];
Program obj = new Program();
int operation = 0;
Console.WriteLine("1 or 2 ?");
operation = Convert.ToInt32(Console.ReadLine());
// call the read from file method
obj.ReadFile(numbers);
if (operation == 1)
{
//Display unsorted values
Console.Write("Unsorted:");
obj.DisplayArray(numbers);
}
if (operation == 2)
{
//sort numbers and display
obj.SortArray(numbers);
Console.Write("Sorted: ");
obj.DisplayArray(numbers);
}
}
}

What I would do is create a class that will hold the values from file 1 and file 2. Then you can populate a list of these classes by reading values from both files. After that, you can sort the list of classes on either field, and the relationships will be maintained (since the two values are stored in a single object).
Here's an example of the class that would hold the file data:
public class FileData
{
public int File1Value { get; set; }
public decimal File2Value { get; set; }
/// <summary>
/// Provides a friendly string representation of this object
/// </summary>
/// <returns>A string showing the File1 and File2 values</returns>
public override string ToString()
{
return $"{File1Value}: {File2Value}";
}
}
Then you can create a method that reads both files and creates and returns a list of these objects:
public static FileData[] GetFileData(string firstFilePath, string secondFilePath)
{
// These guys hold the strongly typed version of the string values in the files
int intHolder = 0;
decimal decHolder = 0;
// Get a list of ints from the first file
var fileOneValues = File
.ReadAllLines(firstFilePath)
.Where(line => int.TryParse(line, out intHolder))
.Select(v => intHolder)
.ToArray();
// Get a list of decimals from the second file
var fileTwoValues = File
.ReadAllLines(secondFilePath)
.Where(line => decimal.TryParse(line, out decHolder))
.Select(v => decHolder)
.ToArray();
// I guess the file lengths should match, but in case they don't,
// use the size of the smaller one so we have matches for all items
var numItems = Math.Min(fileOneValues.Count(), fileTwoValues.Count());
// Populate an array of new FileData objects
var fileData = new FileData[numItems];
for (var index = 0; index < numItems; index++)
{
fileData[index] = new FileData
{
File1Value = fileOneValues[index],
File2Value = fileTwoValues[index]
};
}
return fileData;
}
Now, we need to modify your sorting method to work on a FileData array instead of an int array. I also added an argument that, if set to false, will sort on the File2Data field instead of File1Data:
public static void SortArray(FileData[] fileData, bool sortOnFile1Data = true)
{
bool swap;
do
{
swap = false;
for (int index = 0; index < (fileData.Length - 1); index++)
{
bool comparison = sortOnFile1Data
? fileData[index].File1Value > fileData[index + 1].File1Value
: fileData[index].File2Value > fileData[index + 1].File2Value;
if (comparison)
{
var temp = fileData[index];
fileData[index] = fileData[index + 1];
fileData[index + 1] = temp;
swap = true;
}
}
} while (swap);
}
And finally, you can display the non-sorted and sorted lists of data. Note I added a second question to the user if they choose "Sorted" where they can decide if it should be sorted by File1 or File2:
private static void Main()
{
Console.WriteLine("1 or 2 ?");
int operation = Convert.ToInt32(Console.ReadLine());
var fileData = GetFileData(#"f:\public\temp\temp1.txt", #"f:\public\temp\temp2.txt");
if (operation == 1)
{
//Display unsorted values
Console.WriteLine("Unsorted:");
foreach (var data in fileData)
{
Console.WriteLine(data);
}
}
if (operation == 2)
{
Console.WriteLine("Sort on File1 or File2 (1 or 2)?");
operation = Convert.ToInt32(Console.ReadLine());
//sort numbers and display
SortArray(fileData, operation == 1);
Console.WriteLine("Sorted: ");
foreach (var data in fileData)
{
Console.WriteLine(data);
}
}
Console.Write("\nDone!\nPress any key to exit...");
Console.ReadKey();
}
If you wanted to use an int to decide which field to sort on, you could do something like the following:
public static void SortArray(FileData[] fileData, int sortFileNumber = 1)
{
bool swap;
do
{
swap = false;
for (int index = 0; index < (fileData.Length - 1); index++)
{
bool comparison;
// Set our comparison to the field sortFileNumber
if (sortFileNumber == 1)
{
comparison = fileData[index].File1Value > fileData[index + 1].File1Value;
}
else if (sortFileNumber == 2)
{
comparison = fileData[index].File2Value > fileData[index + 1].File2Value;
}
else // File3Value becomes default for anything else
{
comparison = fileData[index].File3Value > fileData[index + 1].File3Value;
}
if (comparison)
{
var temp = fileData[index];
fileData[index] = fileData[index + 1];
fileData[index + 1] = temp;
swap = true;
}
}
} while (swap);
}

Kattis phonelist issue

Once I run the following local, it is woking fast, but when I submit it to Kattis, It only exceeds 2/5 and I get Time Limit Exceeded.
Any suggestion?
I have tried with a input file with 10000 numbers and it is still fast localy :S
using System;
namespace phonelist
{
class Program
{
static void Main(string[] args)
{
int nrOfPhoneNrs = 0;
bool consistent;
int nrOfTestCases = Convert.ToInt32(Console.ReadLine().Trim());
for (byte i = 0; i < nrOfTestCases; i++)
{
consistent = false;
nrOfPhoneNrs = Convert.ToInt32(Console.ReadLine().Trim());
string[] phList = new string[nrOfPhoneNrs];
int n = 0;
while (n < nrOfPhoneNrs)
{
phList[n] = Console.ReadLine();
n++;
}
Array.Sort(phList);
int runs = nrOfPhoneNrs - 1;
for (int p = 0; p < runs; p++)
{
if (phList[p + 1].StartsWith(phList[p]))
{
consistent= true;
break;
}
}
Console.WriteLine(consistent? "NO" : "YES");
}
}
}
}

I think that your main problem is that you're using StartsWith and Array.Sort methods.
I don't want to give you too detailed advice (so that you can still solve it by yourself) but let me just suggest considering a different data structure than an array of strings, perhaps HashSet<string>.

C# - For vs Foreach - Huge performance difference

i was making some optimizations to an algorithm that finds the smallest number that is bigger than X, in a given array, but then a i stumbled on a strange difference. On the code bellow, the "ForeachUpper" ends in 625ms, and the "ForUpper" ends in, i believe, a few hours (insanely slower). Why so?
class Teste
{
public double Valor { get; set; }
public Teste(double d)
{
Valor = d;
}
public override string ToString()
{
return "Teste: " + Valor;
}
}
private static IEnumerable<Teste> GetTeste(double total)
{
for (int i = 1; i <= total; i++)
{
yield return new Teste(i);
}
}
static void Main(string[] args)
{
int total = 1000 * 1000*30 ;
double test = total/2+.7;
var ieTeste = GetTeste(total).ToList();
Console.WriteLine("------------");
ForeachUpper(ieTeste.Select(d=>d.Valor), test);
Console.WriteLine("------------");
ForUpper(ieTeste.Select(d => d.Valor), test);
Console.Read();
}
private static void ForUpper(IEnumerable<double> bigList, double find)
{
var start1 = DateTime.Now;
double uppper = 0;
for (int i = 0; i < bigList.Count(); i++)
{
var toMatch = bigList.ElementAt(i);
if (toMatch >= find)
{
uppper = toMatch;
break;
}
}
var end1 = (DateTime.Now - start1).TotalMilliseconds;
Console.WriteLine(end1 + " = " + uppper);
}
private static void ForeachUpper(IEnumerable<double> bigList, double find)
{
var start1 = DateTime.Now;
double upper = 0;
foreach (var toMatch in bigList)
{
if (toMatch >= find)
{
upper = toMatch;
break;
}
}
var end1 = (DateTime.Now - start1).TotalMilliseconds;
Console.WriteLine(end1 + " = " + upper);
}
Thanks

IEnumerable<T> is not indexable.
The Count() and ElementAt() extension methods that you call in every iteration of your for loop are O(n); they need to loop through the collection to find the count or the nth element.
Moral: Know thy collection types.

The reason for this difference is that your for loop will execute bigList.Count() at every iteration. This is really costly in your case, because it will execute the Select and iterate the complete result set.
Furthermore, you are using ElementAt which again executes the select and iterates it up to the index you provided.

mergesort - with an insignificant change throws SystemInvalidOperationException

A very strange thing occured in my program. Here is the simplified code.
class Program
{
static void Main(string[] args)
{
ArrayList numbers = new ArrayList();
numbers.Add(1);
numbers.Add(3);
numbers.Add(4);
numbers.Add(2);
var it = Sorts.MergeSort((ArrayList)numbers.Clone());
Sorts.PrintArray(it, "mergesort");
Console.WriteLine("DONE");
Console.ReadLine();
}
}
public static class Sorts
{
public static ArrayList BubbleSort(ArrayList numbers)
{
bool sorted = true;
for (int i = 0; i < numbers.Count; i++)
{
for (int j = 1; j < numbers.Count; j++)
{
if ((int)numbers[j - 1] > (int)numbers[j])
{
int tmp = (int)numbers[j - 1];
numbers[j - 1] = numbers[j];
numbers[j] = tmp;
sorted = false;
}
}
if (sorted)
{
return numbers;
}
}
return numbers;
}
public static ArrayList MergeSort(ArrayList numbers, int switchLimit = 3)
{
//if I use this if - everything works
if (numbers.Count <= 1)
{
// return numbers;
}
//the moment I use this condition - it throws SystemInvalidOperationException in function Merge in the line of a "for"-loop
if (numbers.Count <=switchLimit)
{
return Sorts.BubbleSort(numbers);
}
ArrayList ret = new ArrayList();
int middle = numbers.Count / 2;
ArrayList L = numbers.GetRange(0, middle);
ArrayList R = numbers.GetRange(middle, numbers.Count - middle);
L = MergeSort(L);
R = MergeSort(R);
return Merge(L, R);
}
private static ArrayList Merge(ArrayList L, ArrayList R)
{
ArrayList ret = new ArrayList();
int l = 0;
int r = 0;
for (int i = 0; i < L.Count + R.Count; i++)
{
if (l == L.Count)
{
ret.Add(R[r++]);
}
else if (r == R.Count)
{
ret.Add(L[l++]);
}
else if ((int)L[l] < (int)R[r])
{
ret.Add(L[l++]);
}
else
{
ret.Add(R[r++]);
}
}
return ret;
}
//---------------------------------------------------------------------------------
public static void PrintArray(ArrayList arr, string txt = "", int sleep = 0)
{
Console.WriteLine("{1}({0}): ", arr.Count, txt);
for (int i = 0; i < arr.Count; i++)
{
Console.WriteLine(arr[i].ToString().PadLeft(10));
}
Console.WriteLine();
System.Threading.Thread.Sleep(sleep);
}
}
There is a problem with my Sorts.MergeSort function.
When I use it normally (take a look at the first if-condition in a function - all works perfectly.
But the moment when I want it to switch to bubblesort with smaller input (the second if-condition in the function) it throws me an SystemInvalidOperationException. I have no idea where is the problem.
Do you see it?
Thanks. :)
Remark: bubblesort itself works - so there shouldn't be a problem in that sort...

The problem is with your use of GetRange:
This method does not create copies of the elements. The new ArrayList is only a view window into the source ArrayList. However, all subsequent changes to the source ArrayList must be done through this view window ArrayList. If changes are made directly to the source ArrayList, the view window ArrayList is invalidated and any operations on it will return an InvalidOperationException.
You're creating two views onto the original ArrayList and trying to work with both of them - but when one view modifies the underlying list, the other view is effectively invalidated.
If you change the code to create copies of the sublists - or if you work directly with the original list within specified bounds - then I believe it'll work fine.
(As noted in comments, I'd also strongly recommend that you use generic collections.)
Here's a short but complete program which demonstrates the problem you're running into:
using System;
using System.Collections;
class Program
{
static void Main()
{
ArrayList list = new ArrayList();
list.Add("a");
list.Add("b");
ArrayList view1 = list.GetRange(0, 1);
ArrayList view2 = list.GetRange(1, 1);
view1[0] = "c";
Console.WriteLine(view2[0]); // Throws an exception
}
}

on this line R = MergeSort(R); you alter the range of numbers represented by L. This invalidates L. Sorry I have to go so can't explain any further now.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

fastest way to compare string elements with each other - c#

Related

c# comparing list of IDs

Applying Algorithm to multiple files

Kattis phonelist issue

C# - For vs Foreach - Huge performance difference

mergesort - with an insignificant change throws SystemInvalidOperationException

Categories

Resources