Delete row in 2D array if a column contains duplicate value - c#

I am working on a problem regarding 2D Arrays in Excel: You are able to remove rows if a single column contains duplicate values. How can I accomplish the same thing in C#?
Please see this for reference. It removes only if a whole row contains duplicate values. I need the same thing except that only if a single column contains duplicates, that row should be removed.
Link:
Delete duplicate rows from two dimentsional array
public static class MyExtensions
{
public static IEnumerable<List<T>> ToEnumerableOfEnumerable<T>(this T[,] array)
{
int rowCount = array.GetLength(0);
int columnCount = array.GetLength(1);
for (int rowIndex = 0; rowIndex < rowCount; rowIndex++)
{
var row = new List<T>();
for (int columnIndex = 0; columnIndex < columnCount; columnIndex++)
{
row.Add(array[rowIndex, columnIndex]);
}
yield return row;
}
}
public static T[,] ToTwoDimensionalArray<T>(this List<List<T>> tuples)
{
var list = tuples.ToList();
T[,] array = null;
for (int rowIndex = 0; rowIndex < list.Count; rowIndex++)
{
var row = list[rowIndex];
if (array == null)
{
array = new T[list.Count, row.Count];
}
for (int columnIndex = 0; columnIndex < row.Count; columnIndex++)
{
array[rowIndex, columnIndex] = row[columnIndex];
}
}
return array;
}
}
public class ListEqualityComparer<T> : IEqualityComparer<List<T>>
{
public bool Equals(List<T> x, List<T> y)
{
return x.SequenceEqual(y);
}
public int GetHashCode(List<T> obj)
{
int hash = 19;
foreach (var o in obj)
{
hash = hash * 31 + o.GetHashCode();
}
return hash;
}
}
Usage:
[TestClass]
public class UnitTest1
{
[TestMethod]
public void TestMethod1()
{
var array = new[,] { { 1, 2 }, { 3, 4 }, { 1, 2 }, { 7, 8 } };
array = array.ToEnumerableOfEnumerable()
.Distinct(new ListEqualityComparer<int>())
.ToList()
.ToTwoDimensionalArray();
}
}

Related

How to print all the elements that are stored in class Vector?

public class Vector < T>
{
private const int DEFAULT_CAPACITY = 10;
private T[] data;
public int Count { get; private set; } = 0;
public int Capacity
{
get { return data.Length; }
}
public Vector(int capacity)
{
data = new T[capacity];
}
public Vector() : this(DEFAULT_CAPACITY) { }
public T this[int index]
{
get
{
if (index >= Count || index < 0) throw new IndexOutOfRangeException();
return data[index];
}
set
{
if (index >= Count || index < 0) throw new IndexOutOfRangeException();
data[index] = value;
}
}
private void ExtendData(int extraCapacity)
{
T[] newData = new T[Capacity + extraCapacity];
for (int i = 0; i < Count; i++) newData[i] = data[i];
data = newData;
}
public void Add(T element)
{
if (Count == Capacity) ExtendData(DEFAULT_CAPACITY);
data[Count++] = element;
}
public int IndexOf(T element)
{
for (var i = 0; i < Count; i++)
{
if (data[i].Equals(element)) return i;
}
return -1;
}
public void Insert(int index, T element)
{
if (Count == Capacity)
{
ExtendData(DEFAULT_CAPACITY);
}
if (index >= Count || index < 0)
throw new IndexOutOfRangeException();
int tmp = Count;
while (tmp != index)
{
//shuffle
data[tmp] = data[tmp - 1];
tmp--;
}
data[tmp] = element;
Count++;
}
I am working with arrays and vectors. Just wondering how could we create a function which displays all elements in a vector. I am confused because of the data scope. I can write a function for a simple array but how do you do that when you have whole class?
How do i create a function that prints all the elements stored in this vector?

C# all common sequences in lists of strings

I am trying to find the longest common sequence of strings within the provided arrays.
I have 25,000 lists with sequences, with a total of 450,000 of words that I need to order by length, then by count.
List<string> listA = new List<string>() {"Step1", "Step3", "Process", "System", "Process"};
List<string> listB = new List<string>() {"Process", "System", "Process"};
List<string> listC = new List<string>() {"Terminal", "Step1", "Step3"};
...
The desired output that prints all possible sequences and their length and count is:
Sequence Length Count
Step1->Step3->Process->System->Process 5 1
Step1->Step3->Process->System 4 1
Step3->Process->System->Process 4 1
Process->System->Process 3 2
Step1->Step3->Process 3 1
Step3->Process->System 3 1
Terminal->Step1->Step3 3 1
Step1->Step3 2 2
Process->System 2 2
System->Process 2 2
Step3->Process 2 1
Terminal->Step1 2 1
Process 1 4
Step1 1 2
Step3 1 2
System 1 2
Terminal 1 1
I could only find an implementation of substrings, and not whole words that can take multiple lists as input.
Ok so you can actually overload GetHashCode and Equals to treat strings like chars in a string. Also it might be reasonable create list segment to prevent flooding runtime with multiple collections.
public class ListSegment<T>
{
private sealed class ListSegmentEqualityComparer : IEqualityComparer<ListSegment<T>>
{
public bool Equals(ListSegment<T> x, ListSegment<T> y)
{
if (x.Length != y.Length)
{
return false;
}
return x.Lst.Skip(x.Offset).Take(x.Length)
.SequenceEqual(y.Lst.Skip(y.Offset).Take(y.Length));
}
public int GetHashCode(ListSegment<T> obj)
{
unchecked
{
int hash = 17;
for (int i = obj.Offset; i < obj.Offset + obj.Length; i++)
{
hash = hash * 31 + obj.Lst[i].GetHashCode();
}
return hash;
}
}
}
public static IEqualityComparer<ListSegment<T>> Default { get; } = new ListSegmentEqualityComparer();
public List<T> Lst { get; set; }
public int Offset { get; set; }
public int Length { get; set; }
public IEnumerable<T> GetEnumerable()
{
return Lst.Skip(Offset).Take(Length);
}
public override string ToString()
{
return string.Join("->", GetEnumerable());
}
}
And then you run through list of lists counting number of occurrences
public List<KeyValuePair<ListSegment<string>, int>> GetOrderedPairs(List<List<string>> data)
{
var segmentsDictionary = new Dictionary<ListSegment<string>, int>(ListSegment<string>.Default);
foreach (var list in data)
{
for (int i = 0; i < list.Count; i++)
for (int j = i + 1; j <= list.Count; j++)
{
var segment = new ListSegment<string>
{
Lst = list,
Length = j-i,
Offset = i,
};
if (segmentsDictionary.TryGetValue(segment, out var val))
{
segmentsDictionary[segment] = val + 1;
}
else
{
segmentsDictionary[segment] = 1;
}
}
}
return segmentsDictionary.OrderByDescending(pair => pair.Key.Length).ToList();
}
To test it run following
List<string> listA = new List<string>() { "Step1", "Step3", "Process", "System", "Process" };
List<string> listB = new List<string>() { "Process", "System", "Process" };
List<string> listC = new List<string>() { "Terminal", "Step1", "Step3" };
var pairs = GetOrderedPairs(new List<List<string>>()
{
listA, listB, listC
});
foreach (var keyValuePair in pairs)
{
Console.WriteLine(keyValuePair.Key + " " + keyValuePair.Key.Length + " " + keyValuePair.Value);
}
Using some extension methods, you can create an IEQualityComparer that compares IEnumerable sequences. Using this, you can use LINQ Distinct to compare by sequences:
public static class IEnumerableExt {
public static IEnumerable<IEnumerable<T>> DistinctIE<T>(this IEnumerable<IEnumerable<T>> src) => src.Distinct(Make.IESequenceEqualityComparer<T>());
// IEnumerable<string>
public static string Join(this IEnumerable<string> src, string sep) => String.Join(sep, src);
}
public static class Make {
public static IEqualityComparer<IEnumerable<T>> IESequenceEqualityComparer<T>() => new IEnumerableSequenceEqualityComparer<T>();
public static IEqualityComparer<IEnumerable<T>> IESequenceEqualityComparer<T>(T _) => new IEnumerableSequenceEqualityComparer<T>();
public class IEnumerableSequenceEqualityComparer<T> : IEqualityComparer<IEnumerable<T>> {
public bool Equals(IEnumerable<T> x, IEnumerable<T> y) =>
Object.ReferenceEquals(x, y) || (x != null && y != null && (x.SequenceEqual(y)));
public int GetHashCode(IEnumerable<T> src) {
var hc = new HashCode();
foreach (var v in src)
hc.Add(v);
return hc.ToHashCode();
}
}
}
With these tools, you can create an extension method to generate all the subsequences of a List and all the distinct subsequences:
public static class ListExt {
public static IEnumerable<IEnumerable<T>> Subsequences<T>(this List<T> src) {
IEnumerable<T> Helper(int start, int end) {
for (int j3 = start; j3 <= end; ++j3)
yield return src[j3];
}
for (int j1 = 0; j1 < src.Count; ++j1) {
for (int j2 = j1; j2 < src.Count; ++j2)
yield return Helper(j1, j2);
}
}
public static IEnumerable<IEnumerable<T>> DistinctSubsequences<T>(this List<T> src) => src.Subsequences().DistinctIE();
}
Now you can compute the answer.
First, compute all the subsequences and combine them:
var ssA = listA.DistinctSubsequences();
var ssB = listB.DistinctSubsequences();
var ssC = listC.DistinctSubsequences();
var ssAll = ssA.Concat(ssB).Concat(ssC).DistinctIE();
Then, create some helpers for counting occurrences:
var hA = ssA.ToHashSet(Make.IESequenceEqualityComparer<string>());
var hB = ssB.ToHashSet(Make.IESequenceEqualityComparer<string>());
var hC = ssC.ToHashSet(Make.IESequenceEqualityComparer<string>());
Func<IEnumerable<string>, HashSet<IEnumerable<string>>, int> testIn = (s, h) => h.Contains(s) ? 1 : 0;
Func<IEnumerable<string>,int> countIn = s => testIn(s,hA)+testIn(s,hB)+testIn(s,hC);
Finally, compute the answer:
var ans = ssAll.Select(ss => new { Sequence = ss.Join("->"), Length = ss.Count(), Count = countIn(ss) }).OrderByDescending(sc => sc.Sequence.Length);

How To get Positional Occurrence of strings in List using Linq?

var a = new List<string>() {"aa","aa","bb","bb","bb","cc","aa","aa","cc","cc" };
I want to find occurrence count of all strings, for above list i want output like:
Index String Count
0 aa 2
2 bb 3
5 cc 1
6 aa 2
8 cc 2
but as i tried groupby it gives the total count not positional.
how i can get this?
edit: i expect more than 10 million entries in list.
You can do following.
Solution Using Linq
var a = new List<string>() {"aa","aa","bb","bb","bb","cc","aa","aa","cc","cc" };
var result = Enumerable.Range(0, a.Count())
.Where(x => x == 0 || a[x - 1] != a[x])
.Select((x,index) => new Stat<string>
{
Index =index,
StringValue = a[x],
Count = a.Skip(x).TakeWhile(c => c == a[x]).Count()
});
Where Stat is defined as
public class Stat<T>
{
public int Index{get;set;}
public T StringValue {get;set;}
public int Count {get;set;}
}
Update
Using Iterator
public IEnumerable<Stat<T>> CountOccurance<T>(IEnumerable<T> source)
{
var lastItem = source.First();
var count = 1;
var index= 0;
foreach(var item in source.Skip(1))
{
if(item.Equals(lastItem))
{
count++;
}
else
{
yield return new Stat<T>
{
Index = index,
StringValue = lastItem,
Count = count
};
count=1;
lastItem = item;
index++;
}
}
yield return new Stat<T>
{
Index = index,
StringValue = lastItem,
Count = count
};
}
You can then fetch the result as
var result = CountOccurance(a);
If you will use this logic multiple times you can consider writing your own extension method.
public static class ExtensionMethods
{
public static IEnumerable<SpecialGroup<TKey>> GroupAccordingToSuccessiveItems<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
int index = 0;
int count = 0;
TKey latestKey = default(TKey);
foreach (var item in source)
{
TKey key = keySelector(item);
if (index != 0 && !object.Equals(key, latestKey))
{
yield return new SpecialGroup<TKey>
{
Index = index - count,
Obj = latestKey,
Count = count
};
count = 0;
}
latestKey = key;
count++;
index++;
}
yield return new SpecialGroup<TKey>
{
Index = index - count,
Obj = latestKey,
Count = count
};
}
}
public class SpecialGroup<T>
{
public int Index { get; set; }
public T Obj { get; set; }
public int Count { get; set; }
}
Then you can call it,
var result =
a.
GroupAccordingToSuccessiveItems(i => i);
This will iterate all the list once.

Multi-Dim Array to 1D - getting it to work with protobuf

Protobuf doesn't support multi-dim arrays so I decided to use this implementation to make a 1D array out of a 2D one.
I get a Cannot cast from source type to destination type in the ToProtoArray method when I call the MultiLoop function. Any ideas on how to fix this?
public static ProtoArray<T> ToProtoArray<T>(this System.Array array)
{
// Copy dimensions (to be used for reconstruction).
var dims = new int[array.Rank];
for (int i = 0; i < array.Rank; i++) dims[i] = array.GetLength(i);
// Copy the underlying data.
var data = new T[array.Length];
var k = 0;
array.MultiLoop(indices => data[k++] = (T)array.GetValue(indices));
// ^^^^^^^^^^ cannot cast from source type to destination type
return new ProtoArray<T> { Dimensions = dims, Data = data };
}
public static System.Array ToArray<T>(this ProtoArray<T> protoArray)
{
// Initialize array dynamically.
var result = System.Array.CreateInstance(typeof(T), protoArray.Dimensions);
// Copy the underlying data.
var k = 0;
result.MultiLoop(indices => result.SetValue(protoArray.Data[k++], indices));
return result;
}
public static void MultiLoop(this System.Array array, System.Action<int[]> action)
{
array.RecursiveLoop(0, new int[array.Rank], action);
}
private static void RecursiveLoop(this System.Array array, int level, int[] indices, System.Action<int[]> action)
{
if (level == array.Rank)
{
action(indices);
}
else
{
for (indices[level] = 0; indices[level] < array.GetLength(level); indices[level]++)
{
RecursiveLoop(array, level + 1, indices, action);
}
}
}
[ProtoContract]
public class ProtoArray<T>
{
[ProtoMember(1)]
public int[] Dimensions { get; set; }
[ProtoMember(2)]
public T[] Data { get; set; }
}
Here's how I use this to serialize a 2D array:
[ProtoContract]
public class Tile
{
[ProtoMember(1)]
public int x;
[ProtoMember(2)]
public int y;
// ...
}
Tile[,] map; // meanwhile I assign the data to the array
map1d = Extensions.ToProtoArray<Tile[,]>(map);
using (var file = File.Create(path))
{
Serializer.Serialize(file, map1d);
}
I think this is what you need:
public class ProtoArray<T>
{
public ProtoArray(T[] array)
{
this.Data=array;
this.Dimensions=new int[array.Length];
}
public ProtoArray(T[,] array)
{
int n = array.GetLength(0);
int m = array.GetLength(1);
this.Data=new T[n*m];
for(int i = 0; i<n; i++)
{
for(int j = 0; j<m; j++)
{
// Row Major
Data[i*m+j]=array[i, j];
// For Column Major use Data[i+j*n]=array[i, j];
}
}
this.Dimensions=new[] { n, m };
}
public int[] Dimensions { get; set; }
public T[] Data { get; set; }
public T[] ToArray()
{
if(Dimensions.Length==1)
{
return Data.Clone() as T[];
}
else
{
throw new NotSupportedException();
}
}
public T[,] ToArray2()
{
if(Dimensions.Length==2)
{
int n = Dimensions[0], m = Dimensions[1];
T[,] array = new T[n, m];
for(int i = 0; i<n; i++)
{
for(int j = 0; j<m; j++)
{
array[i, j]=Data[i*m+j];
}
}
return array;
}
else
{
throw new NotSupportedException();
}
}
}
public class Tile
{
public int x;
public int y;
// ...
}
class Program
{
static void Main(string[] args)
{
Tile[,] map = new Tile[16, 4];
ProtoArray<Tile> array = new ProtoArray<Tile>(map);
//serialize array
//
// de-serialize array
Tile[,] serialized_map = array.ToArray2();
}
}

Insert, display, min and max in array list in C#

I'm new in C#. I have this program and I need to add the function for insert, function min, function max, and function display.
namespace unSortedArrayAssignment
{
class unSortedArray
{
public int size;
public int[] array;
//Constructor for an empty unsorted array
public unSortedArray(int MAX_SIZE)
{
array = new int[MAX_SIZE]; //Create a C# array of size MAX_SIZE
size = 0; // Set size of unSortedArray to 0
}
//Append assuming array is not full
public void Append(int value)
{
array[size] = value;
size++;
}
//Remove the last item
public void Remove()
{
if (size != 0)
size--;
}
//Search for an item
public int Search(int value)
{
for (int counter = 0; counter < size; counter++)
{
if (array[counter] == value)
return counter;
}
return -1;
}
//Delete an item
public void Delete(int value)
{
int index = Search(value);
if (index != 0)
{
for (int counter = index; counter < size; counter++)
array[counter] = array[counter + 1];
size--;
}
}
}
}
class unSortedArray
{
public int size;
public int[] array;
public unSortedArray()
{
array = new int[int size here];
}
public int Min()
{
return array.Min();
}
public int Max()
{
return array.Max();
}
public void Insert(int value)
{
Array.Resize<int>(ref array, array.Count() + 1);
array[array.Count() - 1] = value;
}
}

Categories