How to search an array with array? - c#

I have 2 byte arrays:
Dim A() As Byte = {1, 2, 3, 4, 5, 6, 7, 8, 9}
Dim B() As Byte = {5, 6, 7}
Now I want to find the occurance of the full B in A. I tried Array.IndexOf(A, B) with no luck. Is there a simple way to search an array by array without the need to use any loops?
It should find the index (position) of 5,6,7 in the same order as in B().
If A() contains {1,2,3,4,7,6,5,9} it should return false or -1 because they are not in the same order.

The following Linq statement will give an IEnumerable<int> containing the positions of b in a (or an empty set if none occur):
Enumerable
.Range( 0, 1 + a.Length - b.Length )
.Where( i => a.Skip(i).Take(b.Length).SequenceEqual(b) );
I have no idea how to translate to VB.NET.

This might work, but it's C# and uses a loop:
private static int[] GetIndicesOf(byte[] needle, byte[] haystack)
{
int[] foundIndices = new int[needle.Length];
int found = 0;
for (int i = 0; i < haystack.Length; i++)
{
if (needle[found] == haystack[i])
{
foundIndices[found++] = i;
if (found == needle.Length)
return foundIndices;
}
else
{
i -= found; // Re-evaluate from the start of the found sentence + 1
found = 0; // Gap found, reset, maybe later in the haystack another occurrance of needle[0] is found
continue;
}
}
return null;
}
Tested with input:
Byte[] haystack = { 5, 6, 7, 8, 9, 0, 5, 6, 7 };
Byte[] needle = { 5, 6, 7 };
// Returns {0, 1, 2}
Byte[] haystack = { 5, 6, 0, 8, 9, 0, 5, 6, 7 };
Byte[] needle = { 5, 6, 7 };
// Returns {6, 7, 8}
Byte[] haystack = { 5, 6, 0, 7, 9, 0, 5, 6, 8 };
Byte[] needle = { 5, 6, 7 };
// Returns null
Byte[] haystack = { 1, 2, 1, 2, 2 };
Byte[] needle = { 1, 2, 2 };
// Returns {2, 3, 4}
Byte[] haystack = { 1, 2, 1, 2, 1, 2, 3 };
Byte[] needle = { 1, 2, 1, 2, 3 };
// Returns {2, 3, 4, 5, 6}
Byte[] haystack = { 1, 1, 1, 1, 2 };
Byte[] needle = { 1, 2 };
// Returns {3, 4}
But the Linq implementation of #spender looks nicer. :-P

How about creating a method that:
Concatinates the elements of the searched list to one string
Concatinates the elements of the list to search for to one string
Looks in the first string for the precense of the second string
Like so:
public bool IsSubSetOf(IList<int> list1, IList<int> list2){
var string1 = string.Join("", list1);
var string2 = string.Join("", list2);
return string1.Contains(string2);
}
Not tested...

An efficient way of solving this problem in general is the KMP algorithm. Quick googling suggest that a .NET implementation may be found here. It's implementational pseudocode is availible from Wikipedia.
An inefficient, but harmlessly easy to code way is presented in one of the links above as follows:
int[] T = new[]{1, 2, 3, 4, 5};
int[] P = new[]{3, 4};
for (int i = 0; i != T.Length; i++)
{
int j = 0
for (;i+j != T.Length && j != P.Length && T[i+j]==P[j]; j++);
if (j == P.Length) return i;
}

My take would be:
public static int Search<T>(T[] space, T[] searched) {
foreach (var e in Array.FindAll(space, e => e.Equals(searched[0]))) {
var idx = Array.IndexOf(space, e);
if (space.ArraySkip(idx).Take(searched.Length).SequenceEqual(searched))
return idx;
}
return -1;
}
public static class Linqy {
public static IEnumerable<T> ArraySkip<T>(this T[] array, int index) {
for (int i = index; i < array.Length; i++) {
yield return array[i];
}
}
}
As always, it depends on your data whether this is "good enough" or you will have to resort to more complex yet efficient algorithms. I introduced an arrayskip as the Linq skip does indeed only assume the IEnumerable interface and would enumerate up to the index.

Related

C# Span2D CopyTo not copying a 2d range correctly

I'm trying to use the Span2D type to "roll" entries in a 2d array, but it's not working as expected.
By rolling I mean the following - given an array such as:
{
{ 1, 1, 1, 1, 1 },
{ 2, 2, 2, 2, 2 },
{ 0, 0, 0, 0, 0 },
}
I would like to copy the first two rows down one row, so the top row can be repopulated. After the roll operation the array should look like this:
{
{ 1, 1, 1, 1, 1 },
{ 1, 1, 1, 1, 1 },
{ 2, 2, 2, 2, 2 },
}
The Span2d CopyTo method seems perfect for this - I create a Slice of the top two rows, and a slice of the bottom two rows, copy the first slice to the second slice. But instead of the expected result above, I get:
{
{ 1, 1, 1, 1, 1 },
{ 1, 1, 1, 1, 1 },
{ 1, 1, 1, 1, 1 },
}
Here's a runnable class that shows the problem:
public class SpanTest
{
public static void Main()
{
int[,] array =
{
{ 1, 2, 3, 4, 5 },
{ 0, 0, 0, 0, 0 },
{ 0, 0, 0, 0, 0 },
};
var h = array.GetLength(0) - 1;
var w = array.GetLength(1);
Console.WriteLine($"slice height:{h} width: {w}\n-----------");
Span2D<int> span = array;
Console.WriteLine($"{span.ToStringMatrix()}-----------");
var sourceSlice = span.Slice(0, 0, h, w);
Console.WriteLine($"{sourceSlice.ToStringMatrix()}-----------");
var targetSlice = span.Slice(1, 0, h, w);
Console.WriteLine($"{targetSlice.ToStringMatrix()}-----------");
sourceSlice.CopyTo(targetSlice);
Console.WriteLine($"{span.ToStringMatrix()}-----------");
}
}
with a helper for printing the Span2Ds:
public static class Utils
{
public static string ToStringMatrix<T>(this Span2D<T> arr)
{
var sb = new StringBuilder();
for (var i = 0; i < arr.Height; i++)
{
for (var j = 0; j < arr.Width; j++)
{
sb.Append($"{arr[i, j]} ");
}
sb.Append(Environment.NewLine);
}
return sb.ToString();
}
}
How can I make the copy operation behave as expected? Thanks
Well, the answer is quite obvious actually - at least it was when it occurred to me at 5am this morning!
Span2D wraps an array, copying to itself alters the backing array during the copy process. By the time the second row is copied, it already contains the contents of the first row. And hence, the first row gets propagated throughout the 2d array.

Get the shortest path between two linked object

I have a "Contact" object which contains a list called "Linked" containing the "Contact" linked to it.
public class Contact
{
public int Id { get; }
public string FirstName { get; set; }
public string LastName { get; set; }
public int Age { get; set; }
public List<Contact> Linked { get; set; }
}
For example, Contact "A" has 3 linked contacts: B, C and D.
As the links are made in both directions each time, B, C and D all have A in their "Linked" contacts.
B can then have E as a contact, and so on. There is no limit.
I have to make an algo which takes a starting contact and an ending contact as a parameter and which finds the shortest path that links them.
The result must be in the form: A > B > F > H > X, if I have to find the path that goes from A to X. We must therefore find all the steps of the path in the result.
I've tried a lot of stuff (recursion, ...) but it's still getting stuck somewhere.
Do you have an idea?
Dijkstra's algorithm is probably what you are looking for.
http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
Dijkstra's algorithm (/ˈdaɪkstrəz/ DYKE-strəz) is an algorithm for finding the shortest paths between nodes in a graph, which may represent, for example, road networks. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three years later.[4][5][6]
It should be relatively straight forward to find examples in your given language. Here is one in C#, stolen from programmingalgorithms.com
private static int MinimumDistance(int[] distance, bool[] shortestPathTreeSet, int verticesCount)
{
int min = int.MaxValue;
int minIndex = 0;
for (int v = 0; v < verticesCount; ++v)
{
if (shortestPathTreeSet[v] == false && distance[v] <= min)
{
min = distance[v];
minIndex = v;
}
}
return minIndex;
}
private static void Print(int[] distance, int verticesCount)
{
Console.WriteLine("Vertex Distance from source");
for (int i = 0; i < verticesCount; ++i)
Console.WriteLine("{0}\t {1}", i, distance[i]);
}
public static void Dijkstra(int[,] graph, int source, int verticesCount)
{
int[] distance = new int[verticesCount];
bool[] shortestPathTreeSet = new bool[verticesCount];
for (int i = 0; i < verticesCount; ++i)
{
distance[i] = int.MaxValue;
shortestPathTreeSet[i] = false;
}
distance[source] = 0;
for (int count = 0; count < verticesCount - 1; ++count)
{
int u = MinimumDistance(distance, shortestPathTreeSet, verticesCount);
shortestPathTreeSet[u] = true;
for (int v = 0; v < verticesCount; ++v)
if (!shortestPathTreeSet[v] && Convert.ToBoolean(graph[u, v]) && distance[u] != int.MaxValue && distance[u] + graph[u, v] < distance[v])
distance[v] = distance[u] + graph[u, v];
}
Print(distance, verticesCount);
}
It would then be used like so:
int[,] graph = {
{ 0, 4, 0, 0, 0, 0, 0, 8, 0 },
{ 4, 0, 8, 0, 0, 0, 0, 11, 0 },
{ 0, 8, 0, 7, 0, 4, 0, 0, 2 },
{ 0, 0, 7, 0, 9, 14, 0, 0, 0 },
{ 0, 0, 0, 9, 0, 10, 0, 0, 0 },
{ 0, 0, 4, 0, 10, 0, 2, 0, 0 },
{ 0, 0, 0, 14, 0, 2, 0, 1, 6 },
{ 8, 11, 0, 0, 0, 0, 1, 0, 7 },
{ 0, 0, 2, 0, 0, 0, 6, 7, 0 }
};
Dijkstra(graph, 0, 9);
The example above consist of 9 nodes, and the distance to each of the other nodes. If it has 0, there is no connection. In your case, there is no weight - so either there is a connection (1), or there isn't (0).
You have to change the algorithm to take in a list of contacts, instead of a two dimensional array. Try to think of the two dimensional array as a list of lists - very similar to a list of contacts, where each contact has another list of contacts.
Lets for example create a simple contacts list and their contacts:
Peter can contact Mary
Mary can contact Peter and John
John can contact Mary
This would be represented something like this in a two dimensional array:
int[,] contacts = new int[]
{
{ 0, 1, 0 }, //Peter: Peter no, Mary yes, John no
{ 1, 0, 1 }, //Mary: Peter yes, Mary no, John yes
{ 0, 1, 0 } //John: Peter no, Mary yes, John no
}
You would also have to modify the algorithm to keep track of the current path. That should be a relatively straight forward change.
Hope that it helps!

Increment Guid in C#

I have an application that has a guid variable which needs to be unique (of course). I know that statistically any guid should just be assumed to be unique, but due to dev/test environment reasons, the same value may be seen multiple times. So when that happens, I want to "increment" the value of the Guid, rather than just creating a whole new one. There does not seem to be a simple way to do this. I found a hack, which I will post as a possible answer, but want a cleaner solution.
You can get the byte components of the guid, so you can just work on that:
static class GuidExtensions
{
private static readonly int[] _guidByteOrder =
new[] { 15, 14, 13, 12, 11, 10, 9, 8, 6, 7, 4, 5, 0, 1, 2, 3 };
public static Guid Increment(this Guid guid)
{
var bytes = guid.ToByteArray();
bool carry = true;
for (int i = 0; i < _guidByteOrder.Length && carry; i++)
{
int index = _guidByteOrder[i];
byte oldValue = bytes[index]++;
carry = oldValue > bytes[index];
}
return new Guid(bytes);
}
}
EDIT: now with correct byte order
Thanks to Thomas Levesque's byte order, here's a nifty LINQ implementation:
static int[] byteOrder = { 15, 14, 13, 12, 11, 10, 9, 8, 6, 7, 4, 5, 0, 1, 2, 3 };
static Guid NextGuid(Guid guid)
{
var bytes = guid.ToByteArray();
var canIncrement = byteOrder.Any(i => ++bytes[i] != 0);
return new Guid(canIncrement ? bytes : new byte[16]);
}
Note it wraps around to Guid.Empty if you manage to increment it that far.
It would be more efficient if you were to keep incrementing a single copy of bytes rather than calling ToByteArray on each GUID in turn.
Possible solution -- I think this works (not really tested), but want a better solution.
public static Guid Increment(this Guid value)
{
var bytes = value.ToByteArray();
// Note that the order of bytes in the returned byte array is different from the string representation of a Guid value.
// Guid: 00112233-4455-6677-8899-aabbccddeeff
// byte array: 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
// So the byte order of the following indexes indicates the true low-to-high sequence
if (++bytes[15] == 0) if (++bytes[14] == 0) if (++bytes[13] == 0) if (++bytes[12] == 0) if (++bytes[11] == 0) if (++bytes[10] == 0) // normal order
if (++bytes[9] == 0) if (++bytes[8] == 0) // normal order
if (++bytes[6] == 0) if (++bytes[7] == 0) // reverse order
if (++bytes[5] == 0) if (++bytes[4] == 0) // reverse order
if (++bytes[3] == 0) if (++bytes[2] == 0) if (++bytes[1] == 0) { ++bytes[0]; } // reverse order
return new Guid(bytes);
}
Edit: here is the code I ended up using; props to the answers above for the general technique, although without the "unchecked" clause they both would throw exceptions in some cases. But I also tried to make the below as readable as possible.
private static int[] _guidByteOrder = { 15, 14, 13, 12, 11, 10, 9, 8, 6, 7, 4, 5, 0, 1, 2, 3 };
public static Guid NextGuid(this Guid guid)
{
var bytes = guid.ToByteArray();
for (int i = 0; i < 16; i++)
{
var iByte = _guidByteOrder[i];
unchecked { bytes[iByte] += 1; }
if (bytes[iByte] != 0)
return new Guid(bytes);
}
return Guid.Empty;
}
Verified Solution for Ordered Strings:
private static Guid Increment(Guid guid)
{
byte[] bytes = guid.ToByteArray();
byte[] order = { 15, 14, 13, 12, 11, 10, 9, 8, 6, 7, 4, 5, 0, 1, 2, 3 };
for (int i = 0; i < 16; i++)
{
if (bytes[order[i]] == byte.MaxValue)
{
bytes[order[i]] = 0;
}
else
{
bytes[order[i]]++;
return new Guid(bytes);
}
}
throw new OverflowException("Congratulations you are one in a billion billion billion billion etc...");
}
Verification:
private static Guid IncrementProof(Guid guid, int start, int end)
{
byte[] bytes = guid.ToByteArray();
byte[] order = { 15, 14, 13, 12, 11, 10, 9, 8, 6, 7, 4, 5, 0, 1, 2, 3 };
for (int i = start; i < end; i++)
{
if (bytes[order[i]] == byte.MaxValue)
{
bytes[order[i]] = 0;
}
else
{
bytes[order[i]]++;
return new Guid(bytes);
}
}
throw new OverflowException("Congratulations you are one in a billion billion billion billion etc...");
}
static void Main(string[] args)
{
Guid temp = new Guid();
for (int j = 0; j < 16; j++)
{
for (int i = 0; i < 255; i++)
{
Console.WriteLine(temp.ToString());
temp = IncrementProof(temp, j, j + 1);
}
}
}

C# - Fastest Way To Sort Array Of Primitives And Track Their Indices

I need a float[] to be sorted. And I need to know where the old indices are in new array. That's why I can't use Array.Sort(); or whatever. So I would like to write a function that sorts the array for me and remembers from what index it took each value:
float[] input = new float[] {1.5, 2, 0, 0.4, -1, 96, -56, 8, -45};
// sort
float[] output; // {-56, -45, -1, 0, 0.4, 1.5, 2, 8, 96};
int[] indices; // {6, 8, 4, 2, 3, 0, 1, 7, 5};
Size of arrays would be around 500. How should I approach this ? What sorting algorithm etc.
After solved: It always surprises me how powerful C# is. I didn't even though of it being able to do that task on it's own. And since I already heard that Array.Sort() is very fast I'll take it.
float[] input = new float[] { 1.5F, 2, 0, 0.4F, -1, 96, -56, 8, -45 };
int[] indices = new int[input.Length];
for (int i = 0; i < indices.Length; i++) indices[i] = i;
Array.Sort(input, indices);
// input and indices are now at the desired exit state
Basically, the 2-argument version of Array.Sort applies the same operations to both arrays, running the actual sort comparisons on the first array. This is normally used the other way around - to rearrange something by the desired indices; but this works too.
You can use the overload of Array.Sort() which takes TWO arrays, and sorts the second one according to how it sorted the first one:
float[] input = new [] { 1.5f, 2, 0, 0.4f, -1, 96, -56, 8, -45 };
int[] indices = Enumerable.Range(0, input.Length).ToArray();
Array.Sort(input, indices);
You can create a new array of indices, and then sort both of them using Array.Sort and treating input as keys:
float[] input = new float[] { 1.5F, 2, 0, 0.4F, -1, 96, -56, 8, -45 };
int[] indicies = Enumerable.Range(0, input.Length).ToArray();
Array.Sort(input, indicies);
if you use linq:
float[] input = new float[] { 1.5F, 2, 0, 0.4F, -1, 96, -56, 8, -45 };
var result = input.Select(x => new { Value = x, Index = input.ToList().IndexOf(x)}).OrderBy(x => x.Value).ToList();
// sort
float[] output = result.Select(x => x.Value).ToArray();
int[] indices = result.Select(x => x.Index).ToArray();
in results you got objects with values and their indexes.
A List<KeyValuePair<int,float>> and a custom sorter would also work. the key for each pair holds the original index.
private void Form1_Load(object sender, EventArgs e)
{
List<KeyValuePair<int,float>> data = new List<KeyValuePair<int,float>>
{
new KeyValuePair<int,float>(0,1.5f),
new KeyValuePair<int,float>(1,2),
new KeyValuePair<int,float>(2,0),
new KeyValuePair<int,float>(3,0.4f),
new KeyValuePair<int,float>(4,-1),
new KeyValuePair<int,float>(5,96),
new KeyValuePair<int,float>(6,-56),
new KeyValuePair<int,float>(7,8),
new KeyValuePair<int,float>(8,-45)
};
data.Sort(SortByValue);
foreach (KeyValuePair<int, float> kv in data)
{
listBox1.Items.Add(kv.Key.ToString() + " - " + kv.Value.ToString());
}
}
private int SortByValue(KeyValuePair<int, float> a, KeyValuePair<int, float> b)
{
return a.Value.CompareTo(b.Value);
}

Log of a very large number

I'm dealing with the BigInteger class with numbers in the order of 2 raised to the power 10,000,000.
The BigInteger Log function is now the most expensive function in my algorithm and I am desperately looking for an alternative.
Since I only need the integral part of the log, I came across this answer which seems brilliant in terms of speed but for some reason I am not getting accurate values. I do not care about the decimal part but I do need to get an accurate integral part whether the value is floored or ceiled as long as I know which.
Here is the function I implemented:
public static double LogBase2 (System.Numerics.BigInteger number)
{
return (LogBase2(number.ToByteArray()));
}
public static double LogBase2 (byte [] bytes)
{
// Corrected based on [ronalchn's] answer.
return (System.Math.Log(bytes [bytes.Length - 1], 2) + ((bytes.Length - 1) * 8));
}
The values are now incredibly accurate except for corner cases. The values 7 to 7.99999, 15 to 15.9999, 23 to 23.9999 31 to 31.9999, etc. return -Infinity. The numbers seem to revolve around byte boundaries. Any idea what's going on here?
Example:
LogBase2( 1081210289) = 30.009999999993600 != 30.000000000000000
LogBase2( 1088730701) = 30.019999999613300 != 30.000000000000000
LogBase2( 2132649894) = 30.989999999389400 != 30.988684686772200
LogBase2( 2147483648) = 31.000000000000000 != -Infinity
LogBase2( 2162420578) = 31.009999999993600 != -Infinity
LogBase2( 4235837212) = 31.979999999984800 != -Infinity
LogBase2( 4265299789) = 31.989999999727700 != -Infinity
LogBase2( 4294967296) = 32.000000000000000 != 32.000000000000000
LogBase2( 4324841156) = 32.009999999993600 != 32.000000000000000
LogBase2( 545958373094) = 38.989999999997200 != 38.988684686772200
LogBase2( 549755813887) = 38.999999999997400 != 38.988684686772200
LogBase2( 553579667970) = 39.009999999998800 != -Infinity
LogBase2( 557430119061) = 39.019999999998900 != -Infinity
LogBase2( 561307352157) = 39.029999999998300 != -Infinity
LogBase2( 565211553542) = 39.039999999997900 != -Infinity
LogBase2( 569142910795) = 39.049999999997200 != -Infinity
LogBase2( 1084374326282) = 39.979999999998100 != -Infinity
LogBase2( 1091916746189) = 39.989999999998500 != -Infinity
LogBase2( 1099511627775) = 39.999999999998700 != -Infinity
Try this:
public static int LogBase2(byte[] bytes)
{
if (bytes[bytes.Length - 1] >= 128) return -1; // -ve bigint (invalid - cannot take log of -ve number)
int log = 0;
while ((bytes[bytes.Length - 1]>>log)>0) log++;
return log + bytes.Length*8-9;
}
The reason for the most significant byte being 0 is because the BigInteger is a signed integer. When the most significant bit of the high-order byte is 1, an extra byte is tacked on to represent the sign bit of 0 for positive integers.
Also changed from using the System.Math.Log function because if you only want the rounded value, it is much faster to use bit operations.
If you have Microsoft Solver Foundation (download at http://msdn.microsoft.com/en-us/devlabs/hh145003.aspx), then you can use the BitCount() function:
public static double LogBase2(Microsoft.SolverFoundation.Common.BigInteger number)
{
return number.BitCount;
}
Or you can use the java library. Add a reference to the vjslib library (found in the .NET tab - this is the J# implementation of the java library).
You can now add "using java.math" in your code.
java.math.BigInteger has a bitLength() function
BigInteger bi = new BigInteger(128);
int log = bi.Log2();
public static class BigIntegerExtensions
{
static int[] PreCalc = new int[] { 8, 7, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
public static int Log2(this BigInteger bi)
{
byte[] buf = bi.ToByteArray();
int len = buf.Length;
return len * 8 - PreCalc[buf[len - 1]] - 1;
}
}
Years late but maybe this will help someone else...
.Net Core 3 added the .GetBitLength() that is basically log2. (but just one increment too high) Since it is built-in to .net I think this is about as fast as we can get.
// Create some example number
BigInteger myNum= (BigInteger)32;
// Get the Log2
BigInteger myLog2 = myNum.GetBitLength() - 1;
https://dotnetfiddle.net/7ggy4D

Categories