I am given a string of characters, in which every consequent pair of characters comprises an edge. What I mean by that is this is the string: ABBCAD. Edges of the string are:
A->B
B->C
A->D
Shortest path distance is A->D
The task at hand is to build up a Directed Acyclic Graph in memory from the string using the above rule and find the shortest path staring at the root node(in the example given it's A label) ending at terminal node.
NJKUUGHBNNJHYAPOYJHNRMNIKAIILFGJSNAICZQRNM
I gather one of the approaches that suites the task is to use the Depth First Search algo.
This is not homework...
This is a job for Djikstra's Algorithm. Once you build a representation of your graph it should be easy enough to produce the lowest cost traversal ... since in your case it seems that all paths have an equal cost (1).
You can look here on CodeProject for a reasonably decent implementation of Djikstra in C#.
could you present me with a pseudo code of your version of the graph representation for this case?
There are multiple ways to represent a graph in such a problem. If the number of vertices in your graph are likely to be small - it would be sufficient to use an adjacency matrix for the representation of edges. In which case, you could just use a .NET multidimensional array. The trick here is that you need to convert vertices labelled with characters to ordinal values. One approach would be to write a wrapper class that maps character codes to indices into the adjacency matrix:
class AdjacencyMatrix
{
// representation of the adjacency matrix (AM)
private readonly int[,] m_Matrix;
// mapping of character codes to indices in the AM
private readonly Dictionary<char,int> m_Mapping;
public AdjacencyMatrix( string edgeVector )
{
// using LINQ we can identify and order the distinct characters
char[] distinctChars = edgeVector.Distinct().OrderBy(x => x);
m_Mapping = distinctChars.Select((c,i)=>new { Vertex = c, Index = i })
.ToDictionary(x=>x.Vertex, x=>x.Index);
// build the adjacency cost matrix here...
// TODO: I leave this up to the reader to complete ... :-D
}
// retrieves an entry from within the adjacency matrix given two characters
public int this[char v1, char v2]
{
get { return m_Matrix[m_Mapping[v1], m_Mapping[v2]];
private set { m_Matrix[m_Mapping[v1], m_Mapping[v2]] = value; }
}
}
For this particular case, Dijkstra's could be greatly simplified. I imagine something like this would do the trick.
class Node {
public object Value;
// Connected nodes (directed)
public Node[] Connections;
// Path back to the start
public Node Route;
}
Node BreadthFirstRoute(Node[] theStarts, Node[] theEnds) {
Set<Node> visited = new Set<Node>();
Set<Node> frontier = new Set<Node>();
frontier.AddRange(theStarts);
Set<Node> ends = new Set<Node>();
ends.AddRange(theEnds);
// Visit nodes one frontier at a time - Breadth first.
while (frontier.Count > 0) {
// Move frontier into visiting, reset frontier.
Set<Node> visiting = frontier;
frontier = new Set<Node>();
// Prevent adding nodes being visited to frontier
visited.AddRange(visiting);
// Add all connected nodes to frontier.
foreach (Node node in visiting) {
foreach (Node other in node.Connections) {
if (!visited.Contains(other)) {
other.Route = other;
if (ends.Contains(other)) {
return other;
}
frontier.Add(other);
}
}
}
}
return null;
}
Just for the record. This is your graph example:
Where the shortest path from A to M is marked in blue.
It is a very short program in Mathematica:
a = StringSplit["NJKUUGHBNNJHYAPOYJHNRMNIKAIILFGJSNAICZQRNM", ""]
b = Table[a[[i]] -> a[[i + 1]], {i, Length#a - 1}]
vertexNumber[g_, vName_] := Position[ Vertices[g, All], vName][[1, 1]];
Needs["Combinatorica`"]
c = ToCombinatoricaGraph[b]
sp = ShortestPath[c, vertexNumber[c, "A"], vertexNumber[c, "M"]]
vlsp = GetVertexLabels[c, sp]
vlspt = Table[{vlsp[[i]], vlsp[[i + 1]]}, {i, Length#vlsp - 1}]
GraphPlot[b, VertexLabeling -> True, ImageSize -> 250,
DirectedEdges -> True, Method -> {"SpringEmbedding"},
EdgeRenderingFunction ->
(If[Cases[vlspt, {First[#2], Last[#2]}] == {},
{Red, Arrowheads[Medium], Arrow[#1, .1]},
{Blue, Arrowheads[Medium], Arrow[#1, .1]}] &)]
LBushkin's answer is correct. Eric Lippert has a series about implementing the A* algorithm in C#. A* is a more general case of Dijkstra's algorithm : if your cost estimation function always returns 0, it is exactly equivalent to Dijkstra.
Another option would be to use a graph library that has various shortest path algorithms implemented. One I have used in the past and found to be good is QuickGraph. The documentation is quite good. For example, here is a page in the documentation that explains how to use Dijkstra's algorithm.
Because your graph is acyclic the Viterbi algorithm can be used, visit the states in topological order and update the costs in preceding vertices (states).
The below code implements the search algorithm and data structure. I wasn't 100% sure of the definition of the valid based on the original question. But it should be straight forward to modify the code to construct other topologies and solve various dynamic programming task using a DAG.
Changing the outer for loop when computing the state potentials to while loop with a queue will allow for different shortestpath algorithms to used easily changed by changing the queue discipline. For example a binary heap based queue will give Dijkstra algorithm or a FIFO queue will Bellman-Ford algorithm.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace DAGShortestPath
{
class Arc
{
public Arc(int nextstate, float cost)
{
Nextstate = nextstate;
Cost = cost;
}
public int Nextstate { get; set; }
public float Cost { get; set; }
};
class State
{
public bool Final { get; set; }
public List<Arc> Arcs { get; set; }
public void AddArc(int nextstate, float cost)
{
if (Arcs == null)
Arcs = new List<Arc>();
Arcs.Add(new Arc(nextstate, cost));
}
}
class Graph
{
List< State > _states = new List< State >();
int _start = -1;
public void AddArc(int state, int nextstate, float cost)
{
while (_states.Count <= state)
AddState();
while (_states.Count <= nextstate)
AddState();
_states[state].AddArc(nextstate, cost);
}
public List<Arc> Arcs(int state)
{
return _states[state].Arcs;
}
public int AddState()
{
_states.Add(new State());
return _states.Count - 1;
}
public bool IsFinal(int state)
{
return _states[state].Final;
}
public void SetFinal(int state)
{
_states[state].Final = true;
}
public void SetStart(int start)
{
_start = -1;
}
public int NumStates { get { return _states.Count; } }
public void Print()
{
for (int i = 0; i < _states.Count; i++)
{
var arcs = _states[i].Arcs;
for (int j = 0; j < arcs.Count; j++)
{
var arc = arcs[j];
Console.WriteLine("{0}\t{1}\t{2}", i, j, arc.Cost);
}
}
}
}
class Program
{
static List<int> ShortertPath(Graph graph)
{
float[] d = new float[graph.NumStates];
int[] tb = new int[graph.NumStates];
for (int i = 0; i < d.Length; i++)
{
d[i] = float.PositiveInfinity;
tb[i] = -1;
}
d[0] = 0.0f;
float bestscore = float.PositiveInfinity;
int beststate = -1;
for (int i = 0; i < graph.NumStates; i++)
{
if (graph.Arcs(i) != null)
{
foreach (var arc in graph.Arcs(i))
{
if (arc.Nextstate < i)
throw new Exception("Graph is not topologically sorted");
float e = d[i] + arc.Cost;
if (e < d[arc.Nextstate])
{
d[arc.Nextstate] = e;
tb[arc.Nextstate] = i;
if (graph.IsFinal(arc.Nextstate) && e < bestscore)
{
bestscore = e;
beststate = arc.Nextstate;
}
}
}
}
}
//Traceback and recover the best final sequence
if (bestscore == float.NegativeInfinity)
throw new Exception("No valid terminal state found");
Console.WriteLine("Best state {0} and score {1}", beststate, bestscore);
List<int> best = new List<int>();
while (beststate != -1)
{
best.Add(beststate);
beststate = tb[beststate];
}
return best;
}
static void Main(string[] args)
{
Graph g = new Graph();
String input = "ABBCAD";
for (int i = 0; i < input.Length - 1; i++)
for (int j = i + 1; j < input.Length; j++)
{
//Modify this of different constraints on-the arcs
//or graph topologies
//if (input[i] < input[j])
g.AddArc(i, j, 1.0f);
}
g.SetStart(0);
//Modify this to make all states final for example
//To compute longest sub-sequences and so on...
g.SetFinal(g.NumStates - 1);
var bestpath = ShortertPath(g);
//Print the best state sequence in reverse
foreach (var v in bestpath)
{
Console.WriteLine(v);
}
}
}
}
Related
I have a List<Keyword> where Keyword class is:
public string keyword;
public List<int> ids;
public int hidden;
public int live;
public bool worked;
Keyword has its own keyword, a set of 20 ids, live by default is set to 1 and hidden to 0.
I just need to iterate over the whole main List to invalidate those keywords whose number of same ids is greater than 6, so comparing every pair, if the second one has more than 6 ids repeated respect to the first one, hidden is set to 1 and live to 0.
The algorithm is very basic but it takes too long when the main list has many elements.
I'm trying to guess if there could be any method I could use to increase the speed.
The basic algorithm I use is:
foreach (Keyword main_keyword in lista_de_keywords_live)
{
if (main_keyword.worked) {
continue;
}
foreach (Keyword keyword_to_compare in lista_de_keywords_live)
{
if (keyword_to_compare.worked || keyword_to_compare.id == main_keyword.id) continue;
n_ids_same = 0;
foreach (int id in main_keyword.ids)
{
if (keyword_to_compare._lista_models.IndexOf(id) >= 0)
{
if (++n_ids_same >= 6) break;
}
}
if (n_ids_same >= 6)
{
keyword_to_compare.hidden = 1;
keyword_to_compare.live = 0;
keyword_to_compare.worked = true;
}
}
}
The code below is an example of how you would use a HashSet for your problem. However, I would not recommend using it in this scenario. On the other hand, the idea of sorting the ids to make the comparison faster still.
Run it in a Console Project to try it out.
Notice that once I'm done adding new ids to a keyword, I sort them. This makes the comparison faster later on.
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
namespace KeywordExample
{
public class Keyword
{
public List<int> ids;
public int hidden;
public int live;
public bool worked;
public Keyword()
{
ids = new List<int>();
hidden = 0;
live = 1;
worked = false;
}
public override string ToString()
{
StringBuilder s = new StringBuilder();
if (ids.Count > 0)
{
s.Append(ids[0]);
for (int i = 1; i < ids.Count; i++)
{
s.Append(',' + ids[i].ToString());
}
}
return s.ToString();
}
}
public class KeywordComparer : EqualityComparer<Keyword>
{
public override bool Equals(Keyword k1, Keyword k2)
{
int equals = 0;
int i = 0;
int j = 0;
//based on sorted ids
while (i < k1.ids.Count && j < k2.ids.Count)
{
if (k1.ids[i] < k2.ids[j])
{
i++;
}
else if (k1.ids[i] > k2.ids[j])
{
j++;
}
else
{
equals++;
i++;
j++;
}
}
return equals >= 6;
}
public override int GetHashCode(Keyword keyword)
{
return 0;//notice that using the same hash for all keywords gives you an O(n^2) time complexity though.
}
}
class Program
{
static void Main(string[] args)
{
List<Keyword> listOfKeywordsLive = new List<Keyword>();
//add some values
Random random = new Random();
int n = 10;
int sizeOfMaxId = 20;
for (int i = 0; i < n; i++)
{
var newKeyword = new Keyword();
for (int j = 0; j < 20; j++)
{
newKeyword.ids.Add(random.Next(sizeOfMaxId) + 1);
}
newKeyword.ids.Sort(); //sorting the ids
listOfKeywordsLive.Add(newKeyword);
}
//solution here
HashSet<Keyword> set = new HashSet<Keyword>(new KeywordComparer());
set.Add(listOfKeywordsLive[0]);
for (int i = 1; i < listOfKeywordsLive.Count; i++)
{
Keyword keywordToCompare = listOfKeywordsLive[i];
if (!set.Add(keywordToCompare))
{
keywordToCompare.hidden = 1;
keywordToCompare.live = 0;
keywordToCompare.worked = true;
}
}
//print all keywords to check
Console.WriteLine(set.Count + "/" + n + " inserted");
foreach (var keyword in set)
{
Console.WriteLine(keyword);
}
}
}
}
The obvious source of inefficiency is the way you calculate intersection of two lists (of ids). The algorithm is O(n^2). This is by the way problem that relational databases solve for every join and your approach would be called loop join. The main efficient strategies are hash join and merge join. For your scenario the latter approach may be better I guess, but you can also try HashSets if you like.
The second source of inefficiency is repeating everything twice. As (a join b) is equal to (b join a), you do not need two cycles over the whole List<Keyword>. Actually, you only need to loop over the non duplicate ones.
Using some code from here, you can write the algorithm like:
Parallel.ForEach(list, k => k.ids.Sort());
List<Keyword> result = new List<Keyword>();
foreach (var k in list)
{
if (result.Any(r => r.ids.IntersectSorted(k.ids, Comparer<int>.Default)
.Skip(5)
.Any()))
{
k.hidden = 1;
k.live = 0;
k.worked = true;
}
else
{
result.Add(k);
}
}
If you replace the linq with just the index manipulation approach (see the link above), it would be a tiny bit faster I guess.
I'm self-learning c# and I'm a little confused on nodes in graph structures. I've cobbled together this code so far, but I have no idea how to remove and entire node from the list:
public class Graph
{
private int _V;
private int _E;
private LinkedList<Int32>[] adj;
public Graph(int V)
{
this._V = V;
adj = new LinkedList<Int32>[_V];
for (int v = 0; v < _V; v++)
{
adj[v] = new LinkedList<Int32>();
}
}
public void AddEdge(int v, int w)
{
_E++;
adj[v].AddFirst(w);
adj[w].AddFirst(v);
}
public void RemoveEdge(int v, int w)
{
_E--;
adj[v].Remove(w);
adj[w].Remove(v);
}
public IEnumerable<Int32> Adj(int v)
{
return adj[v];
}
public int V()
{
return _V;
}
public bool isLeaf(int v)
{
return adj[v].Count() == 1;
}
public int adjacencies(int v)
{
return adj[v].Count();
}
public String toString()
{
StringBuilder s = new StringBuilder();
String NEWLINE = Environment.NewLine;
s.Append(adj[1].Count + NEWLINE);
s.Append(_V + " vertices, " + _E + " edges " + NEWLINE);
for (int v = 0; v < _V; v++) {
s.Append(String.Format("{0:d}: ", v));
foreach (int w in adj[v]) {
s.Append(String.Format("{0:d} ", w));
}
s.Append(NEWLINE);
}
return s.ToString();
}
}
So, if I have four nodes where node 1 has an edge to 2, 2 has edges to both 3 and 4, and 3 and 4 share an edge with each other, I want to remove node 1 completely since it's a leaf. I can remove the edge easily enough, but that node still remains in my list (just without an edge). How do I get rid of 1 completely? I thought I should be able to just do a adj.Remove(1), but that throws an error.
I realize this is probably super easy, but the answers I've looked through in here seem to be describing something different or I'm simply not getting how this works.
Thanks!
Arrays do not support removal. If you really need to encode the concept of "does not exist" into adj, then you could set adj[v] to a value that represents "does not exist". For example,
adj[v] = null.
Alternatively, you could store adjacencies in a Dictionary<Int32>. Instead of
private LinkedList<Int32>[] adj;
you would have
private Dictionary<Int32, LinkedList<Int32>> adj;
and you could remove vertex v by
adj.Remove(v); // don't forget to remove the edges
In either case, you would probably want to update the other methods to handle such "removed" vertices.
I got a strange C# programming problem. There is a data retrieval in groups of random lengths of number groups. The numbers should be all unique, like:
group[1]{1,2,15};
group[2]{3,4,7,33,22,100};
group[3]{11,12,9};
// Now there is a routine that adds a number to a group.
// For the example, just imagine the active group looks like:
// group[active]=(10,5,0)
group[active].add(new_number);
// Now if 100 were to be added to the active group
// then the active group should be merged to group[2]
// (as that one already contained 100)
// And then as a result it would like
group[1]{1,2,15};
group[2]{3,4,7,33,22,100,10,5,0}; // 10 5 0 added to group[2]
group[3]{11,12,9};
// 100 wasn't added to group[2] since it was already in there.
If the number to be added is already used (not unique) in a previous group.
Then I should merge all numbers in the active group towards that previous group, so I don’t get double numbers.
So in the above example if number 100 was added to the active
group, then all numbers in the group[active] should be merged into group[2].
And then the group[active] should start clean fresh again without any items. And since 100 was already in group[2] it should not be added double.
I am not entirely sure on how to deal with this in a proper way.
As an important criteria here is that it has to work fast.
I will have around minimal 30 groups (upper-bound unknown might be 2000 or more), and their length on average contains five integer numbers, but it could be much longer or only one number.
I kind of feel that I am reinventing the wheel here.
I wonder what this problem is called (does it go by a name, some sorting, or grouping math problem)?, with a name I might find some articles related to such problems.
But maybe it’s indeed something new, then what would be recommended? Should I use list of lists or a dictionary of lists.. or something else? Somehow the checking if the number is already present should be done fast.
I'm thinking along this path now and am not sure if it’s the best.
Instead of a single number, I use a struct now. It wasn't written in the original question as I was afraid, explaining that would make it too complex.
struct data{int ID; int additionalNumber}
Dictionary <int,List<data>> group =new Dictionary<int, List<data>>();
I can step aside from using a struct in here. A lookup list could connect the other data to the proper index. So this makes it again more close to the original description.
On a side note, great answers are given.
So far I don’t know yet what would work best for me in my situation.
Note on the selected answer
Several answers were given here, but I went for the pure dictionary solution.
Just as a note for people in similar problem scenarios: I'd still recommend testing, and maybe the others work better for you. It’s just that in my case currently it worked best. The code was also quite short which I liked, and a dictionary adds also other handy options for my future coding on this.
I would go with Dictionary<int, HashSet<int>>, since you want to avoid duplicates and want a fast way to check if given number already exists:
Usage example:
var groups = new Dictionary<int, HashSet<int>>();
// populate the groups
groups[1] = new HashSet<int>(new[] { 1,2,15 });
groups[2] = new HashSet<int>(new[] { 3,4,7,33,22,100 });
int number = 5;
int groupId = 4;
bool numberExists = groups.Values.Any(x => x.Contains(number));
// if there is already a group that contains the number
// merge it with the current group and add the new number
if (numberExists)
{
var group = groups.First(kvp => kvp.Value.Contains(number));
groups[group.Key].UnionWith(groups[groupId]);
groups[groupId] = new HashSet<int>();
}
// otherwise just add the new number
else
{
groups[groupId].Add(number);
}
From what I gather you want to iteratively assign numbers to groups satisfying these conditions:
Each number can be contained in only one of the groups
Groups are sets (numbers can occur only once in given group)
If number n exists in group g and we try to add it to group g', all numbers from g' should be transferred to g instead (avoiding repetitions in g)
Although approaches utilizing Dictionary<int, HashSet<int>> are correct, here's another one (more mathematically based).
You could simply maintain a Dictionary<int, int>, in which the key would be the number, and the corresponding value would indicate the group, to which that number belongs (this stems from condition 1.). And here's the add routine:
//let's assume dict is a reference to the dictionary
//k is a number, and g is a group
void AddNumber(int k, int g)
{
//if k already has assigned a group, we assign all numbers from g
//to k's group (which should be O(n))
if(dict.ContainsKey(k) && dict[k] != g)
{
foreach(var keyValuePair in dict.Where(kvp => kvp.Value == g).ToList())
dict[keyValuePair.Key] = dict[k];
}
//otherwise simply assign number k to group g (which should be O(1))
else
{
dict[k] = g;
}
}
Notice that from a mathematical point of view what you want to model is a function from a set of numbers to a set of groups.
I have kept it as easy to follow as I can, trying not to impact the speed or deviate from the spec.
Create a class called Groups.cs and copy and paste this code into it:
using System;
using System.Collections.Generic;
namespace XXXNAMESPACEXXX
{
public static class Groups
{
public static List<List<int>> group { get; set; }
public static int active { get; set; }
public static void AddNumberToGroup(int numberToAdd, int groupToAddItTo)
{
try
{
if (group == null)
{
group = new List<List<int>>();
}
while (group.Count < groupToAddItTo)
{
group.Add(new List<int>());
}
int IndexOfListToRefresh = -1;
List<int> NumbersToMove = new List<int>();
foreach (List<int> Numbers in group)
{
if (Numbers.Contains(numberToAdd) && (group.IndexOf(Numbers) + 1) != groupToAddItTo)
{
active = group.IndexOf(Numbers) + 1;
IndexOfListToRefresh = group.IndexOf(Numbers);
foreach (int Number in Numbers)
{
NumbersToMove.Add(Number);
}
}
}
foreach (int Number in NumbersToMove)
{
if (!group[groupToAddItTo - 1].Contains(Number))
{
group[groupToAddItTo - 1].Add(Number);
}
}
if (!group[groupToAddItTo - 1].Contains(numberToAdd))
{
group[groupToAddItTo - 1].Add(numberToAdd);
}
if (IndexOfListToRefresh != -1)
{
group[IndexOfListToRefresh] = new List<int>();
}
}
catch//(Exception ex)
{
//Exception handling here
}
}
public static string GetString()
{
string MethodResult = "";
try
{
string Working = "";
bool FirstPass = true;
foreach (List<int> Numbers in group)
{
if (!FirstPass)
{
Working += "\r\n";
}
else
{
FirstPass = false;
}
Working += "group[" + (group.IndexOf(Numbers) + 1) + "]{";
bool InnerFirstPass = true;
foreach (int Number in Numbers)
{
if (!InnerFirstPass)
{
Working += ", ";
}
else
{
InnerFirstPass = false;
}
Working += Number.ToString();
}
Working += "};";
if ((active - 1) == group.IndexOf(Numbers))
{
Working += " //<active>";
}
}
MethodResult = Working;
}
catch//(Exception ex)
{
//Exception handling here
}
return MethodResult;
}
}
}
I don't know if foreach is more or less efficient than standard for loops, so I have made an alternative version that uses standard for loops:
using System;
using System.Collections.Generic;
namespace XXXNAMESPACEXXX
{
public static class Groups
{
public static List<List<int>> group { get; set; }
public static int active { get; set; }
public static void AddNumberToGroup(int numberToAdd, int groupToAddItTo)
{
try
{
if (group == null)
{
group = new List<List<int>>();
}
while (group.Count < groupToAddItTo)
{
group.Add(new List<int>());
}
int IndexOfListToRefresh = -1;
List<int> NumbersToMove = new List<int>();
for(int i = 0; i < group.Count; i++)
{
List<int> Numbers = group[i];
int IndexOfNumbers = group.IndexOf(Numbers) + 1;
if (Numbers.Contains(numberToAdd) && IndexOfNumbers != groupToAddItTo)
{
active = IndexOfNumbers;
IndexOfListToRefresh = IndexOfNumbers - 1;
for (int j = 0; j < Numbers.Count; j++)
{
int Number = NumbersToMove[j];
NumbersToMove.Add(Number);
}
}
}
for(int i = 0; i < NumbersToMove.Count; i++)
{
int Number = NumbersToMove[i];
if (!group[groupToAddItTo - 1].Contains(Number))
{
group[groupToAddItTo - 1].Add(Number);
}
}
if (!group[groupToAddItTo - 1].Contains(numberToAdd))
{
group[groupToAddItTo - 1].Add(numberToAdd);
}
if (IndexOfListToRefresh != -1)
{
group[IndexOfListToRefresh] = new List<int>();
}
}
catch//(Exception ex)
{
//Exception handling here
}
}
public static string GetString()
{
string MethodResult = "";
try
{
string Working = "";
bool FirstPass = true;
for(int i = 0; i < group.Count; i++)
{
List<int> Numbers = group[i];
if (!FirstPass)
{
Working += "\r\n";
}
else
{
FirstPass = false;
}
Working += "group[" + (group.IndexOf(Numbers) + 1) + "]{";
bool InnerFirstPass = true;
for(int j = 0; j < Numbers.Count; j++)
{
int Number = Numbers[j];
if (!InnerFirstPass)
{
Working += ", ";
}
else
{
InnerFirstPass = false;
}
Working += Number.ToString();
}
Working += "};";
if ((active - 1) == group.IndexOf(Numbers))
{
Working += " //<active>";
}
}
MethodResult = Working;
}
catch//(Exception ex)
{
//Exception handling here
}
return MethodResult;
}
}
}
Both implimentations contain the group variable and two methods, which are; AddNumberToGroup and GetString, where GetString is used to check the current status of the group variable.
Note: You'll need to replace XXXNAMESPACEXXX with the Namespace of your project. Hint: Take this from another class.
When adding an item to your List, do this:
int NumberToAdd = 10;
int GroupToAddItTo = 2;
AddNumberToGroup(NumberToAdd, GroupToAddItTo);
...or...
AddNumberToGroup(10, 2);
In the example above, I am adding the number 10 to group 2.
Test the speed with the following:
DateTime StartTime = DateTime.Now;
int NumberOfTimesToRepeatTest = 1000;
for (int i = 0; i < NumberOfTimesToRepeatTest; i++)
{
Groups.AddNumberToGroup(4, 1);
Groups.AddNumberToGroup(3, 1);
Groups.AddNumberToGroup(8, 2);
Groups.AddNumberToGroup(5, 2);
Groups.AddNumberToGroup(7, 3);
Groups.AddNumberToGroup(3, 3);
Groups.AddNumberToGroup(8, 4);
Groups.AddNumberToGroup(43, 4);
Groups.AddNumberToGroup(100, 5);
Groups.AddNumberToGroup(1, 5);
Groups.AddNumberToGroup(5, 6);
Groups.AddNumberToGroup(78, 6);
Groups.AddNumberToGroup(34, 7);
Groups.AddNumberToGroup(456, 7);
Groups.AddNumberToGroup(456, 8);
Groups.AddNumberToGroup(7, 8);
Groups.AddNumberToGroup(7, 9);
}
long MillisecondsTaken = DateTime.Now.Ticks - StartTime.Ticks;
Console.WriteLine(Groups.GetString());
Console.WriteLine("Process took: " + MillisecondsTaken);
I think this is what you need. Let me know if I misunderstood anything in the question.
As far as I can tell it's brilliant, it's fast and it's tested.
Enjoy!
...and one more thing:
For the little windows interface app, I just created a simple winforms app with three textboxes (one set to multiline) and a button.
Then, after adding the Groups class above, in the button-click event I wrote the following:
private void BtnAdd_Click(object sender, EventArgs e)
{
try
{
int Group = int.Parse(TxtGroup.Text);
int Number = int.Parse(TxtNumber.Text);
Groups.AddNumberToGroup(Number, Group);
TxtOutput.Text = Groups.GetString();
}
catch//(Exception ex)
{
//Exception handling here
}
}
I'm working on a hacker rank problem and I believe my solution is correct. However, most of my test cases are being timed out. Could some one suggest how to improve the efficiency of my code?
The important part of the code starts at the "Trie Class".
The exact question can be found here: https://www.hackerrank.com/challenges/contacts
using System;
using System.Collections.Generic;
using System.IO;
class Solution
{
static void Main(String[] args)
{
int N = Int32.Parse(Console.ReadLine());
string[,] argList = new string[N, 2];
for (int i = 0; i < N; i++)
{
string[] s = Console.ReadLine().Split();
argList[i, 0] = s[0];
argList[i, 1] = s[1];
}
Trie trie = new Trie();
for (int i = 0; i < N; i++)
{
switch (argList[i, 0])
{
case "add":
trie.add(argList[i, 1]);
break;
case "find":
Console.WriteLine(trie.find(argList[i, 1]));
break;
default:
break;
}
}
}
}
class Trie
{
Trie[] trieArray = new Trie[26];
private int findCount = 0;
private bool data = false;
private char name;
public void add(string s)
{
s = s.ToLower();
add(s, this);
}
private void add(string s, Trie t)
{
char first = Char.Parse(s.Substring(0, 1));
int index = first - 'a';
if(t.trieArray[index] == null)
{
t.trieArray[index] = new Trie();
t.trieArray[index].name = first;
}
if (s.Length > 1)
{
add(s.Substring(1), t.trieArray[index]);
}
else
{
t.trieArray[index].data = true;
}
}
public int find(string s)
{
int ans;
s = s.ToLower();
find(s, this);
ans = findCount;
findCount = 0;
return ans;
}
private void find(string s, Trie t)
{
if (t == null)
{
return;
}
if (s.Length > 0)
{
char first = Char.Parse(s.Substring(0, 1));
int index = first - 'a';
find(s.Substring(1), t.trieArray[index]);
}
else
{
for(int i = 0; i < 26; i++)
{
if (t.trieArray[i] != null)
{
find("", t.trieArray[i]);
}
}
if (t.data == true)
{
findCount++;
}
}
}
}
EDIT: I did some suggestions in the comments but realized that I can't replace s.Substring(1) with s[0]... because I actually need s[1..n]. AND s[0] returns a char so I'm going to need to do .ToString on it anyways.
Also, to add a little more information. The idea is that it needs to count ALL names after a prefix for example.
Input: "He"
Trie Contains:
"Hello"
"Help"
"Heart"
"Ha"
"No"
Output: 3
I could just post a solution here that gives you 40 points but i guess this wont be any fun.
Make use of Enumerator<char> instead of any string operations
Count while adding values
any charachter minus 'a' makes a good arrayindex (gives you values between 0 and 25)
I think, this link must be very helpfull
There, you can find the good implementation of a trie data structure, but it's a java implementation. I implemented trie on c# with that great article one year ago, and i tried to find solution to an another hackerrank task too ) It was successful.
In my task, i had to a simply trie (i mean my alphabet equals 2) and only one method for adding new nodes:
public class Trie
{
//for integer representation in binary system 2^32
public static readonly int MaxLengthOfBits = 32;
//alphabet trie size
public static readonly int N = 2;
class Node
{
public Node[] next = new Node[Trie.N];
}
private Node _root;
}
public void AddValue(bool[] binaryNumber)
{
_root = AddValue(_root, binaryNumber, 0);
}
private Node AddValue(Node node, bool[] val, int d)
{
if (node == null) node = new Node();
//if least sagnificient bit has been added
//need return
if (d == val.Length)
{
return node;
}
// get 0 or 1 index of next array(length 2)
int index = Convert.ToInt32(val[d]);
node.next[index] = AddValue(node.next[index], val, ++d);
return node;
}
I am trying this problem using dynamic programming
Problem:
Given a meeting room and a list of intervals (represent the meeting), for e.g.:
interval 1: 1.00-2.00
interval 2: 2.00-4.00
interval 3: 14.00-16.00
...
etc.
Question:
How to schedule the meeting to maximize the room utilization, and NO meeting should overlap with each other?
Attempted solution
Below is my initial attempt in C# (knowing it is a modified Knapsack problem with constraints). However I had difficulty in getting the result correctly.
bool ContainsOverlapped(List<Interval> list)
{
var sortedList = list.OrderBy(x => x.Start).ToList();
for (int i = 0; i < sortedList.Count; i++)
{
for (int j = i + 1; j < sortedList.Count; j++)
{
if (sortedList[i].IsOverlap(sortedList[j]))
return true;
}
}
return false;
}
public bool Optimize(List<Interval> intervals, int limit, List<Interval> itemSoFar){
if (intervals == null || intervals.Count == 0)
return true; //no more choice
if (Sum(itemSoFar) > limit) //over limit
return false;
var arrInterval = intervals.ToArray();
//try all choices
for (int i = 0; i < arrInterval.Length; i++){
List<Interval> remaining = new List<Interval>();
for (int j = i + 1; j < arrInterval.Length; j++) {
remaining.Add(arrInterval[j]);
}
var partialChoice = new List<Interval>();
partialChoice.AddRange(itemSoFar);
partialChoice.Add(arrInterval[i]);
//should not schedule overlap
if (ContainsOverlapped(partialChoice))
partialChoice.Remove(arrInterval[i]);
if (Optimize(remaining, limit, partialChoice))
return true;
else
partialChoice.Remove(arrInterval[i]); //undo
}
//try all solution
return false;
}
public class Interval
{
public bool IsOverlap(Interval other)
{
return (other.Start < this.Start && this.Start < other.End) || //other < this
(this.Start < other.Start && other.End < this.End) || // this covers other
(other.Start < this.Start && this.End < other.End) || // other covers this
(this.Start < other.Start && other.Start < this.End); //this < other
}
public override bool Equals(object obj){
var i = (Interval)obj;
return base.Equals(obj) && i.Start == this.Start && i.End == this.End;
}
public int Start { get; set; }
public int End { get; set; }
public Interval(int start, int end){
Start = start;
End = end;
}
public int Duration{
get{
return End - Start;
}
}
}
Edit 1
Room utilization = amount of time the room is occupied. Sorry for confusion.
Edit 2
for simplicity: the duration of each interval is integer, and the start/end time start at whole hour (1,2,3..24)
I'm not sure how you are relating this to a knapsack problem. To me it seems more of a vertex cover problem.
First sort the intervals as per their start times and form a graph representation in the form of adjacency matrix or list.
The vertices shall be the interval numbers. There shall be an edge between two vertices if the corresponding intervals overlap with each other. Also, each vertex shall be associated with a value equal to the interval's duration.
The problem then becomes choosing the independent vertices in such a way that the total value is maximum.
This can be done through dynamic programming. The recurrence relation for each vertex shall be as follows:
V[i] = max{ V[j] | j < i and i->j is an edge,
V[k] + value[i] | k < i and there is no edge between i and k }
Base Case V[1] = value[1]
Note:
The vertices should be numbered in increasing order of their start times. Then if there are three vertices:
i < j < k, and if there is no edge between vertex i and vertex j, then there cannot be any edge between vertex i and vertex k.
Good approach is to create class that can easily handle for you.
First I create helper class for easily storing intervals
public class FromToDateTime
{
private DateTime _start;
public DateTime Start
{
get
{
return _start;
}
set
{
_start = value;
}
}
private DateTime _end;
public DateTime End
{
get
{
return _end;
}
set
{
_end = value;
}
}
public FromToDateTime(DateTime start, DateTime end)
{
Start = start;
End = end;
}
}
And then here is class Room, where all intervals are and which has method "addInterval", which returns true, if interval is ok and was added and false, if it does not.
btw : I got a checking condition for overlapping here : Algorithm to detect overlapping periods
public class Room
{
private List<FromToDateTime> _intervals;
public List<FromToDateTime> Intervals
{
get
{
return _intervals;
}
set
{
_intervals = value;
}
}
public Room()
{
Intervals = new List<FromToDateTime>();
}
public bool addInterval(FromToDateTime newInterval)
{
foreach (FromToDateTime interval in Intervals)
{
if (newInterval.Start < interval.End && interval.Start < newInterval.End)
{
return false;
}
}
Intervals.Add(newInterval);
return true;
}
}
While the more general problem (if you have multiple number of meeting rooms) is indeed NP-Hard, and is known as the interval scheduling problem.
Optimal solution for 1-d problem with one classroom:
For the 1-d problem, choosing the (still valid) earliest deadline first solves the problem optimally.
Proof: by induction, the base clause is the void clause - the algorithm optimally solves a problem with zero meetings.
The induction hypothesis is the algorithm solves the problem optimally for any number of k tasks.
The step: Given a problem with n meetings, hose the earliest deadline, and remove all invalid meetings after choosing it. Let the chosen earliest deadline task be T.
You will get a new problem of smaller size, and by invoking the algorithm on the reminder, you will get the optimal solution for them (induction hypothesis).
Now, note that given that optimal solution, you can add at most one of the discarded tasks, since you can either add T, or another discarded task - but all of them overlaps T - otherwise they wouldn't have been discarded), thus, you can add at most one from all discarded tasks, same as the suggested algorithm.
Conclusion: For 1 meeting room, this algorithm is optimal.
QED
high level pseudo code of the solution:
findOptimal(list<tasks>):
res = [] //empty list
sort(list) //according to deadline/meeting end
while (list.IsEmpty() == false):
res = res.append(list.first())
end = list.first().endTime()
//remove all overlaps with the chosen meeting
while (list.first().startTine() < end):
list.removeFirst()
return res
Clarification: This answer assumes "Room Utilization" means maximize number of meetings placed in the room.
Thanks all, here is my solution based on this Princeton note on dynamic programming.
Algorithm:
Sort all events by end time.
For each event, find p[n] - the latest event (by end time) which does not overlap with it.
Compute the optimization values: choose the best between including/not including the event.
Optimize(n) {
opt(0) = 0;
for j = 1 to n-th {
opt(j) = max(length(j) + opt[p(j)], opt[j-1]);
}
}
The complete source-code:
namespace CommonProblems.Algorithm.DynamicProgramming {
public class Scheduler {
#region init & test
public List<Event> _events { get; set; }
public List<Event> Init() {
if (_events == null) {
_events = new List<Event>();
_events.Add(new Event(8, 11));
_events.Add(new Event(6, 10));
_events.Add(new Event(5, 9));
_events.Add(new Event(3, 8));
_events.Add(new Event(4, 7));
_events.Add(new Event(0, 6));
_events.Add(new Event(3, 5));
_events.Add(new Event(1, 4));
}
return _events;
}
public void DemoOptimize() {
this.Init();
this.DynamicOptimize(this._events);
}
#endregion
#region Dynamic Programming
public void DynamicOptimize(List<Event> events) {
events.Add(new Event(0, 0));
events = events.SortByEndTime();
int[] eventIndexes = getCompatibleEvent(events);
int[] utilization = getBestUtilization(events, eventIndexes);
List<Event> schedule = getOptimizeSchedule(events, events.Count - 1, utilization, eventIndexes);
foreach (var e in schedule) {
Console.WriteLine("Event: [{0}- {1}]", e.Start, e.End);
}
}
/*
Algo to get optimization value:
1) Sort all events by end time, give each of the an index.
2) For each event, find p[n] - the latest event (by end time) which does not overlap with it.
3) Compute the optimization values: choose the best between including/not including the event.
Optimize(n) {
opt(0) = 0;
for j = 1 to n-th {
opt(j) = max(length(j) + opt[p(j)], opt[j-1]);
}
display opt();
}
*/
int[] getBestUtilization(List<Event> sortedEvents, int[] compatibleEvents) {
int[] optimal = new int[sortedEvents.Count];
int n = optimal.Length;
optimal[0] = 0;
for (int j = 1; j < n; j++) {
var thisEvent = sortedEvents[j];
//pick between 2 choices:
optimal[j] = Math.Max(thisEvent.Duration + optimal[compatibleEvents[j]], //Include this event
optimal[j - 1]); //Not include
}
return optimal;
}
/*
Show the optimized events:
sortedEvents: events sorted by end time.
index: event index to start with.
optimal: optimal[n] = the optimized schedule at n-th event.
compatibleEvents: compatibleEvents[n] = the latest event before n-th
*/
List<Event> getOptimizeSchedule(List<Event> sortedEvents, int index, int[] optimal, int[] compatibleEvents) {
List<Event> output = new List<Event>();
if (index == 0) {
//base case: no more event
return output;
}
//it's better to choose this event
else if (sortedEvents[index].Duration + optimal[compatibleEvents[index]] >= optimal[index]) {
output.Add(sortedEvents[index]);
//recursive go back
output.AddRange(getOptimizeSchedule(sortedEvents, compatibleEvents[index], optimal, compatibleEvents));
return output;
}
//it's better NOT choose this event
else {
output.AddRange(getOptimizeSchedule(sortedEvents, index - 1, optimal, compatibleEvents));
return output;
}
}
//compatibleEvents[n] = the latest event which do not overlap with n-th.
int[] getCompatibleEvent(List<Event> sortedEvents) {
int[] compatibleEvents = new int[sortedEvents.Count];
for (int i = 0; i < sortedEvents.Count; i++) {
for (int j = 0; j <= i; j++) {
if (!sortedEvents[j].IsOverlap(sortedEvents[i])) {
compatibleEvents[i] = j;
}
}
}
return compatibleEvents;
}
#endregion
}
public class Event {
public int EventId { get; set; }
public bool IsOverlap(Event other) {
return !(this.End <= other.Start ||
this.Start >= other.End);
}
public override bool Equals(object obj) {
var i = (Event)obj;
return base.Equals(obj) && i.Start == this.Start && i.End == this.End;
}
public int Start { get; set; }
public int End { get; set; }
public Event(int start, int end) {
Start = start;
End = end;
}
public int Duration {
get {
return End - Start;
}
}
}
public static class ListExtension {
public static bool ContainsOverlapped(this List<Event> list) {
var sortedList = list.OrderBy(x => x.Start).ToList();
for (int i = 0; i < sortedList.Count; i++) {
for (int j = i + 1; j < sortedList.Count; j++) {
if (sortedList[i].IsOverlap(sortedList[j]))
return true;
}
}
return false;
}
public static List<Event> SortByEndTime(this List<Event> events) {
if (events == null) return new List<Event>();
return events.OrderBy(x => x.End).ToList();
}
}
}