Entity Framework - How to convert from String to Integer in query - c#

I have a legacy database with the same ID stored with multiple representations (string and integer). I need my query to join based on the key.
I know about SqlFunctions.StringConvert, but it doesn't work for my case, because the ID has 0-prefixes and the canonical representation of the number does not have string equivalence to other representations.
How can I convert my numeric string value into an integer from within my query?

Not sure if that what you are looking for but since your numeric string may have characters in it you can extract just the numbers from string
var getNumbers =Convert.ToInt32 (from t in stringToQuery
where char.IsDigit(t)
select t).ToArray().ToString());

May be you should try something like this:
//example
List<string> texts = new List<string>();
List<int> integers = new List<int>();
for (int j = 1; j <= 10; j++)
{
text.Add("00" + j.ToString());
integers.Add(j);
}
var a = from t in texts
join i in integers on Convert.ToInt32(t) equals i
select t;

Can't you just use TrimStart?
id.TrimStart('0');
(edit) Actually LINQ to Entities doesn't like that so you need to try this instead to strip the leading zeros before the comparison:
user trimstart in entity framework query

I would create a class to store your representation.
public sealed class CanonicalInt: IEquatable<int>, IEquatable<string>
{
private int _number;
private string _canonical
{
get
{
return ""; //logic to turn int into format
}
set
{
_number = 0; ////logic to turn string into format
}
}
public CanonicalInt(int number)
{
_number = number;
}
public CanonicalInt(string number)
{
_canonical = number;
}
public bool Equals(int other)
{
return _number.Equals(other);
}
public bool Equals(string other)
{
if(other == null)
return false;
return _canonical.Equals(other);
}
public static implicit operator int(CanonicalInt canonicalInt)
{
return canonicalInt._number;
}
public static implicit operator string(CanonicalInt canonicalInt)
{
return canonicalInt._canonical;
}
}
Usage:
var number = new CanonicalInt(23);
var result = number == 23; // True

if your string always ends with canonical number them may be something like a combination of patindex,datalength and stringconvert ? (please replace simulated SqlFunctions with real, it should run in 2entities context on tables then):
string [] Strings = new string [] {"0015","0-00152","00-0012"};
int[] Integers = new int[] { 15,12};
var MixedResult = Strings.Where(s => Integers.Any(i => (PatIndex(StringConvert(i),s) + DataLength(StringConvert(i))) == DataLength(s))).ToList();
these are just simulated SqlFunctions:
private string StringConvert(int x)
{
return x.ToString();
}
private int PatIndex(string pattern,string target)
{
return target.IndexOf(pattern);
}
private int DataLength(string x)
{
return x.Length;
}

Related

How to quicksort pairs of numbers(int and double)

I created the pair class and array class, but I'm lost on how to implement the quicksort algorithm.
I want to do it if ints are same then I should sort by double. I was able to implement quicksort with one value per index of array, but with this I just can't find any resources.
Maybe you guys have some resources or maybe you had the same problem?
By the way I'm trying to implement it with c#.
This is my pair class:
class Pair
{
public int integer = 0;
public double doubl = 0.0;
public Pair(int integer, double doubl)
{
this.integer = integer;
this.doubl = doubl;
}
public Pair()
{
}
public int Integer() { return integer; }
public double Doubl() { return doubl; }
}
And my data array class
class MyDataArray : DataArray
{
Pair[] data;
int operations = 0;
public MyDataArray(int n, int seed)
{
data = new Pair[n];
Random rand = new Random(seed);
for (int i = 0; i < n; i++)
{
data[i] = new Pair(rand.Next(1,100), rand.NextDouble());
}
}
public override int integer(int index)
{
return data[index].integer;
}
public override double doubl(int index)
{
return data[index].doubl;
}
public override void Swap(int i, int j)
{
Pair temp = data[i]; // c3 1
data[i] = data[j]; // c3 1
data[j] = temp; // c3 1
}
Your Pair class could implement IComparable<T>, and your quick sort algorithm could be implemented using the CompareTo method.
The IComparable<T> interface:
Defines a generalized comparison method that a value type or class implements to create a type-specific comparison method for ordering or sorting its instances.
You can see the documentation on the CompareTo method to see what the return values mean.
public class Pair : IComparable<Pair>
{
public int integer = 0;
public double doubl = 0.0;
public Pair(int integer, double doubl)
{
this.integer = integer;
this.doubl = doubl;
}
public Pair()
{
}
public int CompareTo(Pair other)
{
if (other == null)
{
return 1;
}
int result = integer.CompareTo(other.integer);
if (result == 0)
{
result = doubl.CompareTo(other.doubl);
}
return result;
}
public int Integer() { return integer; }
public double Doubl() { return doubl; }
}
If you prefer to use the comparison operators, you can implement them in terms of the CompareTo method. The documentation I liked has examples on how to do that.
//sort for integer
var SortedIntegerList = data.OrderBy(x=>x.integer);
//sort for double
var SortedDoubleList = data.OrderBy(x=>x.doubl);
OrderBy for objects uses Quicksort - What Sorting Algorithm Is Used By LINQ "OrderBy"? - so you can use that.
To avoid creating IComparer<Pair> interface you can construct it using Comparer<T>.Create from just comparison delegate:
var sorted = source.OrderBy(x => x, Comparer<Pair>.Create(
(p1, p2) => p1.Integer() - p2.Integer() != 0 ?
p1.Integer() - p2.Integer() :
Math.Sign(p1.Doubl() - p2.Doubl()))).ToList();

Why can't SortedSet be used as a Priority Queue or Min-Heap?

I was attempting to solve the running median problem (on hackerrank) using a sorted set. Only it's elements don't appear properly sorted.
See it in action here: http://rextester.com/NGBN25779
public class RunningMedian{
List<int> list = new List<int>();
SortedSet<int> sorted = new SortedSet<int>();
public void Add(int num){
list.Add(num);
sorted.Add(num);
}
public double MedianNotWorking(){
return GetMedian(sorted.ToArray());
}
public double MedianWorking(){
int[] arr = list.ToArray();
Array.Sort(arr);
return GetMedian(arr);
}
public double GetMedian(int[] arr){
int idx = list.Count / 2;
if(arr.Length % 2 == 0){
return (double)((double)(arr[idx] + arr[idx-1]) / 2);
}else{
return arr[idx];
}
}
}
static void Main(String[] args) {
int n = Convert.ToInt32(Console.ReadLine());
int[] a = new int[n];
RunningMedian heap = new RunningMedian();
for(int i = 0; i < n; i++){
a[i] = Convert.ToInt32(Console.ReadLine());
heap.Add(a[i]);
//double median = heap.GetMedian();
double median = heap.MedianNotWorking();
Console.WriteLine(median.ToString("F1"));
}
}
For the most part the sorted set does work. However at larger input sizes it begins to give wrong answers. It may not be the optimal solution to the problem but I'm curious as to why it fails at all. C# doesn't have a min-heap / priority queue so why can't sorted sets be used as a substitute?
*Edited to include full code from hackerrank.
Here is an input file.
Input
http://textuploader.com/dovni
Expected
http://textuploader.com/dovnb
Output
http://textuploader.com/dovwj
Conflicts appear near the end
Expected
(Skipping 1-364)
54240.0
54576.5
54913.0
54576.5
54240.0
Results
(Skipping 1-364)
54240.0
54576.5
54913.0
54963.0
54576.5
SortedSet collections contain by definition only unique values. However your input file contains the number 21794 twice, which means that the second 21794 entry doesn't get added to your SortedSet. So your sorted set will contain fewer values than your list and your whole algorithm doesn't work anymore.
In general, this could be achieved by definition of new IComparator behavior for the SortedSet comparison. For the min priority queue it would be smth like this:
public class PriorityQueue<K,V> where K : IComparable
where V : IComparable
{
private SortedSet<Node<K,V>> _set;
private readonly int _amount;
public PriorityQueue(int amount)
{
_set = new SortedSet<Node<K,V>>(new PriorityComparer<K,V>());
_amount = amount;
}
public void Add(Node<K,V> value)
{
if (_amount > _set.Count)
_set.Add(value);
else
{
if (_set.Max.Val.CompareTo(value.Val) == 1)
{
_set.Remove(_set.Max);
_set.Add(value);
}
}
}
public Node<K,V> ExtractMax()
{
var max = _set.Max;
_set.Remove(max);
return max;
}
public Node<K,V> ExtractMin()
{
var min = _set.Min;
_set.Remove(min);
return min;
}
public bool IsEmpty => _set.Count == 0;
}
public struct Node<K,V> where K : IComparable
where V : IComparable
{
public K Key;
public V Val;
public Node(K key, V val)
{
Val = val;
Key = key;
}
}
public class PriorityComparer<K,V> : IComparer<Node<K,V>> where K: IComparable
where V: IComparable
{
public int Compare(Node<K,V> i, Node<K,V> y)
{
var compareresult = i.Val.CompareTo(y.Val);
if (compareresult == 0)
return i.Key.CompareTo(y.Key);
return compareresult;
}
}

C# - Finding key in an array by the value of its complex structure

is there a method in C# to find the key of the item in an array by its "subvalue"? Some hypothetical function "findKeyofCorrespondingItem()"?
struct Items
{
public string itemId;
public string itemName;
}
int len = 18;
Items[] items = new Items[len];
items[0].itemId = "684656";
items[1].itemId = "411666";
items[2].itemId = "125487";
items[3].itemId = "756562";
// ...
items[17].itemId = "256569";
int key = findKeyofCorrespondingItem(items,itemId,"125487"); // returns 2
You can use Array.FindIndex. See https://msdn.microsoft.com/en-us/library/03y7c6xy(v=vs.110).aspx
using System.Linq
...
Array.FindIndex(items, (e) => e.itemId == "125487"));
public static int findKeyofCorrespondingItem(Items[] items, string searchValue)
{
for (int i = 0; i < items.Length; i++)
{
if (items[i].itemId == searchValue)
{
return i;
}
}
return -1;
}
You can run a loop and check if itemId equal to the value you are searching for. Return -1 if no item matches with value.
Solution with Linq:
public static int findKeyofCorrespondingItem(Items[] items, string searchValue)
{
return Array.FindIndex(items, (e) => e.itemId == searchValue);
}

C# replace null XML node with respective data type's default value

i have to iterate a loop on about 400 different XML files and every time i will be getting different xml file.
I have about 11 nodes in the XML(all coming as String) and i am parsing this XML and storing the XML Element's values using Entity Framework in the Database (in different data types like Decimal, int, string, double)
I do not know which xml node will come as null and i do not want to add a null check for each and every node..
Is there a way to implement a common null check for the whole XML file in the loop so if any node comes as null, i can assign it to the default value of respective data type in its respective Entity.. Some thing like the code snippet shown below:-
foreach (XmlNode node in tableElements)
{
dcSearchTerm searchTermEntity = new dcSearchTerm();
//Reference keywords: creation & assignment
int IDRef = 0, salesRef = 0, visitsRef = 0, saleItemsRef = 0;
DateTime visitDateRef = new DateTime();
decimal revenueRef = 0;
int.TryParse(node["id"].InnerText, out IDRef);
searchTermEntity.SearchTerm = node["Search_x0020_Term"].InnerText;
searchTermEntity.ReferrerDomain = node["Referrer_x0020_Domain"].InnerText;
if (node["Country"] == null)
{
searchTermEntity.Country = "";
}
else
{
searchTermEntity.Country = node["Country"].InnerText;
}
DateTime.TryParse(node["Visit_x0020_Date"].InnerText, out visitDateRef);
searchTermEntity.VisitEntryPage = node["Visit_x0020_Entry_x0020_Page"].InnerText;
int.TryParse(node["Sales"].InnerText, out salesRef);
int.TryParse(node["Visits"].InnerText, out visitsRef);
decimal.TryParse(node["Revenue"].InnerText, out revenueRef);
int.TryParse(node["Sale_x0020_Items"].InnerText, out saleItemsRef);
// assigning reference values to the entity
searchTermEntity.ID = IDRef;
searchTermEntity.VisitDate = visitDateRef;
searchTermEntity.Sales = salesRef;
searchTermEntity.Visits = visitsRef;
searchTermEntity.Revenue = revenueRef;
searchTermEntity.SaleItems = saleItemsRef;
searches.Add(searchTermEntity);
return searches;
}
P.S.:- This is my first question on SO, please feel free to ask more details
Waiting for a flood of suggestions ! :)
OK, here is extension class that adds methods to Strings and XmlNodes:
public static class MyExtensions
{
// obviously these ToType methods can be implemented with generics
// to further reduce code duplication
public static int ToInt32(this string value)
{
Int32 result = 0;
if (!string.IsNullOrEmpty(value))
Int32.TryParse(value, out result);
return result;
}
public static decimal ToDecimal(this string value)
{
Decimal result = 0M;
if (!string.IsNullOrEmpty(value))
Decimal.TryParse(value, out result);
return result;
}
public static int GetInt(this XmlNode node, string key)
{
var str = node.GetString(key);
return str.ToInt32();
}
public static string GetString(this XmlNode node, string key)
{
if (node[key] == null || String.IsNullOrEmpty(node[key].InnerText))
return null;
else
return node.InnerText;
}
// implement GetDateTime/GetDecimal as practice ;)
}
Now we can rewrite your code like:
foreach (XmlNode node in tableElements)
{
// DECLARE VARIABLES WHEN YOU USE THEM
// DO NOT DECLARE THEM ALL AT THE START OF YOUR METHOD
// http://programmers.stackexchange.com/questions/56585/where-do-you-declare-variables-the-top-of-a-method-or-when-you-need-them
dcSearchTerm searchTermEntity = new dcSearchTerm()
{
ID = node.GetInt("id"),
SearchTerm = node.GetString("Search_x0020_Term"),
ReferrerDomain = node.GetString("Referrer_x0020_Domain"),
Country = node.GetString("Country"),
VisitDate = node.GetDateTime("Visit_x0020_Date"),
VisitEntryPage = node.GetString("Visit_x0020_Entry_x0020_Page"),
Sales = node.GetInt("Sales"),
Visits = node.GetInt("Visits"),
Revenue = node.GetDecimal("Revenue"),
SaleItems = node.GetDecimal("Sale_x0020_Items")
};
searches.Add(searchTermEntity);
return searches;
}
Don't forget to implement GetDateTime and GetDecimal extensions- I've left those to you ;).
You can use a monad style extension method like below. The sample provided acts only on structs. You can modify it to use for all types.
public static class NullExtensions
{
public delegate bool TryGetValue<T>(string input, out T value);
public static T DefaultIfNull<T>(this string value, TryGetValue<T> evaluator, T defaultValue) where T : struct
{
T result;
if (evaluator(value, out result))
return result;
return defaultValue;
}
public static T DefaultIfNull<T>(this string value, TryGetValue<T> evaluator) where T : struct
{
return value.DefaultIfNull(evaluator, default(T));
}
}
Example:
string s = null;
bool result = s.DefaultIfNull<bool>(bool.TryParse, true);
int r = s.DefaultIfNull<int>(int.TryParse);

Overloading Linq Except to allow custom struct with byte array

I am having a problem with a custom struct and overloading linq's except method to remove duplicates.
My struct is as follows:
public struct hashedFile
{
string _fileString;
byte[] _fileHash;
public hashedFile(string fileString, byte[] fileHash)
{
this._fileString = fileString;
this._fileHash = fileHash;
}
public string FileString { get { return _fileString; } }
public byte[] FileHash { get { return _fileHash; } }
}
Now, the following code works fine:
public static void test2()
{
List<hashedFile> list1 = new List<hashedFile>();
List<hashedFile> list2 = new List<hashedFile>();
hashedFile one = new hashedFile("test1", BitConverter.GetBytes(1));
hashedFile two = new hashedFile("test2", BitConverter.GetBytes(2));
hashedFile three = new hashedFile("test3", BitConverter.GetBytes(3));
hashedFile threeA = new hashedFile("test3", BitConverter.GetBytes(4));
hashedFile four = new hashedFile("test4", BitConverter.GetBytes(4));
list1.Add(one);
list1.Add(two);
list1.Add(threeA);
list1.Add(four);
list2.Add(one);
list2.Add(two);
list2.Add(three);
List<hashedFile> diff = list1.Except(list2).ToList();
foreach (hashedFile h in diff)
{
MessageBox.Show(h.FileString + Environment.NewLine + h.FileHash[0].ToString("x2"));
}
}
This code shows "threeA" and "four" just fine. But if I do the following.
public static List<hashedFile> list1(var stuff1)
{
//Generate a List here and return it
}
public static List<hashedFile> list2(var stuff2)
{
//Generate a List here and return it
}
List<hashedFile> diff = list1.except(list2);
"diff" becomes an exact copy of "list1". I should also mention that I am sending a byte array from ComputeHash from System.Security.Cryptography.MD5 to the byte fileHash in the list generations.
Any ideas on how to overload either the Except or GetHashCode method for linq to successfully exclude the duplicate values from list2?
I'd really appreciate it! Thanks!
~MrFreeman
EDIT: Here was how I was originally trying to use List<hashedFile> diff = newList.Except(oldList, new hashedFileComparer()).ToList();
class hashedFileComparer : IEqualityComparer<hashedFile>
{
public bool Equals(hashedFile x, hashedFile y)
{
if (Object.ReferenceEquals(x, y)) return true;
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
return x.FileString == y.FileString && x.FileHash == y.FileHash;
}
public int GetHashCode(hashedFile Hashedfile)
{
if (Object.ReferenceEquals(Hashedfile, null)) return 0;
int hashFileString = Hashedfile.FileString == null ? 0 : Hashedfile.FileString.GetHashCode();
int hashFileHash = Hashedfile.FileHash.GetHashCode();
int returnVal = hashFileString ^ hashFileHash;
if (Hashedfile.FileString.Contains("blankmusic") == true)
{
Console.WriteLine(returnVal.ToString());
}
return returnVal;
}
}
If you want the type to handle its own comparisons in Except the interface you need is IEquatable. The IEqualityComparer interface is to have another type handle the comparisons so it can be passed into Except as an overload.
This achieves what you want (assuming you wanted both file string and hash compared).
public struct hashedFile : IEquatable<hashedFile>
{
string _fileString;
byte[] _fileHash;
public hashedFile(string fileString, byte[] fileHash)
{
this._fileString = fileString;
this._fileHash = fileHash;
}
public string FileString { get { return _fileString; } }
public byte[] FileHash { get { return _fileHash; } }
public bool Equals(hashedFile other)
{
return _fileString == other._fileString && _fileHash.SequenceEqual(other._fileHash);
}
}
Here is an example in a working console application.
public class Program
{
public struct hashedFile : IEquatable<hashedFile>
{
string _fileString;
byte[] _fileHash;
public hashedFile(string fileString, byte[] fileHash)
{
this._fileString = fileString;
this._fileHash = fileHash;
}
public string FileString { get { return _fileString; } }
public byte[] FileHash { get { return _fileHash; } }
public bool Equals(hashedFile other)
{
return _fileString == other._fileString && _fileHash.SequenceEqual(other._fileHash);
}
}
public static void Main(string[] args)
{
List<hashedFile> list1 = GetList1();
List<hashedFile> list2 = GetList2();
List<hashedFile> diff = list1.Except(list2).ToList();
foreach (hashedFile h in diff)
{
Console.WriteLine(h.FileString + Environment.NewLine + h.FileHash[0].ToString("x2"));
}
Console.ReadLine();
}
private static List<hashedFile> GetList1()
{
hashedFile one = new hashedFile("test1", BitConverter.GetBytes(1));
hashedFile two = new hashedFile("test2", BitConverter.GetBytes(2));
hashedFile threeA = new hashedFile("test3", BitConverter.GetBytes(4));
hashedFile four = new hashedFile("test4", BitConverter.GetBytes(4));
var list1 = new List<hashedFile>();
list1.Add(one);
list1.Add(two);
list1.Add(threeA);
list1.Add(four);
return list1;
}
private static List<hashedFile> GetList2()
{
hashedFile one = new hashedFile("test1", BitConverter.GetBytes(1));
hashedFile two = new hashedFile("test2", BitConverter.GetBytes(2));
hashedFile three = new hashedFile("test3", BitConverter.GetBytes(3));
var list1 = new List<hashedFile>();
list1.Add(one);
list1.Add(two);
list1.Add(three);
return list1;
}
}
This is becoming quite large but I will continue there is an issue with above implementation if hashedFile is a class not a struct (and sometimes when a stuct maybe version depdendant). Except uses an internal Set class the relevant part of that which is problematic is that it compares the hash codes and only if they are equal does it then use the comparer to check equality.
int hashCode = this.InternalGetHashCode(value);
for (int i = this.buckets[hashCode % this.buckets.Length] - 1; i >= 0; i = this.slots[i].next)
{
if ((this.slots[i].hashCode == hashCode) && this.comparer.Equals(this.slots[i].value, value))
{
return true;
}
}
The fix for this depending on performance requirements is you can just return a 0 hash code. This means the comparer will always be used.
public override int GetHashCode()
{
return 0;
}
The other option is to generate a proper hash code this matters sooner than I expected the difference for 500 items is 7ms vs 1ms and for 5000 items is 650ms vs 13ms. So probably best to go with a proper hash code. byte array hash code function taken from https://stackoverflow.com/a/7244316/1002621
public override int GetHashCode()
{
var hashCode = 0;
var bytes = _fileHash.Union(Encoding.UTF8.GetBytes(_fileString)).ToArray();
for (var i = 0; i < bytes.Length; i++)
hashCode = (hashCode << 3) | (hashCode >> (29)) ^ bytes[i]; // Rotate by 3 bits and XOR the new value.
return hashCode;
}

Categories