Compare Two Liste <T> - c#

how can i compare 2 list ?
public class Pers_Ordre : IEqualityComparer<Pers_Ordre>
{
int _ordreId;
public int LettreVoidID
{
get { return _LettreVoidID; }
set { _LettreVoidID = value; }
}
string _OrdreCummul;
public string OrdreCummul
{
get { return _OrdreCummul; }
set { _OrdreCummul = value; }
}
// Products are equal if their names and product numbers are equal.
public bool Equals(Pers_Ordre x, Pers_Ordre y)
{
//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
//Check whether the products' properties are equal.
return x.LettreVoidID == y.LettreVoidID && x.OrdreCummul == y.OrdreCummul;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Pers_Ordre product)
{
//Check whether the object is null
if (Object.ReferenceEquals(product, null)) return 0;
//Get hash code for the Name field if it is not null.
int hashProductName = product.OrdreCummul == null ? 0 : product.OrdreCummul.GetHashCode();
//Get hash code for the Code field.
int hashProductCode = product.LettreVoidID.GetHashCode();
//Calculate the hash code for the product.
return hashProductName ^ hashProductCode;
}
}
and i compare like this:
private void simpleButton_Comparer_Click(object sender, EventArgs e)
{
string LeFile_Client = System.IO.Path.Combine(appDir, #"FA.csv");
string LeFile_Server = System.IO.Path.Combine(appDir, #"FA_Server.csv");
List<Pers_Ordre> oListClient = Outils.GetCsv(LeFile_Client).OrderBy(t => t.LettreVoidID).ToList();
List<Pers_Ordre> oListServert = Outils.GetCsvServer(LeFile_Server).OrderBy(t => t.LettreVoidID).ToList();
List<Pers_Ordre> LeDiff = new List<Pers_Ordre>();
LeDiff = oListServert.Except(oListClient).ToList();
string Noid = "", OdreID = "";
foreach (var oDiff in LeDiff)
{
Noid += oDiff.LettreVoidID + " ";
OdreID += oDiff.OrdreCummul + " ";
}
MessageBox.Show(Noid + "--" + OdreID);
}
i can not get the right result.
The Lists contain class objects and we would like to iterate through one list, looking for the same item in a second List and report any differences.
to get object that contains in List A but not in List B
and vice versa.

Your current .Except() call will find items from Server that are missing on the client, but it will not find items on the client that are missing on the server.
Try this:
private void simpleButton_Comparer_Click(object sender, EventArgs e)
{
string LeFile_Client = System.IO.Path.Combine(appDir, #"FA.csv");
string LeFile_Server = System.IO.Path.Combine(appDir, #"FA_Server.csv");
var ListClient = Outils.GetCsv(LeFile_Client).OrderBy(t => t.LettreVoidID);
var ListServer = Outils.GetCsvServer(LeFile_Server).OrderBy(t => t.LettreVoidID);
var LeDiff = ListServer.Except(ListClient).Concat(ListClient.Except(ListServer));
var result = new StringBuilder();
foreach (var Diff in LeDiff)
{
result.AppendFormat("{0} --{1} ", Diff.LettreVoidID, Diff.OrdreCummul);
}
MessageBox.Show(Noid.ToString() + "--" + OdreID);
}
This code should also be significantly faster than your original, as it avoids loading the results into memory until it builds the final string. This code in performs the equivalent of two separate sql LEFT JOINs. We could make it faster still by doing one FULL JOIN, but that would require writing our own linq operator method as well.

Related

C# - Compare two strings by using dots in the strings and using wildcards: *

I have two string variables that i want to compare.
var compareA = "something.somethingelse.another.something2"
var compareB = "*.another.something2"
I want to compare this, and the result is: True.
var compareC = "something.somethingelse.*"
compared to compareA, the result should also be: True.
Of course, the fact that both variables can contain N dots also complicates the task.
How would you start for him?
I was tried this:
static void Main(string[] args)
{
var A = CompareString("*.something", "other.Another.something"); //I need this is true!
var B = CompareString("something.Value.Other.*", "something.Value.Other.SomethingElse"); //I need this is true
var C = CompareString("something.Value.Other", "something.Value.Other.OtherElse"); //I need this is False
var D = CompareString("*.somethingElse", "other.another.Value"); //I Need this is false
Console.WriteLine("It is need True: {0}", A);
Console.WriteLine("It is need True: {0}", B);
Console.WriteLine("It is need False: {0}", C);
Console.WriteLine("It is need False: {0}", D);
}
private static bool CompareString(string first, string second)
{
var resume = false;
var firstSplit = first.Split('.');
var secondSplit = second.Split('.');
foreach (var firstItem in firstSplit)
{
foreach (var secondItem in secondSplit)
{
if (firstItem == "*" || secondItem == "*" || string.Equals(firstItem.ToLower(), secondItem.ToLower()))
{
resume = true;
}
else
{
resume = false;
}
}
}
return resume;
}
The results are good, but I think it can be done differently, and the reasoning may be wrong.
Assuming the following:
Compare 1 string to another and it's the full string of 1 contained inside the longer of the 2 strings.
Case to be ignored as well as culture.
The full stops are considered part of the string, not actually as a separator.
Wildcards can be used to state how 1 string can be contained within another.
You should be able to use the following
private bool HasMatch(string textToSearch, string searchText)
{
if (textToSearch.Length < searchText.Length) return false;
var wildCardIndex = searchText.IndexOf('*');
if (wildCardIndex == -1)
{
return textToSearch.Equals(searchText, StringComparison.InvariantCultureIgnoreCase);
}
else
{
if (wildCardIndex == 0)
{
var text = searchText.TrimStart('*');
return textToSearch.EndsWith(text, StringComparison.InvariantCultureIgnoreCase);
}
if (wildCardIndex == (searchText.Length - 1))
{
var text = searchText.TrimEnd('*');
return textToSearch.StartsWith(text, StringComparison.InvariantCultureIgnoreCase);
}
}
return false;
}

Getting expression text

I want to pass the name of a property of a model to a method. Instead of using the name as string, I am using lambda expression as it is easy to make a typo, and also property names may be changed. Now if the property is a simple property (e.g: model.Name) I can get the name from the expression. But if it is a nested property (e.g: model.AnotherModel.Name) then how can I get full text ("AnotherModel.Name") from the expression. For example, I have the following classes:
public class BaseModel
{
public ChildModel Child { get; set; }
public List<ChildModel> ChildList { get; set; }
public BaseModel()
{
Child = new ChildModel();
ChildList = new List<ChildModel>();
}
}
public class ChildModel
{
public string Name { get;set; }
}
public void GetExpressionText<T>(Expression<Func<T, object>> expression)
{
string expText;
//what to do??
return expText;
}
GetExpressionText<BaseModel>(b => b.Child); //should return "Child"
GetExpressionText<BaseModel>(b => b.Child.Name); //should return "Child.Name"
GetExpressionText<BaseModel>(b => b.ChildList[0].Name); //should return "ChildList[0].Name"
My first thought was to use expression.Body.ToString() and tweak that a bit, but you would still need to deal with Unary (convert) etc. Assuming this is for logging and you want more control, the below can be used for formatting as wanted (e.g. if you want Child->Name for display purposes, string.Join("->",..) can be used). It may not be complete, but should you find any unsupported types, they should be easy to add.
PS: this post was generated before the question was closed. Just noticed it was reopend and submitting it now, but I haven't checked if particulars have been changed.
public string GetName(Expression e, out Expression parent)
{
if(e is MemberExpression m){ //property or field
parent = m.Expression;
return m.Member.Name;
}
else if(e is MethodCallExpression mc){
string args = string.Join(",", mc.Arguments.SelectMany(GetExpressionParts));
if(mc.Method.IsSpecialName){ //for indexers, not sure this is a safe check...
return $"{GetName(mc.Object, out parent)}[{args}]";
}
else{ //other method calls
parent = mc.Object;
return $"{mc.Method.Name}({args})";
}
}
else if(e is ConstantExpression c){ //constant value
parent = null;
return c.Value?.ToString() ?? "null";
}
else if(e is UnaryExpression u){ //convert
parent= u.Operand;
return null;
}
else{
parent =null;
return e.ToString();
}
}
public IEnumerable<string> GetExpressionParts(Expression e){
var list = new List<string>();
while(e!=null && !(e is ParameterExpression)){
var name = GetName(e,out e);
if(name!=null)list.Add(name);
}
list.Reverse();
return list;
}
public string GetExpressionText<T>(Expression<Func<T, object>> expression) => string.Join(".", GetExpressionParts(expression.Body));
You could use the C# 6.0 feature: nameof(b.Child) "Used to obtain the simple (unqualified) string name of a variable, type, or member."
which will also change on renaming. But this will only return the propertyname and not the complete path. Returning a complete path will be difficult, because only one instance is passed.
Closest i know right now is by simply using expression.Body.ToString() which would result in b.ChildList.get_Item(0).Name as a result.
You would still have to remove the first b. from the string if not wanted, and you could go even further to your intended output with Regex by replacing the get_Item(0) with the typical Index-Accessor.
(Also i had to make the ChildList and the Name-Property of ChildModel public to get it to work)
This Should get you most of the way there:
public static string GetFullPath<T>(Expression<Func<T>> action)
{
var removeBodyPath = new Regex(#"value\((.*)\).");
var result = action.Body.ToString();
var replaced = removeBodyPath.Replace(result, String.Empty);
var seperatedFiltered = replaced.Split('.').Skip(1).ToArray();
return string.Join(".", seperatedFiltered);
}
It gets ugly quite quickly...
public static string GetExpressionText<T>(Expression<Func<T, object>> expression)
{
bool needDot = false;
Expression exp = expression.Body;
string descr = string.Empty;
while (exp != null)
{
if (exp.NodeType == ExpressionType.MemberAccess)
{
// Property or field
var ma = (MemberExpression)exp;
descr = ma.Member.Name + (needDot ? "." : string.Empty) + descr;
exp = ma.Expression;
needDot = true;
}
else if (exp.NodeType == ExpressionType.ArrayIndex)
{
// Array indexer
var be = (BinaryExpression)exp;
descr = GetParameters(new ReadOnlyCollection<Expression>(new[] { be.Right })) + (needDot ? "." : string.Empty) + descr;
exp = be.Left;
needDot = false;
}
else if (exp.NodeType == ExpressionType.Index)
{
// Object indexer (not used by C#. See ExpressionType.Call)
var ie = (IndexExpression)exp;
descr = GetParameters(ie.Arguments) + (needDot ? "." : string.Empty) + descr;
exp = ie.Object;
needDot = false;
}
else if (exp.NodeType == ExpressionType.Parameter)
{
break;
}
else if (exp.NodeType == ExpressionType.Call)
{
var ca = (MethodCallExpression)exp;
if (ca.Method.IsSpecialName)
{
// Object indexer
bool isIndexer = ca.Method.DeclaringType.GetDefaultMembers().OfType<PropertyInfo>().Where(x => x.GetGetMethod() == ca.Method).Any();
if (!isIndexer)
{
throw new Exception();
}
}
else if (ca.Object.Type.IsArray && ca.Method.Name == "Get")
{
// Multidimensiona array indexer
}
else
{
throw new Exception();
}
descr = GetParameters(ca.Arguments) + (needDot ? "." : string.Empty) + descr;
exp = ca.Object;
needDot = false;
}
}
return descr;
}
private static string GetParameters(ReadOnlyCollection<Expression> exps)
{
var values = new string[exps.Count];
for (int i = 0; i < exps.Count; i++)
{
if (exps[i].NodeType != ExpressionType.Constant)
{
throw new Exception();
}
var ce = (ConstantExpression)exps[i];
// Quite wrong here... We should escape string values (\n written as \n and so on)
values[i] = ce.Value == null ? "null" :
ce.Type == typeof(string) ? "\"" + ce.Value + "\"" :
ce.Type == typeof(char) ? "'" + ce.Value + "\'" :
ce.Value.ToString();
}
return "[" + string.Join(", ", values) + "]";
}
The code is quite easy to read, but it is quite long... There are 4 main cases: MemberAccess, that is accessing a property/field, ArrayIndex that is using the indexer of a single-dimensional array, Index that is unused by the C# compiler, but that should be using the indexer of an object (like the [...] of the List<> you are using), and Call that is used by C# for using an indexer or for accessing multi-dimensional arrays (new int[5, 4]) (and for other method calls, but we disregard them).
I support multidimensional arrays, jagged array s(arrays of arrays, new int[5][]) or arrays of indexable objects (new List<int>[5]) or indexable objects of indexable objects (new List<List<int>>). There is even support for multi-property indexers (indexers that use more than one key value, like obj[1, 2]). Small problem: printing the "value" of the indexers: I support only null, integers of various types, chars and strings (but I don't escape them... ugly... if there is a \n then it won't be printed as \n). Other types are not really supported... They will print what they will print (see GetParameters() if you want)

How to compare two csv files by 2 columns?

I have 2 csv files
1.csv
spain;russia;japan
italy;russia;france
2.csv
spain;russia;japan
india;iran;pakistan
I read both files and add data to lists
var lst1= File.ReadAllLines("1.csv").ToList();
var lst2= File.ReadAllLines("2.csv").ToList();
Then I find all unique strings from both lists and add it to result lists
var rezList = lst1.Except(lst2).Union(lst2.Except(lst1)).ToList();
rezlist contains this data
[0] = "italy;russia;france"
[1] = "india;iran;pakistan"
At now I want to compare, make except and union by second and third column in all rows.
1.csv
spain;russia;japan
italy;russia;france
2.csv
spain;russia;japan
india;iran;pakistan
I think I need to split all rows by symbol ';' and make all 3 operations (except, distinct and union) but cannot understand how.
rezlist must contains
india;iran;pakistan
I added class
class StringLengthEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
...
}
public int GetHashCode(string obj)
{
...
}
}
StringLengthEqualityComparer stringLengthComparer = new StringLengthEqualityComparer();
var rezList = lst1.Except(lst2,stringLengthComparer ).Union(lst2.Except(lst1,stringLengthComparer),stringLengthComparer).ToList();
Your question is not very clear: for instance, is india;iran;pakistan the desired result primarily because russia is at element[1]? Isn't it also included because element [2] pakistan does not match france and japan? Even though thats unclear, I assume the desired result comes from either situation.
Then there is this: find all unique string from both lists which changes the nature dramatically. So, I take it that the desired results are because "iran" appears in column[1] no where else in column[1] in either file and even if it did, that row would still be unique due to "pakistan" in col[2].
Also note that a data sample of 2 leaves room for a fair amount of error.
Trying to do it in one step makes it very confusing. Since eliminating dupes found in 1.CSV is pretty easy, do it first:
// parse "1.CSV"
List<string[]> lst1 = File.ReadAllLines(#"C:\Temp\1.csv").
Select(line => line.Split(';')).
ToList();
// parse "2.CSV"
List<string[]> lst2 = File.ReadAllLines(#"C:\Temp\2.csv").
Select(line => line.Split(';')).
ToList();
// extracting once speeds things up in the next step
// and leaves open the possibility of iterating in a method
List<List<string>> tgts = new List<List<string>>();
tgts.Add(lst1.Select(z => z[1]).Distinct().ToList());
tgts.Add(lst1.Select(z => z[2]).Distinct().ToList());
var tmpLst = lst2.Where(x => !tgts[0].Contains(x[1]) ||
!tgts[1].Contains(x[2])).
ToList();
That results in the items which are not in 1.CSV (no matching text in Col[1] nor Col[2]). If that is really all you need, you are done.
Getting unique rows within 2.CSV is trickier because you have to actually count the number of times each Col[1] item occurs to see if it is unique; then repeat for Col[2]. This uses GroupBy:
var unique = tmpLst.
GroupBy(g => g[1], (key, values) =>
new GroupItem(key,
values.ToArray()[0],
values.Count())
).Where(q => q.Count == 1).
GroupBy(g => g.Data[2], (key, values) => new
{
Item = string.Join(";", values.ToArray()[0]),
Count = values.Count()
}
).Where(q => q.Count == 1).Select(s => s.Item).
ToList();
The GroupItem class is trivial:
class GroupItem
{
public string Item { set; get; } // debug aide
public string[] Data { set; get; }
public int Count { set; get; }
public GroupItem(string n, string[] d, int c)
{
Item = n;
Data = d;
Count = c;
}
public override string ToString()
{
return string.Join(";", Data);
}
}
It starts with tmpList, gets the rows with a unique element at [1]. It uses a class for storage since at this point we need the array data for further review.
The second GroupBy acts on those results, this time looking at col[2]. Finally, it selects the joined string data.
Results
Using 50,000 random items in File1 (1.3 MB), 15,000 in File2 (390 kb). There were no naturally occurring unique items, so I manually made 8 unique in 2.CSV and copied 2 of them into 1.CSV. The copies in 1.CSV should eliminate 2 if the 8 unique rows in 2.CSV making the expected result 6 unique rows:
NepalX and ItalyX were the repeats in both files and they correctly eliminated each other.
With each step it is scanning and working with less and less data, which seems to make it pretty fast for 65,000 rows / 130,000 data elements.
your GetHashCode()-Method in EqualityComparer are buggy. Fixed version:
public int GetHashCode(string obj)
{
return obj.Split(';')[1].GetHashCode();
}
now the result are correct:
// one result: "india;iran;pakistan"
btw. "StringLengthEqualityComparer"is not a good name ;-)
private void GetUnion(List<string> lst1, List<string> lst2)
{
List<string> lstUnion = new List<string>();
foreach (string value in lst1)
{
string valueColumn1 = value.Split(';')[0];
string valueColumn2 = value.Split(';')[1];
string valueColumn3 = value.Split(';')[2];
string result = lst2.FirstOrDefault(s => s.Contains(";" + valueColumn2 + ";" + valueColumn3));
if (result != null)
{
if (!lstUnion.Contains(result))
{
lstUnion.Add(result);
}
}
}
}
class Program
{
static void Main(string[] args)
{
var lst1 = File.ReadLines(#"D:\test\1.csv").Select(x => new StringWrapper(x)).ToList();
var lst2 = File.ReadLines(#"D:\test\2.csv").Select(x => new StringWrapper(x));
var set = new HashSet<StringWrapper>(lst1);
set.SymmetricExceptWith(lst2);
foreach (var x in set)
{
Console.WriteLine(x.Value);
}
}
}
struct StringWrapper : IEquatable<StringWrapper>
{
public string Value { get; }
private readonly string _comparand0;
private readonly string _comparand14;
public StringWrapper(string value)
{
Value = value;
var split = value.Split(';');
_comparand0 = split[0];
_comparand14 = split[14];
}
public bool Equals(StringWrapper other)
{
return string.Equals(_comparand0, other._comparand0, StringComparison.OrdinalIgnoreCase)
&& string.Equals(_comparand14, other._comparand14, StringComparison.OrdinalIgnoreCase);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
return obj is StringWrapper && Equals((StringWrapper) obj);
}
public override int GetHashCode()
{
unchecked
{
return ((_comparand0 != null ? StringComparer.OrdinalIgnoreCase.GetHashCode(_comparand0) : 0)*397)
^ (_comparand14 != null ? StringComparer.OrdinalIgnoreCase.GetHashCode(_comparand14) : 0);
}
}
}

How to getting distinct values by linq or lambda?

I have a list of items, and i try to getting unique items by distinct keys.
The class:
class TempClass
{
public string One { get; set; }
public string Two { get; set; }
public string Key
{
get
{
return "Key_" + One + "_" + Two;
}
}
}
I build the dummy list as follows:
List<TempClass> l = new List<TempClass>()
{
new TempClass(){ One="Da" , Two = "Mi"},
new TempClass(){ One="Da" , Two = "Mi"},
new TempClass(){ One="Da" , Two = "Mi"},
new TempClass(){ One="Mi" , Two = "Da"},
new TempClass(){ One="Mi" , Two = "Da"},
};
My question is - how get only 1 item? by check that does exist only unique key? unique item means that should to check that have there only one key that equals to "Key_Da_Mi" or "Key_Mi_Da"?
how to achieve that?
Group each of the items on a HashSet of strings containing both keys, use HashSet's set comparer to compare the items as sets (sets are unordered) and then pull out the first (or whichever) item from each group:
var distinct = l.GroupBy(item => new HashSet<string>() { item.One, item.Two },
HashSet<string>.CreateSetComparer())
.Select(group => group.First());
You should either implement equality comparison, or implement IEqualityComparer<T> with your specific logic:
class TempClassEqualityComparer : IEqualityComparer<TempClass>
{
public bool Equals(TempClass x, TempClass y)
{
if (Object.ReferenceEquals(x, y)) return true;
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
// For comparison check both combinations
return (x.One == y.One && x.Two == y.Two) || (x.One == y.Two && x.Two == y.One);
}
public int GetHashCode(TempClass x)
{
if (Object.ReferenceEquals(x, null)) return 0;
return x.One.GetHashCode() ^ x.Two.GetHashCode();
}
}
Then you can use this comparer in Distinct method:
var result = l.Distinct(new TempClassEqualityComparer());
Just order them before you create the key.
public string Key
{
get{
List<string> l = new List<string>{One, Two};
l = l.OrderBy(x => x).ToList();
return "Key_" + string.Join("_", l);
}
}

fill in delimited sequence

I have a file that contains a list of delimited sequence numbers as a record key. I need to fill in the missing sequence. So if I have
8
8.2
8.3.4.1
I need to add
8.1
8.3
8.3.1
8.3.2
8.3.3
8.3.4
I have come up with a few algorithms but they're all horribly complex and have too many cases. Is there an easy way to do this or do I have to plod through? I'm using c# but Java would do.
Not sure if my solution is easy to understand, but let's try. The idea is that we recursively insert missing sequences between existing ones.
First, you need to parse your file to create a List of items representing existing sequence. Every item should have reference to the next one (linked list idea).
public class Item
{
public int Value { get; set; }
public Item SubItem { get; set; }
public Item NextItem { get; set; }
public Item(int value, Item subItem)
{
Value = value;
SubItem = subItem;
}
public Item CreatePreviousItem()
{
if (SubItem == null)
{
return Value == 1 ? null : new Item(Value - 1, null);
}
return new Item(Value, SubItem.CreatePreviousItem());
}
public bool IsItemMissingPrior(Item item)
{
if (item == null)
{
return false;
}
return
item.Value - Value > 1
|| (SubItem == null && item.SubItem != null && item.SubItem.Value > 1) //edge case
|| (SubItem != null && SubItem.IsItemMissingPrior(item.SubItem));
}
public override string ToString()
{
return Value + (SubItem != null ? "." + SubItem : "");
}
}
Assuming that sequences are delimited by new line symbol, you can use the following Parse method.
private List<Item> Parse(string s)
{
var result = new List<Item>();
var numberLines = s.Split(new[] {Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries);
foreach (var numberLine in numberLines)
{
var numbers = numberLine.Split(new[] {'.'}).Reverse();
Item itemInstance = null;
foreach (var number in numbers)
{
itemInstance = new Item(Convert.ToInt32(number), itemInstance);
}
if (result.Count > 0)
{
result.Last().NextItem = itemInstance;
}
result.Add(itemInstance);
}
return result;
}
Here is a recursive method which inserts missing sequences between two existing ones
private void UpdateSequence(Item item)
{
if (item.IsItemMissingPrior(item.NextItem))
{
var inBetweenItem = item.NextItem.CreatePreviousItem();
inBetweenItem.NextItem = item.NextItem;
item.NextItem = inBetweenItem;
UpdateSequence(item);
}
}
And finally the use case:
var inputItems = Parse(inputString);
foreach (var item in inputItems)
{
UpdateSequence(item);
}
That's it. To see the result, you just need to get the first item from the list and keep moving forward using NextItem property. For example
var displayItem = inputItems.FirstOrDefault();
while (displayItem != null)
{
Console.WriteLine(displayItem.ToString());
displayItem = displayItem.NextItem;
}
Hope it helps.
One of the easy ways (though not optimal for some cases) will be maintaining a set of existing keys. For each key in your initial sequence you can add all the preceding keys to that set. This can be done in two loops: in inner loop you add a key to a set and decrease last number of a key by one while the number is more than zero, in outer loop you decrease the length of a key by one.
And then you just need to output the set in sorted order.

Categories