remove duplicate values from a list with multiple properties - c#

I have a list of MyClass:
class MyClass
{
public DateTime? DueDate;
public string Desc;
public Decimal Amount;
}
var sample = new List<MyClass>();
This is how sample data looks like :
DueDate Desc Amount
06-29-2015 ABC 100
06-29-2015 DEF 200
01-15-2015 ABC 100
01-15-2015 DEF 200
Output I want in this format
DueDate Desc Amount
06-29-2015 ABC 100
DEF 200
01-15-2015 ABC 100
DEF 200
So basically I would like to remove duplicate DueDate values but keeping its adjacent Desc & Amount field values
I tried this but it will remove values from adjacent column as well :
var test = sample.GroupBy(d => d.DueDate).Select(a => a.First()).ToList();
Any suggestions?

Here's how to "remove" (set to null) duplicate, adjacent DueDates from the sample list:
sample.GroupBy(d => d.DueDate).ToList()
.ForEach(g => g.Skip(1).ToList().ForEach(o => o.DueDate = null));
This is done by Group-ing by DueDate, and for each group, Skip-ing the first element, setting the remainder of the elements in the group DueDates to null.
Output with format:
Console.WriteLine("DueDate Desc Amount");
foreach (var item in sample)
{
var dateString = item.DueDate != null
? item.DueDate.Value.ToString("MM-dd-yyyy")
: string.Empty;
Console.WriteLine(dateString.PadRight(12) + item.Desc + " " + item.Amount);
}
Result:
DueDate Desc Amount
06-29-2015 ABC 100
DEF 200
01-15-2015 ABC 100
DEF 200

var finalData = data
.GroupBy(d=>d.DueDate)
.Select(g=>
new {
DueDate = g.Key,
Values = g.Select(d2=>new{d2.Desc, d2.Amount})})
The Final Structure would be
finalDate = [
{
DueDate:'06-29-1015',
Values:[{Desc:"ABC", Amount:100}, {Desc:"DEF", Amount:200}]
},
{...}
]
EDIT:-
var finalData = data
.GroupBy(d=>d.DueDate)
.Select(g=>
new {
DueDate = g.Key,
Values = g.Select(d2=>d2)
})
.ToDictionary(o=>o.DueDate, o=>o.Values)

What you want is a pivot table. this is how it is done :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
MyClass myClass = new MyClass();
myClass.Load();
myClass.CreatePivotTable();
}
}
class MyClass
{
public static List<MyClass> samples = new List<MyClass>();
public DateTime dueDate { get; set; }
public string desc { get; set; }
public Decimal amount { get; set; }
public static DataTable dt = new DataTable();
public void Load()
{
samples = new List<MyClass>() {
new MyClass() { dueDate = DateTime.Parse("06-29-2015"), desc = "ABC", amount = 100},
new MyClass() { dueDate = DateTime.Parse("06-29-2015"), desc = "DEF", amount = 200},
new MyClass() { dueDate = DateTime.Parse("01-15-2015"), desc = "ABC", amount = 100},
new MyClass() { dueDate = DateTime.Parse("01-15-2015"), desc = "DEF", amount = 100}
};
}
public void CreatePivotTable()
{
string[] uniqueDescription = samples.Select(x => x.desc).Distinct().ToArray();
dt.Columns.Add("Due Date", typeof(DateTime));
foreach (string desc in uniqueDescription)
{
dt.Columns.Add(desc, typeof(decimal));
}
var groups = samples.GroupBy(x => x.dueDate);
foreach(var group in groups)
{
DataRow newRow = dt.Rows.Add();
newRow["Due Date"] = group.Key;
foreach (string col in uniqueDescription)
{
newRow[col] = group.Where(x => x.desc == col).Sum(x => x.amount);
}
}
}
}
}

I'd simply prefer that you loop through your records after you got them in the correct order. Just start with an empty variable and keep the last date in it. If the next value is the same, just don't plot it out. If you find another date value the next iteration, plot it and overwrite your variable for further iterations.
Yeah I know, Linq and Lambdas are cool and stuff (and I love them too) but in this case it seems to be appropriate to me.
var last = DateTime.MinValue;
foreach (var f in sample.OrderBy(x => x.DueDate))
{
if (f.DueDate.Equals(last))
Console.WriteLine("{0}\t{1}\t{2}", "SKIP DATE", f.Desc, f.Amount);
else
{
Console.WriteLine("{0}\t{1}\t{2}", f.DueDate.ToShortDateString(), f.Desc, f.Amount);
last = f.DueDate;
}
}

Based on your latest comments I have edited my answer.
As I am understanding, your requirements are:
Group by DueDate, and only allow the first of the group to have a
DueDate.
The results have to be the same structure.
If you want to remove the DueDate property from all i>0 items in a group then you need to make your property nullable: public DateTime? DueDate;. This way you can assign the value of null to subsequent items in the group.
//New list to hold our new items
var outputList = new List<MyClass>();
//Groups all the items together by DueDate
foreach(var grouping in samples.GroupBy(d => d.DueDate))
{
//Iterates through all items in a group (selecting the index as well)
foreach(var item in grouping.Select((Value, Index) => new { Value, Index }))
{
//If this is any item after the first one, we remove the due date
if(item.Index > 0)
{
item.Value.DueDate = null;
}
outputList.Add(item.Value);
}
}
Fiddle here.

Related

c# linq add a column to a new list from another single column list

My project is MVC5. I have a table with multiple rows for the same day, I need to get the total of this entry for each day, I use the following:
var days = db.Nutrition.Where(x => x.Day >= fromDate
&& x.Day <= toDate).DistinctBy(x => x.Day).AsEnumerable().ToList();
List<double?> calories = new List<double?>();
foreach (var item in days)
{
calories.Add(days.Where(c => c.Day==item.Day).Select(x => x.Calories).Sum());
}
I get a list containing the totals. Now I need to make a new list that has two columns.
I made the following model:
public class Consumption
{
public virtual double? Calories { get; set; }
public virtual string Name { get; set; }
}
I tried to use the following to generate the new list:
List<Consumption> newList = new List<Consumption>();
var name = new Consumption { Name = "name" };
foreach (var item in calories)
{
newList.Add(name, Calories = (double)item.Value);
}
I get the following error:
The name 'Calories' does not exist in the current context
Edit
Thanks to Stephen's comment:
I just used one line to achieve same result
var results = db.Nutrition.Where(x => x.Day >= fromDate && x.Day <= toDate).GroupBy(l => l.Day)
.Select(cl => new { Name = "name", Calories = cl.Sum(c => c.Calories)}).ToList();
Try with:
List<Consumption> newList = new List<Consumption>();
var name = new Consumption { Name = "name" };
foreach (var item in calories)
{
var cal = new Consumption{ Name = "name", Calories = (double)item.Value });
newList.Add(cal);
}
You received this compiler error
The name 'Calories' does not exist in the current context
because the List<Consumption>.Add(Comsumption item) method on your variable newList only accepts one argument of type Consumption.
Regarding your intentions, and the discussion in your comments with #StephenMuecke, it became clear that your intention is to Sum a property double Calories, and GroupBy by property DateTime Day and then project that into a List<Consumption>.
var dateTimeFormat = "yyyy-dd-MM";
var results = db.Nutrition.Where(x => x.Day >= fromDate && x.Day <= toDate)
.GroupBy(x => x.Day)
.Select(groupedX => new Consumption
{
Name = groupedX.Key.ToString(dateTimeFormat),
Calories = groupedX.Sum(y => y.Calories)
}).ToList();

DataTable: how to get the duplicates and the row number of the duplicates

I have the following DataTable:
Article Price
ART1 99
ART2 100
ART3 150
ART2 90
ART1 50
Now, I should create a new datatable with the position of the duplicates like that
Article Duplicates
ART1 1,5
ART2 2,4
ART3
ART2 2,4
ART1 1,5
So the key is the "article" column
I found around only examples about finding which are the duplicate values and how many times are repeated the values with linq.
How can I achieve something like that with linq?
thank you
You can use this approach:
var articleLookup = yourTable.AsEnumerable()
.Select((row, index) => new { Row = row, RowNum = index + 1 })
.ToLookup(x=> x.Row.Field<string>("Article"));
DataTable dupTable = new DataTable();
dupTable.Columns.Add("Article");
dupTable.Columns.Add("Duplicates");
foreach(DataRow row in yourTable.Rows)
{
DataRow addedRow = dupTable.Rows.Add();
string article = row.Field<string>("Article");
var dupRowNumList = articleLookup[article].Select(x => x.RowNum).ToList();
string dupRowNumText = dupRowNumList.Count == 1 ? "" : String.Join(",", dupRowNumList);
addedRow.SetField("Article", article);
addedRow.SetField("Duplicates", dupRowNumText);
}
Hi I tried your exact requirement with creating a Object of List. I could get the expected result you require. Important is you have the Linq query which will give you the result.
Here is the Main class
class Program
{
static void Main(string[] args)
{
List<data> datas = new List<data>();
datas.Add(new data() {atricle = "ART1", price = 99});
datas.Add(new data() { atricle = "ART2", price = 100 });
datas.Add(new data() { atricle = "ART3", price = 150 });
datas.Add(new data() { atricle = "ART2", price = 90 });
datas.Add(new data() { atricle = "ART1", price = 50 });
Console.WriteLine($"Atricle | Duplicates");
foreach (data templist in datas)
{
var duplicates = datas.Select((data, index) => new {atricle = data.atricle, Index = index + 1})
.Where(x => x.atricle == templist.atricle)
.GroupBy(pair => pair.atricle)
.Where(g => g.Count() > 1)
.Select(grp => grp.Select(g => g.Index.ToString()).ToArray())
.ToArray();
string joined = duplicates.Length>0 ? string.Join(",", duplicates[0].ToList()):"";
Console.WriteLine($"{templist.atricle} | {joined}");
}
Console.ReadLine();
}
}
Here is the Data model class
public class data{
public string atricle { get; set; }
public int price { get; set; }
}

Loop through a list and combine totals where an identifier is the same

I cuurently have a list of objetcs Cars
The variables within are:
Make
Model
Service Cost
Lets say I have the list filled up with:
Ferrari, F50, 300
Porsche, 911, 700
Toyota, Camary, 300
Porsche, 911, 400
BMW, Z4, 1200
Porsche, 911, 900
Porsche, 356A, 700
As you can see, my list contains three records where the Porsche 911 has service costs.
How would I loop through my list, find the duplicate 911's and combine them to form one single record? So that I end up with:
Ferrari, F50, 300
Porsche, 911, 2000
Toyota, Camary, 300
BMW, Z4, 1200
Porsche, 356A, 700
What I've done so far is not going to work, as my records would propbably end up in the wrong areas:
List<Car> CombinedCarRecords = new List<Car>(CarDetailRecords); //Original list here being used
List<Car> NormalList = new List<Car>();
List<Car> NewList = new List<Car>();//Making new lists to keep the records in
Car normalRecord = new Car();
Car combinedRecord = new Car();//new objects to keep the values in and add the others
string oldVal = "";
string newVal = "";//used to find the same cars
foreach (var item in CombinedCarRecords )
{
normalRecord = new ClaimDetailRecord();
combinedRecord = new ClaimDetailRecord();
oldVal = newVal;
newVal = item.Model;
if (oldVal == newVal)
{
combinedRecord = item;
CombinedCarRecords.Add(combinedRecord);
}
else
{
normalRecord = item;
NormalList.Add(normalRecord);
}
}//I think the way I try to find the records here is not working, as the old and new values will always be different, if maybe not for some records where they are right after each other. But there is still that initial one
decimal totalP = 0;
if (CombinedTariffsRecords.Count > 0)
{
foreach (var item in CombinedTariffsRecords)
{
}
}
else
NewList = NormalList;
//This is where I'm supposed to add up the totals, but I think my code before will render this code useless
So in all,I have tried, but I cannot come up with a better way to store the values and combine my records.
The easiest way is to use LINQ's Enumerable.GroupBy and Sum:
var newCarList = NormalList
.GroupBy(c => new { c.Make, c.Model })
.Select(carGroup => new Car{
Make = carGroup.Key.Make,
Model = carGroup.Key.Model,
ServiceCost = carGroup.Sum(c => c.ServiceCost)
})
.ToList();
void Main()
{
var cars = new List<Car>
{
new Car { Make = "Ferrari", Model = "F50", ServiceCost = 1000 },
new Car { Make = "Ferrari", Model = "F50", ServiceCost = 2000 },
new Car { Make = "Porsche", Model = "911", ServiceCost = 2000 }
};
var grouped = cars.GroupBy(car => car.Make + car.Model).Select(g => new { g.Key, Costs = g.Sum(e => e.ServiceCost), Element = g.First() });
foreach (var g in grouped)
{
Console.WriteLine("{0} {1}: {2}", g.Element.Make, g.Element.Model, g.Costs);
}
}
class Car
{
public string Model { get; set; }
public string Make { get; set; }
public decimal ServiceCost { get; set; }
}

Join and subtract values from 2 lists using linq

I have 2 lists that have objects of { DT (date), Value (double) }.
I want to join on date and subtract the 2 values. However, sometimes one list won't have any records for a given DT in which case I'd want to just use the value from the list that does. However, because I'm joining what ends up happening is I get no record at all for that DT. Is there any way to represent this using sql like linq?
I know I could loop over 1 list myself and search for that date in the other, but if I could do it all in 1 linq line it just seems cleaner.
I believe this is what you can do:
var result = (from x in list1 select new Item() { date = x.date, value = x.value - (from y in list2 where x.date.Equals(y.date) select y.value).FirstOrDefault() }).ToList();
Feel free to run the test ConsoleApp I wrote:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace StackOverFlowConsoleApplication
{
class Program
{
static void Main(string[] args)
{
List<Item> list1 = new List<Item>()
{
new Item(){date = DateTime.Today, value=100},
new Item(){date = DateTime.Today.AddDays(-1), value=100}
};
List<Item> list2 = new List<Item>()
{
new Item(){date = DateTime.Today, value=50}
};
var result = (from x in list1 select new Item() { date = x.date, value = x.value - (from y in list2 where x.date.Equals(y.date) select y.value).FirstOrDefault() }).ToList();
}
class Item
{
public DateTime date { get; set; }
public double value { get; set; }
}
}
}
Say your class is named Blub and looks something like this:
public class Blub
{
public DateTime DT { get; set; }
public double Value { get; set; }
}
And you have two lists of it:
var list1 = new List<Blub>();
var list2 = new List<Blub>();
Then you can find the difference for each date using this LINQ query:
var differences = from x1 in list1
join x2 in list2 on x1.DT equals x2.DT into temp
from x2 in temp.DefaultIfEmpty()
select new Blub
{
DT = x1.DT,
Value = x1.Value - (x2 != null ? x2.Value : 0.0)
};
The DefaultIfEmpty() method turns the join into an outer join, ensuring you get a join pair of (x1, null) if there is no matching x2 for any given DT.
PS: Surely a matter of personal taste, but I don't think that this isn't readable..

Group by same value and contiguous date

var myDic = new SortedDictionary<DateTime,int> ()
{ { new DateTime(0), 0 },
{ new DateTime(1), 1 },
{ new DateTime(2), 1 },
{ new DateTime(3), 0 },
{ new DateTime(4), 0 },
{ new DateTime(5), 2 }
};
How can group these items (with a LINQ request) like this :
group 1 :
startDate: 0, endDate:0, value:0
group 2 :
startDate: 1, endDate:2, value:1
group 3 :
startDate: 3, endDate:4, value:0
group 4 :
startDate: 5, endDate:5, value:2
group are defined by contiguous date and same values.
Is it possible with a groupby ?
Just use a keyGenerating function. This example presumes your dates are already ordered in the source with no gaps.
int currentValue = 0;
int groupCounter = 0;
Func<KeyValuePair<DateTime, int>, int> keyGenerator = kvp =>
{
if (kvp.Value != currentValue)
{
groupCounter += 1;
currentValue = kvp.Value;
}
return groupCounter;
}
List<IGrouping<int, KeyValuePair<DateTime, int>> groups =
myDictionary.GroupBy(keyGenerator).ToList();
It looks like you are trying to group sequential dates over changes in the value. I don't think you should use linq for the grouping. Instead you should use linq to order the dates and iterate over that sorted list to create your groups.
Addition 1
While you may be able to build your collections with by using .Aggregate(). I still think that is the wrong approach.
Does your data have to enter this function as a SortedDictionary?
I'm just guessing, but these are probably records ordered chronologically.
If so, do this:
public class Record
{
public DateTime Date { get; set; }
public int Value { get; set; }
}
public class Grouper
{
public IEnumerable<IEnumerable<Record>> GroupRecords(IEnumerable<Record> sortedRecords)
{
var groupedRecords = new List<List<Record>>();
var recordGroup = new List<Record>();
groupedRecords.Add(recordGroup);
foreach (var record in sortedRecords)
{
if (recordGroup.Count > 0 && recordGroup.First().Value != record.Value)
{
recordGroup = new List<Record>();
groupedRecords.Add(recordGroup);
}
recordGroup.Add(record);
}
return groupedRecords;
}
}

Categories