Arithmetic operations on dictionary c# - c#

I have this dictionary object as:- Dictionary<string,string> lstTransList;
This object has values as |Key={Id}|Value={|QTY=4|PICKEDUP=2}|
now I want to calculate count of records, difference of QTY & PICKEDUP for each record, then summation of QTY, summation PICKEDUP & summation of difference.
Is there any efficient way of performing these arithmatic operations using LINQ ?
I'm getting count as:- int total_transactiondone = lstTransList.Count();
for summation of QTY I want to split value of dictionary object using Keys as 'QTY' & 'PICKEDUP' but don't know how to use this. Any suggestions ??
Thinking of using something like this :
decimal total_tickets = lstTransList.Sum(x => Convert.ToDecimal(x.Value));
Edit :
Now I'm using followin approach to achieve this task. Is there any other efficient way to do this ? Please suggest.
var lstTransList = objRedis.GetAllEntriesFromHash(strHashey);
DataTable dtTransList = new DataTable();
dtTransList.Columns.Add("TransId");
dtTransList.Columns.Add("Qty", typeof(int));
dtTransList.Columns.Add("PickedUp", typeof(int));
dtTransList.Columns.Add("UnPickedUp", typeof(int));
foreach (KeyValuePair<string, string> entry in lstTransList)
{
DataRow DR = dtTransList.NewRow();
if (!string.IsNullOrEmpty(entry.Key))
{
DR[0] = entry.Key;
}
if (!string.IsNullOrEmpty(entry.Value))
{
clsKeyValueParser objKV = new clsKeyValueParser(entry.Value);
DR[1] = Convert.ToInt32(objKV.strGetValue("QTY", "0"));
DR[2] = Convert.ToInt32(objKV.strGetValue("PICKEDUP", "0"));
DR[3] = Convert.ToInt32(DR[1]) - Convert.ToInt32(DR[2]);
}
dtTransList.Rows.Add(DR);
}
int total_transactiondone = dtTransList.Rows.Count;
int total_tickets = dtTransList.AsEnumerable().Sum(x => x.Field<int>(1));
int total_ticketsunpickedup = dtTransList.AsEnumerable().Sum(x => x.Field<int>(3));
int total_pickeduptransaction = dtTransList.AsEnumerable().Count(row => row.Field<int>(3) == 0);
int total_unpickeduptransaction = dtTransList.AsEnumerable().Count(row => row.Field<int>(3) != 0);

Try this to get a total of the QTY
decimal total_tickets_qty = lstTransList.Sum(x => Convert.ToDecimal(Regex.Match(x.Value.Split('|')[1], #"\d+").Value);
Total of the PICKEDUP
decimal total_tickets_pickedup = lstTransList.Sum(x => Convert.ToDecimal(Regex.Match(x.Value.Split('|')[2], #"\d+").Value));
Total of the difference
decimal total_tickets = lstTransList.Sum(x => Convert.ToDecimal(Regex.Match(x.Value.Split('|')[1], #"\d+").Value) - Convert.ToDecimal(Regex.Match(x.Value.Split('|')[2], #"\d+").Value));

Related

Rounding to 2 decimal points a DataTable based NuGet package

I'm using ConsoleTableExt to print a tabulated table. I wanted to round a few decimals to 2 points. The NuGet package is using DataTable. Currently, I'm using .ToString(f2"), but I read a post which states that it's a bad practice to do that in a DataTable. Any suggestions?
// Main()
var tableBuilder = ConsoleTableBuilder.From(backtest.ReportResults(backtestResults)).WithFormat(ConsoleTableBuilderFormat.Alternative);
tableBuilder.ExportAndWriteLine();
public static DataTable ReportResults(List<BacktestResult> backtestResults)
{
var table = new DataTable();
table.Columns.Add("Pair", typeof(string));
table.Columns.Add("Trades", typeof(int));
table.Columns.Add("Average Profit %", typeof(string));
table.Columns.Add("Cumulative Profit %", typeof(string));
table.Columns.Add($"Total Profit", typeof(string));
table.Columns.Add($"Total Profit %", typeof(string));
foreach (var pair in _backtestOptions.Pairs)
{
var results = backtestResults.Where(e => e.Pair.Equals(pair)).ToList();
var trades = results.Count;
var profitMean = results.Count > 0 ? results.Average(e => e.ProfitPercentage) : 0;
var profitMeanPercentage = results.Count > 0 ? results.Average(e => e.ProfitPercentage) * 100 : 0;
var profitSum = results.Sum(e => e.ProfitPercentage);
var profitSumPercentage = results.Sum(e => e.ProfitPercentage) * 100;
var profitTotalAbs = results.Sum(e => e.ProfitAbs);
var profitTotal = results.Sum(e => e.ProfitAbs) / 1;
var profitTotalPercentage = results.Sum(e => e.ProfitPercentage) * 100;
table.Rows.Add(pair, trades, profitMeanPercentage.ToString("f2"), profitSumPercentage.ToString("f2"), profitTotalAbs.ToString("f8"),
profitTotalPercentage.ToString("f2"));
}
return table;
}
You can use the function Round of Math for round value. Round(double value, MidpointRounding mode) where the second parameter specification for how to round value if it is midway between two other numbers.
Example:
decimal decimalVal = 123.456M;
Math.Round(decimalVal, 2);

C# Constructing a Dynamic Query From DataTable

Trying to Generate a Dynamic Linq Query, based on DataTable returned to me... The column names in the DataTable will change, but I will know which ones I want to total, and which ones I will want to be grouped.
I can get this to work with loops and writing the output to a variable, then recasting the parts back into a data table, but I'm hoping there is a more elegant way of doing this.
//C#
DataTable dt = new DataTable;
Dt.columns(DynamicData1)
Dt.columns(DynamicData1)
Dt.columns(DynamicCount)
In this case the columns are LastName, FirstName, Age. I want to total ages by LastName,FirstName columns (yes both in the group by). So one of my parameters would specify group by = LastName, FirstName and another TotalBy = Age. The next query may return different column names.
Datarow dr =..
dr[0] = {"Smith","John",10}
dr[1] = {"Smith","John",11}
dr[2] = {"Smith","Sarah",8}
Given these different potential columns names...I'm looking to generate a linq query that creates a generic group by and Total output.
Result:
LastName, FirstName, AgeTotal
Smith, John = 21
Smith, Sarah = 8
If you use a simple converter for Linq you can achieve that easily.
Here a quick data generation i did for the sample :
// create dummy table
var dt = new DataTable();
dt.Columns.Add("LastName", typeof(string));
dt.Columns.Add("FirstName", typeof(string));
dt.Columns.Add("Age", typeof(int));
// action to create easily the records
var addData = new Action<string, string, int>((ln, fn, age) =>
{
var dr = dt.NewRow();
dr["LastName"] = ln;
dr["FirstName"] = fn;
dr["Age"] = age;
dt.Rows.Add(dr);
});
// add 3 datarows records
addData("Smith", "John", 10);
addData("Smith", "John", 11);
addData("Smith", "Sarah", 8);
This is how to use my simple transformation class :
// create a linq version of the table
var lqTable = new LinqTable(dt);
// make the group by query
var groupByNames = lqTable.Rows.GroupBy(row => row["LastName"].ToString() + "-" + row["FirstName"].ToString()).ToList();
// for each group create a brand new linqRow
var linqRows = groupByNames.Select(grp =>
{
//get all items. so we can use first item for last and first name and sum the age easily at the same time
var items = grp.ToList();
// return a new linq row
return new LinqRow()
{
Fields = new List<LinqField>()
{
new LinqField("LastName",items[0]["LastName"].ToString()),
new LinqField("FirstName",items[0]["FirstName"].ToString()),
new LinqField("Age",items.Sum(item => Convert.ToInt32(item["Age"]))),
}
};
}).ToList();
// create new linq Table since it handle the datatable format ad transform it directly
var finalTable = new LinqTable() { Rows = linqRows }.AsDataTable();
And finally here are the custom class that are used
public class LinqTable
{
public LinqTable()
{
}
public LinqTable(DataTable sourceTable)
{
LoadFromTable(sourceTable);
}
public List<LinqRow> Rows = new List<LinqRow>();
public List<string> Columns
{
get
{
var columns = new List<string>();
if (Rows != null && Rows.Count > 0)
{
Rows[0].Fields.ForEach(field => columns.Add(field.Name));
}
return columns;
}
}
public void LoadFromTable(DataTable sourceTable)
{
sourceTable.Rows.Cast<DataRow>().ToList().ForEach(row => Rows.Add(new LinqRow(row)));
}
public DataTable AsDataTable()
{
var dt = new DataTable("data");
if (Rows != null && Rows.Count > 0)
{
Rows[0].Fields.ForEach(field =>
{
dt.Columns.Add(field.Name, field.DataType);
});
Rows.ForEach(row =>
{
var dr = dt.NewRow();
row.Fields.ForEach(field => dr[field.Name] = field.Value);
dt.Rows.Add(dr);
});
}
return dt;
}
}
public class LinqRow
{
public List<LinqField> Fields = new List<LinqField>();
public LinqRow()
{
}
public LinqRow(DataRow sourceRow)
{
sourceRow.Table.Columns.Cast<DataColumn>().ToList().ForEach(col => Fields.Add(new LinqField(col.ColumnName, sourceRow[col], col.DataType)));
}
public object this[int index]
{
get
{
return Fields[index].Value;
}
set
{
Fields[index].Value = value;
}
}
public object this[string name]
{
get
{
return Fields.Find(f => f.Name == name).Value;
}
set
{
var fieldIndex = Fields.FindIndex(f => f.Name == name);
if (fieldIndex >= 0)
{
Fields[fieldIndex].Value = value;
}
}
}
public DataTable AsSingleRowDataTable()
{
var dt = new DataTable("data");
if (Fields != null && Fields.Count > 0)
{
Fields.ForEach(field =>
{
dt.Columns.Add(field.Name, field.DataType);
});
var dr = dt.NewRow();
Fields.ForEach(field => dr[field.Name] = field.Value);
dt.Rows.Add(dr);
}
return dt;
}
}
public class LinqField
{
public Type DataType;
public object Value;
public string Name;
public LinqField(string name, object value, Type dataType)
{
DataType = dataType;
Value = value;
Name = name;
}
public LinqField(string name, object value)
{
DataType = value.GetType();
Value = value;
Name = name;
}
public override string ToString()
{
return Value.ToString();
}
}
I think I'd just use a dictionary:
public Dictionary<string, int> GroupTot(DataTable dt, string[] groupBy, string tot){
var d = new Dictionary<string, int>();
foreach(DataRow ro in dt.Rows){
string key = "";
foreach(string col in groupBy)
key += (string)ro[col] + '\n';
if(!d.ContainsKey(key))
d[key] = 0;
d[key]+= (int)ro[tot];
}
return d;
}
If you want the total on each row, we could get cute and create a column that is an array of one int instead of an int:
public void GroupTot(DataTable dt, string[] groupBy, string tot){
var d = new Dictionary<string, int>();
var dc = dt.Columns.Add("Total_" + tot, typeof(int[]));
foreach(DataRow ro in dt.Rows){
string key = "";
foreach(string col in groupBy)
key += (string)ro[col] + '\n'; //build a grouping key from first and last name
if(!d.ContainsKey(key)) //have we seen this name pair before?
d[key] = new int[1]; //no we haven't, ensure we have a tracker for our total, for this first+last name
d[key][0] += (int)ro[tot]; //add the total
ro[dc] = d[key]; //link the row to the total tracker
}
}
At the end of the operation every row will have an array of int in the "Total_age" column that represents the total for that First+Last name. The reason I used int[] rather than int, is because int is a value type, whereas int[] is a reference. Because as the table is being iterated each row gets assigned a reference to an int[] some of them with the same First+Last name will end up with their int[] references pointing to the same object in memory, so incrementing a later one increments all the earlier ones too (all "John Smith" rows total column holds a refernece to the same int[]. If we'd made the column an int type, then every row would point to a different counter, because every time we say ro[dc] = d[key] it would copy the current value of d[key] int into ro[dc]'s int. Any reference type would do for this trick to work, but value types wouldn't. If you wanted your column to be value type you'd have to iterate the table again, or have two dictionaries, one that maps DataRow -> total and iterate the keys, assigning the totals back into the row

How to replace values in 2-D string array

How to perform this task?
int Amount, x=1000,y=200;
string BasedOn="x*12/100+y*5/100";
//In place of x and y in BasedOn I want to replace with x,y values like (1000*12%+200*5%)
//and the calculated result(130) should be assigned to Amount variable
For now, I split the BasedOn string
string[][] strings = BasedOn
.Split(new char[] { '+' }, StringSplitOptions.RemoveEmptyEntries)
.Select(w => w.Split('*').ToArray())
.ToArray();
What to do next? Please help me.
I made your code more flexible
static void Main(string[] args)
{
Dictionary<string, object> variables = new Dictionary<string, object>();
variables.Add("x", 1000);
variables.Add("y", 200);
string equation = "x*12/100+y*5/100";
var result = Calculate(variables, equation);
}
static object Calculate(Dictionary<string, object> variables, string equation)
{
variables.ToList().ForEach(v => equation = equation.Replace(v.Key, v.Value.ToString()));
return new DataTable().Compute(equation, string.Empty);
}
You can take a look on the DataTable.Compute method.
It can be used like this on your case:
using System.Data;
DataTable dt = new DataTable();
var Amount = dt.Compute("1000*12%+200*5%","");
For replacing "x" and "y" with numeric values you can use string.Replace
Take a look NCalc (http://ncalc.codeplex.com/)
Expression e = new Expression("2 + 3 * 5");
Debug.Assert(17 == e.Evaluate());
If you want to replace the string AND calculate the result, you can do:
int Amount, x=1000,y=200;
string BasedOn=$"{x}*12/100+{y}*5/100";
DataTable dt = new DataTable();
var v = dt.Compute(BasedOn,"");
v will be your result (130).
EDIT: You have to replace your %'s with division by 100, as the datatable thinks its a mod operator.
With DataColumn.Expression:
int Amount, x = 1000, y = 200; string BasedOn = "x*12%+y*5%";
var dt = new DataTable();
dt.Columns.Add("x", typeof(int)); // in Visual Studio 2015 you can use nameof(x) instead of "x"
dt.Columns.Add("y", typeof(int));
dt.Columns.Add("Amount", typeof(int), BasedOn.Replace("%", "/100")); // "x*12/100+y*5/100" ("%" is considered modulus operator)
Amount = (int)dt.Rows.Add(x, y)["Amount"]; // 130
With RegEx.Replace and DataTable.Compute:
string expression = Regex.Replace(BasedOn.Replace("%", "/100"), "[a-zA-Z]+",
m => (m.Value == "x") ? x + "" : (m.Value == "y") ? y + "" : m.Value); // "1000*12/100+200*5/100"
Amount = (int)(double)new DataTable().Compute(expression, ""); // 130.0 (double)

C# - Looking for the list of duplicated rows (need optimization)

Please, I would like to optimize this code in C#, if possible.
When there are less than 1000 lines, it's fine. But when we have at least 10000, it starts to take some time...
Here a little benchmark :
5000 lines => ~2s
15000 lines => ~20s
25000 lines => ~50s
Indeed, I'm looking for duplicated lines.
Method SequenceEqual to check values may be a problem (in my "benchmark", I have 4 fields considered as "keyField" ...).
Here is the code :
private List<DataRow> GetDuplicateKeys(DataTable table, List<string> keyFields)
{
Dictionary<List<object>, int> keys = new Dictionary<List<object>, int>(); // List of key values + their index in table
List<List<object>> duplicatedKeys = new List<List<object>>(); // List of duplicated keys values
List<DataRow> duplicatedRows = new List<DataRow>(); // Rows that are duplicated
foreach (DataRow row in table.Rows)
{
// Find keys fields values for the row
List<object> rowKeys = new List<object>();
keyFields.ForEach(keyField => rowKeys.Add(row[keyField]));
// Check if those keys are already defined
bool alreadyDefined = false;
foreach (List<object> keyValue in keys.Keys)
{
if (rowKeys.SequenceEqual(keyValue))
{
alreadyDefined = true;
break;
}
}
if (alreadyDefined)
{
duplicatedRows.Add(row);
// If first duplicate for this key, add the first occurence of this key
if (!duplicatedKeys.Contains(rowKeys))
{
duplicatedKeys.Add(rowKeys);
int i = keys[keys.Keys.First(key => key.SequenceEqual(rowKeys))];
duplicatedRows.Add(table.Rows[i]);
}
}
else
{
keys.Add(rowKeys, table.Rows.IndexOf(row));
}
}
return duplicatedRows;
}
Any ideas ?
I think this is the fastest and shortest way to find duplicate rows:
For 100.000 rows it executes in about 250ms.
Main and test data:
static void Main(string[] args)
{
var dt = new DataTable();
dt.Columns.Add("Id");
dt.Columns.Add("Value1");
dt.Columns.Add("Value2");
var rnd = new Random(DateTime.Now.Millisecond);
for (int i = 0; i < 100000; i++)
{
var dr = dt.NewRow();
dr[0] = rnd.Next(1, 1000);
dr[1] = rnd.Next(1, 1000);
dr[2] = rnd.Next(1, 1000);
dt.Rows.Add(dr);
}
Stopwatch sw = new Stopwatch();
sw.Start();
var duplicates = GetDuplicateRows(dt, "Id", "Value1", "Value2");
sw.Stop();
Console.WriteLine(
"Found {0} duplicates in {1} miliseconds.",
duplicates.Count,
sw.ElapsedMilliseconds);
Console.ReadKey();
}
GetDuplicateRows with LINQ:
private static List<DataRow> GetDuplicateRows(DataTable table, params string[] keys)
{
var duplicates =
table
.AsEnumerable()
.GroupBy(dr => String.Join("-", keys.Select(k => dr[k])), (groupKey, groupRows) => new { Key = groupKey, Rows = groupRows })
.Where(g => g.Rows.Count() > 1)
.SelectMany(g => g.Rows)
.ToList();
return duplicates;
}
Explanation (for those who are new to LINQ):
The most tricky part is the GroupBy I guess. Here I take as the first parameter a DataRow and for each row I create a group key from the values for the specified keys that I join to create a string like 1-1-2. Then the second parameter just selects the group key and the group rows into a new anonymous object. Then I check if there is more then 1 row and flatten the groups back into a list with SelectMany.
Try this. Use more linq, that improve perfomance, also try with PLinq if posible.
Regards
private List<DataRow> GetDuplicateKeys(DataTable table, List<string> keyFields)
{
Dictionary<List<object>, int> keys = new Dictionary<List<object>, int>(); // List of key values + their index in table
List<List<object>> duplicatedKeys = new List<List<object>>(); // List of duplicated keys values
List<DataRow> duplicatedRows = new List<DataRow>(); // Rows that are duplicated
foreach (DataRow row in table.Rows)
{
// Find keys fields values for the row
List<object> rowKeys = new List<object>();
keyFields.ForEach(keyField => rowKeys.Add(row[keyField]));
// Check if those keys are already defined
bool alreadyDefined = false;
foreach (List<object> keyValue in keys.Keys)
{
if (rowKeys.Any(keyValue))
{
alreadyDefined = true;
break;
}
}
if (alreadyDefined)
{
duplicatedRows.Add(row);
// If first duplicate for this key, add the first occurence of this key
if (!duplicatedKeys.Contains(rowKeys))
{
duplicatedKeys.Add(rowKeys);
int i = keys[keys.Keys.First(key => key.SequenceEqual(rowKeys))];
duplicatedRows.Add(table.Rows[i]);
}
}
else
{
keys.Add(rowKeys, table.Rows.IndexOf(row));
}
}
return duplicatedRows;
}

How to count and sum total of DataTable with LINQ?

I have a DataTable which has a column "amount" for each rows and I'd like to have the total sum of all the rows. And also, I'd like to get total number of rows in the DataTable. Could anyone teach me how to have it done with LINQ instead of ordinary way?
Number of rows:
DataTable dt; // ... populate DataTable
var count = dt.Rows.Count;
Sum of the "amount" column:
DataTable dt; // ... populate DataTable
var sum = dt.AsEnumerable().Sum(dr => dr.Field<int>("amount"));
Aggregate allows you to avoid enumerating the rows twice (you could get the row count from the rows collection but this is more to show how to extract multiple aggregates in 1 pass):
var sumAndCount = table.AsEnumerable().Aggregate(new { Sum = 0d, Count = 0},
(data, row) => new { Sum = data.Sum + row.Field<double>("amount"), Count = data.Count + 1});
double sum = sumAndCount.Sum;
int count = sumAndCount.Count;
decimal[] Amount = {2,3,5 };
var sum = Amount.Sum();
var count = Amount.Count();
Based on Roy Goode's Answer you could also create an Extension
public static int Sum(this DataTable table, string Column)
{
return table.AsEnumerable().Sum(dr => dr.Field<int>(Column));
}
Unfortunately you can't be more generic her because there is no where T : numeric

Categories