Extracting a dictionary from sparse csv file

Extracting a dictionary from sparse csv file - c#

I have a sparsely populated excel file I want to extract two columns into a dictionary in C#. I have tried the following. This fails when it reads the blank lines. Is there a cleaner way to achieve the same. I don't care about any other values here. Just a mapping of AR ID to AR Type would do.
public class Table
{
private Dictionary<string, string> _ARID_ARTypeValues = new Dictionary<string, string>();
private string _arId;
public Table(string arId)
{
_arId = arId;
}
public void AddValue(string key, string value)
{
_ARID_ARTypeValues.Add(key, value);
}
}
public static IDictionary ParseCsvFile(StreamReader reader)
{
Dictionary<string, Table> tables = new Dictionary<string, Table>();
// First line contains column names.
var columnNames = reader.ReadLine().Split(',');
for (int i = 1; i < columnNames.Length; ++i)
{
var columnName = columnNames[i];
var ntable = new Table(columnName);
if ((columnName == "AR ID") || (columnName == "AR Type"))
{
tables.Add(columnName, ntable);
}
}
var line = reader.ReadLine();
while (line != null)
{
var columns = line.Split(',');
for (int j = 1; j < columns.Length; ++j)
{
var table = tables[columnNames[j]];
table.AddValue(columns[0], columns[j]);
}
line = reader.ReadLine();
}
return tables;
}

I would just use a CSV library, like CsvHelper and read the csv file with that.
Dictionary<string, string> arIdToArTypeMapping = new Dictionary<string, string>();
using (var sr = File.OpenText("test.csv"))
{
var csvConfiguration = new CsvConfiguration
{
SkipEmptyRecords = true
};
using (var csvReader = new CsvReader(sr, csvConfiguration))
{
while (csvReader.Read())
{
string arId = csvReader.GetField("AR ID");
string arType = csvReader.GetField("AR Type");
if (!string.IsNullOrEmpty(arId) && !string.IsNullOrEmpty(arType))
{
arIdToArTypeMapping.Add(arId, arType);
}
}
}
}

You can use Cinchoo ETL - an open source library, to read the csv and convert them to dictionary as simple as with few lines of code shown below
using (var parser = new ChoCSVReader("Dict1.csv")
.WithField("AR_ID", 7)
.WithField("AR_TYPE", 8)
.WithFirstLineHeader(true)
.Configure(c => c.IgnoreEmptyLine = true)
)
{
var dict = parser.ToDictionary(item => item.AR_ID, item => item.AR_TYPE);
foreach (var kvp in dict)
Console.WriteLine(kvp.Key + " " + kvp.Value);
}
Hope this helps.
Disclaimer: I'm the author of this library.

Related

R/W csv with dynamic size of columns to a lookup table by CSVHelper

I have a csv data like this
Header1
Header2
Header3
...
ValueN
Key1
Value11
Value12
Value13
...
Value1N
Key2
Value21
Value22
Value23
...
Value2N
Key3
Value31
Value32
Value33
...
Value3N
...
...
...
...
...
...
KeyN
ValueN1
ValueN2
ValueN3
...
ValueNN
which have dynamic size of columns.
I want to load it to a lookup table
dictionary<string, dictionary<string, string>> lookup_table
so I can get data by
data = lookup_table[key_name][header_name]
Furthermore, I have to write back to csv if data got changed.
How should I create my class and map to read/write it?
csvhelper version = 28.0.1

Except for #dbc comment that the order of the items may change due to the unordered nature of Dictionary<TKey, TValue>, this should work.
void Main()
{
var lookup_table = new Dictionary<string, Dictionary<string, string>>();
using (var reader = new StringReader(",Header1,Header2,Header3\nKey1,value11,value12,value13\nKey2,value21,value22,value23"))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Read();
csv.ReadHeader();
var headerLength = csv.Context.Reader.HeaderRecord.Length;
var header = csv.Context.Reader.HeaderRecord;
while (csv.Read())
{
var key = csv.GetField(0);
lookup_table.Add(key, new Dictionary<string, string>());
for (int i = 1; i < headerLength; i++)
{
lookup_table[key][header[i]] = csv.GetField(i);
}
}
}
using (var csv = new CsvWriter(Console.Out, CultureInfo.InvariantCulture))
{
var headers = lookup_table.First().Value.Keys.ToList();
csv.WriteField(string.Empty);
foreach (var header in headers)
{
csv.WriteField(header);
}
csv.NextRecord();
foreach (KeyValuePair<string, Dictionary<string, string>> entry in lookup_table)
{
csv.WriteField(entry.Key);
for (int i = 0; i < headers.Count; i++)
{
csv.WriteField(entry.Value[headers[i]]);
}
csv.NextRecord();
}
}
}

How can I reduce memory usage when parse json in c#

I'm trying to parse huge json file to 2d array.
I can parse. But required memory is almost 10times.
My sample.json file has 100,000 rows, each with a different item.
If sample.json is 500MB this code need 5GB.
How can i reduce memory usage?
I use Newtonsoft.Json, .Net6.0
Read from json
static void Read()
{
List<Dictionary<string, string>> rows = new List<Dictionary<string, string>>();
string path = #"D:\small.json";
using (FileStream fsRead = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BufferedStream bsRead = new BufferedStream(fsRead))
using (StreamReader srRead = new StreamReader(bsRead))
{
string? line;
while ((line = srRead.ReadLine()) != null)
{
JObject jsonObject = JObject.Parse(line);
MakeRowData(jsonObject, out var row);
rows.Add(row);
}
}
}
Make row
private static void MakeRowData(JObject jsonData, out Dictionary<string, string> row)
{
Dictionary<string, string> output = new Dictionary<string, string>();
foreach (var item in jsonData)
{
int childSize = 0;
if (item.Value != null)
{
childSize = item.Value.Children().Count();
///if Item has child, explore deep
if (childSize > 0)
{
ExploreChild(item.Value, ref output);
}
///or not just add new item
else
{
string str = item.Value.ToString();
output[item.Key] = str ?? "";
}
}
}
row = output;
}
private static void ExploreChild(JToken jToken, ref Dictionary<string, string> row)
{
foreach (var item in jToken)
{
int childSize = item.Children().Count();
///if Item has child, explore deep
if (childSize > 0)
{
ExploreChild(item, ref row);
}
///or not just add new item
else
{
string path = jToken.Path.Replace('[', '(').Replace(']', ')');
string str = jToken.First.ToString();
row[path] = str?? "";
}
}
}
EDIT
Add Sample.json
It is set of json strings.
And Fields are not fixed.
Sample.json
{Field1:0,Field2:1,Field2:3}
{Field1:0,Field5:1,Field6:3}
{Field1:0,Field7:1,Field9:3}
{Field1:0,Field13:1,Field50:3,Field57:3}
...

You can try replacing the recursive exploring children with the iterative one. Something like this:
private static void MakeRowData(JObject jsonData, out Dictionary<string, string> row)
{
Dictionary<string, string> output = new Dictionary<string, string>();
foreach (var item in jsonData)
{
if (item.Value != null)
{
///if Item has child, explore deep
if (item.Value.HasValues)
{
var queue = new Queue<JToken>();
queue.Enqueue(item.Value);
while (queue.Any())
{
var currItem = queue.Dequeue();
if (currItem.HasValues)
{
foreach(var child in item)
queue.Enqueue(child);
}
else
{
// add item without children to row here
}
}
}
///or not just add new item
else
{
string str = item.Value.ToString();
output[item.Key] = str ?? "";
}
}
}
row = output;
}
Recursive calls, unless it is a tail recursion, keep the stack of a method they were called from. This can lead to extensive memory usage.

Inserting data to .CSV file at the same time using foreach

I am new here and actually very new to c#.
In a nutshell, I am using c# via Visual Studio, I am calling a data from a database and I want to save these data in a .csv file. The problem now is that I want to save these data on two columns at the same time.
My code do write them in a file but shifted not on the right rows.
Dictionary<string, string> elementNames = new Dictionary<string, string>();
Dictionary<string, string> elementTypes = new Dictionary<string, string>();
var nodes = webservice.nepService.GetAllElementsOfElementType(webservice.ext, "Busbar", ref elementNames, ref elementTypes);
Dictionary<string, string> nodeResults = new Dictionary<string, string>();
Dictionary<string, string> nodeResults1 = new Dictionary<string, string>();
foreach (var nodename in elementNames.Values)
{
var nodeRes = webservice.nepService.GetResultElementByName(webservice.ext, nodename, "Busbar", -1, "LoadFlow", null);
var Uvolt = GetXMLAttribute(nodeRes, "U");
nodeResults.Add(nodename, Uvolt);
var Upercentage = GetXMLAttribute(nodeRes, "Up");
nodeResults1.Add(nodename, Upercentage);
StringBuilder strBldr = new StringBuilder();
string outputFile = #"C:\Users\12.csv";
string separator = ",";
foreach (var res in nodeResults)
{
strBldr.AppendLine($"{res.Key}{separator}{res.Value}");
}
foreach (var res1 in nodeResults1)
{
strBldr.AppendLine($"{separator}{separator}{res1.Value}");
}
File.WriteAllText(outputFile, strBldr.ToString());
}
this is the output of the previous code:
https://ibb.co/T4trQC3
I want these shifted values to move up beside the other values like that:
https://ibb.co/4S25v0h
Thank you

if you look to the code you are using AppendLine
strBldr.AppendLine($"{separator}{separator}{res1.Value}");
and if you want to append on same line just use Append
strBldr.Append($"{separator}{separator}{res1.Value}");
EDITED:
in linq you can use Zip function to zip to lists
// using System.Linq;
var results = Results.Zip(Results1, (firstList, secondList) => firstList.Key + "," + firstList.Value + "," + secondList.Value);
Edit Full example
public static IDictionary<string, string> Results { get; set; }
public static IDictionary<string, string> Results1 { get; set; }
private static void Main(string[] args)
{
StringBuilder strBldr = new StringBuilder();
string outputFile = #"D:\12.csv";
Results = new Dictionary<string, string>()
{
{"N1", "20"},
{"N2", "0.399992"},
{"N3", "0.369442"},
{"N4", "0.369976"}
};
Results1 = new Dictionary<string, string>()
{
{"N1", "100"},
{"N2", "99.9805"},
{"N3", "92.36053"},
{"N4", "92.49407"}
};
IEnumerable<string> results = Results.Zip(Results1,
(firstList, secondList) => firstList.Key + "," + firstList.Value + "," + secondList.Value);
foreach (string res1 in results)
{
strBldr.AppendLine(res1);
}
File.WriteAllText(outputFile, strBldr.ToString());
}
for faster code you can try this
HashSet<Tuple<string, string, string>> values = new HashSet<Tuple<string, string, string>>();
var nodes = webservice.nepService.GetAllElementsOfElementType(webservice.ext, "Busbar", ref elementNames, ref elementTypes);
foreach (var nodename in elementNames.Values)
{
var nodeRes = webservice.nepService.GetResultElementByName(webservice.ext, nodename, "Busbar", -1, "LoadFlow", null);
var Uvolt = GetXMLAttribute(nodeRes, "U");
var Upercentage = GetXMLAttribute(nodeRes, "Up");
values.Add(Tuple.Create(nodename, Uvolt, Upercentage));
}
var output = string.Join("\n", values.ToList().Select(tuple => $"{tuple.Item1},{tuple.Item2},{tuple.Item3}").ToList());
string outputFile = #"C:\Users\12.csv";
File.WriteAllText(outputFile, output);

if the rowCount for Results and Results1 are same and the keys are in the same order, try:
for (int i = 0; i < Results.Count; i++)
strBldr.AppendLine($"{Results[i].Key}{separator}{Results[i].Value}{separator}{Results1[i].Value}");
Or, if the rows are not in the same order, try:
foreach (var res in Results)
strBldr.AppendLine($"{res.Key}{separator}{res.Value}{separator}{Results1.Single(x => x.Key == res.Key).Value}");

C# Creating named object for each object in LDAP DirectorySearch to insert into SQL database

Ive created a Directory Searcher to pull multiple properties from each user.
objSearchADAM = new DirectorySearcher(objADAM);
objSearchADAM.PropertiesToLoad.Add("givenname");
objSearchADAM.PropertiesToLoad.Add("lastlogontimestamp");
ect...
objSearchResults = objSearchADAM.FindAll();
I then enumerate them, and convert the interger8 timestamp to standard date/time, and save to csv file with
List<string> timeProps = new List<string>() { "lastlogontimestamp", "accountexpires", "pwdlastset", "lastlogoff", "lockouttime", "maxstorage", "usnchanged", "usncreated", "usndsalastobjremoved", "usnlastobjrem", "usnsource" };
foreach (SearchResult objResult in objSearchResults)
{
objEntry = objResult.GetDirectoryEntry();
ResultPropertyCollection myResultProp = objResult.Properties;
foreach (string myKey in myResultProp.PropertyNames)
{
foreach (Object myCollection in myResultProp[myKey])
{
Object sample = myCollection;
if (timeProps.Contains(myKey))
{
String times = sample.ToString();
long ft = Int64.Parse(times);
DateTime date;
try
{
date = DateTime.FromFileTime(ft);
}
catch (ArgumentOutOfRangeException ex)
{
date = DateTime.MinValue;
Console.WriteLine("Out of range: " + ft);
Console.WriteLine(ex.ToString());
}
sample = date;
Console.WriteLine("{0}{1}", myKey.PadRight(25), sample);
objWriter.WriteLine("{0}{1}", myKey.PadRight(25), sample);
}
else
{
Console.WriteLine("{0}{1}", myKey.PadRight(25), sample);
objWriter.WriteLine("{0}{1}", myKey.PadRight(25), sample);
}
}
now i need to create an object for each user with the strings from each result that i can put into an SQL command ive built. where the LDAP query to SQL would be givenname = FirstName and lastlogontimestamp = LastLogon and so on.
StringBuilder sb = new StringBuilder();
sb.Append("INSERT INTO activedirectory.dimUserST (FirstName, LastName) VALUES (#FirstName, #LastName)");
loadStagingCommand.Parameters.AddWithValue("#FirstName", FirstName).DbType = DbType.AnsiString;
ect...
loadStagingCommand.CommandText = sb.ToString();
loadStagingCommand.ExecuteNonQuery();
i tried to use IDictionary in my first foreach (similar to code found here http://ideone.com/vChWD ) but couldn't get it to work. I read about IList and reflection, but im not sure how i could incorporate these.
UPDATE
I researched and found ExpandoObjects and attempted to write in code based off of what i saw in here Creating Dynamic Objects
however i run this new code I return "employeenumber System.Collections.Generic.List`1[System.Dynamic.ExpandoObject]"
if(employeeNumber.Contains(myKey))
{
string[] columnNames = { "EmployeeNumber" };
List<string[]> listOfUsers = new List<string[]>();
for (int i = 0; i < 10; i++)
{
listOfUsers.Add(new[] { myKey});
}
var testData = new List<ExpandoObject>();
foreach (string[] columnValue in listOfUsers)
{
dynamic data = new ExpandoObject();
for (int j = 0; j < columnNames.Count(); j++)
{
((IDictionary<String, Object>)data).Add(columnNames[j], listOfUsers[j]);
}
testData.Add(data);
Console.WriteLine("{0}{1}", myKey.PadRight(25), testData);
objWriter.WriteLine("{0}{1}", myKey.PadRight(25), testData);
}
}
I am obviously missing something here and cant seem to wrap my head around what the problem is. I might even be going about this the wrong way. Basically all i need to do is pull users and their properties from Active Directory and put into SQL database tabels. And I've worked out how to do both separately, but I cant figure out how to put it all together.

If the CSV is just being used to cache the results, you could use a Dictionary to store the contents of the search results instead. Separating your code into functions could be helpful:
private static object GetFirstValue(ResultPropertyCollection properties,
string propertyName)
{
var propertyValues = properties[propertyName];
var result = propertyValues.Count == 0 ? null : propertyValues[0];
return result;
}
Then you could either use a dictionary to hold the property values, or you could create a type:
var results = new List<Dictionary<string, object>>();
foreach(SearchResult objResult in objSearchResults)
{
var properties = objResult.Properties;
var propertyDictionary = new Dictionary<string, object> {
{"FirstName", GetFirstValue(properties, "givenname")},
{"LastName", GetFirstValue(properties, "sn")},
{"UserName", GetFirstValue(properties, "samaccountname")},
};
results.Add(propertyDictionary);
}
Now you have a list of property bags.
This could also be a simple LINQ statement:
var results = objSearchResults.OfType<SearchResult>()
.Select(s => s.Properties)
.Select(p => new {
FirstName = (string)GetFirstValue(properties, "givenname"),
LastName = (string)GetFirstValue(properties, "sn"),
UserName = (string)GetValue(properties, "samaccountname"),
AccountExpires = GetDateTimeValue(properties, "accountexpires")
});
Use the dictionaries like this:
foreach(var item in results)
{
var command = new SqlCommand();
...
command.Parameters.AddWithValue("firstName", item["FirstName"]);
...
}

Dictionaries in C#

This program is meant to read in a csv file and create a dictionary from it, which is then used to translate a word typed into a textbox (txtINPUT) and output the result to another textbox (txtOutput).
The program doesnt translate anything and always outputs "No translation found."
I've never used the dictionary class before so I dont know where the problem is coming from.
Thanks for any help you can give me.
Dictionary<string, string> dictionary;
private void CreateDictionary()
{
//Load file
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader("dictionarylist.csv"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
//Add to dictionary
dictionary = new Dictionary<string, string>();
string[] split = line.Split(',');
dictionary.Add(split[0], split[1]);
}
}
}
private void btnTranslate_Click(object sender, EventArgs e)
{
CreateDictionary();
string outputString = null;
if (dictionary.TryGetValue(txtInput.Text, out outputString))
{
txtOutput.Text = outputString;
}
else
{
txtOutput.Text = ("No translation found");
}
}

You are creating a new instance of a Dictionary each loop cycle, basically overwriting it each time you read a line. Move this line out of the loop:
// Instantiate a dictionary
var map = new Dictionary<string, string>();
Also why not load dictionary one time, you are loading it each button click, this is not efficient.
(>=.NET 3) The same using LINQ ToDictionary():
usign System.Linq;
var map = File.ReadAllLines()
.Select(l =>
{
var pair = l.Split(',');
return new { First = pair[0], Second = pair[1] }
})
.ToDictionary(k => k.First, v => v.Second);

In your while loop, you create a new dictionary every single pass!
You want to create one dictionary, and add all the entries to that:
while ((line = reader.ReadLine()) != null)
{
//Add to dictionary
dictionary = new Dictionary<string, string>(); /* DON'T CREATE NEW DICTIONARIES */
string[] split = line.Split(',');
dictionary.Add(split[0], split[1]);
}
You should do it more like this:
List<string> list = new List<string>();
dictionary = new Dictionary<string, string>(); /* CREATE ONE DICTIONARY */
using (StreamReader reader = new StreamReader("dictionarylist.csv"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
string[] split = line.Split(',');
dictionary.Add(split[0], split[1]);
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Extracting a dictionary from sparse csv file - c#

Related

R/W csv with dynamic size of columns to a lookup table by CSVHelper

How can I reduce memory usage when parse json in c#

Inserting data to .CSV file at the same time using foreach

C# Creating named object for each object in LDAP DirectorySearch to insert into SQL database

Dictionaries in C#

Categories

Resources