All the following must be done in C#. Parsing the SQL table (SQL Server) will be done using methods in System.Data.Odbc.
Let's assume I have two .csv files, fi1 and fi2. The first csv file has two columns id and val1, and the second csv has two columns as well, id and val2.
I would like to read the two files, and parse the output to one SQL table with the following columns: id, val1, val2.
The problem is that the two files may have different entries in the id columns: in other words, some id's may have a val1 value but no val2 value, and vice versa, or they might have both values.
The table should contain the union of the id columns in the two files.
Example:
File 1
File2
The way I would want the final SQL table to look like is this:
Note that each file might contain duplicates, and we would want to exclude the duplicates when parsing the SQL table.
The thought I had is to create two dictionaries, dict1 and dict2, where the key would be the id, and the value would be val1 and val2. Dictionaries will be used to make sure that duplicates are not included:
Dictionary<string, string> dict1 = new Dictionary<string, string>();
string[] header1 = new string[]{};
using (StreamReader rdr = new StreamReader(fi1))
{
header1 = rdr.ReadLine().Split(',');
while (!rdr.EndOfStream)
{
string ln = rdr.ReadLine();
string[] split_ln = ln.Split(',');
dict1.Add(split_ln[0], split_ln[1]);
}
}
Dictionary<string, string> dict2 = new Dictionary<string, string>();
string[] header2 = new string[]{};
using (StreamReader rdr = new StreamReader(fi2))
{
header2 = rdr.ReadLine().Split(',');
while (!rdr.EndOfStream)
{
string ln = rdr.ReadLine();
string[] split_ln = ln.Split(',');
dict2.Add(split_ln[0], split_ln[1]);
}
}
However, after adding each file to a dictionary, I am not sure how to match the id's of both dictionaries.
Would anyone have a good hint as to how to deal with this problem?
I would do atually do a list of tuples to hold the values here instead of a dictionary so that all the information is in one place rather than matching keys, each tuple corresponds to a table record
var dict = new List<Tuple<string, string, string>>();
using (StreamReader rdr = new StreamReader(fi1))
{
while (!rdr.EndOfStream)
{
string ln = rdr.ReadLine();
string[] split_ln = ln.Split(',');
dict.Add(new Tuple<string, string, string>(split_ln[0], split_ln[1],null));
}
}
using (StreamReader rdr = new StreamReader(fi2))
{
while (!rdr.EndOfStream)
{
string ln = rdr.ReadLine();
string[] split_ln = ln.Split(',');
if (dict.Any(item => item.Item1 == split_ln[0]))
{
var item = dict.Find(i => i.Item1 == split_ln[0]);
var newtuple = new Tuple<string, string, string>(item.Item1, item.Item2, split_ln[1]);
dict.Remove(item);
dict.Add(newtuple);
}
else
{
dict.Add(new Tuple<string, string, string>(split_ln[0],null,split_ln[1]));
}
}
}
Related
I have an excel file (separated with with commas) two columns, City and Country.
Column A has countries and column B has cities. Each row therefore has a country and a city located in this country.
City Country
Madrid Spain
Barcelona Spain
Paris France
Valencia Spain
Rome Italy
Marseille France
Florence Italy
I am wondering a way to read this excel in C# in a Dictionary> type where the key will be my country and the values the city, so after reading it I will have the following:
{
"Spain": ["Madrid", "Barcelona", "Valencia"],
"France": ["Paris", "Marseille"],
"Italy": ["Rome", "Florence"]
}
What I have tried so far is creating this class:
class ReadCountryCityFile
{
Dictionary<string, List<string>> countrycitydict{ get; }
// constructor
public ReadCountryCityFile()
{
countrycitydict= new Dictionary<string, List<string>>();
}
public Dictionary<string, List<string>> ReadFile(string path)
{
using (var reader = new StreamReader(path))
{
List<string> listcountry = new List<string>();
List<string> listcity = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line != "Country;City")
{
List<string> citieslist = new List<string>();
var values = line.Split(';');
citieslist .Add(values[0]);
string country= values[1];
countrycitydict[intents] = citieslist ;
}
}
return countrycitydict;
}
}
But countrydict is not as expected. How could I do it?
How could I solved it if intead of
City Country
Madrid Spain
I had
City Country
Madrid Spain
Valencia
Providing that you use a simple CSV (with no quotations) you can try Linq:
Dictionary<string, string[]> result = File
.ReadLines(#"c:\MyFile.csv")
.Where(line => !string.IsNullOrWhiteSpace(line)) // To be on the safe side
.Skip(1) // If we want to skip the header (the very 1st line)
.Select(line => line.Split(';')) //TODO: put the right separator here
.GroupBy(items => items[0].Trim(),
items => items[1])
.ToDictionary(chunk => chunk.Key,
chunk => chunk.ToArray());
Edit: If you want (see comments below) Dictionary<string, string> (not Dictionary<string, string[]>) e.g. you want
...
{"Spain", "Madrid\r\nBarcelona\r\nValencia"},
...
instead of
...
{"Spain", ["Madrid", "Barcelona", "Valencia"]}
...
you can modify the last .ToDictionary into:
.ToDictionary(chunk => chunk.Key,
chunk => string.Join(Environment.NewLine, chunk));
While you loop over your input try to check if your dictionary has alredy the key inserted. If not insert it and then add the value at the key
Dictionary<string, List<string>> countrycitydict{ get; }
public Dictionary<string, List<string>> ReadFile(string path)
{
using (var reader = new StreamReader(path))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line != "Country;City")
{
var values = line.Split(';');
// Try to get the entry for the current country
if(!countrycitydict.TryGetValue(values[0], out List<string> v))
{
// If not found build an entry for the country
List<string> cities = new List<string>()
countrycitydict.Add(values[0], cities) ;
}
// Now you can safely add the city
countrycitydict[values[0]].Add(values[1]);
}
}
return countrycitydict;
}
}
I have the following code:
var tuple = new Tuple<string, string>("MyKey", "MyValue");
var list = new List<string>();
var str = tuple.ToString();
list.Add(str);
// str has the value "(MyKey, MyValue)"
I have a predefined object where I need to use a list of strings.
I decided to use a Tuple but I am not sure how I can cast the str value back to a Tuple.
How can I store a key value in a List so that I can use lambda to query it e.g. by the key?
All this code:
var tuple = new Tuple<string, string>("MyKey", "MyValue");
var list = new List<string>();
var str = tuple.ToString();
list.Add(str);
// str has the value "(MyKey, MyValue)"
Could be replaced by a dictionary:
Dictionary<string, string> values = new Dictionary<string, string>();
values.Add("MyKey", "MyValue");
Then you can use linq to query the dictionary if you'd like to do so:
value = values.Where(x => x.ContainsKey("MyKey"));
You can get a list with all the keys as follows:
List<string> keys = values.Keys;
So no need to have a separate list for that.
If you want a list of string with two values separated by a coma, the dictionary will do too:
List<string> keysValues = (from item in values
select item.Key + "," + item.Value).ToList();
Use Dictionary.
var dictionary = new Dictionary<string, string>();
dictionary.Add("myKey", "myVal");
if (dictionary.ContainsKey("myKey"))
dictionary["myKey"] = "myVal1";
I suggest you use a Dictionary. But if you really need to do it this way:
To transform from the string back to Tuple (assuming that the Key itself will never contain a commma+space):
var tuple = Tuple.Create("MyKey", "MyValue");
var list = new List<string>();
var str = tuple.ToString();
list.Add(str);
// str has the value "(MyKey, MyValue)"
Console.WriteLine(str);
int comma = str.IndexOf(", ");
string key = str.Substring(1,comma-1);
string valuee = str.Substring(comma+2,str.Length-key.Length-4);
var tuple2 = Tuple.Create(key, valuee);
// 'tuple2' is now equal to the orignal 'tuple'
I have a C# application which retrieves an SQL result set in the following format:
customer_id date_registered date_last_purchase loyalty_points
1 2017-01-01 2017-05-02 51
2 2017-01-23 2017-06-21 124
...
How can I convert this to a JSON string, such that the first column (customer_id) is a key, and all other subsequent columns are values within a nested-JSON object for each customer ID?
Example:
{
1: {
date_registered: '2017-01-01',
date_last_purchase: '2017-05-02',
loyalty_points: 51,
...
},
2: {
date_registered: '2017-01-23',
date_last_purchase: '2017-06-21',
loyalty_points: 124,
...
},
...
}
Besides date_registered, date_last_purchase, and loyalty_points, there may be other columns in the future so I do not want to refer to these column names specifically. Therefore I have already used the code below to fetch the column names, but am stuck after this.
SqlDataReader sqlDataReader = sqlCommand.ExecuteReader();
var columns = new List<string>();
for (var i = 0; i < sqlDataReader.FieldCount; i++)
{
columns.Add(sqlDataReader.GetName(i));
}
while (sqlDataReader.Read())
{
rows.Add(columns.ToDictionary(column => column, column => sqlDataReader[column]));
}
You could use something like this to convert the data reader to a Dictionary<object, Dictionary<string, object>> and then use Json.NET to convert that to JSON:
var items = new Dictionary<object, Dictionary<string, object>>();
while (sqlDataReader.Read())
{
var item = new Dictionary<string, object>(sqlDataReader.FieldCount - 1);
for (var i = 1; i < sqlDataReader.FieldCount; i++)
{
item[sqlDataReader.GetName(i)] = sqlDataReader.GetValue(i);
}
items[sqlDataReader.GetValue(0)] = item;
}
var json = Newtonsoft.Json.JsonConvert.SerializeObject(items, Newtonsoft.Json.Formatting.Indented);
Update: JSON "names" are always strings, so used object and GetValue for the keys.
I'm trying to read a text file and print out into a table.
I want the output to be this
But now I having different output
var column1 = new List<string>();
var column2 = new List<string>();
var column3 = new List<string>();
using (var rd = new StreamReader(#"C:\test.txt"))
{
while (!rd.EndOfStream)
{
var splits = rd.ReadLine().Split(';');
column1.Add(splits[0]);
column2.Add(splits[1]);
column3.Add(splits[2]);
}
}
Console.WriteLine("Date/Time \t Movie \t Seat");
foreach (var element in column1) Console.WriteLine(element);
foreach (var element in column2) Console.WriteLine(element);
foreach (var element in column3) Console.WriteLine(element);
You can use Linq to construct a convenient structure (e.g. List<String[]>) and then print out all the data wanted:
List<String[]> data = File
.ReadLines(#"C:\test.txt")
//.Skip(1) // <- uncomment this to skip caption if the csv has it
.Select(line => line.Split(';').Take(3).ToArray()) // 3 items only
.ToList();
// Table output (wanted one):
String report = String.Join(Environment.NewLine,
data.Select(items => String.Join("\t", items)));
Console.WriteLine(report);
// Column after column output (actual one)
Console.WriteLine(String.Join(Environment.NewLine, data.Select(item => item[0])));
Console.WriteLine(String.Join(Environment.NewLine, data.Select(item => item[1])));
Console.WriteLine(String.Join(Environment.NewLine, data.Select(item => item[2])));
EDIT: if you want to choose the movie, buy the ticket etc. elaborate the structure:
// Create a custom class where implement your logic
public class MovieRecord {
private Date m_Start;
private String m_Name;
private int m_Seats;
...
public MovieRecord(DateTime start, String name, int seats) {
...
m_Seats = seats;
...
}
...
public String ToString() {
return String.Join("\t", m_Start, m_Name, m_Seats);
}
public void Buy() {...}
...
}
And then convert to conventinal structure:
List<MovieRecord> data = File
.ReadLines(#"C:\test.txt")
//.Skip(1) // <- uncomment this to skip caption if the csv has it
.Select(line => {
String items[] = line.Split(';');
return new MovieRecord(
DateTime.ParseExact(items[0], "PutActualFormat", CultureInfo.InvariantCulture),
items[1],
int.Parse(items[2]));
}
.ToList();
And the table output will be
Console.Write(String.Join(Envrironment.NewLine, data));
Don't use Console.WriteLine if you want to add a "column". You should also use a single List<string[]> instead of multiple List<string>.
List<string[]> allLineFields = new List<string[]>();
using (var rd = new StreamReader(#"C:\test.txt"))
{
while (!rd.EndOfStream)
{
var splits = rd.ReadLine().Split(';');
allLineFields.Add(splits);
}
}
Console.WriteLine("Date/Time \t Movie \t Seat");
foreach(string[] line in allLineFields)
Console.WriteLine(String.Join("\t", line));
In general you should use a real csv parser if you want to parse a csv-file, not string methods or regex.
You could use the TextFieldParser which is the only one available in the framework directly:
var allLineFields = new List<string[]>();
using (var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(#"C:\test.txt"))
{
parser.Delimiters = new string[] { ";" };
parser.HasFieldsEnclosedInQuotes = false; // very useful
string[] lineFields;
while ((lineFields = parser.ReadFields()) != null)
{
allLineFields.Add(lineFields);
}
}
You need to add a reference to the Microsoft.VisualBasic dll to your project.
There are other available: Parsing CSV files in C#, with header
You could attempt to solve this in a more Object-Orientated manner, which might make it a bit easier for you to work with:
You can declare a simple class to represent a movie seat:
class MovieSeat
{
public readonly string Date, Name, Number;
public MovieSeat(string source)
{
string[] data = source.Split(';');
Date = data[0];
Name = data[1];
Number = data[2];
}
}
And then you can read in and print out the data in a few lines of code:
// Read in the text file and create a new MovieSeat object for each line in the file.
// Iterate over all MovieSeat objets and print them to console.
foreach(var seat in File.ReadAllLines(#"C:\test.txt").Select(x => new MovieSeat(x)))
Console.WriteLine(string.Join("\t", seat.Date, seat.Name, seat.Number));
This program is meant to read in a csv file and create a dictionary from it, which is then used to translate a word typed into a textbox (txtINPUT) and output the result to another textbox (txtOutput).
The program doesnt translate anything and always outputs "No translation found."
I've never used the dictionary class before so I dont know where the problem is coming from.
Thanks for any help you can give me.
Dictionary<string, string> dictionary;
private void CreateDictionary()
{
//Load file
List<string> list = new List<string>();
using (StreamReader reader = new StreamReader("dictionarylist.csv"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
//Add to dictionary
dictionary = new Dictionary<string, string>();
string[] split = line.Split(',');
dictionary.Add(split[0], split[1]);
}
}
}
private void btnTranslate_Click(object sender, EventArgs e)
{
CreateDictionary();
string outputString = null;
if (dictionary.TryGetValue(txtInput.Text, out outputString))
{
txtOutput.Text = outputString;
}
else
{
txtOutput.Text = ("No translation found");
}
}
You are creating a new instance of a Dictionary each loop cycle, basically overwriting it each time you read a line. Move this line out of the loop:
// Instantiate a dictionary
var map = new Dictionary<string, string>();
Also why not load dictionary one time, you are loading it each button click, this is not efficient.
(>=.NET 3) The same using LINQ ToDictionary():
usign System.Linq;
var map = File.ReadAllLines()
.Select(l =>
{
var pair = l.Split(',');
return new { First = pair[0], Second = pair[1] }
})
.ToDictionary(k => k.First, v => v.Second);
In your while loop, you create a new dictionary every single pass!
You want to create one dictionary, and add all the entries to that:
while ((line = reader.ReadLine()) != null)
{
//Add to dictionary
dictionary = new Dictionary<string, string>(); /* DON'T CREATE NEW DICTIONARIES */
string[] split = line.Split(',');
dictionary.Add(split[0], split[1]);
}
You should do it more like this:
List<string> list = new List<string>();
dictionary = new Dictionary<string, string>(); /* CREATE ONE DICTIONARY */
using (StreamReader reader = new StreamReader("dictionarylist.csv"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
string[] split = line.Split(',');
dictionary.Add(split[0], split[1]);
}
}