I need some help trying to transform txt files in easy searchable data in c#.
My txt files are something like this:
Field1: Data
Field2: Data
UselessField1: Data
UselessField2: Data
UselessField3: Data
Field3: Data
Field3: Data
Field3: Data
Field4: Data
Field4: Data
Field4: Data
Field1: Data
Field2: Data
UselessField1: Data
UselessField2: Data
UselessField3: Data
Field3: Data
Field4: Data
Field4: Data
Field4: Data
Field4: Data
Fields3 and Field4 can have n lines and would be good to separate to other lists like Field1 with Fields3 and Field1 with Fields4 so I can link it later.
I also want to skip the Useless fields.
Maybe this is something simple but I'm complicating it too much, I would appreciate if someone could help. Thanks.
First i'd create a class with a meaningful name and meaningful properties:
public class MeaningfulClassName
{
public string Property1 { get; set; }
public string Property2 { get; set; }
public string Property3 { get; set; }
public string Property4 { get; set; }
//...
}
Now you can use LINQ to filter all for only the relevant lines in the file and split the key and the value by :. But if you want a safe and clean approach you need reflection and a dictionary to map the text-file fields with the properties in the class. For example:
var myClassType = typeof(MeaningfulClassName);
var allowedProperties = new Dictionary<string, PropertyInfo>
{
{ "Field1", myClassType.GetProperty("Property1") },
{ "Field2", myClassType.GetProperty("Property2") },
{ "Field3", myClassType.GetProperty("Property3") },
{ "Field4", myClassType.GetProperty("Property4") }
};
The LINQ query to select only the relevant tokens which also skips your useless fields:
var dataLines = File.ReadLines(path)
.Select(l => l.Split(':').Select(t => t.Trim()).ToArray())
.Where(arr => arr.Length == 2 && allowedProperties.ContainsKey(arr[0]));
Following loop reads the data and adds the instances of the class to a list:
var myList = new List<MeaningfulClassName>();
MeaningfulClassName currentObject = null;
foreach (string[] token in dataLines)
{
string fieldName = token[0];
string fieldValue = token[1];
PropertyInfo pi = allowedProperties[fieldName];
// first field specifies the beginning of the next object
if (fieldName == "Field1")
{
if (currentObject != null)
myList.Add(currentObject);
currentObject = new MeaningfulClassName();
}
pi.SetValue(currentObject, fieldValue);
}
if (currentObject != null)
myList.Add(currentObject);
now you have all objects and can search them easily, for example with LINQ
This code will turn your data into a list of dictionaries, and dictionaries will have field names as keys and list of data as value.
FileStream fs = new FileStream("data.txt", FileMode.OpenOrCreate);
StreamReader r = new StreamReader(fs);
List<Dictionary<string,List<string>>> alldata = new List<Dictionary<string,List<string>>>();
String[] lines = r.ReadToEnd().Split(new string[] { "\r\n" },StringSplitOptions.None);
alldata.Add(new Dictionary<string, List<string>>());
foreach (var item in lines)
{
if (item == "") { alldata.Add(new Dictionary<string, List<string>>()); continue; }
var lst = alldata[alldata.Count - 1];
string key = item.Split(':')[0];
if (key.StartsWith("Useless")) continue;
if (lst.ContainsKey(key))
{
lst[key].Add(item.Split(' ')[1]);
}
else {
lst[key] = new List<string>();
lst[key].Add(item.Split(' ')[1]);
}
}
Related
I have an excel file (separated with with commas) two columns, City and Country.
Column A has countries and column B has cities. Each row therefore has a country and a city located in this country.
City Country
Madrid Spain
Barcelona Spain
Paris France
Valencia Spain
Rome Italy
Marseille France
Florence Italy
I am wondering a way to read this excel in C# in a Dictionary> type where the key will be my country and the values the city, so after reading it I will have the following:
{
"Spain": ["Madrid", "Barcelona", "Valencia"],
"France": ["Paris", "Marseille"],
"Italy": ["Rome", "Florence"]
}
What I have tried so far is creating this class:
class ReadCountryCityFile
{
Dictionary<string, List<string>> countrycitydict{ get; }
// constructor
public ReadCountryCityFile()
{
countrycitydict= new Dictionary<string, List<string>>();
}
public Dictionary<string, List<string>> ReadFile(string path)
{
using (var reader = new StreamReader(path))
{
List<string> listcountry = new List<string>();
List<string> listcity = new List<string>();
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line != "Country;City")
{
List<string> citieslist = new List<string>();
var values = line.Split(';');
citieslist .Add(values[0]);
string country= values[1];
countrycitydict[intents] = citieslist ;
}
}
return countrycitydict;
}
}
But countrydict is not as expected. How could I do it?
How could I solved it if intead of
City Country
Madrid Spain
I had
City Country
Madrid Spain
Valencia
Providing that you use a simple CSV (with no quotations) you can try Linq:
Dictionary<string, string[]> result = File
.ReadLines(#"c:\MyFile.csv")
.Where(line => !string.IsNullOrWhiteSpace(line)) // To be on the safe side
.Skip(1) // If we want to skip the header (the very 1st line)
.Select(line => line.Split(';')) //TODO: put the right separator here
.GroupBy(items => items[0].Trim(),
items => items[1])
.ToDictionary(chunk => chunk.Key,
chunk => chunk.ToArray());
Edit: If you want (see comments below) Dictionary<string, string> (not Dictionary<string, string[]>) e.g. you want
...
{"Spain", "Madrid\r\nBarcelona\r\nValencia"},
...
instead of
...
{"Spain", ["Madrid", "Barcelona", "Valencia"]}
...
you can modify the last .ToDictionary into:
.ToDictionary(chunk => chunk.Key,
chunk => string.Join(Environment.NewLine, chunk));
While you loop over your input try to check if your dictionary has alredy the key inserted. If not insert it and then add the value at the key
Dictionary<string, List<string>> countrycitydict{ get; }
public Dictionary<string, List<string>> ReadFile(string path)
{
using (var reader = new StreamReader(path))
{
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line != "Country;City")
{
var values = line.Split(';');
// Try to get the entry for the current country
if(!countrycitydict.TryGetValue(values[0], out List<string> v))
{
// If not found build an entry for the country
List<string> cities = new List<string>()
countrycitydict.Add(values[0], cities) ;
}
// Now you can safely add the city
countrycitydict[values[0]].Add(values[1]);
}
}
return countrycitydict;
}
}
I'm trying to read a text file and print out into a table.
I want the output to be this
But now I having different output
var column1 = new List<string>();
var column2 = new List<string>();
var column3 = new List<string>();
using (var rd = new StreamReader(#"C:\test.txt"))
{
while (!rd.EndOfStream)
{
var splits = rd.ReadLine().Split(';');
column1.Add(splits[0]);
column2.Add(splits[1]);
column3.Add(splits[2]);
}
}
Console.WriteLine("Date/Time \t Movie \t Seat");
foreach (var element in column1) Console.WriteLine(element);
foreach (var element in column2) Console.WriteLine(element);
foreach (var element in column3) Console.WriteLine(element);
You can use Linq to construct a convenient structure (e.g. List<String[]>) and then print out all the data wanted:
List<String[]> data = File
.ReadLines(#"C:\test.txt")
//.Skip(1) // <- uncomment this to skip caption if the csv has it
.Select(line => line.Split(';').Take(3).ToArray()) // 3 items only
.ToList();
// Table output (wanted one):
String report = String.Join(Environment.NewLine,
data.Select(items => String.Join("\t", items)));
Console.WriteLine(report);
// Column after column output (actual one)
Console.WriteLine(String.Join(Environment.NewLine, data.Select(item => item[0])));
Console.WriteLine(String.Join(Environment.NewLine, data.Select(item => item[1])));
Console.WriteLine(String.Join(Environment.NewLine, data.Select(item => item[2])));
EDIT: if you want to choose the movie, buy the ticket etc. elaborate the structure:
// Create a custom class where implement your logic
public class MovieRecord {
private Date m_Start;
private String m_Name;
private int m_Seats;
...
public MovieRecord(DateTime start, String name, int seats) {
...
m_Seats = seats;
...
}
...
public String ToString() {
return String.Join("\t", m_Start, m_Name, m_Seats);
}
public void Buy() {...}
...
}
And then convert to conventinal structure:
List<MovieRecord> data = File
.ReadLines(#"C:\test.txt")
//.Skip(1) // <- uncomment this to skip caption if the csv has it
.Select(line => {
String items[] = line.Split(';');
return new MovieRecord(
DateTime.ParseExact(items[0], "PutActualFormat", CultureInfo.InvariantCulture),
items[1],
int.Parse(items[2]));
}
.ToList();
And the table output will be
Console.Write(String.Join(Envrironment.NewLine, data));
Don't use Console.WriteLine if you want to add a "column". You should also use a single List<string[]> instead of multiple List<string>.
List<string[]> allLineFields = new List<string[]>();
using (var rd = new StreamReader(#"C:\test.txt"))
{
while (!rd.EndOfStream)
{
var splits = rd.ReadLine().Split(';');
allLineFields.Add(splits);
}
}
Console.WriteLine("Date/Time \t Movie \t Seat");
foreach(string[] line in allLineFields)
Console.WriteLine(String.Join("\t", line));
In general you should use a real csv parser if you want to parse a csv-file, not string methods or regex.
You could use the TextFieldParser which is the only one available in the framework directly:
var allLineFields = new List<string[]>();
using (var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(#"C:\test.txt"))
{
parser.Delimiters = new string[] { ";" };
parser.HasFieldsEnclosedInQuotes = false; // very useful
string[] lineFields;
while ((lineFields = parser.ReadFields()) != null)
{
allLineFields.Add(lineFields);
}
}
You need to add a reference to the Microsoft.VisualBasic dll to your project.
There are other available: Parsing CSV files in C#, with header
You could attempt to solve this in a more Object-Orientated manner, which might make it a bit easier for you to work with:
You can declare a simple class to represent a movie seat:
class MovieSeat
{
public readonly string Date, Name, Number;
public MovieSeat(string source)
{
string[] data = source.Split(';');
Date = data[0];
Name = data[1];
Number = data[2];
}
}
And then you can read in and print out the data in a few lines of code:
// Read in the text file and create a new MovieSeat object for each line in the file.
// Iterate over all MovieSeat objets and print them to console.
foreach(var seat in File.ReadAllLines(#"C:\test.txt").Select(x => new MovieSeat(x)))
Console.WriteLine(string.Join("\t", seat.Date, seat.Name, seat.Number));
I have a string that looks like this:
TYPE Email Forwarding
SIGNATURE mysig.html
COMPANY Smith Incorp
CLIENT NAME James Henries
... heaps of others ....
I need to get the values of Type, Signature, Company and Client Name. There are others but once I can find a soution on how to do these, I can do the rest. I have tried to split and trim the string but then it splits fields like CLIENT NAME or on values like Email Forwarding.
I would put all of the "key" values into a collection, and then parse the string into another collection and then compare the values of the collections.
Here is a rough outline of how you could get the values:
static void Main(string[] args)
{
//Assuming that you know all of the keys before hand
List<string> keys = new List<string>() { "TYPE", "SIGNATURE", "COMPANY", "CLIENT NAME" };
//Not sure of the origin of your string to parse. You would have to change
//this to read a file or query the DB or whatever
string multilineString =
#"TYPE Email Forwarding
SIGNATURE mysig.html
COMPANY Smith Incorp
CLIENT NAME James Henries";
//Split the string by newlines.
var lines = multilineString.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
//Iterate over keys because you probably have less keys than data in the event of duplicates
foreach (var key in keys)
{
//Reduce list of lines to check based on ones that start with a given key
var filteredLines = lines.Where(l => l.Trim().StartsWith(key)).ToList();
foreach (var line in filteredLines)
{
Console.WriteLine(line.Trim().Remove(0, key.Length + 1));
}
}
Console.ReadLine();
}
That will do your job.
If it is multiple lines then you can loop through each line and call KeyValue extension method as given below:
public static class Program
{
public static void Main()
{
var value = "TYPE Email Forwarding".KeyValue();
var value1 = "CLIENT NAME James Henries".KeyValue();
}
public static KeyValuePair<string, string> KeyValue(this string rawData)
{
var splitValue = rawData.Split(new[] { ' ' }, System.StringSplitOptions.RemoveEmptyEntries);
KeyValuePair<string, string> returnValue;
var key = string.Empty;
var value = string.Empty;
foreach (var item in splitValue)
{
if (item.ToUpper() == item)
{
if (string.IsNullOrWhiteSpace(key))
{
key += item;
}
else
{
key += " " + item;
}
}
else
{
if (string.IsNullOrWhiteSpace(value))
{
value += item;
}
else
{
value += " " + item;
}
}
}
returnValue = new KeyValuePair<string, string>(key, value);
return returnValue;
}
}
Please note that this logic will work only when keys are all upper and the values are not all upper case. Otherwise, there is no way to identify which one is key (without having a manual track on keys) and which one is not.
I retrieved a list of users from database, something like
List<User> users = <..list of users from db...>
Name, LastName, DateOfBirth //multidimensional array??
Now I want to store this list as a string and I want be able to reuse it i.e.
string strUsers = users.ToArray().ToString();
How to recreate a list of users from strUsers?
Is it possible?
Use the string.Join method, e.g.
var joined = string.Join(",", users.Select(u => u.Name));
This would give you a single string of user's names separated by ','.
Or for multiple columns:
var joined = string.Join(",",
users.Select(u => u.FirstName + " " + u.LastName ));
You can reverse the process using string.Split, e.g.
var split = joined.Split( new [] {','} );
If you have a lot of users and a lot of columns, it would be better to write your own custom converter class.
public static class UsersConverter
{
// Separates user properties.
private const char UserDataSeparator = ',';
// Separates users in the list.
private const char UsersSeparator = ';';
public static string ConvertListToString(IEnumerable<User> usersList)
{
var stringBuilder = new StringBuilder();
// Build the users string.
foreach (User user in usersList)
{
stringBuilder.Append(user.Name);
stringBuilder.Append(UserDataSeparator);
stringBuilder.Append(user.Age);
stringBuilder.Append(UsersSeparator);
}
// Remove trailing separator.
stringBuilder.Remove(stringBuilder.Length - 1, 1);
return stringBuilder.ToString();
}
public static List<User> ParseStringToList(string usersString)
{
// Check that passed argument is not null.
if (usersString == null) throw new ArgumentNullException("usersString");
var result = new List<User>();
string[] userDatas = usersString.Split(UsersSeparator);
foreach (string[] userData in userDatas.Select(x => x.Split(UserDataSeparator)))
{
// Check that user data contains enough arguments.
if (userData.Length < 2) throw new ArgumentException("Users string contains invalid data.");
string name = userData[0];
int age;
// Try parsing age.
if (!int.TryParse(userData[1], out age))
{
throw new ArgumentException("Users string contains invalid data.");
}
// Add to result list.
result.Add(new User { Name = name, Age = age });
}
return result;
}
}
You will win performance wise using the StringBuilder to build up your users string. You could also easily expand the converter to take account different separators/additional logic etc.
If you need a more generic solution (to be able to use for any class), you could create a converter which uses reflection to iterate over all the public fields, get/set properties to see what can be extracted as string and later reverse the process to convert your string back to the list.
I think what you're looking for is something that lets you dump all users to a string and get the users back from the string, correct?
I suggest something like this:
Add a method that returns an XElement to the Users type:
public XElement GetXElement()
{
return new XElement("User", new XElement("Name", this.FirstName)) //and so on...
}
and then one that decodes the string into a user:
static User GetUserFromXElement(string xml)
{
XElement temp = XElement.Parse(xml);
User temp = new User();
foreach (XElement inner in temp.Elements())
{
switch inner.Name
{
case "Name":
temp.Name = inner.Value
break;
//whatever
}
}
}
And then do this:
public string UsersToElements (List<Users> toWrite)
{
Stringbuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
XElement root = new XElement("root");
XDocument temp = new XDocument(root);
foreach (User user in toWrite)
{
root.Append(user.GetXElement());
}
temp.Save(sw);
return sw.ToString();
}
and this:
public List<Users> ElementsToUsers (string xml)
{
List<Users> usrsList = new List<Users>();
XDocument temp = XDocument.Load(xml);
foreach (XElement e in XDocument.Root.Elements())
{
usrsList.Append(Users.GetUserFromXElement(e));
}
return usrsList;
}
JSON solution (using JSON.NET)
public JObject GetJObject()
{
return new JObject("user", new JProperty("name", this.FirstName)); //so on
}
static User GetUserFromJObject(string json)
{
JObject obj = JObject.Parse(json);
return new User() { FirstName = (string)obj["user"]["name"] }; //so on
}
public string UsersToElements (List<Users> users)
{
JObject root = new JObject(from usr in users select new JAttribute("user", usr.GetJObject());
return root.ToString();
}
public List<users> ElementsToUsers(string json)
{
List<Users> users = new List<Users>();
JObject temp = JObject.Parse(json);
foreach (JObject o in (JEnumerable<JObject>)temp.Children())
{
users.Add(Users.GetUserFromJObject(o.ToString());
}
return users;
}
I have no idea if ths works :/ (well the XML I know it does, not so sure about the JSON)
Use this code
string combindedString = string.Join( ",", myList );
var Array = combindedString.Split( new [] {','} );
I have a flat text file that contains the following data;
Following are the names and ages in a text file.
26|Rachel
29|Chris
26|Nathan
The data is kept on a server (e.g http://domain.com/info.dat), I'd like to read this text file and insert it into an array (age and name). I'd like to ignore the first line (Following are....).
I've sorted the code to grab the data file using a webclient and the code to open the dat file using streamreader as follows;
using (StreamReader sr = new StreamReader(path))
{
while (sr.Peek() >= 0)
{
string[] channels = Text.Split('|');
foreach (string s in channels)
{
}
}
}
The problem with the above code is when it comes to inputting it into an array with the correct columns. Could anyone give me some pointers?
Many thanks
How about an answer that uses some LINQ:
var results = from str in File.ReadAllLines(path).Skip(1)
where !String.IsNullOrEmpty(str)
let data = str.Split('|')
where data.Length == 2
select new Person { Age = Int32.Parse(data[0], NumberStyles.Integer, CultureInfo.CurrentCulture), Name = data[1] };
results is now IEnumerable<Person> which you can do ToList or ToArray on to get a List<Person> or Person[], or you can simply use the results with a foreach loop.
UPDATE: here is the Person class needed to make this more functional.
public class Person
{
public int Age { get; set; }
public string Name { get; set; }
}
You could do something like this. (There is no error checking, you might want to check for errors when parsing the age etc.
class Person
{
string Name {get;set;}
int Age {get;set;}
}
List<Person> people = new List<Person>();
string line;
using (StreamReader sr = new StreamReader(path))
{
sr.ReadLine();
while ((line == sr.ReadLine()) != null)
{
string[] channels = line.Split('|');
people.Add(new Person() {Age=int.Parse(channels[0]), Name=channels[1]});
}
}
You should use Dictionary and not Array to store the data.
Sample code:
FileStream fs = new FileStream("filename");
Dictionary<int,string> dict = new Dictionary<int,string>();
string line = "";
fs.ReadLine(); //skip the first line
while( (line = fs.ReadLine()) != null)
{
string parts = line.split("|".ToCharArray());
dict.Add(int.Parse(parts[0]), parts[1]);
}