Strongly typed parsing of CSV-files - c#

So after about an hour's worth of pulling my hair in desperation, I decided to follow the advice from, like, everybody in here and not implement my own CSV-parser.
So I went with FileHelpers instead.
But I am having a bit of trouble using it correctly.
My CSV-file looks something like this:
50382018,50319368,eBusiness Manager,IT02,3350_FIB4,IT,2480
50370383,50373053,CRM Manager,IT01,3200_FIB3,xyz,2480
50320067,50341107,"VP, Business Information Officer",IT03,3200_FI89,xyz,2480
50299061,50350088,Project Expert,IT02,8118_FI09,abc,2480
My need for FileHelpers (and, specifically CsvEngine) is in line 3 - notice third column enclosed in quotes since it has an internal comma (which is otherwise used as delimiter).
My code to read the file is this:
var co = new FileHelpers.Options.CsvOptions("Employee", columnDeliminator, 7);
var ce = new CsvEngine(co);
var records = ce.ReadFile(pathToCSVFile);
It works fine - sort of. It correctly parses the lines and recognizes the values with enclosed delimiters.
But.
The return value of the ReadFile()-method is object[]. And the contents of it appears to be some kind of dynamic type.
It looks something like this - where the columns are named "Field_1", "Field_2" etc.
I have created a "data class" intended to hold the parsed lines It looks like this:
public class Employee
{
public string DepartmentPosition;
public string ParentDepartmentPosition;
public string JobTitle;
public string Role;
public string Location;
public string NameLocation;
public string EmployeeStatus;
}
Is there a way to have FileHelpers' CsvEngine class to return strongly typed data?
If I could just use the "basic" parser of FileHelpers, I could use this code:
var engine = new FileHelperEngine<Employee>();
var records = engine.ReadFile("Input.txt");
Is there a way to have CsvEngine return instances of my "Employee" class? Or do I have to write my own mapping code to support this?

#shamp00 has the correct answer - and I also found it at FileHelper escape delimiter .
I took my model class and decorated each property on it as suggested:
(I probably don't need to decorate all properties, but it works for now)
[DelimitedRecord((","))]
public class Employee
{
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
public string DepartmentPosition;
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
public string ParentDepartmentPosition;
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
public string JobTitle;
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
public string Role;
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
public string Location;
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
public string NameLocation;
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
public string EmployeeStatus;
}
Now I just need this code:
TextReader reader = new StreamReader(contents);
var engine = new FileHelperEngine<Employee>()
{
Options = { IgnoreFirstLines = 1 }
};
var myRecords = engine.ReadStream(reader);

The documentation worked for me for a one simple way:
First in your class, it needs a couple decorators:
Edit Use the FieldQuoted decorator to parse anything in quotes and ignore the included comma
[DelimitedRecord(",")]
class Person
{
[FieldQuoted]
public string Name { get; set; }
[FieldConverter(ConverterKind.Int32)]
public int Age { get; set; }
public string State { get; set; }
}
DelimitedRecord for the class and the expected delimiter (this could be a problem if things change later.
and FieldConverter for it appears anything other than string.
Then change your reading method slightly:
var fhr = new FileHelperEngine<Person>();
var readLines = fhr.ReadFile(pathToFile);
and then it works, strongly typed:
foreach(var person in readLines)
{
Console.WriteLine(person.Name);
}

Using CsvHelper as a viable alternative and assuming the CSV file has no headers,
a mapping can be created for the Employee class like
public sealed class EmployeeClassMap : ClassMap<Employee> {
public EmployeeClassMap() {
Map(_ => _.Location).Index(0);
Map(_ => _.NameLocation).Index(1);
Map(_ => _.JobTitle).Index(2);
//...removed for brevity
}
}
Where the index is mapped to a respective property on the strongly typed object model.
To use this mapping, you need to register the mapping in the configuration.
using (var textReader = new StreamReader(pathToCSVFile)) {
var csv = new CsvReader(textReader);
csv.Configuration.RegisterClassMap<EmployeeClassMap>();
var records = csv.GetRecords<Employee>();
//...
}

If this lib not work, you can also try to use built-in .Net CSV parser TextFieldParser. For ex: https://coding.abel.nu/2012/06/built-in-net-csv-parser/
ADDED:
For types (with auto convert):
static void run()
{
// split with any lib line of CSV
string[] line = new string[]{"john", "doe", "201"};
// needed prop names of class
string[] propNames = "fname|lname|room".Split('|');
Person p = new Person();
parseLine<Person>(p, line, propNames);
}
static void parseLine<T>(T t, string[] line, string[] propNames)
{
for(int i = 0;i<propNames.Length;i++)
{
string sprop = propNames[i];
PropertyInfo prop = t.GetType().GetProperty(sprop);
object val = Convert.ChangeType(line[i], prop.PropertyType);
prop.SetValue(t, val );
}
}
class Person
{
public string fname{get;set;}
public string lname{get;set;}
public int room {get;set;}
}

Related

Use attributes to make headers more human readable with CSVHelper

I am trying to use CSVHelper to serialize a database that is constructed out of multiple classes like shown below. I would like to make the csv a bit more human readable by adding information on units (when appropriate) and by ordering the data so that the "Name" always appears first. The rest can come in whatever order.
I have a class like shown below.
[DataContract(IsReference = true)]
public class OpaqueMaterial : LibraryComponent
{
[DataMember]
[Units("W/m.K")]
public double Conductivity { get; set; } = 2.4;
[DataMember]
public string Roughness { get; set; } = "Rough";
}
[DataContract]
public abstract class LibraryComponent
{
[DataMember, DefaultValue("No name")]
public string Name { get; set; } = "No name";
}
To avoid writing seprarate read write functions for each class I am reading and writing with templated functions like given below:
public void writeLibCSV<T>(string fp, List<T> records)
{
using (var sw = new StreamWriter(fp))
{
var csv = new CsvWriter(sw);
csv.WriteRecords(records);
}
}
public List<T> readLibCSV<T>(string fp)
{
var records = new List<T>();
using (var sr = new StreamReader(fp))
{
var csv = new CsvReader(sr);
records = csv.GetRecords<T>().ToList();
}
return records;
}
That I then use in the code to read and write as such:
writeLibCSV<OpaqueMaterial>(folderPath + #"\OpaqueMaterial.csv", lib.OpaqueMaterial.ToList());
List<OpaqueMaterial> inOpaqueMaterial = readLibCSV<OpaqueMaterial>(folderPath + #"\OpaqueMaterial.csv");
The CSV output then looks like:
Conductivity, Roughnes, Name
2.4, Rough, No Name
I would like to come out as:
Name, Conductivity [W/m.K], Roughness
No Name, 2.4, Rough
I know that the reordering is possible using maps like:
public class MyClassMap : ClassMap<OpaqueMaterial>
{
public MyClassMap()
{
Map(m => m.Name).Index(0);
AutoMap();
}
}
I would like to make this abstract so that I dont have to apply a different mapping to every class. I was not able to find an example that could help with adding the custom headers. Any suggestions or help would be greatly appreciated.
You could create a generic version of ClassMap<T> that will automatically inspect the type T using reflection and then construct the mapping dynamically based on the properties it finds and based on the attributes that may or may not be attached to it.
Without knowing the CsvHelper library too well, something like this should work:
public class AutoMap<T> : ClassMap<T>
{
public AutoMap()
{
var properties = typeof(T).GetProperties();
// map the name property first
var nameProperty = properties.FirstOrDefault(p => p.Name == "Name");
if (nameProperty != null)
MapProperty(nameProperty).Index(0);
foreach (var prop in properties.Where(p => p != nameProperty))
MapProperty(prop);
}
private MemberMap MapProperty(PropertyInfo pi)
{
var map = Map(typeof(T), pi);
// set name
string name = pi.Name;
var unitsAttribute = pi.GetCustomAttribute<UnitsAttribute>();
if (unitsAttribute != null)
name = $"{name} {unitsAttribute.Unit}";
map.Name(name);
// set default
var defaultValueAttribute = pi.GetCustomAttribute<DefaultValueAttribute>();
if (defaultValueAttribute != null)
map.Default(defaultValueAttribute.Value);
return map;
}
}
Now, you just need to create a AutoMap<T> for every type T that you want to support.
I’ve added examples for a UnitsAttribute and the DefaultValueAttribute, that should give you an idea on how to proceed with more attributes if you need more.

Missunderstanding CSV format

First of all, I wanna say: "I know there're XML/JSON/YAML formats, and I know how they works". But now I'm facing a task to make export to CSV format file.
I've read about CSV on wikipedia, searched StackOverflow on CSV topics, and didn't find answer.
As I read, this is popular format for future Excel tables display.
Okay, if I have a simple class with only ValueType properties, it's just fine.
public class MyClass
{
public int ID { get; set; }
public string Name { get; set; }
public string ToCsvString()
{
return string.Format("{0};{1}", ID, Name);
}
public static MyClass FromCsvString(string source)
{
var parts = source.Split(';');
var id = int.Parse(parts[0]);
var name = parts[1];
return new MyClass()
{
ID = id,
Name = name,
};
}
}
But what if I have a little bit more complex class. For example with List<> of other objects.
public class MyClassWithList: MyClass
{
public MyClassWithList()
{
ItemsList = new List<string>();
}
public List<string> ItemsList { get; set; }
public string ToCsvString()
{
// How to format it for future according to CSV format?
return string.Format("{0};{1}", base.ToCsvString(), ItemsList.ToString());
}
public static MyClassWithList FromCsvString(string source)
{
var parts = source.Split(';');
var id = int.Parse(parts[0]);
var name = parts[1];
// How to get it back from CSV formatted string?
var itemsList = parts[2];
return new MyClassWithList()
{
ID = id,
Name = name,
ItemsList = new List<string>()
};
}
}
How should I serialize/deserialize it to CSV?
And final question is how to do the same about when class A contains class B instances?
First off, you have to flatten your data.
If ClassA contains a ClassB, then you'll need to create a flattened POCO that has properties that access any nested properties, e.g. ClassB_PropertyA.
You can really only have 1 variable length property and it has to be the last property, then you can have any column after a point represent a single list property.
Secondly, there is no CSV Serliazation standard. There is https://www.ietf.org/rfc/rfc4180.txt but that only deals with reading text from fields. Something as simple as changing your locale can mess up a CSV library as semicolons will be switched for commas in cultures where a common represents a decimal. There are also many bugs and edge cases in Excel that cause serialization to String to be problematic. And some data is automatically converted to Dates or Times. You need to determine which program you expect to open the CSV and learn about how it handles CSV data.
Once you have a flat POCO, then a CSV is simply a header row with the name of each property followed by a row per object. There are libraries that can help you with this.

Passing c# object as query string

I want to pass C# object as query string & i used following code to get the desired result.
class Program
{
public static string GetQueryString(object obj)
{
var properties = from p in obj.GetType().GetProperties()
where p.GetValue(obj, null) != null
select p.Name + "=" + HttpUtility.UrlEncode(p.GetValue(obj, null).ToString());
return String.Join("&", properties.ToArray());
}
static void Main(string[] args)
{
Filters fil = new Filters();
fil.Age = 10;
fil.Id = "some id";
fil.Divisions = new List<string>();
fil.Divisions.Add("div 1");
fil.Divisions.Add("div 2");
fil.Divisions.Add("div 3");
fil.Names = new List<string>();
fil.Names.Add("name 1");
fil.Names.Add("name 2");
fil.Names.Add("name 3");
var queryStr = GetQueryString(fil);
Console.ReadKey();
}
}
public class Filters
{
public List<string> Names { get; set; }
public List<string> Divisions { get; set; }
public int Age { get; set; }
public string Id { get; set; }
}
using the above code give me following result:
Names=System.Collections.Generic.List%601%5bSystem.String%5d&Divisions=System.Collections.Generic.List%601%5bSystem.String%5d&Age=10&Id=some+id
The output is not a valid query string. I need help to convert any POCO class into query string format.
I have a similar JavaScript object and i am able to convert it into correct query string.
{
"id":"some id",
"age":10,
"division":["div 1","div 2","div 3"],
"names":["name 1","name 2","name 3"]
}
using Jquery I can say $.param(obj) and this will result in:
"id=some+id&age=10&division%5B%5D=div+1&division%5B%5D=div+2&division%5B%5D=div+3&names%5B%5D=name+1&names%5B%5D=name+2&names%5B%5D=name+3"
I want a similar output using c#.
It looks like The problem is that you are calling ToString() on your objects. List<String>.ToString() will return "List<System.String>", which is what you're seeing, except URL encoded.
You will need to either:
Provide an iterface with a ToQueryString method:
public interface IQueryStringable
{
string ToQueryString();
}
and have all classes you might want to use as query strings implement it, or
Rewrite your reflection so that it iterates sequences. Something like (pseudocode):
Get property.
See if it is an instance of IEnumerable. If not, proceed as before
Otherwise:
for each item, construct a string consisting of the property name, "[]=" and the value of that item.
Concatenate the produced strings and urlencode it.
For sanity's sake, I would recommend option 1, and I enjoy playing with reflection. It gets more complex if you want to allow arbitrary nesting of classes.

c# is there a method to serialize to UrlEncoded?

I want to use facebook's API and i find it hard to convert objects to urlEncoded.
so, for now i have something like:
string postData = JsonConvert.SerializeObject(req);
postData = postData.Replace(#"\", "");
postData = HttpUtility.UrlEncode(postData);
byte[] data = Encoding.UTF8.GetBytes(postData);
string facebookUrl = "https://graph.facebook.com/v2.5/";
problem is that facebook doesn't accept jsons but UrlEncoded data, as it seems, correct me if im wrong.
So, Im pretty sure converting objects to UrlEncoded string is impossbile in .Net 4.5.1 because I've tried to use some of the answers for this questions that are while ago they are not working for me.
for example:
var result = new List<string>();
foreach (var property in TypeDescriptor.GetProperties(req))
{
result.Add(property.Name + "=" + property.GetValue(req));
}
postData = string.Join("&", result);
but .Name and .GetValue aren't defined at all.
Would like to get some help with that, TIA.
Objects i use:
internal sealed class FacebookValidationRequest
{
public string access_token;
public fbReq[] batch;
public string method;
public string format;
public int pretty;
public int suppress_http_code;
public string debug;
public FacebookValidationRequest(string appId, string userToken)
{
access_token = userToken;
batch = new[]
{
//test code
new fbReq("GET", "me"),
new fbReq("GET", "me/friends?limit=50") //,
//new fbReq("GET", "app?access_token=" + userToken)
};
method = "post";
format = "json";
pretty = 0;
suppress_http_code = 1;
debug = "all";
}
}
internal sealed class fbReq
{
public string method;
public string relative_url;
public fbReq(string m, string url)
{
method = m;
relative_url = url;
}
}
FacebookValidationRequest req = new FacebookValidationRequest(appToken, userToken);
Also, took the token for the facebook debugger site
how facebook wants to object to look like after encoding:
access_token=mytoken&batch=%5B%7B%22method%22%3A%22GET%22%2C%20%22relative_url%22%3A%22me%22%7D%2C%7B%22method%22%3A%22GET%22%2C%20%22relative_url%22%3A%22me%2Ffriends%3Flimit%3D50%22%7D%5D&debug=all&fields=id%2Cname&format=json&method=post&pretty=0&suppress_http_code=1
Seems to me that the easiest way to do this is with Attributes to describe your properties, just like how the .Net Json's DataContract system does it. Basically, you assign an attribute to each property you want serialized, and make that attribute contain the name to serialize it as. I don't think you want to get into the mess of actually writing your own DataContractSerializer, though, so it might be easier to simply create your own Property class and a simple serializer using reflection.
The attribute class:
[AttributeUsage(AttributeTargets.Property)]
public sealed class UrlEncodeAttribute : System.Attribute
{
public String Name { get; private set; }
public UrlEncodeAttribute(String name)
{
this.Name = name;
}
}
Then, to apply to your data class... put the attributes on all properties:
internal sealed class FacebookValidationRequest
{
[UrlEncodeAttribute("access_token")]
public String AccessToken { get; set; }
[UrlEncodeAttribute("method")]
public String Method { get; set; }
[UrlEncodeAttribute("format")]
public String Format { get; set; }
[UrlEncodeAttribute("pretty")]
public Int32 Pretty { get; set; }
[UrlEncodeAttribute("suppress_http_code")]
public Int32 SuppressHttpCode { get; set; }
[UrlEncodeAttribute("debug")]
public string Debug { get; set; }
public fbReq[] Batch { get; set; }
[UrlEncodeAttribute("batch")]
public String BatchString
{
get
{
// put your json serialization code here to return
// the contents of Batch as json string.
}
}
}
As you see, Batch does not have the UrlEncodeAttribute, while its string representation BatchString does. Its get is what will be called by the serializer, so you can put the conversion code in there.
Also note that thanks to the text names you give in the attributes, your properties don't need to have the names you actually get in the serialization, which looks much cleaner in my opinion. C#'s own serialization to xml and json works in the same way.
Now, let's take a look at the actual serialization, using reflection to get those properties:
public static String Serialize(Object obj, Boolean includeEmpty)
{
// go over the properties, see which ones have a UrlEncodeAttribute, and process them.
StringBuilder sb = new StringBuilder();
PropertyInfo[] properties = obj.GetType().GetProperties();
foreach (PropertyInfo p in properties)
{
object[] attrs = p.GetCustomAttributes(true);
foreach (Object attr in attrs)
{
UrlEncodeAttribute fldAttr = attr as UrlEncodeAttribute;
if (attr == null)
continue;
String objectName = fldAttr.Name;
Object objectDataObj = p.GetValue(obj, null);
String objectData = objectDataObj == null ? String.Empty : objectDataObj.ToString();
if (objectData.Length > 0 || includeEmpty)
{
objectData = HttpUtility.UrlEncode(objectData);
objectName= HttpUtility.UrlEncode(objectName);
if (sb.Length > 0)
sb.Append("&");
sb.Append(objectName).Append("=").Append(objectData);
}
break; // Only handle one UrlEncodeAttribute per property.
}
}
return sb.ToString();
}
A more advanced version of this could be made by including a serialization method property in the UrlEncodeAttribute class (probably best done with an enum), so you can simply specify to serialize the array on the fly using json. You'll obviously need to put the actual json converter into the Serialize function then. I thought using the getter on a dummy property as preparation method was simpler, here.
Obviously, calling it is simply this: (assuming here the Serialize() function is in a class called UrlEncodeSerializer)
FacebookValidationRequest fbreq = new FacebookValidationRequest();
// fill your data into fbreq here
// ...
// includeEmpty is set to true for testing here, but normally in
// UrlEncoded any missing property is just seen as empty anyway, so
// there should be no real difference.
String serialized = UrlEncodeSerializer.Serialize(fbreq, true);

Combine 4 parameters into something that only three arguments?

I've got some static data that I'm experimenting with in C# and the first method looks like this which is basically declaring them.
public class PrivilegeProfiles
{
string PROFILE_ID;
string COMPANY_CODE;
string PRIVILEGE_CODE;
public PrivilegeProfiles(string PROFILE_ID, string COMPANY_CODE, string PRIVILEGE_CODE)
{
this.PROFILE_ID = PROFILE_ID;
this.COMPANY_CODE = COMPANY_CODE;
this.PRIVILEGE_CODE = PRIVILEGE_CODE;
}
}
that's all fine and good but I've got a second method with a .Add keyword and since it only takes 3 arguments I can't add all the static data I need. PRIVILEGE_CODE has multiple bits of data where as PROFILE_ID and COMPANY_CODE only have one. Are there certain brackets I've gotta use or is there a way I've gotta format it for it to work?
public ServiceResponse GetPrivileges()
{
ServiceResponse sR = new ServiceResponse();
List<PrivilegeProfiles> privilegeProfiles;
privilegeProfiles.Add(new PrivilegeProfiles("Train Manager","GW",["DASuper" "DAAccess" "MRSuper", "MRAccess"]);
sR.DataResponse=privilegeProfiles;
return sR;
}
I think you might want the PRIVILEGE_CODE field to be an array of strings instead of a string. For example:
public class PrivilegeProfiles
{
string PROFILE_ID;
string COMPANY_CODE;
string[] PRIVILEGE_CODE;
public PrivilegeProfiles(string aPROFILE_ID, string aCOMPANY_CODE, string[] aPRIVILEGE_CODE)
{
this.PROFILE_ID = aPROFILE_ID;
this.COMPANY_CODE = aCOMPANY_CODE;
this.PRIVILEGE_CODE = aPRIVILEGE_CODE;
}
}
and
public ServiceResponse GetPrivileges()
{
ServiceResponse sR = new ServiceResponse();
List<PrivilegeProfiles> privilegeProfiles;
privilegeProfiles.Add(new PrivilegeProfiles("Train Manager","GW", new string[] {"DASuper","DAAccess","MRSuper","MRAccess"});
sR.DataResponse=privilegeProfiles;
return sR;
}
You either add more variables to your PrivilegeProfiles class that can hold all the information you have or you find a format so that all your PRIVILEGE_CODE data fits into a string. Some examples for your ["DASuper" "DAAccess" "MRSuper", "MRAccess"] as a string could be:
"DASuper,DAAccess,MRSuper,MRAccess"
"DASuper;DAAccess;MRSuper;MRAccess"
"DASuper DAAccess MRSuper MRAccess"
whatever you please

Categories