C# FileHelpers Nulls in Tab Delimited file

C# FileHelpers Nulls in Tab Delimited file - c#

I'm parsing a tab delimited file using FileHelpers.
The null values are being ignored after using the FieldNullValue attribute and I am ending up with the error log
can't be found after the field 'filed name' at line 4 (the record has less fields, the delimiter is wrong or the next field must be marked as optional).
Class definition of delimiter:
[DelimitedRecord("\t")]
Fields are all strings with the same attributes:
[FieldTrim(TrimMode.Both)]
[FieldNullValue("NULL")]
[FieldQuoted('"', QuoteMode.OptionalForRead, MultilineMode.AllowForRead)]
public String initials;
Looking at the imported file in a hex editor i can see back to back tab chars (09 09) which would I assume be a null field.
As you can see in the screen capture fields 5 & 9 are null. These get ignored by the filehelper parser. Does anyone know why?

I think you have two problems going on.
Firstly, FileHelpers is expecting one more tab. One easy fix is to mark the last field with the [FieldOptional] attribute.
Secondly, FieldNullValue("NULL") means: If the value of the field in the file is null, set it to the string "NULL". The value in your file is "", not null. If you need to convert empty values to something else, you can use a custom converter as follows:
public class MyEmptyFieldConverter : ConverterBase
{
protected override bool CustomNullHandling
{
/// you need to tell the converter not
/// to handle empty values automatically
get { return true; }
}
public override object StringToField(string from)
{
if (String.IsNullOrWhiteSpace(from))
return "NULL";
return from;
}
}
And then add the attribute to your field.
[FieldConverter(typeof(MyEmptyFieldConverter))]
public string field9;

removing the attribute:
[FieldTrim(TrimMode.Both)]
Seems to have solved the problem.

Related

Passing date as string - prevent json.net from parsing date

I have a Web API middle layer which consumes an API which exposes a field which carries a timestamp as string (the field is string and it contains a value like "2016-05-31T14:12:45.753Z").
The proxy classes in the middle tier are generated using Visual Studio from Swagger endpoint and under the hood the object is deserialized using Json.NET.
I can see that the field was received as string (that's good):
inputObject {{ "When": "2016-05-31T14:12:45.753Z" }} Newtonsoft.Json.Linq.JToken {Newtonsoft.Json.Linq.JObject}
However, even though the target field is string the value of inputObject["When"] is a parsed as a timestamp.
inputObject["When"] {31/05/2016 14:12:45} Newtonsoft.Json.Linq.JToken {Newtonsoft.Json.Linq.JValue}
Then
JToken whenValue = inputObject["When"];
if (whenValue != null && whenValue.Type != JTokenType.Null)
{
this.When = ((string)whenValue);
}
In the end this.When is a string with value 31/05/2016 14:12:45.
Is there an option to prevent json.net from parsing the date and then casting it to string again?
Please remember that this transformation happens in auto generated code so I'm looking for some way of decorating the field on the server side which would make Swagger mark it somehow and then the generated classes would avoid the deserialize/serialize issue.
Something like:
[JsonProperty("This really is a string, leave it alone")]
public string When { get; private set; }

(Answering my own question)
I needed a solution quickly and this is my temporary solution, for the record.
I format the date as
"When": "2016-05-31 14:12:45"
and not
"When": "2016-05-31T14:12:45.753Z"
This prevents it from being interpreted. The front end (javascript) code knows that timestamps from the API are UTC and it appends 'Z' before transforming the timestamp to local time and formatting for display, e.g:
<td>{{vm.prettyDateTimeFormat(item.StatusDate+'Z')}}</td>
The ctrl code:
vm.prettyDateTimeFormat = function (dateString)
{
var momentDate = moment(dateString, "YYYY-MM-DD HH:mm:ssZZ");
if (typeof(momentDate) === "undefined" || (!momentDate.isValid()))
{
return dateString;
}
//The format needs to be sortable as it ends up in the grid.
var nicePrettyDate = momentDate.format('YYYY-MM-DD HH:mm:ss');
return nicePrettyDate;
}
As far as I don't like this solution it carried us through the demo. This issue is obviously in the back log now to be addressed properly.

[JsonIgnore]
public string When { get; private set; }

Sanitizing a String for a Property Name

Problem
I need to sanitize a collection of Strings from user input to a valid property name.
Context
We have a DataGrid that works with runtime generated classes. These classes are generated based on some parameters. Parameter names are converted into Properties. Some of these parameter names are from user input. We implemented this and it all seemed to work great. Our logic to sanitizing strings was to only allow numbers and letters and convert the rest to an X.
const string regexPattern = #"[^a-zA-Z0-9]";
return ("X" + Regex.Replace(input, regexPattern, "X")); //prefix with X in case the name starts with a number
The property names were always correct and we stored the original string in a dictionary so we could still show a user friendly parameter name.
However, where the trouble starts is when a string only differs in illegal characters like this:
Parameter Name
Parameter_Name
These were both converted into:
ParameterXName
A solution would be to just generate some safe, unrelated names like A, B C. etc. But I would prefer the name to still be recognizable in debug. Unless it's too complicated to implement this behavior of course.
I looked at other questions on StackOverflow, but they all seem to remove illegal characters, which has the same problem.
I feel like I'm reinventing the wheel. Is there some standard solution or trick for this?

I can suggest to change algorithm of generating safe, unrelated and recognizable names.
In c# _ is valid symbol for member names. Replace all invalid symbols (chr) not with X but with "_"+(short)chr+"_".
demo
public class Program
{
public static void Main()
{
string [] props = {"Parameter Name", "Parameter_Name"};
var validNames = props.Select(s=>Sanitize(s)).ToList();
Console.WriteLine(String.Join(Environment.NewLine, validNames));
}
private static string Sanitize(string s)
{
return String.Join("", s.AsEnumerable()
.Select(chr => Char.IsLetter(chr) || Char.IsDigit(chr)
? chr.ToString() // valid symbol
: "_"+(short)chr+"_") // numeric code for invalid symbol
);
}
}
prints
Parameter_32_Name
Parameter_95_Name

How do I configure CsvHelper to always quote a specific field in its CSV output?

I am using the CsvHelper package to write my C# models to Csv. I am using fluent class maps (inheriting from CsvClassMap) to define my field to property mappings.
The issue I have is that some of the property values look like dates to excel. For example "2 - 4". I expect the end user to use excel to view these CSV's. I do not want these values to show as dates, so I am looking to have CsvHelper surround this field by quotes. However, I want ONLY this field surrounded by quotes. There are OTHER fields containing data I WANT to be interpreted (e.g. dates). Can I configure my mapping to specify this field should be quoted? I've played with using a type converter, but that's clearly the wrong approach because this is converting the VALUE and not instructing how to format the field.

As of version 12 you can do this:
const int indexToQuote = 4;
csv.Configuration.ShouldQuote = (field, context) =>
context.Record.Count == indexToQuote &&
context.HasHeaderBeenWritten;

So, apparently quoting is not what I needed to do. Excel quite helpfully decides to treat numeric values that look remotely like dates as dates, unless the field begins with a space (which it then will not display). I feel like relying on this is rather hackish, but I'll take it. FWIW, here's the type converter I used:
public class LeadingSpaceTypeConverter : DefaultTypeConverter {
public override string ConvertToString( TypeConverterOptions options, object value ) {
if (value == null ) {
return String.Empty;
}
return String.Concat(" ", value.ToString());
}
}
And the fluent code:
Map( m => m.CompanySize ).TypeConverter<LeadingSpaceTypeConverter>().Index( 4 );

FileHelpers - extra column in file not raising an error when FieldQuoted attribute used

Using FileHelpers.dll version 3.0.1.0, with .net 4.0.
Code to reproduce issue:
create a file Accounts.txt like this:
"AccountName","ExtraColumn"
"MR GREEN ","abc"
"MR SMITH ","def"
c# Account class :
[IgnoreFirst(1)]
[DelimitedRecord(",")]
[IgnoreEmptyLines()]
public class Account
{
[FieldQuoted('"', QuoteMode.OptionalForBoth, MultilineMode.NotAllow)]
public string AccountName;
}
To read the file:
string fname = #"C:\test\Accounts.txt";
FileHelperEngine engine = new FileHelperEngine(typeof(Account));
Account[] importNodes = (Account[])engine.ReadFile(fname); // XX
Now, I would have expected an exception to be raised at line XX, because there are more columns in the file (2), than columns in the "Account" class (1). However, there is no exception raised, and the extra column seems to get silently ignored.
If you change the Account class to remove the "FieldQuoted" attribute, then an exception is indeed raised, like:
[FileHelpers.BadUsageException] {"Line: 2 Column: 0. Delimiter ','
found after the last field 'AccountName' (the file is wrong or you
need to add a field to the record
class)"} FileHelpers.BadUsageException
Can anyone provide some insight ? Should the code with the FieldQuoted attribute indeed raise an exception ? Am I doing something wrong ?
EDIT : are there any workarounds, so that an error will be raised when there are more columns in the input file than expected ?

[IgnoreFirst(1)] ignores first line of a file, not the first column.
See description here: IgnoreFirstAttribute
Removing FieldQuoted causes error because you have quoted fields that can't be handled correctly without this attribute.
FileHelpers library just ignores the fields from file that don't have corresponding fields in your class.
To ignore first field in each row, you can just add a dummy field to your class definition like this:
[IgnoreFirst(1)]
[DelimitedRecord(",")]
[IgnoreEmptyLines()]
public class Account
{
[FieldQuoted('"', QuoteMode.OptionalForBoth, MultilineMode.NotAllow)]
public string AccountName;
[FieldValueDiscarded]
[FieldQuoted('"', QuoteMode.OptionalForBoth, MultilineMode.NotAllow)]
public string Dummy;
}
As a workaround I can suggest to add an ordinary string dummy field at the end. You can then check if this field is filled in FileHelperEngine.AfterReadRecord and throw an exception like this:
FileHelperEngine engine = new FileHelperEngine(typeof(Account));
engine.AfterReadRecord += engine_AfterReadRecord;
try
{
Account[] importNodes = (Account[])engine.ReadFile(fname); // XX
}
catch (Exception e)
{
}
static void engine_AfterReadRecord(EngineBase engine, FileHelpers.Events.AfterReadEventArgs<object> e)
{
if (!string.IsNullOrEmpty(((Account) e.Record).Dummy))
{
throw new ApplicationException("Unexpected field");
}
}

how to validate JSON string before converting to XML in C#

I will receive an response in the form of JSON string.
We have an existing tool developed in C# which will take input in XML format.
Hence i am converting the JSON string obtained from server using Newtonsoft.JSON to XML string and passing to the tool.
Problem:
When converting JSON response to XML, I am getting an error
"Failed to process request. Reason: The ' ' character, hexadecimal
value 0x20, cannot be included in a name."
The above error indicates that the JSON Key contains a space [For Example: \"POI Items\":[{\"lat\":{\"value\":\"00\"}] which cannot be converted to XML element.
Is there any approach to identify spaces only JSON key's ["POI Items"] and remove the spaces in it?
Also suggest any alternative solution so that we needn't change the existing solution?
Regards,
Sudhir

You can use Json.Net and replace the names while loading the json..
JsonSerializer ser = new JsonSerializer();
var jObj = ser.Deserialize(new JReader(new StringReader(json))) as JObject;
var newJson = jObj.ToString(Newtonsoft.Json.Formatting.None);
.
public class JReader : Newtonsoft.Json.JsonTextReader
{
public JReader(TextReader r) : base(r)
{
}
public override bool Read()
{
bool b = base.Read();
if (base.CurrentState == State.Property && ((string)base.Value).Contains(' '))
{
base.SetToken(JsonToken.PropertyName,((string)base.Value).Replace(" ", "_"));
}
return b;
}
}
Input : {"POI Items":[{"lat":{"value":"00","ab cd":"de fg"}}]}
Output: {"POI_Items":[{"lat":{"value":"00","ab_cd":"de fg"}}]}

I recommend using some sort of Regex.Replace().
Search the input string for something like:
\"([a-zA-Z0-9]+) ([a-zA-Z0-9]+)\":
and then replace something like (mind the missing space):
\"(1)(2)\":
The 1st pair of parenthesis contain the first word in a variable name, the 2nd pair of parenthesis means the 2nd word. The : guarantees that this operation will be done in variable names only (not in string data). the JSON variable names are inside a pair of \"s.
Maybe it's not 100% correct but you can start searching by this.
For details check MSDN, and some Regex examples
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace.aspx

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# FileHelpers Nulls in Tab Delimited file - c#

removing the attribute: [FieldTrim(TrimMode.Both)] Seems to have solved the problem.

Related

Passing date as string - prevent json.net from parsing date

Sanitizing a String for a Property Name

How do I configure CsvHelper to always quote a specific field in its CSV output?

FileHelpers - extra column in file not raising an error when FieldQuoted attribute used

how to validate JSON string before converting to XML in C#

Categories

Resources