CSVhelper any way to ignore "=" present in CSV - c#

I have an application which generates CSV data. This application "helpfully" includes the Excel fix of using = as a preamble to quoted 0-filled numeric data, to prevent the Excel interpreter from eating the leading 0.
I want to use CSVHelper to read these records. However, when mapping to a number, CSVhelper reports an error for these values with the = prefix.
Other than search/replace to pull the "=" out, is there a way to tell CSVhelper to ignore the leading equals and process successfully? I see options for including = in the written output but not to allow them in the parse.
Here is an example record:
"XYZ INC","R1G202113","R2G",="202113","D-SRS PRO FLD SM/2",157.49,122.53,True,50,50,0.00,1,False,"N",4.00,6.00,8.00,6.00,""
Any hep with this is appreciated.

This is a guess, and there may be a much better way, but perhaps you could do something like this with your mapping:
public class MyData
{
//map the raw input to this field as a string
public string MappedIntField {get;set;}=null;
// use an integer property not mapped to any column to shadow the string, and lazy-convert to an integer the first time you read it.
public int ActualIntField
{
get {
if (MappingReady || string.IsNullOrEmpty(MappedIntField)) return _ActualIntField;
//clean up the extra = character.
if (MappedIntField[0] == '=') MappedIntField = MappedIntField.Substring(1);
int result;
if (int.TryParse(MappedIntField, out result))
{
_ActualIntField = result;
MappingReady = true;
return result;
}
return _ActualIntField;
}
set {
_ActualIntField = value;
MappingReady = true;
}
}
private int _AcutalIntField;
// We don't want to re-parse the string on every read, so also flag when this work is done. You could also use a nullable int? to do this.
private bool MappingReady = false;
}

Related

Code not converting text

I have a textbox that allows users to input a decimal values that then gets stored in the a table in the database, this piece of code works in the development environment. I have now published the my project to the server and now is not longer taking the values with the decimal places.
decimal ReceiptAmount;
decimal AmountDue;
decimal Change;
if (!string.IsNullOrEmpty(((TextBox)dl_Item.FindControl("tb_ReceiptAmount")).Text))
{
if (((TextBox)dl_Item.FindControl("tb_ReceiptAmount")).Text.Contains(".") == true)
{
ReceiptAmount = Convert.ToDecimal(((TextBox)dl_Item.FindControl("tb_ReceiptAmount")).Text.Replace(".", ","));
}
else
{
ReceiptAmount = Convert.ToDecimal(((TextBox)dl_Item.FindControl("tb_ReceiptAmount")).Text);
}
}
else
{
ReceiptAmount = 0;
}
if (!string.IsNullOrEmpty(((TextBox)dl_Item.FindControl("tb_AmountDue")).Text))
{
if (((TextBox)dl_Item.FindControl("tb_AmountDue")).Text.Contains(".") == true)
{
AmountDue = Convert.ToDecimal(((TextBox)dl_Item.FindControl("tb_AmountDue")).Text.Replace(".", ","));
}
else
{
AmountDue = Convert.ToDecimal(((TextBox)dl_Item.FindControl("tb_AmountDue")).Text);
}
}
else
{
AmountDue = 0;
}
if (!string.IsNullOrEmpty(((TextBox)dl_Item.FindControl("tb_Change")).Text))
{
if (((TextBox)dl_Item.FindControl("tb_Change")).Text.Contains(".") == true)
{
Change = Convert.ToDecimal(((TextBox)dl_Item.FindControl("tb_Change")).Text.Replace(".", ","));
}
else
{
Change = Convert.ToDecimal(((TextBox)dl_Item.FindControl("tb_Change")).Text);
}
}
else
{
Change = 0;
}
I am not to sure what seems to be the problem with this piece of code. The Textbox are found in a datalist that I loop through to get all of the values.
The Convert.ToDecimal overload that takes a string as input will parse the string using the CultureInfo.CurrentCulture. Probably your server has different regional settings. Depending on regional settings, a comma or point may be either interpreted as a thousand separator (and thus ignored) or as the decimal separator.
Instead, you should use Decimal.Parse directly, providing either a specific culture or the invariant culture, depending on your use case.
Ideally, you'd set the culture of the user somewhere. To achieve this there are multiple approaches, e.g. for ASP.Net Web forms: https://msdn.microsoft.com/en-us/library/bz9tc508.aspx
If you parse the string using the correct culture, you can get rid of the string manipulation for replacing . with ,.
First of all, lines like
if (!string.IsNullOrEmpty(((TextBox)dl_Item.FindControl("tb_ReceiptAmount")).Text))
look very ugly; let's extract a method (copy/paste is very, very bad practice):
private String FindDLText(String controlName) {
var box = dl_Item.FindControl(controlName) as TextBox;
return box == null ? null : box.Text;
}
Then you don't need checking Text.Contains(".") == true, just Replace if you really need it:
private Decimal FindDLValue(String controlName) {
String text = FindDLText(controlName);
if (String.IsNullOrEmpty(text))
return 0.0M;
//TODO: check if you really need this
//text = text.Replace(".", ",");
// you have to specify Culture either InvariantCulture or some predefined one;
// say, new CultureInfo("ru-RU") // <- use Russian Culture to parse this
return Decimal.Parse(text, CultureInfo.InvariantCulture);
}
Finally, you can get
decimal ReceiptAmount = FindDLValue("tb_ReceiptAmount");
decimal AmountDue = FindDLValue("tb_AmountDue");
decimal Change = FindDLValue("tb_Change");
feel the difference: three evident lines and two simple methods.

c# remove duplicate char from array

static string RemoveDuplicateChars(string key)
{
// --- Removes duplicate chars using string concats. ---
// Store encountered letters in this string.
string table = "";
// Store the result in this string.
string result = "";
// Loop over each character.
foreach (char value in key)
{
// See if character is in the table.
if (table.IndexOf(value) == -1)
{
// Append to the table and the result.
table += value;
result += value;
}
}
return result;
}
The above code-snippet is from http://www.dotnetperls.com/duplicate-chars. The question I have is why do you need the extra result variable when you can just use table? Is there a reason for both variables? Below is code I wrote that accomplishes the same purpose, I believe. Am I missing anything? Thanks again and look forward to contributing here!
Code re-written:
static string RemoveDuplicateChars(string key)
{
// --- Removes duplicate chars using string concats. ---
// Store encountered letters in this string.
string table = "";
// Loop over each character.
foreach (char value in key)
{
// See if character is in the table.
if (table.IndexOf(value) == -1)
{
// Append to the table and the result.
table += value;
}
}
return table;
}
There is nothing wrong with what you did. That should work just fine. That being said, in C# we also have linq. You could just take a char[] and do:
char[] result = inputCharArray.Distinct().ToArray();
Your code is correct and functions perfectly, you could also use LINQ in C# using
stringName.Distinct()
The reason that dotnetperls uses two variables is because it is an introduction, and tries to the logic as straightforward as possible to follow to facilitate learning. Good catch!
It is not really necessary as both ways work fine. The choice is purely up to the developer.

Round-trip-safe escaping of strings in C#

I am confused by all the different escaping mechanisms for strings in C#. What I want is an escaping/unescaping method that:
1) Can be used on any string
2) escape+unescape is guaranteed to return the initial string
3) Replaces all punctuation with something else. If that is too much to ask, then at least commas, braces, and #. I am fine with spaces not being escaped.
4) Is unlikely to ever change.
Does it exist?
EDIT: This is for purposes of seriliazing and deserializing app-generated attributes. So my object may or may not have values for Attribute1, Attribute2, Attribute3, etc. Simplifying a bit, the idea is to do something like the below. Goal is to have the encoded collection be brief and more-or-less human-readable.
I am asking what methods would make sense to use for Escape and Unescape.
public abstract class GenericAttribute {
const string key1 = "KEY1"; //It is fine to put some restrictions on the keys, i.e. no punctuation
const string key2 = "KEY2";
public abstract string Encode(); // NO RESTRICTIONS ON WHAT ENCODE MIGHT RETURN
public static GenericAttribute FromKeyValuePair (string key, string value) {
switch (key) {
case key1: return new ConcreteAttribute1(value);
case key2: return new ConcreteAttribute2(value);
// etc.
}
}
}
public class AttributeCollection {
Dictionary <string, GenericAttribute> Content {get;set;}
public string Encode() {
string r = "";
bool first = true;
foreach (KeyValuePair<string, GenericAttribute> pair in this.Content) {
if (first) {
first = false;
} else {
r+=",";
}
r+=(pair.Key + "=" + Escape(pair.Value.Encode()));
}
return r;
}
public AttributeCollection(string encodedCollection) {
// input string is the return value of the Encode method
this.Content = new Dictionary<string, GenericAttribute>();
string[] array = encodedCollection.Split(',');
foreach(string component in array) {
int equalsIndex = component.IndexOf('=');
string key = component.Substring(0, equalsIndex);
string value = component.Substring(equalsIndex+1);
GenericAttribute attribute = GenericAttribute.FromKeyValuePair(key, Unescape(value));
this.Content[key]=attribute;
}
}
}
I'm not entirely sure what your asking, but I believe your intent is for the escaped character to be included, even with the escape.
var content = #"\'Hello";
Console.WriteLine(content);
// Output:
\'Hello
By utilizing the # it will include said escaping, making it apart of your string. That is for the server-side with C#, to account for other languages and escape formats only you would know that.
You can find some great information on C# escaping here:
MSDN Blog
Try using HttpServerUtility.UrlEncode and HttpServerUtility.UrlDecode. I think that will encode and decode all the things you want.
See the MSDN Docs and here is a description of the mapping on Wikipedia.

Convert string array value to int when empty string value is possible

I am having trouble converting a value in a string array to int since the value could possibly be null.
StreamReader reader = File.OpenText(filePath);
string currentLine = reader.ReadLine();
string[] splitLine = currentLine.Split(new char[] { '|' });
object.intValue = Convert.ToInt32(splitLine[10]);
This works great except for when splitLine[10] is null.
An error is thrown: `System.FormatException: Input string was not in a correct format.
Can someone provide me with some advice as to what the best approach in handling this would be?
Don't use convert, it is better to use
int.TryParse()
e.g.
int val = 0;
if (int.TryParse(splitLine[10], out val))
obj.intValue = val;
You can use a TryParse method:
int value;
if(Int32.TryParse(splitLine[10], out value))
{
object.intValue = value;
}
else
{
// Do something with incorrect parse value
}
if (splitLine[10] != null)
object.intValue = Convert.ToInt32(splitLine[10]);
else
//do something else, if you want
You might also want to check that splitLine.Length > 10 before getting splitLine[10].
If you're reading something like a CSV file, and there's a chance it could be somewhat complicated, such as reading multiple values, it probably will make sense for you to use a connection string or other library-sorta-thing to read your file. Get example connection strings from http://www.connectionstrings.com/textfile, using Delimited(|) to specify your delimiter, and then use them like using (var conn = new OleDbConnection(connectionString)). See the section in http://www.codeproject.com/Articles/27802/Using-OleDb-to-Import-Text-Files-tab-CSV-custom about using the Jet engine.
I would go with
object.intValue = int.Parse(splitLine[10] ?? "<int value you want>");
if you're looking for the least code to write, try
object.intValue = Convert.ToInt32(splitLine[10] ?? "0");
If you want to preserve the meaning of the null in splitLine[10], then you will need to change the type of intValue to be of type Nullable<Int32>, and then you can assign null to it. That's going to represent a lot more work, but that is the best way to use null values with value types like integers, regardless of how you get them.

c# How to process the string?

I connect to a webservice that gives me a response something like this(This is not the whole string, but you get the idea):
sResponse = "{\"Name\":\" Bod\u00f8\",\"homePage\":\"http:\/\/www.example.com\"}";
As you can see, the "Bod\u00f8" is not as it should be.
Therefor i tried to convert the unicode (\u00f8) to char by doing this with the string:
public string unicodeToChar(string sString)
{
StringBuilder sb = new StringBuilder();
foreach (char chars in sString)
{
if (chars >= 32 && chars <= 255)
{
sb.Append(chars);
}
else
{
// Replacement character
sb.Append((char)chars);
}
}
sString = sb.ToString();
return sString;
}
But it won't work, probably because the string is shown as \u00f8, and not \u00f8.
Now it would not be a problem if \u00f8 was the only unicode i had to convert, but i got many more of the unicodes.
That means that i can't just use the replace function :(
Hope someone can help.
You're basically talking about converting from JSON (JavaScript Object Notation). Try this link--near the bottom you'll see a list of publicly available libraries, including some in C#, that might do what you need.
The excellent Json.NET library has no problems decoding unicode escape sequences:
var sResponse = "{\"Name\":\"Bod\u00f8\",\"homePage\":\"http://www.ex.com\"}";
var obj = (JObject)JsonConvert.DeserializeObject(sResponse);
var name = ((JValue)obj["Name"]).Value;
var homePage = ((JValue)obj["homePage"]).Value;
Debug.Assert(Equals(name, "Bodø"));
Debug.Assert(Equals(homePage, "http://www.ex.com"));
This also allows you to deserialize to real POCO objects, making the code even cleaner (although less dynamic).
var obj = JsonConvert.DeserializeObject<Response>(sResponse);
Debug.Assert(obj2.Name == "Bodø");
Debug.Assert(obj2.HomePage == "http://www.ex.com");
public class Response
{
public string Name { get; set; }
public string HomePage { get; set; }
}
Perhaps you want to try:
string character = Encoding.UTF8.GetString(chars);
sb.Append(character);
I know this question is getting quite old, but I crashed into this problem as of today, while trying to access the Facebook Graph API. I was getting these strange \u00f8 and other variations back.
First I tried a simple replace as the OP also said (with the help from an online table). But I thought "no way!" after adding 2 replaces.
So after looking a little more at the "codes" it suddenly hit me...
The "\u" is a prefix, and the 4 characters after that is a hexadecimal encoded char code! So writing a simple regex to find all \u with 4 alphanumerical characters after, and afterwards converting the last 4 characters to integer and then to a character made the deal.
My source is in VB.NET
Private Function DecodeJsonString(ByVal Input As String) As String
For Each m As System.Text.RegularExpressions.Match In New System.Text.RegularExpressions.Regex("\\u(\w{4})").Matches(Input)
Input = Input.Replace(m.Value, Chr(CInt("&H" & m.Value.Substring(2))))
Next
Return Input
End Function
I also have a C# version here
private string DecodeJsonString(string Input)
{
foreach (System.Text.RegularExpressions.Match m in new System.Text.RegularExpressions.Regex(#"\\u(\w{4})").Matches(Input))
{
Input = Input.Replace(m.Value, ((char)(System.Int32.Parse(m.Value.Substring(2), System.Globalization.NumberStyles.AllowHexSpecifier))).ToString());
}
return Input;
}
I hope it can help someone out... I hate to add libraries when I really only need a few functions from them!

Categories