Parse command line string into a list of strings [closed] - c#

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to take the following string as input:
first-arg second-arg "third arg with spaces" "arg with \" quotes"
and return this list of strings as output
["first-arg", "second-arg", "third arg with spaces", "arg with \" quotes"]
Are there any nuget packages or built in functions that can do this? I want it to handle edge cases like arguments containing multiple words and arguments containing quotes.

string[] arguments = Environment.GetCommandLineArgs();
For more information see the MSDN website

This class satisfies the requirements. It's not the most effective way, but it returns the right arguments.
public static class ArgumentLineParser
{
public static string[] ToArguments(string cmd)
{
if (string.IsNullOrWhiteSpace(cmd))
{
return new string[0];
}
var argList = new List<string>();
var parseStack = new Stack<char>();
bool insideLiteral = false;
for (int i = 0; i < cmd.Length; i++)
{
bool isLast = i + 1 >= cmd.Length;
if (char.IsWhiteSpace(cmd[i]) && insideLiteral)
{
// Whitespace within literal is kept
parseStack.Push(cmd[i]);
}
else if (char.IsWhiteSpace(cmd[i]))
{
// Whitespace delimits arguments
MoveArgumentToList(parseStack, argList);
}
else if (!isLast && '\\'.Equals(cmd[i]) && '"'.Equals(cmd[i + 1]))
{
//Escaped double quote
parseStack.Push(cmd[i + 1]);
i++;
}
else if ('"'.Equals(cmd[i]) && !insideLiteral)
{
// Begin literal
insideLiteral = true;
}
else if ('"'.Equals(cmd[i]) && insideLiteral)
{
// End literal
insideLiteral = false;
}
else
{
parseStack.Push(cmd[i]);
}
}
MoveArgumentToList(parseStack, argList);
return argList.ToArray();
}
private static void MoveArgumentToList(Stack<char> stack, List<string> list)
{
var arg = string.Empty;
while (stack.Count > 0)
{
arg = stack.Pop() + arg;
}
if (arg != string.Empty)
{
list.Add(arg);
}
}
}
It can be used like this:
var line = #"first-arg second-arg ""third arg with spaces"" ""arg with \"" quotes""";
var args = ArgumentLineParser.ToArguments(line);

Related

IF Statement: Determine what was chosen. C# [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I need to determine which condition was TRUE in an if statement if using ||.
Example:
if(trueone() || truetwo() || truethree()) {
if([magical code] == trueOne()) {
// ...do my code here...
}
}
If trueone() was true, then say that "trueOne was selected"
Or, if trueone() and truetwo() were true, then say that "trueOne and trueTwo were selected"
EDIT: No switches please.
EDIT 2:
Heres a bit more detail:
The program is meant to look through a file and its lines using a foreach statement. If the line contains a certain keyword, then print it out to the user.
Currently, the program looks like this:
foreach(string x in lines) {
if(x.Contains("stringtofind")) {
Console.WriteLine("Found stringtofind at line x");
if(x.Contains("stringtofind2")) {
Console.WriteLine("Found stringtofind2 at line x");
}
Anything more efficient that can accomplish the same task would be useful.
If this is the original code:
foreach(string x in lines) {
if(x.Contains("stringtofind")) {
Console.WriteLine("Found stringtofind at line x");
if(x.Contains("stringtofind2")) {
Console.WriteLine("Found stringtofind2 at line x");
...
}
we can see that there are a pattern that is inside the foreach loop.
In order to remove the duplicated code we can put all the stringsToFind inside an array.
Like;
var lines = new string[]
{
"line1 stringToFind1 stringToFind2",
"line2 ",
"line3 stringToFind3",
"line4 stringToFind4 stringToFind5",
};
var stringsToFind = new string[]
{
"stringToFind1",
"stringToFind2",
"stringToFind3",
"stringToFind4",
"stringToFind5",
};
foreach (string line in lines)
{
foreach (string stringToFind in stringsToFind)
{
if (line.Contains(stringToFind))
{
Console.WriteLine(string.Format("Found {0} at line {1}", stringToFind, line));
}
}
}
Now, if you want to print the number of the line instead of the line, you can a.- use a counter, b.- use a for instead of the first foreach.
for (int i = 0; i < lines.Length; i++)
{
foreach (string stringToFind in stringsToFind)
{
if (lines[i].Contains(stringToFind))
{
// We use i+1 for line number to show that in a 'human' format.
Console.WriteLine(string.Format("Found {0} at line {1}", stringToFind, (i+1)));
}
}
}
I was saying:
if the list of things to check is long... make it an actual list (for example a List<Func<bool>>)
var allConditions = new List<Func<bool>> { trueone, truetwo, truethree };
var trueConditions = allConditions.Where(p => p != null && p()).ToArray();
if (trueConditions.Length > 0)
{
if (trueConditions.Contains(trueone))
{
// ...do my code here...
}
}
and loop on it:
var allConditions = new List<Func<bool>> { trueone, truetwo, truethree };
var trueConditions = allConditions.Where(p => p != null && p()).ToArray();
if (trueConditions.Length > 0)
{
foreach (var condition in allConditions)
{
if (trueConditions.Contains(condition))
{
// ...do my code here...
}
}
}
Now, that was before you said you wanted to check...
If the line contains a certain keyword
You can do that on the framework above:
var x = "some string here to search in";
var allConditions = new List<Func<bool>> { () => x.Contains("keywords"), () => x.Contains("to"), () => x.Contains("find") };
var trueConditions = allConditions.Where(p => p != null && p()).ToArray();
if (trueConditions.Length > 0)
{
foreach (var condition in allConditions)
{
if (trueConditions.Contains(condition))
{
// ...do my code here...
}
}
}
Yet, since you only need to check for strings, you can make a list of those instead:
var x = "some string here to search in";
var allKeywords = new List<string> { "keywords", "to", "find" };
var foundKeywords = allKeywords.Where(s => x.Contains(s)).ToArray();
if (foundKeywords.Length > 0)
{
foreach (var keyword in allKeywords)
{
if (foundKeywords.Contains(keyword))
{
// ...do my code here...
}
}
}

C# performance - Regex vs. multiple Split [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm working with a rather large set of strings I have to process as quickly as possible.
The format is quite fixed:
[name]/[type]:[list ([key] = [value],)]
or
[name]/[type]:[key]
I hope my representation is okay. What this means is that I have a word (in my case I call it Name), then a slash, followed by another word (I call it Type), then a colon, and it is either followed by a comma-separated list of key-value pairs (key = value), or a single key.
Name, Type cannot contain any whitespaces, however the key and value fields can.
Currently I'm using Regex to parse this data, and a split:
var regex = #"(\w+)\/(\w+):(.*)";
var r = new Regex(regex, RegexOptions.IgnoreCase | RegexOptions.Singleline);
var m = r.Match(Id);
if (m.Success) {
Name = m.Groups[1].Value;
Type= m.Groups[2].Value;
foreach (var intern in m.Groups[3].Value.Split(','))
{
var split = intern.Trim().Split('=');
if (split.Length == 2)
Items.Add(split[0], split[1]);
else if (split.Length == 1)
Items.Add(split[0], split[0]);
}
}
Now I know this is not the most optional case, but I'm not sure which would be the fastest:
Split the string first by the : then by / for the first element, and , for the second, then process the latter list and split again by =
Use the current mixture as it is
Use a completely regex-based
Of course I'm open to suggestions, my main goal is to achieve the fastest processing of this single string.
Its always fun to implement a custom parser. Obviously concerning code maintenance, Regex is probably the best choice, but if performance is an ultimate concern then you probably need a tailor made parser which even in the simplest syntaxes is quite a lot more of work.
I've whipped one up really quick (it might be a little hackish in some places) to see what it would take to implement one with some basic error recovery and information. This isn't tested in any way but I'd be curious, if its minimally functional, to know how well it stacks up with the Regex solution in terms of performance.
public class ParserOutput
{
public string Name { get; }
public string Type { get; }
public IEnumerable<Tuple<string, string>> KeyValuePairs { get; }
public bool ContainsKeyValuePairs { get; }
public bool HasErrors { get; }
public IEnumerable<string> ErrorDescriptions { get; }
public ParserOutput(string name, string type, IEnumerable<Tuple<string, string>> keyValuePairs, IEnumerable<string> errorDescriptions)
{
Name = name;
Type = type;
KeyValuePairs = keyValuePairs;
ContainsKeyValuePairs = keyValuePairs.FirstOrDefault()?.Item2?.Length > 0;
ErrorDescriptions = errorDescriptions;
HasErrors = errorDescriptions.Any();
}
}
public class CustomParser
{
private const char forwardSlash = '/';
private const char colon = ':';
private const char space = ' ';
private const char equals = '=';
private const char comma = ',';
StringBuilder buffer = new StringBuilder();
public ParserOutput Parse(string input)
{
var diagnosticsBag = new Queue<string>();
using (var enumerator = input.GetEnumerator())
{
var name = ParseToken(enumerator, forwardSlash, diagnosticsBag);
var type = ParseToken(enumerator, colon, diagnosticsBag);
var keyValuePairs = ParseListOrKey(enumerator, diagnosticsBag);
if (name.Length == 0)
{
diagnosticsBag.Enqueue("Input has incorrect format. Name could not be parsed.");
}
if (type.Length == 0)
{
diagnosticsBag.Enqueue("Input has incorrect format. Type could not be parsed.");
}
if (!keyValuePairs.Any() ||
input.Last() == comma /*trailing comma is error?*/)
{
diagnosticsBag.Enqueue("Input has incorrect format. Key / Value pairs could not be parsed.");
}
return new ParserOutput(name, type, keyValuePairs, diagnosticsBag);
}
}
private string ParseToken(IEnumerator<char> enumerator, char separator, Queue<string> diagnosticsBag)
{
buffer.Clear();
var allowWhitespaces = separator != forwardSlash && separator != colon;
while (enumerator.MoveNext())
{
if (enumerator.Current == space && !allowWhitespaces)
{
diagnosticsBag.Enqueue($"Input has incorrect format. {(separator == forwardSlash ? "Name" : "Type")} cannot contain whitespaces.");
}
else if (enumerator.Current != separator)
{
buffer.Append(enumerator.Current);
}
else
return buffer.ToString();
}
return buffer.ToString();
}
private IEnumerable<Tuple<string, string>> ParseListOrKey(IEnumerator<char> enumerator, Queue<string> diagnosticsBag)
{
buffer.Clear();
var isList = false;
while (true)
{
var key = ParseToken(enumerator, equals, diagnosticsBag);
var value = ParseToken(enumerator, comma, diagnosticsBag);
if (key.Length == 0)
break;
yield return new Tuple<string, string>(key, value);
if (!isList && value.Length != 0)
{
isList = true;
}
else if (isList && value.Length == 0)
{
diagnosticsBag.Enqueue($"Input has incorrect format: malformed [key / value] list.");
}
}
}
}

Simplifying many possibilities in c# indexof condition [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Is there any way to simplify this code? The input is a string.
private string IfItIsPicture(string URI_obrazku)
{
if (URI_obrazku.IndexOf(".jpg") > -1 ||
URI_obrazku.IndexOf(".png") > -1 ||
URI_obrazku.IndexOf(".bmp") > -1 ||
URI_obrazku.IndexOf(".tiff") > -1 ||
URI_obrazku.IndexOf(".tif") > -1 ||
URI_obrazku.IndexOf(".jpeg") > -1 ||
URI_obrazku.IndexOf(".jpg") > -1 ||
URI_obrazku.IndexOf(".svg") > -1 ||
URI_obrazku.IndexOf(".gif") > -1)
{ ... some code }
return someString;
}
Thanks.
Use Path.GetExtension to get just the extension. Then check whether that extension is in your collection of known extensions.
private string IfItIsPicture(string URI_obrazku)
{
var knownExtensions = new [] { ".jpg",".png",".bmp", "..."};
var extension = Path.GetExtension(URI_obrazku);
if (knownExtensions.Contains(extension, StringComparer.OrdinalIgnoreCase))
{
// ... some code
}
return "someString";
}
Use Linq:
var types = new List<string> { ".jpg", ".png", ... };
if (types.Any(t => URI_obrazku.IndexOf(t) >= 0))
{
return someString;
}
Yes, use an array
var values = new [] { ".jpg",".png",".bmp", ...};
if(values.Any(x => URI_obrazku.EndsWith(x)))
I don't know if this is better, but I bet it's faster since it isn't doing multiple IndexOfs:
private bool IsPicture(string URI_obrazku)
{
String Extension = Path.GetExtension(URI_obrazku);
switch (Extension)
{
case ".jpg": return true;
case ".png": return true;
// other extensions
default: return false;
}
}

what to change to use data from csv file not from SQL db [duplicate]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Does anyone know of an open-source library that allows you to parse and read .csv files in C#?
Here, written by yours truly to use generic collections and iterator blocks. It supports double-quote enclosed text fields (including ones that span mulitple lines) using the double-escaped convention (so "" inside a quoted field reads as single quote character). It does not support:
Single-quote enclosed text
\ -escaped quoted text
alternate delimiters (won't yet work on pipe or tab delimited fields)
Unquoted text fields that begin with a quote
But all of those would be easy enough to add if you need them. I haven't benchmarked it anywhere (I'd love to see some results), but performance should be very good - better than anything that's .Split() based anyway.
Now on GitHub
Update: felt like adding single-quote enclosed text support. It's a simple change, but I typed it right into the reply window so it's untested. Use the revision link at the bottom if you'd prefer the old (tested) code.
public static class CSV
{
public static IEnumerable<IList<string>> FromFile(string fileName, bool ignoreFirstLine = false)
{
using (StreamReader rdr = new StreamReader(fileName))
{
foreach(IList<string> item in FromReader(rdr, ignoreFirstLine)) yield return item;
}
}
public static IEnumerable<IList<string>> FromStream(Stream csv, bool ignoreFirstLine=false)
{
using (var rdr = new StreamReader(csv))
{
foreach (IList<string> item in FromReader(rdr, ignoreFirstLine)) yield return item;
}
}
public static IEnumerable<IList<string>> FromReader(TextReader csv, bool ignoreFirstLine=false)
{
if (ignoreFirstLine) csv.ReadLine();
IList<string> result = new List<string>();
StringBuilder curValue = new StringBuilder();
char c;
c = (char)csv.Read();
while (csv.Peek() != -1)
{
switch (c)
{
case ',': //empty field
result.Add("");
c = (char)csv.Read();
break;
case '"': //qualified text
case '\'':
char q = c;
c = (char)csv.Read();
bool inQuotes = true;
while (inQuotes && csv.Peek() != -1)
{
if (c == q)
{
c = (char)csv.Read();
if (c != q)
inQuotes = false;
}
if (inQuotes)
{
curValue.Append(c);
c = (char)csv.Read();
}
}
result.Add(curValue.ToString());
curValue = new StringBuilder();
if (c == ',') c = (char)csv.Read(); // either ',', newline, or endofstream
break;
case '\n': //end of the record
case '\r':
//potential bug here depending on what your line breaks look like
if (result.Count > 0) // don't return empty records
{
yield return result;
result = new List<string>();
}
c = (char)csv.Read();
break;
default: //normal unqualified text
while (c != ',' && c != '\r' && c != '\n' && csv.Peek() != -1)
{
curValue.Append(c);
c = (char)csv.Read();
}
result.Add(curValue.ToString());
curValue = new StringBuilder();
if (c == ',') c = (char)csv.Read(); //either ',', newline, or endofstream
break;
}
}
if (curValue.Length > 0) //potential bug: I don't want to skip on a empty column in the last record if a caller really expects it to be there
result.Add(curValue.ToString());
if (result.Count > 0)
yield return result;
}
}
Take a look at A Fast CSV Reader on CodeProject.
The last time this question was asked, here's the answer I gave:
If you're just trying to read a CSV file with C#, the easiest thing is to use the Microsoft.VisualBasic.FileIO.TextFieldParser class. It's actually built into the .NET Framework, instead of being a third-party extension.
Yes, it is in Microsoft.VisualBasic.dll, but that doesn't mean you can't use it from C# (or any other CLR language).
Here's an example of usage, taken from the MSDN documentation:
Using MyReader As New _
Microsoft.VisualBasic.FileIO.TextFieldParser("C:\testfile.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & _
"is not valid and will be skipped.")
End Try
End While
End Using
Again, this example is in VB.NET, but it would be trivial to translate it to C#.
I really like the FileHelpers library. It's fast, it's C# 100%, it's available for FREE, it's very flexible and easy to use.
I'm implementing Daniel Pryden's answer in C#, so it is easier to cut and paste and customize. I think this is the easiest method for parsing CSV files. Just add a reference and you are basically done.
Add the Microsoft.VisualBasic Reference to your project
Then here is sample code in C# from Joel's answer:
using (Microsoft.VisualBasic.FileIO.TextFieldParser MyReader = new
Microsoft.VisualBasic.FileIO.TextFieldParser(filename))
{
MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited;
MyReader.SetDelimiters(",");
while (!MyReader.EndOfData)
{
try
{
string[] fields = MyReader.ReadFields();
if (first)
{
first = false;
continue;
}
// This is how I treat my data, you'll need to throw this out.
//"Type" "Post Date" "Description" "Amount"
LineItem li = new LineItem();
li.date = DateTime.Parse(fields[1]);
li.description = fields[2];
li.Value = Convert.ToDecimal(fields[3]);
lineitems1.Add(li);
}
catch (Microsoft.VisualBasic.FileIO.MalformedLineException ex)
{
MessageBox.Show("Line " + ex.Message +
" is not valid and will be skipped.");
}
}
}
Besides parsing/reading, some libraries do other nice things like convert the parsed data into object for you.
Here is an example of using CsvHelper (a library I maintain) to read a CSV file into objects.
var csv = new CsvHelper( File.OpenRead( "file.csv" ) );
var myCustomObjectList = csv.Reader.GetRecords<MyCustomObject>();
By default, conventions are used for matching the headers/columns with the properties. You can change the behavior by changing the settings.
// Using attributes:
public class MyCustomObject
{
[CsvField( Name = "First Name" )]
public string StringProperty { get; set; }
[CsvField( Index = 0 )]
public int IntProperty { get; set; }
[CsvField( Ignore = true )]
public string ShouldIgnore { get; set; }
}
Sometimes you don't "own" the object you want to populate the data with. In this case you can use fluent class mapping.
// Fluent class mapping:
public sealed class MyCustomObjectMap : CsvClassMap<MyCustomObject>
{
public MyCustomObjectMap()
{
Map( m => m.StringProperty ).Name( "First Name" );
Map( m => m.IntProperty ).Index( 0 );
Map( m => m.ShouldIgnore ).Ignore();
}
}
You can use Microsoft.VisualBasic.FileIO.TextFieldParser
get below code example from above article
static void Main()
{
string csv_file_path=#"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}

Reading CSV files in C# [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Does anyone know of an open-source library that allows you to parse and read .csv files in C#?
Here, written by yours truly to use generic collections and iterator blocks. It supports double-quote enclosed text fields (including ones that span mulitple lines) using the double-escaped convention (so "" inside a quoted field reads as single quote character). It does not support:
Single-quote enclosed text
\ -escaped quoted text
alternate delimiters (won't yet work on pipe or tab delimited fields)
Unquoted text fields that begin with a quote
But all of those would be easy enough to add if you need them. I haven't benchmarked it anywhere (I'd love to see some results), but performance should be very good - better than anything that's .Split() based anyway.
Now on GitHub
Update: felt like adding single-quote enclosed text support. It's a simple change, but I typed it right into the reply window so it's untested. Use the revision link at the bottom if you'd prefer the old (tested) code.
public static class CSV
{
public static IEnumerable<IList<string>> FromFile(string fileName, bool ignoreFirstLine = false)
{
using (StreamReader rdr = new StreamReader(fileName))
{
foreach(IList<string> item in FromReader(rdr, ignoreFirstLine)) yield return item;
}
}
public static IEnumerable<IList<string>> FromStream(Stream csv, bool ignoreFirstLine=false)
{
using (var rdr = new StreamReader(csv))
{
foreach (IList<string> item in FromReader(rdr, ignoreFirstLine)) yield return item;
}
}
public static IEnumerable<IList<string>> FromReader(TextReader csv, bool ignoreFirstLine=false)
{
if (ignoreFirstLine) csv.ReadLine();
IList<string> result = new List<string>();
StringBuilder curValue = new StringBuilder();
char c;
c = (char)csv.Read();
while (csv.Peek() != -1)
{
switch (c)
{
case ',': //empty field
result.Add("");
c = (char)csv.Read();
break;
case '"': //qualified text
case '\'':
char q = c;
c = (char)csv.Read();
bool inQuotes = true;
while (inQuotes && csv.Peek() != -1)
{
if (c == q)
{
c = (char)csv.Read();
if (c != q)
inQuotes = false;
}
if (inQuotes)
{
curValue.Append(c);
c = (char)csv.Read();
}
}
result.Add(curValue.ToString());
curValue = new StringBuilder();
if (c == ',') c = (char)csv.Read(); // either ',', newline, or endofstream
break;
case '\n': //end of the record
case '\r':
//potential bug here depending on what your line breaks look like
if (result.Count > 0) // don't return empty records
{
yield return result;
result = new List<string>();
}
c = (char)csv.Read();
break;
default: //normal unqualified text
while (c != ',' && c != '\r' && c != '\n' && csv.Peek() != -1)
{
curValue.Append(c);
c = (char)csv.Read();
}
result.Add(curValue.ToString());
curValue = new StringBuilder();
if (c == ',') c = (char)csv.Read(); //either ',', newline, or endofstream
break;
}
}
if (curValue.Length > 0) //potential bug: I don't want to skip on a empty column in the last record if a caller really expects it to be there
result.Add(curValue.ToString());
if (result.Count > 0)
yield return result;
}
}
Take a look at A Fast CSV Reader on CodeProject.
The last time this question was asked, here's the answer I gave:
If you're just trying to read a CSV file with C#, the easiest thing is to use the Microsoft.VisualBasic.FileIO.TextFieldParser class. It's actually built into the .NET Framework, instead of being a third-party extension.
Yes, it is in Microsoft.VisualBasic.dll, but that doesn't mean you can't use it from C# (or any other CLR language).
Here's an example of usage, taken from the MSDN documentation:
Using MyReader As New _
Microsoft.VisualBasic.FileIO.TextFieldParser("C:\testfile.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & _
"is not valid and will be skipped.")
End Try
End While
End Using
Again, this example is in VB.NET, but it would be trivial to translate it to C#.
I really like the FileHelpers library. It's fast, it's C# 100%, it's available for FREE, it's very flexible and easy to use.
I'm implementing Daniel Pryden's answer in C#, so it is easier to cut and paste and customize. I think this is the easiest method for parsing CSV files. Just add a reference and you are basically done.
Add the Microsoft.VisualBasic Reference to your project
Then here is sample code in C# from Joel's answer:
using (Microsoft.VisualBasic.FileIO.TextFieldParser MyReader = new
Microsoft.VisualBasic.FileIO.TextFieldParser(filename))
{
MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited;
MyReader.SetDelimiters(",");
while (!MyReader.EndOfData)
{
try
{
string[] fields = MyReader.ReadFields();
if (first)
{
first = false;
continue;
}
// This is how I treat my data, you'll need to throw this out.
//"Type" "Post Date" "Description" "Amount"
LineItem li = new LineItem();
li.date = DateTime.Parse(fields[1]);
li.description = fields[2];
li.Value = Convert.ToDecimal(fields[3]);
lineitems1.Add(li);
}
catch (Microsoft.VisualBasic.FileIO.MalformedLineException ex)
{
MessageBox.Show("Line " + ex.Message +
" is not valid and will be skipped.");
}
}
}
Besides parsing/reading, some libraries do other nice things like convert the parsed data into object for you.
Here is an example of using CsvHelper (a library I maintain) to read a CSV file into objects.
var csv = new CsvHelper( File.OpenRead( "file.csv" ) );
var myCustomObjectList = csv.Reader.GetRecords<MyCustomObject>();
By default, conventions are used for matching the headers/columns with the properties. You can change the behavior by changing the settings.
// Using attributes:
public class MyCustomObject
{
[CsvField( Name = "First Name" )]
public string StringProperty { get; set; }
[CsvField( Index = 0 )]
public int IntProperty { get; set; }
[CsvField( Ignore = true )]
public string ShouldIgnore { get; set; }
}
Sometimes you don't "own" the object you want to populate the data with. In this case you can use fluent class mapping.
// Fluent class mapping:
public sealed class MyCustomObjectMap : CsvClassMap<MyCustomObject>
{
public MyCustomObjectMap()
{
Map( m => m.StringProperty ).Name( "First Name" );
Map( m => m.IntProperty ).Index( 0 );
Map( m => m.ShouldIgnore ).Ignore();
}
}
You can use Microsoft.VisualBasic.FileIO.TextFieldParser
get below code example from above article
static void Main()
{
string csv_file_path=#"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}

Categories