Reading a delimited text file by line and by delimiter in C# - c#

I would just like to apologise for mposting this, there are a lot of questions like this here but I can't find one that is specific to this.
I have a list and each item contains a DateTime, int and a string. I have successfully written all list items to a .txt file which is delimited by commas.
For example 09/04/2015 22:12:00,10,Posting on Stackoverflow.
I need to loop through the file line by line, each line starting at index 0, through to index 2. At the moment I am able to call index 03, which returns the DateTime of the second list item in the text file. The file is written line by line, but I am struggling to read it back with the delimiters and line breaks.
I am sorry if I am not making much sense, I will appreciate any help, thank you.

string[] lines = File.ReadAllLines( filename );
foreach ( string line in lines )
{
string[] col = line.Split(',');
// process col[0], col[1], col[2]
}

You can read all the lines at once via var lines = File.ReadAllLines(pathToFile);. Then you can split each line into an array of fields via:
foreach (var line in lines) {
String[] fields = line.Split(',');
}
If you don't have any stray commas in your file and the only commas are true delimiters, this will mean that fields will always be a 3 element array, with each of your fields in succession.

Alternatively, you can do the following:
public static List<Values> GetValues(string path)
{
List<Values> valuesCollection = new List<Values>();;
using (var f = new StreamReader(path))
{
string line = string.Empty;
while ((line = f.ReadLine()) != null)
{
var parts = line.Split(',');
valuesCollection.Add(new Values(Convert.ToDateTime(parts[0]), Convert.ToInt32(parts[1]), parts[2]);
}
}
return valuesCollection;
}
class Values
{
public DateTime Date { get; set; }
public int IntValue { get; set; }
public string StringValue { get; set; }
public Values()
{
}
public Values(DateTime date, int intValue, string stringValue)
{
this.Date = date;
this.IntValue = intValue;
this.StringValue = stringValue;
}
}
You could then iterate through the list or collection of values and access each object and its properties. For example:
Console.WriteLine(values.Date);
Console.WriteLine(values.IntValue);
Console.WriteLine(values.StringValue);

Related

Output data to CSV specific columns from Dictionary c#

I am trying to output the values from the dictionary to the CSV and am able to do this. But facing issue with the specific columns this need to output to the csv. I need the specific data value from dictionary to be output to a specific column in csv.
Dictionary<string, List<string>> file = new Dictionary<string, List<string>>();
for (var i = 0; i < stringList.Count(); i++)
{
string line = stringList[i];
string path = line.Replace("\r\n", "");
path = path.Replace(" ", "");
path = path.TrimEnd(':');
if (File.Exists(path))
{
file[path] = file.ContainsKey(path) ? file[path] : new List<string>();
for (var j = i + 1; j < stringList.Count(); j++)
{
string line2 = stringList[j];
string path2 = line2.Replace("\r\n", "");
path2 = path2.Replace(" ", "");
path2 = path2.TrimEnd(':');
if (File.Exists(path2))
{
i = j - 1;
break;
}
else
{
if (path2.Contains("Verified") | path2.Contains("Algorithm"))
{
var strings = path2.Split(':');
var listValue = strings[1].Trim();
file[path].Add(listValue);
}
}
}
}
}
using (var writer = new StreamWriter(outputdir + "\\output_" +
DateTime.Now.ToString("yyyy_MM_dd_HHmmss") + ".csv"))
{
writer.WriteLine("FilePath,Signature,HashValueSHA1, HashValueSHA2, HashValueMD5, Other");
foreach (var keyvaluepair in file)
{
if (!keyvaluepair.Value.Contains("Unsigned"))
{
var values = String.Join(",", keyvaluepair.Value.Distinct().Select(x => x.ToString()).ToArray());
writer.WriteLine("{0},{1}", keyvaluepair.Key, values);
}
}
}
Current Output looks like below:
Sample output I need as below:
The Dictionary key(string) would hold the file path and the values(List) would hold something like below:
Signed
sha1RSA
md5RSA
md5RSA
Signed
sha1RSA
sha1RSA
sha256RSA
sha256RSA
Please suggest how can I get the one as required output.
I had a longer answer typed, but I see the problem.
On this line
var values = String.Join(",", keyvaluepair.Value.Distinct().Select(x => x.ToString()).ToArray());
take out Distinct. It looks like you have the correct number of items in each string, but if a list contains multiple blank entries Distinct is eliminating the duplicates. If a list contains two or three blank entries you need all of them. If you delete duplicate blanks your columns won't line up.
Also, when you use Distinct there's no guarantee that items will come back in any particular order. In this case the order is very important so that values end up in the right columns.
So in your example above, even though there's a blank in the third column of the first row, the value from the fourth column ends up in the third column and the blank goes to the end.
That will likely fix the immediate problem. I'd recommend not using a List<string> when you're expecting a certain number of values (they need to match up with columns) because a List<string> can contain any number of values.
Instead, try something like this:
public class WhateverThisIs
{
public string Signature { get; set; }
public string HashValueSha1 { get; set; }
public string HashValueSha2 { get; set; }
public string HashValueMd5 { get; set; }
public string Other { get; set; }
}
Then, as a starting point, use Dictionary<string, WhateverThisIs>.
Then the part that outputs lines would look more like this:
var value = keyvaluepair.Value;
var values = String.Join(",", value.Signature, value.HashValueSha1, value.HashValueSha2,
value.HashValueMd5, value.Other);
(and yes, that accounts for null values.)
If you want to replace nulls or empty values with "N/A" then you'd need a separate function for that, like
string ReplaceNullWithNa(string value)
{
return string.IsNullOrEmpty(value) ? "N/A" : value;
}

Is there a way to include null values in a list in C#?

I have a text file filled with several lines of data, and I would like to split it into 5 different elements like so..
I am successfully reading in the data and putting it into an array. Now I would like to split each part of the text up into different lists so I can compare the data against one another.
I have currently managed to read in the first 4 elements of each line into their relevant lists but the 5th one is throwing me the error "System.IndexOutOfRangeException" which I can only assume is because the first line it reads in has no value for the 5th element?
So my question is, is there a way to populate null values when writing them to a number of lists?
I've tried manually assigning the size of the array and lists but I still get the same error.
Here is my code:
class Program
{
static void Main(string[] args)
{
// Reading in file containing data from BT Code Evaluation sheet (for testing purposes).
// Each line gets stored into a string array, each element is one line of the data.txt file.
//String[] lines = System.IO.File.ReadAllLines(#"C:\Users\Ad\Desktop\data.txt");
String[] lines = new String[5] {"monitorTime", "localTime", "actor", "action", "actor2"};
lines = System.IO.File.ReadAllLines(#"C:\Users\Ad\Desktop\data.txt");
char delimiter = ' ';
List<String> monitorTime = new List<String>();
List<String> localTime = new List<String>();
List<String> actor = new List<String>();
List<String> action = new List<String>();
List<String> actor2 = new List<String>();
// Foreach loop displays the lines of text in the data file.
foreach (String line in lines)
{
// Writes the data to the console.
Console.WriteLine(line);
String[] data = new String[5] { "monitorTime", "localTime", "actor", "action", "actor2" };
data = line.Split(delimiter);
monitorTime.Add(data[0]);
localTime.Add(data[1]);
actor.Add(data[2]);
action.Add(data[3]);
actor2.Add(data[4]);
}
foreach (String time in monitorTime)
{
Console.WriteLine(time);
}
foreach (String time in localTime)
{
Console.WriteLine(time);
}
foreach (String name in actor)
{
Console.WriteLine(name);
}
foreach (String actions in action)
{
Console.WriteLine(actions);
}
foreach (String name in actor2)
{
if (name != null)
{
Console.WriteLine("UNKNOWN");
}
else
{
Console.WriteLine(actor2);
}
}
// Creates an empty line between the data and the following text.
Console.WriteLine("");
// Displays message in console.
Console.WriteLine("Press any key to analyse data and create report...");
Console.ReadKey();
}
}
You need to check the bounds of you array before you try to add. If their aren't enough items you can add null instead.
For example:
actor2.Add(data.length > 4 ? data[4] : null)
(Note: You could do the same type of check on the other items as well, unless you are positive that the last item is the only one that might be null)
This is using the ternary operator, but you could also use a simple if/else, but it'll be more verbose. It's equivalent to:
if (data.length > 4)
{
actor2.Add(data[4]);
}
else
{
actor2.Add(null);
}
This along with Console.WriteLine(name); instead of Console.WriteLine(actor2); should fix you immediate problem.
However, a much better design here would be to have a single list of objects with MonitorTime, LocalTime, Actor, Action and Actor2 properties. That way you don't ever have to worry that the 5 parallel arrays might get out of sync.
For example, create a class like this:
public class DataItem
{
public string MonitorTime { get; set; }
public string LocalTime { get; set; }
public string Actor { get; set; }
public string Action { get; set; }
public string Actor2 { get; set; }
}
Then instead of your 5 List<String>, you have one List<DataItem>:
List<DataItem> dataList = new List<DataItem>();
Then in your loop to populate it you'd do something like:
data = line.Split(delimiter);
dataList.Add(new DataItem()
{
MonitorTime = data[0],
LocalTime = data[1],
Actor = data[2],
Action = data[3],
Actor2 = data.length > 4 ? data[4] : null
});
Then you can access them later with something like:
foreach (var item in dataList)
{
Console.WriteLine(item.MonitorTime);
//...
}
In your for each you should be checking to see if the index exists before populating the object.
foreach (String line in lines)
{
// Writes the data to the console.
Console.WriteLine(line);
String[] data = new String[5] { "monitorTime", "localTime", "actor", "action", "actor2" };
data = line.Split(delimiter);
monitorTime.Add(data[0]);
localTime.Add(data[1]);
actor.Add(data[2]);
action.Add(data[3]);
if (data.Length > 4) {
actor2.Add(data[4]);
}
}
There's better ways to do this but this is a simple solution for now.

What is the easiest way to split columns from a txt file

I've been looking around a bit but haven't really found a good example with what I'm struggling right now.
I have a .txt file with a couple of columns as follows:
# ID,YYYYMMDD, COLD,WATER, OD, OP,
52,20120406, 112, 91, 20, 130,
53,20130601, 332, 11, 33, 120,
And I'm reading these from the file into a string[] array.
I'd like to split them into a list
for example
List results, and [0] index will be the first index of the columns
results[0].ID
results[0].COLD
etc..
Now I've been looking around, and came up with the "\\\s+" split
but I'm not sure how to go about it since each entry is under another one.
string[] lines = File.ReadAllLines(path);
List<Bus> results = new List<Bus>();
//Bus = class with all the vars in it
//such as Bus.ID, Bus.COLD, Bus.YYYYMMDD
foreach (line in lines) {
var val = line.Split("\\s+");
//not sure where to go from here
}
Would greatly appreciate any help!
Kind regards, Venomous.
I suggest using Linq, something like this:
List<Bus> results = File
.ReadLines(#"C:\MyFile.txt") // we have no need to read All lines in one go
.Skip(1) // skip file's title
.Select(line => line.Split(','))
.Select(items => new Bus( //TODO: check constructor's syntax
int.Parse(items[1]),
int.Parse(items[3]),
DateTime.ParseExact(items[2], "yyyyMMdd", CultureInfo.InvariantCulture)))
.ToList();
I would do
public class Foo
{
public int Id {get; set;}
public string Date {get; set;}
public double Cold {get; set;}
//...more
}
Then read the file
var l = new List<Foo>();
foreach (line in lines)
{
var sp = line.Split(',');
var foo = new Foo
{
Id = int.Parse(sp[0].Trim()),
Date = sp[1].Trim(),//or pharse the date to a date time struct
Cold = double.Parse(sp[2].Trim())
}
l.Add(foo);
}
//now l contains a list filled with Foo objects
I would probably keep a List of properties and use reflection to populate the object, something like this :
var columnMap = new[]{"ID","YYYYMMDD","COLD","WATER","OD","OP"};
var properties = columnMap.Select(typeof(Bus).GetProperty).ToList();
var resultList = new List<Bus>();
foreach(var line in lines)
{
var val = line.Split(',');
var adding = new Bus();
for(int i=0;i<val.Length;i++)
{
properties.ForEach(p=>p.SetValue(adding,val[i]));
}
resultList.Add(adding);
}
This is assuming that all of your properties are strings however
Something like this perhaps...
results.Add(new Bus
{
ID = val[0],
YYYYMMDD = val[1],
COLD = val[2],
WATER = val[3],
OD = val[4],
OP = val[5]
});
Keep in mind that all of the values in the val array are still strings at this point. If the properties of Bus are typed, you will need to parse them into the correct types e.g. assume ID is typed as an int...
ID = string.IsNullOrEmpty(val[0]) ? default(int) : int.Parse(val[0]),
Also, if the column headers are actually present in the file in the first line, you'll need to skip/disregard that line and process the rest.
Given that we have the Bus class with all the variables from your textfile:
class Bus
{
public int id;
public DateTime date;
public int cold;
public int water;
public int od;
public int op;
public Bus(int _id, DateTime _date, int _cold, int _water, int _od, int _op)
{
id = _id;
date = _date;
cold = _cold;
water = _water;
od = _od;
op = _op;
}
}
Then we can list them all in the results list like this:
List<Bus> results = new List<Bus>();
foreach (string line in File.ReadAllLines(path))
{
if (line.StartsWith("#"))
continue;
string[] parts = line.Replace(" ", "").Split(','); // Remove all spaces and split at commas
results.Add(new Bus(
int.Parse(parts[0]),
DateTime.ParseExact(parts[1], "yyyyMMdd", CultureInfo.InvariantCulture),
int.Parse(parts[2]),
int.Parse(parts[3]),
int.Parse(parts[4]),
int.Parse(parts[5])
));
}
And access the values as you wish:
results[0].id;
results[0].cold;
//etc.
I hope this helps.

How to read file that contains one row with multiple records- C# [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have this text file that only has one row. Each file contains one customer name but multiple items and descriptions.
Record starting with 00 (Company Name) has a char length of 10
01 (Item#) - char length of 10
02 (Description) - char length of 50
I know how to read a file, but I don't have any idea of how to loop through only one line, find records 00, 01, 02 and grab the text based on the length, finally start at the position of the last records and start the loop again. Can someone please give me an idea of how to read files like this?
output:
companyName 16622 Description
companyName 15522 Description
input text file example
00Init 0115522 02Description 0116622 02Description
This solution assumes that the data is fixed width, and that item number will preceed description (01 before 02). This solution will emit a record every time a description record is encountered, and deals with multiple products for the same company.
First, define a class to hold your data:
public class Record
{
public string CompanyName { get; set; }
public string ItemNumber { get; set; }
public string Description { get; set; }
}
Then, iterate through your string, returning a record when you've got a description:
public static IEnumerable<Record> ReadFile(string input)
{
// Alter these as appropriate
const int RECORDTYPELENGTH = 2;
const int COMPANYNAMELENGTH = 41;
const int ITEMNUMBERLENGTH = 8;
const int DESCRIPTIONLENGTH = 48;
int index = 0;
string companyName = null;
string itemNumber = null;
while (index < input.Length)
{
string recordType = input.Substring(index, RECORDTYPELENGTH);
index += RECORDTYPELENGTH;
if (recordType == "00")
{
companyName = input.Substring(index, COMPANYNAMELENGTH).Trim();
index += COMPANYNAMELENGTH;
}
else if (recordType == "01")
{
itemNumber = input.Substring(index, ITEMNUMBERLENGTH).Trim();
index += ITEMNUMBERLENGTH;
}
else if (recordType == "02")
{
string description = input.Substring(index, DESCRIPTIONLENGTH).Trim();
index += DESCRIPTIONLENGTH;
yield return new Record
{
CompanyName = companyName,
ItemNumber = itemNumber,
Description = description
};
}
else
{
throw new FormatException("Unexpected record type " + recordType);
}
}
}
Note that your field lengths in the question don't match the sample data, so I adjusted them so that the solution worked with the data you provided. You can adjust the field lengths by adjusting the constants.
Use this like the following:
string input = "00CompanyName 0115522 02Description 0116622 02Description ";
foreach (var record in ReadFile(input))
{
Console.WriteLine("{0}\t{1}\t{2}", record.CompanyName, record.ItemNumber, record.Description);
}
If you read the whole file into a string, you have a couple options.
One, it might be useful to use string.split.
Another option would be to use string.indexof. Once you have the index, you could use string.substring
Assuming fixed-width as specified, lets create two simple classes to hold a client and its related data as a list:
// can hold as many items (data) as there are in the line
public class Client
{
public string name;
public List<ClientData> data;
};
// one single item in the client data
public class ClientData
{
public string code;
public string description;
};
To parse a single line (which is assumed to have a single client and a successive list of item/description), we can do this (note: for simplification I'm just creating a static class with a static method in it):
// this parser will read as many itens as there are in the line
// and return a Client instance with those inside.
public static class Parser
{
public static Client ParseData(string line)
{
Client client = new Client ();
client.data = new List<ClientData> ();
client.name = line.Substring (2, 10);
// remove the client name
line = line.Substring (12);
while (line.Length > 0)
{
// create new item
ClientData data = new ClientData ();
data.code = line.Substring (2, 10);
data.description = line.Substring (14, 50);
client.data.Add (data);
// next item
line = line.Substring (64);
}
return client;
}
}
So, in your main loop, just after reading a new line from the file, you can call the above method to receive a new client. Something like this:
// should be from a file but this is just an example
string[] lines = {
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
"00XXXXXXXXXX01YYYYYYYYYY02XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXX.XXXXXXXXXX",
};
// loop through each line
// (lines can have multiple items)
foreach (string line in lines)
{
Client client = Parser.ParseData (line);
Console.WriteLine ("Read: " + client.name);
}
Contents of Sample.txt:
00Company1 0115522 02This is a description for company 1. 00Company2 0115523 02This is a description for company 2. 00Company3 0115524 02This is a description for company 3
Note that in the code below, the fields are 2 characters longer than those specified in the original question. This is because I am including the headings in the length of each field, thus a field of a length of 10is effectively 12 by including the 00 from the heading. If this is undesirable, tweak the offsets of the entries in the fieldLengths array.
String directory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
String file = "Sample.txt";
String path = Path.Combine(directory, file);
Int32[] fieldLengths = new Int32[] { 12, 12, 52 };
List<RowData> rows = new List<RowData>();
Byte[] buffer = new Byte[fieldLengths.Sum()];
using (var stream = File.OpenRead(path))
{
while (stream.Read(buffer, 0, buffer.Length) > 0)
{
List<String> fieldValues = new List<String>();
Int32 offset = 0;
for (int i = 0; i < fieldLengths.Length; i++)
{
var value = Encoding.UTF8.GetString(buffer, offset, fieldLengths[i]);
fieldValues.Add(value);
offset += fieldLengths[i];
}
String companyName = fieldValues[0];
String itemNumber = fieldValues[1];
String description = fieldValues[2];
var row = new RowData(companyName, itemNumber, description);
rows.Add(row);
}
}
Class definition for RowData:
public class RowData
{
public String Company { get; set; }
public String Number { get; set; }
public String Description { get; set; }
public RowData(String company, String number, String description)
{
Company = company;
Number = number;
Description = description;
}
}
The results will be in the rows variable.
You would have to split rows based on a delimiter. It would seem that in your case you are using whitespace as a delimiter.
The method you are looking for is String.Split(), it should cover your needs :) Documentation is located at https://msdn.microsoft.com/en-us/library/system.string.split(v=vs.110).aspx - It also includes examples.
I'd do something like this:
string myLineOfText = "MyCompany 12345 The description of my company";
string[] partsOfMyLine = myLineOfText.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
Best of luck! :)

Removing quotes in file helpers

I have a .csv file(I have no control over the data) and for some reason it has everything in quotes.
"Date","Description","Original Description","Amount","Type","Category","Name","Labels","Notes"
"2/02/2012","ac","ac","515.00","a","b","","javascript://"
"2/02/2012","test","test","40.00","a","d","c",""," "
I am using filehelpers and I am wondering what the best way to remove all these quotes would be? Is there something that says "if I see quotes remove. If no quotes found do nothing"?
This messes with the data as I will have "\"515.00\"" with unneeded extra quotes(especially since I want in this case it to be a decimal not a string".
I am also not sure what the "javascript" is all about and why it was generated but this is from a service I have no control over.
edit
this is how I consume the csv file.
using (TextReader textReader = new StreamReader(stream))
{
engine.ErrorManager.ErrorMode = ErrorMode.SaveAndContinue;
object[] transactions = engine.ReadStream(textReader);
}
You can use the FieldQuoted attribute described best on the attributes page here. Note that the attribute can be applied to any FileHelpers field (even if it type Decimal). (Remember that the FileHelpers class describes the spec for your import file.. So when you mark a Decimal field as FieldQuoted, you are saying in the file, this field will be quoted.)
You can even specify whether or not the quotes are optional with
[FieldQuoted('"', QuoteMode.OptionalForBoth)]
Here is a console application which works with your data:
class Program
{
[DelimitedRecord(",")]
[IgnoreFirst(1)]
public class Format1
{
[FieldQuoted]
[FieldConverter(ConverterKind.Date, "d/M/yyyy")]
public DateTime Date;
[FieldQuoted]
public string Description;
[FieldQuoted]
public string OriginalDescription;
[FieldQuoted]
public Decimal Amount;
[FieldQuoted]
public string Type;
[FieldQuoted]
public string Category;
[FieldQuoted]
public string Name;
[FieldQuoted]
public string Labels;
[FieldQuoted]
[FieldOptional]
public string Notes;
}
static void Main(string[] args)
{
var engine = new FileHelperEngine(typeof(Format1));
// read in the data
object[] importedObjects = engine.ReadString(#"""Date"",""Description"",""Original Description"",""Amount"",""Type"",""Category"",""Name"",""Labels"",""Notes""
""2/02/2012"",""ac"",""ac"",""515.00"",""a"",""b"","""",""javascript://""
""2/02/2012"",""test"",""test"",""40.00"",""a"",""d"",""c"","""","" """);
// check that 2 records were imported
Assert.AreEqual(2, importedObjects.Length);
// check the values for the first record
Format1 customer1 = (Format1)importedObjects[0];
Assert.AreEqual(DateTime.Parse("2/02/2012"), customer1.Date);
Assert.AreEqual("ac", customer1.Description);
Assert.AreEqual("ac", customer1.OriginalDescription);
Assert.AreEqual(515.00, customer1.Amount);
Assert.AreEqual("a", customer1.Type);
Assert.AreEqual("b", customer1.Category);
Assert.AreEqual("", customer1.Name);
Assert.AreEqual("javascript://", customer1.Labels);
Assert.AreEqual("", customer1.Notes);
// check the values for the second record
Format1 customer2 = (Format1)importedObjects[1];
Assert.AreEqual(DateTime.Parse("2/02/2012"), customer2.Date);
Assert.AreEqual("test", customer2.Description);
Assert.AreEqual("test", customer2.OriginalDescription);
Assert.AreEqual(40.00, customer2.Amount);
Assert.AreEqual("a", customer2.Type);
Assert.AreEqual("d", customer2.Category);
Assert.AreEqual("c", customer2.Name);
Assert.AreEqual("", customer2.Labels);
Assert.AreEqual(" ", customer2.Notes);
}
}
(Note, your first line of data seems to have 8 fields instead of 9, so I marked the Notes field with FieldOptional).
Here’s one way of doing it:
string[] lines = new string[]
{
"\"Date\",\"Description\",\"Original Description\",\"Amount\",\"Type\",\"Category\",\"Name\",\"Labels\",\"Notes\"",
"\"2/02/2012\",\"ac\",\"ac\",\"515.00\",\"a\",\"b\",\"\",\"javascript://\"",
"\"2/02/2012\",\"test\",\"test\",\"40.00\",\"a\",\"d\",\"c\",\"\",\" \"",
};
string[][] values =
lines.Select(line =>
line.Trim('"')
.Split(new string[] { "\",\"" }, StringSplitOptions.None)
.ToArray()
).ToArray();
The lines array represents the lines in your sample. Each " character must be escaped as \" in C# string literals.
For each line, we start off by removing the first and last " characters, then proceed to split it into a collection of substrings, using the "," character sequence as the delimiter.
Note that the above code will not work if you have " characters occurring naturally within your values (even if escaped).
Edit: If your CSV is to be read from a stream, all your need to do is:
var lines = new List<string>();
using (var streamReader = new StreamReader(stream))
while (!streamReader.EndOfStream)
lines.Add(streamReader.ReadLine());
The rest of the above code would work intact.
Edit: Given your new code, check whether you’re looking for something like this:
for (int i = 0; i < transactions.Length; ++i)
{
object oTrans = transactions[i];
string sTrans = oTrans as string;
if (sTrans != null &&
sTrans.StartsWith("\"") &&
sTrans.EndsWith("\""))
{
transactions[i] = sTrans.Substring(1, sTrans.Length - 2);
}
}
I have the same predicament and I replace the quotes when I load the value into my list object:
using System;
using System.Collections.Generic;
using System.IO;
using System.Windows.Forms;
namespace WindowsFormsApplication6
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
LoadCSV();
}
private void LoadCSV()
{
List<string> Rows = new List<string>();
string m_CSVFilePath = "<Path to CSV File>";
using (StreamReader r = new StreamReader(m_CSVFilePath))
{
string row;
while ((row = r.ReadLine()) != null)
{
Rows.Add(row.Replace("\"", ""));
}
foreach (var Row in Rows)
{
if (Row.Length > 0)
{
string[] RowValue = Row.Split(',');
//Do something with values here
}
}
}
}
}
}
This code might help which I developed:
using (StreamReader r = new StreamReader("C:\\Projects\\Mactive\\Audience\\DrawBalancing\\CSVFiles\\Analytix_ABC_HD.csv"))
{
string row;
int outCount;
StringBuilder line=new StringBuilder() ;
string token="";
char chr;
string Eachline;
while ((row = r.ReadLine()) != null)
{
outCount = row.Length;
line = new StringBuilder();
for (int innerCount = 0; innerCount <= outCount - 1; innerCount++)
{
chr=row[innerCount];
if (chr != '"')
{
line.Append(row[innerCount].ToString());
}
else if(chr=='"')
{
token = "";
innerCount = innerCount + 1;
for (; innerCount < outCount - 1; innerCount++)
{
chr=row[innerCount];
if(chr=='"')
{
break;
}
token = token + chr.ToString();
}
if(token.Contains(",")){token=token.Replace(",","");}
line.Append(token);
}
}
Eachline = line.ToString();
Console.WriteLine(Eachline);
}
}

Categories