How to read multiple csv file in a folder using csvhelper - c#

How can I read multiple csv file in a folder?
I have a program that map a csv file into its correct format using csvhelper library. And here is my code:
static void Main()
{
var filePath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.csv");<br>
var tempFilePath = Path.GetTempFileName();
using (var reader = new StreamReader(filePath))
using (var csvReader = new CsvReader(reader))
using (var writer = new StreamWriter(tempFilePath))
using (var csvWriter = new CsvWriter(writer))
{
csvReader.Configuration.RegisterClassMap<TestMapOld>();
csvWriter.Configuration.RegisterClassMap<TestMapNew>();
csvReader.Configuration.Delimiter = ",";
var records = csvReader.GetRecords<Test>();
csvReader.Configuration.PrepareHeaderForMatch = header =>
{
var newHeader = Regex.Replace(header, #"\s", string.Empty);
newHeader = newHeader.Trim();
newHeader = newHeader.ToLower();
return newHeader;
};
csvWriter.WriteRecords(records);
}
File.Delete(filePath);
File.Move(tempFilePath, filePath);
}

Since this is homework (and/or you are seemingly new to coding), I'll give you a very suitable answer that will give you many hours of fun and excitement as you go through the documentation and samples provided
You need
GetFiles(String, String)
Returns the names of files (including their paths) that match the
specified search pattern in the specified directory.
searchPattern
String The search string to match against the names of files in path.
This parameter can contain a combination of valid literal path and
wildcard (* and ?) characters, but it doesn't support regular
expressions.
foreach, in (C# reference)
The foreach statement executes a statement or a block of statements
for each element in an instance of the type that implements the
System.Collections.IEnumerable or
System.Collections.Generic.IEnumerable interface.
I'll leave the details up to you.

Related

CSV quoted values with line break inside data

This question is specific to ChoETL CSV reader
Take this example
"Header1","Header2","Header3"
"Value1","Val
ue2","Value3"
(Notepad++ screenshot)
All headers and values are quoted
There's a line break in "Value2"
I've been playing with ChoETL options, but I can't get it to work:
foreach (dynamic e in new
ChoCSVReader(#"test.csv")
.WithFirstLineHeader()
.MayContainEOLInData(true)
.MayHaveQuotedFields()
//been playing with these too
//.QuoteAllFields()
// .ConfigureHeader(c => c.IgnoreColumnsWithEmptyHeader = true)
//.AutoIncrementDuplicateColumnNames()
//.ConfigureHeader(c => c.QuoteAllHeaders = true)
//.IgnoreEmptyLine()
)
{
System.Console.WriteLine(e["Header1"]);
}
This fails with:
Missing 'Header2' field value in CSV file
The error varies depending on the reader configuration
What is the correct configuration to read this text?
It is bug in handling one of the cases (ie. header having quotes - csv2 text). Applied fix. Take the ChoETL.NETStandard.1.2.1.35-beta1 package and give it a try.
string csv1 = #"Header1,Header2,Header3
""Value1"",""Val
ue2"",""Value3""";
string csv2 = #"""Header1"",""Header2"",""Header3""
""Value1"",""Val
ue2"",""Value3""";
string csv3 = #"Header1,Header2,Header3
Value1,""Value2"",Value3";
using (var r = ChoCSVReader.LoadText(csv1)
.WithFirstLineHeader()
.MayContainEOLInData(true)
.QuoteAllFields())
r.Print();
using (var r = ChoCSVReader.LoadText(csv2)
.WithFirstLineHeader()
.MayContainEOLInData(true)
.QuoteAllFields())
r.Print();
using (var r = ChoCSVReader.LoadText(csv3)
.WithFirstLineHeader()
.MayContainEOLInData(true)
.QuoteAllFields())
r.Print();
Sample fiddle: https://dotnetfiddle.net/VubCDR

how to find all files containing many different strings and document it(windows)

I am trying to find all references to SQL tables in c# code. there are 500 tables and several hundred files in the solution. What i would like to end up with as a file containing a delimited list like this
Table1, codefile.cs
Table1, codefile2.cs
Table2, codefile.cs
Table3, codefile4.cs
the best answer I can come up with is create a batch file that runs findstr and then massage the data into a table. I am just wondering if there is a tool that will do something like this for me.
thank you
KevCri
Assuming
You have your file names in tables.txt, one table per line
All your .cs files are in your current directory
Then the following batch script should give you what you want
#echo off
>tableReferences.txt (
for /f "usebackq delims=" %%T in ("tables.txt") do (
for /f "delims=" %%F iin ('findstr /mirc:"\<%%T\>" *.cs') do (
echo %%T, %%F
)
)
)
If you need to search a file hierarchy, then add /S to the FINDSTR options.
I'd write a C# program like this.
NOTE:
Make sure to pass the project/solution directory in place of Directory.GetCurrentDirectory().
static void Main(string[] args)
{
List<string> TableNames = new List<string>() { "Table1", "Table3", "Table5", "Table7" };
var files = Directory.EnumerateFiles(Directory.GetCurrentDirectory(), "*.cs", SearchOption.AllDirectories);
List<string> results = new List<string>();
foreach (var file in files)
{
using (FileStream fs = new FileStream(file, FileMode.Open, FileAccess.Read))
{
using (StreamReader sr = new StreamReader(fs))
{
var contents = sr.ReadToEnd();
var result = TableNames.FindAll(contents.Contains);
if (result.Count > 0)
{
var fileName = Path.GetFileName(file);
foreach (var table in result)
{
results.Add(string.Format("{0},{1}", table, fileName));
}
}
}
}
}
Console.ReadLine();
}
And write the results to a file if you would like:
File.WriteAllLines("Output.txt", results.ToArray());

Using GetFiles but splitting the results to show full path and just the filename

I am using “GetFiles” to extract files in a specified folder as shown below:
Directory.GetFiles(_dirPath, File_Filter);
This creates a string array which is fine but now I need to do something similar but create a Tuple array where each Tuple holds the full path including filename and also just the filename. So at the moment I have this in my string
C:\temp\test.txt
The new code needs to create a Tuple where each one looks similar to this:
Item1 = C:\temp\test.txt
Item2 = test.txt
I’m guessing I could do this this with a bit of Linq but am not sure. Speed and efficiency is too much of a problem as the lists will be very small.
You should use Directory.EnumerateFiles with LINQ which can improve performance if you don't need to consume all files. Then use the Path class:
Tuple<string, string>[] result = Directory.EnumerateFiles(_dirPath, File_Filter)
.Select(fn => Tuple.Create( fn, Path.GetFileName(fn) ))
.ToArray();
Use DirectoryInfo.GetFiles to return FileInfo instances, which have both file name and full name:
var directory = new DirectoryInfo(_dirPath);
var files = directory.GetFiles(File_Filter);
foreach(var file in files)
{
// use file.FullName
// use file.Name
}
That's much more readable than having tuple.Item1 and tuple.Item2. Or at least use some anonymous type instead of tuple, if you don't need to pass this data between methods:
var files = from f in Directory.EnumerateFiles(_dirPath, File_Filter)
select new {
Name = Path.GetFileName(f),
FullName = f
};
string[] arrayOfFiles = Directory.GetFiles(PathName.Text); // PathName contains the Path to your folder containing files//
string fullFilePath = "";
string fileName = "";
foreach (var file in arrayOfFiles)
{
fullFilePath = file;
fileName = System.IO.Path.GetFileName(file);
}

writing collection of collection to csv file using csv helper

I am using csvhelper to write data to a csv file. I am using C# and VS2010. The object I wish to write is a complex data type object of type Dictionary< long,List<HistoricalProfile>>
Below is the code I am using to write the data to the csv file:
Dictionary<long, List<HistoricalProfile>> profiles
= historicalDataGateway.GetAllProfiles();
var fileName = CSVFileName + ".csv";
CsvWriter writer = new CsvWriter(new StreamWriter(fileName));
foreach (var items in profiles.Values)
{
writer.WriteRecords(items);
}
writer.Dispose();
When it loops the second time I get an error
The header record has already been written. You can't write it more than once.
Can you tell me what I am doing wrong here. My final goal is to have a single csv file with a huge list of records.
Thanks in advance!
Have you seen this library http://www.filehelpers.net/ this makes it very easy to read and write CSV files
Then your code would just be
var profiles = historicalDataGateway.GetAllProfiles(); // should return a list of HistoricalProfile
var engine = new FileHelperEngine<HistoricalProfile>();
// To Write Use:
engine.WriteFile("FileOut.txt", res);
I would go more low-level and iterate through the collections myself:
var fileName = CSVFileName + ".csv";
var writer = new StreamWriter(fileName);
foreach (var items in profiles.Values)
{
writer.WriteLine(/*header goes here, if needed*/);
foreach(var item in items)
{
writer.WriteLine(item.property1 +"," +item.propery2...);
}
}
writer.Close();
If you want to make the routine more useful, you could use reflection to get the list of properties you wish to write out and construct your record from there.

Using List<KeyValuePair> to store Keys and Values

I'm still trying to understand KeyValuePairs but I believe this idea should work. In my code below it searchs through a large string and extracts 2 substrings. One substring (keep in mind the value between the quotes varies) is something like Identity="EDN\username" another substring is something like FrameworkSiteID="Desoto" So I was thinking about combining these strings together before I added them to the List but here is my problem.. The login string below is a Unique field of strings that I need to use in a SQL statement to select records in SQLServer and the framew strings are strings I need lined up with the login strings (and all the columns and rows of data coming from SQLServer) when I output this to a text file. Should I make the login strings KEYS and the framew strings VALUES? If so how do I do that?? Hope that makes sense. I can further explain if needs be
Regex reg = new Regex("Identity=\"[^\"]*\"");
Regex reg1 = new Regex("FrameworkSiteID=\"[^\"]*\"");
foreach (FileInfo file in Files)
{
string line = "";
using (StreamReader sr = new StreamReader(file.FullName))
{
while (!String.IsNullOrEmpty(line = sr.ReadLine()))
{
if (line.ToUpper().Contains("IDENTITY="))
{
string login = reg.Match(line).Groups[0].Value;
string framew = reg1.Match(line).Groups[0].Value; //added
IdentityLines.Add(new KeyValuePair<string, string>(file.Name, login + " " + framew));
//This is probably not what I need
}
else
{
IdentityLines.Add(new KeyValuePair<string, string>(file.Name, "NO LOGIN"));
}
}
KeyValuePair<TKey,TValue> is a structure used by the Dictionary<TKey,TValue> class. Instead of keeping a list of KeyValuePair<TKey,TValue> objects, just create a Dictionary<TKey,TValue> and add keys/values to it.
Example:
Dictionary<string,string> identityLines = new Dictionary<string,string>();
foreach (FileInfo file in Files)
{
string line = "";
using (StreamReader sr = new StreamReader(file.FullName))
{
while (!String.IsNullOrEmpty(line = sr.ReadLine()))
{
if (line.ToUpper().Contains("IDENTITY="))
{
string login = reg.Match(line).Groups[0].Value;
string framew = reg1.Match(line).Groups[0].Value; //added
identityLines.Add(login, framew);
}
}
}
}
This will create an association between logins and framews. If you want to sort these by file, you can make a Dictionary<string, Dictionary<string,string>> and associate each identityLines dictionary with a specific filename. Note that the key values of the Dictionary<TKey, TValue> type are unique - you will get an error if you try to add a key that has already been added.
I'm note clear what the purpose of this is. You don't seem to be using the KeyValuePairs as pairs of a Key and a Value. Are you using them as a general pair class? It's a reasonable use (I do this myself), but I'm not sure what help you are seeking.
The intended purpose of KeyValuePair is as a helper-class in the implementation of Dictionaries. This would be useful if you are going to look up values based on having a key, though it doesn't seem from your explanation that you are.
Why are you using the filename as the key? Does it matter?
I also don't see why you are loading all of this stuff into a list. Why not just yield them out and use them as they are found?
foreach (FileInfo file in Files)
{
using (StreamReader sr = new StreamReader(file.FullName))
{
for(string line = sr.ReadLine(); !string.IsNullOrEmpty(line); line = sr.ReadLine())
{
if(line.IndexOf("IDENTITY=", StringComparison.InvariantCultureIgnoreCase) != -1)
{
string login = reg.Match(line).Groups[0].Value;
string framew = reg1.Match(line).Groups[0].Value; //added
yield return new KeyValuePair<string, string>(login, framew));
}
}
}
}
On the other hand, if you do want to use them as key-d values:
Dictionary<string, string> logins = new Dictionary<string, string>();
foreach (FileInfo file in Files)
{
using (StreamReader sr = new StreamReader(file.FullName))
{
for(string line = sr.ReadLine(); !string.IsNullOrEmpty(line); line = sr.ReadLine())
{
if(line.IndexOf("IDENTITY=", StringComparison.InvariantCultureIgnoreCase) != -1)
{
string login = reg.Match(line).Groups[0].Value;
string framew = reg1.Match(line).Groups[0].Value; //added
logins.Add(login, framew));
}
}
}
}
Now logins[login] returns the related framew. If you want this to be case-insensitive then use new Dictionary<string, string>(StringComparer.InvariantCultureIgnoreCase) or new Dictionary<string, string>(StringComparer.CurrentCultureIgnoreCase) as appropriate.
Finally, are you sure there will be no blank likes until the end of the file? If there could be you should use line != null rather than !string.IsNullOrEmpty() to avoid stopping your file read prematurely.

Categories