I want to compare two csv files and print the differences in a file. I currently use the code below to remove a row. Can I change this code so that it compares two csv files or is there a better way in c# to compare csv files?
List<string> lines = new List<string>();
using (StreamReader reader = new StreamReader(System.IO.File.OpenRead(path)))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.Contains(csvseperator))
{
string[] split = line.Split(Convert.ToChar(scheidingsteken));
if (split[selectedRow] == value)
{
}
else
{
line = string.Join(csvseperator, split);
lines.Add(line);
}
}
}
}
using (StreamWriter writer = new StreamWriter(path, false))
{
foreach (string line in lines)
writer.WriteLine(line);
}
}
Here is another way to find differences between CSV files, using Cinchoo ETL - an open source library
For the below sample CSV files
sample1.csv
id,name
1,Tom
2,Mark
3,Angie
sample2.csv
id,name
1,Tom
2,Mark
4,Lu
METHOD 1:
Using Cinchoo ETL, below code shows how to find differences between rows by all columns
var input1 = new ChoCSVReader("sample1.csv").WithFirstLineHeader().ToArray();
var input2 = new ChoCSVReader("sample2.csv").WithFirstLineHeader().ToArray();
using (var output = new ChoCSVWriter("sampleDiff.csv").WithFirstLineHeader())
{
output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(), ChoDynamicObjectEqualityComparer.Default));
output.Write(input2.OfType<ChoDynamicObject>().Except(input1.OfType<ChoDynamicObject>(), ChoDynamicObjectEqualityComparer.Default));
}
sampleDiff.csv
id,name
3,Angie
4,Lu
Sample fiddle: https://dotnetfiddle.net/nwLeJ2
METHOD 2:
If you want to do the differences by id column,
var input1 = new ChoCSVReader("sample1.csv").WithFirstLineHeader().ToArray();
var input2 = new ChoCSVReader("sample2.csv").WithFirstLineHeader().ToArray();
using (var output = new ChoCSVWriter("sampleDiff.csv").WithFirstLineHeader())
{
output.Write(input1.OfType<ChoDynamicObject>().Except(input2.OfType<ChoDynamicObject>(), new ChoDynamicObjectEqualityComparer(new string[] { "id" })));
output.Write(input2.OfType<ChoDynamicObject>().Except(input1.OfType<ChoDynamicObject>(), new ChoDynamicObjectEqualityComparer(new string[] { "id" })));
}
Sample fiddle: https://dotnetfiddle.net/t6mmJW
If you only want to compare one column you can use this code:
List<string> lines = new List<string>();
List<string> lines2 = new List<string>();
try
{
StreamReader reader = new StreamReader(System.IO.File.OpenRead(pad));
StreamReader read = new StreamReader(System.IO.File.OpenRead(pad2));
string line;
string line2;
//With this you can change the cells you want to compair
int comp1 = 1;
int comp2 = 1;
while ((line = reader.ReadLine()) != null && (line2 = read.ReadLine()) != null)
{
string[] split = line.Split(Convert.ToChar(seperator));
string[] split2 = line2.Split(Convert.ToChar(seperator));
if (line.Contains(seperator) && line2.Contains(seperator))
{
if (split[comp1] != split2[comp2])
{
//It is not the same
}
else
{
//It is the same
}
}
}
reader.Dispose();
read.Dispose();
}
catch
{
}
Related
I want to remove a column with a specific value. The code below is what I used to remove a row. Can I reverse this to remove a column?
int row = comboBox1.SelectedIndex;
string verw = Convert.ToString(txtChange.Text);
List<string> lines = new List<string>();
using (StreamReader reader = new StreamReader(System.IO.File.OpenRead(filepath)))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.Contains(","))
{
string[] split = line.Split(',');
if (split[row] == kill)
{
//achter split vul je de rij in
}
else
{
line = string.Join(",", split);
lines.Add(line);
}
}
}
}
using (StreamWriter writer = new StreamWriter(path, false))
{
foreach (string line in lines)
writer.WriteLine(line);
}
Assuming we ignore the subtleties of writing CSV, this should work:
public void RemoveColumnByIndex(string path, int index)
{
List<string> lines = new List<string>();
using (StreamReader reader = new StreamReader(path))
{
var line = reader.ReadLine();
List<string> values = new List<string>();
while(line != null)
{
values.Clear();
var cols = line.Split(',');
for (int i = 0; i < cols.Length; i++)
{
if (i != index)
values.Add(cols[i]);
}
var newLine = string.Join(",", values);
lines.Add(newLine);
line = reader.ReadLine();
}
}
using (StreamWriter writer = new StreamWriter(path, false))
{
foreach (var line in lines)
{
writer.WriteLine(line);
}
}
}
The code essentially loads each line, breaks it down into columns, loops through the columns ignoring the column in question, then puts the values back together into a line.
This is an over-simplified method, of course. I am sure there are more performant ways.
To remove the column by name, here is a little modification to your code example.
List<string> lines = new List<string>();
using (StreamReader reader = new StreamReader(System.IO.File.OpenRead(path)))
{
string target = "";//the name of the column to skip
int? targetPosition = null; //this will be the position of the column to remove if it is available in the csv file
string line;
List<string> collected = new List<string>();
while ((line = reader.ReadLine()) != null)
{
string[] split = line.Split(',');
collected.Clear();
//to get the position of the column to skip
for (int i = 0; i < split.Length; i++)
{
if (string.Equals(split[i], target, StringComparison.OrdinalIgnoreCase))
{
targetPosition = i;
break; //we've got what we need. exit loop
}
}
//iterate and skip the column position if exist
for (int i = 0; i < split.Length; i++)
{
if (targetPosition != null && i == targetPosition.Value) continue;
collected.Add(split[i]);
}
lines.Add(string.Join(",", collected));
}
}
using (StreamWriter writer = new StreamWriter(path, false))
{
foreach (string line in lines)
writer.WriteLine(line);
}
I want to add a new column with checkbox, my data is from a csv file and showed it in a datagridview with this code:
DataTable dtDataSource = new DataTable();
string[] fileContent = File.ReadAllLines(\data.csv);
if (fileContent.Count() > 0)
{
//Create data table columns
dtDataSource.Columns.Add("ID);
dtDataSource.Columns.Add("Data 1");
dtDataSource.Columns.Add("Data 2");
dtDataSource.Columns.Add("Status");
//Add row data dynamically
for (int i = 1; i < fileContent.Count(); i++)
{
string[] rowData = fileContent[i].Split(',');
dtDataSource.Rows.Add(rowData);
}
if (dtDataSource != null)
{
dataGridView1.DataSource = dtDataSource;
}
}
But also I need to validate if checkbox is checked, the column ¨Status¨, their value must be changed by 1 or if is it unchecked the value must be 0 in every row of datagridview.
Example:
ID,Data1,Data2,Status,checkbox
1,aaa,bbb,0,✓
2,ccc,ddd,1,(unchecked)
3,eee,fff,1,(unchecked)
When you click the save button, the csv file should looks like this:
ID,Data1,Data2,Status
1,aaa,bbb,1
2,ccc,ddd,0
3,eee,fff,0
What I should do? Any ideas? CSV file is a little difficult for me.
Thank you!
I resolved this, thanks anyway..
This is the code:
string id;
for (int i = 0; i < dataGridView1.RowCount; i++) {
String path = "\\registros.csv";
List<String> lines = new List<String>();
if (File.Exists(path))
{
using (StreamReader reader = new StreamReader(path))
{
String line;
while ((line = reader.ReadLine()) != null)
{
id = (string)dataGridView1.Rows[i].Cells[2].Value;
if (line.Contains(","))
{
String[] split = line.Split(',');
if (split[1].Equals(id) && (bool)dataGridView1.Rows[i].Cells[0].FormattedValue == true)
{
split[10] = "" + 1;
line = String.Join(",", split);
}
if (split[1].Equals(id) && (bool)dataGridView1.Rows[i].Cells[0].FormattedValue == false)
{
split[10] = "" + 0;
line = String.Join(",", split);
}
}
lines.Add(line);
}
}
using (StreamWriter writer = new StreamWriter(path, false))
{
foreach (String line in lines)
writer.WriteLine(line);
}
}
}
I have a list of words. I want the program to scan for multiple words from a text file.
This is what i already have:
int counter = 0;
string line;
StringBuilder sb = new StringBuilder();
string[] words = { "var", "bob", "for", "example"};
try
{
using (StreamReader file = new StreamReader("test.txt"))
{
while ((line = file.ReadLine()) != null)
{
if (line.Contains(Convert.ToChar(words)))
{
sb.AppendLine(line.ToString());
}
}
}
listResults.Text += sb.ToString();
}
catch (Exception ex)
{
listResults.ForeColor = Color.Red;
listResults.Text = "---ERROR---";
}
So i want to scan the file for a word, and if it's not there, scan for the next word...
String.Contains() only takes one argument: a string. What your call to Contains(Convert.ToChar(words)) does, is probably not what you expect.
As explained in Using C# to check if string contains a string in string array, you might want to do something like this:
using (StreamReader file = new StreamReader("test.txt"))
{
while ((line = file.ReadLine()) != null)
{
foreach (string word in words)
{
if (line.Contains(word))
{
sb.AppendLine(line);
}
}
}
}
Or if you want to follow your exact problem statement ("scan the file for a word, and if it's not there, scan for the next word"), you might want to take a look at Return StreamReader to Beginning:
using (StreamReader file = new StreamReader("test.txt"))
{
foreach (string word in words)
{
while ((line = file.ReadLine()) != null)
{
if (line.Contains(word))
{
sb.AppendLine(line);
}
}
if (sb.Length == 0)
{
// Rewind file to prepare for next word
file.Position = 0;
file.DiscardBufferedData();
}
else
{
return sb.ToString();
}
}
}
But this will think "bob" is part of "bobcat". If you don't agree, see String compare C# - whole word match, and replace:
line.Contains(word)
with
string wordWithBoundaries = "\\b" + word + "\\b";
Regex.IsMatch(line, wordWithBoundaries);
StringBuilder sb = new StringBuilder();
string[] words = { "var", "bob", "for", "example" };
string[] file_lines = File.ReadAllLines("filepath");
for (int i = 0; i < file_lines.Length; i++)
{
string[] split_words = file_lines[i].Split(' ');
foreach (string str in split_words)
{
foreach (string word in words)
{
if (str == word)
{
sb.AppendLine(file_lines[i]);
}
}
}
}
This works a treat:
var query =
from line in System.IO.File.ReadLines("test.txt")
where words.Any(word => line.Contains(word))
select line;
To get these out as a single string, just do this:
var results = String.Join(Environment.NewLine, query);
Couldn't be much simpler.
If you want to match only whole words it becomes only a little more complicated. You can do this:
Regex[] regexs =
words
.Select(word => new Regex(String.Format(#"\b{0}\b", Regex.Escape(word))))
.ToArray();
var query =
from line in System.IO.File.ReadLines(fileName)
where regexs.Any(regex => regex.IsMatch(line))
select line;
I'm having a question about writing data to a CSV file.
I have a file named test.csv in which are 2 fields > accountnumber and relation ID.
Now I want to add another field next to it: IBAN.
The IBAN is the data from the first row which is validated by the SOAP function BBANtoIBAN.
How can I keep the 2 rows of data accountnumbers and relation IDs in the CSV and add the IBAN in the 3rd row?
This is my code so far:
using (var client = new WebService.BANBICSoapClient("IBANBICSoap"))
{
List<List<string>> dataList = new List<List<string>>();
TextFieldParser parser = new TextFieldParser(#"C:\CSV\test.csv");
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(";");
while (!parser.EndOfData)
{
List<string> data = new List<string>();
string row = parser.ReadLine();
try
{
string resultIBAN = client.BBANtoIBAN(row);
if (resultIBAN != string.Empty)
data.Add(resultIBAN);
else
data.Add("Accountnumber is not correct.");
}
catch (Exception msg)
{
Console.WriteLine(msg);
}
dataList.Add(data);
}
}
I see it as:
StreamReader sr = new StreamReader(#"C:\CSV\test.csv")
StreamWriter sw = new StreamWriter(#"C:\CSV\testOut.csv")
while (sr.Peek() >= 0)
{
string line = sr.ReadLine();
try
{
string[] rowsArray = line.Split(';');
string row = rowsArray[0];
string resultIBAN = client.BBANtoIBAN(row);
if (resultIBAN != string.Empty)
{
line +=";"+ resultIBAN;
}
else
{
line +=";"+"Accountnumber is not correct.";
}
}
catch (Exception msg)
{
Console.WriteLine(msg);
}
sw.WriteLine(line)
}
sr.Close();
sw.Close();
I would do something like this to parse the csv file, and add an extra item to the data list:
List<List<string>> dataList = new List<List<string>>();
string filename = #"C:\CSV\test.csv";
using (StreamReader sr = new StreamReader(filename))
{
string fileContent = sr.ReadToEnd();
foreach (string line in fileContent.Split(new string[] {Environment.NewLine},StringSplitOptions.RemoveEmptyEntries))
{
List<string> data = new List<string>();
foreach (string field in line.Split(';'))
{
data.Add(field);
}
try
{
string resultIBAN = client.BBANtoIBAN(data[0]);
if (resultIBAN != string.Empty)
{
data.Add(resultIBAN);
}
else
{
data.Add("Accountnumber is not correct.");
}
}
catch (Exception msg)
{
Console.WriteLine(msg);
}
dataList.Add(data);
}
I have two csv files. In the first file i have a list of users, and in the second file i have a list of duplicate users. Im trying to remove the rows in the first file that are equal to the second file.
Heres the code i have so far:
StreamWriter sw = new StreamWriter(path3);
StreamReader sr = new StreamReader(path2);
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
File 1 example:
Modify,ABAMA3C,Allpay - Free State - HO,09072701
Modify,ABCG327,Processing Centre,09085980
File 2 Example:
Modify,ABAA323,Group HR Credit Risk & Finance
Modify,ABAB959,Channel Sales & Service,09071036
Any suggestions?
Thanks.
All you'd have to do is change the following file paths in the code below and you will get a file back (file one) without the duplicate users from file 2. This code was written with the idea in mind that you want something that is easy to understand. Sure there are other more elegant solutions, but I wanted to make it as basic as possible for you:
(Paste this in the main method of your program)
string line;
StreamReader sr = new StreamReader(#"C:\Users\J\Desktop\texts\First.txt");
StreamReader sr2 = new StreamReader(#"C:\Users\J\Desktop\texts\Second.txt");
List<String> fileOne = new List<string>();
List<String> fileTwo = new List<string>();
while (sr.Peek() >= 0)
{
line = sr.ReadLine();
if(line != "")
{
fileOne.Add(line);
}
}
sr.Close();
while (sr2.Peek() >= 0)
{
line = sr2.ReadLine();
if (line != "")
{
fileTwo.Add(line);
}
}
sr2.Close();
var t = fileOne.Except(fileTwo);
StreamWriter sw = new StreamWriter(#"C:\Users\justin\Desktop\texts\First.txt");
foreach(var z in t)
{
sw.WriteLine(z);
}
sw.Flush();
If this is not homework, but a production thing, and you can install assemblies, you'll save 3 hours of your life if you swallow your pride and use a piece of the VB library:
There are many exceptions (CR/LF between commas=legal in quotes; different types of quotes; etc.) This will handle anything excel will export/import.
Sample code to load a 'Person' class pulled from a program I used it in:
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(CSVPath)
Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
Reader.Delimiters = New String() {","}
Reader.TrimWhiteSpace = True
Reader.HasFieldsEnclosedInQuotes = True
While Not Reader.EndOfData
Try
Dim st2 As New List(Of String)
st2.addrange(Reader.ReadFields())
If iCount > 0 Then ' ignore first row = field names
Dim p As New Person
p.CSVLine = st2
p.FirstName = st2(1).Trim
If st2.Count > 2 Then
p.MiddleName = st2(2).Trim
Else
p.MiddleName = ""
End If
p.LastNameSuffix = st2(0).Trim
If st2.Count >= 5 Then
p.TestCase = st2(5).Trim
End If
If st2(3) > "" Then
p.AccountNumbersFromCase.Add(st2(3))
End If
While p.CSVLine.Count < 15
p.CSVLine.Add("")
End While
cases.Add(p)
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is not valid and will be skipped.")
End Try
iCount += 1
End While
End Using
this to close the streams properly:
using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path2))
{
string[] lines = File.ReadAllLines(path);
foreach (string line in lines)
{
string user = sr.ReadLine();
if (line != user)
{
sw.WriteLine(line);
}
}
}
for help on the real logic of removal or compare, answer the comment of El Ronnoco above...
You need to close the streams or utilize using clause
sw.Close();
using(StreamWriter sw = new StreamWriter(#"c:\test3.txt"))
You can use LINQ...
class Program
{
static void Main(string[] args)
{
var fullList = "TextFile1.txt".ReadAsLines();
var removeThese = "TextFile2.txt".ReadAsLines();
//Change this line if you need to change the filter results.
//Note: this assume you are wanting to remove results from the first
// list when the entire record matches. If you want to match on
// only part of the list you will need to split/parse the records
// and then filter your results.
var cleanedList = fullList.Except(removeThese);
cleanedList.WriteAsLinesTo("result.txt");
}
}
public static class Tools
{
public static IEnumerable<string> ReadAsLines(this string filename)
{
using (var reader = new StreamReader(filename))
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
public static void WriteAsLinesTo(this IEnumerable<string> lines, string filename)
{
using (var writer = new StreamWriter(filename) { AutoFlush = true, })
foreach (var line in lines)
writer.WriteLine(line);
}
}
using(var sw = new StreamWriter(path3))
using(var sr = new StreamReader(path))
{
string []arrRemove = File.ReadAllLines(path2);
HashSet<string> listRemove = new HashSet<string>(arrRemove.Count);
foreach(string s in arrRemove)
{
string []sa = s.Split(',');
if( sa.Count < 2 ) continue;
listRemove.Add(sa[1].toUpperCase());
}
string line = sr.ReadLine();
while( line != null )
{
string []sa = line.Split(',');
if( sa.Count < 2 )
sw.WriteLine(line);
else if( !listRemove.contains(sa[1].toUpperCase()) )
sw.WriteLine(line);
line = sr.ReadLine();
}
}