Extracting email addresses and names from a text file - c#

I will try to explain the problem as good as I can. I have a text file with email addresses and names. It looks like this: Barb Beney "de.mariof#vienna.aa", "Beny Beney" bet#catering.at,etc....all in the same line. This is just an example and I have like thousands of such data in one big text file. I want to extract the emails and names so that I get something like this in the end:
Beny Beney bet#catering.at - separate, next to each other, in one line and without quote marks. And in the end it should eliminate all duplicate addresses from the file.
I wrote the code for extracting email addresses and it works, but I don't know how to do the rest. How to extract the names put it in one line as the addresses and eliminate duplicates. I hope I described it properly so you know what I'm trying to do. This is the code I have:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
using System.IO;
namespace Email
{
class Program
{
static void Main(string[] args)
{
ExtractEmails(#"C:\Users\drake\Desktop\New.txt", #"C:\Users\drake\Desktop\Email.txt");
}
public static void ExtractEmails(string inFilePath, string outFilePath)
{
string data = File.ReadAllText(inFilePath);
Regex emailRegex = new Regex(#"\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*",
RegexOptions.IgnoreCase);
MatchCollection emailMatches = emailRegex.Matches(data);
StringBuilder sb = new StringBuilder();
foreach (Match emailMatch in emailMatches)
{
sb.AppendLine(emailMatch.Value);
}
File.WriteAllText(outFilePath, sb.ToString());
}
}
}

For the new desired formatting, you could do something like this:
private string[] parseEmails(string bigStringiIn){
string[] output;
string bigString;
bigString = bigStringiIn.Replace("\"", "");
output = bigString.Slit(",".ToCharArray());
return output;
}
it takes the string with the mail adresses, replaces the quote marks, then splits the string into a string array with the format: name lastname email#some.com
for the duplicated entries deletion, a nested for should do the trick, checking (maybe after a .Split()) for matching strings.

Welcome you can use this code and it will work on file made by creating new file which will contain all e-mails without duplicates:
static void Main(string[] args)
{
TextWriter w = File.CreateText(#"C:\Users\drake\Desktop\NonDuplicateEmails.txt");
ExtractEmails(#"C:\Users\drake\Desktop\New.txt", #"C:\Users\drake\Desktop\Email.txt");
TextReader r = File.OpenText(#"C:\Users\drake\Desktop\Email.txt");
RemovingAllDupes(r, w);
}
public static void RemovingAllDupes(TextReader reader, TextWriter writer)
{
string currentLine;
HashSet<string> previousLines = new HashSet<string>();
while ((currentLine = reader.ReadLine()) != null)
{
// Add returns true if it was actually added,
// false if it was already there
if (previousLines.Add(currentLine))
{
writer.WriteLine(currentLine);
}
}
writer.Close();
}

you can also use this code with big files:
static void Main(string[] args)
{
ExtractEmails(#"C:\Users\drake\Desktop\New.txt", #"C:\Users\drake\Desktop\Email.txt");
var sr = new StreamReader(File.OpenRead(#"C:\Users\drake\Desktop\Email.txt"));
var sw = new StreamWriter(File.OpenWrite(#"C:\Users\drake\Desktop\NonDuplicateEmails.txt"));
RemovingAllDupes(sr, sw);
}
public static void RemovingAllDupes(StreamReader str, StreamWriter stw)
{
var lines = new HashSet<int>();
while (!str.EndOfStream)
{
string line = str.ReadLine();
int hc = line.GetHashCode();
if (lines.Contains(hc))
continue;
lines.Add(hc);
stw.WriteLine(line);
}
stw.Flush();
stw.Close();
str.Close();

Related

Read and write to text file efficiently

I have a homework assignment to create a C# console program. It should create a text file with 2 phrases:
Hello, World!
Goodbye, Cruel World!
Then I also must create a program to read the 2 phrases from the file.
After two hours this is what I have. It works, but I want to rewrite the program to be more efficient. I am mainly struggling on how to output the file into a .cs file capable of running.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace ConsoleApplication3
{
class Program
{
static void Main(string[] args)
{
//structure.txt contains the program we will enter our values into.
String filePath = "Structure.txt";
WriteToFile(filePath);
}
public static void WriteToFile(string filePath)
{
//create a string array to gather our text file information.
StreamReader reader = new StreamReader(filePath);
StreamReader info = new StreamReader("Structure.txt");
StreamWriter writer = new StreamWriter("Hello.cs", true);
String temp = String.Empty;
while (!info.EndOfStream)
{
String tempstring = String.Empty;
tempstring = reader.ReadLine();
while (!reader.EndOfStream)
{
temp = reader.ReadLine();
writer.WriteLine(temp);
if (temp == "//break")
{
writer.WriteLine("String1 = {}", tempstring);
}
}
}
reader.Close();
info.Close();
writer.Close();
}
}
}
More efficient? sure
// write
string[] lines = new [] {"Hello, World!", "Goodbye, Cruel World!"};
File.WriteAllLines("c:\\myFile.txt", lines);
// read
string[] lines = File.ReadAllLines("c:\\myFile.txt");
This is all. . .

Delete all but {x} C# string

I'm trying to cycle through a .txt to build a test function for another application I'm building.
I've got a list of UK based lat/long values that are formatted like this:
Latitude: 57°39′55″N 57.665198
Longitude: 6°57′27″W -6.95739395
Distance: 184.8338 mi Bearing: 329.815°
with the intended result of this small application being just the lat/long values:
57.665198
-6.95739395
So far I've got a StreamReader working with a myString.StartsWith("Latitude") {} but I'm stuck.
How do I detect a splitstring of 2 spaces " " inside of a string and delete everything before that? My code so far is this:
static void Main(string[] args)
{
string text = "";
using (var streamReader = new StreamReader(#"c:\mb\latlong.txt", Encoding.UTF8))
{
text = streamReader.ReadToEnd();
if (text.Trim().StartsWith("Latitude: "))
{
text.Split()
} else if (text.StartsWith("Distance: "))
{
} else if (text.StartsWith(""))
{
}
streamReader.ReadLine();
}
Console.ReadKey();
}
Thanks in advance
You can try using regular expressions
var result = File
.ReadLines(#"C:\MyFile.txt")
.SelectMany(line => Regex
.Matches(line, #"(?<=\s)-?[0-9]+(\.[0-9]+)*$")
.OfType<Match>()
.Select(match => match.Value));
Test
// 57.665198
// -6.95739395
Console.Write(String.Join(Environment.NewLine, result));
Use string.IndexOf(" ") to find the position of the two spaces in the string. Then you can use string.Substring(position) to get the string after that point.
In your code:
if (text.Trim().StartsWith("Latitude: "))
{
var positionOfTwoSpaces = text.IndexOf(" ");
var latString = text.Substring(positionOfTwoSpaces);
var latValue = float.Parse(latString);
}
You can try the regular expression solution. (You might need to fix up the space counts in the regex definitions)
static void Main(string[] args)
{
string text = "";
Regex lat = new Regex("Latitude: .+? (.+)");
Regex lon = new Regex("Longitude .+? (.+)");
using (var streamReader = new StreamReader(#"c:\mb\latlong.txt", Encoding.UTF8))
{
string line;
while ((line = streamReader.ReadLine() != null)
{
if (lat.IsMatch(line))
lat.Match(line).Groups[1].Value // latitude
else if(lon.IsMatch(line))
lon.Match(line).Groups[1].Value // longitude
}
}
Console.ReadKey();
}
A simple solution would be
string[] fileLines = IO.File.ReadAllLines("input file path");
List<string> resultLines = new List<string>();
foreach (string line in fileLines) {
string[] parts = line.Split(" "); //Double space
if (parts.Count() > 1) {
string lastPart = parts.LastOrDefault();
if (!string.IsNullOrEmpty(lastPart)) {
resultLines.Add(lastPart);
}
}
}
IO.File.WriteAllLines("output file path", resultLines.ToArray());
As I already suggested in my comment. You can look for the last occurrence of the space and substring from there.
using System;
using System.IO;
using System.Text;
public class Test
{
public static void Main()
{
String line = String.Empty;
while(!String.IsNullOrEmpty((line = streamReader.ReadLine())))
{
if(line.StartsWith("Latitude:"))
{
line = line.Substring(line.LastIndexOf(' ') + 1);
Console.WriteLine(line);
}
}
Console.ReadKey();
}
}
Working example.
I didn't provide all the code because this is just copy paste for the longitude case. I think you can do this by your own. :)

Need help to get IP from strings in c#

So I'm working on a little side project in c# and want to read a long text file and when it encounters the line "X-Originating-IP: [192.168.1.1]" I would like to grab the IP and display to console just the recognized IP #, so just 192.168.1.1 etc. I am having trouble understanding regex. Anyone who could get me started is much appreciated. What I have so far is below.
namespace x.Originating.Ip
{
class Program
{
static void Main(string[] args)
{
int counter = 0;
string line;
System.IO.StreamReader file =
new System.IO.StreamReader("C:\\example.txt");
while ((line = file.ReadLine()) != null)
{
if (line.Contains("X-Originating-IP: "))
Console.WriteLine(line);
counter++;
}
file.Close();
Console.ReadLine();
}
}
}
Try this example:
//Add this namespace
using System.Text.RegularExpressions;
String input = #"X-Originating-IP: [192.168.1.1]";
Regex IPAd = new Regex(#"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b");
MatchCollection MatchResult = IPAd.Matches(input);
Console.WriteLine(MatchResult[0]);
You don't need to use regular expression:
if (line.Contains("X-Originating-IP: ")) {
string ip = line.Split(':')[1].Trim(new char[] {'[', ']', ' '});
Console.WriteLine(ip);
}
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
System.Net.WebClient webclient = new System.Net.WebClient();
string ip = webclient.DownloadString("http://whatismyip.org/");
Regex reg = new Regex("((2[0-4]\\d|25[0-5]|[01]?\\d\\d?)\\.){3}(2[0-4]\\d|25[0-5]|[01]?\\d\\d?)");
if (reg.Match(ip).Success)
{
Console.WriteLine(reg.Match(ip).ToString ());
Console.WriteLine("Success");
}
// Console.Write (ip);
Console.ReadLine();
}
}
}
I'm not sure but I suppose your text file contains one IP address each row, now your codes can be simplified like this below:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Text.RegularExpressions;
namespace x.Originating.Ip
{
class Program
{
static void Main(string[] args)
{
string[] lines = System.IO.File.ReadAllLines("Your path & filename.extension");
Regex reg = new Regex("((2[0-4]\\d|25[0-5]|[01]?\\d\\d?)\\.){3}(2[0-4]\\d|25[0-5]|[01]?\\d\\d?)");
for (int i = 0; i < lines.Length; ++i)
{
if (reg.Match(lines[i]).Success)
{
//Do what you want........
}
}
}
}
}
The following regular expression should get you what you want:
(?<=X-Originating-IP: +)((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?)
This uses a positive lookbehind to assert that "X-Originating-IP: " exists followed by an IPv4 address. Only the IP address will be captured by the match.
Rather than doing a regex, it looks like you are parsing a MIME email, consider LumiSoft.Net.MIME which lets you access the headers with a defined API.
Alternatively, use the built in IPAddress.Parse class, which supports both IPv4 and IPv6:
const string x_orig_ip = "X-Originating-IP:";
string header = "X-Originating-IP: [10.24.36.17]";
header = header.Trim();
if (header.StartsWith(x_orig_ip, StringComparison.OrdinalIgnoreCase))
{
string sIpAddress = header.Substring(x_orig_ip.Length, header.Length - x_orig_ip.Length)
.Trim(new char[] { ' ', '\t', '[', ']' });
var ipAddress = System.Net.IPAddress.Parse(sIpAddress);
// do something with IP address.
return ipAddress.ToString();
}

Writing into .txt file without erasing previous data C#

I am trying to split a string in a .txt-file by commas (,) into a string[] and then replacing every item of the string[] to another formula, for example:
"Marko Kostic, Faculty of Technical Sciences, University of Novi Sad,
Trg D. Obradovica 6, 21125 Novi Sad, Serbia"
I want to split this string by commas in between the words and then I want to put every value in separate line like a list and then changing every value with another like "Marko Kostic" to be
<addr-line>Marko Kostic<\addr-line>
The problem is the writer wrote only the last value of string[] and erase the previous values.
Any suggestions?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Text.RegularExpressions;
using Microsoft.Office.Interop;
using Microsoft.Office.Interop.Word;
using System.Diagnostics;
using System.Reflection;
using System.Collections;
using System.Runtime.InteropServices;
namespace AffiliationParser
{
class Program
{
static void Main(string[] args)
{
Microsoft.Office.Interop.Word.Application oWord = new Microsoft.Office.Interop.Word.Application();
object missing = System.Reflection.Missing.Value;
object isVisible = false;
using (StreamReader batch = new StreamReader(#"D:\Developing\REF\AffiliationParser\AffiliationParser\AffiliationParser\bin\Debug\Run.bat"))
{
string bat;
while (!batch.EndOfStream)
{
bat = batch.ReadLine();
// do your processing with batch command
if (bat == "pause")
{
continue;
}
string fpath = bat.Substring(bat.IndexOf(" \""));
string path = fpath.Replace("\"", "").Replace(" ","");
string[] name = Directory.GetFiles(path, "*.txt");
string words = name.Min();
string word = words.Substring(words.LastIndexOf("\\")).Replace("\\", "");
Console.WriteLine("Processing........");
Console.WriteLine(word);
string Npath = path + #"\Arr" + word;
if (File.Exists(Npath))
{
System.Windows.Forms.MessageBox.Show("The file Arr" + word + " alredy exist in " + path);
continue;
}
else
{
File.Copy(words, Npath);
StreamReader temp = new StreamReader(Npath, Encoding.UTF8);
string tempstring = temp.ReadToEnd();
string[] temp3 = tempstring.Split(',');
temp.Close();
foreach (string item in temp3)
{
string Nitem = item.TrimStart().TrimEnd();
//Match MatchCont = Regex.Match(Nitem, #"Afganistan|Albania|Algeria|American\s+Samoa|Andorra|Angola|Anguilla|Antarctica|Antigua\s+and\s+Barbuda|Argentina|Armenia|Aruba|Australia|Austria|Azerbaijan|Bahamas|Bahrain|Bangladesh|Barbados|Belarus|Belgium|Belize|Benin|Bermuda|Bhutan|Bolivia|Bosnia\s+and\s+Herzegovina|Botswana|Bouvet\s+Island|Brazil|British\s+Indian\s+Ocean\s+Territory|Brunei\s+Darussalam|Bulgaria|Burkina\s+Faso|Burundi|Cambodia|Cameroon|Canada|Cape\s+Verde|Cayman\s+Islands|Central\s+African\s+Republic|Chad|Chile|China|Christmas\s+Island|Cocos\s+\(Keeling\)\s+Islands|Colombia|Comoros|Democratic\s+People's\s+Republic\s+of\s+Korea|Democratic\s+Republic\s+of\s+Congo|Cook\s+Islands|Costa\s+Rica|Cote\s+D'Ivoire|Croatia|Cuba|Cyprus|Czech\s+Republic|Republic\s+of\s+Korea|Denmark|Djibouti|Dominica|Dominican\s+Republic|East\s+Timor|Ecuador|Egypt|El\s+Salvador|Equatorial\s+Guinea|Eritrea|Estonia|Ethiopia|Falkland\s+Islands\s+\(Malvinas\)|Faroe\s+Islands|Fiji|Finland|France\s+Metropolitan|France|French\s+Guiana|French\s+Polynesia|French\s+Southern\s+Territories|Gabon|Gambia|Georgia|Germany|Ghana|Gibraltar|Greece|Greenland|Grenadaf|Guadeloupe|Guam|Guatemala|Guinea|Guinea\-Bissau|Guyana|Haiti|Heard\s+Island\s+and\s+McDonald\s+Island|Honduras|Hong\s+Kong|Hungary|Iceland|India|Indonesia|Iran|Iraq|Ireland|Northern\s+Ireland|Isle\s+Of\s+Man|Israel|Italy|Jamaica|Japan|Jordan|Kazakhstan|Kenya|Kiribati|Kuwait|Kyrgyzstan|Lao\s+People'S\s+Democratic\s+Republic|Latvia|Lebanon|Lesotho|Liberia|Libya|Liechtenstein|Lithuania|Luxembourg|Macau|Macedonia|Madagascar|Malawi|Malaysia|Maldives|Mali|Malta|Marshall\s+Islands|Martinique|Mauritania|Mauritius|Mayotte|Mexico|Micronesia|Moldova|Monaco|Mongolia|Montserrat|Morocco|Mozambique|Myanmar|Namibia|Nauru|Nepal|Netherlands\s+Antilles|New\s+Caledonia|New\s+Zealand|Nicaragua|Nigeria|Niger|Niue|Norfolk\s+Island|Northern\s+Mariana\s+Islands|Norway|Oman|Pakistan|Palau|Palestine|Panama|Papua\s+New\s+Guinea|Paraguay|Peru|Philippines|Pitcairn|Poland|Portugal|Puerto\s+Rico|Qatar|Reunion|Romania|Russia|Rwanda|Saint\s+Kitts\s+and\s+Nevis|Saint\s+Lucia|Saint\s+Vincent\s+and\s+The\s+Grenadines|Samoa|San\s+Marino|Sao\s+Tome\s+and\s+Principe|Saudi\s+Arabia|Scotland|Senegal|Serbia|Kosovo|Montenegro|Seychelles|Sierra\s+Leone|Singapore|Slovakia|Slovenia|Solomon\s+Islands|Somalia|South\s+Africa|South\s+Georgia\s+and\s+The\s+South\s+Sandwich\s+Islands|Spain|Sri\s+Lanka|St.\s+Helena|St.\s+Pierre\s+and\s+Miquelon|Sudan|Suriname|Svalbard\s+and\s+Jan\s+Mayen\s+Islands|Swaziland|Sweden|Switzerland|Syria|Taiwan|Tajikistan|Tanzania|Thailand|The\s+Netherlands|Togo|Tokelau|Tonga|Trinidad\s+and\s+Tobago|Tunisia|Turkey|Turkmenistan|Turks\s+and\s+Caicos\s+Islands|Tuvalu|Uganda|Ukraine|United\s+Arab\s+Emirates|UAE|UK|United\s+States\s+Minor\s+Outlying\s+Islands|Uruguay|USA|Uzbekistan|Vanuatu|Vatican\s+City\s+State\s+\(Holy\s+See\)|Venezuela|Vietnam|British\s+Virgin\s+Islands|USA\s+Virgin\s+Islands|Wallis\s+and\s+Futuna\s+Islands|Western\s+Sahara|West\s+Indies|Yemen|Zambia|Zimbabwe|Abkhazia|Afghanistan|Akrotiri\s+and\s+Dhekelia|Aland|Ascension\s+Island|The\s+Bahamas|Brunei|Central\s+Africa|Cocos|Congo|Cote\s+d'lvoire|Czech|Dominican|Falkland\s+Islands|Cambia,\s+The|Grenada|Guemsey|Isle\s+of\s+Man|Jersey|Korea|Laos|Macao|Nagorno\-Karabakh|Netherlands|Northern\s+Cyprus|Pitcaim\s+Islands|Sahrawi\s+Arab\s+Democratic|Saint\-Barthelemy|Saint\s+Helena|Saint\s+Martin|Saint\s+Pierre\s+and\s+Miquelon|Saint\s+Vincent\s+and\s+Grenadines|Samos|Somaliland|South\s+Ossetia|Svalbard|Transnistria|Tristan\s+da\s+Cunha|United\s+Kingdom|Vatican\s+City|Virgin\s+Islands|Wallis\s+and\s+Futuna|Espa�a|Witsch|United\s+States|Prague\s+Czech\s+Republic", RegexOptions.Singleline | RegexOptions.Compiled | RegexOptions.ExplicitCapture | RegexOptions.IgnoreCase);
//if (MatchCont.Success==true)
//{
// MatchCont.Result(#"<country>" + Nitem + #"<\country>");
//}
}
}
}
}
}
}
}
Try to include code in you question, it's not a best practice to simply hand out answers. That being said, you'll want to look at the String.Split method, String.Trim and the File.AppendText method.
Simple ways to do this:
string[] stuff = data.Split(',');
StreamWriter sW = File.AppendText(pathToFile);
foreach(string parts in stuff)
{
sW.WriteLine(parts.Trim());
}
Very, very basic, and not giving you the answer without some work on your part. Good luck!
Here's some references: File.AppendText and String.Trim
string input="a,b,c,d";
string [] parts=input.Split(",",StringSplitOptions.RemoveEmptyEntries);
List<string> output=new List<string>();
foreach(string s in parts)
{
// do sth you like;
var newStr="<abc>"+s+"</abc>";
output.Add(newStr);
}
return output.ToArray();

Merging 2 Text Files in C#

Firstly, i'd just like to mention that I've only started learning C# a few days ago so my knowledge of it is limited.
I'm trying to create a program that will parse text files for certain phrases input by the user and then output them into a new text document.
At the moment, i have it the program searching the original input file and gathering the selected text input by the user, coping those lines out, creating new text files and then merging them together and also deleting them afterwards.
I'm guessing that this is not the most efficient way of creating this but i just created it and had it work in a logical manor for me to understand as a novice.
The code is as follows;
private void TextInput1()
{
using (StreamReader fileOpen = new StreamReader(txtInput.Text))
{
using (StreamWriter fileWrite = new StreamWriter(#"*DIRECTORY*\FIRSTFILE.txt"))
{
string file;
while ((file = fileOpen.ReadLine()) != null)
{
if (file.Contains(txtFind.Text))
{
fileWrite.Write(file + "\r\n");
}
}
}
}
}
private void TextInput2()
{
using (StreamReader fileOpen = new StreamReader(txtInput.Text))
{
using (StreamWriter fileWrite = new StreamWriter(#"*DIRECTORY*\SECONDFILE.txt"))
{
string file;
while ((file = fileOpen.ReadLine()) != null)
{
if (file.Contains(txtFind2.Text))
{
fileWrite.Write("\r\n" + file);
}
}
}
}
}
private static void Combination()
{
ArrayList fileArray = new ArrayList();
using (StreamWriter writer = File.CreateText(#"*DIRECTORY*\FINALOUTPUT.txt"))
{
using (StreamReader reader = File.OpenText(#"*DIRECTORY*\FIRSTFILE.txt"))
{
writer.Write(reader.ReadToEnd());
}
using (StreamReader reader = File.OpenText(#"*DIRECTORY*\SECONDFILE.txt"))
{
writer.Write(reader.ReadToEnd());
}
}
}
private static void Delete()
{
if (File.Exists(#"*DIRECTORY*\FIRSTFILE.txt"))
{
File.Delete(#"*DIRECTORY*\FIRSTFILE.txt");
}
if (File.Exists(#"*DIRECTORY*\SECONDFILE.txt"))
{
File.Delete(#"*DIRECTORY*\SECONDFILE.txt");
}
}
The output file that is being created is simply outputting the first text input followed by the second. I am wondering if it is possible to be able to merge them into 1 file, 1 line at a time as it is a consecutive file meaning have the information from Input 1 followed 2 is needed rather than all of 1 then all of 2.
Thanks, Neil.
To combine the two files content in an one merged file line by line you could substitute your Combination() code with this
string[] file1 = File.ReadAllLines("*DIRECTORY*\FIRSTFILE.txt");
string[] file2 = File.ReadAllLines("*DIRECTORY*\SECONDFILE.txt");
using (StreamWriter writer = File.CreateText(#"*DIRECTORY*\FINALOUTPUT.txt"))
{
int lineNum = 0;
while(lineNum < file1.Length || lineNum < file2.Length)
{
if(lineNum < file1.Length)
writer.WriteLine(file1[lineNum]);
if(lineNum < file2.Length)
writer.WriteLine(file2[lineNum]);
lineNum++;
}
}
This assumes that the two files don't contains the same number of lines.
try this method. You can receive three paths. File 1, File 2 and File output.
public void MergeFiles(string pathFile1, string pathFile2, string pathResult)
{
File.WriteAllText(pathResult, File.ReadAllText(pathFile1) + File.ReadAllText(pathFile2));
}
If the pathResult file exists, the WriteAllText method will overwrite it. Remember to include System.IO namespace.
Important: It is not recommended for large files! Use another options available on this thread.
If your input files are quite large and you run out of memory, you could also try wrapping the two readers like this:
using (StreamWriter writer = File.CreateText(#"*DIRECTORY*\FINALOUTPUT.txt"))
{
using (StreamReader reader1 = File.OpenText(#"*DIRECTORY*\FIRSTFILE.txt"))
{
using (StreamReader reader2 = File.OpenText(#"*DIRECTORY*\SECONDFILE.txt"))
{
string line1 = null;
string line2 = null;
while ((line1 = reader1.ReadLine()) != null)
{
writer.WriteLine(line1);
line2 = reader2.ReadLine();
if(line2 != null)
{
writer.WriteLine(line2);
}
}
}
}
}
Still, you have to have an idea how many lines you have in your input files, but I think it gives you the general idea to proceed.
Using a FileInfo extension you could merge one or more files by doing the following:
public static class FileInfoExtensions
{
public static void MergeFiles(this FileInfo fi, string strOutputPath , params string[] filesToMerge)
{
var fiLines = File.ReadAllLines(fi.FullName).ToList();
fiLines.AddRange(filesToMerge.SelectMany(file => File.ReadAllLines(file)));
File.WriteAllLines(strOutputPath, fiLines.ToArray());
}
}
Usage
FileInfo fi = new FileInfo("input");
fi.MergeFiles("output", "File2", "File3");
I appreciate this question is almost old enough to (up)vote (itself), but for an extensible approach:
const string FileMergeDivider = "\n\n";
public void MergeFiles(string outputPath, params string[] inputPaths)
{
if (!inputPaths.Any())
throw new ArgumentException(nameof(inputPaths) + " required");
if (inputPaths.Any(string.IsNullOrWhiteSpace) || !inputPaths.All(File.Exists))
throw new ArgumentNullException(nameof(inputPaths), "contains invalid path(s)");
File.WriteAllText(outputPath, string.Join(FileMergeDivider, inputPaths.Select(File.ReadAllText)));
}

Categories