fast text reader c# - c#

I am using the following code to read and combine number of texts in one string:
foreach (string path in filePaths)
{
StreamReader singfile = new StreamReader(path);
string file_text = singfile.ReadToEnd();
combinetexts += file_text + "\n";
fs.Close();
}
and as I know, the string combinetexts will copy n times as much as the number of filepaths. is it possible the to do that orcedure using string builder ? I tried but it doesn't.
thanks in advance.

Here's a short LINQ way of doing it:
string result = string.Join("\n", filePaths.Select(x => File.ReadAllText(x)));
Or with C# 4 (which has better handling of type inference wrt method group conversions):
string result = string.Join("\n", filePaths.Select(File.ReadAllText));
If you're using .NET 3.5 you'll need to create an array of the strings, as string.Join didn't have as many overloads then:
string result = string.Join("\n", filePaths.Select(x => File.ReadAllText(x))
.ToArray());
This has the disadvantage of reading all of all the files before performing the concatenation, admittedly - but it's still better than the repeated concatenation in the original code. It might also be more efficient than using StringBuilder - it depends on the string.Join implementation.
See my article on StringBuilder for why the original code is really inefficient.
EDIT: Note that this does not include a trailing \n at the end. If you really want to add that, you can :)

Here's your example using a StringBuilder instead of a string:
var sb = new StringBuilder();
foreach (string path in filePaths)
sb.AppendLine(File.ReadAllText(path));
string result = sb.ToString();
(I've also taken the liberty to shorten/optimize your code a bit. File.ReadAllText reads the complete contents of a file without having to open a StreamReader manually. In addition, AppendLine automatically adds a \n at the end.)

Of course it's possible, use
StringBuilder combinetexts = new StringBuilder();
...
combinetexts.Append(file_text);
combinetexts.Append("\n");;

It's more efficient to use an StringBuilder to manipulates strings.
http://www.codeproject.com/Articles/14936/StringBuilder-vs-String-Fast-String-Operations-wit
Best regards

Try the below code:
StringBuilder strBuilder= new StringBuilder();
foreach (string path in filePaths)
{
StreamReader singfile = new StreamReader(path);
string file_text = singfile.ReadToEnd();
strBuilder.AppendLine(file_text);
fs.Close();
}
Console.WriteLine(strBuilder.ToString());

Yes, it is possible to use StringBuilder, there are various ways to "optimize" this code.
TL;DR: Skip to the last part of this post for the best way to do this.
Here's stage 1 changes to your code:
StringBuilder combinetexts = new StringBuilder();
foreach (string path in filePaths)
{
StreamReader fs = new StreamReader(path);
string file_text = fs.ReadToEnd();
combinetexts.Append(file_text).Append("\n");
fs.Close();
}
Secondly, before building the StringBuilder you can calculate how much space you actually need, this will reduce the chance of copying the string even further:
long totalSize = 0;
foreach (string path in filePaths)
totalSize += new FileInfo(path).Length + 1; // +1 = \n
StringBuilder sb = new StringBuilder(Convert.ToInt32(totalSize));
foreach (string path in filePaths)
{
StreamReader fs = new StreamReader(path);
string file_text = fs.ReadToEnd();
combinetexts.Append(file_text).Append("\n");
fs.Close();
}
Lastly I would use using (...) instead of the fs.Close(); call:
long totalSize = 0;
foreach (string path in filePaths)
totalSize += new FileInfo(path).Length + 1; // +1 = \n
StringBuilder sb = new StringBuilder(Convert.ToInt32(totalSize));
foreach (string path in filePaths)
{
using (StreamReader fs = new StreamReader(path))
{
string file_text = fs.ReadToEnd();
combinetexts.Append(file_text).Append("\n");
}
}
Then I would use LINQ a bit more and switch to using File.ReadAllText instead of an explicit StreamReader, and then combine the lines of code a bit:
long totalSize = filePaths.Sum(path => new FileInfo(path).Length + 1);
StringBuilder sb = new StringBuilder(Convert.ToInt32(totalSize));
foreach (string path in filePaths)
{
combinetexts.Append(File.ReadAllText(path)).Append("\n");
}
However, as it turns out, there's an even better way to do this:
string combinetexts = String.Join("\n", filePaths.Select(path => File.ReadAllText(path)));
or in C# 4.0 which can better infer the right way to handle method group conversions:
string combinetexts = String.Join("\n", filePaths.Select(File.ReadAllText));
This will do all of the above, it will:
Read in all the files
String.Join will calculate the total size needed to hold the entire string
Then it will combine all the texts, with a \n between each

Related

How can I read data from specific location in .asc file using c#

I have .asc file which has 1000's of rows. Each column in row is of the fixed length and separate by one space. I want to read email id column which is started from 296 position and ended at 326 position in a row.
Is there any way to read such data from .asc file?
This might do the trick for you. I am just reading the email ids in the file whatever extension file it might me, may it be txt or asc. Also it doesnt matter if the email address is locating at some other place instead of 296 or 326.
public void ExtractAllEmails()
{
string datafrmAsc = File.ReadAllText(YourASCFile); //read File
Regex emailRegex = new Regex(#"\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*", RegexOptions.IgnoreCase);
MatchCollection emailMatches = emailRegex.Matches(datafrmAsc);
StringBuilder sb = new StringBuilder();
foreach (Match emailMatch in emailMatches)
{
sb.AppendLine(emailMatch.Value);
}
File.WriteAllText(SomeTxtFile, sb.ToString());
}
Assuming that is large text file, you can do something like this:
List<string> emailsList = new List<string>();
int startIndex = 295;
int endIndex = 325;
using (FileStream stream = File.Open("c:\\test.asc", FileMode.Open))
using (StreamReader sr = new StreamReader(stream))
{
string line = string.Empty;
while ((line = sr.ReadLine()) != null)
{
emailsList.Add(line.Substring(startIndex, endIndex - startIndex).Trim());
}
}

Change the end of line of a row with File.WriteAllLines();

I'm trying to write a file with this File class method in C#.
public static void WriteAllLines(string path, IEnumerable<string> contents);
The end of line is CRLF but I need this to be LF.
WriteAllLines uses a StreamWriter to write the lines to a file, using the newline string specified in the NewLine property.
You can use the StreamWriter in your own code and use \n instead of \r\n. This has the benefit that you avoid string concatenations and generating temporary strings :
using (var writer = new StreamWriter(path))
{
writer.NewLine = "\n";
foreach (var line in contents)
{
writer.WriteLine(line );
}
}
Using a StreamWriter directly allows you to use asynchronous methods as well:
public async Task MyMethod()
{
....
using (var writer = new StreamWriter(path))
{
writer.NewLine = "\n";
foreach (var line in contents)
{
await writer.WriteLineAsync(line);
}
}
....
}
This can be a big benefit when writing large files, in server and web applications and web sites where you want to keep blocking at a minimum
There are so many ways of writing to a file, I'd just go with a different one - only a couple lines:
using (var writer = new StreamWriter(path)) {
foreach (var line in contents) {
writer.Write(line + "\n");
}
}
Instead of using WriteAllLines(), you can join the strings yourself and use WriteAllText():
File.WriteAllText(string path, string.Join("\n", contents) + "\n");
var builder = new StringBuilder();
for (int i = 0; i < 99999; i++)
{
builder.Append(i.ToString() + '\n');
}
File.WriteAllText("asd.txt", builder.ToString());
That is obviously with boilerplate code. Keep in mind that using a StringBuilder instead of a string[] is also faster.
I'd go with this, it avoids re-writing to memory and works quickly. This assumes you are only using ASCII and don't need to overwrite the file - otherwise use a different encoding and change the file mode accordingly.
public static void WriteAllLines(string path, IEnumerable<string> contents)
{
using (var s = new FileStream(path, FileMode.Append))
{
foreach (var line in contents)
{
var bytes = Encoding.ASCII.GetBytes($"{line}\r");
s.Write(bytes,0,bytes.Length);
}
s.Flush();
s.Close();
}
}

File cannot be accessed because it is being used by another program

I am trying to remove the space at the end of line and then that line will be written in another file.
But when the program reaches to FileWriter then it gives me the following error
Process can't be accessed because it is being used by another process.
The Code is as below.
private void FrmCounter_Load(object sender, EventArgs e)
{
string[] filePaths = Directory.GetFiles(#"D:\abc", "*.txt", SearchOption.AllDirectories);
string activeDir = #"D:\dest";
System.IO.StreamWriter fw;
string result;
foreach (string file in filePaths)
{
result = Path.GetFileName(file);
System.IO.StreamReader f = new StreamReader(file);
string newFileName = result;
// Combine the new file name with the path
string newPath = System.IO.Path.Combine(activeDir, newFileName);
File.Create(newPath);
fw = new StreamWriter(newPath);
int counter = 0;
int spaceAtEnd = 0;
string line;
// Read the file and display it line by line.
while ((line = f.ReadLine()) != null)
{
if (line.EndsWith(" "))
{
spaceAtEnd++;
line = line.Substring(0, line.Length - 1);
}
fw.WriteLine(line);
fw.Flush();
counter++;
}
MessageBox.Show("File Name : " + result);
MessageBox.Show("Total Space at end : " + spaceAtEnd.ToString());
f.Close();
fw.Close();
}
}
File.Create itself returns a stream.
Use that stream to write file. Reason you are receiving this error is because Stream returned by File.Create is open and you are trying to open that file again for write.
Either close the stream returned by File.Create or better use that stream for file write or use
Stream newFile = File.Create(newPath);
fw = new StreamWriter(newFile);
Even though you solved your initial problem, if you want to write everything into a new file in the original location, you can try to read all of the data into an array and close the original StreamReader. Performance note: If your file is sufficiently large though, this option is not going to be the best for performance.
And you don't need File.Create as the StreamWriter will create a file if it doesn't exist, or overwrite it by default or if you specify the append parameter as false.
result = Path.GetFileName(file);
String[] f = File.ReadAllLines(file); // major change here...
// now f is an array containing all lines
// instead of a stream reader
using(var fw = new StreamWriter(result, false))
{
int counter = f.Length; // you aren't using counter anywhere, so I don't know if
// it is needed, but now you can just access the `Length`
// property of the array and get the length without a
// counter
int spaceAtEnd = 0;
// Read the file and display it line by line.
foreach (var item in f)
{
var line = item;
if (line.EndsWith(" "))
{
spaceAtEnd++;
line = line.Substring(0, line.Length - 1);
}
fw.WriteLine(line);
fw.Flush();
}
}
MessageBox.Show("File Name : " + result);
MessageBox.Show("Total Space at end : " + spaceAtEnd.ToString());
Also, you will not remove multiple spaces from the end of the line using this method. If you need to do that, consider replacing line = line.Substring(0, line.Length - 1); with line = line.TrimEnd(' ');
You have to close any files you are reading before you attempt to write to them in your case.
Write stream in using statement like:
using (System.IO.StreamReader f = new StreamReader(file))
{
//your code goes here
}
EDIT:
Zafar is correct, however, maybe this will clear things up.
Because File.Create returns a stream.. that stream has opened your destination file. This will make things clearer:
File.Create(newPath).Close();
Using the above line, makes it work, however, I would suggest re-writing that properly. This is just for illustrative purposes.

Parsing individual lines in a robots.txt file with C#

Working on an application to parse robots.txt. I wrote myself a method that pulled the the file from a webserver, and threw the ouput into a textbox. I would like the output to display a single line of text for every line thats in the file, just as it would appear if you were looking at the robots.txt normally, however the ouput in my textbox is all of the lines of text without carriage returns or line breaks. So I thought I'd be crafty, make a string[] for all the lines, make a foreach loop and all would be well. Alas that did not work, so then I thought I would try System.Enviornment.Newline, still not working. Here's the code as it sounds now....how can I change this so I get all the individual lines of robots.txt as opposed to a bunch of text cobbled together?
public void getRobots()
{
WebClient wClient = new WebClient();
string url = String.Format("http://{0}/robots.txt", urlBox.Text);
try
{
Stream data = wClient.OpenRead(url);
StreamReader read = new StreamReader(data);
string[] lines = new string[] { read.ReadToEnd() };
foreach (string line in lines)
{
textBox1.AppendText(line + System.Environment.NewLine);
}
}
catch (WebException ex)
{
MessageBox.Show(ex.Message, null, MessageBoxButtons.OK);
}
}
You are reading the entire file into the first element of the lines array:
string[] lines = new string[] {read.ReadToEnd()};
So all your loop is doing is adding the whole contents of the file into the TextBox, followed by a newline character. Replace that line with these:
string content = read.ReadToEnd();
string[] lines = content.Split(new string[] { "\r\n", "\n" }, StringSplitOptions.None);
And see if that works.
Edit: an alternative and perhaps more efficient way, as per Fish's comment below about reading line by line—replace the code within the try block with this:
Stream data = wClient.OpenRead(url);
StreamReader read = new StreamReader(data);
while (read.Peek() >= 0)
{
textBox1.AppendText(read.ReadLine() + System.Environment.NewLine);
}
You need to make the textBox1 multiline. Then I think you can simply go
textBox1.Lines = lines;
but let me check that
Try
public void getRobots()
{
WebClient wClient = new WebClient();
string robotText;
string[] robotLines;
System.Text.StringBuilder robotStringBuilder;
robotText = wClient.DownloadString(String.Format("http://{0}/robots.txt", urlBox.Text));
robotLines = robotText.Split(Environment.NewLine);
robotStringBuilder = New StringBuilder();
foreach (string line in robotLines)
{
robotStringBuilder.Append(line);
robotStringBuilder.Append(Environment.NewLine);
}
textbox1.Text = robotStringBuilder.ToString();
}
Try using .Read() in a while loop instead of .ReadToEnd() - I think you're just getting the entire file as one line in your lines array. Debug and check the count of lines[] to verify this.
Edit: Here's a bit of sample code. Haven't tested it, but I think it should work OK;
Stream data = wClient.OpenRead(url);
StreamReader read = new StreamReader(data);
List<string> lines = new List<string>();
string nextLine = read.ReadLine();
while (nextLine != null)
{
lines.Add(nextLine);
nextLine = read.ReadLine();
}
textBox1.Lines = lines.ToArray();

how to format data in a text file

ihave an string builder where it conatins email id( it conatins thousands of email id)
StringBuilder sb = new StringBuilder();
foreach (DataRow dr2 in dtResult.Rows)
{
strtxt = dr2[strMailID].ToString()+";";
sb.Append(strtxt);
}
string filepathEmail = Server.MapPath("Email");
using (StreamWriter outfile = new StreamWriter(filepathEmail + "\\" + "Email.txt"))
{
outfile.Write(sb.ToString());
}
now data is getting stored in text file like this:
abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;
abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;
But i need to store them like where every row should only only 10 email id, so that i looks good**
any idea how to format the data like this in .txt file? any help would be great
Just add a counter in your loop and append a line break every 10 lines.
int counter = 0;
StringBuilder sb = new StringBuilder();
foreach (DataRow dr2 in dtResult.Rows)
{
counter++;
strtxt = dr2[strMailID].ToString()+";";
sb.Append(strtxt);
if (counter % 10 == 0)
{
sb.Append(Environment.NewLine);
}
}
Use a counter and add a line break each tenth item:
StringBuilder sb = new StringBuilder();
int cnt = 0;
foreach (DataRow dr2 in dtResult.Rows) {
sb.Append(dr2[strMailID]).Append(';');
if (++cnt == 10) {
cnt = 0;
sb.AppendLine();
}
}
string filepathEmail = Path.Combine(Server.MapPath("Email"), "Email.txt");
File.WriteAllText(filepathEmail, sb.ToString());
Notes:
Concatentate strings using the StringBuilder instead of first concatenating and then appending.
Use Path.Combine to combine the path and file name, this works on any platform.
You can use the File.WriteAllText method to save the string in a single call instead of writing to a StreamWriter.
as it said you may add a "line break" I suggest to add '\t' tab after each address so your file will be CSV format and you can import it in Excel for instance.
Use a counter to keep track of number of mail already written, like this:
int i = 0;
foreach (string mail in mails) {
var strtxt = mail + ";";
sb.Append(strtxt);
i++;
if (i % 10==0)
sb.AppendLine();
}
Every 10 mails written, i modulo 10 equals 0, so you put an end line in the string builder.
Hope this can help.
Here's an alternate method using LINQ if you don't mind any overheads.
string filepathEmail = Server.MapPath("Email");
using (StreamWriter outfile = new StreamWriter(filepathEmail + "\\" + "Email.txt"))
{
var rows = dtResult.Rows.Cast<DataRow>(); //make the rows enumerable
var lines = from ivp in rows.Select((dr2, i) => new {i, dr2})
group ivp.dr2[strMailID] by ivp.i / 10 into line //group every 10 emails
select String.Join(";", line); //put them into a string
foreach (string line in lines)
outfile.WriteLine(line);
}

Categories