C# character counter when writing to new line - c#

Basically I'm trying to read a really big text file and when the charecters of the line reach X amount write to a new line, but I can't seem to get the character count to work. Any help is appreciated!
using (FileStream fs = new FileStream(betaFilePath,FileMode.Open))
using (StreamReader rdr = new StreamReader(fs))
{
while (!rdr.EndOfStream)
{
string betaFileLine = rdr.ReadLine();
int stringline = 0;
if (betaFileLine.Contains("þTEMP"))
{
//sb.AppendLine(#"C:\chawkster\workfiles\New Folder\GEL_ALL_PRODUCTS_CONCORD2.DAT");
string checkline = betaFileLine.Length.ToString();
foreach (string cl in checkline)
{
stringline++;
File.AppendAllText(#"C:\chawkster\workfiles\New Folder\GEL_ALL_PRODUCTS_CONCORD3.DAT", cl);
if(stringline == 1200)
{
File.AppendAllText(#"C:\chawkster\workfiles\New Folder\GEL_ALL_PRODUCTS_CONCORD3.DAT","\n");
stringline = 0;
}
}
}
}
Error:
foreach (string cl in checkline)
Error 1 Cannot convert type 'char' to 'string'

I don't understand why you have string checkline = betaFileLine.Length.ToString(); since that will just take the current line and give you the length which is a number in a string format. Don't you want all the characters in the current line? Not sure what you want the numeric length there.
Not really sure what you are doing exactly but try:
// Get the current line as an array of characters
char[] checkline = betaFileLine.ToCharArray();
// Iterate for each character add to you file?
foreach (char cl in checkline)

I would use a Regular Expression to split the input string into chunks of the desired amount of characters. Here's an example:
string input = File.ReadAllText(inputFilePath);
MatchCollection lines = Regex.Matches(input, ".{1200}", RegexOptions.Singleline); // matches any character including \n exactly 1200 times
StringBuilder output = new StringBuilder();
foreach (Match line in lines)
{
output.AppendLine(line.Value);
}
File.AppendAllText(outputFilePath, output.ToString());

System.String implements an IEnumerable - you need to use the code
foreach (char cl in checkLine)
{
...
File.AppendAllText(fileName, cl.ToString());
}
I'd also suggest you put it all into an in-memory stream or StringBuilder and persist it all to the file in one go, rather than writing each character to the FileStream separately.

Related

C# Split a string and build a stringarray out of the string [duplicate]

I need to split a string into newlines in .NET and the only way I know of to split strings is with the Split method. However that will not allow me to (easily) split on a newline, so what is the best way to do it?
To split on a string you need to use the overload that takes an array of strings:
string[] lines = theText.Split(
new string[] { Environment.NewLine },
StringSplitOptions.None
);
Edit:
If you want to handle different types of line breaks in a text, you can use the ability to match more than one string. This will correctly split on either type of line break, and preserve empty lines and spacing in the text:
string[] lines = theText.Split(
new string[] { "\r\n", "\r", "\n" },
StringSplitOptions.None
);
What about using a StringReader?
using (System.IO.StringReader reader = new System.IO.StringReader(input)) {
string line = reader.ReadLine();
}
Try to avoid using string.Split for a general solution, because you'll use more memory everywhere you use the function -- the original string, and the split copy, both in memory. Trust me that this can be one hell of a problem when you start to scale -- run a 32-bit batch-processing app processing 100MB documents, and you'll crap out at eight concurrent threads. Not that I've been there before...
Instead, use an iterator like this;
public static IEnumerable<string> SplitToLines(this string input)
{
if (input == null)
{
yield break;
}
using (System.IO.StringReader reader = new System.IO.StringReader(input))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
This will allow you to do a more memory efficient loop around your data;
foreach(var line in document.SplitToLines())
{
// one line at a time...
}
Of course, if you want it all in memory, you can do this;
var allTheLines = document.SplitToLines().ToArray();
You should be able to split your string pretty easily, like so:
aString.Split(Environment.NewLine.ToCharArray());
Based on Guffa's answer, in an extension class, use:
public static string[] Lines(this string source) {
return source.Split(new string[] { "\r\n", "\n" }, StringSplitOptions.None);
}
Regex is also an option:
private string[] SplitStringByLineFeed(string inpString)
{
string[] locResult = Regex.Split(inpString, "[\r\n]+");
return locResult;
}
For a string variable s:
s.Split(new string[]{Environment.NewLine},StringSplitOptions.None)
This uses your environment's definition of line endings. On Windows, line endings are CR-LF (carriage return, line feed) or in C#'s escape characters \r\n.
This is a reliable solution, because if you recombine the lines with String.Join, this equals your original string:
var lines = s.Split(new string[]{Environment.NewLine},StringSplitOptions.None);
var reconstituted = String.Join(Environment.NewLine,lines);
Debug.Assert(s==reconstituted);
What not to do:
Use StringSplitOptions.RemoveEmptyEntries, because this will break markup such as Markdown where empty lines have syntactic purpose.
Split on separator new char[]{Environment.NewLine}, because on Windows this will create one empty string element for each new line.
I just thought I would add my two-bits, because the other solutions on this question do not fall into the reusable code classification and are not convenient.
The following block of code extends the string object so that it is available as a natural method when working with strings.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Collections;
using System.Collections.ObjectModel;
namespace System
{
public static class StringExtensions
{
public static string[] Split(this string s, string delimiter, StringSplitOptions options = StringSplitOptions.None)
{
return s.Split(new string[] { delimiter }, options);
}
}
}
You can now use the .Split() function from any string as follows:
string[] result;
// Pass a string, and the delimiter
result = string.Split("My simple string", " ");
// Split an existing string by delimiter only
string foo = "my - string - i - want - split";
result = foo.Split("-");
// You can even pass the split options parameter. When omitted it is
// set to StringSplitOptions.None
result = foo.Split("-", StringSplitOptions.RemoveEmptyEntries);
To split on a newline character, simply pass "\n" or "\r\n" as the delimiter parameter.
Comment: It would be nice if Microsoft implemented this overload.
Starting with .NET 6 we can use the new String.ReplaceLineEndings() method to canonicalize cross-platform line endings, so these days I find this to be the simplest way:
var lines = input
.ReplaceLineEndings()
.Split(Environment.NewLine, StringSplitOptions.None);
I'm currently using this function (based on other answers) in VB.NET:
Private Shared Function SplitLines(text As String) As String()
Return text.Split({Environment.NewLine, vbCrLf, vbLf}, StringSplitOptions.None)
End Function
It tries to split on the platform-local newline first, and then falls back to each possible newline.
I've only needed this inside one class so far. If that changes, I will probably make this Public and move it to a utility class, and maybe even make it an extension method.
Here's how to join the lines back up, for good measure:
Private Shared Function JoinLines(lines As IEnumerable(Of String)) As String
Return String.Join(Environment.NewLine, lines)
End Function
Well, actually split should do:
//Constructing string...
StringBuilder sb = new StringBuilder();
sb.AppendLine("first line");
sb.AppendLine("second line");
sb.AppendLine("third line");
string s = sb.ToString();
Console.WriteLine(s);
//Splitting multiline string into separate lines
string[] splitted = s.Split(new string[] {System.Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries);
// Output (separate lines)
for( int i = 0; i < splitted.Count(); i++ )
{
Console.WriteLine("{0}: {1}", i, splitted[i]);
}
string[] lines = text.Split(
Environment.NewLine.ToCharArray(),
StringSplitOptions.RemoveEmptyStrings);
The RemoveEmptyStrings option will make sure you don't have empty entries due to \n following a \r
(Edit to reflect comments:) Note that it will also discard genuine empty lines in the text. This is usually what I want but it might not be your requirement.
I did not know about Environment.Newline, but I guess this is a very good solution.
My try would have been:
string str = "Test Me\r\nTest Me\nTest Me";
var splitted = str.Split('\n').Select(s => s.Trim()).ToArray();
The additional .Trim removes any \r or \n that might be still present (e. g. when on windows but splitting a string with os x newline characters). Probably not the fastest method though.
EDIT:
As the comments correctly pointed out, this also removes any whitespace at the start of the line or before the new line feed. If you need to preserve that whitespace, use one of the other options.
Examples here are great and helped me with a current "challenge" to split RSA-keys to be presented in a more readable way. Based on Steve Coopers solution:
string Splitstring(string txt, int n = 120, string AddBefore = "", string AddAfterExtra = "")
{
//Spit each string into a n-line length list of strings
var Lines = Enumerable.Range(0, txt.Length / n).Select(i => txt.Substring(i * n, n)).ToList();
//Check if there are any characters left after split, if so add the rest
if(txt.Length > ((txt.Length / n)*n) )
Lines.Add(txt.Substring((txt.Length/n)*n));
//Create return text, with extras
string txtReturn = "";
foreach (string Line in Lines)
txtReturn += AddBefore + Line + AddAfterExtra + Environment.NewLine;
return txtReturn;
}
Presenting a RSA-key with 33 chars width and quotes are then simply
Console.WriteLine(Splitstring(RSAPubKey, 33, "\"", "\""));
Output:
Hopefully someone find it usefull...
Silly answer: write to a temporary file so you can use the venerable
File.ReadLines
var s = "Hello\r\nWorld";
var path = Path.GetTempFileName();
using (var writer = new StreamWriter(path))
{
writer.Write(s);
}
var lines = File.ReadLines(path);
using System.IO;
string textToSplit;
if (textToSplit != null)
{
List<string> lines = new List<string>();
using (StringReader reader = new StringReader(textToSplit))
{
for (string line = reader.ReadLine(); line != null; line = reader.ReadLine())
{
lines.Add(line);
}
}
}
Very easy, actually.
VB.NET:
Private Function SplitOnNewLine(input as String) As String
Return input.Split(Environment.NewLine)
End Function
C#:
string splitOnNewLine(string input)
{
return input.split(environment.newline);
}

How would I access a txt file and split the links

Alright, I have a program that grabs links off of a website and puts it into a txt BUT the links aren't separated onto their own lines and I need to somehow do that without having to manually do it myself, here is the code used to grab the links off of the website, write the links to a text file then grab the txt file and read it.
private void linkLabel1_LinkClicked(object sender, LinkLabelLinkClickedEventArgs e)
{
var client = new WebClient();
string text = client.DownloadString("https://currentlinks.com");
File.WriteAllText("C:/ProgramData/oof.txt", text);
string searchKeyword = "https://foobar.to/showthread.php";
string fileName = "C:/ProgramData/oof.txt";
string[] textLines = File.ReadAllLines(fileName);
List<string> results = new List<string>();
foreach (string line in textLines)
{
if (line.Contains(searchKeyword))
{
results.Add(line);
}
var sb = new StringBuilder();
foreach (var item in results)
{
sb.Append(item);
}
textBox1.Text = sb.ToString();
var parsed = textBox1;
TextWriter tw = new StreamWriter("C:/ProgramData/parsed.txt");
// write lines of text to the file
tw.WriteLine(parsed);
// close the stream
tw.Close();
}
}
You are getting all the Links (URLs) in one single string. There is not straight forward way to get all the URLs individually without some assumptions.
With the sample data you shared, I assume that the URLs in the string follow simple URLs format and do not have any fancy stuff in it. They start with http and one url does not have any other http.
With above assumptions, I suggest following code.
// Sample data as shared by the OP
string data = "https://forum.to/showthread.php?tid=22305https://forum.to/showthread.php?tid=22405https://forum.to/showthread.php?tid=22318";
//Splitting the string by string `http`
var items = data.Split(new [] {"http"},StringSplitOptions.RemoveEmptyEntries).ToList();
//At this point all the strings in items collection will be without "http" at the start.
//So they will look like as following.
// s://forum.to/showthread.php?tid=22305
// s://forum.to/showthread.php?tid=22405
// s://forum.to/showthread.php?tid=22318
//So we need to add "http" at the start of each of the item as following.
items = items.Select(i => "http" + i).ToList();
// After this they will become like following.
// https://forum.to/showthread.php?tid=22305
// https://forum.to/showthread.php?tid=22405
// https://forum.to/showthread.php?tid=22318
//Now we need to create a single string with newline character between two items so
//that they represent a single line individually.
var text = String.Join("\r\n", items);
// Then write the text to the file.
File.WriteAllText("C:/ProgramData/oof.txt", text);
This should help you resolve your issue.
.Split way
Could you use yourString.Split("https://");?
Example:
//This simple example assumes that all links are https (not http)
string contents = "https://www.example.com/dogs/poodles/poodle1.htmlhttps://www.example.com/dogs/poodles/poodle2.html";
const string Prefix = "https://";
var linksWithoutPrefix = contents.Split(Prefix, StringSplitOptions.RemoveEmptyEntries);
//using System.Linq
var linksWithPrefix = linksWithoutPrefix.Select(l => Prefix + l);
foreach (var match in linksWithPrefix)
{
Console.WriteLine(match);
}
Regex way
Another option is to use reg exp.
Failed - cannot find/write the right regex ... got to go now
string contents = "http://www.example.com/dogs/poodles/poodle1.htmlhttp://www.example.com/dogs/poodles/poodle2.html";
//From https://regexr.com/
var rgx = new Regex(#"(?<Protocol>\w+):\/\/(?<Domain>[\w#][\w.:#]+)\/?[\w\.?=%&=\-#/$,]*");
var matches = rgx.Matches(contents);
foreach(var match in matches )
{
Console.WriteLine(match);
}
//This finds 'http://www.example.com/dogs/poodles/poodle1.htmlhttp' (note the htmlhttp at the end

Replacing string in txt using foreach

I'm trying to replace a string in a text file with everything there is in the other file. For my html Email variable.
But whenever i try to run the foreach, it gives me the error that it can convert char to string. How would one go about doing this in a different way?
StreamReader myreader = new StreamReader("VUCresult.txt");
StreamReader myreaderhtml = new StreamReader("htmlemail.html");
string lines = myreader.ReadToEnd();
string htmlmailbody = myreaderhtml.ReadToEnd();
if (lines == "Der er ikke nogen udmeldinger idag")
{
htmlmailbody.Replace("ingen", lines);
}
else
{
foreach (string s in lines)
{
htmlmailbody = htmlmailbody.Replace("Row2", s);
}
htmlmailbody = htmlmailbody.Replace("Row1", lines);
htmlmailbody = htmlmailbody.Replace("Row3", DateTime.Now.ToString());
}
You are using foreach over a very long string (that happens to include newlines); that will return you each individual character.
To get all the lines in a file (as a collection) just use File.ReadAllLines:
string htmlmailbody;
using StreamReader myreaderhtml = new StreamReader("htmlemail.html"))
{
htmlmailbody = myreaderhtml.ReadToEnd();
}
string[] lines = File.ReadAllLines("VUCresult.txt");
foreach (string s in lines)
{
...
}
Your original if check won't make sense here since you have a collection of lines instead of the entire file, and your second replace statements in the else won't make sense either. You need to decide which thing you are really trying to look at.
You don't need the foreach here as lines is a string already and calling foreach on it would make variable 's' a char. What's more, statement s.ToString() does nothing as it returns a string, leaving s itself unchanged.
Furthermore, consider enclosing readers in 'using' statement.

Verifying and parsing csv to 2D array in C# Visual Studios

Just trying out C# to make a button that loads csv files verify them and parse them:
protected void Upload_Btn_Click(object sender, EventArgs e)
{
string test = PNLdataLoader.FileName;
//checks if file is csv
Regex regex = new Regex("*.csv");
Match match = regex.Match(test);
if (match.Success)
{
string CSVFileAsString = System.Text.Encoding.ASCII.GetString(PNLdataLoader.FileBytes);
System.IO.MemoryStream MS = new System.IO.MemoryStream(PNLdataLoader.FileBytes);
System.IO.StreamReader SR = new System.IO.StreamReader(MS);
//Store each line in CSVlines array of strings
string[] CSVLines = new string[0];
while (!SR.EndOfStream)
{
System.Array.Resize(ref CSVLines, CSVLines.Length + 1);
CSVLines[CSVLines.Length - 1] = SR.ReadLine();
}
}
So far I got it to store the lines in CSVLines but I am not sure what is wrong with the regex. Is there a more efficient way to do this?
That isn't a valid expression, its saying match whatever character that comes before * 0 or more times, since there is no character before that there is a problem.
This will probably match most things, it does not include special characters.
Regex regex = new Regex("[a-zA-Z0-9]{1,}.csv");
You could also do this instead:
if(test.EndsWith(".csv"))
and lastly, I would change your array to a List<T> or something like that, futher explained here: What is more efficient: List<T>.Add() or System.Array.Resize()?
//Store each line in CSVlines array of strings
List<string> CSVLines = new List<string>();
while (!SR.EndOfStream)
{
CSVLines.Add(SR.ReadLine());
}
EDIT:
List<T> is in System.Collections.Generic

Search and replace values in text file with C#

I have a text file with a certain format. First comes an identifier followed by three spaces and a colon. Then comes the value for this identifier.
ID1 :Value1
ID2 :Value2
ID3 :Value3
What I need to do is searching e.g. for ID2 : and replace Value2 with a new value NewValue2. What would be a way to do this? The files I need to parse won't get very large. The largest will be around 150 lines.
If the file isn't that big you can do a File.ReadAllLines to get a collection of all the lines and then replace the line you're looking for like this
using System.IO;
using System.Linq;
using System.Collections.Generic;
List<string> lines = new List<string>(File.ReadAllLines("file"));
int lineIndex = lines.FindIndex(line => line.StartsWith("ID2 :"));
if (lineIndex != -1)
{
lines[lineIndex] = "ID2 :NewValue2";
File.WriteAllLines("file", lines);
}
Here's a simple solution which also creates a backup of the source file automatically.
The replacements are stored in a Dictionary object. They are keyed on the line's ID, e.g. 'ID2' and the value is the string replacement required. Just use Add() to add more as required.
StreamWriter writer = null;
Dictionary<string, string> replacements = new Dictionary<string, string>();
replacements.Add("ID2", "NewValue2");
// ... further replacement entries ...
using (writer = File.CreateText("output.txt"))
{
foreach (string line in File.ReadLines("input.txt"))
{
bool replacementMade = false;
foreach (var replacement in replacements)
{
if (line.StartsWith(replacement.Key))
{
writer.WriteLine(string.Format("{0} :{1}",
replacement.Key, replacement.Value));
replacementMade = true;
break;
}
}
if (!replacementMade)
{
writer.WriteLine(line);
}
}
}
File.Replace("output.txt", "input.txt", "input.bak");
You'll just have to replace input.txt, output.txt and input.bak with the paths to your source, destination and backup files.
Ordinarily, for any text searching and replacement, I'd suggest some sort of regular expression work, but if this is all you're doing, that's really overkill.
I would just open the original file and a temporary file; read the original a line at a time, and just check each line for "ID2 :"; if you find it, write your replacement string to the temporary file, otherwise, just write what you read. When you've run out of source, close both, delete the original, and rename the temporary file to that of the original.
Something like this should work. It's very simple, not the most efficient thing, but for small files, it would be just fine:
private void setValue(string filePath, string key, string value)
{
string[] lines= File.ReadAllLines(filePath);
for(int x = 0; x < lines.Length; x++)
{
string[] fields = lines[x].Split(':');
if (fields[0].TrimEnd() == key)
{
lines[x] = fields[0] + ':' + value;
File.WriteAllLines(lines);
break;
}
}
}
You can use regex and do it in 3 lines of code
string text = File.ReadAllText("sourcefile.txt");
text = Regex.Replace(text, #"(?i)(?<=^id2\s*?:\s*?)\w*?(?=\s*?$)", "NewValue2",
RegexOptions.Multiline);
File.WriteAllText("outputfile.txt", text);
In the regex, (?i)(?<=^id2\s*?:\s*?)\w*?(?=\s*?$) means, find anything that starts with id2 with any number of spaces before and after :, and replace the following string (any alpha numeric character, excluding punctuations) all the way 'till end of the line. If you want to include punctuations, then replace \w*? with .*?
You can use regexes to achieve this.
Regex re = new Regex(#"^ID\d+ :Value(\d+)\s*$", RegexOptions.IgnoreCase | RegexOptions.Compiled);
List<string> lines = File.ReadAllLines("mytextfile");
foreach (string line in lines) {
string replaced = re.Replace(target, processMatch);
//Now do what you going to do with the value
}
string processMatch(Match m)
{
var number = m.Groups[1];
return String.Format("ID{0} :NewValue{0}", number);
}

Categories