How can I specify region of reading from file? - c#

I have the following txt file
//test.txt
information needed[12334,56565]important numbers
I want to read from [ until ]
string print= File.ReadAllText(#"C:/Users/kokos/Desktop/test.txt");
Console.WriteLine(print);
The above is reading the whole file, but i want to print only
[12334,56565]

string pattern = #"\[(.*?)\]";
string print = File.ReadAllText(#"C:/Users/kokos/Desktop/test.txt");
var result = Regex.Matches(print, pattern);
foreach (Match r in result)
{
Console.WriteLine(r.Groups[1]);
}
As mentioned by Matthew, here is a solution using regex. At the top of your .cs. Add the line: using System.Text.RegularExpressions;
Note
This answer assumes the OP desires to load in the entire file to memory.

You can do this with LINQ.
var text = File.ReadAllText(#"C:/Users/kokos/Desktop/test.txt");
var print = new string(text.SkipWhile(c => c != '[')
.TakeWhile(c => c != ']')
.ToArray())+"]";
// print = "[12334,56565]"
... if you don't want the leading [ then do this...
var print = new string(text.SkipWhile(c => c != '[').Skip(1)
.TakeWhile(c => c != ']')
.ToArray());
// print = "12334,56565"
Here are a few more options if you just want to mess around with the string. (these are more error prone.)
var print = text.Substring(text.IndexOf('['), text.IndexOf(']') - text.IndexOf('[') + 1);
... or ...
var print = "[" + text.Split('[')[1].Split(']')[0] + "]";
... regex would probably look nicer.

var data = Encoding.UTF8.GetBytes("here is a simulated file [here's the data I'm after]");
var read = new StringBuilder();
var inScope = false;
using(var ms = new MemoryStream(data)) {
using(var sr = new StreamReader(ms)) {
while(!sr.EndOfStream) {
var by = sr.Read();
if (((char)by) == '[') {
inScope = true;
continue;
}
else if (((char)by) == ']') {
inScope = false;
break;
}
if (inScope) {
read.Append((char)by);
}
}
}
}
read.ToString().Dump();
The above code is a LINQPad snippet, that shows how you can read a stream byte by byte and pull out the data you're after without loading the whole thing into memory.
Instead of using a memory stream, just use a file stream for the file you want to read.
This is less than optimal with all the casting (just do it once), but it should be enough to demonstrate the basic idea.
The output of this is: "here's the data I'm after"
WARNING be sure to use the encoding object for whichever encoding your file is using!

Related

C# search line and then overwrite it

How can I find a line in C# and overwrite it (.sii file)?
string result = string.Empty;
var lines = File.ReadAllLines(Path);
foreach (var line in lines)
{
if (line.Contains("my_truck_placement: ("))
{
var text = line.Replace("my_truck_placement: ", "");
result = text.Trim();
File.WriteAllText(Path, result);
}
}
The main problem of yours is that you are trying to write to file too early, before you finish analyzing content of the file.
// use implicit types wherever possible
// Good to explicitly initiate with string.Empty :)
var result = string.Empty;
var lines = File.ReadAllLines(Path);
// I prefer here for each loop, as we are oging ot modify content of
// collection being iterated over.
for (var i = 0; i < lines.Length; i++)
{
var line = lines[i];
if (line.Contains("my_truck_placement: ("))
{
lines[i] = line.Replace("my_truck_placement: ", "");
}
}
// Here, after all manipulations, you are able to write to file.
File.WriteAllLines(Path, lines);
You could simplify even further, for example loop body:
lines[i] = lines[i].Replace("my_truck_placement: (", "(");
If you are sure the phrase will only happen at the beginning of the line.
You could even limit yourself to such code
File.WriteAllLines(
Path,
File.ReadAllLines(Path)
.Select(x => x.Replace("my_truck_placement: (", "("))
.ToArray());

How can I read a Lync conversation file containing HTML?

I'm having trouble reading a local file, into a string, in c#.
Here's what I came up with till now:
string file = #"C:\script_test\{5461EC8C-89E6-40D1-8525-774340083829}.html";
using (StreamReader reader = new StreamReader(file))
{
string line = "";
while ((line = reader.ReadLine()) != null)
{
textBox1.Text += line.ToString();
}
}
And it's the only solution that seems to work.
I've tried some other suggested methods for reading a file, such as:
string file = #"C:\script_test\{5461EC8C-89E6-40D1-8525-774340083829}.html";
string html = File.ReadAllText(file).ToString();
textBox1.Text += html;
Yet it does not work as expected.
Here are the first few lines of the file i'm trying to read:
as you can see, it has some funky characters, honestly I don't know if that's the cause of this weird behavior.
But in the first case, the code seems to skip those lines, printing only "Document generated by Office Communicator..."
Your task would be easier if you could use an API or the SDK or even would have a description of the format you try to read. However the binary format looks not to be that complicated and with an hexviewer installed I got this far to get the html out of the example you provided.
To parse non-text files you fall-back to the BinaryReader and then use one of the Read methods to read the correct type from the bytestream. I used ReadByte and ReadInt32. Notice how in the description of the method is explained how many bytes are read. That becomes handy when you try to decipher your file.
private string ParseHist(string file)
{
using (var f = File.Open(file, FileMode.Open))
{
using (var br = new BinaryReader(f))
{
// read 4 bytes as an int
var first = br.ReadInt32();
// read integer / zero ended byte arrays as string
var lead = br.ReadInt32();
// until we have 4 zero bytes
while (lead != 0)
{
var user = ParseString(br);
Trace.Write(lead);
Trace.Write(":");
Trace.Write(user.Length);
Trace.Write(":");
Trace.WriteLine(user);
lead = br.ReadInt32();
// weird special case
if (lead == 2)
{
lead = br.ReadInt32();
}
}
// at the start of the html block
var htmllen = br.ReadInt32();
Trace.WriteLine(htmllen);
// parse the html
var html = ParseString(br);
Trace.Write(len);
Trace.Write(":");
Trace.Write(html.Length);
Trace.Write(":");
Trace.WriteLine(html);
// other structures follow, left unparsed
return html.ToString();
}
}
}
// a string seems to be ascii encoded and ends with a zero byte.
private static string ParseString(BinaryReader br)
{
var ch = br.ReadByte();
var sb = new StringBuilder();
while (ch != 0)
{
sb.Append((char)ch);
ch = br.ReadByte();
}
return sb.ToString();
}
You could use the simple parsing logic in a winform application as follows:
private void button1_Click(object sender, EventArgs e)
{
webBrowser1.DocumentText = ParseHist(#"5461EC8C-89E6-40D1-8525-774340083829-Copia.html");
}
Keep in mind that this is not bullet proof or the recommended way but it should get you started. For files that don't parse well you'll need to go back to the hexviewer and work-out what other byte structures are new or different from what you already had. That is not something I intend to help you with, that is left as an exercise for you to figure out.
I don't know if it's the right way to answer this, but here's what I've managed to do so far:
string file = #"C:\script_test\{1C0365BC-54C6-4D31-A1C1-586C4575F9EA}.hist";
string outText = "";
//Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
StreamReader reader = new StreamReader(file, utf8);
char[] text = reader.ReadToEnd().ToCharArray();
//skip first n chars
/*
for (int i = 250; i < text.Length; i++)
{
outText += text[i];
}
*/
for (int i = 0; i < text.Length; i++)
{
//skips non printable characters
if (!Char.IsControl(text[i]))
{
outText += text[i];
}
}
string source = "";
source = WebUtility.HtmlDecode(outText);
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(source);
string html = "<html><style>";
foreach (HtmlNode node in htmlDoc.DocumentNode.SelectNodes("//style"))
{
html += node.InnerHtml+ Environment.NewLine;
}
html += "</style><body>";
foreach (HtmlNode node in htmlDoc.DocumentNode.SelectNodes("//body"))
{
html += node.InnerHtml + Environment.NewLine;
}
html += "</body></html>";
richTextBox1.Text += html+Environment.NewLine;
webBrowser1.DocumentText = html;
The conversation displays correctly, both style and encoding.
So it's a start for me.
Thank you all for the support!
EDIT
Char.IsControl(char)
skips non printable characters :)

Getting Data From IEnumerable

I have text file which contains airport Codes in this format:
"AAA","","Anaa Arpt","PF","","","AAA","2","N","272"
I used a StreamReader to to read the line from file and then I add that line to string list finally I convert that list to IEnumerable type.
Can you please help me how could I get only three values from each line for example
AAA is airportCode
Anna Arpt airport name
PF is country Code
I want to get only these three values from each row.
Please find below the code.
using (StreamReader sr = new StreamReader("C:/AirCodes/RAPT.TXT"))
{
String line;
while ((line = sr.ReadLine()) != null)
{
aircodesFromTravelPort.Add(line);
Console.WriteLine(line);
}
}
var codes = (IEnumerable<String>)aircodesFromTravelPort;
foreach (var aircode in codes)
It seems that you can try using Linq, something like that:
var codes = File
.ReadLines(#"C:/AirCodes/RAPT.TXT")
.Select(line => line.Split(','))
.Select(items => new {
// I've created a simple anonymous class,
// you'd probably want to create you own one
Code = items[0].Trim('"'), //TODO: Check numbers
Airport = items[2].Trim('"'),
Country = items[3].Trim('"')
})
.ToList();
...
foreach(var item in codes)
Console.WriteLine(item);
You'll probably want to make use of String's Split function on each line to get the values into an array.
while ((line = sr.ReadLine()) != null)
{
var values = line.Split(","); // here you have an array of strings containing the values between commas
var airportCode = values[0];
var airportName = values[2];
var airportCountry = values[3];
var airportInfo = airportCode + "," + airportName + "," + airportCountry;
aircodesFromTravelPort.Add(airportInfo );
// what you actually do with the values is up to you, I just tried to make it as close to the original as possible.
Console.WriteLine(airportInfo);
}
Hope this helps!
I like Regex with named groups:
var line = #"""AAA"","""",""Anaa Arpt"",""PF"","""","""",""AAA"",""2"",""N"",""272""";
var pattern = #"^""(?<airportCode>\w+)"",""(\w*)"",""(?<ariportName>[\w\s]+)"",""(?<cuntryCode>\w+)""";
Match match = Regex.Match(line, pattern, RegexOptions.IgnoreCase);
if (match.Success)
{
string airportCode = match.Groups["airportCode"].Value;
string ariportName = match.Groups["ariportName"].Value;
string cuntryCode = match.Groups["cuntryCode"].Value;
}

Fastest way to find strings in a file

I have a log file that is not more than 10KB (File size can go up to 2 MB max) and I want to find if atleast one group of these strings occurs in the files. These strings will be on different lines like,
ACTION:.......
INPUT:...........
RESULT:..........
I need to know atleast if one group of above exists in the file. And I have do this about 100 times for a test (each time log is different, so I have reload and read the log), so I am looking for fastest and bets way to do this.
I looked up in the forums for finding the fastest way, but I dont think my file is too big for those silutions.
Thansk for looking.
I would read it line by line and check the conditions. Once you have seen a group you can quit. This way you don't need to read the whole file into memory. Like this:
public bool ContainsGroup(string file)
{
using (var reader = new StreamReader(file))
{
var hasAction = false;
var hasInput = false;
var hasResult = false;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (!hasAction)
{
if (line.StartsWith("ACTION:"))
hasAction = true;
}
else if (!hasInput)
{
if (line.StartsWith("INPUT:"))
hasInput = true;
}
else if (!hasResult)
{
if (line.StartsWith("RESULT:"))
hasResult = true;
}
if (hasAction && hasInput && hasResult)
return true;
}
return false;
}
}
This code checks if there is a line starting with ACTION then one with INPUT and then one with RESULT. If the order of those is not important then you can omit the if () else if () checks. In case the line does not start with the strings replace StartsWith with Contains.
Here's one possible way to do it:
StreamReader sr;
string fileContents;
string[] logFiles = Directory.GetFiles(#"C:\Logs");
foreach (string file in logFiles)
{
using (StreamReader sr = new StreamReader(file))
{
fileContents = sr.ReadAllText();
if (fileContents.Contains("ACTION:") || fileContents.Contains("INPUT:") || fileContents.Contains("RESULT:"))
{
// Do what you need to here
}
}
}
You may need to do some variation based on your exact implementation needs - for example, what if the word spans two lines, does the line need to start with the word, etc.
Added
Alternate line-by-line check:
StreamReader sr;
string[] lines;
string[] logFiles = Directory.GetFiles(#"C:\Logs");
foreach (string file in logFiles)
{
using (StreamReader sr = new StreamReader(file)
{
lines = sr.ReadAllLines();
foreach (string line in lines)
{
if (line.Contains("ACTION:") || line.Contains("INPUT:") || line.Contains("RESULT:"))
{
// Do what you need to here
}
}
}
}
Take a look at How to Read Text From a File. You might also want to take a look at the String.Contains() method.
Basically you will loop through all the files. For each file read line-by-line and see if any of the lines contains 1 of your special "Sections".
You don't have much of a choice with text files when it comes to efficiency. The easiest way would definitely be to loop through each line of data. When you grab a line in a string, split it on the spaces. Then match those words to your words until you find a match. Then do whatever you need.
I don't know how to do it in c# but in vb it would be something like...
Dim yourString as string
Dim words as string()
Do While objReader.Peek() <> -1
yourString = objReader.ReadLine()
words = yourString.split(" ")
For Each word in words()
If Myword = word Then
do stuff
End If
Next
Loop
Hope that helps
This code sample searches for strings in a large text file. The words are contained in a HashSet. It writes the found lines in a temp file.
if (File.Exists(#"temp.txt")) File.Delete(#"temp.txt");
String line;
String oldLine = "";
using (var fs = File.OpenRead(largeFileName))
using (var sr = new StreamReader(fs, Encoding.UTF8, true))
{
HashSet<String> hash = new HashSet<String>();
hash.Add("house");
using (var sw = new StreamWriter(#"temp.txt"))
{
while ((line = sr.ReadLine()) != null)
{
foreach (String str in hash)
{
if (oldLine.Contains(str))
{
sw.WriteLine(oldLine);
// write the next line as well (optional)
sw.WriteLine(line + "\r\n");
}
}
oldLine = line;
}
}
}

Delete specific line from a text file?

I need to delete an exact line from a text file but I cannot for the life of me workout how to go about doing this.
Any suggestions or examples would be greatly appreciated?
Related Questions
Efficient way to delete a line from a text file (C#)
If the line you want to delete is based on the content of the line:
string line = null;
string line_to_delete = "the line i want to delete";
using (StreamReader reader = new StreamReader("C:\\input")) {
using (StreamWriter writer = new StreamWriter("C:\\output")) {
while ((line = reader.ReadLine()) != null) {
if (String.Compare(line, line_to_delete) == 0)
continue;
writer.WriteLine(line);
}
}
}
Or if it is based on line number:
string line = null;
int line_number = 0;
int line_to_delete = 12;
using (StreamReader reader = new StreamReader("C:\\input")) {
using (StreamWriter writer = new StreamWriter("C:\\output")) {
while ((line = reader.ReadLine()) != null) {
line_number++;
if (line_number == line_to_delete)
continue;
writer.WriteLine(line);
}
}
}
The best way to do this is to open the file in text mode, read each line with ReadLine(), and then write it to a new file with WriteLine(), skipping the one line you want to delete.
There is no generic delete-a-line-from-file function, as far as I know.
One way to do it if the file is not very big is to load all the lines into an array:
string[] lines = File.ReadAllLines("filename.txt");
string[] newLines = RemoveUnnecessaryLine(lines);
File.WriteAllLines("filename.txt", newLines);
Hope this simple and short code will help.
List linesList = File.ReadAllLines("myFile.txt").ToList();
linesList.RemoveAt(0);
File.WriteAllLines("myFile.txt"), linesList.ToArray());
OR use this
public void DeleteLinesFromFile(string strLineToDelete)
{
string strFilePath = "Provide the path of the text file";
string strSearchText = strLineToDelete;
string strOldText;
string n = "";
StreamReader sr = File.OpenText(strFilePath);
while ((strOldText = sr.ReadLine()) != null)
{
if (!strOldText.Contains(strSearchText))
{
n += strOldText + Environment.NewLine;
}
}
sr.Close();
File.WriteAllText(strFilePath, n);
}
You can actually use C# generics for this to make it real easy:
var file = new List<string>(System.IO.File.ReadAllLines("C:\\path"));
file.RemoveAt(12);
File.WriteAllLines("C:\\path", file.ToArray());
This can be done in three steps:
// 1. Read the content of the file
string[] readText = File.ReadAllLines(path);
// 2. Empty the file
File.WriteAllText(path, String.Empty);
// 3. Fill up again, but without the deleted line
using (StreamWriter writer = new StreamWriter(path))
{
foreach (string s in readText)
{
if (!s.Equals(lineToBeRemoved))
{
writer.WriteLine(s);
}
}
}
Read and remember each line
Identify the one you want to get rid
of
Forget that one
Write the rest back over the top of
the file
I cared about the file's original end line characters ("\n" or "\r\n") and wanted to maintain them in the output file (not overwrite them with what ever the current environment's char(s) are like the other answers appear to do). So I wrote my own method to read a line without removing the end line chars then used it in my DeleteLines method (I wanted the option to delete multiple lines, hence the use of a collection of line numbers to delete).
DeleteLines was implemented as a FileInfo extension and ReadLineKeepNewLineChars a StreamReader extension (but obviously you don't have to keep it that way).
public static class FileInfoExtensions
{
public static FileInfo DeleteLines(this FileInfo source, ICollection<int> lineNumbers, string targetFilePath)
{
var lineCount = 1;
using (var streamReader = new StreamReader(source.FullName))
{
using (var streamWriter = new StreamWriter(targetFilePath))
{
string line;
while ((line = streamReader.ReadLineKeepNewLineChars()) != null)
{
if (!lineNumbers.Contains(lineCount))
{
streamWriter.Write(line);
}
lineCount++;
}
}
}
return new FileInfo(targetFilePath);
}
}
public static class StreamReaderExtensions
{
private const char EndOfFile = '\uffff';
/// <summary>
/// Reads a line, similar to ReadLine method, but keeps any
/// new line characters (e.g. "\r\n" or "\n").
/// </summary>
public static string ReadLineKeepNewLineChars(this StreamReader source)
{
if (source == null)
throw new ArgumentNullException(nameof(source));
char ch = (char)source.Read();
if (ch == EndOfFile)
return null;
var sb = new StringBuilder();
while (ch != EndOfFile)
{
sb.Append(ch);
if (ch == '\n')
break;
ch = (char)source.Read();
}
return sb.ToString();
}
}
Are you on a Unix operating system?
You can do this with the "sed" stream editor. Read the man page for "sed"
What?
Use file open, seek position then stream erase line using null.
Gotch it? Simple,stream,no array that eat memory,fast.
This work on vb.. Example search line culture=id where culture are namevalue and id are value and we want to change it to culture=en
Fileopen(1, "text.ini")
dim line as string
dim currentpos as long
while true
line = lineinput(1)
dim namevalue() as string = split(line, "=")
if namevalue(0) = "line name value that i want to edit" then
currentpos = seek(1)
fileclose()
dim fs as filestream("test.ini", filemode.open)
dim sw as streamwriter(fs)
fs.seek(currentpos, seekorigin.begin)
sw.write(null)
sw.write(namevalue + "=" + newvalue)
sw.close()
fs.close()
exit while
end if
msgbox("org ternate jua bisa, no line found")
end while
that's all..use #d

Categories