Search multiple words in a text file - c#

I made a code to search for several words in a text file but only the last word is searched, I would like to solve it
code:
string txt_text;
string[] words = {
"var",
"bob",
"for",
"example"
};
StreamReader file = new StreamReader("test.txt");
foreach(string _words in words) {
while ((txt_text = file.ReadToEnd()) != null) {
if (txt_text.Contains(_words)) {
textBox1.Text = "founded";
break;
} else {
textBox1.Text = "nothing founded";
break;
}
}
}

First of all, you can get rid of StreamReader and loop and query the file with a help of Linq
using System.Linq;
using System.IO;
...
textBox1.Text = File
.ReadLines("test.txt")
.Any(line => words.Any(word => line.Contains(word)))
? "found"
: "nothing found";
If you insist on loop, you should drop else:
// using - do not forget to Dispose IDisposable
using StreamReader file = new StreamReader("test.txt");
// shorter version is
// string txt_text = File.ReadAllText("test.txt");
string txt_text = file.ReadToEnd();
bool found = false;
foreach (string word in words)
if (txt_text.Contains(word)) {
// If any word has been found, stop further searching
found = true;
break;
} // no else here: keep on looping for other words
textBox1.Text = found
? "found"
: "nothing found";

I'd save the text in a variable and then loop over your words to check if it exists in the file. Something like this:
string[] words = { "var", "bob", "for", "example"};
var text = file.ReadToEnd();
List<string> foundWords = new List<string>();
foreach (var word in words)
{
if (text.Contains(word))
foundWords.Add(word);
}
Then, the list foundWords contains all matching words.
(PS: Don't forget to put your StreamReader in a using statement so it gets disposed correctly)

Related

Replace multiple strings in text files with different texts

I have a text file like so:
template.txt
hello my name is [MYNAME], and i am of age [AGE].
i live in [COUNTRY].
i love to eat [FOOD]
and I am trying to replace whatever is in the square brackets with strings from a list example
// // name //country // age // food
p.Add(new Person("jack", "NZ", "20", "Prawns"));
p.Add(new Person("ana", "AUS", "23", "Chicken"));
p.Add(new Person("tom", "USA", "30", "Lamb"));
p.Add(new Person("ken", "JAPAN", "15", "Candy"));
so far I have tried the below function which I call inside a loop
//loop
static void Main(string[] args)
{
int count = 0;
foreach (var l in p)
{
FindAndReplace("template.txt","output"+count+".txt" ,"[MYNAME]",l.name);
FindAndReplace("template.txt","output"+count+".txt" ,"[COUNTRY]",l.country);
FindAndReplace("template.txt","output"+count+".txt" ,"[AGE]",l.age);
FindAndReplace("template.txt","output"+count+".txt" ,"[FOOD]",l.food);
count++;
}
}
//find and replace function
private static void FindAndReplace(string template_path,string save_path,string find,string replace)
{
using (var sourceFile = File.OpenText(template_path))
{
// Open a stream for the temporary file
using (var tempFileStream = new StreamWriter(save_path))
{
string line;
// read lines while the file has them
while ((line = sourceFile.ReadLine()) != null)
{
// Do the word replacement
line = line.Replace(find, replace);
// Write the modified line to the new file
tempFileStream.WriteLine(line);
}
}
}
}
this is what I have done. But the output I get is this
output1.txt
hello my name is [MYNAME], and i am of age [AGE].
i live in [COUNTRY].
i love to eat Prawns
output2.txt
hello my name is [MYNAME], and i am of age [AGE].
i live in [COUNTRY].
i love to eat Chicken
Only the last text is replaced.
Every time you call FindAndReplace you are overwriting the last file written.
When you call it the first time it reads the template file, replaces a specific placeholder ([MYNAME]) with a value and writes it to a new file.
In the next call you take the template again so [MYNAME] is not replaced anymore and only replaces the country and writes it to the same file overwriting the content. This repeats till you get to the last call.
That is why only [FOOD] is replaced.
Try replacing all the text in one go and then writing it to the file.
instead of a function try doing something like this
static void Main(string[] args)
{
int count = 0;
foreach (var l in p)
{
using (var sourceFile = File.OpenText("template.txt"))
{
// Open a stream for the temporary file
using (var tempFileStream = new StreamWriter("output" + count + ".txt"))
{
string line;
// read lines while the file has them
while ((line = sourceFile.ReadLine()) != null)
{
line = line.Replace("[MYNAME]", l.name);
line = line.Replace("[COUNTRY]", l.country);
line = line.Replace("[AGE]", l.age);
line = line.Replace("[FOOD]", l.food);
tempFileStream.WriteLine(line);
}// end of while loop
}
count++;
}//end foreach loop
}
}//end of main

extracting a substring within a multiline string

I have a text file containing the following lines:
<TestInfo."Content">
{
<Label> "Content"
<Visible> "true"
"This is the text I want to get"
}
<TestInfo."Content2">
{
<Label> "Content2"
<Visible> "true"
"I don't want e.g. this"
}
I want to extract This is the text I want to get.
I tried e.g. the following:
string tmp = File.ReadAllText(textfile);
string result = Regex.Match(tmp, #"<Label> ""Content"" \n\s+ <Visible> ""true"" \n\s+ ""(.+?)""", RegexOptions.Singleline).Groups[1].Value;
However, in this case I get only the first word.
So, my output is: This
And I have no idea why...
I would appreciate any help. Thanks!
If you want the entire line after the line that starts with <Visible>, you'd better read the file line by line instead of using File.ReadAllText and a regular expression:
string result;
using (StreamReader sr = new StreamReader(textfile))
{
while (sr.Peek() >= 0)
{
string line = sr.ReadLine();
if (line.StartsWith("<Visible>"))
{
result = sr.ReadLine();
break;
}
}
}
Try this:
var tmp = File.ReadAllText("TextFile1.txt");
var result = Regex.Match(tmp, "This is the text I want to get", RegexOptions.Multiline);
if (result.Groups.Count> 0)
for (int i = 0; i < result.Groups.Count; i++)
Console.WriteLine(result.Groups[i].Value);
else
Console.WriteLine("string not found.");
Regards,
//jafc
You could change your regex this way:
var result = Regex.Match(tmp, #"<Visible> ""true""\s*""([\S ]+)""", RegexOptions.Singleline).Groups[1].Value;
If you want to get all the matches, not only the first one, you could use Regex.Matches
Thanks a lot for your input! This helped me to find a final solution:
First, I extracted only a small part containing the string I want to extract to avoid ambiguities:
string[] tmp = File.ReadAllLines(textfile);
List<string> Content = new List<string>();
bool dumpA = false;
Regex regBEGIN = new Regex(#"<TestInfo\.""Content"">");
Regex regEND = new Regex(#"<TestInfo\.""Content2"">");
foreach (string line in tmp)
{
if (dumpA)
Content.Add(line.Trim());
if (regBEGIN.IsMatch(line))
dumpA = true;
if (regEND.IsMatch(line)) break;
}
Then I can extract the (now only once existing) line starting with '"':
string result = "";
foreach (string line in Content)
{
if (line.StartsWith("\""))
{
result = line;
result = result.Replace("\"", "");
result = result.Trim();
}
}

Reading in first line of csv file and saving to a list without using ',' split

I have a csv file that I need to read in the first line and save it to a List. Only problem is there are commas in some of the text and it is splitting in the middle of a field when I need it not to. Unfortunately I cannot change the data inside so whats there needs to stay. I currently also write the data to csv so I was thinking maybe instead of using a comma I can use a different character. Does anyone know if this is possible? I have been researching but am not coming up with a proper answer. Here is my code below:
using System;
using System.CodeDom;
using System.IO;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
namespace TestJSON
{
class Program
{
static void Main()
{
var data = JsonConvert.DeserializeObject<dynamic>(File.ReadAllText(
#"C:\Users\nphillips\workspace\2016R23\UITestAutomation\SeedDataGenerator\src\staticresources\seeddata.resource"));
string fileName = "";
var bundles = data.RecordSetBundles;
foreach (var bundle in bundles)
{
var records = bundle.Records;
foreach (var record in records)
{
var test = record.attributes;
foreach (var testagain in test)
{
// Getting the object Name Ex. Location, Item, etc.
var jprop = testagain as JProperty;
if (jprop != null)
{
fileName = jprop.First.ToString().Split('_')[2]+ ".csv";
}
break;
}
string header = "";
string value = "";
foreach (var child in record)
{
var theChild = child as JProperty;
if (theChild != null && !theChild.Name.Equals("attributes"))
{
header += child.Name + ",";
value += child.Value.ToString() + ",";
}
}
value += "+" + Environment.NewLine;
if (!File.Exists(fileName))
{
header += "+" + Environment.NewLine;
File.WriteAllText(fileName, header);
}
else
{
// Need to read in here
var readCSV = new StreamReader(fileName);
var splits = readCSV.ReadLine();
}
File.AppendAllText(fileName, value);
}
}
}
}
}
You need to know how the file is delimited. I would guess that this file is tab delimited, so split on that instead.
Assuming your line is called myCSVLine... I.E
string seperator = "\t";
string[] splitLine = myCSVLine.Split(seperator.ToCharArray());
splitLine would now have all of your strings, including ones with commas

Counting names in CSV files

I am trying to write a program for a school project that will read a csv file containing a name on each line and output each name and the number of times it occurrences in a list box. I would prefer for it not to be pre set for a specific name but i guess that would work also. So far i have this but now I'm stuck. The CSV file will have a name on each line and also have a coma after each name. Any help would be great thanks.
This is what I have so far:
string[] csvArray;
string line;
StreamReader reader;
OpenFileDialog openFileDialog = new OpenFileDialog();
//set filter for dialog control
const string FILTER = "CSV Files|*.csv|All Files|*.*";
openFileDialog.Filter = FILTER;
//if user opens file and clicks ok
if (openFileDialog.ShowDialog() == DialogResult.OK)
{
//open input file
reader = File.OpenText(openFileDialog.FileName);
//while not end of stream
while (!reader.EndOfStream)
{
//read line from file
line = reader.ReadLine().ToLower();
//split values
csvArray = line.Split(',');
Using Linq we can do the following:
static IEnumerable<Tuple<int,string>> CountOccurences(IEnumerable<string> data)
{
return data.GroupBy(t => t).Select(t => Tuple.Create(t.Count(),t.Key));
}
Test:
var strings = new List<string>();
strings.Add("John");
strings.Add("John");
strings.Add("John");
strings.Add("Peter");
strings.Add("Doe");
strings.Add("Doe");
foreach (var item in CountOccurences(strings)) {
Console.WriteLine (String.Format("{0} = {1}", item.Item2, item.Item1));
}
John = 3
Peter = 1
Doe = 2
To use in your case:
string filePath = "c:\myfile.txt"
foreach (var item in CountOccurences(File.ReadAllLines(filePath).Select(t => t.Split(',').First())))
Console.WriteLine (String.Format("{0} = {1}", item.Item2, item.Item1));
you can use a dictionary, where you can store the occurrence of each Name:
Dictionary<string,int> NameOcur=new Dictionary<string,int>();
...
while (!reader.EndOfStream)
{
//read line from file
line = reader.ReadLine().ToLower();
//split values
csvArray = line.Split(',');
if (NameOcur.ContainsKey(csvArray[0]))
{
///Name exists in Dictionary increase count
NameOcur[csvArray[0]]++;
}
else
{
//Does not exist add with value 1
NameOcur.Add(csvArray[0],1);
}
}

Fastest way to find strings in a file

I have a log file that is not more than 10KB (File size can go up to 2 MB max) and I want to find if atleast one group of these strings occurs in the files. These strings will be on different lines like,
ACTION:.......
INPUT:...........
RESULT:..........
I need to know atleast if one group of above exists in the file. And I have do this about 100 times for a test (each time log is different, so I have reload and read the log), so I am looking for fastest and bets way to do this.
I looked up in the forums for finding the fastest way, but I dont think my file is too big for those silutions.
Thansk for looking.
I would read it line by line and check the conditions. Once you have seen a group you can quit. This way you don't need to read the whole file into memory. Like this:
public bool ContainsGroup(string file)
{
using (var reader = new StreamReader(file))
{
var hasAction = false;
var hasInput = false;
var hasResult = false;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (!hasAction)
{
if (line.StartsWith("ACTION:"))
hasAction = true;
}
else if (!hasInput)
{
if (line.StartsWith("INPUT:"))
hasInput = true;
}
else if (!hasResult)
{
if (line.StartsWith("RESULT:"))
hasResult = true;
}
if (hasAction && hasInput && hasResult)
return true;
}
return false;
}
}
This code checks if there is a line starting with ACTION then one with INPUT and then one with RESULT. If the order of those is not important then you can omit the if () else if () checks. In case the line does not start with the strings replace StartsWith with Contains.
Here's one possible way to do it:
StreamReader sr;
string fileContents;
string[] logFiles = Directory.GetFiles(#"C:\Logs");
foreach (string file in logFiles)
{
using (StreamReader sr = new StreamReader(file))
{
fileContents = sr.ReadAllText();
if (fileContents.Contains("ACTION:") || fileContents.Contains("INPUT:") || fileContents.Contains("RESULT:"))
{
// Do what you need to here
}
}
}
You may need to do some variation based on your exact implementation needs - for example, what if the word spans two lines, does the line need to start with the word, etc.
Added
Alternate line-by-line check:
StreamReader sr;
string[] lines;
string[] logFiles = Directory.GetFiles(#"C:\Logs");
foreach (string file in logFiles)
{
using (StreamReader sr = new StreamReader(file)
{
lines = sr.ReadAllLines();
foreach (string line in lines)
{
if (line.Contains("ACTION:") || line.Contains("INPUT:") || line.Contains("RESULT:"))
{
// Do what you need to here
}
}
}
}
Take a look at How to Read Text From a File. You might also want to take a look at the String.Contains() method.
Basically you will loop through all the files. For each file read line-by-line and see if any of the lines contains 1 of your special "Sections".
You don't have much of a choice with text files when it comes to efficiency. The easiest way would definitely be to loop through each line of data. When you grab a line in a string, split it on the spaces. Then match those words to your words until you find a match. Then do whatever you need.
I don't know how to do it in c# but in vb it would be something like...
Dim yourString as string
Dim words as string()
Do While objReader.Peek() <> -1
yourString = objReader.ReadLine()
words = yourString.split(" ")
For Each word in words()
If Myword = word Then
do stuff
End If
Next
Loop
Hope that helps
This code sample searches for strings in a large text file. The words are contained in a HashSet. It writes the found lines in a temp file.
if (File.Exists(#"temp.txt")) File.Delete(#"temp.txt");
String line;
String oldLine = "";
using (var fs = File.OpenRead(largeFileName))
using (var sr = new StreamReader(fs, Encoding.UTF8, true))
{
HashSet<String> hash = new HashSet<String>();
hash.Add("house");
using (var sw = new StreamWriter(#"temp.txt"))
{
while ((line = sr.ReadLine()) != null)
{
foreach (String str in hash)
{
if (oldLine.Contains(str))
{
sw.WriteLine(oldLine);
// write the next line as well (optional)
sw.WriteLine(line + "\r\n");
}
}
oldLine = line;
}
}
}

Categories