get line number of string in file C# - c#

That code makes filter on the files with the regular expression, and to shape the results. But how can I get a line number of each matches in the file?
I have no idea how to do it...
Please help me
class QueryWithRegEx
enter code here
IEnumerable<System.IO.FileInfo> fileList = GetFiles(startFolder);
System.Text.RegularExpressions.Regex searchTerm =
new System.Text.RegularExpressions.Regex(#"Visual (Basic|C#|C\+\+|Studio)");
var queryMatchingFiles =
from file in fileList
where file.Extension == ".htm"
let fileText = System.IO.File.ReadAllText(file.FullName)
let matches = searchTerm.Matches(fileText)
where matches.Count > 0
select new
{
name = file.FullName,
matchedValues = from System.Text.RegularExpressions.Match match in matches
select match.Value
};
Console.WriteLine("The term \"{0}\" was found in:", searchTerm.ToString());
foreach (var v in queryMatchingFiles)
{
string s = v.name.Substring(startFolder.Length - 1);
Console.WriteLine(s);
foreach (var v2 in v.matchedValues)
{
Console.WriteLine(" " + v2);
}
}
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
static IEnumerable<System.IO.FileInfo> GetFiles(string path)
{
...}

Give this a go:
var queryMatchingFiles =
from file in Directory.GetFiles(startFolder)
let fi = new FileInfo(file)
where fi.Extension == ".htm"
from line in File.ReadLines(file).Select((text, index) => (text, index))
let match = searchTerm.Match(line.text)
where match.Success
select new
{
name = file,
line = line.text,
number = line.index + 1
};

You can Understand from this example:
Read lines number 3 from string
string line = File.ReadLines(FileName).Skip(14).Take(1).First();
Read text from a certain line number from string
string GetLine(string text, int lineNo)
{
string[] lines = text.Replace("\r","").Split('\n');
return lines.Length >= lineNo ? lines[lineNo-1] : null;
}

Retrieving the line number from a Regex match can be tricky, I'd use the following approach:
private static void FindMatches(string startFolder)
{
var searchTerm = new Regex(#"Visual (Basic|C#|C\+\+|Studio)");
foreach (var file in Directory.GetFiles(startFolder, "*.htm"))
{
using (var reader = new StreamReader(file))
{
int lineNumber = 1;
string lineText;
while ((lineText = reader.ReadLine()) != null)
{
foreach (var match in searchTerm.Matches(lineText).Cast<Match>())
{
Console.WriteLine($"Match found: <{match.Value}> in file <{file}>, line <{lineNumber}>");
}
lineNumber++;
}
}
}
}

Related

Search a string to see if a word occurs more than once

I need to check if the given word is contained in a line inside the path, and print it. Here is my code:
using (StreamReader reading = new StreamReader(path))
{
string user= Console.ReadLine();
string line = user;
Console.WriteLine();
while ((line = reading.ReadLine()) != null)
{
if (line.Contains(user))
{
Console.WriteLine(line);
}
}
}
This is working, but if in the stream the word is found twice it gives two strings as an output. How can i check if the word is found twice?
If you just want to display the lines with the user and display the total count of how many lines contains user, you can easily do this with some LINQ:
var linesWithUser = File.ReadLines(filePath).Where(x => x.Contains(user)).ToList();
//Prints the count
Console.WriteLine(linesWithUser.Count);
//Prints all the lines that contain the user, maybe do other things...
foreach(var line in linesWithUser)
{
Console.WriteLine(line);
}
put your counting outside of the while, like this:
int count = 0;
string line = null;
while ((line = reading.ReadLine()) != null)
{
if (line.Contains(user))
{
count++;
}
}
if (count > 0)
{
Console.WriteLine(user +" found " + count +" time");
}
else
{
Console.WriteLine(user + " not found!");
}
Something like that:
bool ContainsWordMultipleTimes(string word, string input)
{
var regex = new Regex(string.Format(#"\b{0}\b", word),
RegexOptions.IgnoreCase);
return regex.Matches(input).Count > 1;
}
Or even extend string like this:
public static class StringWordsExtensions
{
public static bool ContainsMultipleTimes(this string input, string word)
{
var regex = new Regex(string.Format(#"\b{0}\b", word),
RegexOptions.IgnoreCase);
return regex.Matches(input).Count > 1;
}
}

String formatting in C#?

I have some problems to format strings from a List<string>
Here's a picture of the List values:
Now I managed to manipulate some of the values but others not, here's what I used to manipulate:
string prepareStr(string itemToPrepare) {
string first = string.Empty;
string second = string.Empty;
if (itemToPrepare.Contains("\"")) {
first = itemToPrepare.Replace("\"", "");
}
if (first.Contains("-")) {
int beginIndex = first.IndexOf("-");
second = first.Remove(beginIndex, first.Length - beginIndex);
}
return second;
}
Here's a picture of the Result:
I need to get the clear Path without the (-startup , -minimzed , MSRun , double apostrophes).
What am I doing wrong here?
EDIT my updated code:
void getStartUpEntries() {
var startEntries = StartUp.getStartUp();
if (startEntries != null && startEntries.Count != 0) {
for (int i = 0; i < startEntries.Count; i++) {
var splitEntry = startEntries[i].Split(new string[] { "||" }, StringSplitOptions.None);
var str = splitEntry[1];
var match = Regex.Match(str, #"\|\|""(?<path>(?:\""|[^""])*)""");
var finishedPath = match.Groups["path"].ToString();
if (!string.IsNullOrEmpty(finishedPath)) {
if (File.Exists(finishedPath) || Directory.Exists(finishedPath)) {
var _startUpObj = new StartUp(splitEntry[0], finishedPath,
"Aktiviert: ", new Uri("/Images/inWatch.avOK.png", UriKind.RelativeOrAbsolute),
StartUp.getIcon(finishedPath));
_startUpList.Add(_startUpObj);
}
else {
var _startUpObjNo = new StartUp(splitEntry[0], finishedPath,
"Aktiviert: ", new Uri("/Images/inWatch.avOK.png", UriKind.RelativeOrAbsolute),
StartUp.getIcon(string.Empty));
_startUpList.Add(_startUpObjNo);
}
}
var _startUpObjLast = new StartUp(splitEntry[0], splitEntry[1],
"Aktiviert: ", new Uri("/Images/inWatch.avOK.png", UriKind.RelativeOrAbsolute),
StartUp.getIcon(string.Empty));
_startUpList.Add(_startUpObjLast);
}
lstStartUp.ItemsSource = _startUpList.OrderBy(item => item.Name).ToList();
}
You could use a regex to extract the path:
var str = #"0Raptr||""C:\Program Files (x86)\Raptr\raptrstub.exe"" --startup"
var match = Regex.Match(str, #"\|\|""(?<path>(?:\""|[^""])*)""");
Console.WriteLine(match.Groups["path"]);
This will match any (even empty) text (either an escaped quote, or any character which is not a quote) between two quote characters preceeded by two pipe characters.
Similarly, you could simply split on the double quotes as I see that's a repeating occurrence in your examples and take the second item in the split array:
var path = new Regex("\"").Split(s)[1];
This is and update to your logic without using any Regex:
private string prepareStr(string itemToPrepare)
{
string result = null;
string startString = #"\""";
string endString = #"\""";
int startPoint = itemToPrepare.IndexOf(startString);
if (startPoint >= 0)
{
startPoint = startPoint + startString.Length;
int EndPoint = itemToPrepare.IndexOf(endString, startPoint);
if (EndPoint >= 0)
{
result = itemToPrepare.Substring(startPoint, EndPoint - startPoint);
}
}
return result;
}

Clearing a line of a txt file by the line ID

I have looked all over for the answer to this, but I can't find it anywhere. I need to be able to clear a line from a txt file by the last integer in the line (the ID number), but I have no idea how to do that. Please help? Basically I was thinking that I need to find the last integer, and if it does not equal to the input, then it would move to the next line until it finds the right integer. Then that line is cleared. Here is some of my code that obviously doesn't work:
public static void TicID(CommandArgs args)
{
if (args.Parameters.Count == 1)
{
if (i == 1)
{
try
{
string idToDelete = args.Parameters[0];
StreamReader idreader = new StreamReader("Tickets.txt");
StreamWriter iddeleter = new StreamWriter("Tickets.txt");
string id = Convert.ToString(idreader.Read());
string line = null;
while (idreader.Peek() >= 0)
{
if (String.Compare(id, idToDelete) == 0)
{
iddeleter.WriteLine(line);
}
else
{
idreader.ReadLine();
}
}
}
The most straightforward way to delete lines is to write the lines that should not be deleted:
var idToDelete = "1";
var path = #"C:\Temp\Test.txt";
var lines = File.ReadAllLines(path);
using (var writer = new StreamWriter(path, false)) {
for (var i = 0; i < lines.Length; i++) {
var line = lines[i];
//assuming it's a CSV file
var cols = line.Split(',');
var id = cols[cols.Length - 1];
if (id != idToDelete) {
writer.WriteLine(line);
}
}
}
This is the LINQ-way:
var lines = from line in File.ReadAllLines(path)
let cols = line.Split(',')
let id = cols[cols.Length - 1]
where id != idToDelete
select line;
File.WriteAllLines(path, lines);
Create a StreamReader object and read line by line into a string array or something like that using StreamReader's instance method ReadLine() and now find your line of choice in your text array and delete it.
Important note:
do { /*see description above*/ } while (streamReader.Peak() != -1);

Code to trim part of a text file in C#

I have a situation where I am given a text file with text formatted as follows:
C:\Users\Admin\Documents\report2011.docx: My Report 2011
C:\Users\Admin\Documents\newposter.docx: Dinner Party Poster 08
How would it be possible to trim the text file, so to trim the ":" and all characters after it.
E.g. so the output would be like:
C:\Users\Admin\Documents\report2011.docx
C:\Users\Admin\Documents\newposter.docx
Current Code:
private void button1_Click(object sender, EventArgs e)
{
using (StreamWriter sw = File.AppendText(#"c:\output.txt"))
{
StreamReader sr = new StreamReader(#"c:\filename.txt");
Regex reg = new Regex(#"\w\:(.(?!\:))+");
List<string> parsedStrings = new List<string>();
while (sr.EndOfStream)
{
sw.WriteLine(reg.Match(sr.ReadLine()).Value);
}
}
}
Not working :(
int index = myString.LastIndexOf(":");
if (index > 0)
myString= myString.Substring(0, index);
Edit - Added answer based on modified question. It can be condensed slightly, but left expanded for clarity of what's going on.
using (StreamWriter sw = File.AppendText(#"c:\output.txt"))
{
using(StreamReader sr = new StreamReader(#"input.txt"))
{
string myString = "";
while (!sr.EndOfStream)
{
myString = sr.ReadLine();
int index = myString.LastIndexOf(":");
if (index > 0)
myString = myString.Substring(0, index);
sw.WriteLine(myString);
}
}
}
Edited
StreamReader sr = new StreamReader("yourfile.txt");
Regex reg = new Regex(#"\w\:(.(?!\:))+");
List<string> parsedStrings = new List<string>();
while (!sr.EndOfStream)
{
parsedStrings.Add(reg.Match(sr.ReadLine()).Value);
}
sure. while reading each line, do a
Console.WriteLine(line.Substring(0,line.IndexOf(": "));
i'd use the answer given here Code to trim part of a text file in C# and find the 2nd occurrence and then use it in substring.
var s = #"C:\Users\Admin\Documents\report2011.docx: My Report 2011";
var i = GetNthIndex(s,':',2);
var result = s.Substring(i+1);
public int GetNthIndex(string s, char t, int n)
{
int count = 0;
for (int i = 0; i < s.Length; i++)
{
if (s[i] == t)
{
count++;
if (count == n)
{
return i;
}
}
}
return -1;
}
Assuming this is being done where the files are supposed to exist, you could handle this taking into account any colons in the (what I assume is) description by checking for the existence of the files after you get the index of the colon.
List<string> files = new List<string>();
files.Add(#"C:\Users\Admin\Documents\report2011.docx: My Report 2011");
files.Add(#"C:\Users\Admin\Documents\newposter.docx: Dinner Party Poster 08");
files.Add(#"C:\Users\Admin\Documents\newposter.docx: Dinner Party: 08");
int lastColon;
string filename;
foreach (string s in files)
{
bool isFilePath = false;
filename = s;
while (!isFilePath)
{
lastColon = filename.LastIndexOf(":");
if (lastColon > 0)
{
filename = filename.Substring(0, lastColon);
if (File.Exists(filename))
{
Console.WriteLine(filename);
isFilePath = true;
}
}
else
{
throw new FileNotFoundException("File not found", s);
}
}
}
Try this:
more faster:
string s = #"C:\Users\Admin\Documents\report2011.docx: My Report 2011";
string path = Path.GetDirectoryName(s) + s.Split(new char[] { ':' }) [1];
Console.WriteLine(path); //C:\Users\Admin\Documents\report2011.docx
you need import System.IO
you could use split:
string[] splitted= myString.Split(':');
Then you get an array where you take the first one.
var mySplittedString = splitted[0]
Have a look here if you need more information on this.
EDIT: In your case you get an array with the size of at least 3 so you need to get splitted[0] and splitted[ 1]

Getting rid of *consecutive* periods in a filename

I was wondering how I'd get rid of periods in a filename if i have a filename like:
Test....1.txt to look like Test 1.txt ? I do NOT want files like : 1.0.1 Test.txt to be touched. Only files with consecutive periods should be replaced with a space. Any ideas?
This is my current code but as you can see, it replaces every period aside from periods in extension names:
public void DoublepCleanUp(List<string> doublepFiles)
{
Regex regExPattern2 = new Regex(#"\s{2,}");
Regex regExPattern4 = new Regex(#"\.+");
Regex regExPattern3 = new Regex(#"\.(?=.*\.)");
string replace = " ";
List<string> doublep = new List<string>();
var filesCount = new Dictionary<string, int>();
try
{
foreach (string invalidFiles in doublepFiles)
{
string fileOnly = System.IO.Path.GetFileName(invalidFiles);
string pathOnly = System.IO.Path.GetDirectoryName(fileOnly);
if (!System.IO.File.Exists(fileOnly))
{
string filewithDoublePName = System.IO.Path.GetFileName(invalidFiles);
string doublepPath = System.IO.Path.GetDirectoryName(invalidFiles);
string name = System.IO.Path.GetFileNameWithoutExtension(invalidFiles);
//string newName = name.Replace(".", " ");
string newName = regExPattern4.Replace(name, replace);
string newName2 = regExPattern2.Replace(newName, replace);
string filesDir = System.IO.Path.GetDirectoryName(invalidFiles);
string fileExt = System.IO.Path.GetExtension(invalidFiles);
string fileWithExt = newName2 + fileExt;
string newPath = System.IO.Path.Combine(filesDir, fileWithExt);
System.IO.File.Move(invalidFiles, newPath);
DataGridViewRow clean = new DataGridViewRow();
clean.CreateCells(dataGridView1);
clean.Cells[0].Value = doublepPath;
clean.Cells[1].Value = filewithDoublePName;
clean.Cells[2].Value = fileWithExt;
dataGridView1.Rows.Add(clean);
}
else
{
if (filesCount.ContainsKey(fileOnly))
{
filesCount[fileOnly]++;
}
else
{
filesCount.Add(fileOnly, 1);
string newFileName = String.Format("{0}{1}{2}",
System.IO.Path.GetFileNameWithoutExtension(fileOnly),
filesCount[fileOnly].ToString(),
System.IO.Path.GetExtension(fileOnly));
string newFilePath = System.IO.Path.Combine(System.IO.Path.GetDirectoryName(fileOnly), newFileName);
System.IO.File.Move(fileOnly, newFilePath);
DataGridViewRow clean = new DataGridViewRow();
clean.CreateCells(dataGridView1);
clean.Cells[0].Value = pathOnly;
clean.Cells[1].Value = fileOnly;
clean.Cells[2].Value = newFileName;
dataGridView1.Rows.Add(clean);
}
}
}
}
catch(Exception e)
{
//throw;
StreamWriter doublepcleanup = new StreamWriter(#"G:\DoublePeriodCleanup_Errors.txt");
doublepcleanup.Write("Double Period Error: " + e + "\r\n");
doublepcleanup.Close();
}
}
string name = "Text...1.txt";
Regex r = new Regex("[.][.]+");
string result = r.Replace(name, " ");
Well, to do that with a string:
string st = "asdf..asdf...asfd...asdf.asf.asdf.s.s";
Regex r = new Regex("\\.\\.+");
st = r.Replace(st, " ");
This will replace any instance of 2 or more '.'s with a space.
I would throw this into a method:
public static string StringReplace(string original,
string regexMatch, string replacement) {
Regex r = new Regex(regexMatch);
return r.Replace(original, replacement);
}
How about this?
string newFileName = String.Join(".", fileName.Split('.').Select(p => !String.IsNullOrEmpty(p) ? p : " ").ToArray())
Why not use something like this?
string newName = name;
while (newName.IndexOf("..") != -1)
newName = newName.Replace("..", " ");
static string CleanUpPeriods(string filename)
{
StringBuilder sb = new StringBuilder();
if (filename.Length > 0) sb.Append(filename[0]);
for (int i = 1; i < filename.Length; i++)
{
char last = filename[i - 1];
char current = filename[i];
if (current != '.' || last != '.') sb.Append(current);
}
return sb.ToString();
}
You could use use regular expressions, something like this
string fileName = new Regex(#"[.][.]+").Replace(oldFileName, "");
Continuing from dark_charlie's solution, isn't
string newName = name;
while (newName.IndexOf("..") != -1)
newName = newName.Replace("..", ".");
enough?
I have tested this code on a number of cases, and it appears to exhibit the requested behavior.
private static string RemoveExcessPeriods(string text)
{
if (string.IsNullOrEmpty(text))
return string.Empty;
// If there are no consecutive periods, then just get out of here.
if (!text.Contains(".."))
return text;
// To keep things simple, let's separate the file name from its extension.
string extension = Path.GetExtension(text);
string fileName = Path.GetFileNameWithoutExtension(text);
StringBuilder result = new StringBuilder(text.Length);
bool lastCharacterWasPeriod = false;
bool thisCharacterIsPeriod = fileName.Length > 0 && fileName[0] == '.';
bool nextCharacterIsPeriod;
for (int index = 0; index < fileName.Length; index++)
{
// Includes both the extension separator and other periods.
nextCharacterIsPeriod = fileName.Length == index + 1 || fileName[index + 1] == '.';
if (!thisCharacterIsPeriod)
result.Append(fileName[index]);
else if (thisCharacterIsPeriod && !lastCharacterWasPeriod && !nextCharacterIsPeriod)
result.Append('.');
else if (thisCharacterIsPeriod && !lastCharacterWasPeriod)
result.Append(' ');
lastCharacterWasPeriod = thisCharacterIsPeriod;
thisCharacterIsPeriod = nextCharacterIsPeriod;
}
return result.ToString() + extension;
}
I just made a change to handle some edge cases. Here are some test results for this version.
"Test....1.txt" => "Test 1.txt"
"1.0.1..Test.txt" => "1.0.1 Test.txt"
"Test......jpg" => "Test .jpg"
"Test.....jpg" => "Test .jpg"
"one.pic.jpg" => "one.pic.jpg"
"one..pic.jpg" => "one pic.jpg"
"one..two..three.pic.jpg" => "one two three.pic.jpg"
"one...two..three.pic.jpg" => "one two three.pic.jpg"
"one..two..three.pic..jpg" => "one two three.pic .jpg"
"one..two..three..pic.jpg" => "one two three pic.jpg"
"one..two..three...pic...jpg" => "one two three pic .jpg"
Combining some other answers...
static string CleanUpPeriods(string filename)
{
string extension = Path.GetExtension(filename);
string name = Path.GetFileNameWithoutExtension(filename);
Regex regex = new Regex(#"\.\.+");
string s = regex.Replace(name, " ").Trim();
if (s.EndsWith(".")) s = s.Substring(0, s.Length - 1);
return s + extension;
}
Sample Output
"Test........jpg" -> "Test.jpg"
"Test....1.jpg" -> "Test 1.jpg"
"Test 1.0.1.jpg" -> "Test 1.0.1.jpg"
"Test..jpg" -> "Test.jpg"
void ReplaceConsecutive(string src, int lenght, string replace)
{
char last;
int count = 0;
StringBuilder ret = new StringBuilder();
StringBuilder add = new StringBuilder();
foreach (char now in src)
{
if (now == last)
{
add.Append(now);
if (count > lenght)
{
ret.Append(replace);
add = new StringBuilder();
}
count++;
}
else
{
ret.Append(add);
add = new StringBuilder();
count = 0;
ret.Append(now);
}
}
return ret.ToString();
}
Untested, but this should work.
src is the string you want to check for consecutives, lenght is the number of equal chars followed by each other until they get replaced with replace.
This is AFAIK also possible in Regex, but I'm not that good with Regex's that I could do this.

Categories