How can I find a phrase anywhere in a String Array? - c#

I need to see if any phrase, such as "duckbilled platypus" appears in a string array.
In the case I'm testing, the phrase does exist in the string list, as shown here:
Yet, when I look for that phrase, as shown here:
...it fails to find it. I never get past the "if (found)" gauntlet in the code below.
Here is the code that I'm using to try to traverse through the contents of one doc to see if any phrase (two words or more) are found in both documents:
private void FindAndStorePhrasesFoundInBothDocs()
{
string[] doc1StrArray;
string[] doc2StrArray;
slPhrasesFoundInBothDocs = new List<string>();
slAllDoc1Words = new List<string>();
int iCountOfWordsInDoc1 = 0;
int iSearchStartIndex = 0;
int iSearchEndIndex = 1;
string sDoc1PhraseToSearchForInDoc2;
string sFoundPhrase;
bool found;
int iLastWordIndexReached = iSearchEndIndex;
try
{
doc1StrArray = File.ReadAllLines(sDoc1Path, Encoding.UTF8);
doc2StrArray = File.ReadAllLines(sDoc2Path, Encoding.UTF8);
foreach (string line in doc1StrArray)
{
string[] subLines = line.Split();
foreach (string whirred in subLines)
{
if (String.IsNullOrEmpty(whirred)) continue;
slAllDoc1Words.Add(whirred);
}
}
iCountOfWordsInDoc1 = slAllDoc1Words.Count();
sDoc1PhraseToSearchForInDoc2 = slAllDoc1Words[iSearchStartIndex] + ' ' + slAllDoc1Words[iSearchEndIndex];
while (iLastWordIndexReached < iCountOfWordsInDoc1 - 1)
{
sFoundPhrase = string.Empty;
// Search for the phrase from doc1 in doc2;
found = doc2StrArray.Contains(sDoc1PhraseToSearchForInDoc2);
if (found)
{
sFoundPhrase = sDoc1PhraseToSearchForInDoc2;
iSearchEndIndex++;
sDoc1PhraseToSearchForInDoc2 = sDoc1PhraseToSearchForInDoc2 + ' ' + slAllDoc1Words[iSearchEndIndex];
}
else //if not found, inc vals of BOTH int args and, if sFoundPhrase not null, assign to sDoc1PhraseToSearchForInDoc2 again.
{
iSearchStartIndex = iSearchEndIndex;
iSearchEndIndex = iSearchStartIndex + 1;
if (!string.IsNullOrWhiteSpace(sFoundPhrase)) // add the previous found phrase if there was one
{
slPhrasesFoundInBothDocs.Add(sFoundPhrase);
}
sDoc1PhraseToSearchForInDoc2 = slAllDoc1Words[iSearchStartIndex] + ' ' + slAllDoc1Words[iSearchEndIndex];
} // if/else
iLastWordIndexReached = iSearchEndIndex;
} // while
} // try
catch (Exception ex)
{
MessageBox.Show("FindAndStorePhrasesFoundInBothDocs(); iSearchStartIndex = " + iSearchStartIndex.ToString() + "iSearchEndIndex = " + iSearchEndIndex.ToString() + " iLastWordIndexReached = " + iLastWordIndexReached.ToString() + " " + ex.Message);
}
}
doc2StrArray does contain the phrase sought, so why does doc2StrArray.Contains(sDoc1PhraseToSearchForInDoc2) fail?

This should do what you want:
found = Array.FindAll(doc2StrArray, s => s.Contains(sDoc1PhraseToSearchForInDoc2));

In List<T>, Contains() looking for an T, Here in your code to found be true must have all the text in particular index (NOT part of it).
Try this
var _list = doc2StrArray.ToList();
var found = _list.FirstOrDefault( w => w.Contains( sDoc1PhraseToSearchForInDoc2 ) ) != null;

Related

Getting a System.StackOverflowException' error

I am Getting this error An unhandled exception of type 'System.StackOverflowException' occurred in mscorlib.dll I know you are not supposed to have an infinite loop but its not an infinate loop because it just has too go till it gets a file number that has not been made yet. How can i go about this a better way?
private int x = 0;
public string clients = #"F:\Internal Jobs\Therm-Air Files\Program\P-1-2.0\Clients\";
public string tdate = DateTime.Today.ToString("MM-dd-yy");
public void saveloop()
{
string path = LoadPO.Text.Substring(0, LoadPO.Text.LastIndexOf("\\"));
string name = Path.GetFileName(path);
string t = Convert.ToString(x);
if (!File.Exists(path + #"\" + name + ".xlsx")) // This Line throws error
{
oSheet.SaveAs(path + #"\" + name + "-" + t + ".xlsx");
string prop = /* snipped for space reasons, just string concats */
string Combine = string.Empty;
int b = 0;
int c = cf.cPanel.Controls.Count;
string[] items = new string[c];
foreach (WProduct ewp in cf.cPanel.Controls)
{
string item = /* snipped for space reasons, just string concats */
items[b] = item;
b += 1;
}
Combine = prop + "^<";
foreach (var strings in items)
{
Combine += strings + "<";
}
File.WriteAllText(path + #"\" + name + ".txt", Combine);
}
else
{
x += 1;
saveloop();
}
The reason the code above is failing is because you do not use i in the name of the file so you can increment all you want it does not change the name.
You need to abstract the creation of the name of the file from the code that does the writing. Think of it as writing code in blocks of functionality.
public static string GetFileName(string path, string name)
{
var fileName = $#"{path}\{name}.xlsx";
int i = 0;
while (System.IO.File.Exists(fileName))
{
i++;
fileName = $#"{path}\{name}({i}).xlsx";
}
return fileName;
}
public void saveloop()
{
var fileName = GetFileName(path, name);
// use fileName from this point on
}

Openxml in C# updating only the first MERGEFIELD in a paragraph

I have approximately 10 MERGEFIELD in a document that I'm trying to replace the Text with some value. Here's the code.
using (WordprocessingDocument document = WordprocessingDocument.Open(destinationFileName, true))
{
document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
MainDocumentPart docPart = document.MainDocumentPart;
docPart.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri(destinationFileName, UriKind.RelativeOrAbsolute));
docPart.Document.Save();
IEnumerable<FieldChar> fldChars = document.MainDocumentPart.RootElement.Descendants<FieldChar>();
if (fldChars == null) { return; }
string fieldList = string.Empty;
FieldChar fldCharStart = null;
FieldChar fldCharEnd = null;
FieldChar fldCharSep = null;
FieldCode fldCode = null;
string fldContent = String.Empty;
int i = 1;
foreach(var fldChar in fldChars)
{
System.Diagnostics.Debug.WriteLine(i + ": " + fldChar);
i++;
string fldCharPart = fldChar.FieldCharType.ToString();
System.Diagnostics.Debug.WriteLine("Field Char Length: " + fldChar.Count());
System.Diagnostics.Debug.WriteLine("Field Char part: " + fldCharPart);
switch(fldCharPart)
{
case "begin": // start of the field
fldCharStart = fldChar;
System.Diagnostics.Debug.WriteLine("Field Char Start: " + fldCharStart);
// get the field code, which will be an instrText element
// either as sibling or as a child of the parent sibling
fldCode = fldCharStart.Parent.Descendants<FieldCode>().FirstOrDefault();
System.Diagnostics.Debug.WriteLine("Field Code: " + fldCode);
if (fldCode == null)
{
fldCode = fldCharStart.Parent.NextSibling<Run>().Descendants<FieldCode>().FirstOrDefault();
System.Diagnostics.Debug.WriteLine("New Field Code: " + fldCode);
}
if (fldCode != null && fldCode.InnerText.Contains("MERGEFIELD"))
{
fldContent = getFieldValue(query, prescriber, fldCode.InnerText);
fieldList += fldContent + "\n";
System.Diagnostics.Debug.WriteLine("Field content: " + fldContent);
}
break;
case "end": // end of the field
fldCharEnd = fldChar;
System.Diagnostics.Debug.WriteLine("Field char end: " + fldCharEnd);
break;
case "separate": // complex field with text result
fldCharSep = fldChar;
break;
default:
break;
}
if((fldCharStart != null) && (fldCharEnd != null))
{
if(fldCharSep != null)
{
Text elemText = (Text)fldCharSep.Parent.NextSibling().Descendants<Text>().FirstOrDefault();
elemText.Text = fldContent;
System.Diagnostics.Debug.WriteLine("Element text: " + elemText);
// Delete all field chas with their runs
DeleteFieldChar(fldCharStart);
DeleteFieldChar(fldCharEnd);
DeleteFieldChar(fldCharSep);
fldCode.Remove();
}
else
{
Text elemText = new Text(fldContent);
fldCode.Parent.Append(elemText);
fldCode.Remove();
System.Diagnostics.Debug.WriteLine("Element Text !sep: " + elemText);
DeleteFieldChar(fldCharStart);
DeleteFieldChar(fldCharEnd);
DeleteFieldChar(fldCharSep);
}
fldCharStart = null;
fldCharEnd = null;
fldCharSep = null;
fldCode = null;
fldContent = string.Empty;
}
}
System.Diagnostics.Debug.WriteLine("Field list: " + fieldList);
}
It works to some extent. The problem is when there are more than one field in a paragraph. I have about 4 merge fields in one paragraph in this document, and one field in each paragraph after that. Only the first merge field in the paragraph is being updated and the rest fields in the paragraphs is untouched. Then, it moves to the next paragraph and looks for the field. How can I fix this?
Looks like you are over complicating a simple Mailmerge replacement. Instead of looping through paragraphs you could rather get all mailmerge fields within a document and replace them.
private const string FieldDelimeter = " MERGEFIELD ";
foreach (FieldCode field in doc.MainDocumentPart.RootElement.Descendants<FieldCode>())
{
var fieldNameStart = field.Text.LastIndexOf(FieldDelimeter, System.StringComparison.Ordinal);
var fieldName = field.Text.Substring(fieldNameStart + FieldDelimeter.Length).Trim();
foreach (Run run in doc.MainDocumentPart.Document.Descendants<Run>())
{
foreach (Text txtFromRun in run.Descendants<Text>().Where(a => a.Text == "«" + fieldName + "»"))
{
txtFromRun.Text = "Replace what the merge field here";
}
}
}
doc.MainDocumentPart.Document.Save();
doc is of type WordprocessingDocument.
This will replace all merge fields regardless of the amount of fields in a paragraph.

What is the code for renaming non-space filename

I have a filename of document erwin_01problem.doc, What i want here is the if a filename does not contain a space between 01problem (erwin_01problem.doc). I find the index of problem and replace it with " ". The output will be erwin_01 problem.doc
Here is the code that i try but still I wasn't able to put a space between 01 and problem.
if (!string.IsNullOrEmpty(job.ProblemPath))
{
job.HasProblemFile = true;
var problemDocFname = Path.GetFileName(job.ProblemPath);
if (!Regex.IsMatch(problemDocFname, #"\sproblem\.doc$"))
{
ProgM.JobStatus = "Checking space between filename and problem...";
Thread.Sleep(1000);
problemDocFname = problemDocFname.Insert(problemDocFname.IndexOf("problem.doc", StringComparison.Ordinal), " ");
//problemDocFname = problemDocFname.Replace("problem", " problem");
}
problemDocFname = Path.Combine(job.FilePath, problemDocFname);
var docProblemCount = 0;
ProgM.JobStatus = "Correcting the Format of Problem Doc...";
Thread.Sleep(1000);
MicrosoftWord.CorrectProblemDocFormatting(problemDocFname, ref docProblemCount);
}
jobs.Add(job);
I belive you don't really need regular expressions. You can just do something like:
string key = "problem.doc";
if (problemDocFname.EndsWith(key) && problemDocFname.Length > key.Length)
{
problemDocFname.Replace(key, " problem.doc");
}

XPATH Input Type Radio doesn't work c#

this my code
private DataTable ParseTable(string html)
{
HtmlDocument doc = new HtmlDocument();
DataTable dt = new DataTable();
String[] datasc;
String[] valueTemp = new String[30];
int index;
doc.LoadHtml("<table><tr><td><p><input id=\"ControlGroupScheduleSelectView_AvailabilityInputScheduleSelectView_RadioButtonMkt1Fare7\" type=\"radio\" name=\"ControlGroupScheduleSelectView$AvailabilityInputScheduleSelectView$market1\" value=\"0~N~~N~RGFR~~1~X|QG~ 885~ ~~BTH~05/19/2014 07:00~KNO~05/19/2014 08:20~\" />Rp.445,000 ( N/Cls;4 )</p></td></tr></table>");
for (int z = 0; z < 4; z++)
{
var getInputSchedule = doc.DocumentNode.SelectNodes("//table//input");
datasc = new String[getInputSchedule.Count];
for (int i = 0; i < getInputSchedule.Count; i = i+1)
{
string removeClassFare = string.Empty;
String[] selectValueSplit = getInputSchedule[i].Attributes["value"].Value.Split('|');
valueTemp[i] = selectValueSplit[1];
String[] getAlphaSC = selectValueSplit[0].Split('~');
try
{
index = getInputSchedule[i].ParentNode.InnerText.IndexOf("(");
if (index != -1)
{
removeClassFare = getInputSchedule[i].ParentNode.InnerText.Substring(0, index);
removeClassFare = System.Text.Encoding.ASCII.GetString(System.Text.Encoding.ASCII.GetBytes(removeClassFare)).Replace("??", "").Replace("Rp.", "").Trim();
}
}
catch (Exception e) {
//removeClassFare = getInputSchedule[i].ParentNode.InnerText;
}
if (!dt.Columns.Contains(getAlphaSC[1]))
{
dt.Columns.Add(getAlphaSC[1], typeof(string));
}
if (i == 0)
{
datasc[i] = "<div align=\"center\"><input <input onclick='faredetail(this.value, this.name)' id=\"" + getInputSchedule[i].Attributes["id"].Value + "\" type=\"radio\" value=\"" + getInputSchedule[i].Attributes["value"].Value + "\" name=\"" + getInputSchedule[i].Attributes["name"].Value + "\"><br>" + removeClassFare + "</div>";
}
else
{
if (selectValueSplit[1].Equals(valueTemp[i - 1],StringComparison.Ordinal))
{
datasc[i] = "<div align=\"center\"><input <input onclick='faredetail(this.value, this.name)' id=\"" + getInputSchedule[i].Attributes["id"].Value + "\" type=\"radio\" value=\"" + getInputSchedule[i].Attributes["value"].Value + "\" name=\"" + getInputSchedule[i].Attributes["name"].Value + "\"><br>" + removeClassFare + "</div>";
}
else
{
break;
}
}
getInputSchedule[i].Remove();
}
datasc = datasc.Where(x => !string.IsNullOrEmpty(x)).ToArray();
dt.Rows.Add(datasc);
}
return dt;
}
if i run, error message "Object reference not set to an instance of an object.", but if i remove the ID of element like
doc.LoadHtml("<table><tr><td><p><input type=\"radio\" name=\"ControlGroupScheduleSelectView$AvailabilityInputScheduleSelectView$market1\" value=\"0~N~~N~RGFR~~1~X|QG~ 885~ ~~BTH~05/19/2014 07:00~KNO~05/19/2014 08:20~\">Rp.445,000 ( N/Cls;4 )</p></td></tr></table>");
Everything works ok.
Why does the ID attribute cause my XPath to fail?
pleasee..help..
thank you
I stand corrected. SelectNodes does return null if it can't find any nodes.
But the behavior you are witnessing has nothing to do with the id attribute (in fact, removing the id attribute causes an exception to happen sooner), and everything to do with your code.
At the end of your inner loop, you are doing this:
getInputSchedule[i].Remove();
which removes the <input> element from the HTML document.
Your outer loop is set up to execute four times, so the second time it executes, the input element is already gone, and doc.DocumentNode.SelectNodes("//table//input") returns null, and that is the cause of your error.
I'm not really sure why you're removing the input elements from the document as you go through it, or why you're looping through the whole thing 4 times, but hopefully that gets you going in the right direction.

replacing text in a text file with \r\n

Currently I am building an agenda with extra options.
for testing purposes I store the data in a simple .txt file
(after that it will be connected to the agenda of a virtual assistant.)
To change or delete text from this .txt file I have a problem.
Although the part of the content that needs to be replaced and the search string are exactly the same it doesn't replace the text in content.
code:
Change method
public override void Change(List<object> oldData, List<object> newData)
{
int index = -1;
for (int i = 0; i < agenda.Count; i++)
{
if(agenda[i].GetType() == "Task")
{
Task t = (Task)agenda[i];
if(t.remarks == oldData[0].ToString() && t.datetime == (DateTime)oldData[1] && t.reminders == oldData[2])
{
index = i;
break;
}
}
}
string search = "Task\r\nTo do: " + oldData[0].ToString() + "\r\nDateTime: " + (DateTime)oldData[1] + "\r\n";
reminders = (Dictionary<DateTime, bool>) oldData[2];
if(reminders.Count != 0)
{
search += "Reminders\r\n";
foreach (KeyValuePair<DateTime, bool> rem in reminders)
{
if (rem.Value)
search += "speak " + rem.Key + "\r\n";
else
search += rem.Key + "\r\n";
}
}
// get new data
string newRemarks = (string)newData[0];
DateTime newDateTime = (DateTime)newData[1];
Dictionary<DateTime, bool> newReminders = (Dictionary<DateTime, bool>)newData[2];
string replace = "Task\r\nTo do: " + newRemarks + "\r\nDateTime: " + newDateTime + "\r\n";
if(newReminders.Count != 0)
{
replace += "Reminders\r\n";
foreach (KeyValuePair<DateTime, bool> rem in newReminders)
{
if (rem.Value)
replace += "speak " + rem.Key + "\r\n";
else
replace += rem.Key + "\r\n";
}
}
Replace(search, replace);
if (index != -1)
{
remarks = newRemarks;
datetime = newDateTime;
reminders = newReminders;
agenda[index] = this;
}
}
replace method
private void Replace(string search, string replace)
{
StreamReader reader = new StreamReader(path);
string content = reader.ReadToEnd();
reader.Close();
content = Regex.Replace(content, search, replace);
content.Trim();
StreamWriter writer = new StreamWriter(path);
writer.Write(content);
writer.Close();
}
When running in debug I get the correct info:
content "-- agenda --\r\n\r\nTask\r\nTo do: test\r\nDateTime: 16-4-2012 15:00:00\r\nReminders:\r\nspeak 16-4-2012 13:00:00\r\n16-4-2012 13:30:00\r\n\r\nTask\r\nTo do: testing\r\nDateTime: 16-4-2012 9:00:00\r\nReminders:\r\nspeak 16-4-2012 8:00:00\r\n\r\nTask\r\nTo do: aaargh\r\nDateTime: 18-4-2012 12:00:00\r\nReminders:\r\n18-4-2012 11:00:00\r\n" string
search "Task\r\nTo do: aaargh\r\nDateTime: 18-4-2012 12:00:00\r\nReminders\r\n18-4-2012 11:00:00\r\n" string
replace "Task\r\nTo do: aaargh\r\nDateTime: 18-4-2012 13:00:00\r\nReminders\r\n18-4-2012 11:00:00\r\n" string
But it doesn't change the text. How do I make sure that the Regex.Replace finds the right piece of content?
PS. I did check several topics on this, but none of the solutions mentioned there work for me.
You missed a : right after Reminders. Just check it again :)
You could try using a StringBuilder to build up you want to write out to the file.
Just knocked up a quick example in a console app but this appears to work for me and I think it might be what you are looking for.
StringBuilder sb = new StringBuilder();
sb.Append("Tasks\r\n");
sb.Append("\r\n");
sb.Append("\tTask 1 details");
Console.WriteLine(sb.ToString());
StreamWriter writer = new StreamWriter("Tasks.txt");
writer.Write(sb.ToString());
writer.Close();

Categories