Read the Word File, identify headings,& get the content - c#

I want to compare the heading of a word file with string, if it matches , then it displays it content
suppose a word file content 2-4 Paragraphs with heading, I want that it compare the heading with the string & display the content using C#

Not too sure how far you have gotten on the project, but this is how you can compare the heading to your selected string then display the header content.
char[] separ = new char[]{' '};
string[] yourSelectedHeaderText = YourString.Split(separ.StringSplitOptions.RemoveEmptyString)
string[] docHeader = HeaderSting.Split(separ,StringSplitOptions.RemoveEmptyString);
for(int i=0; i< docHeader.Length;i++){
if(docHeader[i] == yourSelectHeaderText[i]){
Console.WriteLine(docHeader[i].ToString());
}
}
Setting everything up into arrays or some kind of collection to iterate through is what you would want to do first, then I iterated through the header Strings one by one, under that iteration I added an if statement that would catch the matching header with your selected string. Inside the if statement we have the line that will display the string in your console.

Related

split string with comma and newspace in 2d or jagged array and access items in c#

Hi there I have a text file like this, many lines 2 columns
CF7CED1BF035345269118A15EF2D45A06, product1
CF7CED1BF035345269118A15EF2D45A09, product2
....
...
...
...
I need to split this and access each field, more precise I need to make a loop that creates many files like product1.txt product2.txt etc and will enclose the codes on its left.
So I need to create files with filenames of columns [2] of all lines and enclose the column[1] as value of each line
I know how to do basic stuff in arrays, like read all lines and store them, but i don't know how to make a loop that will read both field 1 then 2 of LINE 1, create the file and store (I know how to read and save to file) and go on on next LINE 2 and go on field 1 and field 2 again and so on.
Someone suggested using jagged arrays would be faster than 2d arrays
"When you use ReadLines, you can start enumerating the collection of
strings before the whole collection is returned; when you use
ReadAllLines, you must wait for the whole array of strings be
returned before you can access the array. Therefore, when you are
working with very large files, ReadLines can be more efficient."
string path_read = #"c:\read\file.txt";
//Path to save resulting files.
string path = #"c:\temp\";
char[] comma = new char[1]{','};
//ASSUMPTION: Your every row has comma separated 2 values.
//Do a for loop.
//Code now use File.ReadLines
foreach (var currentLine in File.ReadLines(path_read))
{
string[] itemArray = currentLine.Split(comma, StringSplitOptions.RemoveEmptyEntries);
// Your item array now has 2 values from 2 columns in the same row.
// Do whatever with it.
File.WriteAllText(path+itemArray[1]+".txt", itemArray[0], Encoding.UTF8);
}
Do you need to keep the contents for any further use? if the intent is to read and the save the contents to separate files then there is no need for a separate array.
using (var reader = new StreamReader(#"input.txt"))
{
while (!reader.EndOfStream)
{
var inputText = reader.ReadLine();
var splitText = inputText.Split(',');
File.AppendAllLines(splitText[1] + ".txt", new List<string> {splitText[0]});
}
}

Search text file for text above which pattern matches input

I am trying to make my program display the text above the input text which matches a pattern I set.
For example, if user input 'FastModeIdleImmediateCount"=dword:00000000', I should get the closest HKEY above, which is [HKEY_CURRENT_CONFIG\System\CurrentControlSet\Enum\SCSI\Disk&Ven_ATA&Prod_TOSHIBA_MQ01ABD0\4&6a0976b&0&000000] for this case.
[HKEY_CURRENT_CONFIG\System\CurrentControlSet\Enum\SCSI\Disk&Ven_ATA&Prod_TOSHIBA_MQ01ABD0\4&6a0976b&0&000000]
"StandardModeIdleImmediateCount"=dword:00000000
"FastModeIdleImmediateCount"=dword:00000000
[HKEY_CURRENT_CONFIG\System\CurrentControlSet\SERVICES]
[HKEY_CURRENT_CONFIG\System\CurrentControlSet\SERVICES\TSDDD]
[HKEY_CURRENT_CONFIG\System\CurrentControlSet\SERVICES\TSDDD\DEVICE0]
"Attach.ToDesktop"=dword:00000001
Could anyone please show me how I can code something like that? I tried playing around with regular expressions to match text with bracket, but I am not sure how to make it to only search for the text above my input.
I'm assuming your file is a .txt file, although it's most probably not. But the logic is the same.
It is not hard at all, a simple for() loop would do the trick.
Code with the needed description:
string[] lines = File.ReadAllLines(#"d:\test.txt");//replace your directory. We're getting all lines from a text file.
string inputToSearchFor = "\"FastModeIdleImmediateCount\"=dword:00000000"; //that's the string to search for
int indexOfMatchingLine = Array.FindIndex(lines, line => line == inputToSearchFor); //getting the index of the line, which equals the matchcode
string nearestHotKey = String.Empty;
for(int i = indexOfMatchingLine; i >=0; i--) //looping for lines above the matched one to find the hotkey
{
if(lines[i].IndexOf("[HKEY_") == 0) //if we find a line which begins with "[HKEY_" (that means it's a hotkey, right?)
{
nearestHotKey = lines[i]; //we get the line into our hotkey string
break; //breaking the loop
}
}
if(nearestHotKey != String.Empty) //we have actually found a hotkey, so our string is not empty
{
//add code...
}
You could try to split the text into lines, find the index of the line that contains your text (whether exact match or regex is used doesn't matter) and then backsearch for the first key. Reverse sorting the lines first might help.

C# so I need to split out a string, I think

so I have this application that I have inherited from someone that is long gone. The gist of the application is that it reads in a .cvs file that has about 5800 lines in it, copies it over to another .cvs, which it creates new each time, after striping out a few things , #, ', &. Well everything works great, or it has until about a month ago. so I started checking into it, and what I have found so far is that there are about 131 items missing from the spreadsheet. Now I read someplace that the maximun amount of data a string can hold is over 1,000,000,000 chars, and my spreadsheet is way under that, around 800,000 chars, but the only thing I can think is doing it is the string object.
So anyway, here is the code in question, this piece appears
to both read in from the existing field, and output to the new file:
StreamReader s = new StreamReader(File);
//Read the rest of the data in the file.
string AllData = s.ReadToEnd();
//Split off each row at the Carriage Return/Line Feed
//Default line ending in most windows exports.
//You may have to edit this to match your particular file.
//This will work for Excel, Access, etc. default exports.
string[] rows = AllData.Split("\r\n".ToCharArray(), System.StringSplitOptions.RemoveEmptyEntries);
//Now add each row to the DataSet
foreach (string r in rows)
{
//Split the row at the delimiter.
string[] items = r.Split(delimiter.ToCharArray());
//Add the item
result.Rows.Add(items);
}
If anyone can help me I would really appreciate it. I either need to figure out how to split the data better, or I need to figure out why it is cutting out the last 131 lines from the existing excel file to the new excel file.
One easier way to do this, since you're using "\r\n" for lines, would be to just use the built-in line reading method: File.ReadLines(path)
foreach(var line in File.ReadLines(path))
{
var items = line.Split(',');
result.Rows.Add(items);
}
You may want to check out the TextFieldParser class, which is part of the Microsoft.VisualBasic.FileIO namespace (yes, you can use this with C# code)
Something along the lines of:
using(var reader = new TextFieldParser("c:\\path\\to\\file"))
{
//configure for a delimited file
reader.TextFieldType = FieldType.Delimited;
//configure the delimiter character (comma)
reader.Delimiters = new[] { "," };
while(!reader.EndOfData)
{
string[] row = reader.ReadFields();
//do stuff
}
}
This class can help with some of the issues of splitting a line into its fields, when the field may contain the delimiter.

Split string from text file

I'm trying to convert string to keys from a text file and I need to split text.
For example:
Code c#
string[] controls = File.ReadAllLines(FilePath);
Keys move up = (Keys)Enum.Parse(type of(Keys),controls[1].Split("|", StringSplitOption.None), true);
In the text file at the line[1] I have :
moveUp |W;
I want to set the char W as keys.
Thanks to reply and sorry if my English looks weird.
If you are interested in string after | , then this should be:
controls[1].Split("|", StringSplitOption.None)
replaced with this:
controls[1].Split("|")[1]
[1] means return the 2nd index value from array which will be created by Split()
If you are trying to get from Line 1 then controls[1] should be controls[0] because arrays are zero index based.

"\r\n" appears as small square boxes in word document, C#

I am appending some text containing '\r\n' into a word document at run-time.
But when I see the word document, they are replaced with small square boxes :-(
I tried replacing them with System.Environment.NewLine but still I see these small boxes.
Any idea?
the answer is to use \v - it's a paragraph break.
Have you not tried one or the other in isolation i.e.\r or \n as Word will interpret a carriage return and line feed respectively. The only time you would use the Environment.Newline is in a pure ASCII text file. Word would handle those characters differently! Or even a Ctrl+M sequence. Try that and if it does not work, please post the code.
Word uses the <w:br/> XML element for line breaks.
After much trial and error, here is a function that sets the text for a Word XML node, and takes care of multiple lines:
//Sets the text for a Word XML <w:t> node
//If the text is multi-line, it replaces the single <w:t> node for multiple nodes
//Resulting in multiple Word XML lines
private static void SetWordXmlNodeText(XmlDocument xmlDocument, XmlNode node, string newText)
{
//Is the text a single line or multiple lines?>
if (newText.Contains(System.Environment.NewLine))
{
//The new text is a multi-line string, split it to individual lines
var lines = newText.Split("\n\r".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
//And add XML nodes for each line so that Word XML will accept the new lines
var xmlBuilder = new StringBuilder();
for (int count = 0; count < lines.Length; count++)
{
//Ensure the "w" prefix is set correctly, otherwise docFrag.InnerXml will fail with exception
xmlBuilder.Append("<w:t xmlns:w=\"http://schemas.microsoft.com/office/word/2003/wordml\">");
xmlBuilder.Append(lines[count]);
xmlBuilder.Append("</w:t>");
//Not the last line? add line break
if (count != lines.Length - 1)
{
xmlBuilder.Append("<w:br xmlns:w=\"http://schemas.microsoft.com/office/word/2003/wordml\" />");
}
}
//Create the XML fragment with the new multiline structure
var docFrag = xmlDocument.CreateDocumentFragment();
docFrag.InnerXml = xmlBuilder.ToString();
node.ParentNode.AppendChild(docFrag);
//Remove the single line child node that was originally holding the single line text, only required if there was a node there to start with
node.ParentNode.RemoveChild(node);
}
else
{
//Text is not multi-line, let the existing node have the text
node.InnerText = newText;
}
}

Categories