OpenXML: Read text between two document fields using OpenXML SDK - c#

I'm new to programming with the OpenXML SDK and I've tried excessively to locate and read text that is between two document fields, but never really succeeded. There are tons of samples and tutorials on the web about almost everything you can think of doing with the OpenXML SDK, from setting watermarks to doing merge mail, but not only one about processing document fields.
My word document looks something like this:
{ Field1 } data { Field2 }
and what I want to do, is to read the data that is between Field1 and Field2.
I succeeded to the point to locate all the fields I need like this:
var qryFieldCode = (from p in procDoc.MainDocumentPart.Document.Body.Descendants()
where p.GetType() == typeof(FieldCode)
select p).ToList();
But what can I do to read the text that is between those fields I found?
Any help is greatly appreciated.

Find your first field (much like above) and then get an .ElementsAfterSelf.TakeWhile until where p.GetType() doesn't = typeof(FieldCode). Then just get the .Value of that query and you'll have your text. This won't be a great solution if you have things like tables between your two fields, but for your example above, it will work.

Related

How do I append to a CSV file using Filehelpers with multiple record types that have distinct headers?

As the question says, using the FileHelpers library I am attempting to generate a CSV file along side a report file. The report file may have different (but finite) inputs/data structures and hence my CSV generation method is not explicitly typed. The CSV contains all of the report data as well as the report's header information. For my headers, I am using the class object properties because they are descriptive enough for my end use purpose.
My relevant code snippet is below:
// File location, where the .csv goes and gets stored.
string filePath = Path.Combine(destPath, fileName);
// First, write report header details based on header list
Type type = DetermineListType(headerValues);
var headerEngine = new FileHelperEngine(type);
headerEngine.HeaderText = headerEngine.GetFileHeader();
headerEngine.WriteFile(filePath, (IEnumerable<object>)headerValues);
// Next, append the report data below the report header data.
type = DetermineListType(reportData);
var reportDataEngine = new FileHelperEngine(type);
reportDataEngine.HeaderText = reportDataEngine.GetFileHeader();
reportDataEngine.AppendToFile(filePath, (IEnumerable<object>)reportData);
When this is executed, the CSV is successfully generated however the .AppendToFile() method does not add the reportDataEngine.HeaderText. From the documentation I do not see this functionality to .AppendToFile() and I am wondering if anyone has a known work-around for this or a suggestion how to output the headers of two different class objects in a single CSV file using FileHelpers.
The desired output would look something like this however in a single CSV file (This would be a contiguous CSV obviously; not tables)
Report_Name
Operator
Timestamp
Access Report
User1
14:50:12 28 Dec 2020
UserID
Login_Time
Logout_Time
User4
09:33:23
10:45:34
User2
11:32:11
11:44:11
User4
15:14:22
16:31:09
User1
18:55:32
19:10:10
I have looked also at the MultiRecordEngine in FileHelpers and while I think this may be helpful, I cannot figure out based on the examples how to actually write a multirecord CSV file in the required fashion I have above; if it is possible at all.
Thank you!
The best way is to merge the columns and make one big table then make your classes match the columns you need to separate them out when reading. CSV only allows for the first row to define the column names and that is optional based on your use case. Look at CSVHelper https://joshclose.github.io/CsvHelper/ it has a lot of built-in features with lots of examples. Let me know if you need additional help.

How to get tables from a word file and store them into a datagridview?

i am working with c# on VS 2013. in my program, i want to get a word file as an input from to an openfiledialog. then i want to access into it and extract the tables which exist on it and finally, store them into a datagridview.
please i need a Tutorial to follow.
Thank you!!
I presume you are working with OpenXML SDK.. In that case maybe something like that will give you access to all of the tables:
Body body = doc.MainDocumentPart.Document.Body;
foreach (Table t in body.Descendants<Table>())
{
...
}
See this as well: https://msdn.microsoft.com/en-us/library/office/cc850835(v=office.14).aspx

Insert a table into a middle of word processing document (Open XML SDK)

I have a template Word document which i fill details into with openXML SDK 2.0 (using c#).
I also need to insert a table into file, and i found this tutorial on MSDN.
But - the example is appending the table to the end of the document, and I want it to be somewhere in the middle.
I may need to replace this line:
doc.MainDocumentPart.Document.Body.Append(table);
with something else. (The full code is in the link above).
Please help me.. I found nothing yet.
Thanks.
One way to do this may be to use Content Controls as placeholders to insert the table into them from code.
var myContentControl = doc.MainDocumentPart.Document.Body.Descendants<SdtBlock>()
.Where(e => e.Descendants<SdtAlias>().FirstOrDefault().Val == "myTablePlaceholder").FirstOrDefault();
SdtContentBlock sdtContentBlock1 = new SdtContentBlock();
sdtContentBlock1.Append(table); // Your table
myContentControl.Append(sdtContentBlock1);

c# getting informaion from xml file

A want get all information which are in located in <forecast_conditions></forecast_conditions> tag for first ta I use
var for_cod = from currentCond in xdoc.Root.Descendants("current_conditions")
select currentCond;
I don;t know how to get information from 2,3 <forecast_conditions></forecast_conditions> tags because it have same names, maybe you have any ideas?
xml file: http://www.google.com/ig/api?weather=vilnius&hleng=eng
I am not sure what you exactly need. If you want to get all forecast_conditions for in a result set you can use this simple query
var query = from t in doc.Descendants("forecast_conditions")
select t;
You may wanna see Weather Information from Google Weather using ASP.NET and LINQ to XML, also check out this article Using Google Weather API In A C# Application. Its not using LINQ though. Also check out this thread C# Pull XML data from google's weather API

I need to parse an HTML formatted country list into SQL inserts. Is there an easier way to do this?

There is about 2000 lines of this, so manually would probably take more work than to figure out a way to do ths programatically. It only needs to work once so I'm not concerned with performance or anything.
<tr><td>Canada (CA)</td><td>Alberta (AB)</td></tr>
<tr><td>Canada (CA)</td><td>British Columbia (BC)</td></tr>
<tr><td>Canada (CA)</td><td>Manitoba (MB)</td></tr>
Basically its formatted like this, and I need to divide it into 4 parts, Country Name, Country Abbreviation, Division Name and Division Abbreviation.
In keeping with my complete lack of efficiency I was planning just to do a string.Replace on the HTML tags after I broke them up and then just finding the index of the opening brackets and grabbing the space delimited strings that are remaining. Then I realized I have no way of keeping track of which is the country and which is the division, as well as figuring out how to group them by country.
So is there a better way to do this? Or better yet, an easier way to populate a database with Country and Provinces/States? I looked around SO and the only readily available databases I can find dont provide the full name of the countries or the provinces/states or use IPs instead of geographic names.
Paste it into a spreadsheet. Some spreadsheets will parse the HTML table for you.
Save it as a .CSV file and process it that way. Or. Add a column to the spreadsheet that says something like the following:
="INSERT INTO COUNTRY(CODE,NAME) VALUES=('" & A1 & "','" & B1 & "');"
Then you have a column of INSERT statements that you can cut, paste and execute.
Edit
Be sure to include the <table> tag when pasting into a spreadsheet.
<table><tr><th>country</th><th>name></th></tr>
<tr><td>Canada (CA)</td><td>Alberta (AB)</td></tr>
<tr><td>Canada (CA)</td><td>British Columbia (BC)</td></tr>
<tr><td>Canada (CA)</td><td>Manitoba (MB)</td></tr>
</table>
Processing a CSV file requires almost no parsing. It's got quotes and commas. Much easier to live with than XML/HTML.
/<tr><td>([^\s]+)\s\(([^\)])\)<\/td><td>([^\s]+)\s\(([^\)])\)<\/td><\/tr>/
Then you should have 4 captures with the 4 pieces of data from any PCRE engine :)
Alternatively, something like http://jacksleight.com/assets/blog/really-shiny/scripts/table-extractor.txt provides more completeness.
Sounds like a problem easily solved by a Regex.
I recently learned that if you open a url from Excel it will try and parse out the table data.
If you are able to see this table in the browser (Internet explorer), you can select the entire table, right click & "Export to Microsoft Excel"
That should help you get data into separate columns, I guess.
do you have to do this programatically? If not, may i suggest just copying and pasting the table (from the browser) onto MS Excel and then clearing all formats? This way tou get a nice table that can then be imported into your database without problem.
just a suggestion... hth
An assembly exists for .Net called System.Xml; you can just reference the assembly and convert your HTML document to a System.Xml.XmlDocument, you can easily pinpoint the HTML node that contains your required data, and use the use the children nodes to add into your data. This requires little string parsing on your part.
Load the HTML data as XElements, use LINQ to grab the values you need, and then INSERT.
Blowing my own trumpet here but my FOSS tool CSVfix will do it with a combination of the read_xml and sql_insert commands.

Categories