I am using OpenXML Spreadsheet in order to generate .xlsx file with a given template and a variable dictionary. But I have a question when updating the value in SharedStringTable since the InnerText of the SharedStringItem is read only.
My template excel file is a file with .xlsx. The variables I need to replace has prefix "$$$". For example, "$$$abc", then in the dictionary I may have <"abc", "Payson"> pair (if the dictionary does not contain the key "abc", just leave the "$$$abc" there.
What I have done is something like this
private void UpdateValue(WorkbookPart wbPart, Dictionary<string, string> dictionary)
{
var stringTablePart = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
if (stringTablePart == null)
{
stringTablePart = wbPart.AddNewPart<SharedStringTablePart>();
}
var stringTable = stringTablePart.SharedStringTable;
if (stringTable == null)
{
stringTable = new SharedStringTable();
}
//iterate through all the items in the SharedStingTable
// if there is any text starts with $$$, find out the name of the string
// look for the string in the dictionary
// replace it if found, or keep it if not.
foreach (SharedStringItem item in stringTable.Elements<SharedStringItem>())
{
if (item.InnerText.StartsWith("$$$"))
{
string variableName = item.InnerText.Substring(3);
if (!string.IsNullOrEmpty(variableName) && dictionary.containsKey(variableName))
{
// The problem is here since InnerText is read only.
item.InnerText = dictionary[variableName];
}
}
}
}
The document is here http://msdn.microsoft.com/en-us/library/documentformat.openxml.openxmlcompositeelement.innertext(v=office.14).aspx
Even the document mentioned that innertext can be set, however, there is no set accessor.
Does anyone know how to set the InnterText. Since I may have many cells with the same variable name "$$$abc", and I would like to replace all of them with "Payson".
It appears that you need to replace the InnerXml, then save the SharedStringTable. See Set text value using SharedStringTable with xml sdk in .net.
I've tried to edit the Text and it works.
Text newText = new Text();
newText.Text = dictionary[variableName];
item.Text = newText;
stringTable.Save();
Related
Using PDFSharp, I would like to create one PDF-document for each distinct value in the property of the List I am looping through.
Grouping my collection, and creating a List:
var listName = collectionName
.GroupBy(p => new { p.propertyName })
.ToList();
Trying to execute my PDFsharp-code for every propertyName in listName:
foreach (var trip in paidTrip) {
// Getting just the name string from the specific propertyName key
string[] remove = { "{", "}", "propertyName", "=" };
string pnString = trip.Key.ToString();
foreach (string item in remove) {
pnString = pnString.Replace(item, string.Empty);
}
Right here is where I believe I drop the ball; how can I bring each name with me to their distinct PDF-document? I am missing that connection.
So, underneath this, I start creating my PDF-document(s):
// Continued
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
PdfDocument doc = new();
PdfPage page = doc.AddPage();
XGraphics gfx = XGraphics.FromPdfPage(page);
// Adding fonts and brushes...
// Adding some text, to check If I am able to grab the names of propertyName (which I am - but jut the first, once)...
gfx.DrawString(pnString, new XFont("Arial", 40, XFontStyle.Bold), myGreen, new XPoint(5, 250));
// And then, saving the PDF-document
doc.Save("C:\My\File\Path\Test.pdf");
But as I said, this just saves one PDF for the first name found.
I believe it is trying to save one file per name found, but it can't, because the file has already been created with the file name specified.
So my question is: how can I make sure that each name found in the foreach loop is brought with me to when I create the PDFs, and save one PDF-document for each of them?
At the end when saving your file, you need to change the filename to be the key from your list like so:
// And then, saving the PDF-document
doc.Save($"C:\My\File\Path\{pnString}.pdf");
This will save an individual file for each of the different propertyName Keys you performed the GroupBy on previously.
Put your code inside foreach loop :) PPAPed:
foreach (var trip in paidTrip) {
// Getting just the name string from the specific propertyName key
string[] remove = { "{", "}", "propertyName", "=" };
string pnString = trip.Key.ToString();
foreach (string item in remove) {
pnString = pnString.Replace(item, string.Empty);
}
// Continued
// [..] the rest of the code here
}
I am looking for a safe and efficient way to update the value of a cell where the text may be in the SharedStringTable (this appears to be the case of any spreadsheet created by MS Excel).
As the name implies SharedStringTable contains strings that may be used in multiple cells.
So just finding the item in the string table and update the value is NOT the way to go as it may be in use by other cells as well.
As far as I understand one must do the following:
Check if the cell is using string table
If so, check if the new string is already there in which case just use it (remember to remove the item with the old string if it is no longer in use by any other cells!)
If not, check if the item with old string is refered to by any other cells in the spreadsheet
If so, create new item with the new string and refer to it
If not, just update existing item with new string
Are there any easier solution to this using the OpenXML SDK?
Also consider that one may want to update not only one cell but rather set new (different) values for several cells.
So we may be calling the update cell method in a loop ...
First take on this. Appears to work for my particular case.
But it must be possible to improve on or, even better, do totally different:
private static void UpdateCell(SharedStringTable sharedStringTable,
Dictionary<string, SheetData> sheetDatas, string sheetName,
string cellReference, string text)
{
Cell cell = sheetDatas[sheetName].Descendants<Cell>()
.FirstOrDefault(c => c.CellReference.Value == cellReference);
if (cell == null) return;
if (cell.DataType == null || cell.DataType != CellValues.SharedString)
{
cell.RemoveAllChildren();
cell.AppendChild(new InlineString(new Text { Text = text }));
cell.DataType = CellValues.InlineString;
return;
}
// Cell is refering to string table. Check if new text is already in string table, if so use it.
IEnumerable<SharedStringItem> sharedStringItems
= sharedStringTable.Elements<SharedStringItem>();
int i = 0;
foreach (SharedStringItem sharedStringItem in sharedStringItems)
{
if (sharedStringItem.InnerText == text)
{
cell.CellValue = new CellValue(i.ToString());
// TODO: Should clean up, ie remove item with old text from string table if it is no longer in use.
return;
}
i++;
}
// New text not in string table. Check if any other cells in the Workbook referes to item with old text.
foreach (SheetData sheetData in sheetDatas.Values)
{
var cells = sheetData.Descendants<Cell>();
foreach (Cell cell0 in cells)
{
if (cell0.Equals(cell)) continue;
if (cell0.DataType != null
&& cell0.DataType == CellValues.SharedString
&& cell0.CellValue.InnerText == cell.CellValue.InnerText)
{
// Other cells refer to item with old text so we cannot update it. Add new item.
sharedStringTable.AppendChild(new SharedStringItem(new Text(text)));
cell.CellValue.Text = (i).ToString();
return;
}
}
}
// No other cells refered to old item. Update it.
sharedStringItems.ElementAt(int.Parse(cell.CellValue.InnerText)).Text = new Text(text);
}
....
private static void DoIt(string filePath)
{
using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(filePath, true))
{
SharedStringTable sharedStringTable
= spreadSheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>()
.First().SharedStringTable;
Dictionary<string, SheetData> sheetDatas = new Dictionary<string, SheetData>();
foreach (var sheet in spreadSheet.WorkbookPart.Workbook.Descendants<Sheet>())
{
SheetData sheetData
= (spreadSheet.WorkbookPart.GetPartById(sheet.Id) as WorksheetPart)
.Worksheet.GetFirstChild<SheetData>();
sheetDatas.Add(sheet.Name, sheetData);
}
UpdateCell(sharedStringTable, sheetDatas, "Sheet1", "A2", "Mjau");
}
}
WARNING: Do NOT use the above as is, it works with a particular spreadsheet. It is very likely things not handled if one use it in other situations.
This is my first attempt at OpenXML for spreadsheet.
Ended up following the suggestion made by George Polevoy.
Much easier and appears to have no ill side-effects (That said there are a million other issues to handle when manipulating spreadsheets which may be edited outside your control...)
As you can see the update operation of the shared string table really keeps developers busy.
In my experience shared string table does not add anything in terms of performance and file size economy. OpenXml format is compressed inside a packaging container anyway, so even if you have massively duplicated strings it won't affect the file size.
Microsoft Excel writes everything in shared string tables, even there's no duplication.
I'd recommend just to convert everything to InlineStrings before modifying the document, and the further operation becomes as simple as it gets.
You can write it simply as InlineStrings, and that would be a functionally equal document file.
Microsoft Excel would convert it back to shared string tables when the file is edited, but who cares.
I would suggest the shared string table feature removed in future versions of the standard, unless justified by some sound benchmarks.
So I tried some research, but I just don't know how to google this..
For example, I got a .db (works same as .txt for me) file, written like this:
DIRT: 3;
STONE: 6;
so far, i got a code that can put items in a comboBox like this:
DIRT,
STONE,
will put DIRT and STONE in the comboBox. This is the code I'm using for that:
string[] lineOfContents = System.IO.File.ReadAllLines(dbfile);
foreach (var line in lineOfContents)
{
string[] tokens = line.Split(',');
comboBox1.Items.Add(tokens[0]);
}
How do I expand this so it put e.g. DIRT and STONE in the combobox, and keep the rest (3) in variables (ints, like int varDIRT = 3)?
If you want, it doesn't have to be txt or db files.. i heard xml are config files too.
Try doing something like this:
cmb.DataSource = File.ReadAllLines("filePath").Select(d => new
{
Name = d.Split(',').First(),
Value = Convert.ToInt32(d.Split(',').Last().Replace(";",""))
}).ToList();
cmb.DisplayMember = "Name";
cmb.ValueMember= "Value";
remember it will require to use using System.Linq;
if your want ot reference the selected value of the combobox you can use
cmb.SelectedValue;
cmb.SelectedText;
I think you've really got two questions, so I'll try to answer them separately.
The first question is "How can I parse a file that looks like this...
DIRT: 3;
STONE: 6;
into names and integers?" You could remove all the whitespace and semicolons from each line, and then split on colon. A cleaner way, in my opinion, would be to use a regular expression:
// load your file
var fileLines = new[]
{
"DIRT: 3;",
"STONE: 6;"
};
// This regular expression will match anything that
// begins with some letters, then has a colon followed
// by optional whitespace ending in a number and a semicolon.
var regex = new Regex(#"(\w+):\s*([0-9])+;", RegexOptions.Compiled);
foreach (var line in fileLines)
{
// Puts the tokens into an array.
// The zeroth token will be the entire matching string.
// Further tokens will be the contents of the parentheses in the expression.
var tokens = regex.Match(line).Groups;
// This is the name from the line, i.e "DIRT" or "STONE"
var name = tokens[1].Value;
// This is the numerical value from the same line.
var value = int.Parse(tokens[2].Value);
}
If you're not familiar with regular expressions, I encourage you to check them out; they make it very easy to format strings and pull out values. http://regexone.com/
The second question, "how do I store the value alongside the name?", I'm not sure I fully understand. If what you want to do is back each item with the numerical value specified in the file, the dub stylee's advice is good for you. You'll need to place the name as the display member and value as the value member. However, since your data is not in a table, you'll have to put the data somewhere accessible so that the Properties you want to use can be named. I recommend a dictionary:
// This is your ComboBox.
var comboBox = new ComboBox();
// load your file
var fileLines = new[]
{
"DIRT: 3;",
"STONE: 6;"
};
// This regular expression will match anything that
// begins with some letters, then has a colon followed
// by optional whitespace ending in a number and a semicolon.
var regex = new Regex(#"(\w+):\s*([0-9])+;", RegexOptions.Compiled);
// This does the same as the foreach loop did, but it puts the results into a dictionary.
var dictionary = fileLines.Select(line => regex.Match(line).Groups)
.ToDictionary(tokens => tokens[1].Value, tokens => int.Parse(tokens[2].Value));
// When you enumerate a dictionary, you get the entries as KeyValuePair objects.
foreach (var kvp in dictionary) comboBox.Items.Add(kvp);
// DisplayMember and ValueMember need to be set to
// the names of usable properties on the item type.
// KeyValue pair has "Key" and "Value" properties.
comboBox.DisplayMember = "Key";
comboBox.ValueMember = "Value";
In this version, I have used Linq to construct the dictionary. If you don't like the Linq syntax, you can use a loop instead:
var dictionary = new Dictionary<string, int>();
foreach (var line in fileLines)
{
var tokens = regex.Match(line).Groups;
dictionary.Add(tokens[1].Value, int.Parse(tokens[2].Value));
}
You could also use FileHelpers library. First define your data record.
[DelimitedRecord(":")]
public class Record
{
public string Name;
[FieldTrim(TrimMode.Right,';')]
public int Value;
}
Then you read in your data like so:
FileHelperEngine engine = new FileHelperEngine(typeof(Record));
//Read from file
Record[] res = engine.ReadFile("FileIn.txt") as Record[];
// write to file
engine.WriteFile("FileOut.txt", res);
I am trying to populate a word template in C#. The template contains a table with several cells. I need to be able to identify each cell based on a unique id. I do not find a way to store & read a unique id for each cell/text in word. My approach is to have the unique id as hidden text in each cell. And then format the cell (like changing the background color) based on this unique id.
I face the problem in reading this hidden text in each cell in C#?
Any suggestions would be of great help please!
Thanks!
Here it comes! You can iterate over a document and find a hidden text:
foreach (Microsoft.Office.Interop.Word.Range p in objDoc.Range().Words)
{
if (p.Font.Hidden != 0) //Hidden text found
{
// Do something
}
}
The values returned for p are:
0: Text visible
-1: Text Hidden
That's what I did for a Word Document, but if you are able to iterate over your cells' content, probably this information may help you.
To read hidden text in your code you just need to set
rangeObject.TextRetrievalMode.IncludeHiddenText = true
If you want to make them visible for example, you can iterate trough all words and check the Font.Hidden property, then set it visible:
Word.Document document = ThisAddIn.Instance.Application.ActiveDocument;
var rangeAll = document.Range();
rangeAll.TextRetrievalMode.IncludeHiddenText = true;
foreach (Microsoft.Office.Interop.Word.Range p in rangeAll.Words)
{
texts += p.Text;
if (p.Font.Hidden != 0) //Hidden text found
{
p.Font.Hidden = 0;
count++;
}
}
I am using the Microsoft.Office.Interop.Word namespace in a console application to get form data from a MSWord document. In this MSWord doc are fields that have each been assigned a bookmark which I am using as an id.
I would like to be able to retrieve the value of a field its bookmark and store it in a dictionary.I am only able to get the value of each field but not the bookmark AND the field.
Is there a way that I could do something like wdField.Result.Bookmark to get a field's bookmark? I looked at the MSDN documentation but am having a difficult time getting this right. Here is the foreach loop that I am enumerating with:
foreach (Field wdField in oWordDoc.Fields)
{
wdField.Select();
string fieldText = wdField.Result.Text
Console.WriteLine(fieldText);
//string fieldBookMark = wdField.Result.BookMark
}
KazJaw is right: if you have all the target text "bookmarked", you can rely just on BookMarks. Sample code:
foreach (Bookmark bookMark in oWordDoc.Bookmarks)
{
string bmName = bookMark.Name;
Range bmRange = bookMark.Range;
string bmText = bmRange.Text;
}
Or:
Range bmRange = oWordDoc.Bookmarks["bookmark name"].Range;