I am using OpenXML Spreadsheet in order to generate .xlsx file with a given template and a variable dictionary. But I have a question when updating the value in SharedStringTable since the InnerText of the SharedStringItem is read only.
My template excel file is a file with .xlsx. The variables I need to replace has prefix "$$$". For example, "$$$abc", then in the dictionary I may have <"abc", "Payson"> pair (if the dictionary does not contain the key "abc", just leave the "$$$abc" there.
What I have done is something like this
private void UpdateValue(WorkbookPart wbPart, Dictionary<string, string> dictionary)
{
var stringTablePart = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
if (stringTablePart == null)
{
stringTablePart = wbPart.AddNewPart<SharedStringTablePart>();
}
var stringTable = stringTablePart.SharedStringTable;
if (stringTable == null)
{
stringTable = new SharedStringTable();
}
//iterate through all the items in the SharedStingTable
// if there is any text starts with $$$, find out the name of the string
// look for the string in the dictionary
// replace it if found, or keep it if not.
foreach (SharedStringItem item in stringTable.Elements<SharedStringItem>())
{
if (item.InnerText.StartsWith("$$$"))
{
string variableName = item.InnerText.Substring(3);
if (!string.IsNullOrEmpty(variableName) && dictionary.containsKey(variableName))
{
// The problem is here since InnerText is read only.
item.InnerText = dictionary[variableName];
}
}
}
}
The document is here http://msdn.microsoft.com/en-us/library/documentformat.openxml.openxmlcompositeelement.innertext(v=office.14).aspx
Even the document mentioned that innertext can be set, however, there is no set accessor.
Does anyone know how to set the InnterText. Since I may have many cells with the same variable name "$$$abc", and I would like to replace all of them with "Payson".
It appears that you need to replace the InnerXml, then save the SharedStringTable. See Set text value using SharedStringTable with xml sdk in .net.
I've tried to edit the Text and it works.
Text newText = new Text();
newText.Text = dictionary[variableName];
item.Text = newText;
stringTable.Save();
I need to programmatically retrieve the columns in a Sharepoint document library, in order to set file properties externally to Sharepoint.
I've found that setting the metadata property is not hard as long as you already know the name of the column, which I cannot expect users to input themselves.
As it does not seem possible to do this through the Sharepoint Web Services I have created my own custom web service so I have access to the Client Object Model.
Using this code I am able to retrieve the custom columns I have created, however I am not able to distinguish between the ones editable in the item properties section (picture above) and those which aren't.
SPList list = web.Lists[specificList];
foreach (SPField field in list.Fields)
{
if (!field.Hidden)
{
var title = field.Title;
var description = field.Description;
var parentList = field.ParentList;
var references = field.FieldReferences; // contains names of fields referenced in computed fields
if (references != null)
{
foreach (string reference in references)
{
var test = parentList.Fields.GetField(reference);
}
}
}
}
I get extra properties such as:
Copy Source
Content Type
Checked Out To
Checked In Comment
Type
File
Size
Edit
Version
Source Version
Source Name
I have also tried retrieving the column fields from the SPFolder item, but again this returns many extra properties and is even less filterable.
foreach (SPListItem folderItem in list.Folders)
{
SPFolder folder = folderItem.Folder;
System.Collections.Hashtable oHashtable = folder.Properties;
System.Collections.ICollection collKeys = oHashtable.Keys;
foreach (var key in collKeys)
{
string keyName = key.ToString();
}
}
Is there a standard way to retrieve the column fields I need? Or will I have to manually exclude the defaults ones such as "Checked out to"?
First you have to know which form you are viewing. Is it the EditForm or NewForm?
You can filter the columns visible on a specific form by getting the fields of the ContentType and then check if they are getting displayed on the NewForm (or whatever form):
SPList list = web.Lists[specificList];
var contentType = list.ContentTypes[0]; // Select first contenttype. Change this if you need a different contentType
foreach (SPField field in contentType.Fields)
{
if (!field.Hidden
&& (field.ShowInEditForm == null
|| !field.ShowInEditForm.Value)) // Replace ShowInEditForm with the form you need
{
var title = field.Title;
var description = field.Description;
var parentList = field.ParentList;
var references = field.FieldReferences; // contains names of fields referenced in computed fields
if (references != null)
{
foreach (string reference in references)
{
var test = parentList.Fields.GetField(reference);
}
}
}
}
I think the best way to go is to get the fields from the content type and not the list itself. That way you'll get only the fields visible in the form.
var list = web.Lists[specificList];
var contentType = list.ContentTypes["Document"];
foreach (SPField field in contentType.Fields)
{
if(!field.Reorderable || contentType.FieldLinks[field.Id].Hidden)
{
continue;
}
//Process fields
}
You may ask "Why Reordable=false?". Well, generally custom fields do not set this property so it is a nice way to filter them.
Also I didn't invent this code. This code is taken from code behind class of SharePoint standard content type fields reorder page (using reflection).
I am using the Microsoft.Office.Interop.Word namespace in a console application to get form data from a MSWord document. In this MSWord doc are fields that have each been assigned a bookmark which I am using as an id.
I would like to be able to retrieve the value of a field its bookmark and store it in a dictionary.I am only able to get the value of each field but not the bookmark AND the field.
Is there a way that I could do something like wdField.Result.Bookmark to get a field's bookmark? I looked at the MSDN documentation but am having a difficult time getting this right. Here is the foreach loop that I am enumerating with:
foreach (Field wdField in oWordDoc.Fields)
{
wdField.Select();
string fieldText = wdField.Result.Text
Console.WriteLine(fieldText);
//string fieldBookMark = wdField.Result.BookMark
}
KazJaw is right: if you have all the target text "bookmarked", you can rely just on BookMarks. Sample code:
foreach (Bookmark bookMark in oWordDoc.Bookmarks)
{
string bmName = bookMark.Name;
Range bmRange = bookMark.Range;
string bmText = bmRange.Text;
}
Or:
Range bmRange = oWordDoc.Bookmarks["bookmark name"].Range;
I am trying to use openxml to produce automated excel files. One problem I am facing is to accomodate my object model with open xml object model for excel. I have to come to a point where I realise that the order in which I append the child elements for a worksheet matters.
For Example:
workSheet.Append(sheetViews);
workSheet.Append(columns);
workSheet.Append(sheetData);
workSheet.Append(mergeCells);
workSheet.Append(drawing);
the above ordering doesnot give any error.
But the following:
workSheet.Append(sheetViews);
workSheet.Append(columns);
workSheet.Append(sheetData);
workSheet.Append(drawing);
workSheet.Append(mergeCells);
gives an error
So this doesn't let me to create a drawing object whenever I want to and append it to the worksheet. Which forces me to create these elements before using them.
Can anyone tell me if I have understood the problem correctly ? Because I believe we should be able to open any excel file create a new child element for a worksheet if necessary and append it. But now this might break the order in which these elements are supposed to be appended.
Thanks.
According to the Standard ECMA-376 Office Open XML File Formats, CT_Worksheet has a required sequence:
The reason the following is crashing:
workSheet.Append(sheetViews);
workSheet.Append(columns);
workSheet.Append(sheetData);
workSheet.Append(drawing);
workSheet.Append(mergeCells);
Is because you have drawing before mergeCells. As long as you append your mergeCells after drawing, your code should work fine.
Note: You can find the full XSD in ECMA-376 3rd edition Part 1 (.zip) -> OfficeOpenXML-XMLSchema-Strict -> sml.xsd.
I found that for all "Singleton" children where the parent objects has a Property defined (such as Worksheet.sheetViews) use the singleton property and assign the new object to that instead of using "Append" This causes the class itself to ensure the order is correct.
workSheet.Append(sheetViews);
workSheet.Append(columns);
workSheet.Append(sheetData); // bad idea(though it does work if the order is good)
workSheet.Append(drawing);
workSheet.Append(mergeCells);
More correct format...
workSheet.sheetViews=sheetViews; // order doesn't matter.
workSheet.columns=columns;
...
As Joe Masilotti already explained, the order is defined in the schema.
Unfortunately, the OpenXML library does not ensure the correct order of child elements in the serialized XML as required by the underlying XML schema. Applications may not be able to parse the XML successfully if the order is not correct.
Here is a generic solution which I am using in my code:
private T GetOrCreateWorksheetChildCollection<T>(Spreadsheet.Worksheet worksheet)
where T : OpenXmlCompositeElement, new()
{
T collection = worksheet.GetFirstChild<T>();
if (collection == null)
{
collection = new T();
if (!worksheet.HasChildren)
{
worksheet.AppendChild(collection);
}
else
{
// compute the positions of all child elements (existing + new collection)
List<int> schemaPositions = worksheet.ChildElements
.Select(e => _childElementNames.IndexOf(e.LocalName)).ToList();
int collectionSchemaPos = _childElementNames.IndexOf(collection.LocalName);
schemaPositions.Add(collectionSchemaPos);
schemaPositions = schemaPositions.OrderBy(i => i).ToList();
// now get the index where the position of the new child is
int index = schemaPositions.IndexOf(collectionSchemaPos);
// this is the index to insert the new element
worksheet.InsertAt(collection, index);
}
}
return collection;
}
// names and order of possible child elements according to the openXML schema
private static readonly List<string> _childElementNames = new List<string>() {
"sheetPr", "dimension", "sheetViews", "sheetFormatPr", "cols", "sheetData",
"sheetCalcPr", "sheetProtection", "protectedRanges", "scenarios", "autoFilter",
"sortState", "dataConsolidate", "customSheetViews", "mergeCells", "phoneticPr",
"conditionalFormatting", "dataValidations", "hyperlinks", "printOptions",
"pageMargins", "pageSetup", "headerFooter", "rowBreaks", "colBreaks",
"customProperties", "cellWatches", "ignoredErrors", "smartTags", "drawing",
"drawingHF", "picture", "oleObjects", "controls", "webPublishItems", "tableParts",
"extLst"
};
The method always inserts the new child element at the correct position, ensuring that the resulting document is valid.
For those end up here via Google like I did, the function below solves the ordering problem after the child element is inserted:
public static T ReorderChildren<T>(T element) where T : OpenXmlElement
{
Dictionary<Type, int> childOrderHashTable = element.GetType()
.GetCustomAttributes()
.Where(x => x is ChildElementInfoAttribute)
.Select( (x, idx) => new KeyValuePair<Type, int>(((ChildElementInfoAttribute)x).ElementType, idx))
.ToDictionary(x => x.Key, x => x.Value);
List<OpenXmlElement> reorderedChildren = element.ChildElements
.OrderBy(x => childOrderHashTable[x.GetType()])
.ToList();
element.RemoveAllChildren();
element.Append(reorderedChildren);
return element;
}
The generated types in the DocumentFormat.OpenXml library have custom attributes that can be used to reflect metadata from the the OOXML schema. This solution relies on System.Reflection and System.Linq (i.e., not very fast) but eliminates the need to hardcode a list of strings to correctly order the child elements for a specific type.
I use this function after validation on the ValidationErrorInfo.Node property and it and cleans up the newly created element by reference. That way I don't have apply this method recursively across an entire document.
helb's answer is beautiful - thank you for that, helb.
It has the slight drawback that it does not test if there are already problems with the order of child elements. The following slight modification makes sure there are no pre-existing problems when adding a new element (you still need his _childElementNames, which is priceless) and it's slightly more efficient:
private static int getChildElementOrderIndex(OpenXmlElement collection)
{
int orderIndex = _childElementNames.IndexOf(collection.LocalName);
if( orderIndex < 0)
throw new InvalidOperationException($"Internal: worksheet part {collection.LocalName} not found");
return orderIndex;
}
private static T GetOrCreateWorksheetChildCollection<T>(Worksheet worksheet) where T : OpenXmlCompositeElement, new()
{
T collection = worksheet.GetFirstChild<T>();
if (collection == null)
{
collection = new T();
if (!worksheet.HasChildren)
{
worksheet.AppendChild(collection);
}
else
{
int collectionSchemaPos = getChildElementOrderIndex(collection);
int insertPos = 0;
int lastOrderNum = -1;
for(int i=0; i<worksheet.ChildElements.Count; ++i)
{
int thisOrderNum = getChildElementOrderIndex(worksheet.ChildElements[i]);
if(thisOrderNum<=lastOrderNum)
throw new InvalidOperationException($"Internal: worksheet parts {_childElementNames[lastOrderNum]} and {_childElementNames[thisOrderNum]} out of order");
lastOrderNum = thisOrderNum;
if( thisOrderNum < collectionSchemaPos )
++insertPos;
}
// this is the index to insert the new element
worksheet.InsertAt(collection, insertPos);
}
}
return collection;
}
Basically I have a single element inside of an xml file where I store settings for my application. This element mirrors a class that I have built. What I'm trying to do using LINQ, is select that single element, and then store the values stored inside of that element into an instance of my class in a single statement.
Right now I'm selecting the element seperately and then storing the values from that element into the different properties. Of course this turns into about six seperate statements. Is it possible to do this in a single statement?
It will be better if you can show your XML but you can get general idea from code below
XDocument doc = //load xml document here
var instance = from item in doc.Descendants("ElementName")
select new YourClass()
{
//fill the properties using item
};
You can use LINQ to XML, e.g.
var document = XDocument.Load("myxml.xml");
document.Element("rootElement").Element("myElement").Select(e =>
new MySettingsClass
{
MyProperty = e.Attribute("myattribute").Value,
MyOtherProperty = e.Attribute("myotherattribute").Value
});
See http://msdn.microsoft.com/en-us/library/bb387098.aspx for more details.