I´m coding in C# and I have a project that the user will
be able to upload a .zip file and in that file will a .xml be read
and chapters based on xml tags will de displayed dynamic.
Right now it is hardcoded a specific file and in that file
that specific .xml file.
How do I read .xml file from a zip file and display that dynamic in C#?
Microsoft documentation has an exemple on this. Extracting specific file from a zip Archive and unzip them in a directory:
https://learn.microsoft.com/en-us/dotnet/standard/io/how-to-compress-and-extract-files#example-2-extract-specific-file-extensions.
Then you next step will be to process the all or some file from the directory, https://learn.microsoft.com/en-us/dotnet/api/system.io.directory.getfiles?view=netframework-4.7.2#System_IO_Directory_GetFiles_System_String_System_String_
If you don't want to unzip, you can directly open the ZipArchiveEntry, with : https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.ziparchiveentry.open?view=netframework-4.7.2
With multiple Xml files in the zip all serialisation of myTypethe codes should boil down to :
string zipPath = #".\result.zip";
List<myType> listResults ;
using (ZipArchive archive = ZipFile.OpenRead(zipPath))
{
XmlSerializer serializer = new XmlSerializer(typeof(myType);
listResults =
archive
.Entries
.Where(entry => entry.FullName
.EndsWith(".xml", StringComparison.OrdinalIgnoreCase))
.Select(entry => (myType)serializer.Deserialize(entry.Open()))
.ToList();
}
For any missing reference in your project follow the light bulb 💡!
Add the System.IO.Compression.FileSystem assembly to your project references. Once you did that, you can open an archive like this:
static void Main(string[] args)
{
var zipPath = "Path-To-Your-Zipfile";
using (var archive = ZipFile.OpenRead(zipPath))
{
var xmlFile = archive.Entries.FirstOrDefault(e => e.FullName.EndsWith(".xml"));
if(xmlFile == null)
return;
using (var stream = xmlFile.Open())
{
using (var reader = new StreamReader(stream))
{
var fileContents = reader.ReadToEnd();
}
}
}
}
Related
I'm trying to edit xml file.
but document.Save() method has to use another file name.
Is there any way to use same file? or other method. Thank you!
string path = "test.xml";
using (FileStream xmlFile = File.OpenRead(path))
{
XDocument document = XDocument.Load(xmlFile);
var setupEl = document.Root;
var groupEl = setupEl.Elements().ElementAt(0);
var valueEl = groupEl.Elements().ElementAt(1);
valueEl.Value = "Test2";
document.Save("test-result.xml");
// document.Save("test.xml"); I want to use this line.
}
I receive the error:
The process cannot access the file '[...]\test.xml' because it is being used by another process.
The problem is that you are trying to write to the file while you still have it open. However, you have no need to have it open once you've loaded the XML file. Simply scoping your code more granularly will solve the issue:
string path = "test.xml";
XDocument document;
using (FileStream xmlFile = File.OpenRead(path))
{
document = XDocument.Load(xmlFile);
}
// the rest of your code
Hi I have a need to store some data in a excel file (.xlsx) so I can send the file to our customers and extract data from it again later when they send us the file back to update some content.
The customer should not be able to see the data we store in the file, and certainly not be able to remove it (accidentally). In fact he should not be aware of it in any way. Also we want to add this info from a service on a system that doesn't have Excel installed.
I know that a .xlsx file is basicaly a zip file so I can extract the data and add a file to it, zip it again and have a valid file that can be opened by Excel. Only problem here is that after saving that file in Excel my custom xml file is removed from the package. So I need to know how to fix this.
What I have:
XNamespace MyNamespace = "http://stackoverflow.com/questions/ask";
XNamespace ExcelNamespace = "http://schemas.openxmlformats.org/package/2006/relationships";
string ExtractionPath = #"C:\temp\test\";
string ExcelFile = #"C:\temp\example.xlsx";
Directory.CreateDirectory(ExtractionPath);
System.IO.Compression.ZipFile.ExtractToDirectory(ExcelFile, ExtractionPath);
var Root = new XElement(MyNamespace + "tracker",
new XAttribute("version", "1.0.0.1"),
new XElement("connections"));
var file = Path.Combine(ExtractionPath, "connectioninfo.xml");
if (!File.Exists(file))
{
var relsPath = Path.Combine(ExtractionPath, "_rels", ".rels");
var rels = XElement.Load(relsPath);
rels.Add(new XElement(ExcelNamespace + "Relationship",
new XAttribute("Id", "XXXX"),
new XAttribute("Type", "http://schemas.openxmlformats.org/officeDocument/2006/relationships/customXml"),
new XAttribute("Target", "connectioninfo.xml")));
rels.Save(relsPath);
}
Root.Save(file);
if (File.Exists(ExcelFile)) File.Delete(ExcelFile);
System.IO.Compression.ZipFile.CreateFromDirectory(ExtractionPath, ExcelFile, System.IO.Compression.CompressionLevel.NoCompression, false);
When I run this code I end up with a file that contains my connectioninfo.xml file and that I can open in Excel. But when I save that file in excel and unzip the package again, then the connectioninfo.xml file is gone.
So question -> what am I missing to keep the file in the package after saving?
PS: I have also tried following code, but same problem ...
(using System.IO.Packaging;)
using (Package package = Package.Open(ExcelFile, FileMode.Open, FileAccess.ReadWrite))
{
Uri uriPartTarget = new Uri("/customXml/example.xml", UriKind.Relative);
if (!package.PartExists(uriPartTarget))
{
PackagePart customXml = package.CreatePart(uriPartTarget,
"application/vnd.openxmlformats-officedocument.customXmlProperties+xml");
using (Stream partStream = customXml.GetStream(FileMode.Create,
FileAccess.ReadWrite))
{
var doc = new XElement("test", new XElement("content","Hello world!"));
doc.Save(partStream);
}
}
}
I want to read data - like string, from .docx file from C# code. I look through some of the issues but didn't understand which one to use.
I'm trying to use ApplicationClass Application = new ApplicationClass(); but I get t
Error:
The type 'Microsoft.Office.Interop.Word.ApplicationClass' has no
constructors defined
And I want to get full text from my docx file, NOT SEPARATED WORDS !
foreach (FileInfo f in docFiles)
{
Application wo = new Application();
object nullobj = Missing.Value;
object file = f.FullName;
Document doc = wo.Documents.Open(ref file, .... . . ref nullobj);
doc.Activate();
doc. == ??
}
I want to know how can I get whole text from docx file?
This Is what I want to extract whole text from docx file !
using (ZipFile zip = ZipFile.Read(filename))
{
MemoryStream stream = new MemoryStream();
zip.Extract(#"word/document.xml", stream);
stream.Seek(0, SeekOrigin.Begin);
XmlDocument xmldoc = new XmlDocument();
xmldoc.Load(stream);
string PlainTextContent = xmldoc.DocumentElement.InnerText;
}
try
Word.Application interface instead of ApplicationClass.
Understanding Office Primary Interop Assembly Classes and Interfaces
The .docx format as the other Microsoft Office files that end with "x" is simply a ZIP package that you can open/modify/compress.
So use an Office Open XML library like this.
Enjoy.
Make sure you are using .Net Framework 4.5.
using NUnit.Framework;
[TestFixture]
public class GetDocxInnerTextTestFixture
{
private string _inputFilepath = #"../../TestFixtures/TestFiles/input.docx";
[Test]
public void GetDocxInnerText()
{
string documentText = DocxInnerTextReader.GetDocxInnerText(_inputFilepath);
Assert.IsNotNull(documentText);
Assert.IsTrue(documentText.Length > 0);
}
}
using System.IO;
using System.IO.Compression;
using System.Xml;
public static class DocxInnerTextReader
{
public static string GetDocxInnerText(string docxFilepath)
{
string folder = Path.GetDirectoryName(docxFilepath);
string extractionFolder = folder + "\\extraction";
if (Directory.Exists(extractionFolder))
Directory.Delete(extractionFolder, true);
ZipFile.ExtractToDirectory(docxFilepath, extractionFolder);
string xmlFilepath = extractionFolder + "\\word\\document.xml";
var xmldoc = new XmlDocument();
xmldoc.Load(xmlFilepath);
return xmldoc.DocumentElement.InnerText;
}
}
First you need to add some references from assemblies such as:
System.Xml
System.IO.Compression.FileSystem
Second you should be certain of calling these using in your class:
using System.IO;
using System.IO.Compression;
using System.Xml;
Then you can use below code:
public string DocxToString(string docxPath)
{
// Destination of your extraction directory
string extractDir = Path.GetDirectoryName(docxPath) + "\\" + Path.GetFileName(docxPath) + ".tmp";
// Delete old extraction directory
if (Directory.Exists(extractDir)) Directory.Delete(extractDir, true);
// Extract all of media an xml document in your destination directory
ZipFile.ExtractToDirectory(docxPath, extractDir);
XmlDocument xmldoc = new XmlDocument();
// Load XML file contains all of your document text from the extracted XML file
xmldoc.Load(extractDir + "\\word\\document.xml");
// Delete extraction directory
Directory.Delete(extractDir, true);
// Read all text of your document from the XML
return xmldoc.DocumentElement.InnerText;
}
Enjoy...
Read sequentially through the XML files (e.g. C:\Application\XML) and get the xml for all the files.
You can read XML files as shown below:
List<string> files = Directory.GetFiles("c:\\MyDir", "*.xml").ToList();
foreach(string fileLocation in files)
{
XmlDocument obj = new XmlDocument();
obj.Load(filelocation);
//Your code to place the xml in a queue.
}
What you need to do is implement a producer-consumer model. Have a look here: http://www.albahari.com/threading/part4.aspx and scroll down to the "Producer/Consumer Queue" part.
For some classic C# XML API read here: http://msdn.microsoft.com/en-us/magazine/cc302158.aspx
foreach (var file in Directory.EnumerateFiles(path, "*.xml"))
{
var xdoc = XDocument.Load(file);
...
}
I am trying to manipulate the XML of a Word 2007 document in C#. I have managed to find and manipulate the node that I want but now I can't seem to figure out how to save it back. Here is what I am trying:
// Open the document from memoryStream
Package pkgFile = Package.Open(memoryStream, FileMode.Open, FileAccess.ReadWrite);
PackageRelationshipCollection pkgrcOfficeDocument = pkgFile.GetRelationshipsByType(strRelRoot);
foreach (PackageRelationship pkgr in pkgrcOfficeDocument)
{
if (pkgr.SourceUri.OriginalString == "/")
{
Uri uriData = new Uri("/word/document.xml", UriKind.Relative);
PackagePart pkgprtData = pkgFile.GetPart(uriData);
XmlDocument doc = new XmlDocument();
doc.Load(pkgprtData.GetStream());
NameTable nt = new NameTable();
XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
nsManager.AddNamespace("w", nsUri);
XmlNodeList nodes = doc.SelectNodes("//w:body/w:p/w:r/w:t", nsManager);
foreach (XmlNode node in nodes)
{
if (node.InnerText == "{{TextToChange}}")
{
node.InnerText = "success";
}
}
if (pkgFile.PartExists(uriData))
{
// Delete template "/customXML/item1.xml" part
pkgFile.DeletePart(uriData);
}
PackagePart newPkgprtData = pkgFile.CreatePart(uriData, "application/xml");
StreamWriter partWrtr = new StreamWriter(newPkgprtData.GetStream(FileMode.Create, FileAccess.Write));
doc.Save(partWrtr);
partWrtr.Close();
}
}
pkgFile.Close();
I get the error 'Memory stream is not expandable'. Any ideas?
I would recommend that you use Open XML SDK instead of hacking the format by yourself.
Using OpenXML SDK 2.0, I do this:
public void SearchAndReplace(Dictionary<string, string> tokens)
{
using (WordprocessingDocument doc = WordprocessingDocument.Open(_filename, true))
ProcessDocument(doc, tokens);
}
private string GetPartAsString(OpenXmlPart part)
{
string text = String.Empty;
using (StreamReader sr = new StreamReader(part.GetStream()))
{
text = sr.ReadToEnd();
}
return text;
}
private void SavePart(OpenXmlPart part, string text)
{
using (StreamWriter sw = new StreamWriter(part.GetStream(FileMode.Create)))
{
sw.Write(text);
}
}
private void ProcessDocument(WordprocessingDocument doc, Dictionary<string, string> tokenDict)
{
ProcessPart(doc.MainDocumentPart, tokenDict);
foreach (var part in doc.MainDocumentPart.HeaderParts)
{
ProcessPart(part, tokenDict);
}
foreach (var part in doc.MainDocumentPart.FooterParts)
{
ProcessPart(part, tokenDict);
}
}
private void ProcessPart(OpenXmlPart part, Dictionary<string, string> tokenDict)
{
string docText = GetPartAsString(part);
foreach (var keyval in tokenDict)
{
Regex expr = new Regex(_starttag + keyval.Key + _endtag);
docText = expr.Replace(docText, keyval.Value);
}
SavePart(part, docText);
}
From this you could write a GetPartAsXmlDocument, do what you want with it, and then stream it back with SavePart(part, xmlString).
Hope this helps!
You should use the OpenXML SDK to work on docx files and not write your own wrapper.
Getting Started with the Open XML SDK 2.0 for Microsoft Office
Introducing the Office (2007) Open XML File Formats
How to: Manipulate Office Open XML Formats Documents
Manipulate Docx with C# without Microsoft Word installed with OpenXML SDK
The problem appears to be doc.Save(partWrtr), which is built using newPkgprtData, which is built using pkgFile, which loads from a memory stream... Because you loaded from a memory stream it's trying to save the document back to that same memory stream. This leads to the error you are seeing.
Instead of saving it to the memory stream try saving it to a new file or to a new memory stream.
The short and simple answer to the issue with getting 'Memory stream is not expandable' is:
Do not open the document from memoryStream.
So in that respect the earlier answer is correct, simply open a file instead.
Opening from MemoryStream editing the document (in my experience) easy lead to 'Memory stream is not expandable'.
I suppose the message appears when one do edits that requires the memory stream to expand.
I have found that I can do some edits but not anything that add to the size.
So, f.ex deleting a custom xml part is ok but adding one and some data is not.
So if you actually need to open a memory stream you must figure out how to open an expandable MemoryStream if you want to add to it.
I have a need for this and hope to find a solution.
Stein-Tore Erdal
PS: just noticed the answer from "Jan 26 '11 at 15:18".
Don't think that is the answer in all situations.
I get the error when trying this:
var ms = new MemoryStream(bytes);
using (WordprocessingDocument wd = WordprocessingDocument.Open(ms, true))
{
...
using (MemoryStream msData = new MemoryStream())
{
xdoc.Save(msData);
msData.Position = 0;
ourCxp.FeedData(msData); // Memory stream is not expandable.