I have an XML file that I want to base some unit tests off of. Currently I load the XML file from disk in the class initialize method. I would rather have this XML generated in the test instead of reading the file from disk. Are there any utilities that will automatically generate the LINQ to XML code to generate a given XML file?
Or are there better ways to do this? Is loading from disk OK for unit tests?
I would embed the XML file directly into the assembly - no need for a string resource or anything like that, just include it as an embedded resource (create a file, go to the properties in Visual Studio, and select "Embedded Resource").
Then you can read it using Assembly.GetManifestResourceStream, load the XML from that as you would any other stream, and you're away.
I've used this technique several times - it makes it a lot easier to see the data you're interested in.
Probably it's better to use some resource file, for example, a .resx file where you put the XML as a string resource. That's fast enough for a unit test and you don't have to do any magic. Reading from disk is not OK for various reasons (speed, need for configuration, etc.)
Related
I have translation file with a 13000 of lines. Now at starting the app I read it from manifest resource
var resourceStream = Assembly.GetExecutingAssembly().GetManifestResourceStream("filename.csv")
and parse them via CsvParser.
It's slow operation (takes ~2seconds). I am looking for ways to pre-parse it at build time, so I can access it like that:
var lines = SomeCode.ParsedLines;
Any recommendations how can I do that? I could just write a gigantic .cs file like
"ParsedLines= new string[,]{{"title1","title2"},{"word1","word2"}}"
but the problem is that the .csv file is frequently modified. My best guess is to create a code generator that will create this .cs file at each build, but I am wondering if there are any better approaches.
This is one of the textbook use cases for source generators. Using a source generator you can parse your csv file at the build time and generate source for a class which will be compiled in the next steps.
Another useful article - introducing C# source generators. Also potentially you can find useful - source generators cookbook, my sandbox source generator project.
Also you can try looking into processing the csv file manually without CsvHelper (since you control it and you are sure about formatting, escaping, etc.) via using standard File.ReadLines or via System.Pipelines to improve performance.
I have been asked to create a project which involves loading profiles into a UI in which the user can edit the values. I need to be able to load data from a file within the project and allow the user to make chnages and save back to that file.
All of this has to be contained within an executable but I am unsure of the best way to approach this, I was think of using an XML file with an XML structure or a text file and just string split on it or even a resources file and just call out to it.
I thought I would put my problem up on here and see what the community suggest, thanks!
Embedded resources are not intended to be changed during runtime. A database is really easy to auto-create these days using code-first EF, but a file containing XML or JSON would also be a good option (as there are third-party libraries to help you parse the result). Hand-rolling your own string.split solution is not recommended (because if requirements get more complex in the future, your code may become unmanageable)
I may be missing something very simple here, but what's the benefit of using reflection to retrieve an embedded resource from the same assembly that contains the resource as opposed to simply retrieving it via an .resx file? I see this a lot but don't get it - is there a reason to use Assembly.GetExecutingAssembly().GetManifestResourceStream(resource) compared to resx file Resources.resource? Even Microsoft does it: How to embed and access resources.
What I mean exactly: suppose I have an assembly MyAssembly that contains an embedded resource Config.xml. The assembly has MyClass that implements a method that returns said resource as a string:
public string GetConfigXML() // returns the content of Config.xml as a string
Often, I see this implemented like this, using reflection to retrieve the resource:
public string GetConfigXML()
{
Stream xmlStream = Assembly.GetExecutingAssembly().GetManifestResourceStream("MyAssembly.Config.xml");
string xml = GetStringFromStream(xmlStream);
return xml;
}
Why use GetManifestResourceStream() when you can:
add a resource file (Resource.resx) to the MyAssembly project in Visual Studio;
add Config.xml to the resource's 'Files';
get the content of Config.xml in a much simpler way: string xml = Resource.Config;
I don't know how Visual Studio handles .resx files internally, but I doubt it simply copies the resource into the .resx file (in which case you'd end up with duplicated resources). I assume it doesn't use reflection internally either, so why not simply use .resx files in situations like this, which seems much more performance-friendly to me?
but what's the benefit of using reflection to retrieve an embedded resource
The common benefit that's behind any reason to convert data from one format to another. Speed, speed, speed and convenience.
XML is a pretty decent format to keep your resources stored in. You'll have a very good guarantee that you can still retrieve the original resource 10 years from now when the original got lost in the fog of time and a couple of machine changes without good backups. But it is quite a sucky format to have to read from, XML is very verbose and locating a fragment requires reading from the start of the file.
Problems that disappear when Resgen.exe compiles the .xml file into a .resource file. A binary format that's fit to be linked into your assembly metadata and contains the original bytes in the resource. And is directly mapped into memory when your assembly is loaded, no need to find another file and open it, read it and convert the data. Big difference.
Do use the Resource Designer to avoid having to use GetManifestResourceStream() directly. Yet more convenience.
What is the "recommended" approach for processing very large XML files in .NET 3.5?
For writing, I want to generate an element at a time then append to a file.
For reading, I would likewise want to read an element at a time (in the same order as written).
I have a few ideas how to do it using strings and File.Append, but does .NET 3.5 provide XML Api's for dealing with arbitrarily large XML files?
Without going into specifics this isn't easy to answer. .NET offers different methods to process XML files:
XmlDocument creates a DOM, supports XPath queries but loads the entire XML file into memory.
XElement/XDocument has support for LINQ and also reads the entire XML file into memory.
XmlReader is a forward-only reader. It does not read the entire file into memory.
XmlWriter is just like the XmlReader, except for writing
Based on what you say an XmlReader/XmlWriter combination seems like the best approach.
As Dirk said, using an XmlWriter/XmlReader combo sounds like the best approach. It can be very lengthy and if your XML file is fairly complex it gets very unwieldy. I had to do something similar recently with some strict memory constraints. My SO question might come in handy.
But personally, I found this method here on MSDN blogs to be very easy to implement and it neatly handles appending to the end of the XML file without fragments.
Try to make an *.xsd file out of your *.xml. You can than generate *.cs file from *.xsd file. After that load you *.xml file to your object. It should take less memory than whole file.
There is a plugin for VS2010 that gives option to generate *.cs file from *.xsd. It is called XSD2Code. In that plugin you have an option to decorate properties for serialization. For your *.xsd file named Settings you would get Settings.cs. You would than do something like this.
StreamReader str = new StreamReader("SomeFolder\\YourFile.xml");
XmlSerializer xmlSer = new XmlSerializer(typeof(TcpPostavke));
Settings m_settings = (Settings )xmlSer .Deserialize(str);
You can than query your list of objects with Linq.
My application has historically used an ini file on the same file server as the data it consumes is located to store per user settings so that they roam if the user logs on from multiple computers. To do this we had a file that looked like:
[domain\username1]
value1=foo
value2=bar
[domain\username2]
value1=foo
value2=baz
For this release we're trying to migrate away from using ini files due to limitations in the win32 ini read/write functions without having to write a custom ini file parser.
I've looked at app.config and user settings files and neither appear to be suitable. The former needs to be in the same folder as the executable, and the latter doesn't provide any means to create new values at runtime.
Is there a built in option I'm missing, or is my best path to write a preferences class of my own and use the framework's XML serialization to write it out?
I have found that the fastest way here is to just create an XML file that does what you want, then use XSD.exe to create a class and serialize the data. It is fast, and a few lines of code and works quite well.
Have you not checked out or have heard of nini which is a third party ini handler. I found it quite easy to use and simple in reading/writing to ini file.
For your benefit, it would mean very little changes, and easier to use.
The conversion from ini to another format needs to be weighed up, like the code impact, ease of programming (nitpicky aside, changing the code to use xml may be easy but it is limiting in that you cannot write to it). What would be the benefit in ripping out the ini codes and replace it with xml is a question you have to decide?
There may well be a knock on effect such as having to change it and adapt the code...but... for the for-seeable time, sure, ini is a bit outdated and old, but it is still in use, I cannot see Microsoft dropping the ini API support as it is very much alive and in use behind the scenes for driver installation...think of inf files used to specify where the drivers go and how is it installed...it is here to stay as the manufacturers of drivers have adopted it and is de-facto standard way of driver distribution...
Hope this helps,
Best regards,
Tom.