How to read and output the XML within an SPFile? - c#

I have this line of code that retrieves and XML file and saves it to an SPFile
SPFile XMLFile = SPContext.Current.Web.GetFile("C:\\Users\\maleem\\Documents\\XMLTest.xml");
I want to get the XML/Text within it and output it to a literal, I tried
StreamReader reader = new StreamReader(XMLFile.OpenBinaryStream());
And a few variants but its not working.

If you use the OpenBinary method of SPFile the return is a byte array you can then convert it into a string.
Depending on the encoding you can try this:
For default encoding:
string str = System.Text.Encoding.Default.GetString(XMLFile.OpenBinary());
For UTF8:
string str = System.Text.Encoding.UTF8.GetString(XMLFile.OpenBinary());

Related

XmlWriter trimming my string

I am trying to return an XML string as a CLOB from Oracle stored procedure to C# string.
Then I am write this string to a file using XmlWriter class.
My code looks like following:
string myString= ((Oracle.ManagedDataAccess.Types.OracleClob)(cmd.Parameters["paramName"].Value)).Value.ToString();
string fileName = DateTime.Now.ToString("yyyyMMddHHmmss");
var stream = new MemoryStream();
var writer = XmlWriter.Create(stream);
writer.WriteRaw(myString);
stream.Position = 0;
var fileStreamResult = File(stream, "application/octet-stream", "ABCD"+fileName+".xml");
return fileStreamResult;
When I checked my CLOB output it returns completely to myString.
When I check my end result, XML file is trimmed at the end.
My string will be huge for ex: Length of 3382563 and more.
Is there any setting for XmlWriter to write the complete string to file.
Thanks in advance.
Sounds like all you want to do is grab some string value out of your Database, and write that string value in a text file. The string being xml does not actually force you into using an XML specific class or method unless you want to do XML specific operations, which I do not see in your snippet. Therefore, I suggest you simply grab the string value and spit it out in a file in the easiest way.
string myString = " blah blah blah keep my spaces ";
using (StreamWriter sw = new StreamWriter(#"M:\StackOverflowQuestionsAndAnswers\XMLWriterTrimmingString_45380476\bin\Debug\outputfile.xml"))
{
sw.Write(myString);
}

Extract XML Data from String

I am having a input stream which is generated when I upload a file(XML Type). I need the XML data at code behind. I am having the xml data in string by using
StreamReader stream = new StreamReader(Request.InputStream);
string x = stream.ReadToEnd();
It also contains the following data at the start of the string
------WebKitFormBoundary8na5dBbHc4ydfxVU
Content-Disposition: form-data; name="MyFile"; filename="Test 123.vfc"
Content-Type: application/octet-stream
at the end of the string
------WebKitFormBoundary8na5dBbHc4ydfxVU--
This data is not required for me. Please help me in getting the right XML String.
First you can remove the first three lines and last line from your string.
int n = 3;
string[] lines = str.Split(Environment.NewLine.ToCharArray()).Skip(n).ToArray();
string output = string.Join(Environment.NewLine, lines);
output = output.Remove(str.LastIndexOf(Environment.NewLine));
In your XML string if you don't have a root node then add it like following.
string xmlTxt = "<ROOT>" + xmlString + "</ROOT>";
If you have a root node skip above. For a well format XML string you can just use below code
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.InnerXml = xmlTxt;

Convert from Unicode Characters to blank in XML File

How can i converts specific Unicode Characters From Xml file which is not valid in XML file at the time Desensitization. I just tried to use below Regex Function but not getting Success.
string strXML = File.ReadAllText("Xml File Path", Encoding.UTF8);
System.Text.RegularExpressions.Regex _invalidXMLChars = new System.Text.RegularExpressions.Regex(#"(?<![\uD800-\uDBFF])[\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|[\x00-\x08\x0B\x0C\x0E-\x1F\x7F-\x9F\uFEFF\uFFFE\uFFFF\00EF\00BB\00BF]", System.Text.RegularExpressions.RegexOptions.Compliled);
strXML = _invalidXMLChars.Replace(strXML, "");

How to read a file into a string with CR/LF preserved?

If I asked the question "how to read a file into a string" the answer would be obvious. However -- here is the catch with CR/LF preserved.
The problem is, File.ReadAllText strips those characters. StreamReader.ReadToEnd just converted LF into CR for me which led to long investigation where I have bug in pretty obvious code ;-)
So, in short, if I have file containing foo\n\r\nbar I would like to get foo\n\r\nbar (i.e. exactly the same content), not foo bar, foobar, or foo\n\n\nbar. Is there some ready to use way in .Net space?
The outcome should be always single string, containing entire file.
Are you sure that those methods are the culprits that are stripping out your characters?
I tried to write up a quick test; StreamReader.ReadToEnd preserves all newline characters.
string str = "foo\n\r\nbar";
using (Stream ms = new MemoryStream(Encoding.ASCII.GetBytes(str)))
using (StreamReader sr = new StreamReader(ms, Encoding.UTF8))
{
string str2 = sr.ReadToEnd();
Console.WriteLine(string.Join(",", str2.Select(c => ((int)c))));
}
// Output: 102,111,111,10,13,10,98,97,114
// f o o \n \r \n b a r
An identical result is achieved when writing to and reading from a temporary file:
string str = "foo\n\r\nbar";
string temp = Path.GetTempFileName();
File.WriteAllText(temp, str);
string str2 = File.ReadAllText(temp);
Console.WriteLine(string.Join(",", str2.Select(c => ((int)c))));
It appears that your newlines are getting lost elsewhere.
This piece of code will preserve LR and CR
string r = File.ReadAllText(#".\TestData\TR120119.TRX", Encoding.ASCII);
The outcome should be always single string, containing entire file.
It takes two hops. First one is File.ReadAllBytes() to get all the bytes in the file. Which doesn't try to translate anything, you get the raw data in the file so the weirdo line-endings are preserved as-is.
But that's bytes, you asked for a string. So second hop is to apply Encoding.GetString() to convert the bytes to a string. The one thing you have to do is pick the right Encoding class, the one that matches the encoding used by the program that wrote the file. Given that the file is pretty messed up if it contains \n\r\n sequences, and you didn't document anything else about the file, your best bet is to use Encoding.Default. Tweak as necessary.
You can read the contents of a file using File.ReadAllLines, which will return an array of the lines. Then use String.Join to merge the lines together using a separator.
string[] lines = File.ReadAllLines(#"C:\Users\User\file.txt");
string allLines = String.Join("\r\n", lines);
Note that this will lose the precision of the actual line terminator characters. For example, if the lines end in only \n or \r, the resulting string allLines will have replaced them with \r\n line terminators.
There are of course other ways of acheiving this without losing the true EOL terminator, however ReadAllLines is handy in that it can detect many types of text encoding by itself, and it also takes up very few lines of code.
ReadAllText doesn't return carriage returns.
This method opens a file, reads each line of the file, and then adds each line as an element of a string. It then closes the file. A line is defined as a sequence of characters followed by a carriage return ('\r'), a line feed ('\n'), or a carriage return immediately followed by a line feed. The resulting string does not contain the terminating carriage return and/or line feed.
From MSDN - https://msdn.microsoft.com/en-us/library/ms143368(v=vs.110).aspx
This is similar to the accepted answer, but wanted to be more to the point. sr.ReadToEnd() will read the bytes like is desired:
string myFilePath = #"C:\temp\somefile.txt";
string myEvents = String.Empty;
FileStream fs = new FileStream(myFilePath, FileMode.Open);
StreamReader sr = new StreamReader(fs);
myEvents = sr.ReadToEnd();
sr.Close();
fs.Close();
You could even also do those in cascaded using statements. But I wanted to describe how the way you write to that file in the first place will determine how to read the content from the myEvents string, and might really be where the problem lies. I wrote to my file like this:
using System.Reflection;
using System.IO;
private static void RecordEvents(string someEvent)
{
string folderLoc = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
if (!folderLoc.EndsWith(#"\")) folderLoc += #"\";
folderLoc = folderLoc.Replace(#"\\", #"\"); // replace double-slashes with single slashes
string myFilePath = folderLoc + "myEventFile.txt";
if (!File.Exists(myFilePath))
File.Create(myFilePath).Close(); // must .Close() since will conflict with opening FileStream, below
FileStream fs = new FileStream(myFilePath, FileMode.Append);
StreamWriter sr = new StreamWriter(fs);
sr.Write(someEvent + Environment.NewLine);
sr.Close();
fs.Close();
}
Then I could use the code farther above to get the string of the contents. Because I was going further and looking for the individual strings, I put this code after THAT code, up there:
if (myEvents != String.Empty) // we have something
{
// (char)2660 is ♠ -- I could have chosen any delimiter I did not
// expect to find in my text
myEvents = myEvents.Replace(Environment.NewLine, ((char)2660).ToString());
string[] eventArray = myEvents.Split((char)2660);
foreach (string s in eventArray)
{
if (!String.IsNullOrEmpty(s))
// do whatever with the individual strings from your file
}
}
And this worked fine. So I know that myEvents had to have the Environment.NewLine characters preserved because I was able to replace it with (char)2660 and do a .Split() on that string using that character to divide it into the individual segments.

Using XDocument to write raw XML

I'm trying to create a spreadsheet in XML Spreadsheet 2003 format (so Excel can read it). I'm writing out the document using the XDocument class, and I need to get a newline in the body of one of the <Cell> tags. Excel, when it reads and writes, requires the files to have the literal string
embedded in the string to correctly show the newline in the spreadsheet. It also writes it out as such.
The problem is that XDocument is writing CR-LF (\r\n) when I have newlines in my data, and it automatically escapes ampersands for me when I try to do a .Replace() on the input string, so I end up with &#10; in my file, which Excel just happily writes out as a string literal.
Is there any way to make XDocument write out the literal
as part of the XML stream? I know I can do it by deriving from XmlTextWriter, or literally just writing out the file with a TextWriter, but I'd prefer not to if possible.
I wonder if it might be better to use XmlWriter directly, and WriteRaw?
A quick check shows that XmlDocument makes a slightly better job of it, but xml and whitespace gets tricky very quickly...
I battled with this problem for a couple of days and finally came up with this solution. I used XMLDocument.Save(Stream) method, then got the formatted XML string from the stream. Then I replaced the &#10; occurrences with
and used the TextWriter to write the string to a file.
string xml = "<?xml version=\"1.0\"?><?mso-application progid='Excel.Sheet'?><Workbook xmlns=\"urn:schemas-microsoft-com:office:spreadsheet\" xmlns:o=\"urn:schemas-microsoft-com:office:office\" xmlns:x=\"urn:schemas-microsoft-com:office:excel\" xmlns:ss=\"urn:schemas-microsoft-com:office:spreadsheet\" xmlns:html=\"http://www.w3.org/TR/REC-html40\">";
xml += "<Styles><Style ss:ID=\"s1\"><Alignment ss:Vertical=\"Center\" ss:WrapText=\"1\"/></Style></Styles>";
xml += "<Worksheet ss:Name=\"Default\"><Table><Column ss:Index=\"1\" ss:AutoFitWidth=\"0\" ss:Width=\"75\" /><Row><Cell ss:StyleID=\"s1\"><Data ss:Type=\"String\">Hello&#10;&#10;World</Data></Cell></Row></Table></Worksheet></Workbook>";
System.Xml.XmlDocument doc = new System.Xml.XmlDocument();
doc.LoadXml(xml); //load the xml string
System.IO.MemoryStream stream = new System.IO.MemoryStream();
doc.Save(stream); //save the xml as a formatted string
stream.Position = 0; //reset the stream position since it will be at the end from the Save method
System.IO.StreamReader reader = new System.IO.StreamReader(stream);
string formattedXML = reader.ReadToEnd(); //fetch the formatted XML into a string
formattedXML = formattedXML.Replace("&#10;", "
"); //Replace the unhelpful &#10;'s with the wanted endline entity
System.IO.TextWriter writer = new System.IO.StreamWriter("C:\\Temp\test1.xls");
writer.Write(formattedXML); //write the XML to a file
writer.Close();

Categories