Generate HTML/XML Markup in C# Console Application? - c#

I am working a console application that generates XML/HTML output. Currently I have the code below that includes hard-coded spaghetti markup.
Is it possible to use razor or some tool to improve this and make it cleaner?
foreach (var file in _files)
{
TagLib.File f = TagLib.File.Create(file.FullName);
string title = f.Tag.Title;
string url = _urlPrefix + file.Name;
StringBuilder sb = new StringBuilder();
sb.Append("<item>\n");
sb.AppendFormat("\t<title>{0}</title>\n", HttpUtility.HtmlEncode(title));
sb.AppendFormat("\t<pubDate>{0}</pubDate>\n", file.CreationTimeUtc.ToString("r"));
sb.AppendFormat("\t<guid isPermaLink=\"false\">{0}</guid>\n", url);
sb.AppendFormat("\t<enclosure url=\"{0}\" length=\"{1}\" type=\"audio/mpeg\" />\n", url, file.Length);
sb.Append("\t<itunes:subtitle></itunes:subtitle>\n");
sb.Append("\t<itunes:summary></itunes:summary>\n");
sb.Append("</item>\n");
items.AppendLine(sb.ToString());
}

Just use the System.XML classes to help you.
https://msdn.microsoft.com/pt-br/library/system.xml(v=vs.110).aspx
sample code:
XmlDocument doc = new XmlDocument();
XmlElement el = (XmlElement)doc.AppendChild(doc.CreateElement("Foo"));
el.SetAttribute("Bar", "some & value");
el.AppendChild(doc.CreateElement("Nested")).InnerText = "data";
Console.WriteLine(doc.OuterXml);

There is an open source project which allows to use Razor as a general templating engine: it's called
razorengine
string template = "Hello #Model.Name! Welcome to Razor!";
string result = Razor.Parse(template, new { Name = "World" });
I think this will helpful for you..

You don't have to use a StringBuilder to output XML, you can use LINQ to XML (https://msdn.microsoft.com/en-us/library/bb387061.aspx) or XmlWriter (https://msdn.microsoft.com/en-us/library/system.xml.xmlwriter(v=vs.110).aspx) or the DOM API (https://msdn.microsoft.com/en-us/library/t058x2df(v=vs.110).aspx).

Related

Return a XDocument generated XML file to browser in C#?

I created a XML 1.0 file using XDocument class in .NET framework.
The question is: how do I return the generated file to the browser to allow the user to save it?
I have var generatedXMLfile = generateXml(Parameters param) this methods returns a XDocument class instance with my XML.
Then I need to take the generatedXMLfile and return it to the browser as an XML file.
I don't want to write a file in the server and then pass it to the browser, maybe it would be better to
save a temp file in memory.
Thank You.
You need to convert it to a string and then return the string as an application/json content type.
You can convert the document to a string like this:
var sb = new StringBuilder();
var tr = new StringWriter(sb);
xmlDoc.Save(tr);
var xmlToSendToClient = sb.ToString();

Weird character encoded characters (’) appearing from a feed

I've got a question regarding an XML feed and XSL transformation I'm doing. In a few parts of the outputted feed on an HTML page, I get weird characters (such as ’) appearing on the page.
On another site (that I don't own) that's using the same feed, it isn't getting these characters.
Here's the code I'm using to grab and return the transformed content:
string xmlUrl = "http://feedurl.com/feed.xml";
string xmlData = new System.Net.WebClient().DownloadString(xmlUrl);
string xslUrl = "http://feedurl.com/transform.xsl";
XsltArgumentList xslArgs = new XsltArgumentList();
xslArgs.AddParam("type", "", "specifictype");
string resultText = Utils.XslTransform(xmlData, xslUrl, xslArgs);
return resultText;
And my Utils.XslTransform function looks like this:
static public string XslTransform(string data, string xslurl)
{
TextReader textReader = new StringReader(data);
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Ignore;
XmlReader xmlReader = XmlReader.Create(textReader, settings);
XmlReader xslReader = new XmlTextReader(Uri.UnescapeDataString(xslurl));
XslCompiledTransform myXslT = new XslCompiledTransform();
myXslT.Load(xslReader);
StringBuilder sb = new StringBuilder();
using (TextWriter tw = new StringWriter(sb))
{
myXslT.Transform(xmlReader, new XsltArgumentList(), tw);
}
string transformedData = sb.ToString();
return transformedData;
}
I'm not extremely knowledgeable with character encoding issues and I've been trying to nip this in the bud for a bit of time and could use any suggestions possible. I'm not sure if there's something I need to change with how the WebClient downloads the file or something going weird in the XslTransform.
Thanks!
Give HtmlEncode a try. So in this case you would reference System.Web and then make this change (just call the HtmlEncode function on the last line):
string xmlUrl = "http://feedurl.com/feed.xml";
string xmlData = new System.Net.WebClient().DownloadString(xmlUrl);
string xslUrl = "http://feedurl.com/transform.xsl";
XsltArgumentList xslArgs = new XsltArgumentList();
xslArgs.AddParam("type", "", "specifictype");
string resultText = Utils.XslTransform(xmlData, xslUrl, xslArgs);
return HttpUtility.HtmlEncode(resultText);
The character â is a marker of multibyte sequence (’) of UTF-8-encoded text when it's represented as ASCII. So, I guess, you generate an HTML file in UTF-8, while browser interprets it otherwise. I see 2 ways to fix it:
The simplest solution would be to update the XSLT to include the HTML meta tag that will hint the correct encoding to browser: <meta charset="UTF-8">.
If your transform already defines a different encoding in meta tag and you'd like to keep it, this encoding needs to be specified in the function that saves XML as file. I assume this function took ASCII by default in your example. If your XSLT was configured to generate XML files directly to disk, you could adjust it with XSLT instruction <xsl:output encoding="ASCII"/>.
To use WebClient.DownloadString you have to know what the encoding the server is going use and tell the WebClient in advance. It's a bit of a Catch-22.
But, there is no need to do that. Use WebClient.DownloadData or WebClient.OpenReader and let an XML library figure out which encoding to use.
using (var web = new WebClient())
using (var stream = web.OpenRead("http://unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xml"))
using (var reader = XmlReader.Create(stream, new XmlReaderSettings { DtdProcessing = DtdProcessing.Parse }))
{
reader.MoveToContent();
//… use reader as you will, including var doc = XDocument.ReadFrom(reader);
}

Read content of Web Browser in WPF

Hello Developers I want to read external content from Website such as element between tag . I am using Web Browser Control and here is my code however this Code just fills my Web browser control with the Web Page
public MainWindow()
{
InitializeComponent();
wbMain.Navigate(new Uri("http://www.annonymous.com", UriKind.RelativeOrAbsolute));
}
You can use the Html Agility Pack library to parse any HTML formatted data.
HtmlDocument doc = new HtmlDocument();
doc.Load(wbMain.DocumentText);
var nodes = doc.SelectNodes("//a[#href"]);
NOTE: The method SelectNode accepts XPath, not CSS or jQuery selectors.
var node = doc.SelectNodes("id('my_element_id')");
As I understood from your question, you are only trying to parse the HTML data, and you don't need to show the actual web page.
If that is the case than you can take a very simple approach and use HttpWebRequest:
var _plainText = string.Empty;
var _request = (HttpWebRequest)WebRequest.Create("http://www.google.com");
_request.Timeout = 5000;
_request.Method = "GET";
_request.ContentType = "text/plain";
using (var _webResponse = (HttpWebResponse)_request.GetResponse())
{
var _webResponseStatus = _webResponse.StatusCode;
var _stream = _webResponse.GetResponseStream();
using (var _streamReader = new StreamReader(_stream))
{
_plainText = _streamReader.ReadToEnd();
}
}
Try this:
dynamic doc = wbMain.Document;
var htmlText = doc.documentElement.InnerHtml;
edit: Taken from here.

HtmlAgilityPack: how to create indented HTML?

So, I am generating html using HtmlAgilityPack and it's working perfectly, but html text is not indented. I can get indented XML however, but I need HTML. Is there a way?
HtmlDocument doc = new HtmlDocument();
// gen html
HtmlNode table = doc.CreateElement("table");
table.Attributes.Add("class", "tableClass");
HtmlNode tr = doc.CreateElement("tr");
table.ChildNodes.Append(tr);
HtmlNode td = doc.CreateElement("td");
td.InnerHtml = "—";
tr.ChildNodes.Append(td);
// write text, no indent :(
using(StreamWriter sw = new StreamWriter("table.html"))
{
table.WriteTo(sw);
}
// write xml, nicely indented but it's XML!
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlWriter xw = XmlTextWriter.Create("table.xml", settings))
{
table.WriteTo(xw);
}
Fast, Reliable, Pure C#, .NET Core compatible AngleSharp
You can parse it with AngleSharp
which provides a way to auto indent:
var parser = new HtmlParser();
var document = parser.ParseDocument(text);
using (var writer = new StringWriter())
{
document.ToHtml(writer, new PrettyMarkupFormatter
{
Indentation = "\t",
NewLine = "\n"
});
var indentedText = writer.ToString();
}
No, and it's a "by design" choice. There is a big difference between XML (or XHTML, which is XML, not HTML) where - most of the times - whitespaces are no specific meaning, and HTML.
This is not a so minor improvement, as changing whitespaces can change the way some browsers render a given HTML chunk, especially malformed HTML (that is in general well handled by the library). And the Html Agility Pack was designed to keep the way the HTML is rendered, not to minimize the way the markup is written.
I'm not saying it's not feasible or plain impossible. Obviously you can convert to XML and voilà (and you could write an extension method to make this easier) but the rendered output may be different, in the general case.
As far as I know, HtmlAgilityPack cannot do this. But you could look through html tidy packs which are proposed in similar questions:
Html Agility Pack: make code look
neat
Which is the best HTML tidy pack? Is
there any option in HTML agility pack
to make HTML webpage tidy?
I made the same experience even though HtmlAgilityPack is great to read and modify Html (or in my case asp) files you cannot create readable output.
However, I ended up in writing some lines of code which work for me:
Having a HtmlDocument named "m_htmlDocument" I create my HTML file as follows:
file = new System.IO.StreamWriter(_sFullPath);
if (m_htmlDocument.DocumentNode != null)
foreach (var node in m_htmlDocument.DocumentNode.ChildNodes)
WriteNode(file, node, 0);
and
void WriteNode(System.IO.StreamWriter _file, HtmlNode _node, int _indentLevel)
{
// check parameter
if (_file == null) return;
if (_node == null) return;
// init
string INDENT = " ";
string NEW_LINE = System.Environment.NewLine;
// case: no children
if(_node.HasChildNodes == false)
{
for (int i = 0; i < _indentLevel; i++)
_file.Write(INDENT);
_file.Write(_node.OuterHtml);
_file.Write(NEW_LINE);
}
// case: node has childs
else
{
// indent
for (int i = 0; i < _indentLevel; i++)
_file.Write(INDENT);
// open tag
_file.Write(string.Format("<{0} ",_node.Name));
if(_node.HasAttributes)
foreach(var attr in _node.Attributes)
_file.Write(string.Format("{0}=\"{1}\" ", attr.Name, attr.Value));
_file.Write(string.Format(">{0}",NEW_LINE));
// childs
foreach(var chldNode in _node.ChildNodes)
WriteNode(_file, chldNode, _indentLevel + 1);
// close tag
for (int i = 0; i < _indentLevel; i++)
_file.Write(INDENT);
_file.Write(string.Format("</{0}>{1}", _node.Name,NEW_LINE));
}
}

Getting nothing while parsing XML response from the .aspx page of VS 2005 in JQuery

I am not able to get xml response data from .aspx page of VS 2005. Following is the function which writes xml response on the client end:
protected void GetMailContents(double pdblMessageID)
{
string lstrMailContents = "";
DataSet lobjDs = new DataSet();
StringBuilder stringBuilder = new StringBuilder("<MailContents>");
lobjDs = mobjCProfileAndMail.GetMailContents(pdblMessageID);
if (lobjDs != null)
{
stringBuilder.Append("<Contents><From>");
stringBuilder.Append(lobjDs.Tables[0].Rows[0]["From"].ToString());
stringBuilder.Append("</From><To>");
stringBuilder.Append(lobjDs.Tables[0].Rows[0]["To"].ToString());
stringBuilder.Append("</To><Subject>");
stringBuilder.Append(lobjDs.Tables[0].Rows[0]["Subject"].ToString());
stringBuilder.Append("</Subject><Message>");
stringBuilder.Append(lobjDs.Tables[0].Rows[0]["Message"].ToString());
stringBuilder.Append("</Message></Contents>");
}
stringBuilder.Append("</MailContents>");
lstrMailContents = "<?xml version=\"1.0\" encoding=\"utf-8\"?> \n ";
lstrMailContents += stringBuilder.ToString();
Response.ContentEncoding = Encoding.UTF8;
Response.Write(lstrMailContents);
Response.End();
}
Code on the client end:
$(document).ready(function()
{
var varURL = document.URL;
var varArr = varURL.split('=');
var varMessageID = varArr[1];
$.get("AjaxData.aspx?Mode=MODALDIALOG."+varMessageID, function(data)
{
$(data).find('Contents').each(function()
{
var varFrom = $(this).find('From').text();
var varTo = $(this).find('To').text();
var varSubject = $(this).find('Subject').text();
var varMessage = $(this).find('Message').text();
alert(varFrom);
});
});
});
I have written a alert for the data coming from the callback but getting nothing. If I am parsing any fixed xml then its working fine but in case getting response from the .aspx page got nothing. Is there any one who can help me out for this problem.
Thanks.
First - writing xml through concatenation is really flakey - consider using XmlWriter / XDocument / XmlDocument instead, which will automatically escape any necessary symbols (<, &, etc) rather than result in invalid xml.
Did you clear the response buffer before writing to it? In reality, it would be a lot simpler to do this from a raw handler (ashx) than from within the aspx page life-cycle. Or switch to MVC, which works similarly to the ashx result.
Also - from within the jquery, you should probably speficy the type as xml - see here.
Here's an example of a suitable handler:
public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "text/xml";
XmlDocument doc = new XmlDocument();
XmlElement root = (XmlElement) doc.AppendChild(doc.CreateElement("Contents"));
root.AppendChild(doc.CreateElement("From")).InnerText = "some text";
root.AppendChild(doc.CreateElement("To")).InnerText = "some more text";
root.AppendChild(doc.CreateElement("Subject")).InnerText = "this & that";
root.AppendChild(doc.CreateElement("Message")).InnerText = "a < b > c";
doc.Save(context.Response.Output);
}
First off, no need for aspx here - plain IHttpHandler will do and will also be a more natural solution.
As for the question, make sure you clear the output stream before writing out XML and ensure that HTTP headers (specifically Content-Type) are correct. Use Fiddler to see what's going on under the hood.

Categories