Parse XML With Additional String

Parse XML With Additional String - c#

I need to support parsing xml that is inside an email body but with extra text in the beginning and the end.
I've tried the HTML agility pack but this does not remove the non-xml texts.
So how do I cleanse the string w/c contains an entire xml text mixed with other texts around it?
var bodyXmlPart= #"Hi please see below client <?xml version=""1.0"" encoding=""UTF-8""?>" +
"<ac_application>" +
" <primary_applicant_data>" +
" <first_name>Ross</first_name>" +
" <middle_name></middle_name>" +
" <last_name>Geller</last_name>" +
" <ssn>123456789</ssn>" +
" </primary_applicant_data>" +
"</ac_application> thank you, \n john ";
//How do I clean up the body xml part before loading into xml
//This will fail:
var xDoc = XDocument.Parse(bodyXmlPart);

If you mean that body can contain any XML and not just ac_application. You can use the following code:
var bodyXmlPart = #"Hi please see below client " +
"<ac_application>" +
" <primary_applicant_data>" +
" <first_name>Ross</first_name>" +
" <middle_name></middle_name>" +
" <last_name>Geller</last_name>" +
" <ssn>123456789</ssn>" +
" </primary_applicant_data>" +
"</ac_application> thank you, \n john ";
StringBuilder pattern = new StringBuilder();
Regex regex = new Regex(#"<\?xml.*\?>", RegexOptions.Singleline);
var match = regex.Match(bodyXmlPart);
if (match.Success) // There is an xml declaration
{
pattern.Append(#"<\?xml.*");
}
Regex regexFirstTag = new Regex(#"\s*<(\w+:)?(\w+)>", RegexOptions.Singleline);
var match1 = regexFirstTag.Match(bodyXmlPart);
if (match1.Success) // xml has body and we got the first tag
{
pattern.Append(match1.Value.Trim().Replace(">",#"\>" + ".*"));
string firstTag = match1.Value.Trim();
Regex regexFullXmlBody = new Regex(pattern.ToString() + #"<\/" + firstTag.Trim('<','>') + #"\>", RegexOptions.None);
var matchBody = regexFullXmlBody.Match(bodyXmlPart);
if (matchBody.Success)
{
string xml = matchBody.Value;
}
}
This code can extract any XML and not just ac_application.
Assumptions are, that the body will always contain XML declaration tag.
This code will look for XML declaration tag and then find first tag immediately following it. This first tag will be treated as root tag to extract entire xml.

I'd probably do something like this...
using System.Diagnostics;
using System.Text.RegularExpressions;
namespace Test {
class Program {
static void Main(string[] args) {
var bodyXmlPart = #"Hi please see below client <?xml version=""1.0"" encoding=""UTF-8""?>" +
"<ac_application>" +
" <primary_applicant_data>" +
" <first_name>Ross</first_name>" +
" <middle_name></middle_name>" +
" <last_name>Geller</last_name>" +
" <ssn>123456789</ssn>" +
" </primary_applicant_data>" +
"</ac_application> thank you, \n john ";
Regex regex = new Regex(#"(?<pre>.*)(?<xml>\<\?xml.*</ac_application\>)(?<post>.*)", RegexOptions.Singleline);
var match = regex.Match(bodyXmlPart);
if (match.Success) {
Debug.WriteLine($"pre={match.Groups["pre"].Value}");
Debug.WriteLine($"xml={match.Groups["xml"].Value}");
Debug.WriteLine($"post={match.Groups["post"].Value}");
}
}
}
}
This outputs...
pre=Hi please see below client
xml=<?xml version="1.0" encoding="UTF-8"?><ac_application> <primary_applicant_data> <first_name>Ross</first_name> <middle_name></middle_name> <last_name>Geller</last_name> <ssn>123456789</ssn> </primary_applicant_data></ac_application>
post= thank you,
john

Related

How to obtain regex matched string 's file path?

I have successfully regex matched multiple string from a folder with txt.files with "streamreader" but i also need to obtain the matched string's file path.
How am i able to obtain the matched string's file paths?
static void abnormalitiescheck()
{
int count = 0;
Regex regex = new Regex(#"(#####)");
DirectoryInfo di = new DirectoryInfo(txtpath);
Console.WriteLine("No" + "\t" + "Name and location of file" + "\t" + "||" +" " + "Abnormal Text Detected");
Console.WriteLine("=" + "\t" + "=========================" + "\t" + "||" + " " + "=======================");
foreach (string files in Directory.GetFiles(txtpath, "*.txt"))
{
using (StreamReader reader = new StreamReader(files))
{
string line;
while ((line = reader.ReadLine()) != null)
{
Match match = regex.Match(line);
if (match.Success)
{
count++;
Console.WriteLine(count + "\t\t\t\t\t" + match.Value + "\n");
}
}
}
}
}
If possible , i want to have output of the strings's file path as well.
For e.g.,
C:/..../email_4.txt
C:/..../email_7.txt
C:/..../email_8.txt
C:/..../email_9.txt

As you already have the DirectoryInfo, you could get the FullName property.
You also have the filename called files. To get the name and location of the file, you could use Path.Combine
Your updated code could look like:
Console.WriteLine(count + "\t" + Path.Combine(di.FullName , Path.GetFileName(files)) + "\t" + match.Value + "\n");

I'm guessing that we might just want to maybe match some .txt files. If that might be the case, let's start with a simple expression that would collect everything from the start of our input strings up to .txt, then we add .txt as a right boundary:
^(.+?)(.txt)$
Demo
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"^(.+?)(.txt)$";
string input = #"C:/..../email_4.txt
C:/..../email_7.txt
C:/..../email_8.txt
C:/..../email_9.txt";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}

Can't deserialize xml with element starting with 'μ'

I have an xml configuration file:
<Instruments>
<Instrument Name="uEyeFF1" Assembly="Instruments" Class="IDSuEye1240SE">
<Settings>
<IDSuEye1240SESettings>
<Serial>4102801225</Serial>
<µmPerPixel>5.3</µmPerPixel>
<Color>true</Color>
</IDSuEye1240SESettings>
</Settings>
</Instrument>
<!-- more Instruments -->
</Instruments>
and a class to which the <IDSuEye1240SESettings> node is deserialized:
[Serializable]
public class IDSuEye1240SESettings
{
[XmlElement]
public string Serial { get; set; }
[XmlElement]
public double µmPerPixel { get; set; }
[XmlElement]
public bool Color { get; set; }
}
But when deserializing, I get the following error:
'There is an error in XML document (69, 10).' in Utilities.Load() as System.Void
at path hidden 'Name cannot begin with the 'µ' character, hexadecimal value 0xB5. Line 69, position 10.' in System.Xml.Throw() as System.Void
rest of stack trace...
On a different PC, an earlier compiled version of the same application is running, and I am seemingly able to deserialize the xml to the class. But the developer who wrote it is not around.
As far as I know, the working code doing the deserialization should be the same as the current code:
// the <Settings> node's children is value below
_settings = ConvertNode<IDSuEye1240SESettings>(((XmlNode[])value).First())
public T ConvertNode<T>(XmlNode node)
{
MemoryStream ms = new MemoryStream();
StreamWriter sw = new StreamWriter(ms);
sw.Write(node.OuterXml);
sw.Flush();
ms.Position = 0;
XmlSerializer ser = new XmlSerializer(typeof(T));
T result = (T)ser.Deserialize(ms);
return result;
}
Is it possible that there is something different in the compiled code which works to allow the 'µ' character to start an xml element name?
I had found a difference in the first line of the xml files between the PC's:
Working:
<?xml version="1.0"?>
<Instruments
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://www.w3schools.com Instruments.xsd">
Not working:
<?xml version="1.0" encoding="utf-8"?>
<Instruments
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://www.w3schools.com Instruments.xsd">
But I removed the encoding attribute but that only removed the red underline below the 'µ' for the new file open in the IDE. It didn't solve the serialization issue.
The schema files are identical for both PCs, if that matters.
Is it possible to allow the illegal character to start an xml node?

I decompiled System.Xml to find out why this exception might occur and I think I found an answer...
In short
There is no way to crack into it. But it is possible to solve your problem without changing XML.
In long
Unfotunetly XML fifth edition is considerably differs in deifinition of characters which can be used to start tag names. Here is decompiled constants from System.Xml which proves my words:
#if XML10_FIFTH_EDITION
// StartNameChar without ':' -- see Section 2.3 production [4]
const string s_NCStartName =
"\u0041\u005a\u005f\u005f\u0061\u007a\u00c0\u00d6" +
"\u00d8\u00f6\u00f8\u02ff\u0370\u037d\u037f\u1fff" +
"\u200c\u200d\u2070\u218f\u2c00\u2fef\u3001\ud7ff" +
"\uf900\ufdcf\ufdf0\ufffd";
// NameChar without ':' -- see Section 2.3 production [4a]
const string s_NCName =
"\u002d\u002e\u0030\u0039\u0041\u005a\u005f\u005f" +
"\u0061\u007a\u00b7\u00b7\u00c0\u00d6\u00d8\u00f6" +
"\u00f8\u037d\u037f\u1fff\u200c\u200d\u203f\u2040" +
"\u2070\u218f\u2c00\u2fef\u3001\ud7ff\uf900\ufdcf" +
"\ufdf0\ufffd";
#else
const string s_NCStartName =
"\u0041\u005a\u005f\u005f\u0061\u007a" +
"\u00c0\u00d6\u00d8\u00f6\u00f8\u0131\u0134\u013e" +
"\u0141\u0148\u014a\u017e\u0180\u01c3\u01cd\u01f0" +
"\u01f4\u01f5\u01fa\u0217\u0250\u02a8\u02bb\u02c1" +
"\u0386\u0386\u0388\u038a\u038c\u038c\u038e\u03a1" +
"\u03a3\u03ce\u03d0\u03d6\u03da\u03da\u03dc\u03dc" +
"\u03de\u03de\u03e0\u03e0\u03e2\u03f3\u0401\u040c" +
"\u040e\u044f\u0451\u045c\u045e\u0481\u0490\u04c4" +
"\u04c7\u04c8\u04cb\u04cc\u04d0\u04eb\u04ee\u04f5" +
"\u04f8\u04f9\u0531\u0556\u0559\u0559\u0561\u0586" +
"\u05d0\u05ea\u05f0\u05f2\u0621\u063a\u0641\u064a" +
"\u0671\u06b7\u06ba\u06be\u06c0\u06ce\u06d0\u06d3" +
"\u06d5\u06d5\u06e5\u06e6\u0905\u0939\u093d\u093d" +
"\u0958\u0961\u0985\u098c\u098f\u0990\u0993\u09a8" +
"\u09aa\u09b0\u09b2\u09b2\u09b6\u09b9\u09dc\u09dd" +
"\u09df\u09e1\u09f0\u09f1\u0a05\u0a0a\u0a0f\u0a10" +
"\u0a13\u0a28\u0a2a\u0a30\u0a32\u0a33\u0a35\u0a36" +
"\u0a38\u0a39\u0a59\u0a5c\u0a5e\u0a5e\u0a72\u0a74" +
"\u0a85\u0a8b\u0a8d\u0a8d\u0a8f\u0a91\u0a93\u0aa8" +
"\u0aaa\u0ab0\u0ab2\u0ab3\u0ab5\u0ab9\u0abd\u0abd" +
"\u0ae0\u0ae0\u0b05\u0b0c\u0b0f\u0b10\u0b13\u0b28" +
"\u0b2a\u0b30\u0b32\u0b33\u0b36\u0b39\u0b3d\u0b3d" +
"\u0b5c\u0b5d\u0b5f\u0b61\u0b85\u0b8a\u0b8e\u0b90" +
"\u0b92\u0b95\u0b99\u0b9a\u0b9c\u0b9c\u0b9e\u0b9f" +
"\u0ba3\u0ba4\u0ba8\u0baa\u0bae\u0bb5\u0bb7\u0bb9" +
"\u0c05\u0c0c\u0c0e\u0c10\u0c12\u0c28\u0c2a\u0c33" +
"\u0c35\u0c39\u0c60\u0c61\u0c85\u0c8c\u0c8e\u0c90" +
"\u0c92\u0ca8\u0caa\u0cb3\u0cb5\u0cb9\u0cde\u0cde" +
"\u0ce0\u0ce1\u0d05\u0d0c\u0d0e\u0d10\u0d12\u0d28" +
"\u0d2a\u0d39\u0d60\u0d61\u0e01\u0e2e\u0e30\u0e30" +
"\u0e32\u0e33\u0e40\u0e45\u0e81\u0e82\u0e84\u0e84" +
"\u0e87\u0e88\u0e8a\u0e8a\u0e8d\u0e8d\u0e94\u0e97" +
"\u0e99\u0e9f\u0ea1\u0ea3\u0ea5\u0ea5\u0ea7\u0ea7" +
"\u0eaa\u0eab\u0ead\u0eae\u0eb0\u0eb0\u0eb2\u0eb3" +
"\u0ebd\u0ebd\u0ec0\u0ec4\u0f40\u0f47\u0f49\u0f69" +
"\u10a0\u10c5\u10d0\u10f6\u1100\u1100\u1102\u1103" +
"\u1105\u1107\u1109\u1109\u110b\u110c\u110e\u1112" +
"\u113c\u113c\u113e\u113e\u1140\u1140\u114c\u114c" +
"\u114e\u114e\u1150\u1150\u1154\u1155\u1159\u1159" +
"\u115f\u1161\u1163\u1163\u1165\u1165\u1167\u1167" +
"\u1169\u1169\u116d\u116e\u1172\u1173\u1175\u1175" +
"\u119e\u119e\u11a8\u11a8\u11ab\u11ab\u11ae\u11af" +
"\u11b7\u11b8\u11ba\u11ba\u11bc\u11c2\u11eb\u11eb" +
"\u11f0\u11f0\u11f9\u11f9\u1e00\u1e9b\u1ea0\u1ef9" +
"\u1f00\u1f15\u1f18\u1f1d\u1f20\u1f45\u1f48\u1f4d" +
"\u1f50\u1f57\u1f59\u1f59\u1f5b\u1f5b\u1f5d\u1f5d" +
"\u1f5f\u1f7d\u1f80\u1fb4\u1fb6\u1fbc\u1fbe\u1fbe" +
"\u1fc2\u1fc4\u1fc6\u1fcc\u1fd0\u1fd3\u1fd6\u1fdb" +
"\u1fe0\u1fec\u1ff2\u1ff4\u1ff6\u1ffc\u2126\u2126" +
"\u212a\u212b\u212e\u212e\u2180\u2182\u3007\u3007" +
"\u3021\u3029\u3041\u3094\u30a1\u30fa\u3105\u312c" +
"\u4e00\u9fa5\uac00\ud7a3";
const string s_NCName =
"\u002d\u002e\u0030\u0039\u0041\u005a\u005f\u005f" +
"\u0061\u007a\u00b7\u00b7\u00c0\u00d6\u00d8\u00f6" +
"\u00f8\u0131\u0134\u013e\u0141\u0148\u014a\u017e" +
"\u0180\u01c3\u01cd\u01f0\u01f4\u01f5\u01fa\u0217" +
"\u0250\u02a8\u02bb\u02c1\u02d0\u02d1\u0300\u0345" +
"\u0360\u0361\u0386\u038a\u038c\u038c\u038e\u03a1" +
"\u03a3\u03ce\u03d0\u03d6\u03da\u03da\u03dc\u03dc" +
"\u03de\u03de\u03e0\u03e0\u03e2\u03f3\u0401\u040c" +
"\u040e\u044f\u0451\u045c\u045e\u0481\u0483\u0486" +
"\u0490\u04c4\u04c7\u04c8\u04cb\u04cc\u04d0\u04eb" +
"\u04ee\u04f5\u04f8\u04f9\u0531\u0556\u0559\u0559" +
"\u0561\u0586\u0591\u05a1\u05a3\u05b9\u05bb\u05bd" +
"\u05bf\u05bf\u05c1\u05c2\u05c4\u05c4\u05d0\u05ea" +
"\u05f0\u05f2\u0621\u063a\u0640\u0652\u0660\u0669" +
"\u0670\u06b7\u06ba\u06be\u06c0\u06ce\u06d0\u06d3" +
"\u06d5\u06e8\u06ea\u06ed\u06f0\u06f9\u0901\u0903" +
"\u0905\u0939\u093c\u094d\u0951\u0954\u0958\u0963" +
"\u0966\u096f\u0981\u0983\u0985\u098c\u098f\u0990" +
"\u0993\u09a8\u09aa\u09b0\u09b2\u09b2\u09b6\u09b9" +
"\u09bc\u09bc\u09be\u09c4\u09c7\u09c8\u09cb\u09cd" +
"\u09d7\u09d7\u09dc\u09dd\u09df\u09e3\u09e6\u09f1" +
"\u0a02\u0a02\u0a05\u0a0a\u0a0f\u0a10\u0a13\u0a28" +
"\u0a2a\u0a30\u0a32\u0a33\u0a35\u0a36\u0a38\u0a39" +
"\u0a3c\u0a3c\u0a3e\u0a42\u0a47\u0a48\u0a4b\u0a4d" +
"\u0a59\u0a5c\u0a5e\u0a5e\u0a66\u0a74\u0a81\u0a83" +
"\u0a85\u0a8b\u0a8d\u0a8d\u0a8f\u0a91\u0a93\u0aa8" +
"\u0aaa\u0ab0\u0ab2\u0ab3\u0ab5\u0ab9\u0abc\u0ac5" +
"\u0ac7\u0ac9\u0acb\u0acd\u0ae0\u0ae0\u0ae6\u0aef" +
"\u0b01\u0b03\u0b05\u0b0c\u0b0f\u0b10\u0b13\u0b28" +
"\u0b2a\u0b30\u0b32\u0b33\u0b36\u0b39\u0b3c\u0b43" +
"\u0b47\u0b48\u0b4b\u0b4d\u0b56\u0b57\u0b5c\u0b5d" +
"\u0b5f\u0b61\u0b66\u0b6f\u0b82\u0b83\u0b85\u0b8a" +
"\u0b8e\u0b90\u0b92\u0b95\u0b99\u0b9a\u0b9c\u0b9c" +
"\u0b9e\u0b9f\u0ba3\u0ba4\u0ba8\u0baa\u0bae\u0bb5" +
"\u0bb7\u0bb9\u0bbe\u0bc2\u0bc6\u0bc8\u0bca\u0bcd" +
"\u0bd7\u0bd7\u0be7\u0bef\u0c01\u0c03\u0c05\u0c0c" +
"\u0c0e\u0c10\u0c12\u0c28\u0c2a\u0c33\u0c35\u0c39" +
"\u0c3e\u0c44\u0c46\u0c48\u0c4a\u0c4d\u0c55\u0c56" +
"\u0c60\u0c61\u0c66\u0c6f\u0c82\u0c83\u0c85\u0c8c" +
"\u0c8e\u0c90\u0c92\u0ca8\u0caa\u0cb3\u0cb5\u0cb9" +
"\u0cbe\u0cc4\u0cc6\u0cc8\u0cca\u0ccd\u0cd5\u0cd6" +
"\u0cde\u0cde\u0ce0\u0ce1\u0ce6\u0cef\u0d02\u0d03" +
"\u0d05\u0d0c\u0d0e\u0d10\u0d12\u0d28\u0d2a\u0d39" +
"\u0d3e\u0d43\u0d46\u0d48\u0d4a\u0d4d\u0d57\u0d57" +
"\u0d60\u0d61\u0d66\u0d6f\u0e01\u0e2e\u0e30\u0e3a" +
"\u0e40\u0e4e\u0e50\u0e59\u0e81\u0e82\u0e84\u0e84" +
"\u0e87\u0e88\u0e8a\u0e8a\u0e8d\u0e8d\u0e94\u0e97" +
"\u0e99\u0e9f\u0ea1\u0ea3\u0ea5\u0ea5\u0ea7\u0ea7" +
"\u0eaa\u0eab\u0ead\u0eae\u0eb0\u0eb9\u0ebb\u0ebd" +
"\u0ec0\u0ec4\u0ec6\u0ec6\u0ec8\u0ecd\u0ed0\u0ed9" +
"\u0f18\u0f19\u0f20\u0f29\u0f35\u0f35\u0f37\u0f37" +
"\u0f39\u0f39\u0f3e\u0f47\u0f49\u0f69\u0f71\u0f84" +
"\u0f86\u0f8b\u0f90\u0f95\u0f97\u0f97\u0f99\u0fad" +
"\u0fb1\u0fb7\u0fb9\u0fb9\u10a0\u10c5\u10d0\u10f6" +
"\u1100\u1100\u1102\u1103\u1105\u1107\u1109\u1109" +
"\u110b\u110c\u110e\u1112\u113c\u113c\u113e\u113e" +
"\u1140\u1140\u114c\u114c\u114e\u114e\u1150\u1150" +
"\u1154\u1155\u1159\u1159\u115f\u1161\u1163\u1163" +
"\u1165\u1165\u1167\u1167\u1169\u1169\u116d\u116e" +
"\u1172\u1173\u1175\u1175\u119e\u119e\u11a8\u11a8" +
"\u11ab\u11ab\u11ae\u11af\u11b7\u11b8\u11ba\u11ba" +
"\u11bc\u11c2\u11eb\u11eb\u11f0\u11f0\u11f9\u11f9" +
"\u1e00\u1e9b\u1ea0\u1ef9\u1f00\u1f15\u1f18\u1f1d" +
"\u1f20\u1f45\u1f48\u1f4d\u1f50\u1f57\u1f59\u1f59" +
"\u1f5b\u1f5b\u1f5d\u1f5d\u1f5f\u1f7d\u1f80\u1fb4" +
"\u1fb6\u1fbc\u1fbe\u1fbe\u1fc2\u1fc4\u1fc6\u1fcc" +
"\u1fd0\u1fd3\u1fd6\u1fdb\u1fe0\u1fec\u1ff2\u1ff4" +
"\u1ff6\u1ffc\u20d0\u20dc\u20e1\u20e1\u2126\u2126" +
"\u212a\u212b\u212e\u212e\u2180\u2182\u3005\u3005" +
"\u3007\u3007\u3021\u302f\u3031\u3035\u3041\u3094" +
"\u3099\u309a\u309d\u309e\u30a1\u30fa\u30fc\u30fe" +
"\u3105\u312c\u4e00\u9fa5\uac00\ud7a3";
#endif
As to why your problem occurs: s_NCStartName constant defines which characters can be used to start your name, and in fifth edition it is VERY short (you will not find 0xB5 part in new edition string), and there is no way around it. It is constant, hardcoded into System.Xml. Well, it is Microsoft, they don't care about backward compatability.
As to solution
You have four options:
Attach old System.Xml, which is in my opinion is worse solution, but easiest.
Preprosses your xml: clean it from invalid start name characters.
Blame Microsoft on task tracker about this: lose of backward compatability.
Use other serialization frameworks, for example Xml.Net
PS
Small advise. You can rewrite ConvertNode like this:
public static T ConvertNode<T>(XmlNode node)
{
using (var reader = new XmlNodeReader(node))
{
var ser = new XmlSerializer(typeof(T));
return (T) ser.Deserialize(reader);
}
}

Microsoft.Office.Interop.Word - Overwriting text in current document, how to stop this?

I am trying to add text to a document and format it appropriately.
It works as long as there is no text after the insertion point, but if there is then it overwrites it. Why is that?
Here is my code in which the text is written. Again, this works if there is nothing after it.
// Header
var p = p2.Range.Paragraphs.Add();
var x = p.Range.Paragraphs.Count;
p.Range.Text = String.Format(headerText + "\r\n");
p.Range.set_Style("Req Level " + layerNumber.ToString() + " - Body");
// Description
p2 = p.Range.Paragraphs.Add();
p2.Range.Text = String.Format(bodyText + "\r\n");
p2.Range.set_Style("Req Level " + layerNumber.ToString());

If you want to put the description at the same level as header:
var pp = p2.Range.Paragraphs.Add();
pp.Range.Text = String.Format(bodyText + "\r\n");
pp.Range.set_Style("Req Level " + layerNumber.ToString());

Why 'innerhtml' does not work properly for 'select' tag

I am trying to set the innerhtml of an html select tag but I cannot set this feature;therefor,I need to use the outerhtml feature.This way,not only is my code HARDCODE ,but also it is preposterous.I have already read 'InnerHTML IE 8 doesn't work properly? Resetting form',it did not help though.
I would really appreciate it if you tell me how to set the innerhtml feature of an html select tag.
My C# code:
public void SetDefaultValue(string ControlID, string ControlValue)
{
System.Windows.Forms.HtmlDocument doc = webBrowser1.Document;
HtmlElement HTMLControl = doc.GetElementById(ControlID);
string ListResult;
string ListInnerHTML = "";
ListInnerHTML += "<OPTION value = " + LstString + ">" + LstString + "</OPTION>";
ListResult = "<SELECT id = " + '"' + HTMLControl.Id + '"' + " type = " + '"' + HTMLControl.GetAttribute("type") + '"' + " title = " + '"' +
HTMLControl.GetAttribute("title") + '"' + " name = " + '"' + HTMLControl.Name + '"' + " value = " + '"' + HTMLControl.GetAttribute("value") +
'"' + " size = \"" + HTMLControl.GetAttribute("size") + '"' + HTMLControl.GetAttribute("multiple").ToString() + "\">" + ListInnerHTML + "</SELECT>";
HTMLControl.OuterHtml = ListResult;
}
or
string _lsthtml = _htmlel.OuterHtml;
string[] _parts = ControlValue.Split(new char[] { ',' });
string _lstinner = "";
foreach (string _lst in _parts)
_lstinner += "<option value=" + _lst + ">" + _lst + "</option>";
_lsthtml = _lsthtml.Insert(_lsthtml.IndexOf(">") + 1, _lstinner);
_htmlel.OuterHtml = _lsthtml;
This code works but I need something efficient and clean.
The ReturnControlType function returns the type of an html tag.

This is an official Internet Explorer bug:
BUG: Internet Explorer Fails to Set the innerHTML Property of the Select Object.
One workaround
You may try adding one of the following meta tags in your document's head:
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
or
<meta http-equiv="X-UA-Compatible" content="IE=10" />
You should also properly format the option tag's value attribute (enclose LstString in '):
ListInnerHTML += "<OPTION value='" + LstString + "'>" + LstString + "</OPTION>";
A more reliable solution
As the fixes above might be a workaround for your code, I would suggest to use a more reliable approach. Consider adding a reference to Microsoft.mshtml to your project and modifying your method like this:
// add this to the top of the file containing your class
using mshtml;
public void SetDefaultValue(string ControlID, string ControlValue)
{
System.Windows.Forms.HtmlDocument doc = webBrowser1.Document;
IHTMLDocument2 document = doc.DomDocument as IHTMLDocument2;
var sel = doc.GetElementById(ControlID);
HTMLSelectElement domSelect = (HTMLSelectElement)sel.DomElement;
domSelect.options.length = 0;
HTMLOptionElement option;
// here you can dynamically add the options to the select element
for (int i = 0; i < 10; i++)
{
option = (HTMLOptionElement)document.createElement("option");
option.text = String.Format("text{0}", i);
option.value = String.Format("value{0}", i);
domSelect.options.add(option, 0);
}
}

I really don't know why innerHTML is not working for you.
If it just dosen't you could try an alternative:
http://innerdom.sourceforge.net/
demo

In this thread it is sugested you use the items collection of a control
How to add items to dynamically created select (html) Control
refer to this page for a complete example:
http://msdn.microsoft.com/en-us/library/system.web.ui.htmlcontrols.htmlselect.items.aspx

XML Schema for Sitemap binding it together

I have the following code for defining XML Schema. Having a problem in keeping the lines together. Worked fine before.
public static FileContentResult WriteTo(SiteMapFeed feedToFormat)
{
var siteMap = feedToFormat;
//TODO: DO something, next codes are just DEMO
var header = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
+ Environment.NewLine + "<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\""
+ Environment.NewLine + "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\""
+ Environment.NewLine + "xsi:schemaLocation=\""
+ Environment.NewLine + "http://www.sitemaps.org/schemas/sitemap/0.9"
+ Environment.NewLine + "http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd\">";
var urls = new System.Text.StringBuilder();
foreach (var site in siteMap.Items)
{
urls.Append(string.Format("<url><loc>http://www.{0}/</loc><lastmod>{1}</lastmod><changefreq>{2}</changefreq><priority>{3}</priority></url>", site.Url, site.LastMod, site.ChangeFreq, site.Priority));
}
byte[] fileContent = System.Text.Encoding.UTF8.GetBytes(header + urls + "</urlset>");
return new FileContentResult(fileContent, "text/xml");
}
SO this is now causing the following error:
Where am I doing it wrong? Thanks

Problem is in that "xsi:schemaLocation" bit I think - you can't have multiple lines between quotes in an XML attribute, if my brain remembers correctly. Try changing to:
var header = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
+ Environment.NewLine + "<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\""
+ Environment.NewLine + "xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\""
+ Environment.NewLine + "xsi:schemaLocation=\""
+ "http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd\">" + Environment.NewLine;

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Parse XML With Additional String - c#

Related

How to obtain regex matched string 's file path?

Can't deserialize xml with element starting with 'μ'

Microsoft.Office.Interop.Word - Overwriting text in current document, how to stop this?

Why 'innerhtml' does not work properly for 'select' tag

XML Schema for Sitemap binding it together

Categories

Resources