I am trying to build a regex parser for a single XML block.
I know people will say that Regex is not a good plan for xml, but I am working with stream data and I just need to know if a complete xml block has been broadcast and is sitting in the buffer.
I am trying to handle for anything between the Opening and closing blocks of the XML and any data in parameters of the main block header.
My example code is below the broken down Regular Expression, if anyone has any input on how to make this as comprehensive as possible I would greatly appreciate it.
Here is my regular expression formatted for visual aid.
I am balancing the group, as well as the group and validating that they do not exist at the end of the expression segments.
/*
^(?<TAG>[<]
(?![?])
(?<TAGNAME>[^\s/>]*)
)
(?<ParamData>
(
(\"
(?>
\\\"|
[^"]|
\"(?<quote>)|
\"(?<-quote>)
)*
(?(quote)(?!))
\"
)|
[^/>]
)*?
)
(?:
(?<HASCONTENT>[>])|
(?<-TAG>
(?<TAGEND>/[>])
)
)
(?(HASCONTENT)
(
(?<CONTENT>
(
(?<inTAG>[<]\<TAGNAME>)(?<-inTAG>/[>])?|
(?<-inTAG>[<]/\<TAGNAME>[>])|
([^<]+|[<](?![/]?\<TAGNAME>))
)*?
(?(inTAG)(?!))
)
)
(?<TAGEND>(?<-TAG>)[<]/\<TAGNAME>[>])
)
(?(TAG)(?!))
*/
Within my class, I expect that any Null object returned means there was no xml block on the queue.
Here is the class I am using.
(I used a literal string (#"") to limit the escape requirements, All " characters were replaced with "" to format properly.
public class XmlDataParser
{
// xmlObjectExpression defined below to limit code highlight errors
private Regex _xmlRegex;
private Regex xmlRegex
{
get
{
if (_xmlRegex == null)
{
_xmlRegex = new Regex(xmlObjectExpression);
}
return _xmlRegex;
}
}
private string backingStore = "";
public bool HasObject()
{
return (backingStore != null) && xmlRegex.IsMatch(backingStore);
}
public string GetObject()
{
string result = null;
if (HasObject())
{
lock (this)
{
Match obj = xmlRegex.Match(backingStore);
result = obj.Value;
backingStore = backingStore.Substring(result.Length);
}
}
return result;
}
public void AddData(byte[] bytes)
{
lock (this)
{
backingStore += System.Text.Encoding.Default.GetString(bytes);
}
}
private static string xmlObjectExpression = #"^(?<TAG>[<](?![?])(?<TAGNAME>[^\s/>]*))(?<ParamData>((\""(?>\\\""|[^""]|\""(?<quote>)|\""(?<-quote>))*(?(quote)(?!))\"")|[^/>])*?)(?:(?<HASCONTENT>[>])|(?<-TAG>(?<TAGEND>/[>])))(?(HASCONTENT)((?<CONTENT>((?<inTAG>[<]\<TAGNAME>)(?<-inTAG>/[>])?|(?<-inTAG>[<]/\<TAGNAME>[>])|([^<]+|[<](?![/]?\<TAGNAME>)))*?(?(inTAG)(?!))))(?<TAGEND>(?<-TAG>)[<]/\<TAGNAME>[>]))(?(TAG)(?!))";
}
Just use XmlReader and feed it a TextReader. To read streams, you want to change the ConformanceLevel to Fragment.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlReader reader = XmlReader.Create(tr,settings))
{
while (reader.Read())
{
switch (reader.NodeType)
{
// this is from my code. You'll rewrite this part :
case XmlNodeType.Element:
if (t != null)
{
t.SetName(reader.Name);
}
else if (reader.Name == "event")
{
t = new Event1();
t.Name = reader.Name;
}
else if (reader.Name == "data")
{
t = new Data1();
t.Name = reader.Name;
}
else
{
throw new Exception("");
}
break;
case XmlNodeType.Text:
if (t != null)
{
t.SetValue(reader.Value);
}
break;
case XmlNodeType.XmlDeclaration:
case XmlNodeType.ProcessingInstruction:
break;
case XmlNodeType.Comment:
break;
case XmlNodeType.EndElement:
if (t != null)
{
if (t.Name == reader.Name)
{
t.Close();
t.Write(output);
t = null;
}
}
break;
case XmlNodeType.Whitespace:
break;
}
}
}
Related
I'm relatively new to C# and I'm trying to get my head around a problem that I believe should be pretty simple in concept, but I just cant get it.
I am currently, trying to display a message to the console when the program is run from the command line with two arguments, if a sequence ID does not exist inside a text file full of sequence ID's and DNA sequences against a query text file full of Sequence ID's. For example args[0] is a text file that contains 41534 lines of sequences which means I cannot load the entire file into memory.:
NR_118889.1 Amycolatopsis azurea strain NRRL 11412 16S ribosomal RNA, partial sequence
GGTCTNATACCGGATATAACAACTCATGGCATGGTTGGTAGTGGAAAGCTCCGGCGT
NR_118899.1 Actinomyces bovis strain DSM 43014 16S ribosomal RNA, partial sequence
GGGTGAGTAACACGTGAGTAACCTGCCCCNNACTTCTGGATAACCGCTTGAAAGGGTNGCTAATACGGGATATTTTGGCCTGCT
NR_074334.1 Archaeoglobus fulgidus DSM 4304 16S ribosomal RNA, complete sequence >NR_118873.1 Archaeoglobus fulgidus DSM 4304 strain VC-16 16S ribosomal RNA, complete sequence >NR_119237.1 Archaeoglobus fulgidus DSM 4304 strain VC-16 16S ribosomal RNA, complete sequence
ATTCTGGTTGATCCTGCCAGAGGCCGCTGCTATCCGGCTGGGACTAAGCCATGCGAGTCAAGGGGCTT
args[1] is a query text file with some sequence ID's:
NR_118889.1
NR_999999.1
NR_118899.1
NR_888888.1
So when the program is run, all I want are the sequence ID's that were not found in args[0] from args[1] to be displayed.
NR_999999.1 could not be found
NR_888888.1 could not be found
I know this probably super simple, and I have spent far too long on trying to figure this out by myself to the point where I want to ask for help.
Thank you in advance for any assistance.
You can try this.
It loads each file content and compare with each other.
static void Main(string[] args)
{
if ( args.Length != 2 )
{
Console.WriteLine("Usage: {exename}.exe [filename 1] [filename 2]");
Console.ReadKey();
return;
}
string filename1 = args[0];
string filename2 = args[1];
bool checkFiles = true;
if ( !File.Exists(filename1) )
{
Console.WriteLine($"{filename1} not found.");
checkFiles = false;
}
if ( !File.Exists(filename2) )
{
Console.WriteLine($"{filename2} not found.");
checkFiles = false;
}
if ( !checkFiles )
{
Console.ReadKey();
return;
}
var lines1 = System.IO.File.ReadAllLines(args[0]).Where(l => l != "");
var lines2 = System.IO.File.ReadAllLines(args[1]).Where(l => l != "");
foreach ( var line in lines2 )
if ( !lines1.StartsWith(line) )
{
Console.WriteLine($"{line} could not be found");
checkFiles = false;
}
if (checkFiles)
Console.WriteLine("There is no difference.");
Console.ReadKey();
}
This works, but it only processes the first line of the files...
using( System.IO.StreamReader sr1 = new System.IO.StreamReader(args[1]))
{
using( System.IO.StreamReader sr2 = new System.IO.StreamReader(args[2]))
{
string line1,line2;
while ((line1 = sr1.ReadLine()) != null)
{
while ((line2 = sr2.ReadLine()) != null)
{
if(line1.Contains(line2))
{
found = true;
WriteLine("{0} exists!",line2);
}
if(found == false)
{
WriteLine("{0} does not exist!",line2);
}
}
}
}
}
var saved_ids = new List<String>();
foreach (String args1line in File.ReadLines(args[1]))
{
foreach (String args2line in File.ReadLines(args[2]))
{
if (args1line.Contains(args2line))
{
saved_ids.Add(args2line);
}
}
}
using (System.IO.StreamReader sr1 = new System.IO.StreamReader(args[1]))
{
using (System.IO.StreamReader sr2 = new System.IO.StreamReader(args[2]))
{
string line1, line2;
while ((line1 = sr1.ReadLine()) != null)
{
while ((line2 = sr2.ReadLine()) != null)
{
if (line1.Contains(line2))
{
saved_ids.Add(line2);
break;
}
if (!line1.StartsWith(">"))
{
break;
}
if (saved_ids.Contains(line1))
{
break;
}
if (saved_ids.Contains(line2))
{
break;
}
if (!line1.Contains(line2))
{
saved_ids.Add(line2);
WriteLine("The sequence ID {0} does not exist", line2);
}
}
if (line2 == null)
{
sr2.DiscardBufferedData();
sr2.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);
continue;
}
}
}
}
We are getting false positives while using rule S2538 in the following code
EventLogLevel[] eventLevels = null;
bool reachedEnd = false;
while(!reachedEnd && jsonReader.Read())
{
switch(jsonReader.TokenType)
{
case JsonToken.PropertyName:
string propertyName = jsonReader.Value.ToString();
switch(propertyName)
{
case nameof(EventLevels):
eventLevels = EventSettingsJson.ParseEventLogLevelsArray(nameof(EventLevels), jsonReader);
break;
default:
throw new JsonParserException($"Invalid property: {propertyName}");
}
break;
case JsonToken.EndObject:
reachedEnd = true;
break;
default:
throw new JsonParserException($"Unexpected Token Type while parsing json properties. TokenType: {jsonReader.TokenType}");
}
}
if(eventLevels != null)
{
return new EventLogCollectionSettings(eventLogName, eventLevels);
}
The last if (eventLevels != null) shows the warning with the message:
[Change this condition so that it does not always evaluate to
"false"].
I couldn't create a testcase to reproduce it.
We know about this limitation in our data flow analysis engine. It's related to this ticket: https://jira.sonarsource.com/browse/SLVS-1091. We have no fix for it yet.
I am new to windows metro apps and totally stuck here. textbox1.text displaying the accurate data inside the function but Aya remains null outside the function. How can i solve this problem ? I think recursion is creating problem but how to solve it ?
public async void Aya_Parse()
{
// Initialize http client.
HttpClient httpClient = new HttpClient();
Stream stream = await httpClient.GetStreamAsync("some link");
// Load html document from stream provided by http client.
HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.OptionFixNestedTags = true;
htmlDocument.Load(stream);
Aya_ParseHtmlNode(htmlDocument.DocumentNode);
}
int aia = 0;
string Aya = null;
private void Aya_ParseHtmlNode(HtmlNode htmlNode)
{
foreach (HtmlNode childNode in htmlNode.ChildNodes)
{
if (childNode.NodeType == HtmlNodeType.Text && aia == 1)
{
Aya += " " + childNode.InnerText.ToString(); aia = 0;
}
else if (childNode.NodeType == HtmlNodeType.Element)
{
Aya += " "; // removing this causes null exception at textbox1.text
switch (childNode.Name.ToLower())
{
case "span":
Aya += childNode.NextSibling.InnerText.ToString();
Aya_ParseHtmlNode(childNode);
break;
case "td":
aia = 1;
Aya_ParseHtmlNode(childNode);break;
default:
Aya_ParseHtmlNode(childNode); break;
}
}
}
textBox1.Text = Aya;
}
You never assign a starting value to Aya, so even though you try to add text to it in your Aya_ParseHtmlNode(HtmlNode htmlNode) method, you can't add text to a null value. This can be fixed by doing a check for null on the value and setting it to a default. I'm surprised you aren't getting a NullArgumentException inside your method...
public async void Aya_Parse()
{
// Initialize http client.
HttpClient httpClient = new HttpClient();
Stream stream = await httpClient.GetStreamAsync("some link");
// Load html document from stream provided by http client.
HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.OptionFixNestedTags = true;
htmlDocument.Load(stream);
// greetingOutput.Text = htmlDocument.DocumentNode.InnerText.ToString();
// Parse html node, this is a recursive function which call itself until
// all the childs of html document has been navigated and parsed.
Aya_ParseHtmlNode(htmlDocument.DocumentNode);
}
int aia = 0;
string Aya = null;
private void Aya_ParseHtmlNode(HtmlNode htmlNode)
{
if (Aya == null)
{
Aya = String.empty;
}
foreach (HtmlNode childNode in htmlNode.ChildNodes)
{
if (childNode.NodeType == HtmlNodeType.Text && aia == 1)
{
Aya += " " + childNode.InnerText.ToString(); aia = 0;
}
else if (childNode.NodeType == HtmlNodeType.Element)
{
Aya += " ";
switch (childNode.Name.ToLower())
{
case "span":
Aya += childNode.NextSibling.InnerText.ToString();
Aya_ParseHtmlNode(childNode);
break;
case "td":
aia = 1;
Aya_ParseHtmlNode(childNode);break;
default:
Aya_ParseHtmlNode(childNode); break;
}
}
}
textBox1.Text = Aya;
}
Using a StringBuilder might also be a better idea here since you could recurse and generate a very large string here, so a StringBuilder would be a easier on your memory
public void Aya_Parse()
{
// Initialize http client.
HttpClient httpClient = new HttpClient();
Stream stream = httpClient.GetStreamAsync("some link").Result;
// Load html document from stream provided by http client.
HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.OptionFixNestedTags = true;
htmlDocument.Load(stream);
// greetingOutput.Text = htmlDocument.DocumentNode.InnerText.ToString();
// Parse html node, this is a recursive function which call itself until
// all the childs of html document has been navigated and parsed.
//you marked the method Async, and
//since Aya is in the class, if multiple threads call this
//method, you could get inconsistent results
//I have changed it to a parameter here so this doesn't happen
StringBuilder Aya = new StringBuilder()
Aya_ParseHtmlNode(htmlDocument.DocumentNode, Aya);
//I would also move your textbox update here, so you aren't calling
//ToString() all the time, wasting all of the memory benefits
textBox1.Text = Aya.ToString();
}
int aia = 0;
private void Aya_ParseHtmlNode(HtmlNode htmlNode, StringBuilder Aya)
{
foreach (HtmlNode childNode in htmlNode.ChildNodes)
{
if (childNode.NodeType == HtmlNodeType.Text && aia == 1)
{
Aya.Append(childNode.InnerText); aia = 0;
}
else if (childNode.NodeType == HtmlNodeType.Element)
{
Aya.Append(" ");
switch (childNode.Name.ToLower())
{
case "span":
Aya.Append(childNode.NextSibling.InnerText);
Aya_ParseHtmlNode(childNode, Aya);
break;
case "td":
aia = 1;
Aya_ParseHtmlNode(childNode, Aya);break;
default:
Aya_ParseHtmlNode(childNode, Aya); break;
}
}
}
}
Edit: Your issue actually probably comes from your use of the async keyword on Aya_Parse() which means that the method calling Aya_Parse() may return immediately before it actually does any processing. So if you are checking the value of Aya after calling Aya_Parse(), it likely has not had enough time to do the computation before you actually check the value elsewhere in your code. I recommend removing the async tag, or changing Aya_Parse() to return the value of Aya when it finishes. Check here for some good info on how to use the async tag with return values.
It could be. It's behaving as if your string variable is passed into the method by value rather than holding the reference.
Keep in mind that by using Async methods you are effectively multi threading, so multiple threads would be contending for the same module level variable. The compiler is likely choosing to make your code threadsafe for you.
If you declare a separate string inside your async method and pass it in by ref is should behave as you expect.
I would also suggest you do the same with your module level int.
OR... you could remove the async from the Aya_Parse and use the Task library (and toss in a Wait call below) to get your stream.
When you create a new XDocument using XDocument.Load, does it open the XML file and keep a local copy, or does it continuously read the document from the hard drive? If it does continuously read, is there a faster way to parse XML?
XDocument x = XDocument.Load("file.xml");
There are a couple of measurements to consider:
Linear traversal speed (e.g. reading/loading)
On-demand query speed
To answer the immediate question: XDocument uses an XmlReader to load the document into memory by reading each element and creating corresponding XElement instances (see code below). As such, it should be quite fast (fast enough for most purposes), but it may consume a large amount of memory when parsing a large document.
A raw XmlReader is an excellent choice for traversal if your needs are limited to that which can be done without retaining the document in memory. It will outperform other methods since no significant structure is created nor resolved with relation to other nodes (e.g. linking parent and child nodes). However, on-demand query ability is almost non-existent; you can react to values found in each node, but you can't query the document as a whole. If you need to look at the document a second time, you have to traverse the whole thing again.
By comparison, an XDocument will take longer to traverse because it instantiates new objects and performs basic structural tasks. It will also consume memory proportionate to the size of the source. In exchange for these trade-offs, you gain excellent query abilities.
It may be possible to combine the approaches, as mentioned by Jon Skeet and shown here: Streaming Into LINQ to XML Using C# Custom Iterators and XmlReader.
Source for XDocument Load()
public static XDocument Load(Stream stream, LoadOptions options)
{
XmlReaderSettings xmlReaderSettings = XNode.GetXmlReaderSettings(options);
XDocument result;
using (XmlReader xmlReader = XmlReader.Create(stream, xmlReaderSettings))
{
result = XDocument.Load(xmlReader, options);
}
return result;
}
// which calls...
public static XDocument Load(XmlReader reader, LoadOptions options)
{
if (reader == null)
{
throw new ArgumentNullException("reader");
}
if (reader.ReadState == ReadState.Initial)
{
reader.Read();
}
XDocument xDocument = new XDocument();
if ((options & LoadOptions.SetBaseUri) != LoadOptions.None)
{
string baseURI = reader.BaseURI;
if (baseURI != null && baseURI.Length != 0)
{
xDocument.SetBaseUri(baseURI);
}
}
if ((options & LoadOptions.SetLineInfo) != LoadOptions.None)
{
IXmlLineInfo xmlLineInfo = reader as IXmlLineInfo;
if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
{
xDocument.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
}
}
if (reader.NodeType == XmlNodeType.XmlDeclaration)
{
xDocument.Declaration = new XDeclaration(reader);
}
xDocument.ReadContentFrom(reader, options);
if (!reader.EOF)
{
throw new InvalidOperationException(Res.GetString("InvalidOperation_ExpectedEndOfFile"));
}
if (xDocument.Root == null)
{
throw new InvalidOperationException(Res.GetString("InvalidOperation_MissingRoot"));
}
return xDocument;
}
// which calls...
internal void ReadContentFrom(XmlReader r, LoadOptions o)
{
if ((o & (LoadOptions.SetBaseUri | LoadOptions.SetLineInfo)) == LoadOptions.None)
{
this.ReadContentFrom(r);
return;
}
if (r.ReadState != ReadState.Interactive)
{
throw new InvalidOperationException(Res.GetString("InvalidOperation_ExpectedInteractive"));
}
XContainer xContainer = this;
XNode xNode = null;
NamespaceCache namespaceCache = default(NamespaceCache);
NamespaceCache namespaceCache2 = default(NamespaceCache);
string text = ((o & LoadOptions.SetBaseUri) != LoadOptions.None) ? r.BaseURI : null;
IXmlLineInfo xmlLineInfo = ((o & LoadOptions.SetLineInfo) != LoadOptions.None) ? (r as IXmlLineInfo) : null;
while (true)
{
string baseURI = r.BaseURI;
switch (r.NodeType)
{
case XmlNodeType.Element:
{
XElement xElement = new XElement(namespaceCache.Get(r.NamespaceURI).GetName(r.LocalName));
if (text != null && text != baseURI)
{
xElement.SetBaseUri(baseURI);
}
if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
{
xElement.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
}
if (r.MoveToFirstAttribute())
{
do
{
XAttribute xAttribute = new XAttribute(namespaceCache2.Get((r.Prefix.Length == 0) ? string.Empty : r.NamespaceURI).GetName(r.LocalName), r.Value);
if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
{
xAttribute.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
}
xElement.AppendAttributeSkipNotify(xAttribute);
}
while (r.MoveToNextAttribute());
r.MoveToElement();
}
xContainer.AddNodeSkipNotify(xElement);
if (r.IsEmptyElement)
{
goto IL_30A;
}
xContainer = xElement;
if (text != null)
{
text = baseURI;
goto IL_30A;
}
goto IL_30A;
}
case XmlNodeType.Text:
case XmlNodeType.Whitespace:
case XmlNodeType.SignificantWhitespace:
if ((text != null && text != baseURI) || (xmlLineInfo != null && xmlLineInfo.HasLineInfo()))
{
xNode = new XText(r.Value);
goto IL_30A;
}
xContainer.AddStringSkipNotify(r.Value);
goto IL_30A;
case XmlNodeType.CDATA:
xNode = new XCData(r.Value);
goto IL_30A;
case XmlNodeType.EntityReference:
if (!r.CanResolveEntity)
{
goto Block_25;
}
r.ResolveEntity();
goto IL_30A;
case XmlNodeType.ProcessingInstruction:
xNode = new XProcessingInstruction(r.Name, r.Value);
goto IL_30A;
case XmlNodeType.Comment:
xNode = new XComment(r.Value);
goto IL_30A;
case XmlNodeType.DocumentType:
xNode = new XDocumentType(r.LocalName, r.GetAttribute("PUBLIC"), r.GetAttribute("SYSTEM"), r.Value, r.DtdInfo);
goto IL_30A;
case XmlNodeType.EndElement:
{
if (xContainer.content == null)
{
xContainer.content = string.Empty;
}
XElement xElement2 = xContainer as XElement;
if (xElement2 != null && xmlLineInfo != null && xmlLineInfo.HasLineInfo())
{
xElement2.SetEndElementLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
}
if (xContainer == this)
{
return;
}
if (text != null && xContainer.HasBaseUri)
{
text = xContainer.parent.BaseUri;
}
xContainer = xContainer.parent;
goto IL_30A;
}
case XmlNodeType.EndEntity:
goto IL_30A;
}
break;
IL_30A:
if (xNode != null)
{
if (text != null && text != baseURI)
{
xNode.SetBaseUri(baseURI);
}
if (xmlLineInfo != null && xmlLineInfo.HasLineInfo())
{
xNode.SetLineInfo(xmlLineInfo.LineNumber, xmlLineInfo.LinePosition);
}
xContainer.AddNodeSkipNotify(xNode);
xNode = null;
}
if (!r.Read())
{
return;
}
}
goto IL_2E1;
Block_25:
throw new InvalidOperationException(Res.GetString("InvalidOperation_UnresolvedEntityReference"));
IL_2E1:
throw new InvalidOperationException(Res.GetString("InvalidOperation_UnexpectedNodeType", new object[]
{
r.NodeType
}));
}
It will parse the incoming stream (whether it is from a file or a string doesn't matter) when you call Load() and then keep a local instance of the document in memory. Since the source can be anything (could be a NetworkStream, a DataReader, a string entered by the user) it couldn't go back and try to read the data again since it wouldn't know the state of it (streams being closed etc).
If you really want speed on the other hand, XDocument isn't the fastets (all though it is easier to work with) since it will need to first parse the document and then retain it in memory. If you are working with really large documents using an approach with System.Xml.XmlReader is usually way faster since it can read the document as a stream and doesn't need to retain anything except the current element. This benchmark shows some interesting figures about this.
I do no think it continuously reads; the nice thing about the XDocument.Load method is that it uses XmlReader to read the XML into an XML tree. And since now you just created a tree which is most likely stored in your memory as a tree it no longer reads the document constantly. It manipulates the tree and since it is a tree all your reading and modification are done a lot faster. Although it does not not implement IDisposable it is automatically disposed.
I'm trying to debug my c# application that check MIPS syntax. But its not allowing be to debug it. No matter where I enter my break point it gets ignored, including the first line of the Main() function. Its also throwing me this error.
'add a b c' works fine if i don't call HasValidParams()
'add a b' throws exception in the same situation
neither works when calling HasValidParams()
program.cs
private static void Main(string[] args)
{
var validator = new MipsValidator();
Console.Write("Please enter a MIPS statement: ");
string input = Console.ReadLine();
List<string> arguments = input.Split(new char[0]).ToList();
Response status = validator.IsSyntaxValid(arguments);
//Check syntax
if (status.Success.Equals(true))
{
Response stat = validator.HasValidParams(arguments);
//Check parameters
if (stat.Success.Equals(true))
{
Console.WriteLine(string.Format("'{0}' is a valid mips instruction ", input));
}
else
{
foreach (var reason in stat.Reasons)
{
Console.WriteLine(reason);
}
}
}
else
{
foreach (string reason in status.Reasons)
{
Console.WriteLine(reason);
}
}
}
mips-validator.cs
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
namespace mips_validator.utils
{
public class MipsValidator : IMipsValidator
{
#region Implementation of IMipsValidator
public Response IsSyntaxValid(List<string> args)
{
var response = new Response {Success = true};
var op = (Operator) Enum.Parse(typeof (Operator), args[0]);
switch (op)
{
case Operator.addi:
case Operator.add:
case Operator.beq:
if (args.Count != 4)
{
response.Reasons.Add(string.Format("4 operands required for {0}, {1} parameters provided.",
op, args.Count));
response.Success = false;
}
break;
case Operator.j:
if (args.Count != 2)
{
response.Reasons.Add(string.Format("1 operands required for {1}, {0} parameters provided.",
args.Count, op));
response.Success = false;
}
break;
default:
response.Reasons.Add(string.Format("{0} is an unknown mips operation", op));
response.Success = false;
break;
}
return response;
}
public Response HasValidParams(List<string> parameters)
{
string op1, op2, op3;
var temporary = new Regex(#"/\$t\d+/");
var store = new Regex(#"/\$s\d+/");
var zero = new Regex(#"/\$zero/");
var osReserved = new Regex(#"/\$k0|1/");
var memory = new Regex(#"");
var constant = new Regex(#"/-?\d*/");
var label = new Regex(#"/.*\:/");
Operator operation;
var response = new Response {Success = true};
string opString = parameters[0];
Enum.TryParse(opString.Replace("$", string.Empty), true, out operation);
switch (operation)
{
case Operator.add:
{
op1 = parameters[1];
op2 = parameters[2];
if (!temporary.IsMatch(op1) && !store.IsMatch(op1) && !zero.IsMatch(op1))
{
response.Reasons.Add(string.Format("{0}: error register expected", op1));
response.Success = false;
}
if (!temporary.IsMatch(op2) && !store.IsMatch(op2) && !zero.IsMatch(op2))
{
response.Reasons.Add(string.Format("{0}: error register expected", op2));
response.Success = false;
}
}
break;
case Operator.addi:
{
op1 = parameters[1];
op2 = parameters[2];
if (!temporary.IsMatch(op1) && !store.IsMatch(op1) && !zero.IsMatch(op1))
{
response.Reasons.Add(string.Format("{0}: error register expected", op1));
response.Success = false;
}
if (!constant.IsMatch(op2) && !zero.IsMatch(op2))
{
response.Reasons.Add(string.Format("{0}: error constant expected", op2));
response.Success = false;
}
}
break;
case Operator.beq:
{
op1 = parameters[1];
op2 = parameters[2];
op3 = parameters[3];
if (!temporary.IsMatch(op1) && !store.IsMatch(op1) && !zero.IsMatch(op1))
{
response.Reasons.Add(string.Format("{0}: error register expected", op1));
response.Success = false;
}
if (!temporary.IsMatch(op2) && !store.IsMatch(op2) && !zero.IsMatch(op2))
{
response.Reasons.Add(string.Format("{0}: error register expected", op2));
response.Success = false;
}
if (!label.IsMatch(op3) && !constant.IsMatch(op3))
{
response.Reasons.Add(string.Format("{0}: error label or constant expected", op3));
response.Success = false;
}
}
break;
}
return response;
}
#endregion
}
}
SOLUTION-------
Response.cs(old)
public class Response
{
public List<string> Reasons;
public bool Success = true;
}
Response.cs(current)
public class Response
{
public Response()
{
Reasons = new List<string>();
Success = true;
}
public List<string> Reasons;
public bool Success = true;
}
I can't tell if you're looking for a way to be able to debug your project or if you'd prefer to be told potential issues in your code.
For the latter:
Make sure Response.Reasons is initialized by the constructor of Response (or a field initializer).
You're not showing the Response class, so make sure Reasons is actually set to a collection you can add to and not left to the default, null.
Edit: The below possible cause for a crash was pointed put by #nodakai not to be one at all; turns out an empty char array is a special case to split on whitespace.
*You calculate arguments by doing
List arguments = input.Split(new char[0]).ToList();
...which as far as I can tell does absolutely nothing except put the original string inside a List. You probably want to split on new char[] {' '} instead to split on spaces.*
Check if your breakpoint looks like this:
If it does, your source code differs from the code the assembly was actually compiled with. Make sure your project is built properly (right click on the solution and select "Rebuild") and check your current configuration:
Hope this helps...