Today, I encountered this new syntax in a C# program:
var sb = new StringBuilder();
foreach (var ns in this.Namespaces)
{
_ = sb.AppendFormat(CultureInfo.InvariantCulture, " {0}", ns.Value);
}
The underscore inside the loop is never defined, and the code above seems to compile just fine.
So I think the underscore is a syntactic sugar of some kind. But seems that my Google skill failed me this time (I found perl and python information about underscore, or a JS library named underscore).
So it's not clear to me the meaning of the underscore in the code snippet. Can someone clarify it to me?
The code is taken from this library :
https://www.nuget.org/packages/OpenGraph-Net/4.0.2-alpha.0.6
A Net Core 6.0 library written in C#
The term you need to search for is discard variables - the underscore indicates that the variable is intentionally unused.
See here for more details.
Related
I have a very simple grammar that (I think) should only allow additions of two elements like 1+1 or 2+3
grammar dumbCalculator;
expression: simple_add EOF;
simple_add: INT ADD INT;
INT:('0'..'9');
ADD : '+';
I generate my C# classes using the official ANTLR jar file
java -jar "antlr-4.9-complete.jar" C:\Users\Me\source\repos\ConsoleApp2\ConsoleApp2\dumbCalculator.g4 -o C:\Users\Me\source\repos\ConsoleApp2\ConsoleApp2\Dumb -Dlanguage=CSharp -no-listener -visitor
No matter what I try, the parser keeps adding the trailing elements although they shouldn't be allowed.
For example "1+1+1" gets parsed properly as an AST :
expression
simple_add
1
+
1
+
1
Although I specifically wrote that expression must be simple_add then EOF and simple_add is just INT ADD INT. I have no idea why the rest is being accepted, I expect ANTLR to throw an exception on this.
This is how I test my parser :
var inputStream = new AntlrInputStream("1+1+1");
var lexer = new dumbCalculatorLexer(inputStream);
lexer.RemoveErrorListeners();
lexer.AddErrorListener(new ThrowExceptionErrorListener());
var commonTokenStream = new CommonTokenStream(lexer);
var parser = new dumbCalculatorParser(commonTokenStream);
parser.RemoveErrorListeners();
parser.AddErrorListener(new ThrowExceptionErrorListener());
var ex = parser.expression();
ExploreAST(ex);
Why is the rest of the output being accepted ?
Classical scenario, I find my error 5 minutes after posting on Stack Overflow.
For anyone encountering a similar scenario, this happened because I did not explicitly set the ErrorHandler on my parser.
Naively, I expected all the AddErrorListener to handle the errors, but somehow there's a specific thing to do if you need the errors to be handled before visiting the tree.
I needed to add
parser.ErrorHandler = new BailErrorStrategy();
After this, I indeed got the exceptions on wrong input strings.
This is probably not the right thing to do, I'll let someone who knows ANTLR better to comment on this.
I need to create a C#.NET program which will search specific words in a Microsoft Word document and will replace it with another words. For example, in my word file there is a text which is – LeadSoft IT. This “LeadSoft IT” will be replaced by – LeadSoft IT Limited. Now there is a problem which is, at the first time LeadSoft IT will be replaced with LeadSoft IT Limited. But if I run the program again then it will change LeadSoft IT again and in the next time the text will be LeadSoft IT Limited Limited. This is a problem. Can anyone suggest me how to solve this problem with C# code to replace words in word document.
If you already have some script for this, feel free to post it and I'll try and help more.
I'm not sure what functionality you're using to find the text instance, but I would suggest looking into regex, and using something like (LeadSoft IT(?! Limited)).
Regex: https://regexr.com/
A good regex tester: https://www.regextester.com/109925
Edit: I made a Python script that uses regex to replace the instances:
import re
word_doc = "We like working " \
"here at Leadsoft IT.\n" \
"We are not limited here at " \
"Leadsoft It Limited."
replace_str = "Leadsoft IT Limited"
reg_str = '(Leadsoft IT(?!.?Limited))'
fixed_str = re.sub(reg_str, replace_str, word_doc, flags=re.IGNORECASE)
print(fixed_str)
# Prints:
# We like working here at Leadsoft IT Limited.
# We are not limited here at Leadsoft It Limited.
Edit 2: Code re-created in C#: https://gist.github.com/Zylvian/47ecd6d1953b8d8c3900dc30645efe98
The regex checks the entire string for instances where Leadsoft IT is NOT followed by Limited, and for all those instances, replaces Leadsoft IT with Leadsoft IT Limited.
The regex uses what's called a "negative lookahead (?!)" which makes sure that the string to the left is not followed by the string to the right. Feel free to edit the regex how you see fit, but be aware that the matching is very strong.
If you want to understand the regex string better, feel free to copy it into https://www.regextester.com/.
Let me know if that helps!
Simplistically, you can just run another replace to fix the problem you cause:
s = s.Replace("LeadSoft IT", "LeadSoft IT Limited").Replace("LeadSoft IT Limited Limited", "LeadSoft IT Limited");
If you're after a more generic fixing of this that doesn't hard code the problem string, consider examining whether the string you find is inside the string you replace with, which will mean the problem occurs. This means you need to run a second replacement on the document that finds the result of running the replacement on the replacement
var find = "LeadSoft IT";
var repl = "LeadSoft IT Limited";
var result = document.Replace(find, repl);
var problemWillOccur = repl.Contains(find);
if(problemWillOccur){
var fixProblemByFinding = repl.Replace(find, repl); //is "LeadSoft IT Limited Limited"
result = result.Replace(fixProblemByFinding, repl);
}
You may be interested how I solve this problem.
At first, I was using NPOI but it was making a mess with document, so I discovered that a DOCX file is simply a ZIP Archive with XMLs.
https://github.com/kubala156/DociFlow/blob/main/DociFlow.Lib/Word/SeekAndReplace.cs
Usage:
var vars = Dictionary<string, string>()
{
{ "testtag", "Test tag value" }
}
using (var doci = new DociFlow.Lib.Word.SeekAndReplace())
{
// test.docx contains text with tag "{{testtag}}" it will be replaced with "Test tag value"
doci.Open("test.docx");
doci.FindAndReplace(vars, "{{", "}}");
}
NPOI 2.5.4 provides ReplaceText method to help you replace placeholders in a Word file.
Here is an example.
https://github.com/nissl-lab/npoi-examples/blob/main/xwpf/ReplaceTexts/Program.cs
I'm now writing C# grammar using Antlr 3 based on this grammar file.
But, I found some definitions I can't understand.
NUMBER:
Decimal_digits INTEGER_TYPE_SUFFIX? ;
// For the rare case where 0.ToString() etc is used.
GooBall
#after
{
CommonToken int_literal = new CommonToken(NUMBER, $dil.text);
CommonToken dot = new CommonToken(DOT, ".");
CommonToken iden = new CommonToken(IDENTIFIER, $s.text);
Emit(int_literal);
Emit(dot);
Emit(iden);
Console.Error.WriteLine("\tFound GooBall {0}", $text);
}
:
dil = Decimal_integer_literal d = '.' s=GooBallIdentifier
;
fragment GooBallIdentifier
: IdentifierStart IdentifierPart* ;
The above fragments contain the definition of 'GooBall'.
I have some questions about this definition.
Why is GooBall needed?
Why does this grammar define lexer rules to parse '0.ToString()' instead of parser rules?
It's because that's a valid expression that's not handled by any of the other rules - I guess you'd call it something like an anonymous object, for lack of a better term. Similar to "hello world".ToUpper(). Normally method calls are only valid on variable identifiers or return values ala GetThing().Method(), or otherwise bare.
Sorry. I found the reason from the official FAQ pages.
Now if you want to add '..' range operator so 1..10 makes sense, ANTLR has trouble distinguishing 1. (start of the range) from 1. the float without backtracking. So, match '1..' in NUM_FLOAT and just emit two non-float tokens:
I'm trying to use a DLL generated by ikvmc from a jar file compiled from Scala code (yeah my day is THAT great). The Scala compiler seems to generate identifiers containing dollar signs for operator overloads, and IKVM uses those in the generated DLL (I can see it in Reflector). The problem is, dollar signs are illegal in C# code, and so I can't reference those methods.
Any way to work around this problem?
You should be able to access the funky methods using reflection. Not a nice solution, but at least it should work. Depending on the structure of the API in the DLL it may be feasible to create a wrapper around the methods to localise the reflection code. Then from the rest of your code just call the nice wrapper.
The alternative would be to hack on the IL in the target DLL and change the identifiers. Or do some post-build IL-hacking on your own code.
Perhaps you can teach IKVM to rename these identifiers such that they have no dollar sign? I'm not super familar, but a quick search pointed me at these:
http://weblog.ikvm.net/default.aspx?date=2005-05-02
What is the format of the Remap XML file for IKVM?
String and complex data types in Map.xml for IKVM!
Good Hunting
Write synonyms for those methods:
def +(a:A,b:A) = a + b
val plus = + _
I fear that you will have to use Reflection in order to access those members. Escaping simply doesn't work in your case.
But for thoose of you, who interested in escaping mechanics I've wrote an explanation.
In C# you can use the #-sign in order to escape keywords and use them as identifiers. However, this does not help to escape invalid characters:
bool #bool = false;
There is a way to write identifiers differently by using a Unicode escape sequence:
int i\u0064; // '\u0064' == 'd'
id = 5;
Yes this works. However, even with this trick you can still not use the $-sign in an identifier. Trying...
int i\u0024; // '\u0024' == '$'
... gives the compiler error "Unexpected character '\u0024'". The identifier must still be a valid identifier! The c# compiler probably resolves the escape sequence in a kind of pre-processing and treats the resulting identifier as if it had been entered normally
So what is this escaping good for? Maybe it can help you, if someone uses a foreign language character that is not on your keyboard.
int \u00E4; // German a-Umlaut
ä = 5;
I am trying to make a regular expression that checks for the Camel Casing for the name of variables.
The expression I have got so far is:
(?xm-isn:(?:\b\w*(?:-)\w*\s*\=)|(?:\b[A-Z0-9_-]+(?=\s*\W*\b)\s*\=))
which works fine.
The question is, how can I make an exception for the following part of the code so it doesn't consider this naming convention for that particular part of the code in the file?
public enum ProjectType
{
[DisplayName("All")]
All = 0,
[DisplayName("All .NET - Windows Forms and Web Forms")]
AllNet = 1,
}
Regex is great for pattern matching but not lexical analysis. I suggest you look into that by using such tools as Garden Points Lexical Analysis.