What is the equivalent to &THROW_PUBLIC in C# or what does &THROW_PUBLIC mean is ABL? I don't know much about ABL but need to convert it to C#.
Code Snip from a Data Directive in Epicor 9.05
If ttPart.ProdCode = '' then do:
If lookup(SUBSTRING(ttPart.PartNum,1,3),'12-,13-,14-,15-',',') <> 0 then do:
{lib\PublishEx.i &ExMsg = "'Please select a Product Group for this Part'"}
{&THROW_PUBLIC}.
End.
End.
&THROW_PUBLIC is not a Progress ABL keyword or any standard feature at all. Most likely it's something specific to Epicor.
For better understanding of the problem perhaps you should post more code!
However: beginning with an ampersand might be a clue to it being a preprocessor. A preprocessor can be defined with GLOBAL-DEFINE or SCOPED-DEFINE - look for that in your code.
When the program is compiled any reference to the preprocessor (written like {&name-of-preprocesor} will be replaced with it's definition. Certain limited checks can be done (like for instance what OS is used for compiling).
Here's an example of two preprocessors being defined and used.
&GLOBAL-DEFINE THROW_PUBLIC1 MESSAGE "HELLO 1".
&SCOPED-DEFINE THROW_PUBLIC2 MESSAGE "HELLO 2".
{&THROW_PUBLIC1}
{&THROW_PUBLIC2}
After the precompiler the program will simply look like this:
MESSAGE "HELLO 1".
MESSAGE "HELLO 2".
The THROW part might indicate some kind of error handling being used like a THROW-CATCH-situation or similar. They are written something like this:
BLOCK-LEVEL ON ERROR UNDO, THROW.
DEFINE VARIABLE i AS INTEGER NO-UNDO.
ASSIGN
i = INTEGER("hello").
CATCH err AS Progress.Lang.Error:
MESSAGE "Error is caught here" VIEW-AS ALERT-BOX.
END.
FINALLY:
MESSAGE "This is run in the end" VIEW-AS ALERT-BOX.
END.
{&THROW_PUBLIC}. is defined in the Epicor include file manager\Exception.i (this is probably xcoded so you can't read it). This basically skips to the end of the {&TRY_PUBLIC} / {&CATCH_PUBLIC} block and publishes an exception.
In c# it is much easier:
throw new Ice.BLException("Please select a Product Group for this Part");
You can use a standard System.Exception but Ice.BLException (defined in Epicor.ServiceModel.dll) has some overloads to record extra information which other parts of the framework can display/log.
Related
In an Antlr 3 grammar, is it possible to print out the full text matching a rule in a grammar targeting c#? Something like below:
rule : FIRST SECOND
{ Console.WriteLine($rule.text); };//does not work.
FIRST: 'first';
SECOND: 'second';
If $rule.text doesn't work (as #dana suggested), you might try $rule.Text or even $rule.GetText().
If all fails, please tell us which version of the C# port you're using (and where it can be downloaded), then I (or someone else) can perhaps give it a try.
E.g:
isValidCppIdentifier("_foo") // returns true
isValidCppIdentifier("9bar") // returns false
isValidCppIdentifier("var'") // returns false
I wrote some quick code but it fails:
my regex is "[a-zA-Z_$][a-zA-Z0-9_$]*"
and I simply do regex.IsMatch(inputString).
Thanks..
It should work with some added anchoring:
"^[a-zA-Z_][a-zA-Z0-9_]*$"
If you really need to support ludicrous identifiers using Unicode, feel free to read one of the various versions of the standard and add all the ranges into your regexp (for example, pages 713 and 714 of http://www-d0.fnal.gov/~dladams/cxx_standard.pdf)
Matti's answer will work to sanitize identifiers before inserting into C++ code, but won't handle C++ code as input very well. It will be annoying to separate things like L"wchar_t string", where L is not an identifier. And there's Unicode.
Clang, Apple's compiler which is built on a philosophy of modularity, provides a set of tokenizer functions. It looks like you would want clang_createTranslationUnitFromSourceFile and clang_tokenize.
I didn't check to see if it handles \Uxxxx or anything. Can't make any kind of gurarantees. Last time I used LLVM was five years ago and it wasn't the greatest experience… but not the worst either.
On the other hand, GCC certainly has it, although you have to figure out how to use cpp_lex_direct.
I'm attempting to write an application to extract properties and code from proprietary IDE design files. The file format looks something like this:
HEADING
{
SUBHEADING1
{
PropName1 = PropVal1;
PropName2 = PropVal2;
}
SUBHEADING2
{
{ 1 ; PropVal1 ; PropValue2 }
{ 2 ; PropVal1 ; PropValue2 ; OnEvent1=BEGIN
MESSAGE('Hello, World!');
{ block comments are between braces }
//inline comments are after double-slashes
END;
PropVal3 }
{ 1 ; PropVal1 ; PropVal2; PropVal3 }
}
}
What I am trying to do is extract the contents under the subheading blocks. In the case of SUBHEADING2, I would also separate each token as delimited by the semicolons. I had reasonably good success with just counting the brackets and keeping track of what subheading I'm currently under. The main issue I encountered involves dealing with the code comments.
This language happens to use {} for block comments, which interferes with the brackets in the file format. To make it even more interesting, it also needs to take into account double-slash inline comments and ignore everything up to the end of the line.
What is the best approach to tackling this? I looked at some of the compiler libraries discussed in another article (ANTLR, Doxygen, etc.) but they seem like overkill for solving this specific parsing issue.
I'd suggest writing a tokenizer and parser; this will give you more flexibility. The tokenizer basically does a simple text-wise breakdown of the sourcecode and puts it into more usable data structure; the parser figures out what to do with it, often leveraging recursion.
Terms to google: tokenizer, parser, compiler design, grammars
Math expression evaluator: http://www.codeproject.com/KB/vb/math_expression_evaluator.aspx
(you might be able to take an example like this and hack it apart into what you want)
More info about parsing: http://www.codeproject.com/KB/recipes/TinyPG.aspx
You won't have to go nearly as far as those articles go, but, you're going to want to study a bit on this one first.
You should be able to put something together in a few hours, using regular expressions in combination with some code that uses the results.
Something like this should work:
- Initialize the process by loading the file into a string.
Pull each top-level block from the string, using regex tags to separately identify the block keyword and contents.
If a block is found,
Make a decision based on the keyword
Pass the content to this process recursively.
Following this, you would process HEADING, then the first SUBHEADING, then the second SUBHEADING, then each sub-block. For the sub-block containing the block comment, you would presumably know based on the block's lack of a keyword that any sub-block is a comment, so there is no need to process the sub-blocks.
No matter which solution you will choose, I'm pretty sure the best way is to have 2 parsers/tokenizers. One for the main file structure with {} as grouping characters, and one for the code blocks.
Does anyone have any suggestions as to how I can clean the body of incoming emails? I want to strip out disclaimers, images and maybe any previous email text that may be also be present so that I am left with just the body text content. My guess is it isn't going to be possible in any reliable way, but has anyone tried it? Are there any libraries geared towards this sort of thing?
In email, there is couple of agreed markings that mean something you wish to strip. You can look for these lines using regular expressions. I doubt you can't really well "sanitize" your emails, but some things you can look for:
Line starting with "> " (greater than then whitespace) marks a quote
Line with "-- " (two hyphens then whitespace then linefeed) marks the beginning of a signature, see Signature block on Wikipedia
Multipart messages, boundaries start with --, beyond that you need to do some searching to separate the message body parts from unwanted parts (like base64 images)
As for an actual C# implementation, I leave that for you or other SOers.
A few obvious things to look at:
if the mail is anything but pure plain text, the message will be multi-part mime. Any part whose type is "image/*" (image/jpeg, etc), can probably be dropped. In all likelyhood any part whose type is not "text/*" can go.
A HTML message will probably have a part of type "multipart/alternative" (I think), and will have 2 parts, one "text/plain" and one "text/html". The two parts should be just about equivalent, so you can drop the HTML part. If the only part present is the HTML bit, you may have to do a HTML to plain text conversion.
The usual format for quoted text is to precede the text by a ">" character. You should be able to drop these lines, unless the line starts ">From", in which case the ">" has been inserted to prevent the mail reader from thinking that the "From " is the start of a new mail.
The signature should start with "-- \r\n", though there is a very good chance that the trailing space will be missing.
Version 3 of OSBF-Lua has a mail-parsing library that will handle the MIME and split a message into its MIME parts and so on. I currently have a mess of Lua scripts that do
stuff like ignore most non-text attachments, prefer plain text to HTML, and so on. (I also wrap long lines to 80 characters while trying to preserve quoting.)
As far as removing previously quoted mail, the suggestions above are all good (you must subscribe to some ill-mannered mailing lists).
Removing disclaimers reliably is probably going to be hard. My first cut would be simply to maintain a library of disclaimers that would be stripped off the end of each mail message; I would write a script to make it easy for me to add to the library. For something more sophisticated I would try some kind of machine learning.
I've been working on spam filtering since Feb 2007 and I've learned that anything to do with email is a mess. A good rule of thumb is that whatever you want to do is a lot harder than you think it is :-(
Given your question "Is it possible to programmatically ‘clean’ emails?", I'd answer "No, not reliably".
The danger you face isn't really a technological one, but a sociological one.
It's easy enough to spot, and filter out, some aspects of the messages - like images. Filtering out signatures and disclaimers is, likewise, possible to achieve (though more of a challenge).
The real problem is the cost of getting it wrong.
What happens if your filter happens to remove a critical piece of the message? Can you trace it back to find the missing piece, or is your filtering desctructive? Worse, would you even notice that the piece was missing?
There's a classic comedy sketch I saw years ago that illustrates the point. Two guys working together on a car. One is underneath doing the work, the other sitting nearby reading instructions from a service manual - it's clear that neither guy knows what he's doing, but they're doing their best.
Manual guy, reading aloud: "Undo the bold in the centre of the oil pan ..." [turns page]
Tool guy: "Ok, it's out."
Manual guy: "... under no circumstances."
If you creating your own application i'd look into Regex, to find text and replace it. To make the application a little nice, i'd create a class Called Email and in that class i have a property called RAW and a property called Stripped.
Just some hints, you'll gather the rest when you look into regex!
SigParser has an assembly you can use in .NET. It gives you the body back in both HTML and text forms with the rest of the stuff stripped out. If you give it an HTML email it will convert the email to text if you need that.
var parser = new SigParser.EmailParsing.EmailParser();
var result = await parser.GetCleanedBodyAsync(new SigParser.EmailParsing.Models.CleanedBodyInput {
FromEmailAddress = "john.smith#example.com",
FromName = "John Smith",
TextBody = #"Hi Mark,
This is my message.
Thanks
John Smith
888-333-4434"
});
// This would print "Hi Mark,\r\nThis is my message."
Console.WriteLine(result.CleanedBodyPlain);
I am wondering if it is possible to extract the index position in a given string where a Regex failed when trying to match it?
For example, if my regex was "abc" and I tried to match that with "abd" the match would fail at index 2.
Edit for clarification. The reason I need this is to allow me to simplify the parsing component of my application. The application is an Assmebly language teaching tool which allows students to write, compile, and execute assembly like programs.
Currently I have a tokenizer class which converts input strings into Tokens using regex's. This works very well. For example:
The tokenizer would produce the following tokens given the following input = "INP :x:":
Token.OPCODE, Token.WHITESPACE, Token.LABEL, Token.EOL
These tokens are then analysed to ensure they conform to a syntax for a given statement. Currently this is done using IF statements and is proving cumbersome. The upside of this approach is that I can provide detailed error messages. I.E
if(token[2] != Token.LABEL) { throw new SyntaxError("Expected label");}
I want to use a regular expression to define a syntax instead of the annoying IF statements. But in doing so I lose the ability to return detailed error reports. I therefore would at least like to inform the user of WHERE the error occurred.
I agree with Colin Younger, I don't think it is possible with the existing Regex class. However, I think it is doable if you are willing to sweat a little:
Get the Regex class source code
(e.g.
http://www.codeplex.com/NetMassDownloader
to download the .Net source).
Change the code to have a readonly
property with the failure index.
Make sure your code uses that Regex
rather than Microsoft's.
I guess such an index would only have meaning in some simple case, like in your example.
If you'll take a regex like "ab*c*z" (where by * I mean any character) and a string "abbbcbbcdd", what should be the index, you are talking about?
It will depend on the algorithm used for mathcing...
Could fail on "abbbc..." or on "abbbcbbc..."
I don't believe it's possible, but I am intrigued why you would want it.
In order to do that you would need either callbacks embedded in the regex (which AFAIK C# doesn't support) or preferably hooks into the regex engine. Even then, it's not clear what result you would want if backtracking was involved.
It is not possible to be able to tell where a regex fails. as a result you need to take a different approach. You need to compare strings. Use a regex to remove all the things that could vary and compare it with the string that you know it does not change.
I run into the same problem came up to your answer and had to work out my own solution. Here it is:
https://stackoverflow.com/a/11730035/637142
hope it helps