I have this solution (that works), but i would like to know if theres a way to make a loop that checks if the method-name is posted, and if it is -> run the method. Current code:
if (HttpContext.Current.Request["FunctionName"] != null)
{
switch (HttpContext.Current.Request["FunctionName"])
{
case "DoStuff":
DoStuff();
//... etc
Hope you get the idea, otherwise ill elaborate.
Thanks in advance!
You could call GetType().GetMethod(HttpContext.Current.Request["FunctionName"], new Type[]{}) which would return a MethodInfo that you could invoke. I wouldn't though for a few reasons:
The general diciness of do-whatever-the-user-tells-you is high enough that even with the assurance that this was done in a class where every method (including inherited) was safe to run, I'd rather be more active in parsing requests from potentially malicious users.
There'd have to be lot of such methods before the convenience of this outweighed the relative cost, and at that point I'd wonder about the specification of the resource in question. URIs should map to a resource with well defined meanings, rather than including everything but the kitchen sink. There should only be a small number of possible values for the function name anyway.
The title says you're taking this from the query string, which suggests you're reacting to a GET by doing different things. GETs should be "look at" operations, that return the state of the thing looked at. This can certainly involve doing quite a bit (classic example is a search that does a lot of complicated comparisons, possibly against a variety of different sources, but is still a "look at" operation). The query string should not select a choice of actions, that should be done by examining the information POSTed to the resource - or better yet POSTed to the resources with completely different URIs for each sort of operation.
based on the follow up comments I would create context specific handlers rather than one handler to process all generic requests. otherwise integrate a MVC framework into the webforms project and let the MVC framework handle object/method delegation.
Related
To give my users more flexibility and to let them write their own expressions, I want to allow them to write very simple C# statements in a text field that are executed on server side to do some custom calculations.
I am archiving this with Roslyn.
A good example to start for me can be found here.
I let users inject code inside an evaluation function, like this:
string codeToCompile = #"
using System;
using System.Collections.Generic;
namespace Evaluator
{
public class Evaluator
{
public string Eval()
{
" + {POTENTIALLY_DANGEROUS_CODE_GOES_HERE} + #"
}
}
}";
You can see that the injected code is always inside an Eval-Function and should return a string in the end.
The user can decide how this string is calculated.
I am now thinking of security, because I do not have any control of the injected code.
Actually my users should only be able to:
Use mathematical expressions
Primitive variables
if statements
So an example injected code could look like this:
int a = 5;
int b = 10;
if(a < b)
{
return "a is smaller";
}
else
{
return "a is bigger or equal";
}
You can see in the sample code above, that the namespace is limited to "System" and "System.Collections.Generic", so a lot of stuff wont be possible anymore (like reading something from the file system of the server and outputting this information as a string)
I also replace all occurences of loops, so expressions like while, for, foreach etc... are removed from the string.
But I am still pretty unsure if this solution is secure.
What could a potential attacker still do now? (Especially with the options of the two provided namespaces)
Is there any best-practice what I could do in this case, to prevent attacks?
Depending on your needs, this is very hard to get right. Very hard. "If you have to ask how to do it you might be over your head" hard. Some fun things to consider:
Just because you limit the namespaces at the top of the file doesn't mean somebody can't explicitly qualify to something in their code snippet to a different namespace. So what's important is you have to walk the entire code to see if there's uses of any other types. I can't tell if your explicit list of things you're allowing implicitly disallows all method calls or object creation.
Be careful with assuming anything in System is safe. Consider System.Activator, which lets you call CreateInstance and pass in the string name of another type to create it. That type alone lets you bypass any other checks you might do. And that was just the first one that jumped out when I pulled up the docs in the System namespace alphabetically!
...and of course don't just block System.Activator specifically. Any time you update which framework people are writing code against, there might be new types that are problematic.
Also consider your types of potential security attacks: even if you can't write files, can you still leak information from your server (like the username or machine name) that might allow the user to break into your system some other way. Or they just write an infinite loop which consumes server resources. You mentioned that you'll remove loops, but don't forget things like goto, or just writing some sort of recursive function that does a stack overflow.
I'm not going to say "just do X and it's safe", because I don't even trust myself to write that. But:
Use your OS to help you isolate: run this in a separate process with less or no permissions, etc. If you can do a separate VM/container, great. The more you can isolate here the better.
If you're going to do code inspection, don't reject patterns that you know are bad; instead write code that only accepts patterns that you know are "safe". And that might result in a lot of work to opt in silly things, but the alternative requires you to enumerate all that is bad.
Maybe you do not need C# code? There are multiple options to integrate a script language. They will be interpreted and slower but that is usually not an issue.
I can recommend https://github.com/sebastienros/jint for JavaScript for example. (I'm not affiliated with the project) It has good interoparability with the hosting C# code. It also has built-in safety guards for "resource" attacks like endless loops or excessive memory consumption.
Small word of caution: Be careful what kind of .net interoparability you allow or you might open up a security risk even with interpreted code.
While going through MVC concepts, i have read that it is not a good practice to have code inside 'GET' action which changes state of server objects( DB updates etc.,).
'Caching of return data' has been given as a reason for this.
Could someone please explain this?
Thanks in advance!
This is by HTTP standard. The GET verb is one that should be idempotent and safe.
9.1.1 Safe Methods
Implementors should be aware that the software represents the user in
their interactions over the Internet, and should be careful to allow
the user to be aware of any actions they might take which may have an
unexpected significance to themselves or others.
In particular, the convention has been established that the GET and
HEAD methods SHOULD NOT have the significance of taking an action
other than retrieval. These methods ought to be considered "safe".
This allows user agents to represent other methods, such as POST, PUT
and DELETE, in a special way, so that the user is made aware of the
fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not
generate side-effects as a result of performing a GET request; in
fact, some dynamic resources consider that a feature. The important
distinction here is that the user did not request the side-effects, so
therefore cannot be held accountable for them.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
Browsers can cache GET requests, generally on static data, like images or scripts. But you can also allow browsers to cache GET requests to controller actions as well, using [OutputCache] or other similar ways, so if caching is turned on for a GET controller action, it's possible that clicking on a link leading to /Home/Index doesn't actually run the Index method on the server, but rather allows the browser to serve up the page from its own cache.
With this line of thinking, you can safely turn on caching on GET actions in which the data you're serving up doesn't change (or doesn't change often), with the knowledge that your server action won't fire every time.
POSTs won't be cached by the browser, so any POST is guaranteed to make it to the server.
Ignore caching for a moment. Another way of thinking about this is that search engines will store HTTP GET links during their indexing/crawling process, therefore they will show up in search results.
Suppose if your /Home/Index is implemented as GET but it lets say deletes a row in your Database, every time this link shows up on a search engine and somebody clicks it, you will have a delete row, and soon you have a lot deleted rows.
The HTTP spec states that GET and HEAD are expected to be idempotent, ie. they should not change server state.
One practical aspect of this, is that search robots will issue GET against any link to your site they know of. If such a GET changes user data it was not meant to change, you are in trouble.
Being idempotent has the added benefit that clients could be able to cache the result of a GET (use HTTP headers to control this).
In my application we have multi-lingual language strings which are stored in custom tables, as the user can edit, delete, import new languages etc... via a UI
Currently, what I'm doing is at the beginning of each request is. I'm going off and getting all the language strings (From our database) for the currently selected language and sticking them in a dictionary.
I then have a Html Helper extension method which I use in the razor views (See below), which fishes in the dictionary I got at the beginning of the request to pull out the correct language based on the key supplied in the helper.
Html.LanguageString("MyLanguage.KeyHere")
Now this works fine. However, as the application is getting bigger. We are getting more and more language strings. It's not an issue right now, as its still very fast as there are only around 200 strings to get.
But this also means I'm getting all of them, even if a page has say one on it. I'd ideally like a way of processing the LanguageString("")'s before hand and doing a query to just get those that are needed at the beginning of the request? Or maybe my own linq based language that can be processed and product a more efficient call.
I'm looking for some advice on how to do this. As I'd like the application to be as efficient as possible. Any advice, help, tips are greatly received. Thanks.
I'd suggest caching language strings on the application basis rather than fetching them for every request. For example, this can be done by maintaining a static dictionary and invalidating the cache only when the user makes changes to these strings. This will make your application more responsive as well as save you from implementing (imho) rather more complex and not necessarily efficient technique of loading this data on-demand.
As a side note I'd add the following: it's usually a good practice to address these kinds of problems when they arise (rather than fixing something that is not broken) and focus on more important things. I totally agree that performance implications of a given solution must always be taken into consideration, I'm just saying that premature optimizations are not always a good idea.
I have the folowing question:
What is the prefered way to use the status in code, an enum OR singleton?
I have in a DB stored the status values with their ID's. If the status changes in de DB is would also need some changes in the code.
does anyone now what is more prefered, based on conventions?
I've been looking on the internet but couldn't find a clear answer.
It depends in part on whether the ids for your statuses have guaranteed values, or whether the ids could change per-database (via an IDENTITY). Personally, for statuses I prefer fixed - which gives you the most flexibility, and least overhead - you can choose to use an enum (or maybe some consts if more convenient), and you never have to add an indirection, i.e. "get the id that is open".
This isn't always possible, though, and when it isn't it is still definitely useful to cache and re-use them (to avoid hitting the DB for that lookup). However, I would avoid a singleton, not least because it won't play nicely if you ever need to talk to more than one database - the ids in each could well be different. However, any suitable cache implementation (or maybe IoC/DI) should allow you to store the appropriate data (probably some kind of dictionary). Singletons are also just a bit of a pain generally if you like testing etc.
But: an enum and fixed id values is a lot simpler.
Note that under any implementation, changing the status list is a non-trivial operation, not least it will be a big UPDATE (or several if you are denormalized).
If you intend to use the Status across the application and is standardised across then it would be best fit for an Enum
Enum Status
{Open, Pending, Closed, Deferred}
Also this makes the code more readable
What is the simplest way to identify if a given method is reading or writing a member variable or property?
I am writing a tool to assist in an RPC system, in which access to remote objects is expensive. Being able to detect if a given object is not used in a method could allow us to avoid serializing its state. Doing it on source code is perfectly reasonable (but being able to do it on compiled code would be amazing)
I think I can either write my own simple parser, I can try to use one of the existing C# parsers and work with the AST. I am not sure if it is possible to do this with Assemblies using Reflection. Are there any other ways? What would be the simplest?
EDIT: Thanks for all the quick replies. Let me give some more information to make the question clearer. I definitely prefer correct, but it definitely shouldn't be extremely complex. What I mean is that we can't go too far checking for extremes or impossibles (as the passed-in delegates that were mentioned, which is a great point). It would be enough to detect those cases and assume everything could be used and not optimize there. I would assume that those cases would be relatively uncommon.
The idea is for this tool to be handed to developers outside of our team, that should not be concerned about this optimization. The tool takes their code and generates proxies for our own RPC protocol. (we are using protobuf-net for serialization only, but no wcf nor .net remoting). For this reason, anything we use has to be free or we wouldn't be able to deploy the tool for licensing issues.
You can have simple or you can have correct - which do you prefer?
The simplest way would be to parse the class and the method body. Then identify the set of tokens which are properties and field names of the class. The subset of those tokens which appears in the method body are the properties and field names you care about.
This trivial analysis of course is not correct. If you had
class C
{
int Length;
void M() { int x = "".Length; }
}
Then you would incorrectly conclude that M references C.Length. That's a false positive.
The correct way to do it is to write a full C# compiler, and use the output of its semantic analyzer to answer your question. That's how the IDE implements features like "go to definition".
Before attempting to write this kind of logic yourself, I would check to see if you can leverage NDepend to meet your needs.
NDepend is a code dependency analysis tool ... and much more. It implements a sophisticated analyzer for examining relationships between code constructs and should be able to answer that question. It also operates on both source and IL, if I'm not mistaken.
NDepend exposes CQL - Code Query Language - which allows you to write SQL-like queries against the relationships between structures in your code. NDepend has some support for scripting and is capable of being integrated with your build process.
To complete the LBushkin answer on NDepend (Disclaimer: I am one of the developer of this tool), NDepend can indeed help you on that. The Code LINQ Query (CQLinq) below, actually match methods that...
shouldn't provoque any RPC calls but
that are reading/writing any fields of any RPC types,
or that are reading/writing any properties of any RPC types,
Notice how first we define the 4 sets: typesRPC, fieldsRPC, propertiesRPC, methodsThatShouldntUseRPC - and then we match methods that violate the rule. Of course this CQLinq rule needs to be adapted to match your own typesRPC and methodsThatShouldntUseRPC:
warnif count > 0
// First define what are types whose call are RDC
let typesRPC = Types.WithNameIn("MyRpcClass1", "MyRpcClass2")
// Define instance fields of RPC types
let fieldsRPC = typesRPC.ChildFields()
.Where(f => !f.IsStatic).ToHashSet()
// Define instance properties getters and setters of RPC types
let propertiesRPC = typesRPC.ChildMethods()
.Where(m => !m.IsStatic && (m.IsPropertyGetter || m.IsPropertySetter))
.ToHashSet()
// Define methods that shouldn't provoke RPC calls
let methodsThatShouldntUseRPC =
Application.Methods.Where(m => m.NameLike("XYZ"))
// Filter method that should do any RPC call
// but that is using any RPC fields (reading or writing) or properties
from m in methodsThatShouldntUseRPC.UsingAny(fieldsRPC).Union(
methodsThatShouldntUseRPC.UsingAny(propertiesRPC))
let fieldsRPCUsed = m.FieldsUsed.Intersect(fieldsRPC )
let propertiesRPCUsed = m.MethodsCalled.Intersect(propertiesRPC)
select new { m, fieldsRPCUsed, propertiesRPCUsed }
My intuition is that detecting which member variables will be accessed is the wrong approach. My first guess at a way to do this would be to just request serialized objects on an as-needed basis (preferably at the beginning of whatever function needs them, not piecemeal). Note that TCP/IP (i.e. Nagle's algorithm) should stuff these requests together if they are made in rapid succession and are small
Eric has it right: to do this well, you need what amounts to a compiler front end. What he didn't emphasize enough is the need for strong flow analysis capabilities (or a willingness to accept very conservative answers possibly alleviated by user annotations). Maybe he meant that in the phrase "semantic analysis" although his example of "goto definition" just needs a symbol table, not flow analysis.
A plain C# parser could only be used to get very conservative answers (e.g., if method A in class C contains identifier X, assume it reads class member X; if A contains no calls then you know it can't read member X).
The first step beyond this is having a compiler's symbol table and type information (if method A refers to class member X directly, then assume it reads member X; if A contains **no* calls and mentions identifier X only in the context of accesses to objects which are not of this class type then you know it can't read member X). You have to worry about qualified references, too; Q.X may read member X if Q is compatible with C.
The sticky point are calls, which can hide arbitrary actions. An analysis based on just parsing and symbol tables could determine that if there are calls, the arguments refer only to constants or to objects which are not of the class which A might represent (possibly inherited).
If you find an argument that has an C-compatible class type, now you have to determine whether that argument can be bound to this, requiring control and data flow analysis:
method A( ) { Object q=this;
...
...q=that;...
...
foo(q);
}
foo might hide an access to X. So you need two things: flow analysis to determine whether the initial assignment to q can reach the call foo (it might not; q=that may dominate all calls to foo), and call graph analysis to determine what methods foo might actually invoke, so that you can analyze those for accesses to member X.
You can decide how far you want to go with this simply making the conservative assumption "A reads X" anytime you don't have enough information to prove otherwise. This will you give you a "safe" answer (if not "correct" or what I'd prefer to call "precise").
Of frameworks that might be helpful, you might consider Mono, which surely parses and builds symbol tables. I don't know what support it provides for flow analysis or call graph extraction; I would not expect the Mono-to-IL front-end compiler to do a lot of that, as people usually hide that machinery in the JIT part of JIT-based systems. A downside is that Mono may be behind the "modern C#" curve; last time I heard, it handled only C# 2.0 but my information may be stale.
An alternative is our DMS Software Reengineering Toolkit and its C# Front End.
(Not an open source product).
DMS provides general source code parsing, tree building/inspection/analysis, general symbol table support and built-in machinery for implementing control-flow analysis, data flow analysis, points-to analysis (needed for "What does object O point to?"), and call graph construction. This machinery has all been tested by fire with DMS's Java and C front ends, and the symbol table support has been used to implement full C++ name and type resolution, so its pretty effective. (You don't want to underestimate the work it takes to build all that machinery; we've been working on DMS since 1995).
The C# Front End provides for full C# 4.0 parsing and full tree building. It presently does not build symbol tables for C# (we're working on this) and that's a shortcoming compared to Mono. With such a symbol table, however, you would have access to all that flow analysis machinery (which has been tested with DMS's Java and C front ends) and that might be a big step up from Mono if it doesn't provide that.
If you want to do this well, you have a considerable amount of work in front of you. If you want to stick with "simple", you'll have to do with just parsing the tree and being OK with being very conservative.
You didn't say much about knowing if a method wrote to a member. If you are going to minimize traffic the way you describe, you want to distinguish "read", "write" and "update" cases and optimize messages in both directions. The analysis is obviously pretty similar for the various cases.
Finally, you might consider processing MSIL directly to get the information you need; you'll still have the flow analysis and conservative analysis issues. You might find the following technical paper interesting; it describes a fully-distributed Java object system that has to do the same basic analysis you want to do,
and does so, IIRC, by analyzing class files and doing massive byte code rewriting.
Java Orchestra System
By RPC do you mean .NET Remoting? Or DCOM? Or WCF?
All of these offer the opportunity to monitor cross process communication and serialization via sinks and other constructs, but they are all platform specific, so you'll need to specify the platform...
You could listen for the event that a property is being read/written to with an interface similar to INotifyPropertyChanged (although you obviously won't know which method effected the read/write.)
I think the best you can do is explicitly maintain a dirty flag.