High-level question here:
I have spent a lot of time today educating myself on basic high-level concepts such as APIs, static and dynamic libraries, DLLs and marshaling in C#. Gaining all of this knowledge led me to what seems like a pretty basic question, and probably demonstrates a hole in my understanding of these concepts:
What I know:
DLLs may contain classes which in turn contains various class-members such as methods and fields, several of which I might want to utilize in my program
In C# we use the keyword "using" at the top of the code, to define a namespace we
want to include in our program
What I do not get:
I was under the impression that the actual methods were defined in the DLLs. How does my program find the actual functions that are defined in the DLLs, when all i give them is a namespace? It seems more intuitive to me to have "using XYZ.dll" at top, rather than "using XYZ_namespace".
Thanks a lot for helping me fill in the gaps here.
EDIT: Modified post to be specific to C#.
EDIT 2: For other people that wonder how their C# application actually gets a hold of the types made available through "using namespaceX", this is a good resource (in addition to the helpful posts below): http://broadcast.oreilly.com/2010/07/understanding-c-namespaces-and.html.
Basically the type you would like to use resides in libraries and you have to set Visual Studio to reference these libraries in order to make it possible to "use" its namespace in your code.
DLLs contain many routines / methods we might want to use in our
programs
Partially correct. .Net DLLs contain Classes, and these classes contain Members (Fields, Constants, Methods, Properties, Events, Operators, Indexers).
.Net is strictly OOP, and it does not allow code "floating in limbo". Everything is defined inside classes.
Classes are organized in Namespaces just to keep a naming separation and organization. Think of namespaces as "folders" that contain one or more classes, and that might be defined in one or more assemblies (DLLs).
For example, Classes inside the System namespace are defined in 2 assemblies (DLLs): mscorlib.dll and System.dll.
At the same time, these 2 assemblies contain many different namespaces, so you can think the Assembly to Namespace relation as a Many-to-Many.
When you put a using directive at the beginning of a C# code file, you're telling the compiler "I want to use classes defined in this Namespace, no matter what assembly they come from". You will be able to use all classes defined in such namespace, inside all assemblies Referenced from within the current project.
In C#, DLLs (aka assemblies) contain classes (and other types). These types typically have long full names, like System.Collections.Generic.List<T>. These types can contain methods.
In your References area, you have references to assemblies (this is part of your .csproj file). In a .cs file, you don't need to include any using to reference this DLL, because it's already referenced in your .csproj file.
If you include a line like using System.Collections.Generic;, that tells the C# compiler to look for System.Collections.Generic.List<T> when you type List<T>. You don't need to do it that way, however: you can simply type System.Collections.Generic.List<T>.
I was under the impression that the actual methods were defined in the
DLLs. How does my program find the actual functions that are defined
in the DLLs, when all i give them is a namespace?
The process of finding the correct code occurs through static or dynamic binding and also assembly binding. When you compile the code static binding will tell you if you wrote bad code or forgot to add a reference:
ClassInADifferentAssembly.M(); //Generally this will static bind and
cause a compiler error if you forgot to include a reference to
DifferentAssembly
Unless you are dealing with dynamic or reflection then you have static binding. Assembly binding is a different process. The overall process is complex, but basically assemblies are discovered in the the GAC, current location or you can even handle an event yourself, AppDomain.AssemblyLoad.
So when you add a using statement then static binding can successfully find the correct code in the context. However, you can still receive a runtime error if later the assembly fails to bind at runtime.
DLL is short for dynamic link library. And can be a class library containing classes, methods etc that can all be put under different namespaces.
So first you have to add a reference to the DLL into your project. When that is done, you then use a keyword such as "using" to basically shorten the path to reach the methods/classes in that particular namespace.
Example namespaces
Namespace.Something.SomethingMore.Finally.Just.One.More
Namespace.Something.SomethingMore.Finally.Just.One.More2
To reach classes under those namespaces you can do either of the following
using Namespace.Something.SomethingMore.Finally.Just.One.More;
using Namespace.Something.SomethingMore.Finally.Just.One.More2;
// Now you can access classes under those namespaces without typing the whole namespace
// Like in the row below
Class.GetData();
If you did not have the usings, you would still be able to access those classes. But would then have to type
Namespace.Something.SomethingMore.Finally.Just.One.More.Class.GetData();
Namespace.Something.SomethingMore.Finally.Just.One.More2.AnotherClass.GetData();
DLLs have a collection of functions.
You can calls these functions by one of 2 ways:
link with the DLLs export library (a lib file) or do the link in runtime:
Call LoadLibrary()
Call GetProcAddress and provide the name of the function you want. You'll need to cast it to the actual type (function pointer).
Call the function via the new function pointer.
Pretty simple stuff, just read it on MSDN.
C++ namespaces are just a part of the function name.
You can view what functions are exported from a DLL by using a tool called Dependency Walker.
Related
I am currently working with a piece of software known as Kofax TotalAgility or KTA for short.
This is Business Process Automation Software, which I have the "pleasure" of expanding with custom .net libraries.
I have been creating a MS Graph library to perform actions with the MS Graph API. The API works great and I am quite pleased with how it turned out.
However due to the way KTA is accessing methods in classes I have used "Data classes" (dont know if that is the right word) to use as input parameters for my methods. To be clear these methods have no functionality other than to store data for methods to use, the reason I am doing this, is because of the way it is structured in the KTA class inspector (I am assuming that KTA uses the IL Code from my library to create a list of classes and methods).
This is what I am expecting the user is shown when they are using my methods. As you can see by using classes as input parameters I get this nice hierarchical structure.
By using classes as input parameters another issue occurs which is that my "Data Classes" are show in the list of classes, which produces alot of unnecessary clutter.
Is there a way to hide these classes from the inspector? I get that it might be an internal KTA issue, which of course would mean I am not asking in the right place, and it is an internal Kofax issue.
However if there is some C# or .NET way of doing this, that would be preferable.
There are a number of different terms for the data/parameter classes that you mention, such as DTO (data transfer objects), POCO (plain old C# objects), or the one that you can see in the KTA product dlls: model classes.
There is not a direct way to hide public classes from KTA. However, when you use the KTA API via the TotalAgility.Sdk.dll, you notice that you don’t see all of the parameter classes mixed in with the list of the classes that hold the SDK functions. The reason is just that these objects are in a separate referenced assembly: Agility.Sdk.Model.dll. When you are configuring a .NET activity/action in KTA, it will only list the classes directly in the assembly that you specify, not referenced assemblies.
If you are using local assembly references in KTA, then this should work because you can just have your referenced assembly in the same folder as your main dll. However if you are ILMerging into a single dll to can add it to the .NET assembly store, then this approach won’t work.
When ILMerged together, the best you can do is to have your parameter classes grouped in a namespace that helps make it clear. What I do is have a main project with just one class that acts as a wrapper for any functions I want to expose. Then use ILMerge with the internalize option, which changes visibility to internal for any types not in the primary assembly. To allow the model classes to still be public, I keep them in a specific namespace and add that namespace to the exclude list for the internalize command. See Internalizing Assemblies with ILMerge for more detail.
Keep in mind that anyone seeing this list is configuring a function call with your dll. Even if they are not a skilled developer, they should at least have some competence for this type of task (hopefully). So even if the list shows a bunch of model classes, it shouldn’t be too hard to follow instructions if you tell them which class is to be used.
Ive been looking everywhere for a possible solution to this but can't seem to find an answer. My issue is that I have a few classes that need to completely hidden from Assembly.getTypes, as I'm writing a plugin for an application, and it's picking up types that I need to remain hidden (this happens even if they are declared as private or internal classes).
anyone know how to either alter what assembly.GetTyes returns, or an ,aficionado attribute that will keep those types from being listed?
This is quite a hack and is very fragile, but could work.
Create 2 assemblies -- one for the plug-in and the second for the other types. The second would be placed in another known directory and loaded dynamically into the first when needed. (For example, via Assembly.LoadFrom.)
The first assembly would then be placed in the plug-in directory and only ever publish its types. This very fragile because you would likely have to hard-code a path to the second assembly and you run the risk of the file getting deleted or moved.
EDIT
#SLaks' comment takes away the fragility of this solution. If you embed the second assembly as a resource and load it at run-time, the app calling Assembly.GetTypes won't see the types you want hidden.
This is not possible.
Sorry.
Code that calls Assembly.GetTypes() should typically filter for only public types.
Welcome to managed code. Complete type information is necessary to .NET's type verifier. Only native code can be hidden from .NET metadata, and then you give up the portability and permissions supported by pure MSIL.
In C#, is it possible to restrict who can call a method at compile time?
I've looked into directives, but that didn't work since I can't assign values to symbols.
#define WHO VisualStudioUser.Current // does not work
I also looked into Code Access Security (CAS) but that's runtime enforcement, not compile time.
The requirement is to restrict access to a method at compile time for specific developers given the method exists in a pre-compiled assembly.
here's more details...
I'm building a framework or a series or assemblies for a team of developers. Because of our software license restrictions, I can only allow a few developers to write code to make a call to some restricted methods. The developers will not have access to the source code of the framework but they'll have access to the compiled framework assemblies.
The quick answer will be: No this isn't possible, and if you need to do it, you're Doing It Wrong.
How would this even work? Does it depend who who's running the code or who wrote it?
Edit There's kind of a way using InternalsVisibleTo and restricting accessing in source control to the assemblies that InternalsVisibleTo is specified for. See Jordão's answer
The requirement is to restrict access to a method at compile time for specific developers given the method exists in a pre-compiled assembly.
One way is to mark the method private or internal, it won't be callable by anyone outside the assembly. UPDATE: Also take a look at the InternalsVisibleTo attribute, which is used to define which assemblies can "see" internals of your assembly.
Another way is to divide the code you want to distribute from the code you don't want people to call into separate assemblies. Maybe you just share an assembly mostly of interfaces with your users, that they them compile against; and you have a separate assembly with implementations that they shouldn't reference directly. Your internal team would have access to the implementation assembly. This is just a common form of dependency management, the dependency inversion principle.
Draft:
Compile the restricted code into (obfuscated) DLLs: TypeA.dll, TypeB.dll etc.
Define an interface for each type, and compile them into separate DLLs: ITypeA.dll, ITypeB.dll etc.
Create a "guard assembly", and embed all restricted assemblies into it: Guard.dll. This has a ResolveEventHandler, and methods to instantiate different types defined in the embedded restricted DLLs. Instances are returned through their interface.
Developers get the interface DLLs and the Guard.dll. Each developer can get a Guard.dll with special authentication tokens in it. For example, a Guard.dll can be bound to PC, an IP address, a GUID issued to the developer, anything.
The developer can instantiate those types for which she has the proper authentication code, and uses the object instance through an interface.
Sorry this is a bit fuzzy, because it was more than a year ago when I used these techniques. I hope the main idea is clear.
Can you try using Extensible C# developed by ResolveCorp, some of the links for study and implementation are:
http://zef.me/782/extensible-c
http://www.codeproject.com/KB/architecture/DbCwithXCSharp.aspx
http://weblogs.asp.net/nunitaddin/archive/2003/02/14/2412.aspx
http://www.devx.com/dotnet/Article/11579/0/page/5
Question in the title.
I'd like to avoid recompiling since the source code I'm modifying is third party and I'd like to use the original binaries where possible, and replace only the assembly which contains the class I modified. But I'm not sure if this is a safe thing to do. In C++ for example this is definitely a bad idea.
No.
The assemblies that reference your library refer to methods and types using (some form of) name, so as long as you don't change the names of public types and methods (used by other assemblies), you don't need to recompile any of the assemblies - they will work with the updated version of the library.
In most cases Tomas answer is correct, but there are some cases where it is not true:
When using strong naming (signing) change of a single character leads to a new signature, thous leading to a new strong name.
Setting in your project references for your assembly the property Specific Version to true and changing the version number manually or automatically in AssemblyInfo.cs
No. All other assemblies will automatically work with the newly updated library.
I just finished watching an episode of Bob Martin at NDC where he said "using" directives in C# at the top of a page are bad because of the tight coupling they create/imply between components.
What way are there to use external .dlls without adding a project reference and a using statement?
I remember V6 used to let you create an object by the string of the ProgId--I'm not sure that's the technique I'm looking for, but it's an example of a language that didn't need a project reference to use a dll.
EDIT: Here is a link to the conference. Sorry I don't have the exact quote or minute in the presenation, I'm going by memory.
I believe Bob Martin is actually referring to early versus late binding.
In .NET late binding is possible through reflection and more specifically the Activator class that allows creation of a type in an external assembly using a filename or assembly name.
Normally, using directives (not the using statement) go hand in hand with directly referencing an external assembly. ie. You add a reference to an assembly and then add using directives to avoid needing to type the full namespace hierarchy when you use the external types.
So if you find your code has a large number of using directives at the top, it is possible that you are referencing many other types directly and so increasing the coupling/dependency of your code on these types.
I would guess this is why Bob is referring to them as bad. The answer to the question "is this actually bad?" is a very subjective and context dependent one.
In general though, de-coupling of components is almost always a good goal to aim for in designing software. This is because it allows you to change parts of your system with minimal impact on the rest of the system. Having read one or two of Bob Martins books, I would expect this is what he is getting at.
It's not the using statement itself that is bad - it's if you get too many of them.
A statement such as using System; is rarely a problem in itself, but if you have lots (I'd say more than 3-6, depending on which ones) in the same code file, it could be an indication of tight coupling.
You could just as well apply a similar rule of thumb to the number of references in a project itself.
The solution to tight coupling is programming to interfaces and Dependency Injection (DI).
The ProgId way of doing things that you can remember from VB was simply COM in action. In essence, you used that ProgId to get a reference to an instance that implemented the desired interface. The downside was that this only worked when the COM object was universally registered. Remember dll hell?
You can still apply the same principle using certain flavors of DI, only that now the interface is a .NET type and not defined in IDL, and you need some sort of DI Container to supply the concrete implementation.
using is just a shortcut to namespaces, they are not references to external files. Therefore, this is just not really making sense.
Anyways, what one can do is have an interface DLL (a DLL with only interfaces), so that you dynamically load and use different assemblies and create types (through reflection) which you can cast to the well-known interfaces. This is the proper way of loosening external references while keeping the benefits of the strongly typed language and early binding.
Have a look at the Assembly and AppDomain classes to load assemblies, and Activator to create type instances by name.
You could use reflection:
// Load the assembly
Assembly assembly = Assembly.LoadFrom(#"c:\path\Tools.dll");
// Select a type
Type type = assembly.GetType("Tools.Utility");
// invoke a method on this type
type.InvokeMember("SomeMethod", BindingFlags.Static, null, null, new object[0]);
You can do what you are referring to through reflection. You can load the assembly at runtime, and reflect through it to get the classes etc. and call them dynamically.
Personally, I wouldn't do this though to avoid coupling. To me that is a bad use of reflection, and I would much rather add it to the project and reference it, unless there is a specific reason why not to do so. Reflection adds overhead to the system, and you don't get the advantage of compile time safety.