Control when referenced .NET assemblies are loaded

Control when referenced .NET assemblies are loaded - c#

Normally a referenced assembly of an assembly is loaded when the first method from a type in the referenced assembly is executed.
Does it make sense to force loading all referenced assemblies at a point where the application flow can tolerate a delay to avoid it in further execution where it might not be tolerable (e.g. in a time critical method)?
If yes, what's the best way to do that? (Reflection, ...)

One of my present employer's products gets a list of all the DLLs from the directory of the entry assembly. It then loads them all using Assembly.LoadFrom. It does this while the splash screen is up. Frankly, the code scares me. We've had to put in some hacks to avoid certain DLLs. We've had to change the installer to wipe the target directory clean before updating. It's a very insecure plan.
At a previous job, I wrote a similar function that used the GetReferencedAssemblies method. Starting with the entry assembly, it would recursively call that followed by Assembly.LoadFrom. It would stop the recursion after it loaded an assembly that was not shipped with our product. It worked, but I have since decided it was unnecessary.
In the present product I work on, we use Autofac to build the full dependency tree for the application. The bootstrapper code to configure that references all the services in the entire project -- I would guess that's at least 70% of the code. Again, this is triggered while the splash screen is up. This is the right approach. It balances "loading the necessities" with "taking time to load everything including stuff that may never be used".

Related

Partially reference a DLL

I have a library DLL full with sort algorithmn, parsers, validators, converters etc. The DLL is about 40 Mb (that is not much I know but still). Now I would like to reference just the parsers of that DLL. The point is to get out those parsers without shipping 40 Mb to the customer.
Is there a way everytime I make a release build to just take those up-to-date parsers from my library, store them into some kind of .partialDll file and deliver only them to the customer? The result would be me keeping all my helper classes in one big library which keeps growing and the customers get just what they ordered..
I guess I would need to deal with alot of reflection to achieve something like this, right? Any ideas?

Let me start with a quote from MSDN:
"Assemblies are the building blocks of .NET Framework applications; they form the fundamental unit of deployment […]."
Note that the quote is about assemblies, not about DLLs. There's a difference!
Although most .NET assemblies consist of exactly one DLL file, that is not a strict requirement: An assembly can in fact consist of more than one file; such a "multi-file assembly" can, for instance, consist of several DLLs, which in turn are called "netmodules". (A netmodule might have a .netmodule file extension by convention, but it's really a DLL containing .NET metadata and bytecode.) Each multi-file assembly has exactly one "main" module which carries the metadata that references all the other assembly files and so ties them together into a logical whole.
While an assembly has to be deployed in full (as per the above quote), the .NET runtime can load only those netmodules that are actually required for JIT code compilation and execution.
So you can split up an assembly into several parts, and have the runtime load only what is actually needed; but you cannot do the same to a netmodule / DLL file. A DLL file can only be deployed and loaded in its entirety.
Note also that Visual Studio's support for netmodules is non-existent for all practical purposes, so most people don't use them, which is why you see so few multi-file assemblies in the real world.
The bottom line is this: In practice, if you or your clients are interested in only a part of an assembly ("DLL"), then it's usually easier to split a large assembly (that is, one large Visual Studio project) into several inter-dependent assemblies (several smaller Visual Studio projects).

In general, no, there is no way to achieve that. Once you pack "everything" into a module and compile it, you can't split that module later into smaller ones. (well, ok, you can analyze the bytecode and rewrite the assembly, see the end of this post).
For me, your nullhypothesis seems wrong. You don't need to work with "one huge library that keeps all your helper classes", and really, you dont want, or you will not want to either. If you don't feel like that, I assure you that in time, years maybe, you will hate such one-to-have-it-all approach.
This is exactly what you want to escape from and this is why .Net and many other languages/environments support concept of "libraries" or "modules" and allow you to use multiple of them, and that's why most of the projects you see everywhere aren't created as "one huge EXE". It's much easier to reuse, analyze and even hunt bugs when you have it in smaller chunks.
--
However, if you'd insist, there are ways (ugly) to achive something-like you think. I assume that the "huge DLL" is in C# and is controlled by you.
First, somewhat naiive but working way, is to use "file links". In VisualStudio you can have a project that contains tons of files and producess a BigDLL "all.dll", and just by its side you can create another project that will not contain any files at all, but that will contain links to the first projects' files. Use typical "Add a file.." option to a project and note that near the final "Add" button there's a down arrow that expands to "Add as link..".
This will cause the file to stay in HugeProject, but the SmallProject will see the file too and when SmallProject is compiled, it will pull the code from that file too.
Note that this way you will actually build two separate modules assemblies: big one and small one, and your final product will need to reference the small one.
This way is naiive and ugly, it is just as if you manually copied/splitted the huge project into smaller ones, but with the tiny advantage is that you don't need to copy the code files around.
--
intermission for side-thoughts:
you can use #if to conditionally turn off some currently-unused code, however setting the flags that drive those IFs will be cumbersome
you can edit .csproj files and use MSBuild conditional clauses to automatically exclude unused code files from your HugeProject during final builds, however setting the flags that drive those IFs will be cumbersome too
--
The second way is to keep everything in the HugeProject, and to have your application(s) reference it directly, and then after building and testing everything, just before packing that and sending to customer - use some kind of trimming utility that will check what parts of code are referenced and that will remove all dead code from the assemblies. I can't give you any name for such utility, but many obfuscators come with such feature.
They will run through your compiled code, cross-reference everything, change/remove/trash class/method/propertynames and also they may as a bonus remove unused bits. Then, they'll write mangled assemblies back to disk ensuring that they reference each other and not the original ones from before mangling.
example: See a question related to that
example: See an example of such utility also consider ILMerge for better results.
Cons - utility may leave some trash it couldn't decide whether it is used or not, finding/testing/buying it may take some time and resources, you can have some signing problems since the stripped-assembly will be a brand new assembly, etc. Also, such utilities have problems if you invoke some code only by reflection and it may require you to provide some extra hints or to make sure the code "seems to be used" (example: a whole namespace of "plugins" that implement "IPlugin" and then your app searched that NS for Types and uses Activator.CreateInstance to instantiate them; no hard-linked usages, trimmer may decide to remove all plugins as "unused"; you'll need to configure trimmer carefully or be suprised).
Probably a few other ways could be found too, but seriously, in most of the times, you don't want to waste your time on that, especially manually. So just tidy up your code and split it into small libs, or start looking for automatic obfuscator&trimmer.

Is there a good reason for preferring reflection over reference?

Going over some legacy code, I ran into piece of code that was using reflection for loading some dll's that their source code was available (they were another project in the solution).
I was cracking my skull trying to figure out why it was done this way (naturally the code was not documented...).
My question is, can you think about any good reason for preferring to load an assembly via reflection rather than referencing it?

Yes, if you have a dynamic module system, where different DLLs should be loaded depending on conditions at runtime. We do this where I work; we do a license check for different optional modules that may be loaded into our system, and then only load the DLLs associated with each module if the license checks out. This prevents code that should never be executed from being loaded, which can both improve performance slightly and prevent bugs.
Dynamically loading DLLs may also allow you to drastically change functionality without changing any source code. The main assembly may for instance set in motion a discovery process where it finds all classes that implement some interface, and chooses which one to use depending on some runtime criterion.
These days you'll typically want to use MEF for this kind of task, but that's only been around since .NET 4.0, so there are probably many codebases out there that do it manually. (I don't know much about MEF. Maybe you have to do this part manually there as well.)
But anyway, the answer to your question is that there certainly are good reasons to dynamically load DLLs using reflection. Whether it applies in your case is impossible to say without more details.

Without knowing you specific project, noone here can tell you why it was done that way in your case.
But the general reasons are:
updateability: You can simply recompile and replace the updated libary instead of having to recompile and replace the whole application.
cooperation: if the interface is clear, that way multiple teams can work together. one for the main application and others for the dlls
reusability: sometimes you need the same functionality in multiple projects, so the same dll can be used again and again
extensability: in some cases you want to be able to later extend your program with plugins that where not present at shipment time. This can be realized using dlls.
I hope this helps you understand some of your setup..

Reason for loading an assembly via reflection rather than referencing it?
Let us consider a scenario, where there are three classes with method DoWork() this method returns string, you are accessing it by checking the condition (strong type).
Now you have two more classes in two different DLL's how would you cope up the change?
1)You can add reference of new DLL's , change the conditional check and make it work.
2)You can use reflection , pass on condition and assembly name at run time, this allows you to add any number of functionality at runttime without any change of code in primary appliation.

Accessing resources in registered .NET DLL via res protocol

I have a .NET DLL that I register with regasm.exe as a Browser Helper Object. I embedded several Win32 resources in the DLL using .res files and /win32 switch to csc.exe.
image.png HTML "image.png"
I am able to access the resources using the res protocol as long as I include the absolute path of the DLL.
res://C:\Path\To\My\Dll\Assembly.dll/image.png
This is rather unfortunate. I would much rather just reference the assembly name as I have seen in many other examples:
res://Assembly.dll/image.png
But this results in an error because the browser can't find the resource.
Ideas?

I think there are a couple things that can be done. First, I believe your assembly needs to be a part of the global assembly cache if you don't want to be forced to use the full path.
Add assembly to global assembly cache.
It's not pretty and you must also keep in mind that a newer build of the DLL will be different from the version in cache. So it would essentially be another extra step to take which would just lead us back to accepting having to put in the full path in the first place.
Second, another solution I believe that works would be to place the DLL in the same directory as the VS utility or the other resources you're trying to use. This could be applicable to multiple different things that one might want to do, but requires manually moving your files around.
Third, you create some custom environmental variables that represent the path that you desire. So instead of typing the full path in the future, you just type your variable.
The third option is my favorite and something that I use for multiple different things I need to accomplish from the command-line.

Best way to only perform a function if a (.NET) DLL is loaded?

I am not sure the best way to explain this so please leave comments if you do not understand.
Basically, I have a few libraries for various tasks to work with different programs - notification is just one example.
Now, I am building a new program, and I want it to be as lightweight as possible. Whilst I would like to include my notification engine, I do not think many people would actually use its functionality, so, I would rather not include it by default - just as an optional download.
How would I program this?
With unmanaged Dlls and P/Invoke, I can basically wrap the whole lot in a try/catch loop, but I am not sure about the managed version.
So far, the best way I can think of is to check if the DLL file exists upon startup then set a field bool or similar, and every time I would like a notification to be fired, I could do an if/check the bool and fire...
I have seen from the debug window that DLL files are only loaded as they are needed. The program would obviously compile as all components will be visible to the project, but would it run on the end users machine without the DLL?
More importantly, is there a better way of doing this?
I would ideally like to have nothing about notifications in my application and somehow have it so that if the DLL file is downloaded, it adds this functionality externally. It really is not the end of the world to have a few extra bytes calling notification("blabla"); (or similar), but I am thinking a lot further down the line when I have much bigger intentions and just want to know best practices for this sort of thing.

I do not think many people would
actually use its functionality, so, I
would rather not include it by default
- just as an optional download.
Such things are typically described as plugins (or add-ons, or extensions).
Since .NET 4, the standard way to do that is with the Managed Exensibility Framework. It is included in the framework as the System.ComponentModel.Composition assembly and namespace. To get started, it is best to read the MSDN article and the MEF programming guide.

You can use System.Reflection.Assembly and its LoadFile method to dynamically load a DLL. You can then use the methods in Assembly to get Classes, types etc. embedded in the DLL and call them.
If you just check if the .dll exists or load every .dll in a plugin directory you can get what you want.

To your question if the program will run on the user's machine without the dlls already being present - yes , the program would run. As long as you dont do something that needs the runtime to load the classes defined in the dll , it does not matter if the dll is missing from the machine. To the aspect you are looking for regarding loading the dll on demand , I think you are well of using some sort of a configuration and Reflection ( either directly or by some IoC strategy. )

Try to load the plugin at startup.
Instead of checking a boolean all over the place, you can create a delegate field for the notification and initialize it to a no-op function. If loading the plugin succeeds, assign the delegate to the plugin implementation. Then everywhere the event occurs can just call the delegate, without worrying about the fact that the plugin might or might not be available.

ASP.NET websites referencing common, updatable Assembly

I'm developing a component (HttpModule) that's used by a number of web applications on a .NET website, and I want the component to be easily maintainable. I've come up with something outlined below but wanted to see if there were any positive/negative thoughts or general feedback on the implementation, as I'm not 100% familiar with Assembly loading, especially in terms of memory overhead.
(I don't really want to do this: Create Your Own .NET Assembly Cache)
The lightweight HttpModule itself is in the GAC and referenced from the site's root web.config. On each request it opens a text file (stored in the web's root/bin) that contains just a strong named's assembly name (e.g. "My.MyLibrary, Version=1.1.0.0, Culture=en, PublicKeyToken=03689116d3a4ae33") and then checks the current AppDomain to see if it is already referenced (iterates over GetAssemblies()). If not, it then calls Assembly.Load to load myLibrary and uses basic Reflection to Invoke() a custom method in My.MyLibrary that actually does the intended processing work of the HttpModule.
My.MyLibrary itself is also in the GAC. To upgrade the app without any app restarts, put a new version in the GAC, and just edit the string in the text file. I'm using the text file because a) it's fast and b) I didn't want to have to update a machine/web.config and cause a recycle to redirect the HttpModule to use a new version of My.MyLibrary. It seems to work okay. The old version can be uninstalled from the GAC when it's finally ready to be. So hopefully the only time an app pool/iis reset would be needed would be to change the HttpModule part.
Any replies much appreciated!
-Will

Personally I would say if you can avoid using any late binding it would be better, but as you want to be able to have the freedom to just throw a new assembly at your application then it does seem like late binding makes sense.
With regards to your method of storing and retrieving the list of assemblies, I would use an XML object and load it from the file. You will find adding extra information to it simpler this way, otherwise you will have to maintain your own file format.
You may also want to consider adding some code to catch errors generated from these assemblies, unload them and put a flag in your file telling your HttpModule not to load it until it has been updated.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.