How do you go about finding unused icons, images, strings in .resx files that may have become 'orphaned' and are no longer required?
Recently ResXManager 1.0.0.41 added a feature to show the number of references to a string resource.
I couldn't find any existing solution that would search for string resource references in XAML files and batch-remove unused ones.
So I wrote this: https://github.com/Microsoft/RESX-Unused-Finder
It searches a project directory for references to string resources, then displays a list of ones it couldn't find a match for. You can specify a template to search for so it can find references in XAML files.
I created a free open-source VS extension that looks for unused images in a project, just published the first version: https://marketplace.visualstudio.com/items?itemName=Jitbit1.VSUnusedImagesFinder
This is no information an algorithm can reliably compute. The inspected program could fetch a list of all resources and do something with them, like letting the user choose from several icons.
Your best bet is probably to search for all references to your resource-access API of choice and inspect those manually. Using grep/sed you might be able to reduce the sites you have to inspect manually by handling all "easy" ones where a simple string is used.
Since I could not find a simple and fast solution yet, I found at least a solution that allows me to get the result I'm looking for, even if it takes some time (ideal for a lazy sunday afternoon).
The solution involves Visual Studio .NET 2010 and ReSharper (I'm using version 7.1) and goes like the following.
Step-by-step solution
1.) Right-click your primary RESX file in VS.NET and select "Find Usages" from the context menu:
This will bring up ReSharper's "Find Results" window.
2.) Double-click each occurrence in the solution window:
This will open the source code window with the resource.
3.) Rename this resource from within the source code window:
It will bring up ReSharper's "Rename Resource" dialog.
4.) Give the resource a new name with a unique prefix. In my example this is "TaskDialog_":
It will rename both the resource and also the auto-generated C# wrapper/access class.
5.) Repeat the above steps 2, 3 and 4 for all resources in the "Usages" window.
6.) Open the RESX file in Visual Studio's resource editor and select all files without the prefix:
7.) Now click the "Remove Resource" button on the top of the window or simply press the Del key:
You finally have a RESX file with only the acutally used resources in your file.
8.) (Optionally) If you have resources in multiple languages (e.g. "Resources.de.resx" for German), repeat steps 7 and 8 for those RESX files, too.
Warning
Please note that this will not work if you access your strings other than through the strongly-typed, auto-generated C# class Resources.
I recently built a tool that detects and removes unused string resources. I used the information in this post as a reference. The tool may not be perfect, but it does the heavy-lifting part and will be useful if you have a big project with the long history. We used this tool internally to consolidate resource files, and remove unused resources (we got rid of 4,000+ resources out of 10,000).
You can look at the source code, or just install ClickOnce from here: https://resxutils.codeplex.com/
I had a similar problem. Several thousand resource strings that I'd created for a translation table, many of which were no longer required or reference by code. With around 180 dependent code files, there was no way I was going to manually go through each resource string.
The following code (in vb.net) will go through your project finding orphaned resources (in the project resources, not any individual forms' resources). It took around 1 minute for my project. It can be modified to find strings, images or any other resource type.
In summary it;
1) Uses the solution project file to gather all the included code
modules and appends them into a single string variable;
2) Loops through all the project resource objects, and creates a list (in my case) of those which are strings;
3) Does a string search finding resource string codes in the combined project text variable;
4) Reports resource objects that are not referenced.
The function returns the object names on the windows clipboard for pasting in a spreadsheet or as a list array of the resource names.
edit : example call in module : modTest
? modTest.GetUnusedResources("C:\Documents and Settings\me\My Documents\Visual Studio 2010\Projects\myProj\myProj.vbproj", True, true)
'project file is the vbproj file for my solution
Public Function GetUnusedResources(projectFile As String, useClipboard As Boolean, strict As Boolean) As List(Of String)
Dim myProjectFiles As New List(Of String)
Dim baseFolder = System.IO.Path.GetDirectoryName(projectFile) + "\"
'get list of project files
Dim reader As Xml.XmlTextReader = New Xml.XmlTextReader(projectFile)
Do While (reader.Read())
Select Case reader.NodeType
Case Xml.XmlNodeType.Element 'Display beginning of element.
If reader.Name.ToLowerInvariant() = "compile" Then ' only get compile included files
If reader.HasAttributes Then 'If attributes exist
While reader.MoveToNextAttribute()
If reader.Name.ToLowerInvariant() = "include" Then myProjectFiles.Add((reader.Value))
End While
End If
End If
End Select
Loop
'now collect files into a single string
Dim fileText As New System.Text.StringBuilder
For Each fileItem As String In myProjectFiles
Dim textFileStream As System.IO.TextReader
textFileStream = System.IO.File.OpenText(baseFolder + fileItem)
fileText.Append(textFileStream.ReadToEnd)
textFileStream.Close()
Next
' Debug.WriteLine(fileText)
' Create a ResXResourceReader for the file items.resx.
Dim rsxr As New System.Resources.ResXResourceReader(baseFolder + "My Project\Resources.resx")
rsxr.BasePath = baseFolder + "Resources"
Dim resourceList As New List(Of String)
' Iterate through the resources and display the contents to the console.
For Each resourceValue As DictionaryEntry In rsxr
' Debug.WriteLine(resourceValue.Key.ToString())
If TypeOf resourceValue.Value Is String Then ' or bitmap or other type if required
resourceList.Add(resourceValue.Key.ToString())
End If
Next
rsxr.Close() 'Close the reader.
'finally search file string for occurances of each resource string
Dim unusedResources As New List(Of String)
Dim clipBoardText As New System.Text.StringBuilder
Dim searchText = fileText.ToString()
For Each resourceString As String In resourceList
Dim resourceCall = "My.Resources." + resourceString ' find code reference to the resource name
Dim resourceAttribute = "(""" + resourceString + """)" ' find attribute reference to the resource name
Dim searchResult As Boolean = False
searchResult = searchResult Or searchText.Contains(resourceCall)
searchResult = searchResult Or searchText.Contains(resourceAttribute)
If Not strict Then searchResult = searchResult Or searchText.Contains(resourceString)
If Not searchResult Then ' resource name no found so add to list
unusedResources.Add(resourceString)
clipBoardText.Append(resourceString + vbCrLf)
End If
Next
'make clipboard object
If useClipboard Then
Dim dataObject As New DataObject ' Make a DataObject clipboard
dataObject.SetData(DataFormats.Text, clipBoardText.ToString()) ' Add the data in string format.
Clipboard.SetDataObject(dataObject) ' Copy data to the clipboard.
End If
Return unusedResources
End Function
I use ReSharper for finding unused resource fields and then remove them manually if project contains small amount of resources. Some short script can be used if we already have list of unused items.
The solution is next:
show all unused members as described in this article
temporary remove *.Designer.cs from Generated file masks
(ReSharper → Options → CodeInspection → GeneratedCode)
Also comment or remove comment (that indicates that code is auto
generated) from top of Designer.cs file attached to resource file.
You will have list of all unused resources, left to remove them from resx.
I've been considering this myself and I believe I have two options. Both of these rely on the fact that I use a helper method to extract the required resource from the resource files.
Logging
Add some code to the "getresource" method or methods so that every time a resource is accessed, the resource key is written to a log. Then try to access every part of the site (a testing script might be helpful here). The resultant log entries should give a list of all the active resource keys, the rest can be junked.
Code Analysis
I am looking at whether T4 is capable of working through the solution and creating a list of all references to the "getresource" helper method. The resultant list of keys will be active, the rest can be deleted.
There are limitations of both methods. The logging method is only as good as the code covered by the test and the code analysis might not always find keys rather than strings containg the keys so there will be some extra manual work required there.
I think I'll try both. I'll let you know how it goes.
Rename your current image directory and then create a new one, do a find-in-files search within VS for your image path, i.e. '/content/images', multiselect all the used images and drag them into the new image folder.
You can then exclude the old directory from the project, or just delete it.
Related
I have an issue I've stuck with for over a year now. I made a Forms application in VB.net which allows the user to type in some information and select items which represent docx-files with tables with special formatting, pictures and other formatting quirks in them.
At the end the software creates a Word document via Office.Interop, using the information the user provided in text fields in the Forms and the items they selected (e.g. it creates a table in Word, listing the user's selections with some extra info) and then appends the content from multiple docx-files depending on the user's selection to the document created via Interop.
The problem is: To achieve this I had to use a pretty dirty method:
I open the respective docx-files, select all content (Range.Wholestory()) and copy it (Range.Copy()). Then I insert this content from the clipboard into my newly created document with the following option:
Selection.PasteAndFormat (wdFormatOriginalFormatting)
This produces a satisfactory result but it feels super dirty since it uses the user's clipboard (which I save at the beginning of the runtime and restore at the end).
I originally tried to use the Selection.InsertFile-Method and tried this again today but it completely screws the formatting.
When the content of the docx is inserted this way it neither has the formatting of the original docx nor the one of the file I created with the program. E.g. the SpaceBefore and SpaceAfter values are wrong, even if I explicitly define them in my created file. Changing the formatting afterwards is no option since the source files contain a lot of special formatting and can change all the time.
Another factor which makes it hard: I cannot save the file before it is presented to the user, using temp folder is not an option in the environment this application is deployed into, so basically everything happens in RAM.
Summary:
Basically what I want is to create the same outcome as with my "Copy and Paste" method utilizing the OriginalFormatting WITHOUT using the clipboard. The problem is, the InsertFile-Method doesn't provide an option for the formatting.
Any idea or help would be greatly appreciated.
Edit:
The FormattedText option as suggested by Rich Michaels produces the same result as the InsertFile-Method. Here is the relevant part of what I did (word is the Microsoft.Office.Interop.Word.Application):
#Opening the source file
Dim doctemp As Microsoft.Office.Interop.Word.Document
doctemp = word.Documents.Open(doctempfilepath)
#Selecting whole document; this is what I did for the "Copy/Paste"-Method, too
doctemp.Range.WholeStory()
Dim insert_range As wordoptions.Range
doc_destination.Activate()
#Jumping to the end and selecting the range
word.Selection.EndKey(Unit:=Microsoft.Office.Interop.Word.WdUnits.wdStory)
insert_range = word.Selection.Range
#Inserting the text
insert_range.FormattedText = doctemp.Range.FormattedText
doctemp.Close(False)
This is the problem:
Use the Range.FormattedText property. It doesn't touch the clipboard and it maintains the source formatting. The process is ...
Set the range in the Source document you want "copied" and set the insertion point in the Destination document and then,
DestinationRange.FormattedText = SourceRange.FormattedText
I want to convert an HTML page to a PDF page. I have a windows application.
I saw many articles but did not find any right solution. I am also facing the images path issue and some other issues like the input string is not of the correct format. Pleas help me to find a solution for that so that I can use it in my windows application.
I am using the following code
Private Sub Button2_Click_1(sender As Object, e As EventArgs) Handles Button2.Click
Dim document As New Document()
Try
PdfWriter.GetInstance(document, New FileStream(AppDomain.CurrentDomain.BaseDirectory + "\SCRA_Resources\SCRA.pdf", FileMode.Create))
document.Open()
Dim wc As New WebClient()
Dim htmlText As String = wc.DownloadString(AppDomain.CurrentDomain.BaseDirectory + "\SCRA_Resources\SCRA.html")
Dim htmlarraylist = HTMLWorker.ParseToList(New StringReader(htmlText), Nothing)
For k As Integer = 0 To htmlarraylist.Count - 1
document.Add(DirectCast(htmlarraylist(k), IElement))
Next
document.Close()
Catch
End Try
End Sub
When i run this code i am getting the error Could not find file 'C:\TestProjects\MergePDfs\MergePDfs\bin\Debug\help.gif'.
I am putting these image where my html file is save. But the html worker cut the path two folder before. And also its not taking the CSS fully.
Let me go through your code to explain a couple of things.
First get rid of your Try and Catch and avoid ever using them in the future. Sounds weird, I know. But everything in code is technically "try this" because every line of code can fail. The only reason to ever use the actual Try command is if you have a valid Catch block that actually does something useful. Logging is one thing. Showing an error message is another, but since you're in VS that's covered already.
Next are these two lines:
Dim htmlText As String = wc.DownloadString(AppDomain.CurrentDomain.BaseDirectory + "\SCRA_Resources\SCRA.html")
Dim htmlarraylist = HTMLWorker.ParseToList(New StringReader(htmlText), Nothing)
The right part of the first line is "get some HTML from a very specific location" and the left part is "and put that into a variable as a string that's totally unaware of the original specific location". Read this a couple of times if it doesn't make sense because it should explain why the second line can't find the images.
Your image links are all relative but relative to what? I know you want it to be your specific folder but you didn't actually specify that in any way. HTML has (or maybe had, I haven't done this in a decade probably) a way to do this via the base tag but I don't know if iText supports that. So instead you need to tell iText "when I say relative, I mean relative to this folder".
Before continuing, it is important to understand that you are using a very old, officially obsoleted and no longer supported helper class that lacks many features and will eventually cause you a lot of grief. The HTMLWorker class was replaced with the XMLWorker class many years ago. Although the HTMLWorker class sounds like something that's more appropriate, think of the XMLWorker as "XHTML" instead of "XML".
Okay, so if you're stuck using HTMLWorker, you can solve this by implementing the iTextSharp.text.html.simpleparser.IImageProvider interface. If you do this and you are using the 5.x series you should hopefully get a bunch of warnings because, as was said above, HTMLWorker is officially obsoleted. The GetImage method of this interface will be called for every image in your document. Below is a very simple implementation that takes a single parameter for the constructor that specifies what the new location should be. Ideally you should add some error handling (this is a good candidate for a Try\Catch because your Catch could be to include an explicit "image not found image") and if you have a mixture of absolute and relative images you should check for that, too.
Public Class RelativeRootImageProvider
Implements iTextSharp.text.html.simpleparser.IImageProvider
Public Property BasePath As String
Public Sub New(basePath As String)
Me.BasePath = basePath
End Sub
Public Function GetImage(src As String,
attrs As IDictionary(Of String, String),
chain As iTextSharp.text.html.simpleparser.ChainedProperties,
doc As IDocListener) As iTextSharp.text.Image Implements iTextSharp.text.html.simpleparser.IImageProvider.GetImage
''//This should also check to see if src is absolute and maybe try getting it first before the below.
''//The below could also have a File.Exists() check, too.
Dim newSrc = System.IO.Path.Combine(BasePath, src)
Return iTextSharp.text.Image.GetInstance(newSrc)
End Function
End Class
To use this you just need to create a special collection and add it to it:
''//Pick a folder
Dim RelativeImageRootPath = Environment.GetFolderPath(Environment.SpecialFolder.Desktop)
''//Collection of providers
Dim providers As New System.Collections.Generic.Dictionary(Of String, Object)()
''//Add our image provider pointed to our specific folder
providers.Add(HTMLWorker.IMG_PROVIDER, New RelativeRootImageProvider(RelativeImageRootPath))
And then pass the providers as the third parameter of the ParseToList method:
Dim htmlarraylist = HTMLWorker.ParseToList(New StringReader(htmlText), Nothing, providers)
I have a folder, full of 38,000+ .pdf files. I was not the genius to put them all into one folder, but I now have the task of separating them. The files that are of value to us, all have the same basic naming convention, for example:
123456_20130604_NEST_IV
456789_20120209_VERT_IT
What I'm trying to do, if possible, is search the folder for only those files with that particular naming convention. As in, search only for files that have 6 digits, an underscore, and then 8 digits followed by another underscore. Kind of like *****_********. I've searched online but I haven't had much luck. Any help would be great!
var regex = new Regex(#"^\d{6}_\d{8}_", RegexOptions.Compiled);
string[] files = Directory.GetFiles(folderPath)
.Where(path => regex.Match(Path.GetFileName(path)).Success)
.ToArray();
files would contain paths to a files, that match criteria.
For my example C:\Temp\123456_20130604_NEST_IV 456789_20120209_VERT_IT.pdf, which I've added beforehand.
As a bonus, here is PowerShell script to do this (assuming you are in the correct folder, otherwise use gc "C:\temp" instead of dir):
dir | Where-Object {$_ -match "^\d{6}_\d{8}_"}
? - single character
* - multiple characters
So, I would say use ?????? _ ???????? _ ???? _ ??.* to get all your files
You can use move or copy command from a command prompt to do that.
If you want to do advanced searches such as pattern matching, use windows grep: http://www.wingrep.com/
Are you familiar with regular expressions? If not, they are a generalized way to search for strings of a special format. I see you tagged your question with C# so assuming you are writing a C# script you might try the .NET regular expression module.
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx
If you are a beginner, you may want to start here.
http://www.codeproject.com/Articles/9099/The-30-Minute-Regex-Tutorial
There are numerous ways to handle this. What I like to do is to divide work into different steps with clear output/data in each step. Hence I would tackle this in the following way (since this seems easier for me instead of writing a master program in c# that does everything):
Open windows command prompt (start/run/cmd), navigate to correct
folder and then "dir *.pdf > pdf_files.txt". This would give you a
file containing all pdf-files inside the specific folder.
open up the txt-file (pdf_files.txt) in Notepad++ and then press "ctrl + f
(find)" activate radio button "regular expressions"
type: [0-9]{6}_[0-9]{8}_.*\.pdf and press "Find all in current document"
Copy results and save to new .txt-file
Now you have a text file containing all pdf-files that you can do what you want with (create a c# program that parses the files and move them depending on their name or whatever)
I am wondering say I have this string "Hi my name is chobo2" and I want to find all the files that have this string. Normally I would do ctrl + f in VS 2010 and do a find.
How do I find this string if it is in a resource file? Right now I have a string in a solution that has many projects. I know the project has at least one resource file but I cannot find the string I am looking for. I might have missed it as the file seems to have many string in it.
Is there any easy way to locate this string value in the resource file? This way I can find the "resource name" and thus find where the string is used in the project.
Edit
Just a side note
I opened up the resource file and tried to do a ctrl + f on it(search by current document) but it only searches on the "name" column not the "value" column
I think this will explain all you need to find the string. ****Cheers**** to vs2010
In resharper it is very easy -> find usages (or something similar)
But you can use Ctrl + Shift + F (select entire solution) in file type section type *.* or *.resx and click find (on panel below all occurrence should appear) double click should drive you directly to resx file (Xml)
Press Ctrl+Shift+F and select Entire solution. Then use F8 for navigation on found items. In addition, for finding related resource files, goto Tools->Option->Project and Solution and check Track Active Item to true.
I want to get title of shortcut, not file name, not description, but title.
how to get it?
I have learn to resolve its target path from here, How to resolve a .lnk in c#
but i don't find any method to get its title.
(source: ggpht.com)
(source: ggpht.com)
It sounds like you might be trying to get the title of the file the link points to, as JRL suggests.
If you're not trying to do that, I'd recommend opening up one of these .lnk files in a hex editor like XVI32. You can probably tell from there whether the Chinese name displayed is embedded in the .lnk file or is somewhere else.
If it's somewhere else, it may be an Extended File Property. There's some source code that may help with retrieving that info: Extended File Properties
If by some chance it is inside the .lnk file, I recommend looking at the Windows Shortcut Specification to get offset information and such on the location of that data.
There is a Desktop.ini hidden file in shortcuts directory, the Desktop.ini file records display strings info of shortcuts.
Desktop.ini file sample:
[LocalizedFileNames]
Windows Update.lnk=#%SystemRoot%\system32\wucltux.dll,-1
Default Programs.lnk=#%SystemRoot%\system32\sud.dll,-1
You can use the property system APIs in latest relase of Code pack:
(all the 670+ properties in the system are accesible using simple property accessors)
http://code.msdn.microsoft.com/WindowsAPICodePack
I know your current need is only limited title of lnk files. Using the above library, the sample code might look like:
ShellLink myLink = ShellObject.FromParsingName("c:\somepath\myLink.lnk");
string title = myLink.Properties.System.Title.Value;
// This is what its pointing to...
string target = myLink.Properties.System.TargetParsingPath.Value;
Please define "title". The only attributes that sound relevent are the shortcut's file name, the target's file name, and the .lnk file's description data.
Assuming you mean the title of the file the link points to, not the link itself, and that you are talking about Windows, then it's done via a feature in NTFS, alternative streams. You can access those streams using code in this article.
Looking around on creating shortcuts, looks like there's a lot of jumping through hoops with scripting objects. But am I missing something? If you have a path to the shortcut, the name should be exactly what you find in the path, not some attribute you have to look up.
Dim f As FileInfo = New FileInfo("C:\Name of shortcut.lnk")
Dim title As String = f.Name.Replace(".lnk", String.Empty)