I am struggling to find a reliable way to get the content/text of the window that is currently in the foreground. It should be able to determine the text from every possible program that a user is currently using, if possible
What I tried:
Take a screenshot of the currently active window, apply some filters and run an OCR algorithm (tesseract .Net wrapper). This works, but takes a long time and is not very accurate.
Then I tried some Windows API functions (FindWindow and SendMessage), as described here. I could make it run for the standard Editor (notepad) for example, but not for most other programs
I also tried to make it work with AutoHotKey and the WinGetText function and again a .Net Wrapper. Here, I just get some info about the window, but in no way the text of it...
Unfortunately, now, I don't have any other idea what to do as I am stuck in every way... Does someone have experience with this or knows a way that works? Any suggestion is really much appreciated
It will be difficult to find a single solution to retrieve text from applications. Different methods for different programs will be required.
For AutoHotkey, AccViewer, which makes use of Acc.ahk is the best method of first resort. Acc works on a large variety of controls and also elements within controls, it can cover far more control types than AutoHotkey's ControlGet command.
Acc Library [AHK_L] (updated 09/27/2012) - Scripts and Functions - AutoHotkey Community
https://autohotkey.com/board/topic/77303-acc-library-ahk-l-updated-09272012/
Accessible Info Viewer - Alpha Release (2012-09-20) - Scripts and Functions - AutoHotkey Community
https://autohotkey.com/board/topic/77888-accessible-info-viewer-alpha-release-2012-09-20/
A link describing some further text retrieval methods:
AutoHotKey ControlGet
Note also:
COM (Component Object Model), is handled natively by AutoHotkey. It can be used to retrieve the text from web elements in Internet Explorer, and via VBA code, text can be retrieved from MS Office programs such as MS Excel and MS Word.
Related
Sorry for the noob question, I just want some pointers on what I need to learn to be able to achieve such tasks.
I want to know what skill-set and tools I will need to automate control any particular software. My goal is to simplify tasks which is similar to creating a micro.
However, I understand a lot of macro programs uses screen x and y coordinates, but I believe a better method would be reading memories with the help such as cheat engine perhaps? is that the tool which I will need? or there is alternative which suites the tasks better?
basically I want let say, a C# winform perhaps, with certain buttons which will help me execute a series of commands. It would be similar to a game bot program but not made for games, but for other office related tasks. such as open files, basic editing and close.
for example:
open excel
open file xyz
read cell B4 value(perhaps I can use that value elsewhere and display it on the winform or even grab and do some further calculation in C# and throw it back into excel)
move to F1 and enter value 1234
save file
exit excel
I'm basically looking for a way to make a macro any program not just excel, but without the downside of using x and y coordinates because if program window moves by any chance, it would cause the macro to malfunction.
Therefore, is reading memory of the program consider the best solution? So I can interact with files, data, and commands for any program with the intention to do some desktop usage automation.
"reading memory of the program" is entertaining, but not necessary reliable way to automate anything.
It is useful for cheats as person who cheat normally willing to go long way to make cheat work, so will check for correct versions, get particular cheat patch/version, disable all protection features and so on.
For general macro creation it is much less useful as you'll go against built in barriers (like security ACLs, address space randomization - ASR, auto updates, or JIT compilation) and explicitly created ones (like debugger protection). I believe normal people are less likely to restrict features to make macro to work...
I'm going to develop a small windows application using C# .NET in VS 2010. The app should read the personnel's data and fill a card layout's fields and then user can click the print button in order to print the card. What is the best solution for printing the card and displaying it to the user?
Like all thing in programming it depends on how much work you want to do. In our app (not sure if I am allowed to post a link, so better not) we take the data from user in a fairly standard form and then use standard graphical style calls to draw the card. This same code can then either draw into an image control for showing to the user OR to a printer device to produce the final output. We have (several) abstraction layers so that the calls for drawing into either type of output are the same.
In general we have found it much more productive to develop our own custom solutions rather than rely on a reporting component. The custom solution is easier to change and in most cases the functionality actually required takes only a day or so of work.
ReportViewerControl http://msdn.microsoft.com/en-us/library/ms251671.aspx is a possible candidate. it is free of charge if you have Visual Studio and it can export the report in PDF too. You can bind to a custom DataSource ( it does not need a Database behind ) and when it's done customizing takes minutes.
I think it's possible to somehow hook with the windows environment (specifically explorer.exe) and trigger specific things, for example launching control panel and using it as if I had mouse (meaning I'm clicking the interface from the code).
Basically what I'm trying to do is automate some redundant tasks I do often, just I don't know how it's done, or even how it's called. Anyone can point me in right direction?
Thanks!
Forget about "automated clicking". GUI tools are just front-ends to control the system. You can control the system like they do, it will be much easier.
Huge possibilities can give you Microsoft Management Console. Each "snap-in" can be accessed via COM model. Some of them have GUI front-ends, find and fire "*.msc" files (somewhere in Windows directory) to try them.
There is many command line tools i.e. "net" command has huge abilities related to networking.
PowerShell may be a better choice instead of C# or C++, it's designed for task automation. You can easily use COM, .NET, MMC ...
Windows Explorer has a COM object model that you can call from both C# and C++. (Most of the examples on MSDN are in Javascript or VBScript, which I guess aren't your languages of choice, but they demonstrate that the API is straightforward to call.)
AutoHotKey is a scripting environment specifically designed for this sort of task
If you want mostly to launch control panel you can do using RunDll32 interface existing in the most control panel applets. See http://www.osattack.com/windows-7/huge-list-of-windows-7-shell-commands/ , http://support.microsoft.com/kb/167012 or http://www.winvistaclub.com/t57.html for example. For the corresponding API see http://support.microsoft.com/kb/164787.
Another option is usage of control.exe (see http://msdn.microsoft.com/en-us/library/cc144191.aspx and http://vlaurie.com/computers2/Articles/control.htm).
If you google more you will find much more examples which you can to automate a lot of things without using of some general ways to automate GUI.
At more or less the lowest level within Win32, you can use the SendMessage() API to send raw click messages to windows of interest. This will rely on a lot of intrusive knowledge about the apps you intend to drive. However, you could easily implement a "click recorder" that could replay click sequences captured from user interaction.
I've inherited a C# window's application that I'm not real crazy about. I've got a looming deadline and I'm scared to death that some of my changes might be having adverse effects on existing functionality.
I've got a hobbyist background to RoR and I'm fairly comfortable with testing in that framework (using both RSpec and Cucumber).
I love having test scripts that can be ran on a regular basis and I'm willing to spend my personal time developing those for this particular project. I purchased a book from PragProg.com on scripted GUI testing with Ruby (http://pragprog.com/titles/idgtr/scripted-gui-testing-with-ruby). So far, I'm digging what I'm seeing and I think that this should work well.
Unfortunately, I've got a fundamental lack of understanding concerning Windows app development. I'm making calles to FindWindowEx (via Win32API) to "attempt" to retrieve sub-controls in my application.
A big part of my confusion is how I should retrieve the Class Name of the control that I'm trying to capture. The example provided in the text is as follows:
edit = find_window_ex.call #main_window, 0, 'ATL:00434310', nil
Where #main_window is my application's main window handle, and 'ATL:...' is the class of a text box area. There is no explanation given as to how the author arrived at 'ATL:...'.
I've read some very old posts concerning MS's SPY++, but those seem to be obsolete (or for some reason it wasn't installed when I installed vs2010).
So, what's the best way for me to find control classes to be used with the findWindowEx call? I do have the source code - should I be pulling from there? What if I don't have the source code and I want to automate an application? Is there a utility that allows you to somehow "browse" controls on a running application?
Sorry for the length - thanks in advance for the help!
Bob
The best is for you to install the components so that you get Spy++, this is the best way I know of to get to the actual class names esp. if you do not have the source to the original controls, which might be from a library or possibly some standard ActiveX controls that Microsoft ships.
The ATL class name is probably for controls developed using Microsoft Active Template Library (ATL), this is a C++ template library which significantly simplifies the development of ActiveX controls, and COM objects etc. in C++.
my main language is vb/c#.net and I'd like to make a console program but with a menu system.
If any of you have worked with "dos" like programs or iSeries from IBM then thats the style I am going for.
so, was wondering if anyone knows of a "winforms" library that will make my form look like this. I dont mind a "fake winforms look" or a console application but thats how I'd like.
I've used iSeries extensively and I remember exactly what you're talking about. To simulate this look and feel in a C# app, you'll want to create a console project and write text to different areas of the screen with the help of the Console.CursorTop and Console.CursorLeft properties, then calling Console.Write or Console.WriteLine to write out the text in the previously set position. To change colors, before calling WriteLine you'll want to use the Console.ForegroundColor and Console.BackgroundColor properties.
You'll need to listen for input and upon finding a tab character, your program can use its own internal logic to determine where the cursor should appear next (on the next line in the same column, for instance, to simulate those left columns of input fields in your screenshot).
Doing this with a Windows Forms app will be a little trickier and you'd definitely want to write your own control for it (possibly sub-classed from one of the many types of standard multi-line text controls already available).
It's a good question. For many Use Cases the standard Windows (or other windowing) paradigm can be overkill, intimidating, and confusing.
Back in DOS days there were a number of "Windowing" libraries that created various abstractions for doing this.
[After Googling]
Here's a site that lists various libraries including a several that appear to be of interest.
A resource like this would also be handy for Mobile apps, where mouse-driven window apps tend to be not the best fit, especially for workflow-type processes. The Console is a pretty universal lowest-common-denominator abstraction available in most every environment.
You are looking for a curses like library but for windows. And usable from VB & C#.
Curses provides for a even richer text based UI than even iSeries. All sorts of widgetry!
Windows is not really supportive of text interfaces whether on purpose or not so are out of luck.
But ...
Well, how about MonoCurses? I don't know if it will work though. Also look at PDCurses.
And if you don't mind using Python for just the front-end see this.
There are a couple of webifiers or screen scraping programs for iSeries that will create a web or windows user interface on top of your iSeries application. I have never used any of those so there is not a particular one that I can recommend, but you might want to look their for inspiration or reuse.