Sorry for the noob question, I just want some pointers on what I need to learn to be able to achieve such tasks.
I want to know what skill-set and tools I will need to automate control any particular software. My goal is to simplify tasks which is similar to creating a micro.
However, I understand a lot of macro programs uses screen x and y coordinates, but I believe a better method would be reading memories with the help such as cheat engine perhaps? is that the tool which I will need? or there is alternative which suites the tasks better?
basically I want let say, a C# winform perhaps, with certain buttons which will help me execute a series of commands. It would be similar to a game bot program but not made for games, but for other office related tasks. such as open files, basic editing and close.
for example:
open excel
open file xyz
read cell B4 value(perhaps I can use that value elsewhere and display it on the winform or even grab and do some further calculation in C# and throw it back into excel)
move to F1 and enter value 1234
save file
exit excel
I'm basically looking for a way to make a macro any program not just excel, but without the downside of using x and y coordinates because if program window moves by any chance, it would cause the macro to malfunction.
Therefore, is reading memory of the program consider the best solution? So I can interact with files, data, and commands for any program with the intention to do some desktop usage automation.
"reading memory of the program" is entertaining, but not necessary reliable way to automate anything.
It is useful for cheats as person who cheat normally willing to go long way to make cheat work, so will check for correct versions, get particular cheat patch/version, disable all protection features and so on.
For general macro creation it is much less useful as you'll go against built in barriers (like security ACLs, address space randomization - ASR, auto updates, or JIT compilation) and explicitly created ones (like debugger protection). I believe normal people are less likely to restrict features to make macro to work...
Related
I am struggling to find a reliable way to get the content/text of the window that is currently in the foreground. It should be able to determine the text from every possible program that a user is currently using, if possible
What I tried:
Take a screenshot of the currently active window, apply some filters and run an OCR algorithm (tesseract .Net wrapper). This works, but takes a long time and is not very accurate.
Then I tried some Windows API functions (FindWindow and SendMessage), as described here. I could make it run for the standard Editor (notepad) for example, but not for most other programs
I also tried to make it work with AutoHotKey and the WinGetText function and again a .Net Wrapper. Here, I just get some info about the window, but in no way the text of it...
Unfortunately, now, I don't have any other idea what to do as I am stuck in every way... Does someone have experience with this or knows a way that works? Any suggestion is really much appreciated
It will be difficult to find a single solution to retrieve text from applications. Different methods for different programs will be required.
For AutoHotkey, AccViewer, which makes use of Acc.ahk is the best method of first resort. Acc works on a large variety of controls and also elements within controls, it can cover far more control types than AutoHotkey's ControlGet command.
Acc Library [AHK_L] (updated 09/27/2012) - Scripts and Functions - AutoHotkey Community
https://autohotkey.com/board/topic/77303-acc-library-ahk-l-updated-09272012/
Accessible Info Viewer - Alpha Release (2012-09-20) - Scripts and Functions - AutoHotkey Community
https://autohotkey.com/board/topic/77888-accessible-info-viewer-alpha-release-2012-09-20/
A link describing some further text retrieval methods:
AutoHotKey ControlGet
Note also:
COM (Component Object Model), is handled natively by AutoHotkey. It can be used to retrieve the text from web elements in Internet Explorer, and via VBA code, text can be retrieved from MS Office programs such as MS Excel and MS Word.
i need to block any screen capture software on the computer from taking screen shots. Since all of them are work on standard API-functions, i think i could monitor and block them.
I need to use C#.
All i have found is how to monitor and block them in a certain program (screen capture program). They are looking for a function in the program, then they change it address on mine function address.
But how can i do it, if i haven't any certain programs? I need to block anyone which tries to take a screenshot.
If your final goal is possible or not I don't know, but for the hooking the API portion I can help you out.
I have used the library EasyHook many times in the past, this will let you hook and intercept system function calls from C# code fairly easily. Just read through the PDF tutorial for setup instructions.
For actually finding the API's I recommend Rohitab's API Monitor, it's still in Alpha stages but it works really well and is free. You just hook it on to a processes and it tells you every external DLL call it makes (with the parameters it passed if you have the xml definition file for the DLL, the program comes with almost all of the windows API dll's pre-defined).
The combination of EasyHook and API Monitor is a great 1-2 punch for mucking with other program's calls.
It is not possible to prevent screenshots from being taken. The battle is already lost because of the DWM (Desktop Window Manager). It's lower level than Win32 and device contexts.
If you want to protect the text in your program, there are a lot easier ways to extract it than doing screenshots and OCR. TextOut and/or Direct2D hooking and accessibility APIs.
If there's a lot of IP in your program. Then don't make it all available onscreen. Make sure it's tedious to crawl the GUI for text, and hard to automate it. And don't load whole texts in memory of the program.
Possible solutions:
1. To prevent copying of text. Draw the text as an image.
2. To prevent accessibility technologies, like screen readers - override WndProc in your control, handle and ignore the window message WM_GETOBJECT.
3. To make it harder if they try to use OCR. Draw graphics behind the text. Human readable, but much harder for a machine to interpret it.
Neither of these methods are invasive for the user.
** A very invasive suggestion **:
If you are really serious about preventing anyone from "stealing" your content.
Implement mouse and keyboard hooks. Filter out typical copy shortcuts. Prevent the mouse from leaving the boundaries of your application.
Allow your application to only run when the OS runs well-known processes and services.
If any process starts which you don't recognize, black out the application and notify the user about it, and request the user to close it. And ofc make sure someone is not just spoofing a well-known process.
Monitor the clipboard as you suggested yourself.
You can ofc soften some of these suggestions based on the context of your application.
As Scott just posted it likely can be prevented with API hooks to see that paint events only go to desktop bound handles and not others, and refuse to paint otherwise. However, you need to consider the following scenarios and see if they're relevant threat to your approach or not:
Your software may be running in a virtual machine like VMWare. Such software has capapbilities to capture screen that does so at "virtual hardware" level, and your API hooks will not be able to discern it - and this would be the easiest way approach if I wanted to bypass your protections.
As a post suggests here, nothing also prevents someone to take monitor cable and plug it into another computer's capture card, and take screenshot that way. Again, your hooks will be helpless here.
Bottom line, you can make it somewhat harder to do, but bypassing such protection may be pretty trivial thing to do.
My 2c.
I'm writing a small c# program, I don't want the final user to take screenshots while using my program, is it possible? Or even if he takes one, how can I know it?
Thanks in advance and sorry if this is a poor-content question due to my lack of experience in c# coding.
You can create a system-wide keyboard hook using the low-level keyboard filter and cancel any printscreen keyboard combination. But if someone has also installed a helper application (like Gadwin or something) it'll become a lot more difficult because you won't know beforehand what keyboard shortcut you should catch (most tools allow to specify your own hooks).
Here's an article on using hooks in C#
and here's a ready-made keyboard hook library for .net that uses global mouse and keyboard hooks (use Google to find more freeware and commercial libraries and tools).
On a side note: it's generally not preferred to change the system behavior. Screenshots are system behavior and serve a distinguished purpose for trouble shooting. If you prevent this, users will not be able to show you a screenshot of something wrong. But if you must do it, you can do it.
EDIT: on a deeper level, you can install an API hook. All screenshot applications use API calls to get the content of a (part of) the screen. But API hooks are hard to get right. A more trivial way is probably by writing a user-level driver. While you can prevent all this, it is really worth all the trouble?
You might want a keyboard hook. But it'll tell you if the user pressed the "print screen" key, not if someone programmatically take a screenshot using some GDI function.
I doubt it's possible to prevent all the ways of taking a screenshot.
General answer: No. It's not possible to detect this - especially from C#. There are dozens of ways to take screenshot and even applications written in C++/WinAPI can only detect some of them, but not all.
Also consider - what if user is running your app in virtual machine? He'll be able to take screenshots at host machine and you can do absolutely nothing to detect (not even prevent) this.
all.
i wanna write a tool for GUI automation which can locate text label on current screen(the absolute location) so that i can drive the mouse cursor to click on it.
The signature of the needed function should like this:
Point GetTextCoordination(string text)
Any one have an idea how implement this? I don't wanna use the OCR or computer vision technology for performance issue. Is the hooking of the TextOut win32api function a feasible way?
I don't think that hooking the TextOut function is a feasible solution (though it's certainly possible). You don't have any guarantee that the text you want to find was drawn using this function. Trying to use OCR would be similarly fraught with difficulties.
I suspect that for your purposes, it would be sufficient to enumerate the windows of the target application (using GetWindow and related functions) and examing the text of each (using GetWindowText) to look for the one you want. That would give you a window handle and from that you could get the window boundaries or send it a message directly.
You want to use a GUI automation toolkit, such as the UIAutomation library or the white library (which is a wrapper around UIAutomation), or AutoIT.
(Alternatively there are commercial tools for this - if you are looking into setting up a test automation program, then you'd be better off with one of the commercial tools, as they have a lot of features that make this kind of thing easier.)
I think it's possible to somehow hook with the windows environment (specifically explorer.exe) and trigger specific things, for example launching control panel and using it as if I had mouse (meaning I'm clicking the interface from the code).
Basically what I'm trying to do is automate some redundant tasks I do often, just I don't know how it's done, or even how it's called. Anyone can point me in right direction?
Thanks!
Forget about "automated clicking". GUI tools are just front-ends to control the system. You can control the system like they do, it will be much easier.
Huge possibilities can give you Microsoft Management Console. Each "snap-in" can be accessed via COM model. Some of them have GUI front-ends, find and fire "*.msc" files (somewhere in Windows directory) to try them.
There is many command line tools i.e. "net" command has huge abilities related to networking.
PowerShell may be a better choice instead of C# or C++, it's designed for task automation. You can easily use COM, .NET, MMC ...
Windows Explorer has a COM object model that you can call from both C# and C++. (Most of the examples on MSDN are in Javascript or VBScript, which I guess aren't your languages of choice, but they demonstrate that the API is straightforward to call.)
AutoHotKey is a scripting environment specifically designed for this sort of task
If you want mostly to launch control panel you can do using RunDll32 interface existing in the most control panel applets. See http://www.osattack.com/windows-7/huge-list-of-windows-7-shell-commands/ , http://support.microsoft.com/kb/167012 or http://www.winvistaclub.com/t57.html for example. For the corresponding API see http://support.microsoft.com/kb/164787.
Another option is usage of control.exe (see http://msdn.microsoft.com/en-us/library/cc144191.aspx and http://vlaurie.com/computers2/Articles/control.htm).
If you google more you will find much more examples which you can to automate a lot of things without using of some general ways to automate GUI.
At more or less the lowest level within Win32, you can use the SendMessage() API to send raw click messages to windows of interest. This will rely on a lot of intrusive knowledge about the apps you intend to drive. However, you could easily implement a "click recorder" that could replay click sequences captured from user interaction.