While talking with a friend over yahoo messenger, I told him would be really cool to make a bot to answer with generic messages when someone starts a conversation. Upon thinking about what I told him, I realized it would be quite interesting to do something like that. The problem is that I don't know much about win32.
So my question is this: how do you 'link' a process to both another one and the windows environment? The goal would be to have an application running in the background which makes some sort of a query to see what windows are opened and when a new yahoo messenger conversation window appears it should send a list of keystroke events to that window.
I could use either C# or VC++ for the programming part and I can use any help: either specific answers or tips that could help me - e.g.: what to google for. So far my google research only came up with some apps/dlls/code that do that for you and some scripting stuff and I'm not exactly searching for that. I want to do all the work myself so I can learn from it.
It seems like you basically want to control other applications.
There are roughly 2 ways to do this on windows
1 - Use the low level windows API to blindly fire keyboard and mouse events at your target application.
The basic way this works is using the Win32 SendInput method, but there's a ton of other work you have to do to find window handles, etc, etc
2 - Use a higher level UI automation API to interact with the application in a more structured manner.
The best (well, newest anyway) way to do this is using the Microsoft UI Automation API which shipped in windows vista and 7 (it's available on XP as well). Here's the MSDN starter page for it.
We use the microsoft UI automation API at my job for automated UI testing of our apps, and it's not too bad. Beware though, that no matter how you chose to solve this problem, it is fraught with peril, and whether or not it works at all depends on the target application.
Good luck
Not quite the same domain as what you're looking for, BUT this series of blog posts will tell you what you need to know (and some other cool stuff).
http://www.codingthewheel.com/archives/how-i-built-a-working-poker-bot
If you really want to learn everything from scratch, then you should use C++ and native WIN32 API functions.
If you want to play a bit with C#, then you should look the pinvoke.net site and Managed Windows API project.
What you'll surely need is the Spy++ tool.
http://pinvoke.net/ seems to be the website you are looking for. The site explains how to use Windows API functions in higher level languages. Search on pinvoke for any of the functions I've listed below and it gives you the code necessary to be able to use these functions in your application.
You'll likely want to use the FindWindow function to find the window in which you're interested.
You'll need the process ID, so use GetWindowThreadProcessId to grab it.
Next, you'll need to use OpenProcess allow for reading of the process's memory.
Afterwards, you'll want to use ReadProcessMemory to read into the process's memory to see what happening with it.
Lastly, you'll want to use the PostMessage function to send key presses to the window handle.
Welcome to the wonderful world of Windows API programming.
Check out Autohotkey. This is the fastest way to do what you want.
Related
I'm writing a small c# program, I don't want the final user to take screenshots while using my program, is it possible? Or even if he takes one, how can I know it?
Thanks in advance and sorry if this is a poor-content question due to my lack of experience in c# coding.
You can create a system-wide keyboard hook using the low-level keyboard filter and cancel any printscreen keyboard combination. But if someone has also installed a helper application (like Gadwin or something) it'll become a lot more difficult because you won't know beforehand what keyboard shortcut you should catch (most tools allow to specify your own hooks).
Here's an article on using hooks in C#
and here's a ready-made keyboard hook library for .net that uses global mouse and keyboard hooks (use Google to find more freeware and commercial libraries and tools).
On a side note: it's generally not preferred to change the system behavior. Screenshots are system behavior and serve a distinguished purpose for trouble shooting. If you prevent this, users will not be able to show you a screenshot of something wrong. But if you must do it, you can do it.
EDIT: on a deeper level, you can install an API hook. All screenshot applications use API calls to get the content of a (part of) the screen. But API hooks are hard to get right. A more trivial way is probably by writing a user-level driver. While you can prevent all this, it is really worth all the trouble?
You might want a keyboard hook. But it'll tell you if the user pressed the "print screen" key, not if someone programmatically take a screenshot using some GDI function.
I doubt it's possible to prevent all the ways of taking a screenshot.
General answer: No. It's not possible to detect this - especially from C#. There are dozens of ways to take screenshot and even applications written in C++/WinAPI can only detect some of them, but not all.
Also consider - what if user is running your app in virtual machine? He'll be able to take screenshots at host machine and you can do absolutely nothing to detect (not even prevent) this.
I'm writing an app (in C#) which as a part of it must simulate and send some key strokes to another application. I'm using http://inputsimulator.codeplex.com/ project for simulating keys, and it works in many applications, but in some it doesn't - i.e. Mortal Combat 4.
I've googled about it, and found many answers varying from 'it's impossible' to 'you must use XXX library' etc. Those answered scared me a lot, and even nearly convinced I'm not able to do it at that time, BUT...
M$ Virtual Keyboard works. It works in ALL applications. So it IS possible... Does anyone of you, clever guys, know how can I achieve this?
Ok, I think I finally got it to work. I used API Monitor recommended by Neal P and it showed just minimal differences between OSK calls and mine. A bit later I've tried to make my calling thread sleep some time between sending messages with press and release key and that was it.
Although you were able to achieve your purpose, the way you achieved it does not fundamentally answer your question: How to simulate keyboard input in ALL applications?
There's a bunch of situations where the common user mode Microsoft API already mentioned does not work, like game applications that use the DirectInput API or protected games.
I have built a library that can help in this situations, it provides a simple C API that internally communicates with device filter drivers. It is able to send input for DirectInput based games and also is able to bypass some game protections. I have checked and it is still able to bypass some known game protections by using the x64 version of the library. Game protections commonly hook only the x86 system's api. Well, at last now, 18 February 2012, this is what I'm seeing happening.
Take a look at SendKeys on MSDN
I'm trying to automate a hidden .NET application, with another .NET application (written in c#) using the easiest way possible. It's NOT for testing purposes, it's a way to fulfill the lack of scripting for this application.
I already tried white framework, but there is one major problems with it: the way it's working. It's slow and it's not working on hidden windows and controls (like the winAPI does). Whats more, when "clicking" white moves the mouse, brings it's targeted window to the front and so on.
I was also thinking about using a user32.dll wrapper, because the way it's handling it's target is what I need, but I've red it's not working with .NET applications. It also would be a problem working with it, because my targeted application got 5 button labeled "...", and would be really hard finding 2 of them I need. I also would like to use the controls .NET id (the name the developer gave to it's controls when designing the GUI).
BTW, my targeted application is MeGUI if that helps. We do a lot of video encoding and a tool like this would help us a lot. I need the MeGUI to be hidden, because I'm the only programmer, others using my tool shouldn't see what happens in the background, not to talk about the many windows popping all around.
You can add a reference to the exe from your project and then create an AppDomain to run its main method. From there, it should be possible to queue delegates to its main thread's loop. With a bit of reflection, you could have those delegates invoke the click events and whatnot directly.
I've never attempted this approach, but it should work.
You should try Stephens idea instead of scripting a hidden app. A .NET Windows Forms App (EXE) is still a .NET Assembly and that means you can use that the same way as a DLL, just add a reference and use the public classes.
If you still want to try some scripting, take a look on the "Microsoft UI Automation" API and the "System.Windows.Automation" namespace.
Nice article here: http://msdn.microsoft.com/en-us/magazine/cc163465.aspx
MSDN Doc: http://msdn.microsoft.com/en-us/library/system.windows.automation.aspx
I think it's possible to somehow hook with the windows environment (specifically explorer.exe) and trigger specific things, for example launching control panel and using it as if I had mouse (meaning I'm clicking the interface from the code).
Basically what I'm trying to do is automate some redundant tasks I do often, just I don't know how it's done, or even how it's called. Anyone can point me in right direction?
Thanks!
Forget about "automated clicking". GUI tools are just front-ends to control the system. You can control the system like they do, it will be much easier.
Huge possibilities can give you Microsoft Management Console. Each "snap-in" can be accessed via COM model. Some of them have GUI front-ends, find and fire "*.msc" files (somewhere in Windows directory) to try them.
There is many command line tools i.e. "net" command has huge abilities related to networking.
PowerShell may be a better choice instead of C# or C++, it's designed for task automation. You can easily use COM, .NET, MMC ...
Windows Explorer has a COM object model that you can call from both C# and C++. (Most of the examples on MSDN are in Javascript or VBScript, which I guess aren't your languages of choice, but they demonstrate that the API is straightforward to call.)
AutoHotKey is a scripting environment specifically designed for this sort of task
If you want mostly to launch control panel you can do using RunDll32 interface existing in the most control panel applets. See http://www.osattack.com/windows-7/huge-list-of-windows-7-shell-commands/ , http://support.microsoft.com/kb/167012 or http://www.winvistaclub.com/t57.html for example. For the corresponding API see http://support.microsoft.com/kb/164787.
Another option is usage of control.exe (see http://msdn.microsoft.com/en-us/library/cc144191.aspx and http://vlaurie.com/computers2/Articles/control.htm).
If you google more you will find much more examples which you can to automate a lot of things without using of some general ways to automate GUI.
At more or less the lowest level within Win32, you can use the SendMessage() API to send raw click messages to windows of interest. This will rely on a lot of intrusive knowledge about the apps you intend to drive. However, you could easily implement a "click recorder" that could replay click sequences captured from user interaction.
we are tasked with basically emulating a browser to fetch webpages, looking to automate tests on different web pages. This will be used for (ideally) console-ish applications that run in the background and generate reports.
We tried going with .NET and the WatiN library, but it was built on a Marshalled IE, and so it lacked many features that we hacked in with calls to unmanaged native code, but at the end of the day IE is not thread safe nor process safe, and many of the needed features could only be implemented by changing registry values and it was just terribly unflexible.
Proxy support
JavaScript support- we have to be able to parse the actual DOM after any javascript has executed (and hopefully an event is raised to handle any ajax calls)
Ability to save entire contents of page including images FROM THE loaded page's CACHE to a separate location
ability to clear cookies/cache, get the cookies/cache, etc.
Ability to set headers and alter post data for any browser call
Process and/or thread safe would be ideal
And for the love of drogs, an API that isn't completely cryptic
Languages acceptable C++, C#, Python, anything that can be a simple little background application that is somewhat bearable and doesn't have a completely "untraditional" syntax like Ruby.
From my own research, and believe me I am terrible at google searches, I have heard good things about WebKit... would the Qt module QtWebKit handle all these features?
You might try one of these:
http://code.google.com/p/spynner/
http://code.google.com/p/pywebkitgtk/
I know you mentioned you don't like Ruby syntax (neither do I), but I just have to chime in and say that Watir is probably the best thing out there for what you are trying to do.
EDIT: There appears to be a Java counter-part called Watij
I've only been digging into this recently myself, so I couldn't say that this does everything you've listed, but check out GeckoFx.
From the site: GeckoFX is an open-source component which makes it easy to embed Mozilla Gecko (Firefox) into any .NET Windows Forms application. Written in clean, fully commented C#, GeckoFX is the perfect replacement for the default Internet Explorer-based WebBrowser control.
As for my own impressions: it has blown away the default .NET WebBrowser in both performance and stability.