How to ScrollIntoView when the browser is minimized through Selenium and C# - c#

how can i scroll to elements, while my browser is minimized?
Currently i have this code:
IWebElement scrollnextpage = driver.FindElement(By.XPath("//a[" + x + "][contains(#class, 'paged-nav-item')]"));
js.ExecuteScript("arguments[0].scrollIntoView({behavior: 'smooth', block: 'center'});", scrollnextpage);
This is working fine, but when i minimize my browser it stops working.
Any solutions?

Answering straight, the Browser Client shouldn't be kept minimized while you initiate the test execution. When you use Selenium to execute your program/script, Selenium needs the focus on the Browser Client which renders the HTML DOM.
Why browser shouldn't be minimized?
Software Test Automation is an art. Test Execution must be performed in a controled environment for optimized performance.
Particularly when your #Tests are Selenium based, test execution should be conducted with the Viewport maximized because of the following reasons:
At the lowest level, the behavior of actions class is intended to mimic the remote end's behavior with an actual input device as closely as possible, and the implementation strategy may involve e.g. injecting synthesized events into a browser event loop. Therefore the steps to dispatch an action will inevitably end up in implementation-specific territory. However there are certain content observable effects that must be consistent across implementations. To accommodate this, the specification requires that remote ends perform implementation-specific action dispatch steps, along with a list of events and their properties. This list is not comprehensive; in particular the default action of the input source may cause additional events to be generated depending on the implementation and the state of the browser (e.g. input events relating to key actions when the focus is on an editable element, scroll events, etc.).
Moreover,
An activation trigger generated by the WebDriver API user needs to be indistinguishable from those generated by a real user interacting with the browser. In particular, the dispatched events will have the isTrusted attribute set to true. The most robust way to dispatch these events is by creating them in the browser implementation itself. Sending OS-specific input messages to the browser's window has the disadvantage that the browser being automated may not be properly isolated from a user accidentally modifying input source state. Use of an OS-level accessibility API has the disadvantage that the browser's window must be focused, and as a result, multiple WebDriver instances cannot run in parallel.
An advantage of an OS-level accessibility API is that it guarantees that inputs correctly mirror user input, and allows interaction with the host OS if necessary. This might, however, have performance penalties from a machine utilisation perspective.
Additionally,
Robot Class is used to generate native system input events for the purposes of test automation, self-running demos, and other applications where control of the mouse and keyboard is needed. The primary purpose of Robot is to facilitate automated testing of Java platform implementations. Using the class to generate input events differs from posting events to the AWT event queue or AWT components in that the events are generated in the platform's native input queue. For example, Robot.mouseMove will actually move the mouse cursor instead of just generating mouse move events.
Finally, as per Internet Explorer and Native Events:
As the InternetExplorerDriver is Windows-only, it attempts to use so-called "native", or OS-level events to perform mouse and keyboard operations in the browser. This is in contrast to using simulated JavaScript events for the same operations. The advantage of using native events is that it does not rely on the JavaScript sandbox, and it ensures proper JavaScript event propagation within the browser. However, there are currently some issues with mouse events when the IE browser window does not have focus, and when attempting to hover over elements.
Browser Focus:
The challenge is that IE itself appears to not fully respect the Windows messages we send the IE browser window (WM_MOUSEDOWN and WM_MOUSEUP) if the window doesn't have the focus. Specifically, the element being clicked on will receive a focus window around it, but the click will not be processed by the element. Arguably, we shouldn't be sending messages at all; rather, we should be using the SendInput() API, but that API explicitly requires the window to have the focus. We have two conflicting goals with the WebDriver project.
First, we strive to emulate the user as closely as possible. This means using native events rather than simulating the events using JavaScript.
Second, we want to not require focus of the browser window being automated. This means that just forcing the browser window to the foreground is suboptimal.
Conclusion
Always keep the Browser maximized while initiating the test execution.

Related

C# forward mouse events to another window without losing focus

My app screencaptures another window that runs on a second monitor. Now I'd also like to forward mouse clicks made in my app to that window. I tried using SendMessage in user32.dll for this, but this also makes window focus switch, which causes some issues, like the two windows rapidly fighting for focus. Is there are way to place those mouse events without making the hidden window active and losing focus on the main app?
Is there are way to place those mouse events without making the hidden window active and losing focus on the main app?
No, there is not even a way to forward mouse input to another receiver. Messages are only part of the input processing. The system also does internal bookkeeping and you cannot replicate that.
The only reliable way to inject input is by calling SendInput. Doing so doesn't allow you to specify a receiver. Input goes to whichever thread is determined to be the receiver by the system.
Although, more often than not, this question is asked when the problem that needs to be solved is a different one altogether: How do you automate a UI? The answer to that question is UI Automation.

c# detect which messagebox (originated in another app) button has been pressed

Another application displays a messagebox (with a unique text inside it), user chooses Yes/No.
How to detect what he pressed in c#? (best in .Net up to 3.5). I could do polling with FindWindowEx (on another thread) but how to detect what button had been pressed? Also I don't think polling is the best way to do the job.
I need to know what the user has chosen in another app, so I can react accordingly in my own app. I don't have access to the other app's source code. Also to make it clear I don't want to click any of the buttons myself. I'm not afraid of a bit of c++, winapi and pinvoke
To monitor UI events in another application you can use UI Automation. To solve your specific problem you need to subscribe to a particular event (see Subscribing to UI Automation Events). To do so call IUIAutomation::AddAutomationEventHandler with a UIA_Invoke_InvokedEventId Event Identifier.
While UI Automation can be used to solve your problem, it is an assistive technology, mainly to enable accessibility needs and automated UI testing.
You could use either Anonymous or Named Pipes or WCF(Windows Communications Foundation).

Is it possible to create a touch application to interact with another application, "sharing" focus betwen the two?

What I am trying to do is have a helper application that a user can use touch input to affect a second application. I have been able to send keystrokes to the second application, but the problem I am having is when I want to hold a button down.
For example on my application, I want to be able to hold down a button which would simulate a ctrl key down. And while this button is touched, I want to be able to interact with the second application. And if the user lets go of the button, then the ctrl key is undressed. I can kind of get this working, except when the user does anything on the second application, the button that was held down is unpressed (because the other application gained focus).
I don't care if I have to go WPF or windows forms, just as long as I can get it working. Windows 8 or 8.1 only is acceptable as well (all clients will be 8.1).
Any help would be appreciated!
Note I added to a comment below.
The second application is one I haven't created, it could be anything really. A scenario would be my application having a ctrl button that you could hold press and hold, for example, and in outlook click a link. Or pressing and holding a shift button in my app, while drawing with a pen in photoshop to draw a straight line. I am able to send key strokes, but just can't handle the "hold" touch command.
Since it's been so long, I'm creating a new answer. I did the research, and I'm pretty sure I know what's going on. But I'm going to mention all the official resources I examined before coming to my conclusion.
Possible packaged solutions
First off, the new Windows Input Simulator might fix all your troubles right out of the box. If you need the Windows API, which I'll be talking about below, check PInvoke.net first to see if they have documentation for the call you're trying to make.
The Windows API way
The best place to start is the User Interaction article on MSDN. There's a bunch of new Winu8 Touch APIs there, but you're probably interest in the legacy Keyboard input article.
Every window for an application must have a Windows Procedure (a.k.a WindowsProc) that's responsible for reacting to messages it cares about (e.g. a button click, a message indicating the Window needs to draw its GUI, or the WM_QUIT event that alerts it to gracefully dispose of the resources held by the Window. This procedure is also responsible for handling messages from input devices, like mouse-clicks and keys on the keyboard.
In your case, you're more interested in making the Window think there's a message from the keyboard when there isn't. That's what the SendInput API call is for; it lets you insert an array of INPUT messages, be they keyboard, mouse, or other input device directly into the queue, bypassing the need for the user to physically act. This easy API call specifically accepts MOUSEINPUT, KEYBDINPUT, or HARDWAREINPUT messages.
For the keyboard, you'll get a message when a key is pressed (WM_KEYDOWN) and when it is released (WM_KEYUP), so to determine hotkeys like CTRL+C, you have to watch for WM_KEYDOWN message for the letter C that were received after a WM_KEYDOWN for the CTRL key but before its WM_KEYUP message.
Managing input device messages
To simulate input devices, use SendInput to pass along the WM_KEYDOWN and/or WM_KEYUP message(s) to the target Window. But don't forget that an application can have more than one window. There are API calls to get the different Windows, but it'll be up to you to write code to find it before you can use SendInput on it.a
To find out what a window believes about an input device, use GetAsyncKeyState. You may not be able to trust it if you've meddled with APIs related to input devices.
There is a BlockInput call on a window which denies all messages except SendInput calls from the thread which blocked it. In most cases, re-enabling input as soon as possible is the right thing. The documentation say that if the blocking thread dies, BlockInput is disabled. A similar but less harsh call is EnableWindow which prevents a window from receiving input focus.
The API for windows includes the ability to register hooks, which let you specify kinds of messages and/or certain windows to be reviewed by a user-specified function.
I would really like to know why you need this to be in two different applications, but here's the best I can think of.
In the applications, you should be able to subscribe to KeyDown, KeyUp, Focus, and Blur (lost focus). I'm not clear on if this is an actual button or if its touch input, but whatever the case may be, assume KeyDown is whatever event fires when the user is "simulating" the ctrl key being pressed, and KeyUp is whatever event fires when the user is ceases to "simulate" the ctrl key being down.
Set up the App1 so when it gains focus, it communicates with the App2 the state: depressed, or not depressed. Every time KeyDown or KeyUp fires, send a message to App2.
When App1's Blur event fires, stop sending messages to App2. Even though App1 will no longer have the button depressed, App2 won't know it and can continue to behave as though the button was depressed until App2 regains focus and can go back to sending messages again.
If it were me, I would have App2 have all the same logic as App1, so the moment App2 gets in Focus, it begins handling the up/down state itself. You may want to have the two applications do some kind of "handshake" when a blur/focus event happens to make sure the state is preserved when switching between. When App2 gets the Blur event, it transfers to App1 the state and they shake hands again, so App1 knows its now responsible for managing the state.
This is basically having the apps cooperate via "tag-team." They keep some state synchronized between each other, "handing off" the responsibility when the blur/focus events fire. Since you cannot know that Blur will fire on one app before Focus fires on the other, you will need to use the same mechanism that communicates the state of this "simulated button" to coordinate the apps so they never interfere with each other.
Something tells me that this doesn't completely solve your problem, but hearing why it doesn't will certainly get everyone closer to thinking out the rest of the way. Let me know the twist ending, eh?

Silverlight IsolatedStorageFile.IncreaseQuotaTo

Msdn doc for IsolatedStorageFile.IncreaseQuotaTo states that:
To increase the quota, you must call
this method from a user-initiated
event, such as in an event handler for
a button-click event. When you call
the IncreaseQuotaTo method, the common
language runtime in Silverlight
presents a dialog box for the user to
approve the request. If the user
declines the request, this method
returns false and the quota remains
the same size.
How does Silverlight know that the method was called from a user-initiated event like a button click and not from some other thread?
More specifically: What is a user initiated event? Is there any way to overcome this limitation?
And another question:
I do some automatic downloads of files when user first accesses my application, but I don't want the user to press "Download" and then when I detect more space is needed call IncreaseQuota and have the "Silverlight dialog" appearing asking for more space.
I want to start the download automatically (not user initiated), and if I detect more space is needed, call IncreaseQuota and hence have the "Silverlight dialog" appear. (No user pressing download).
After much digging, I did find out what a user initiated event is. Seems that msdn doc specifies what a user initiated event in the section related to "events overview", but there's no link between documentation of IsolatedStorageFile.IncreaseQuotaTo and Events Overview
So a user initiated event according to the definition is:
Silverlight enforces that certain
operations are only permitted in the
context of a handler that handles a
user-initiated event. The following is
a list of such operations:
Setting IsFullScreen.
Showing certain dialogs. This includes
SaveFileDialog, OpenFileDialog, and
the print dialog displayed by
PrintDocument.Print.
Navigating from a HyperlinkButton.
Accessing the primary Clipboard API.
Silverlight user-initiated events
include the mouse events (such as
MouseLeftButtonDown), and the keyboard
events (such as KeyDown). Events of
controls that are based on such events
(such as Click) are also considered
user-initiated.
API calls that require user initiation
should be called as soon as possible
in an event handler. This is because
the Silverlight user initiation
concept also requires that the calls
occur within a certain time window
after the event occurrence. In
Silverlight 4, this time window is
approximately one second.
User-initiated event restrictions also
apply to usages of JavaScript API for
Silverlight.
When Silverlight is in full-screen
mode, some input events are
deliberately limited for security
reasons, although this can be
mitigated for out-of-browser
applications using elevated trust. For
more information, see Full-Screen
Support.
Although I don't see "IncreaseQuotaTo" inside the list of "operations", I'm guessing they just forgot it, since the behavior/limitations are the same as the ones described in the doc.
I was curios how exactly does silverlight know what a user initiated event is but after digging through .net framework source code I've got to a dead end:
if ((browserService == null) || !browserService.InPrivateMode())
{
//..
}
return false; //means that IncreaseQuota will fail
where browser.IsInPrivateMode is:
[SecuritySafeCritical]
public bool InPrivateMode()
{
bool privateMode = false;
return (NativeMethods.SUCCEEDED(UnsafeNativeMethods.DOM_InPrivateMode(this._browserServiceHandle, out privateMode)) && privateMode);
}
where DOM_InPrivateMode is in a DllImport["agcore"] which according to microsoft is confidential :(
So it looks like I won't find out soon how they're detecting user initiated events.
Thinking it more about it, I guess microsoft didn't want a user to have many tabs open in a browser and then poof: I call automatically IncreaseQuotaTo.
The IncreaseQuotaTo is a browser modal dialog. This means you can't navigate to other browser tabs while is active.
So if the user has now moved from my page to the tab with google.com, and if I would be able to call IncreaseQuotaTo with a delay, the user might think that google.com is asking for more storage :).
This would be a security breach indeed.
Had they implemented this with a page level dialog, then that would have been probably more easily hacked (or worked around).
So all in all, thinking of it, I'm starting to see why they implemented it like this and why these limitations exist.
The documentation isn't incomplete.
If I do this... button_click(..) { new UserControl() }... Does this still count as a user initiated event?
Yes. But what has that little bit of extra code really achieved?
What i've personally never experimented with is exactly what consitutes a user event; IOW is a mouse-over considered a user event? This will be very simple for you to try, and there are a multitude of other things you can experiment with. If necessary you could have a splash screen popup that welcomes the user and they have to click on it to dismiss it, at which point you make the request. It may seem a bit corny, but you can get away with things like this if you present it well.
Note that the prompt is a one-time thing. If you prompt the user and they accept, that storage is persisted for your application between visits, which means you don't need to prompt them again the next time they use your control, your quota is still increased from last time (unless the user has deliberately deleted it, which they can do by right clicking on the Silverlight control and then going to the Application Storage tab).

Send keys to WPF Browser control

Can I programatically send [UserID]{TAB}[Password]{CARRIAGE RETURN} to a webbrowser control which has a userID, password and Sign-in button there. I wanted to use my own virtual keyboard in my application. Any tips here?
Sorry for the late answer but I've just finished a similar project and as part of the work am in the process of open sourcing two projects to Codeplex.
The first is the Windows Input Simulator which is a simple .NET wrapper around the Win32 SendInput written in C#.
The second is a very customisable on screen keyboard or touch screen keyboard control and toolkit called WpfKB and will be available as an initial release tomorrow. Hope these are of help to you or anyone else who comes across the projects.
I recently had to implement automatic authentication through a WPF browser control, and I looked into simulating keystrokes. I didn't need a full virtual keyboard so interacting with the DOM of the login page through IHTMLDocument2 ended up being the best approach, but I looked into keystroke automation before making that decision and found a few options.
You can raise the appropriate routed events on the control as described in Simulating basic keyboard events and Simulating text input. I don't know of any specific problems with this approach but I opted against it simply because I wasn't comfortable simulating input without looking at how the CLR handles the actual input, and without at least raising the complete lifetime (PreviewKeyDown, KeyDown, PreviewKeyUp, KeyUp) I was wary of unintended consequences.
Take a look at WOSK on CodePlex. It's a good example of how to invoke Win32 keybd_event and SendInput functions to generate the low-level input messages via Managed Windows API to simulate input. There's some unnecessary fluff (eg transparency) and some odd WPF usage, such as using a CommandParameter with a Click event instead of a Command on the buttons, but the general approach is sane and it's reasonably complete.
You can also invoke the windows on-screen keyboard as alluded to by Jeroen. I didn't try this because I didn't need a virtual keyboard, but if you're going to call into Win32 anyway, you might as well follow the WOSK model and build the UI the way you want it.

Categories