I've received an interesting requirement for my Selenium tests where I need my code to RDP into a server, open up a browser and then run my tests on that.
Initially I thought Browserstack or even Selenium Grid, but the requirement is to actually open the RDP session and run tests through that.
Is there a way to achieve this? I wanted to try using something like Microsoft UI Automation to open the RDP session but then my Selenium tests would just run locally after starting up RDP, right? I'm having some trouble getting starting on this and can't seem to find a good place to start.
The RDP window is basically an image of a desktop and Selenium cannot control the web browser through it. You would need to deploy your Selenium tests on the remote machine and run them from there.
This is theoretically possible, but not exactly in the way you are describing it. You seem to be saying that you want to interact with things inside an RDP windows as if the RDP window were a standard browser or native app, and things inside it were elements in a DOM. By running selenium on a host machine, and then opening up an RDP, and clicking within the RDP window from the original host machine.
It is impossible to connect like this (in a manner that this approach would require) directly over an RDP. If for no other reason than security. For example; if you have ever watched people on youtube infect VM's with crazy viruses to show off the destructive effect of the virus on a VM without hurting the computer they are RDP'ed into from. But there seem to be ways to communicate over TCP connections via RDP, which would facilitate other ways to interact with the remote machine.
Additionally, you could just SSH/Enter-PSSession into the machine, and set up a communication channel between the machines outside of the RDP session. If you had a custom built application on the machine being remoted into, you could use System32 libraries to tell the primary machine x y coordinates of things to click on. Then, from the primary machine, you can maximize the RDP window, and click on the provided X Y coordinates.
I suspect that this would successfully send the click all the way through down to the VM, in the manner you are suggesting. If Appium cannot send the click down in this manner, you may need to develop your own abstraction of the user32.dll to perform clicks using the core Windows logic for keyboards and mouse clicks.
All in all, this is an INSANELY in depth project that accomplishes (seemingly) nothing for the effort. I would push back on whomever is giving you this requirement, by investigating some of the things I mentioned above (along with any other advice), and explaining the cost versus the return on investment.
Hopefully, they will just let you use Selenium Grid to communicate with VM's or attached devices to push out applications and test them given industry standard approaches.
Here is an interesting and relevant read: https://superuser.com/questions/130552/tunneling-a-tcp-ip-connection-through-remote-desktop-connection
Related
We have a C# tool (that I wrote) that records online broadcasts taking place a custom written (that we wrote) flash app. (There are no DRM or copyright issues here.)
We've coded up a system whereby this tool is installed on a Windows Server 2012 R2 Amazon AWS instance. After we boot the instance, the tool loads, waits for the right time to start recording, launches a browser and passes the command line argument of the URL to access the broadcast. The browser will then load the flash app and the interview audio and video will start arriving at the browser instance on AWS
By way of a virtual audio cable driver, screen / audio capture directshow filters and ffmpeg a screen recording is taken. The C# tool calls ffmpeg and ffmpeg will record the screen reliably for the entire interview, then the tool shuts the whole thing down
The problem I'm having is that both Chrome and Electron browser sometimes simply don't draw themselves on the screen so all ffmpeg ends up recording is a blank desktop and the audio of the broadcast (hence, the browser IS running)
We found this out when recordings started turning up with X hours of merely recording the windows desktop and the tool's main window with a countdown timer.
A screenshotting facility was built into the tool and added to its web control interface, and this way we can test whether the browser is visible - a human looks at the screenshot of every broadcast, just after recording has started (the browser is supposed to be on show by this time)
We notice that 50% of the time, the browser isn't drawing itself on screen. By 50% I mean that every other recording that the AWS instance carries out, will be blank: AWS starts, records ok, shuts down. AWS starts again an hour later for a different broadcast, recording is blank, shuts down.. Starts/ok/shutdown. Starts/blank/shutdown. Repeat ad infinitum
What's even more strange is that if I run VNCviewer on my dev machine and connect up to an instance that is having a problem, the instant that the VNC connection is up and the remote desktop is showing on my screen, the browser suddenly appears as if nothing was ever wrong. A screenshot from before the VNC connect shows blank desktop, connect VNC, take another screenshot and the browser is there. All through it the audio is fine - the browser connected to the boadcast is fine, for sure
It's as though Chrome/Electron thinks "you know what, noone is looking at me so I'm not going to bother drawing myself". No screen saver is set, though the power plan has the setting "turn off the display after 15 minutes".
Perhaps Chrome/Electron have a test amounts to "if the display is off, don't draw". I can't explain the inconsistency though - the recorder launches at least 1 hour before it's needed, and sits there idle until it's time to start the browser. You'd hence imagine that the "power off the monitor after 15 mins" setting would reliably have ensured the "monitor" is "off" by the time every recording start comes around
This behaviour doesn't happen with any of the other browsers (but unfortunately the app doesn't and cannot work in them because it uses some weird chrome-only technology/API).
Can anyone suggest anything to look at to help debug this, or anything I can build into the C# tool to overcome the problem? Coding it up to connect to itself via VNC for a few seconds after it has launched the browser.. Well that just tastes nasty.
Naturally, as soon as I connect to the machine via VNC (rather than RDP - RDP isn't usable because the recording context is in a logged on session for a particular user) the problem goes away, which makes it frustratingly hard to debug.
I am not sure what exactly causes your problem, but it sounds like interacting wit the system prevents it. One way to interact with a system is to use the keyboard and this can be automated.
You could try sending a keystroke (like "F15") evey so many seconds in C# using
Windows Input Simulator or maybe SendKeys.Send and
Combine the above with some kind of Timer
Maybe take a quick peek at this app called Caffeine...it presses the "F15" key for you every so many seconds. They claim "F15" generally doesnt trigger anything in windows (since a release they made back in 2010).
Caffeine App
I'm using the Kitware VTK library to display 2D images. I've recently begun work using the vtkWindowToImageFilter to output images in various formats. Everything was looking great until I worked at home today and I began to realize that VTK rendering doesn't seem to work when you are running software on your work machine through Remote Desktop.
When I output an image while NOT running in Remote Desktop, the image that gets output only consists of the VTK window. But when I run this same process while using Remote Desktop, the output image comes out in the correct size, but does just a normal screenshot basically, and other UI elements outside of the VTK window are showing up.
Question:
What is it about Remote Desktop and VTK that causes the differences I'm seeing? Is there anything that can be done to support outputting images from VTK windows while using Remote Desktop?
Thanks in advance!
From the VTK mailing list, I received the following response:
Remote desktop swap the video card driver hence the issue you are seeing. But if you use VNC instead, you should be good.
Hope this helps someone make the decision I had to make: whether or not to go forward with development on this feature knowing that if used remotely, the results would be unusable. I decided to go ahead with development with the assumption that our users who are in the phase of their workflow where this feature will be used are normally going to be in the office sitting in front of their work machines.
I am working on using a Raspberry Pi in an embedded project that will utilize wifi to communicate with external devices. The device should be able to act either as a standalone wifi hotspot that devices can connect to or in the case of the presence of an existing wifi network it should connect to that network so that the user does not have to give up his internet connection in order to connect to the device. I plan on making the device start up in hotspot mode, the user can then use the web interface to enter the details of a network that he wants the device to connect to, whenever the specified parameters fails to establish a connection then it defaults back to hotspot mode.
Now the technical stuff I am struggling with is that I want to implement the control software in C# running with Mono on Arch Linux on the Rapsberry Pi. I am struggling the find the Apis or libraries needed to manage the Linux wifi connection. On Windows it seems as if managedwifi.codeplex.com can be used but it does not seem to be compatible with Linux.
My last resort would obviously be to execute shell commands and then parse their outputs, but considering how crude and possibly unreliable that would be this is obviously my last resort.
Any ideas regarding what I should do?
PS. Another thing I might consider before using shell scripts, if it makes a difference is to use Raspbian or some other distro instead.
Actually calling shell commands from a managed code is not a very bad idea. They are reliable, very well tested and mostly lightweight and sometimes just a wrapper around kernel or other modules function. This is also seems to be the same method Node.js modules use when they want to access something lowlevel or related to networking. For example see this source code: node-wireless/node_modules/wireless/index.js
If you don't like it this way there is always "Interop". The same way that you can DllImport() libraries in Windows, you can do in Linux. See here: http://www.mono-project.com/Interop_with_Native_Libraries
IMHO the second solution doesn't worth the effort. Calling shell commands is elegant and neat enough.
my pi is starting into wlan0 as AP with hostapd, when a AP also a lighttpd is starting and give a web interface do change the settings. the web interface verify and write the input in a sqlite3. a second script is doing the canges according to my changes. (like add, edit or remove wifis in wpa_supplicant, reset wlan0 to be a part of an existing wlan like set to dhcp, tell wpa_supplicants...)
except the lighttpd and sqlite3 all components are already on the Raspery. you dont need any mono or c-libraries
for writing the scripts i use python but also perl is working (even php for the frontend)
Hi all I used Selenium some time ago to create a program to carry out automated actions on a website I enjoy using.
I managed to use Selenium to do what I wanted before without much trouble the only issue I had was using it in the background.
I couldn't use it without it effecting other things I was doing on the PC, I did think of using virtual machines but I would like to try and avoid this.
Last night I was playing around with the WebBrowser class in C# and its nice but limited, I like how it was self contained within the windows form application so this is what I am looking for.
Dose anyone know the best way integrating a visual representation of a browser within a windows form application but still allow me to mimic key entry etc but would run in the background.
I have heard of things like WaitN, GekoFX, MozNet etc but from what I read I am not sure any of these would work.
In general, when you are attempting to automate a web page using a browser, you have two options for simulating user events. You can either simulate them via JavaScript, or you can use OS-level mechanisms (so-called "native events") for simulating mouse and keyboard events. Both approaches have their pitfalls.
Simulated events using JavaScript only would probably allow the window being automated to remain in the background, without system focus, while carrying out the tasks you desire. Selenium RC used this method, and Selenium WebDriver offers the ability to use simulated events for Firefox and IE. However, there are some drawbacks to this approach. Simulated events may lack the fidelity and accuracy you require. For example, "drop-down menus" on a page that work via the CSS :hover pseudoselector cannot be triggered via JavaScript, so this approach is doomed to failure in these cases. Additionally, since you're using JavaScript, you're restricted to the JavaScript sandbox, which means that cross-domain frames and the like may be strictly out of bounds.
Native events, on the other hand, are far more closely representational of a user's actual mouse and keyboard operations. In general, they'll allow the correct events to fire on the web page, and in the correct order, without you having to guess which events to fire on which elements. The downside to using them is that to implement them correctly, the window being automated must have the system focus to receive the events properly. You can attempt to hack around this using the SendMessage API if you're on Windows, but this is the Wrong Thing to do, as it's error prone and absolutely not guaranteed to work. The correct way to use native events is to use the SendInput API, but that API sends the input to the window with the system focus. WebDriver defaults to using native events for simulating user input, but it defaults to the flawed SendMessage approach. For IE, at least, it does provide an option to use the more correct SendInput approach.
If you're dead set on not requiring a browser window in the foreground, you really ought to look into a headless option. PhantomJS is a great option, and WebDriver also has a driver for it, which means you can still write your automation code in C#. Otherwise, you're limited to one of the other approaches outlined above.
Does the application need to be hosted within a window?
I have used selenium, Watin for automation, unfortunately they do interfere with what you are doing and I have not managed to find a way around this.
I have used the .Net WebBrowser class too, but for automating I am not sure without testing if it is a fully featured IE, with regard to JavaScript running inside it. I think it does execute JavaScripts though, but you would need to check.
If you do not need to see what is happening there are headless options available too, even for Selenium I think:
Is it possible to hide the browser in Selenium RC?
Here is a list of headless versions if that is viable for you:
https://gist.github.com/evandrix/3694955
From https://learn.microsoft.com/en-us/visualstudio/test/use-ui-automation-to-test-your-code?view=vs-2022
Visual Studio 2019 is the last version where Coded UI Test will be
fully available. We recommend using Playwright for testing web apps
and Appium with WinAppDriver for testing desktop and UWP apps.
Consider Xamarin.UITest for testing iOS and Android apps using the
NUnit test framework. To reduce the impact on users some minimum
support will still be available in Visual Studio 2022 Preview 4 or
later.
I've been trying to figure out how to detect this for months without any real progress. Whatever road I'll try out testing C++/C# code, all turns out to be dead ends.
The problem:
I have two computers, Comp1 and Comp2. The first is remotely administered by Comp2 (via local network).
Now, on Comp1 i want to be able to detect if Comp2 are running code that are screen scraping (specific c#/c++ functions), or taking screen shots of, this computer, via the remote administration window on that machine.
Is this impossible? (if infecting Comp2 with some sort of trojan or virus is out of the question - which it definitely is!)
I can relatively easy get detailed information about the state of the remote administration itself (if being administered or not at this time) as well as other stuff (like IP administering) but not detect exactly what I want.
My next step is to see what the .dll files (the remote software are using) can tell me. My knowledge here is somewhat limited however.
What information could I get by hooking the "video driver" that was installed and is being used by the remote administration software? Is that another dead end?
Another thing that struck me would be to monitor the actual data traffic on specific ports (relevant to the current remote software), but that should fail as well because only data of what's being sent to Comp2, or mouse/keyboard emulation being sent from the same, can be obtained (?)
I'd appreciate all ideas, suggestions or points to library entries (i.e MSDN).
Thanks in advance
I am not entirely sure i understand your question - but i will try to answer it anyway:
First let me rephrase the question the way i understand it:
Is there some way Host for a remote session can detect snapshots/screen scaraping done by the Client software running?
The short answer is "No".
A simple analog would be to consider a camera pointed at the client machine. this camera directly records the monitor on the client machine - there is no way (unless the camera chooses to report to you by some custom interface) that you can know this is happening.
The same holds for the screen scraping software.
Screen scraping software records whatever goes on in a given machine.
The fact that some other machine is being viewed by the scraping software is not transmitted to the Host (Unless you designed the scraping software to do exactly that).
The only information that goes back to the host is what the client chooses to transmit.
Typically this is just the keyboard/mouse operations when the window showing the host is active.