How to hide an automationelement? - c#

I've been working for a few days on this and would like if nothing more some confirmation or a few hints as to where to go next on this task.
I've been tasked with hiding certain notifications within Chrome that Chrome doesn't provide a native option to hide - even in kiosk or incognito mode. The approach I have been using is Microsoft's Automation API to get access to these objects, which is relatively easy because I can find them by classname or by some of the text contained within the object - which is necessary. The challenge now is that I need to hide the element and/or it's container without closing Chrome :)
Easy enough to get the main handle to Chrome:
private IntPtr GetChromeHandle()
{
IntPtr chWnd = IntPtr.Zero;
Process[] procsChrome = Process.GetProcessesByName("chrome");
foreach (Process chrome in procsChrome)
{
// the chrome process must have a window
if (chrome.MainWindowHandle == IntPtr.Zero)
{
continue;
}
else
{
chWnd = chrome.MainWindowHandle;
}
}
return chWnd;
}
I can get one of these elements pretty easily from there by doing the following:
PropertyCondition pcFullScreen = new PropertyCondition(AutomationElement.NameProperty, "F11", PropertyConditionFlags.IgnoreCase);
AutomationElement fsTest = chromeWindow.FindFirst(TreeScope.Descendants, pcFullScreen);
The challenge here is how to close that element or navigate to a high level that I can close it?
Alternate approach tried looks like this:
PropertyCondition pcTest = new PropertyCondition(AutomationElement.ClassNameProperty, "Intermediate D3D Window");
AutomationElement newTestElm = chromeWindow.FindFirst(TreeScope.Descendants, pcTest);
Problem here is that while I can "hide/close" by using the handle of the class, I can't seem to narrow it down to an instance of the class that contains the text I'm looking for. Any suggestions would be greatly appreciated.
Per a comment, tried to access via WindowPattern and get "null" based on this code:
private WindowPattern GetWindowPattern(AutomationElement targetControl)
{
WindowPattern windowPattern = null;
try
{
windowPattern =
targetControl.GetCurrentPattern(WindowPattern.Pattern)
as WindowPattern;
}
catch (InvalidOperationException)
{
// object doesn't support the WindowPattern control pattern
return null;
}
// Make sure the element is usable.
if (false == windowPattern.WaitForInputIdle(10000))
{
// Object not responding in a timely manner
return null;
}
return windowPattern;
}

Related

UIAutomation failing to find child nodes

Trying to automate IIS (INetMgr) trying to use UIAutomation, i'm fixing mixed results. I'm able to get some if the screen elements good fine, others, even immediate children nodes, can't get either with a Find[First|All] or try in a treewalker (content|control|raw), just can't get the node(s) needed. Any suggestion what to use for driving the UI to automate it?
Window 10/11 desktop environment.
Here is a C# Console app that dumps (max 2 levels) of InetMgr's items from the "Connections" pane.
This must be started as Administrator otherwise it will fail (not immediately). In general UIA clients must run at same UAC level as automated apps.
To determine what to get from the tree or if something can be done, before any coding, we can use the Inspect tool from Windows SDK or the more recent Accessibility Insights.
Also, I use Windows' UIAutomationClient COM object, not the old one from Windows XP era as it misses lots of stuff.
The code iterates all tree items recursively and expand them if they are not expanded using the ExpandCollapse Control Pattern because InetMgr's tree has a lazy loading behavior as it can potentially contains hundreds of thousands of items (mapped to disk folders at some points).
class Program
{
// add a COM reference to UIAutomationClient (don't use .NET legacy UIA .dll)
private static readonly CUIAutomation8 _automation = new CUIAutomation8(); // using UIAutomationClient;
static void Main()
{
var process = Process.GetProcessesByName("InetMgr").FirstOrDefault();
if (process == null)
{
Console.WriteLine("InetMgr not started.");
return;
}
var inetmgr = _automation.ElementFromHandle(process.MainWindowHandle);
if (inetmgr == null)
return;
// note: set "Embed Interop Type" to false in UIAutomationClient reference node or redefine all UIA_* used constants manually
// also: you *must* this program as administrator to see this panel
var connections = inetmgr.FindFirst(TreeScope.TreeScope_Subtree,
_automation.CreatePropertyCondition(UIA_PropertyIds.UIA_AutomationIdPropertyId, "_hierarchyPanel"));
if (connections == null)
return;
var treeRoot = connections.FindFirst(TreeScope.TreeScope_Subtree,
_automation.CreatePropertyCondition(UIA_PropertyIds.UIA_ControlTypePropertyId, UIA_ControlTypeIds.UIA_TreeItemControlTypeId));
Dump(0, treeRoot);
}
static void Dump(int indent, IUIAutomationElement element)
{
var s = new string(' ', indent);
Console.WriteLine(s + "name: " + element.CurrentName);
// get expand/collapse pattern. All treeitem support that
var expandPattern = (IUIAutomationExpandCollapsePattern)element.GetCurrentPattern(UIA_PatternIds.UIA_ExpandCollapsePatternId);
if (expandPattern != null && expandPattern.CurrentExpandCollapseState != ExpandCollapseState.ExpandCollapseState_Expanded)
{
try
{
expandPattern.Expand();
}
catch
{
// cannot be expanded
}
}
// tree can be huge,only dump 2 levels max
if (indent > 2)
return;
var children = element.FindAll(TreeScope.TreeScope_Children, _automation.CreateTrueCondition());
for (var i = 0; i < children.Length; i++)
{
Dump(indent + 1, children.GetElement(i));
}
}
}

Selenium. Bring-up window on the front

If you run the following code, then at each iteration of the cycle, the browser will bring up on the front and get focus.
public class Program
{
private static void Main()
{
var driver = new ChromeDriver();
driver.Navigate().GoToUrl("https://i.imgur.com/cdA7SBB.jpg");
for (int i = 0; i < 100; i++)
{
var ss = ((ITakesScreenshot)driver).GetScreenshot();
ss.SaveAsFile("D:/imgs/i.jpg");
}
}
}
The question is: why does this happen and can it be turned off? headless mod does not fit.
It seems that this always happens when Selenium needs to save / read the file or start the process.
To take a screenshot, chromedriver activates the window. It's by design and there's no option to avoid it even though it's technically possible.
For the relevant sources have a look at window_commands.cc.
You could however avoid the effect by moving the window off-screen:
driver.Manage().Window.Position = new Point(-32000, -32000);
or by launching the browser off-screen:
var options = new ChromeOptions();
options.AddArgument("--window-position=-32000,-32000");
UPDATE
You can avoid the activation by taking the screenshot directly via the devtool API. Here's a class to override GetScreenshot:
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;
using OpenQA.Selenium.Remote;
using JObject = System.Collections.Generic.Dictionary<string, object>;
class ChromeDriverEx : ChromeDriver
{
public ChromeDriverEx(ChromeOptions options = null)
: base(options ?? new ChromeOptions()) {
var repo = base.CommandExecutor.CommandInfoRepository;
repo.TryAddCommand("send", new CommandInfo("POST", "/session/{sessionId}/chromium/send_command_and_get_result"));
}
public new Screenshot GetScreenshot() {
object response = Send("Page.captureScreenshot", new JObject {{"format", "png"}, {"fromSurface", true}});
string base64 = (string)((JObject)response)["data"];
return new Screenshot(base64);
}
protected object Send(string cmd, JObject args) {
return this.Execute("send", new JObject {{"cmd", cmd}, {"params", args}}).Value;
}
}
usage:
var driver = new ChromeDriverEx();
driver.Url = "https://stackoverflow.com";
driver.GetScreenshot().SaveAsFile("/tmp/screenshot.png");
driver.Quit();
When you invoke Navigate().GoToUrl("url") method through your Automation script, it is expected that your script will be interacting with some of the elements on the webpage. So for Selenium to interact with those elements, Selenium needs focus. Hence opening up the browser, bring up on the front and getting the focus is the default phenomenon implemented through Navigate().GoToUrl("url").
Now Default Mode or Headless Mode is controlled by the ChromeOption/FirefoxOptions class which is passed as an argument while initializing the WebDriver instance and will call Navigate().GoToUrl("url"). So, Navigate().GoToUrl("url") would have no impact how the WebDriver instance is controlling the Mode of Operation i.e. Default Mode or Headless Mode.
Now when you try to invoke the method from ITakesScreenshot Interface i.e. ITakesScreenshot.GetScreenshot Method which is defined as :
Gets a Screenshot object representing the image of the page on the screen.
In case of WebDriver instance which extends ITakesScreenshot, makes the best effort depending on the browser to return the following in order of preference:
Entire page
Current window
Visible portion of the current frame
The screenshot of the entire display containing the browser
There may be some instances when the browser looses the focus. In that case you can use IJavascriptExecutor to regain the focus as follows :
((IJavascriptExecutor) driver).executeScript("window.focus();");
I was struggling with an issue when generic GetScreenshot() in parallel testing was causing browser to lose focus. Some elements were being removed from DOM and my tests were failing. I've come up with a working solution for Edge and Chrome 100+ with Selenium 4.1:
public Screenshot GetScreenshot()
{
IHasCommandExecutor executor = webDriverInstance as IHasCommandExecutor;
var sessionId = ((WebDriver)webDriverInstance).SessionId;
var command = new HttpCommandInfo(HttpCommandInfo.PostCommand, $"/session/{sessionId}/chromium/send_command_and_get_result");
executor.CommandExecutor.TryAddCommand("Send", command);
var response = Send(executor, "Page.captureScreenshot", new JObject { { "format", "png" }, { "fromSurface", true } });
var base64 = ((Dictionary<string, object>)response.Value)["data"];
return new Screenshot(base64.ToString());
}
private Response Send(IHasCommandExecutor executor, string cmd, JObject args)
{
var json = new JObject { { "cmd", cmd }, { "params", args } };
var command = new Command("Send", json.ToString());
return executor.CommandExecutor.Execute(command);
}

Detect if process is visible or not [duplicate]

I cannot seem to find a way to determine whether a Process has a user interface e.g. a window, which is visible to the user?
Environment.UserInteractive is not useful for external processes
process.MainWindowHandle != IntPtr.Zero appears to always return false in my tests?
I would like to differentiate between say Notepad and conhost
Find out the process ID from your Process instance.
Enumerate the top-level windows with EnumWindows.
Call GetWindowThreadProcessId and see if it matches the target PID.
Call IsWindowVisible and/or IsIconic to test if that window is visible to the user.
The MSDN article about System.Diagnostics.Process.MainWindowHandle states the following
If you have just started a process and want to use its main window handle, consider using the WaitForInputIdle method to allow the process to finish starting, ensuring that the main window handle has been created. Otherwise, an exception will be thrown.
What they are implying is that the Window might take several seconds to render after you've made the call for the MainWindowHandle, returning IntPtr.Zero even though you can clearly see a Window is shown.
See https://msdn.microsoft.com/en-us/library/system.diagnostics.process.mainwindowhandle(v=vs.110).aspx for reference
Following #David Heffernan, this is what I did:
HWND FindTopWindow(DWORD pid)
{
std::pair<HWND, DWORD> params = { 0, pid };
// Enumerate the windows using a lambda to process each window
BOOL bResult = EnumWindows([](HWND hwnd, LPARAM lParam) -> BOOL
{
auto pParams = (std::pair<HWND, DWORD>*)(lParam);
DWORD processId;
if (GetWindowThreadProcessId(hwnd, &processId) && processId == pParams->second)
{
if (IsWindowVisible(hwnd)) {
// Stop enumerating
SetLastError(-1);
pParams->first = hwnd;
return FALSE;
}
return TRUE;
}
// Continue enumerating
return TRUE;
}, (LPARAM)&params);
if (!bResult && GetLastError() == -1 && params.first)
{
return params.first;
}
return 0;
}

Multi-threaded C# Selenium WebDriver automation with Uris not known beforehand

I need to perform some simultaneous webdrivers manipulation, but I am uncertain as to how to do this.
What I am asking here is:
What is the correct way to achieve this ?
What is the reason for the exception I am getting (revealed below)
After some research I ended up with:
1. The way I see people doing this (and the one I ended up using after playing with the API, before searching) is to loop over the window handles my WebDriver has at hand, and perform a switch to and out of the window handle I want to process, closing it when I am finished.
2. Selenium Grid does not seem like an option fore me - am I wrong or it is intended for parallel processing ? Since am running everything in a single computer, it will be of no use for me.
In trying the 1st option, I have the following scenario (a code sample is available below, I skipped stuff that is not relevant/repeat itself (where ever I added 3 dots:
I have a html page, with several submit buttons, stacked.
Clicking each of them will open a new browser/tab (interestingly enough, using ChromeDriver opens tabs, while FirefoxDriver opens separate windows for each.)
As a side note: I can't determine the uris of each submit beforehand (they must be determined by javascript, and at this point, let's just assume I want to handle everything knowing nothing about the client code.
Now, after looping over all the submit buttons, and issuing webElement.Click() on the corresponding elements, the tabs/windows open. The code flows to create a list of tasks to be executed, one for each new tab/window.
The problem is: since all tasks all depend upon the same instance of webdriver to switch to the window handles, seems I will need to add resource sharing locks/control. I am uncertain as whether I am correct, since I saw no mention of locks/resource access control in searching for multi-threaded web driver examples.
On the other hand, if I am able to determine the tabs/windows uris beforehand, I would be able to skip all the automation steps needed to reach this point, and then creating a webDriver instance for each thread, via Navigate().GoToUrl() would be straightforward. But this looks like a deadlock! I don't see webDriver's API providing any access to the newly opened tab/window without performing a switch. And I only want to switch if I do not have to repeat all the automation steps that lead me to the current window !
...
In any case, I keep getting the exception:
Element belongs to a different frame than the current one - switch to its containing frame to use it
at
IWebElement element = cell.FindElement
inside the ToDictionary() block.
I obviously checked that all my selectors are returning results, in chrome's console.
foreach (WebElement resultSet in resultSets)
resultSet.Click();
foreach(string windowHandle in webDriver.WindowHandles.Skip(1))
{
dataCollectionTasks.Add(Task.Factory.StartNew<List<DataTable>>(obj =>
{
List<DataTable> collectedData = new List<DataTable>();
string window = obj as string;
if (window != null)
{
webDriver.SwitchTo().Window(windowHandle);
List<WebElement> dataSets = webDriver.FindElements(By.JQuerySelector(utils.GetAppSetting("selectors.ResultSetData"))).ToList();
DataTable data = null;
for (int i = 0; i < dataSets.Count; i += 2)
{
data = new DataTable();
data.Columns.Add("Col1", typeof(string));
data.Columns.Add("Col2", typeof(string));
data.Columns.Add("Col3", typeof(string));
///...
//data set header
if (i % 2 != 0)
{
IWebElement headerElement = dataSets[i].FindElement(OpenQA.Selenium.By.CssSelector(utils.GetAppSetting("selectors.ResultSetDataHeader")));
data.TableName = string.Join(" ", headerElement.Text.Split().Take(3));
}
//data set records
else
{
Dictionary<string, string> cells = dataSets[i]
.FindElements(OpenQA.Selenium.By.CssSelector(utils.GetAppSetting("selectors.ResultSetDataCell")))
.ToDictionary(
cell =>
{
IWebElement element = cell.FindElement(OpenQA.Selenium.By.CssSelector(utils.GetAppSetting("selectors.ResultSetDataHeaderColumn")));
return element == null ? string.Empty : element.Text;
},
cell =>
{
return cell == null ? string.Empty : cell.Text;
});
string col1Value, col2Value, col3Value; //...
cells.TryGetValue("Col1", out col1Value);
cells.TryGetValue("Col2", out col2Value);
cells.TryGetValue("Col3", out col3Value);
//...
data.Rows.Add(col1Value, col2Value, col3Value /*...*/);
}
}
collectedData.Add(data);
}
webDriver.SwitchTo().Window(mainWindow);
webDriver.Close();
return collectedData;
}, windowHandle));
} //foreach
Task.WaitAll(dataCollectionTasks.ToArray());
foreach (Task<List<DataTable>> dataCollectionTask in dataCollectionTasks)
{
results.AddRange(dataCollectionTask.Result);
}
return results;

How to check if google chrome is running

I can close Google chrome via C# as follows:
Process[] chromeInstances = Process.GetProcessesByName("chrome");
foreach (Process p in chromeInstances)
{
p.Kill();
}
but I do not know of a way to check if Google Chrome is running.
I would like to know way check that if google chrome is running or not first, thus will close Google chrome via C#.
simply check the array you got
Process[] chromeInstances = Process.GetProcessesByName("chrome");
if (chromeInstances.Length > 0)
{
//then chrome is up
}
else
{
//not working now
}
If you would like to practice with dealing with the Chrome instances via the Process object you can do code snippets with LinqPad. Once you have this downloaded you can change your Language drop down to C# Program and paste this code in. Take your time and play here and try things before posting another question. I see that you kind of asked a question before, got a semi answer, took that semi answer then created a new question off of it that is still not 100% clear what you are looking for. StackOverflow is not here to do every step for you, make attempts first. If you are still stuck post YOUR code with a proper question to get help.
void Main()
{
var chromeProcess = new ChromeProcess();
Console.WriteLine(chromeProcess.AnyInstancesRunning());
Console.WriteLine(chromeProcess.NumberOfInstancesRunning());
chromeProcess.ChromeInstanceIds().Dump("Chrome Instance Ids");
chromeProcess.KillChromeInstance(2816);
//open and close a few chrome windows
chromeProcess.RefreshInstances();
Console.WriteLine(chromeProcess.AnyInstancesRunning());
Console.WriteLine(chromeProcess.NumberOfInstancesRunning());
chromeProcess.ChromeInstanceIds().Dump("Chrome Instance Ids");
}
// Define other methods and classes here
public class ChromeProcess
{
private const string ImageName = "chrome";
private IEnumerable<Process> _Instances;
public ChromeProcess()
{
_Instances = Process.GetProcessesByName(ImageName);
}
public bool AnyInstancesRunning()
{
return _Instances.Any();
}
public int NumberOfInstancesRunning()
{
return _Instances.Count();
}
public IEnumerable<int> ChromeInstanceIds()
{
return _Instances.Select(i => i.Id).ToArray();
}
public void KillChromeInstance(int id)
{
var process = Process.GetProcessById(id);
if(process.ProcessName != ImageName)
{
throw new Exception("Not a chrome instance.");
}
process.Kill();
}
public void RefreshInstances()
{
_Instances = Process.GetProcessesByName(ImageName);
}
}

Categories