Sorry, my English is not good.
I use Selenium to get datas from web,
Here is my code
var workGroups = e.WebDriver.FindElements(By.XPath("//div[#class='workgroup']"));
Console.WriteLine($"Item List: {workGroups.Count} Items");
foreach (var workgroup in workGroups)
{
string workName = workgroup.FindElement(By.XPath("//div[#class='worktitle']/label")).Text;
var detail = workgroup.FindElements(By.XPath("//div[#class='col-4 high']"));
Console.WriteLine($"Item Name: {workName}, Number of Pictures: {detail.Count}");
}
And this is the result:
result
It seems to be catching the first data and all pictures,
I use chromedriver to help me.
I don't know where it is wrong.
Please help me, brothers and sisters.
thank you very much.
Try to use:
string workName = workgroup.FindElement(By.XPath("./div[#class='worktitle']/label")).Text;
var detail = workgroup.FindElements(By.XPath("./div[#class='col-4 high']"));
I didn't test that but assuming from using workgroup element you would like to get only elements that are "inside" the workgroup element area. However, to do so you need to use current "folder" notation (./) instead of root element notation (//) which looking for elements starting from root node in your HTML document and actually going through the entire document.
Related
I am trying to find out in the string if the word that changes but starts with letter 'F' (C#). The result output from service call is as below:
Exception_Remote_Call--VNQ DN ERROR CODE found ERROR CODE= F0123,
ERROR DESCRIPTION= NOT AVAILABLE
In the above string, F0123 word changes according to the different ERROR CODE. I tried as below but it works for F0123 and does not work if the output is F0111. I would like to find if it starts with 'F'.
var isStartsWithF = s.Contains("F0123");
I would really appreciate for the help. Thank you in advance!
This is a job for regular expressions. To make things clearer and easier to spot for future maintainers, I might include the ERROR CODE = as part of the expression:
var data = "Exception_Remote_Call--VNQ DN ERROR CODE found ERROR CODE = F0123, ERROR DESCRIPTION= NOT AVAILABLE";
var exp = new Regex(#"ERROR CODE\s?= (F\d{4,5})");
var result = exp.Match(data).Groups[1].Value;
See it work here:
https://dotnetfiddle.net/nOXOCt
I have posted the same question but I post it again since I haven't got any answers to that post yet.
I am trying to get some information (such as tagName, id using GetElementsByTagName method or GetElementById method) from a content page in a website using winforms.
as you see the pictures attached, no matter which selection you make (select1, select2, select3 etc) web address stays same. however, contents under those selections are different in content page.
I am trying to access to a tagName(or id) from one of them(not selections but contents under a specific selection).
I have debugged and figured out(or seems like) I can not access to tagName(or id) from any of those contents under a specific selection.
It seems like I can only access tagName(or id) from main page. picture 3 will help better explanation of some terms such as main page, content page.
I tried to explain in detail, if my question seems still not clear, let me know plz.
My code looks like this.
var countGetFile = webBrowser1.Document.GetElementsByTagName("IFRAME");
foreach (HtmlElement l in countGetFile)
{
if (l.GetAttribute("width").Equals("100%"))
{
MessageBox.Show(l.GetAttribute("height").ToString());
MessageBox.Show(l.GetAttribute("outerText").ToString());
}
}
I was not able to grab information under 2 down level of #document from html.
html looks something like
...
<src="..." id="A" ... >
#document
...
<src="..." id="B" ... >
#document
...
<span="C" ...>
...
I could grab span information (third curly brackets) with codes looking like
HtmlWindow frame1 = webBrowser1.Document.GetElementById("A").Document.Window.Frames["A"];
HtmlWindow frame2 = frame1.Document.GetElementById("B").Document.Window.Frames["B"];
foreach (HtmlElement elm in frame2.Document.All)
{
if (elm.GetAttribute("tagName").Equals("C"))
{
// your command
}
}
to use Document.Window.Frames you need a header using "System.Collections";
btw, there is a problem. When I try to access to the information in third curly bracket, I need to do some kinds of work between frame1 and frame2 such as delaying for frame2 to have enough time to be able to access to next level after frame1.
I figured a kind of hack to get it through. Place a messagebox to pop up for short time delay, or place a delay function( not freeze ) with async code looking like,
async Task PutTaskDelay()
{
await Task.Delay(5000);//5 secs
}
I just found a temporary solution for accessing to second level. I will appreciate anyone who knows some ways to solve this problem.
To preface, I know there are similar threads about this, but I am using C#, not java, or python, or Php. Some threads provide a solution for a single URL, which is not universal. Thanks for not flagging me.
So I am using Google Alerts to get links to articles via email. I have already written a program that can strip the URLs out of the email as well as another program to scrape the websites. My issue is that the links in the google alerts email look like this:
https://www.google.com/url?rct=j&sa=t&url=http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html&ct=ga&cd=CAEYACoTOTc2NjE4NjYyNzMzNzc3NDcyODIaODk2NWUwYzRjMzdmOGI4Nzpjb206ZW46VVM&usg=AFQjCNGyK2EyVBLoKnNkdxIBDf8a_B3Ung. Yeah, ugly.
Because this redirects to the actual article through google, my scraping program does not work on these links. I have tried a million different RegExs from questions here and other sources. I managed to strip off everything up until the http:// of the actual article but it still has the tail end that screws it up. Here is what I have so far. They now look like:
http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html&ct=ga&cd=CAEYACoTOTc2NjE4NjYyNzMzNzc3NDcyODIaODk2NWUwYzRjMzdmOGI4Nzpjb206ZW46VVM&usg=AFQjCNGyK2EyVBLoKnNkdxIBDf8a_B3Ung
private List<string> GetLinks(string message)
{
List<string> list = new List<string>();
Regex urlRx = new Regex(#"((http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?)", RegexOptions.IgnoreCase);
MatchCollection matches = urlRx.Matches(message);
foreach (Match match in matches)
{
if(!match.ToString().Contains("news.google.com/news") && !match.ToString().Contains("google.com/alerts"))
{
string find = "=http";
int ind = match.ToString().IndexOf(find);
list.Add(match.ToString().Substring(ind+1));
}
}
return list;
}
Some help getting rid of the endings would be awesome, be it a new RegEx or some extra code. Thanks in advance.
You can use HttpUtility.ParseQueryString to retrieve the url part of the query string. It is located in the System.Web namespace (reference required).
var uri = new Uri("https://www.google.com/url?rct=j&sa=t&url=http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html&ct=ga&cd=CAEYACoTOTc2NjE4NjYyNzMzNzc3NDcyODIaODk2NWUwYzRjMzdmOGI4Nzpjb206ZW46VVM&usg=AFQjCNGyK2EyVBLoKnNkdxIBDf8a_B3Ung");
var queries = HttpUtility.ParseQueryString(uri.Query);
var foxNews = queries["url"]; //http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html
Later update
So, spent some time trying to figure this out and I got to this:
In sitecore.config there is a section where you define your websites. While I was trying to setup a custom login page I added "loginPage" attribute to "website" site element but it didn't work.
At one point I realized that I have to change the name from "website" to "myhost_name" and the login page started to work as expected but it turns out that removing the "website" site element wasn't a very bright idea because the website started to have this unstable behavior.
Does anyone know what's the right setup for this situation? I don't find the Sitecore documentation to clear in this matter.
Thanks
I have the following issue (I'm kind of new in Sitecore development so it might be some easy stuff, but I can't figure it out)
I have a template for some error messages I will show in the website and I have a folder under content where I store this Items
There are 3 fields I added on template:
- Type
- ResultKey
- Message
All of them are Single Line Text
Now, in visual studio I have a routine which does this:
/// <summary>
/// Get an Item by path
/// </summary>
public Item GetItemByPath(string itemPath)
{
return Sitecore.Context.Database.GetItem(itemPath);
}
And I have an other one which should return a ViewModel
public ModelValidation GetMessageByName(string itemName, string xpath)
{
var mess = GetItemByPath(xpath + itemName);
if (mess == null) return new ModelValidation(3, itemName);
int type;
string stype = "";
string message = "";
mess.Fields.ReadAll();
if (mess.Fields["Type"] == null)
stype = "3";
else
stype = mess.Fields["Type"].Value;
if (!int.TryParse(stype, out type))
type = 3;
if (String.IsNullOrEmpty(mess.Fields["Message"].Value))
message = itemName;
else
message = mess.Fields["Message"].Value;
return new ModelValidation(type, message);
}
The issue:
The item is returned, all the fields are in place, but the value of my fields is "" (String.Empty)
What am I doing wrong ?
The items have values in Sitecore and they are published ( I checked the Web database)
Context
Sitecore 8.1
VS 2013 MVC 5.2.3
Thank you
Below are the possibilities of getting the empty results.
Language Version is not passed properly while fetching the item.
The path of the item might be incorrect
The Context database might be wrong, the data might be in master but you are referring Web database.
Try using to fetch the item by Item ID.
There might be different versions for the same item and you light be fetching the wrong Version which has empty values
try to change your query into: return Sitecore.Context.Database.GetItem(itemPath, Sitecore.Context.Language);
If you didn't specify the language the returned item can be in different language than you have in context and because of that your data is empty.
Best regards,
Łukasz Skowroński
Thank you all for your advices. The problem was that there was an other site config file . Stil does not explain the unexpected behavior but at least I solved the issue.
I am willing to know how can I get the replies of a tweet?
I am not quite sure if this could be accomplished by using a trend or maybe passing a different API URL in an option file to the Retweets methos, I don't know by hard how to do it, any assistance will be well received.
To solve this, you need to do a Search:
TwitterResponse<TwitterSearchResultCollection> replies = TwitterSearch.Search(tokens, "term", options);
And loop thru the results:
foreach (var reply in replies.ResponseObject)
{ }
Please ensure to use:
if (reply.InReplyToScreenName != null && reply.InReplyToScreenName.ToLower().Equals("term"){}
To get the replies of the right user (the one that you looked for)
Term is going to be replaced by the ScreenName that you look for i.e.: #rodbh08