My goal is to pull XML data from the API and load it to a sql server database. The first step I'm attempting here is to access the data and display it. Once I get this to work I'll loop through each row and insert the values into a sql server database. When I try to run the code below nothing happens and when I paste the url directly into the browser I get this error
"2010-03-08 04:24:17 Wallet exhausted: retry after 2010-03-08 05:23:58. 2010-03-08 05:23:58"
To me it seems that every iteration of the foreach loop makes a call to the site and I get blocked for an hour. Am I retrieving data from the API in an incorrect manner? Is there some way to load the data into memory or an array then loop through that?
Here's the bit of code I hacked together.
using System;
using System.Data.SqlClient;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Xml;
using System.Data;
public partial class _Default : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
try
{
string userID = "123";
string apiKey = "abc456";
string characterID = "789";
string url = "http://api.eve-online.com/char/WalletTransactions.xml.aspx?userID=" + userID + "&apiKey=" + apiKey + "&characterID=" + characterID;
XmlDocument xmldoc = new XmlDocument();
xmldoc.Load(url);
XmlNamespaceManager xnm1 = new XmlNamespaceManager(xmldoc.NameTable);
XmlNodeList nList1 = xmldoc.SelectNodes("result/rowset/row", xnm1);
foreach (XmlNode xNode in nList1)
{
Response.Write(xNode.InnerXml + "<br />");
}
}
catch (SqlException em)
{
Response.Write(em.Message);
}
}
}
Here's a sample of the xml
<eveapi version="2">
<currentTime>2010-03-06 17:38:35</currentTime>
<result>
<rowset name="transactions" key="transactionID" columns="transactionDateTime,transactionID,quantity,typeName,typeID,price,clientID,clientName,stationID,stationName,transactionType,transactionFor">
<row transactionDateTime="2010-03-06 17:16:00" transactionID="1343566007" quantity="1" typeName="Co-Processor II" typeID="3888" price="1122999.00" clientID="1404318579" clientName="unseenstrike" stationID="60011572" stationName="Osmeden IX - Moon 6 - University of Caille School" transactionType="sell" transactionFor="personal" />
<row transactionDateTime="2010-03-06 17:15:00" transactionID="1343565894" quantity="1" typeName="Co-Processor II" typeID="3888" price="1150000.00" clientID="1404318579" clientName="unseenstrike" stationID="60011572" stationName="Osmeden IX - Moon 6 - University of Caille School" transactionType="sell" transactionFor="personal" />
</rowset>
</result>
<cachedUntil>2010-03-06 17:53:35</cachedUntil>
</eveapi>
Some quick searching (google) shows that this is the EVE API cache mechanism kicking in. If you query the same data from the same key and IP, it tells you to re-use what you already have. So make sure you write any results you do get to disk or database straight away. Some cache times are 1 day...
#Jarek
You could just store the xml file as a local cache to your webservers file structure like .../Users/thisusersid/mycustomfile_02102011.xml, then when the user logs in again just pull the file, split the file name, check the dates and use the most recent file as the default. Then you can allow the user to update their data from the eveapi manually by clicking a button/link to retrieve the latest & greatest.
Related
I've scraped a website with HTMLAgilityPack in C#, and I'm trying to open all link inside it and scrape them with same method.
But when I try to call this method bottom, page is downloaded from library as I have AdBlock active. In fact, I can't find any tables and HTML code downloaded says "ADblock detected".
This is strange because I've filter oddsmath website on my Google Chrome and I can download the master page withouth any problem. Anyone has faced with this problem?
This is the function and the "Console.WriteLine" is just for testing and see full HTML code.
public void GetMatchesDetails()
{
List<String> matchDetails = new List<string>();
foreach (Oddsmath om in oddsmathGoodMatches)
{
matchDetails.Add("http://www.oddsmath.com" + om.matchUrl);
}
foreach (String om in matchDetails)
{
HtmlDocument doc = new HtmlWeb().Load(om);
foreach (HtmlNode table in doc.DocumentNode.SelectNodes("html"))
{
Console.WriteLine("Found: " + table.OuterHtml);
foreach (HtmlNode row in table.SelectNodes("tr"))
{
Console.WriteLine("row");
foreach (HtmlNode cell in row.SelectNodes("th|td"))
{
Console.WriteLine("cell: " + cell.InnerText);
}
}
}
}
}
EDIT
Going little deeper, I've noticed that maybe is not a problem of my application or something related to Adblock, but seems connected to website i'm trying to scrape... In fact, if you see a page like this: oddsmath.com/football/international/afc-champions-league-1053/… you can see that content are correctly loaded in browser, but tables are empty inside source code. Why? It's Javascript that prevents loading of page?
First: Use whatever you are most comfortable with HAP vs AngleSharp unless time is really a factor in your application. And in this case it is not.
Second: Use a Web Debugger like Fiddler or Charles to understand what it is that you are actually getting from the when you make a request. Since you are not actually getting any html created with javascript or api calls. You only get the page source. Which is why the tables are empty. They are generated with either javascript.
For instance. I just used a web debugger to see that the site makes an api call to:
http://www.oddsmath.com/api/v1/dropping-odds.json/?sport_type=soccer&provider_id=7&cat_id=0&interval=60&sortBy=1&limit=30&language=en
Then javascript will use this json object to create the rest of page.
And this returns a nice json object that is easier to navigate than with eithr HAP or AngleSharp. I recommend using NewtonSoft JSON.
If you are adamant on using HtmlAgilityPack then you need to combine it with Selenium. Because then you can wait until the page is fully loaded before parsing the HTML.
[Edit]
Further digging:
Api-request to get all the leagues and their ids:
http://www.oddsmath.com/api/v1/menu-leagues.json/?language=en
Api-request for just the asian champions league:
http://www.oddsmath.com/api/v1/events-by-league.json/?language=en&country_code=GB&league_id=1053
Other solution with Selenium with Firefox driver.
Eventhough I highly recommend that you use API and NewtonSoft-JSON to your solution I will provide how it can be done with Selenium.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using HtmlAgilityPack;
using OpenQA.Selenium.Firefox;
using OpenQA.Selenium;
using System.Threading;
namespace SeleniumHap {
class Program {
static void Main(string[] args)
{
HtmlDocument doc = new HtmlDocument();
string url = "http://www.oddsmath.com/football/sweden/division-1-1195/2019-04-26/if-sylvia-vs-nykopings-bis-2858046/";
//string url = "http://www.oddsmath.com/";
FirefoxOptions options = new FirefoxOptions();
//options.AddArguments("--headless");
IWebDriver driver = new FirefoxDriver(options);
driver.Navigate().GoToUrl(url);
while (true) {
doc.LoadHtml(driver.PageSource);
HtmlNode n = doc.DocumentNode.SelectSingleNode("//table[#id='table-odds-cat-0']//*[self::th or self::td]");
if (n != null) {
n = n.SelectSingleNode(".//div[#class='live-odds-loading']");
if (n == null) {
break;
}
}
Thread.Sleep(1000);
}
Console.WriteLine("Exited loop. Meaning the page is done loading since we could get a td. A Crude method but it works");
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table");
foreach(HtmlNode table in tables) {
Console.WriteLine(table.GetAttributeValue("id", "No id"));
HtmlNodeCollection tableContent = table.SelectNodes(".//*[self::th or self::td]");
foreach(HtmlNode n in tableContent) {
Console.WriteLine(n.InnerHtml);
}
break;
}
Console.ReadKey();
}
}
}
As you can see I use Firefox as my driver instead of chrome. When using either you might have to edit the options where you edit the variable 'BrowserExecutableLocation' to tell where the browser's executable is.
As you can see I am using a while loop in a crude way to make sure that the browser fully loads page before continuing on reading html.
In an effort to minimize time spent on reading error logs, I have created the following small plugin that will parse relevant information contained within Elmah error logs, and eventually produce an Excel spreadsheet. For the time being I am using a WriteLine to test. The issue I am facing is that inside the second foreach loop I am seeing a null reference exception in each of the node.attribute items. The expected outcome of this code is to produce a list of "attribute" values from within the <error> tag of each of the XML documents.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using System.Xml;
namespace ExcelSortingAutomation
{
public class Program
{
public static void Main(string[] args)
{
DirectoryInfo Exceptions = new DirectoryInfo("C:\\ErrorsMay2017");
FileInfo[] ExceptionFiles = Exceptions.GetFiles("*.xml");
foreach (var exception in ExceptionFiles)
{
string xml = "<error></error>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
foreach (XmlNode node in doc.SelectNodes("//error"))
{
string errorId = node.Attributes["errorId"].Value;
string type = node.Attributes["type"].Value;
string message = node.Attributes["message"].Value;
string time = node.Attributes["time"].Value;
Console.WriteLine("{0} - {1} - {2} = {3}", errorId, type, message, time);
}
}
}
}
}
My question is, Why would the logs, which should be pulled and parsed by this point, not be receiving the Attribute values?
Edit 1: upon closer inspection, the XmlNode Node value is returning without "HasAttributes"
The line string xml = "<error></error>"; should be replace with string xml = File.ReadAllText(exception.FullName));
Creating a simple iteration path in TFS 2013 is described here. Traversing a whole tree of unknown depth is described here. I need to create an iteration path of which I know the exact path, and which contains sub-directories, as in \{ProjectName}\Iteration\{Year}\{Iteration}
EDIT:
In order to do that safely, I need to first check the existence of the iteration path, which requires me to check the existence of {Year} and {Iteration}. Otherwise an exception is thrown and I'd like to avoid exception-based logic.
I can find only one way of doing that, and it is level by level, using the method CommonStructureService.GetNodesXml(), but then I have to parse XML and I lose the advantage of using the provided API types such as NodeInfo. Is there a better way to check for existence of a deeper child with a known path while retaining the API domain model?
You can create the iterations one by one: Create the node {Year} first, and then create the {Iteration} under {Year}. See following code for details:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.TeamFoundation.Client;
using Microsoft.TeamFoundation.Server;
namespace AAAPI
{
class Program
{
static void Main(string[] args)
{
string project = "https://xxx.xxx.xxx.xxx/tfs";
string projectName = "XXX";
string node1 = "Year";
string node2 = "Iter1";
TfsTeamProjectCollection tpc = new TfsTeamProjectCollection(new Uri(project));
tpc.Authenticate();
Console.WriteLine("Creating node" + node1);
var css = tpc.GetService<ICommonStructureService>();
string rootNodePath = string.Format("\\{0}\\Iteration", projectName);
var pt = css.GetNodeFromPath(rootNodePath);
css.CreateNode(node1, pt.Uri);
Console.WriteLine("Creating" + node1 + "Successfully");
Console.WriteLine("Creating node" + node2);
string parentNodePath = string.Format("\\{0}\\Iteration\\{1}", projectName, node1);
var pt1 = css.GetNodeFromPath(parentNodePath);
css.CreateNode(node2, pt1.Uri);
Console.WriteLine("Creating" + node2 + "Successfully");
Console.ReadLine();
}
}
}
As no valid answers have come I'm going to assume the following answer:
No, there's no way to read any deeper part than the top level while retaining the API typed domain model.
XML is currently the only option.
The XML file is like this,There are about 20 Nodes(modules) like this.
<list>
<module code="ECSE502">
<code>ECSE502</code>
<name>Algorithms and Data structures</name>
<semester>1</semester>
<prerequisites>none</prerequisites>
<lslot>0</lslot>
<tslot>1</tslot>
<description>all about algorythms and data structers with totorials and inclass tests</description>
</module>
</list>
I did the following code. But when I debugged it it even didn't went inside to foreach function.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
namespace ModuleEnrolmentCW
{
class XMLRead
{
public string[] writeToXML(string s)
{
string text = s;
string[] arr = new string[6];
XmlDocument xml = new XmlDocument();
xml.Load("modules.xml");
XmlNodeList xnList = xml.SelectNodes("list/module[#code='" + text + "']");
foreach (XmlNode xn in xnList)
{
arr[0] = xn.SelectSingleNode("code").InnerText;
arr[1] = xn.SelectSingleNode("name").InnerText;
arr[2] = xn.SelectSingleNode("semester").InnerText;
arr[3] = xn.SelectSingleNode("prerequisites").InnerText;
arr[4] = xn.SelectSingleNode("lslot").InnerText;
arr[5] = xn.SelectSingleNode("tslot").InnerText;
}
return arr;
}
}
}
Please tell me where is the wrong??
Here is the rest of the code
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
namespace ModuleEnrolmentCW
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
string selected;
private void listBox1_SelectedIndexChanged(object sender, EventArgs e)
{
XMLRead x = new XMLRead();
selected = (string)listBox1.SelectedItem;
string[] arr2 = x.writeToXML(selected);
label11.Text = arr2[0];
}
}
}
Make sure you are specifying correct path for your xml file.
It is working for me.
This line:
XmlNodeList xnList = xml.SelectNodes("list/module[#code='" + text + "']");
should read:
XmlNodeList xnList = xml.SelectNodes("list/module"); //Does not answer full scope of the question
Edit following a reread of the question:
The OP's code works fine in my tests. Either the file path is not correct, or the the string s passed into text matches the case of the Code value by which you are reading the nodes.
The SelectNodes XPath as you have it is case sensitive.
You appear to be working with XPath V1.0 which doesn't appear to support out of the box case insensitivity if that's a issue. See this link for a way to perform case insensitive XPath searches: http://blogs.msdn.com/b/shjin/archive/2005/07/22/442025.aspx
See also this link: case-insensitive matching in xpath?
Your code is correct, if the input is really the one you shown, and s point to an actual present code. Since you are pointing the file by a relative path, ensure you are loading the file you really expect.
Found the error. I was passing a wrong value to writeToXML method. Insted of passing code, I have passed name
I have page,default.aspx, with a button. On Click of it, I sent a "HTTP POST" request with query string parameters to some server which returned me jSON data and also redirected me back to default.aspx
Now I wish to see what the request looked like and what all query string parameters was sent.
However, in firebug(params) section, I can't see it. How do I view it ?
Isnt it as simple as enabling Persist on the Net panel in Firebug and see the details of each entry?
http://getfirebug.com/wiki/index.php/Net_Panel#Persist
When this option is enabled, the entries of the requests list are not
deleted when reloading the page. Instead the are grouped by page
request, which means, when reloading the page several times you will
get several request trees having the page title as root.
If you are sending query parameters, it's a GET request. You should not mix the POST and the GET method, or you will run into trouble.
this code will log post or get data to your firebug window if found, place it in the page you request with ajax
using System;
using System.Collections.Generic;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Collections.Specialized;
namespace WebApplication1
{
public partial class _Default : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
NameValueCollection n = Request.QueryString;
int x = 0;
Response.Write("<script>");
foreach (string s in n)
{
// 3
// Get first key and value
string k = n.GetKey(x);
string v = n.Get(x);
// 4
// Test different keys
Response.Write("console.log('[" + k + "] => ");
Response.Write(v + "');");
x++;
}
if (x == 0)
{
Response.Write("console.log('QueryString is empty!')");
}
Response.Write("</script>");
}
}
}