Using a Foreach loop to consume a large amount of WebServices - c#

Essentially, my program needs to consume about 100 (and this number will expand) WebServices, pull a piece of data from each, store it, parse it, and then display it. I've written the code for storing, parsing, and displaying.
My problem is this: I can't find any tutorial online about how to loop through a list of WebReferences and query each one. Am I doomed to writing 100 WebReferences and manually writing queries for each one, or is it possible to store a List or Array of the URLs (or something) and loop through it? Or is there another, better way of doing this?
I've specifically done research on this and I haven't found anything, I've done my due diligence. I'm not asking about how to consume a WebService, there's plenty of information on that and it's not that hard.
Current foreach loop (not sufficent as I need to pass login credentials and get a response):
//Retrieve the XMLString from the server
//The ServerURLList is just a giant list of URLS, I didn't include it
var client = new WebClient { Credentials = new NetworkCredential("LoginCredentials", "LoginCredentialsPass") };
var XMLStringFromServer = client.DownloadString((String)(dr[0]));
//Notice it takes the string URL from the DataTable provided, so that it can do all 100 customers while parsing the response

Related

Cleansing a TextFile / Log File WinForms C#

I am currently writing an winforms c# application that will allow users to cleanse text / log files. At present the app is working, but if the file is massibe in size, i.e. 10MB it is taking an age!
The first cleanse it does is for users Windows Auth, i.e. who was logged in at the time. I have a textfile of all users in our organisation, roughly 10,000.
I load this into a
List<string> loggedUsers = new List<string>();
string[] userList = System.IO.File.ReadAllLines(#"C:\temp\logcleaner\users.txt");
foreach (string line in userList)
{
loggedUsers .Add(line.ToString());
}
Next i take a textfile and show it in a RichTextBox (rtbOrgFile), allowing the user to see see what information is currently there. The user then clicks a button which does the following:
foreach (var item in loggedUsers)
{
if (rtbOrgFile.Text.Contains(item.ToString()))
{
if(foundUsers.Items.Contains(item.ToString()))
{
// already in list
}
else
{
foundUsers.Items.Add(item.ToString());
}
}
}
My question is, is this the most efficient way? Or is there are far batter way to go about this. The code is working fine, but the as you start to get into big files it is incredibly slow.
First, I would advise the following for loading your List:
List<string> loggeedUsers = System.IO.File.ReadAllLines("[...]users.txt").ToList();
You didn't specify how large the text file that you load into the RichTextBox is, but I assume it is quite large, since it takes so long.
Found in this answer, it suggests the Lucene.NET search engine, but it also provides a simple way to multi-thread the search without that engine, making it faster.
I would translate the example to:
var foundUsers = loggeedUsers.AsParallel().Where(user => rtbOrgFile.Contains(user)).ToList();
This way, it checks for multiple logged users at once.
You need at least .NET 4.0 for Parallel LINQ (which this example uses), as far as I know. If you don't have access to .NET 4.0, you could try to manually create one or two Threads and let each one handle an equal part of loggedUsers to check. They would each make a separate foundUsers list and then report it back to you, where you would merge them to a single list using List<T>.AddRange(anotherList).

Coldfusion - How to update Table Cells in Real time?

I am relatively new to ColdFusion (using ColdFusion 10) and I have a question regarding creating a real-time updated table.
Currently I have a C# application that I have writing stock prices to a csv (text) file every 2 seconds and would like to reflect these changes as they happen in a table on web page. I know I could have the entire table refresh every 2 seconds, but this would generate a lot of requests to the server and I would like to know if there is a better way of doing it ? Could this be easily achieved using ColdFusion 10's new html5 Web-sockets functionality ?
Any advice/guidance on which way to proceed or how to achieve this would be greatly appreciated !
Thanks, AlanJames.
I think you could rewrite your question and get at least 5 answers in first hour.
Now to answer it, if I understood well what you're asking.
IMHO websockets aren't there yet, if your website is for wide population and you are not 100% sure that they're coming with most recent Chrome or FF, forget it.
You could use some javascript websocket library which would gracefully fallback to flash or AJAX HTTP polling, like http://socket.io/ or cloud service like pusher.com . But this will complicate your life because you have 2-3 times more work in backend if you implement polling and websocket.
Regarding amount of requests, if you want real time data on screen, you gotta have server to support it.
You could optimize if you request once and refresh data for all the table, so not per cell. You'd get all new data at once and update those cells which changed with jquery. So not pulling all data again, or whole table HTML, just minimal amount of data.
AJAX polling would certainly help with amount of requests, time of the request being open is another possible problem though. You could do polling with BlazeDS which is even in ColdFusion 9.
some pages to look at:
http://www.bennadel.com/blog/2351-ColdFusion-10-Using-WebSockets-To-Push-A-Message-To-A-Target-User.htm
http://www.bennadel.com/blog/1956-Very-Simple-Pusher-And-ColdFusion-Powered-Chat.htm
http://nil.checksite.co.uk/index.cfm/2010/1/28/CF-BlazeDS-AJAX-LongPolling-Part1
There isn't a way to get live updates every 2 seconds without making some kind of request from your page to your server, otherwise how would it know if anything has changed?
Personally I would write a CFC method to read in your text file and see if it's changed, then poll that method every few seconds using jQuery to return whether it has changed or not, and pass back any updated content.
Without knowing the details of your text file etc. it's hard to write anything accurate. Fundamentally your CFC method would have to store (in a SESSION var probably) a copy of the text file data, so it could compare it with the latest read-in data and tell if anything has changed. If it has changed then send a structure back with the updates, or return a response saying it's unchanged.
Your CFC code would look something like this:
<cffunction name="check_update" access="remote" output="false">
<cfset response = structNew()>
<cffile action="read"
file="path\to\your\textfile.txt"
variable="file_content"
>
<cfif file_content NEQ SESSION.file_content>
<cfset response.updated = true>
<cfset SESSION.file_content = file_content>
<cfset response.content = structNew()>
<!--- code here to populate 'content' variable with updated info --->
<cfelse>
<cfset response.updated = false>
</cfif>
<cfreturn response>
</cffunction>
Then the jQuery code to poll that data would look like this:
var update_interval;
var update_pause = 3000;
function check_update() {
var request = {
returnformat : 'json',
queryformat : 'column',
method: 'check_update'
}
$.getJSON("path/to/your/service.cfc", request, function(data) {
if(data.UPDATED == true) {
/* code here to iterate through data.CONTENT */
/* and render out your updated info to your table */
}
});
}
$(document).ready(function () {
update_interval = setInterval(check_update(), update_pause);
});
So once the DOM is ready we create an interval that in this case fires every 3 seconds (3000ms) and calls the check_update() function. That function makes a call to your CFC, and checks the response. If the response UPDATED value is true then it runs whatever code to render your updates.
That's the most straightforward method of achieving what you need, and should work regardless of browser. In my experience the overhead of polling a CFC like that is really very small indeed and the amount of data you're transferring will by tiny, so it should be no problem to handle.
I don't think there's any other method that could be more lightweight / easy to put together. The benefits of long polling or SSE (with dodgy browser support) are negligible and not worth the programming overhead.
Thanks, Henry

Tips for XML performance optimization on WP7

I have an application on the phone and it takes in about 50 pages of XML, each XML has about 100 nods in it. So if you do the math that is about 5000 nodes I am parsing. Sometimes these nodes are not set up the same. Example: maybe 75% have a different schema than the other 25% so there is code to handle this and parse them differently.
I can't optimize the http calls any more then I have as the web services only serve up data 100 "items" at a time, so I have to hit the web service 50 times basically to get all the pages of data. Here is the high level process.
Call web service (webclient)
Parse XML (take note total pages in xml. it will say Page 1 of 100)
Add results to collection
Call web service again for page 2
Parse
Add results to collection
....rinse and repeat 100 times.
The parsing code is really the only place I can optimize. All I am doing is using linq to parse the XML and separate out the nodes in an IEnumerable and then I parse them and place them in a custom object I created. I'm looking for some high level ideas on how to optimize this entire process. Maybe I'm missing something.
Some code....just imagine the below, just like 1000 times or more, and with more attributes, this is a small example. Most have like 30 attributes that need parsing..Also I have no access to a real schema, and no control over schema changes.
XElement eventData = XElement.Parse(e.Result);
IEnumerable<XElement> feed =
(eventData.Element("results").Elements("event").Select(el => el)).Distinct();
foreach (XElement el in feed)
{
_brokenItem = el.ToString();
thisFeeditem.InternalGuid = Guid.NewGuid().ToString();
thisFeeditem.ServiceIcon = GetServiceIcon(thisFeeditem.ServiceType);
thisFeeditem.Description = el.Attribute("displayName").Value;
thisFeeditem.EventURL = el.Attribute("uri").Value;
thisFeeditem.Guid = el.Attribute("id").Value;
thisFeeditem.Latitude = el.Element("venue").Attribute("lat").Value;
thisFeeditem.Longitude = el.Element("venue").Attribute("lng").Value;
}
Without seeing your code, it is not easy to optimise it. However, there is one general point you should consider:
Linq-to-XML is a DOM-based parser, in that it reads the entire XML document into a model which resides in memory. All queries are executed against the DOM. For large documents, constructing the DOM can be memory and CPU intensive. Also, your Linq-to-XML queries, if written inefficiently can navigate the same tree nodes multiple times.
As an alternative, consider using a serial parser like XmlReader. Parsers of this type do not create a memory-based model of your document, and operate in a forward-only manner, forcing you to read each element just once.
You could change the architecture.
Create a web service that does the collection and filtering of the XML data and on the phone retrieve the data from that web service.
This way you move the heavy processing to a (scale-able?!) server and you only have to modify the service when the XML sources change instead of having to update all clients.
You can also cache results and prevent duplicates.
Now you are in full control of what happens on the phone.

In .Net 4.0, can DirectorySearch return LDAP results in a way that allows me to page through them?

I am working in C#, and am trying to use DirectorySearch to query the groups of an extremely large Microsoft ActiveDirectory LDAP server.
So, in my application, I'm going to have a paged list of groups, with searching capability. Naturally, I don't want to hammer my LDAP server with passing me the entire result set for these queries every time I hit "Next Page".
Is there a way, using DirectorySearch, to retrieve ONLY a single arbitrary page's results, rather than returning the entire result-set in one method call?
Similar questions:
DirectorySearch.PageSize = 2 doesn't work
c# Active Directory Services findAll() returns only 1000 entries
Many questions like these exist, where someone asks about paging (meaning from LDAP server to app server), and gets responses involving PageSize and SizeLimit. However, those properties only affect paging between the C# server and the LDAP server, and in the end, the only relevant methods that DirectorySearch has are FindOne() and FindAll().
What I'm looking for is basically "FindPaged(pageSize, pageNumber)" (the pageNumber being the really important bit. I don't just want the first 1000 results, I want (for example) the 100'th set of 1000 results. The app can't wait for 100,000 records to be passed from the LDAP server to the app server, even if they are broken up into 1,000-record chunks.
I understand that DirectoryServices.Protocols has SearchRequest, which (I think?) allows you to use a "PageResultRequestControl", which looks like it has what I'm looking for (although it looks like the paging information comes in "cookies", which I'm not sure how I'd be supposed to retrieve). But if there's a way to do this without rewriting the entire thing to use Protocols instead, I'd rather not have to do so.
I just can't imagine there's no way to do this... Even SQL has Row_Number.
UPDATE:
The PageResultRequestControl does not help - It's forward-only and sequential (You must call and get the first N results before you can get the "cookie" token necessary to make a call to get result N+1).
However, the cookie does appear to have some sort of reproducible ordering... On a result set I was working on, I iterated one by one through the results, and each time the cookie came out thusly:
1: {8, 0, 0, 0}
2: {11, 0, 0, 0}
3: {12, 0, 0, 0}
4: {16, 0, 0, 0}
When I iterated through two by two, I got the same numbers (11, 16).
This makes me think that if I could figure out the code of how those numbers are generated, I could create a cookie ad-hoc, which would give me exactly the paging I'm looking for.
The PageResultRequestControl is indeed the way to do this, it's part of the LDAP protocol. You'll just have to figure out what that implies for your code, sorry. There should be a way to use it from where you are, but, having said that, I'm working in Java and I've just had to write a dozen or so request controls and extended-operation classes for use with JNDI so you might be out of luck ... or you might have to do like I did. Warning, ASN.1 parsing follows not that far behind :-|
Sadly, it appears there may not be a way to do this given current C# libraries.
All of the standard C#4.0 LDAP libraries return Top-N results (As in, FindAll(), which returns every result, FindOne(), which returns the first result, or SearchResult with PageResultRequestControl, which returns results N through N+M but requires you to retrieve results 1 through N-1 before you'll have a cookie token that you can pass with the request in order to get the next set.
I haven't been able to find any third-party LDAP libraries that allow this, either.
Unless a better solution is found, my path forward will be to modify the interface to instead display the top X results, with no client paging capabilities (obviously still using server-side paging as appropriate).
I may pursue a forward-only paging system at a later date, by passing the updated cookie to the client with the response, and passing it back with a click of a "More Results" type of button.
It might be worth pursuing at a later date, whether or not these cookies can be hand-crafted.
UPDATE:
I spoke with Microsoft Support and confirmed this - There is no way to do dynamic paging with LDAP servers. This is a limitation of LDAP servers themselves.
You can use Protocols and the Paging control (if your LDAP server supports it) to step forward at will, but there is no cross-server (or even cross-version) standard for the cookie, so you can't reasonably craft your own, and there's no guarantee that the cookie can be reused for repeated queries.
A full solution involves using Protocols (with Paging as above) to pull your pageable result set into SQL, whether into a temp table or a permanent storage table, and allow your user to page and sort through THAT result set in the traditional manner. Bear in mind your results won't be precisely up to date, but with some smart cache updating you can minimize that risk.
Maybe you want to iterate through your "pages" using the range-attribute accordingly:
----copy & paste----
This sample retrieves entries 0-500, inclusively.
DirectoryEntry group = new DirectoryEntry("LDAP://CN=Sales,DC=Fabrikam,DC=COM");
DirectorySearcher groupMember = new DirectorySearcher
(group,"(objectClass=*)",new string[]{"member;Range=0-500"},SearchScope.Base);
SearchResult result = groupMember.FindOne();
// Each entry contains a property name and the path (ADsPath).
// The following code returns the property name from the PropertyCollection.
String propName=String.Empty;
foreach(string s in result.Properties.PropertyNames)
{
if ( s.ToLower() != "adspath")
{
propName = s;
break;
}
}
foreach(string member in result.Properties[propName])
{
Console.WriteLine(member);
}
----copy & paste----
for more Information see:
Enumerating Members in a Large Group
https://msdn.microsoft.com/en-us/library/ms180907.aspx
Range Retrieval of Attribute Values
https://msdn.microsoft.com/en-us/library/cc223242.aspx
Searching Using Range Retrieval
https://msdn.microsoft.com/en-us/library/aa367017.aspx

receiving everyday XML files - 12 types need to do search on these everyday

Asp.NET - C#.NET
I need a advice regarding a design problem below:
I'll receive everyday XML files. It changes the quantity e.g. yesterday 10 XML files received, today XML 56 files received and maybe tomorrow 161 XML files etc.
There are 12 types (12 XSD)... and in the top there is a attribute called FormType e.g. FormType="1", FormType="2" , FormType="12" etc. up to 12 formtypes.
All of them have common fields like Name, adres, Phone.
But e.g. FormType=1 is for Construction, FormType=2 is for IT, FormType 3=Hospital, Formtype=4 is for Advertisement etc. etc.
As I said all of them have common attributes.
Requirements:
Need a search screen so the user can do search on these XML contents. But I don't have any clue how to approach this. e.g. Search the text in some attributes for the xml's received from Date_From and Date_To.
Problem:
I've heard about putting the XML's in a Binary field and do XPATH query or whatever but don't know the word's to search on google.
I was thinking to create a big database.table and read all XML's and put in the Database Table. But the issue is some xml attributes are very huge like 2-3 pages. and the same attributes in other XML file are empty..
So creating NVARCHAR(MAX) for every XML attribute and putting them in table.field.... After some period my DATABASE will be a big big monster...
Can someone advice what is the best approach to handle this issue?
I'm not 100% sure I understand your problem. I'm guessing that the query's supposed to return individual XML documents that meet some kind of user-specified criteria.
In that event, my starting point would probably be to implement a method for querying a single XML document, i.e. one that returns true if the document's a hit and false otherwise. In all likelihood, I'd make the query parameter an XPath query, but who knows? Here's a simple example:
public bool TestXml(XDocument d, string query)
{
return d.XPathSelectElements(query).Any();
}
Next, I need a store of XML documents to query. Where does that store live, and what form does it take? At a certain level, those are implementation details that my application doesn't care about. They could live in a database, or the file system. They could be cached in memory. I'd start by keeping it simple, something like:
public IEnumerable<XDocument> XmlDocuments()
{
DirectoryInfo di = new DirectoryInfo(XmlDirectoryPath);
foreach (FileInfo fi in di.GetFiles())
{
yield return XDocument.Load(fi.Filename);
}
}
Now I can get all of the documents that fulfill a request like this:
public IEnumerable<XDocument> GetDocuments(query)
{
return XmlDocuments.Where(x => TextXml(x, query));
}
The thing that jumps out at me when I look at this problem: I have to parse my documents into XDocument objects to query them. That's going to happen whether they live in a database or the file system. (If I stick them in a database and write a stored procedure that does XPath queries, as someone suggested, I'm still parsing all of the XML every time I execute a query; I've just moved all that work to the database server.)
That's a lot of I/O and CPU time that gets spent doing the exact same thing over and over again. If the volume of queries is anything other than tiny, I'd consider building a List<XDocument> the first time GetDocuments() is called and come up with a scheme of keeping that list in memory until new XML documents are received (or possibly updating it when new XML documents are received).

Categories