Polling SSIS execution status - c#

I have an SSIS package that's launching another SSIS package in a Foreach container; because the container reports completion as soon as it launched all the packages it had to launch, I need a way to make it wait until all "child" packages have completed.
So I implemented a little sleep-wait loop that basically pulls the Execution objects off the SSISDB for the ID's I'm interested in.
The problem I'm facing, is that a grand total of 0 Dts.Events.FireProgress events get fired, and if I uncomment the Dts.Events.FireInformation call in the do loop, then every second I get a message reported saying 23 packages are still running... except if I check in SSISDB's Active Operations window I see that most have completed already and 3 or 4 are actually running.
What am I doing wrong, why wouldn't runningCount contain the number of actually running executions?
using ssis = Microsoft.SqlServer.Management.IntegrationServices;
public void Main()
{
const string serverName = "REDACTED";
const string catalogName = "SSISDB";
var ssisConnectionString = $"Data Source={serverName};Initial Catalog=msdb;Integrated Security=SSPI;";
var ids = GetExecutionIDs(serverName);
var idCount = ids.Count();
var previousCount = -1;
var iterations = 0;
try
{
var fireAgain = true;
const int secondsToSleep = 1;
var sleepTime = TimeSpan.FromSeconds(secondsToSleep);
var maxIterations = TimeSpan.FromHours(1).TotalSeconds / sleepTime.TotalSeconds;
IDictionary<long, ssis.Operation.ServerOperationStatus> catalogExecutions;
using (var connection = new SqlConnection(ssisConnectionString))
{
var server = new ssis.IntegrationServices(connection);
var catalog = server.Catalogs[catalogName];
do
{
catalogExecutions = catalog.Executions
.Where(execution => ids.Contains(execution.Id))
.ToDictionary(execution => execution.Id, execution => execution.Status);
var runningCount = catalogExecutions.Count(kvp => kvp.Value == ssis.Operation.ServerOperationStatus.Running);
System.Threading.Thread.Sleep(sleepTime);
//Dts.Events.FireInformation(0, "ScriptMain", $"{runningCount} packages still running.", string.Empty, 0, ref fireAgain);
if (runningCount != previousCount)
{
previousCount = runningCount;
decimal completed = idCount - runningCount;
decimal percentCompleted = completed / idCount;
Dts.Events.FireProgress($"Waiting... {completed}/{idCount} completed", Convert.ToInt32(100 * percentCompleted), 0, 0, "", ref fireAgain);
}
iterations++;
if (iterations >= maxIterations)
{
Dts.Events.FireWarning(0, "ScriptMain", $"Timeout expired, requesting cancellation.", string.Empty, 0);
Dts.Events.FireQueryCancel();
Dts.TaskResult = (int)Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Canceled;
return;
}
}
while (catalogExecutions.Any(kvp => kvp.Value == ssis.Operation.ServerOperationStatus.Running));
}
}
catch (Exception exception)
{
if (exception.InnerException != null)
{
Dts.Events.FireError(0, "ScriptMain", exception.InnerException.ToString(), string.Empty, 0);
}
Dts.Events.FireError(0, "ScriptMain", exception.ToString(), string.Empty, 0);
Dts.Log(exception.ToString(), 0, new byte[0]);
Dts.TaskResult = (int)ScriptResults.Failure;
return;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
The GetExecutionIDs function simply returns all execution ID's for the child packages, from my metadata database.

The problem is that you're re-using the same connection at every iteration. Turn this:
using (var connection = new SqlConnection(ssisConnectionString))
{
var server = new ssis.IntegrationServices(connection);
var catalog = server.Catalogs[catalogName];
do
{
catalogExecutions = catalog.Executions
.Where(execution => ids.Contains(execution.Id))
.ToDictionary(execution => execution.Id, execution => execution.Status);
Into this:
do
{
using (var connection = new SqlConnection(ssisConnectionString))
{
var server = new ssis.IntegrationServices(connection);
var catalog = server.Catalogs[catalogName];
catalogExecutions = catalog.Executions
.Where(execution => ids.Contains(execution.Id))
.ToDictionary(execution => execution.Id, execution => execution.Status);
}
And you'll get correct execution status every time. Not sure why the connection can't be reused, but keeping connections as short-lived as possible is always a good idea - and that's another proof.

Related

ZeroMQ Broadcast with Proxy losing messages

I am doing some performance tests on ZeroMQ in order to compare it with others like RabbitMQ and ActiveMQ.
In my broadcast tests and to avoid "The Dynamic Discovery Problem" as referred by ZeroMQ documentation I have used a proxy. In my scenario, I am using 50 concurrent publishers each one sending 500 messages with 1ms delay between sends. Each message is then read by 50 subscribers. And as I said I am losing messages, each of the subscribers should receive a total of 25000 messages and they are each receiving between 5000 and 10000 messages only.
I am using Windows and C# .Net client clrzmq4 (4.1.0.31).
I have already tried some solutions that I found on other posts:
I have set linger to TimeSpan.MaxValue
I have set ReceiveHighWatermark to 0 (as it is presented as infinite, but I have tried also Int32.MaxValue)
I have set checked for slow start receivers, I made receivers start some seconds before publishers
I had to make sure that no garbage collection is made to the socket instances (linger should do it but to make sure)
I have a similar scenario (with similar logic) using NetMQ and it works fine. The other scenario does not use security though and this one does (and that's also the reason why I use clrzmq in this one because I need client authentication with certificates that is not yet possible on NetMQ).
EDIT:
public class MCVEPublisher
{
public void publish(int numberOfMessages)
{
string topic = "TopicA";
ZContext ZContext = ZContext.Create();
ZSocket publisher = new ZSocket(ZContext, ZSocketType.PUB);
//Security
// Create or load certificates
ZCert serverCert = Main.GetOrCreateCert("publisher");
var actor = new ZActor(ZContext, ZAuth.Action, null);
actor.Start();
// send CURVE settings to ZAuth
actor.Frontend.Send(new ZFrame("VERBOSE"));
actor.Frontend.Send(new ZMessage(new List<ZFrame>()
{ new ZFrame("ALLOW"), new ZFrame("127.0.0.1") }));
actor.Frontend.Send(new ZMessage(new List<ZFrame>()
{ new ZFrame("CURVE"), new ZFrame(".curve") }));
publisher.CurvePublicKey = serverCert.PublicKey;
publisher.CurveSecretKey = serverCert.SecretKey;
publisher.CurveServer = true;
publisher.Linger = TimeSpan.MaxValue;
publisher.ReceiveHighWatermark = Int32.MaxValue;
publisher.Connect("tcp://127.0.0.1:5678");
Thread.Sleep(3500);
for (int i = 0; i < numberOfMessages; i++)
{
Thread.Sleep(1);
var update = $"{topic} {"message"}";
using (var updateFrame = new ZFrame(update))
{
publisher.Send(updateFrame);
}
}
//just to make sure it does not end instantly
Thread.Sleep(60000);
//just to make sure publisher is not garbage collected
ulong Affinity = publisher.Affinity;
}
}
public class MCVESubscriber
{
private ZSocket subscriber;
private List<string> prints = new List<string>();
public void read()
{
string topic = "TopicA";
var context = new ZContext();
subscriber = new ZSocket(context, ZSocketType.SUB);
//Security
ZCert serverCert = Main.GetOrCreateCert("xpub");
ZCert clientCert = Main.GetOrCreateCert("subscriber");
subscriber.CurvePublicKey = clientCert.PublicKey;
subscriber.CurveSecretKey = clientCert.SecretKey;
subscriber.CurveServer = true;
subscriber.CurveServerKey = serverCert.PublicKey;
subscriber.Linger = TimeSpan.MaxValue;
subscriber.ReceiveHighWatermark = Int32.MaxValue;
// Connect
subscriber.Connect("tcp://127.0.0.1:1234");
subscriber.Subscribe(topic);
while (true)
{
using (var replyFrame = subscriber.ReceiveFrame())
{
string messageReceived = replyFrame.ReadString();
messageReceived = Convert.ToString(messageReceived.Split(' ')[1]);
prints.Add(messageReceived);
}
}
}
public void PrintMessages()
{
Console.WriteLine("printing " + prints.Count);
}
}
public class Main
{
static void Main(string[] args)
{
broadcast(500, 50, 50, 30000);
}
public static void broadcast(int numberOfMessages, int numberOfPublishers, int numberOfSubscribers, int timeOfRun)
{
new Thread(() =>
{
using (var context = new ZContext())
using (var xsubSocket = new ZSocket(context, ZSocketType.XSUB))
using (var xpubSocket = new ZSocket(context, ZSocketType.XPUB))
{
//Security
ZCert serverCert = GetOrCreateCert("publisher");
ZCert clientCert = GetOrCreateCert("xsub");
xsubSocket.CurvePublicKey = clientCert.PublicKey;
xsubSocket.CurveSecretKey = clientCert.SecretKey;
xsubSocket.CurveServer = true;
xsubSocket.CurveServerKey = serverCert.PublicKey;
xsubSocket.Linger = TimeSpan.MaxValue;
xsubSocket.ReceiveHighWatermark = Int32.MaxValue;
xsubSocket.Bind("tcp://*:5678");
//Security
serverCert = GetOrCreateCert("xpub");
var actor = new ZActor(ZAuth.Action0, null);
actor.Start();
// send CURVE settings to ZAuth
actor.Frontend.Send(new ZFrame("VERBOSE"));
actor.Frontend.Send(new ZMessage(new List<ZFrame>()
{ new ZFrame("ALLOW"), new ZFrame("127.0.0.1") }));
actor.Frontend.Send(new ZMessage(new List<ZFrame>()
{ new ZFrame("CURVE"), new ZFrame(".curve") }));
xpubSocket.CurvePublicKey = serverCert.PublicKey;
xpubSocket.CurveSecretKey = serverCert.SecretKey;
xpubSocket.CurveServer = true;
xpubSocket.Linger = TimeSpan.MaxValue;
xpubSocket.ReceiveHighWatermark = Int32.MaxValue;
xpubSocket.Bind("tcp://*:1234");
using (var subscription = ZFrame.Create(1))
{
subscription.Write(new byte[] { 0x1 }, 0, 1);
xpubSocket.Send(subscription);
}
Console.WriteLine("Intermediary started, and waiting for messages");
// proxy messages between frontend / backend
ZContext.Proxy(xsubSocket, xpubSocket);
Console.WriteLine("end of proxy");
//just to make sure it does not end instantly
Thread.Sleep(60000);
//just to make sure xpubSocket and xsubSocket are not garbage collected
ulong Affinity = xpubSocket.Affinity;
int ReceiveHighWatermark = xsubSocket.ReceiveHighWatermark;
}
}).Start();
Thread.Sleep(5000); //to make sure proxy started
List<MCVESubscriber> Subscribers = new List<MCVESubscriber>();
for (int i = 0; i < numberOfSubscribers; i++)
{
MCVESubscriber ZeroMqSubscriber = new MCVESubscriber();
new Thread(() =>
{
ZeroMqSubscriber.read();
}).Start();
Subscribers.Add(ZeroMqSubscriber);
}
Thread.Sleep(10000);//to make sure all subscribers started
for (int i = 0; i < numberOfPublishers; i++)
{
MCVEPublisher ZeroMqPublisherBroadcast = new MCVEPublisher();
new Thread(() =>
{
ZeroMqPublisherBroadcast.publish(numberOfMessages);
}).Start();
}
Thread.Sleep(timeOfRun);
foreach (MCVESubscriber Subscriber in Subscribers)
{
Subscriber.PrintMessages();
}
}
public static ZCert GetOrCreateCert(string name, string curvpath = ".curve")
{
ZCert cert;
string keyfile = Path.Combine(curvpath, name + ".pub");
if (!File.Exists(keyfile))
{
cert = new ZCert();
Directory.CreateDirectory(curvpath);
cert.SetMeta("name", name);
cert.Save(keyfile);
}
else
{
cert = ZCert.Load(keyfile);
}
return cert;
}
}
This code also produces the expected number of messages when security is disabled, but when turned on it doesn't.
Does someone know another thing to check? Or has it happened to anyone before?
Thanks

Execution Timeout Expired errors using Entity Framework

My code below starts out great and seems to run pretty quickly but then I will start getting this error message after awhile. I realize that Entity Framework/Dbcontext is not thread safe and this is probably causing the issue so is there a way to change this code so that it keeps the same performance and doesn't have issues with not closing threads which is probably causing the problem or is there another way to speed up this process? I have over 9000 symbols to download and insert into a database and I tried doing basic for loops with the await command but it was extremely slow and took more than 10 times longer to achieve the same results.
public static async Task startInitialMarketSymbolsDownload(string market)
{
try
{
List<string> symbolList = new List<string>();
symbolList = getStockSymbols(market);
var historicalGroups = symbolList.Select((x, i) => new { x, i })
.GroupBy(x => x.i / 50)
.Select(g => g.Select(x => x.x).ToArray());
await Task.WhenAll(historicalGroups.Select(g => Task.Run(() => getLocalHistoricalStockData(g, market))));
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
public static async Task getLocalHistoricalStockData(string[] symbols, string market)
{
// download data for list of symbols and then upload to db tables
string symbolInfo = null;
try
{
using (financeEntities context = new financeEntities())
{
foreach (string symbol in symbols)
{
symbolInfo = symbol;
List<HistoryPrice> hList = Get(symbol, new DateTime(1900, 1, 1), DateTime.UtcNow);
var backDates = context.DailyAmexDatas.Where(c => c.Symbol == symbol).Select(d => d.Date).ToList();
List<HistoryPrice> newHList = hList.Where(c => backDates.Contains(c.Date) == false).ToList<HistoryPrice>();
foreach (HistoryPrice h in newHList)
{
DailyAmexData amexData = new DailyAmexData();
// set the data then add it
amexData.Symbol = symbol;
amexData.Open = h.Open;
amexData.High = h.High;
amexData.Low = h.Low;
amexData.Close = h.Close;
amexData.Volume = h.Volume;
amexData.AdjustedClose = h.AdjClose;
amexData.Date = h.Date;
context.DailyAmexDatas.Add(amexData);
}
// now save everything
await context.SaveChangesAsync();
Console.WriteLine(symbol + " added to the " + market + " database!");
}
}
}
catch (Exception ex)
{
Console.WriteLine(ex.InnerException.Message);
}
}

The underlying provider failed on Open / The operation is not valid for the state of the transaction

Here is my code
public static string UpdateEmptyCaseRevierSet() {
string response = string.Empty;
using (System.Transactions.TransactionScope tran = new System.Transactions.TransactionScope()) {
using (var db = new Entities.WaveEntities()) {
var maxCaseReviewersSetID = db.CaseReviewerSets.Select(crs => crs.CaseReviewersSetId).Max();
var emptyCHList = db.CaseHistories.Where(ch => ch.CaseReviewersSetID == null && ch.IsLatest == true && ch.StatusID != 100).ToList();
for(int i=0; i < emptyCHList.Count; i++) {
var emptyCH = emptyCHList[i];
var newCaseReviewerSET = new Entities.CaseReviewerSet();
newCaseReviewerSET.CreationCHID = emptyCH.CHID;
db.CaseReviewerSets.Add(newCaseReviewerSET);
emptyCH.CaseReviewerSet = newCaseReviewerSET;
}
db.SaveChanges();
}
tran.Complete();
}
return response;
}
The exception occures on "db.SaveChanges()"
I saw in another post with the same error message something about "it seems I cannot have two connections opened to the same database with the TransactionScope block." but I dont think that this has anything to do with my case.
Additionally the number of records to insert and update in total are 2700, witch is not that many really. But it does take quite a lot of time to complete the for statement (10 minutes or so). Since everything happening within the for statement is actually happening in the memory can someone please explane why is this taking so long ?
You can try as shown below using latest db.Database.BeginTransaction API.
Note : use foreach instead of for
using (var db = new Entities.WaveEntities())
{
using (var dbContextTransaction = db.Database.BeginTransaction())
{
try
{
var maxCaseReviewersSetID = db.CaseReviewerSets.Select(crs => crs.CaseReviewersSetId).Max();
var emptyCHList = db.CaseHistories.Where(ch => ch.CaseReviewersSetID == null && ch.IsLatest == true && ch.StatusID != 100).ToList();
foreach(var ch in emptyCHList) {
var newCaseReviewerSET = new Entities.CaseReviewerSet();
newCaseReviewerSET.CreationCHID = ch.CHID;
db.CaseReviewerSets.Add(newCaseReviewerSET);
}
db.SaveChanges();
dbContextTransaction.Commit();
}
catch (Exception)
{
dbContextTransaction.Rollback();
}
}
}

SqlWorkflowInstanceStore WaitForEvents returns HasRunnableWorkflowEvent but LoadRunnableInstance fails

Dears
Please help me with restoring delayed (and persisted) workflows.
I'm trying to check on self-hosted workflow store is there any instance that was delayed and can be resumed. For testing purposes I've created dummy activity that is delayed and it persists on delay.
generally resume process looks like:
get WF definition
configure sql instance store
call WaitForEvents
is there event with HasRunnableWorkflowEvent.Value name and if it is
create WorkflowApplication object and execute LoadRunnableInstance method
it works fine if store is created|initialized, WaitForEvents is called, store is closed. In such case store reads all available workflows from persisted DB and throws timeout exception if there is no workflows available to resume.
The problem happens if store is created and loop is started only for WaitForEvents (the same thing happens with BeginWaitForEvents). In such case it reads all available workflows from DB (with proper IDs) but then instead of timeout exception it is going to read one more instance (I know exactly how many workflows is there ready to be resumed because using separate test database). But fails to read and throws InstanceNotReadyException. In catch I'm checking workflowApplication.Id, but it was not saved with my test before.
I've tried to run on new (empty) persistent database and result is the same :(
This code fails:
using (var storeWrapper = new StoreWrapper(wf, connStr))
for (int q = 0; q < 5; q++)
{
var id = Resume(storeWrapper); // InstanceNotReadyException here when all activities is resumed
But this one works as expected:
for (int q = 0; q < 5; q++)
using (var storeWrapper = new StoreWrapper(wf, connStr))
{
var id = Resume(storeWrapper); // timeout exception here or beginWaitForEvents continues to wait
What is a best solution in such case? Add empty catch for InstanceNotReadyException and ignore it?
Here are my tests
const int delayTime = 15;
string connStr = "Server=db;Database=AppFabricDb_Test;Integrated Security=True;";
[TestMethod]
public void PersistOneOnIdleAndResume()
{
var wf = GetDelayActivity();
using (var storeWrapper = new StoreWrapper(wf, connStr))
{
var id = CreateAndRun(storeWrapper);
Trace.WriteLine(string.Format("done {0}", id));
}
using (var storeWrapper = new StoreWrapper(wf, connStr))
for (int q = 0; q < 5; q++)
{
var id = Resume(storeWrapper);
Trace.WriteLine(string.Format("resumed {0}", id));
}
}
Activity GetDelayActivity(string addName = "")
{
var name = new Variable<string>(string.Format("incr{0}", addName));
Activity wf = new Sequence
{
DisplayName = "testDelayActivity",
Variables = { name, new Variable<string>("CustomDataContext") },
Activities =
{
new WriteLine
{
Text = string.Format("before delay {0}", delayTime)
},
new Delay
{
Duration = new InArgument<TimeSpan>(new TimeSpan(0, 0, delayTime))
},
new WriteLine
{
Text = "after delay"
}
}
};
return wf;
}
Guid CreateAndRun(StoreWrapper sw)
{
var idleEvent = new AutoResetEvent(false);
var wfApp = sw.GetApplication();
wfApp.Idle = e => idleEvent.Set();
wfApp.Aborted = e => idleEvent.Set();
wfApp.Completed = e => idleEvent.Set();
wfApp.Run();
idleEvent.WaitOne(40 * 1000);
var res = wfApp.Id;
wfApp.Unload();
return res;
}
Guid Resume(StoreWrapper sw)
{
var res = Guid.Empty;
var events = sw.GetStore().WaitForEvents(sw.Handle, new TimeSpan(0, 0, delayTime));
if (events.Any(e => e.Equals(HasRunnableWorkflowEvent.Value)))
{
var idleEvent = new AutoResetEvent(false);
var obj = sw.GetApplication();
try
{
obj.LoadRunnableInstance(); //instancenotready here if the same store has read all instances from DB and no delayed left
obj.Idle = e => idleEvent.Set();
obj.Completed = e => idleEvent.Set();
obj.Run();
idleEvent.WaitOne(40 * 1000);
res = obj.Id;
obj.Unload();
}
catch (InstanceNotReadyException)
{
Trace.TraceError("failed to resume {0} {1} {2}", obj.Id
, obj.DefinitionIdentity == null ? null : obj.DefinitionIdentity.Name
, obj.DefinitionIdentity == null ? null : obj.DefinitionIdentity.Version);
foreach (var e in events)
{
Trace.TraceWarning("event {0}", e.Name);
}
throw;
}
}
return res;
}
Here is store wrapper definition I'm using for test:
public class StoreWrapper : IDisposable
{
Activity WfDefinition { get; set; }
public static readonly XName WorkflowHostTypePropertyName = XNamespace.Get("urn:schemas-microsoft-com:System.Activities/4.0/properties").GetName("WorkflowHostType");
public StoreWrapper(Activity wfDefinition, string connectionStr)
{
_store = new SqlWorkflowInstanceStore(connectionStr);
HostTypeName = XName.Get(wfDefinition.DisplayName, "ttt.workflow");
WfDefinition = wfDefinition;
}
SqlWorkflowInstanceStore _store;
public SqlWorkflowInstanceStore GetStore()
{
if (Handle == null)
{
InitStore(_store, WfDefinition);
Handle = _store.CreateInstanceHandle();
var view = _store.Execute(Handle, new CreateWorkflowOwnerCommand
{
InstanceOwnerMetadata = { { WorkflowHostTypePropertyName, new InstanceValue(HostTypeName) } }
}, TimeSpan.FromSeconds(30));
_store.DefaultInstanceOwner = view.InstanceOwner;
//Trace.WriteLine(string.Format("{0} owns {1}", view.InstanceOwner.InstanceOwnerId, HostTypeName));
}
return _store;
}
protected virtual void InitStore(SqlWorkflowInstanceStore store, Activity wfDefinition)
{
}
public InstanceHandle Handle { get; protected set; }
XName HostTypeName { get; set; }
public void Dispose()
{
if (Handle != null)
{
var deleteOwner = new DeleteWorkflowOwnerCommand();
//Trace.WriteLine(string.Format("{0} frees {1}", Store.DefaultInstanceOwner.InstanceOwnerId, HostTypeName));
_store.Execute(Handle, deleteOwner, TimeSpan.FromSeconds(30));
Handle.Free();
Handle = null;
_store = null;
}
}
public WorkflowApplication GetApplication()
{
var wfApp = new WorkflowApplication(WfDefinition);
wfApp.InstanceStore = GetStore();
wfApp.PersistableIdle = e => PersistableIdleAction.Persist;
Dictionary<XName, object> wfScope = new Dictionary<XName, object> { { WorkflowHostTypePropertyName, HostTypeName } };
wfApp.AddInitialInstanceValues(wfScope);
return wfApp;
}
}
I'm not workflow foundation expert so my answer is based on the official examples from Microsoft. The first one is WF4 host resumes delayed workflow (CSWF4LongRunningHost) and the second is Microsoft.Samples.AbsoluteDelay. In both samples you will find a code similar to yours i.e.:
try
{
wfApp.LoadRunnableInstance();
...
}
catch (InstanceNotReadyException)
{
//Some logging
}
Taking this into account the answer is that you are right and the empty catch for InstanceNotReadyException is a good solution.

First time creating a method that uses parallel linq and getting out of memory exception

I wrote a method to download data from the internet and save it to my database. I wrote this using PLINQ to take advantage of my multi-core processor and because it is downloading thousands of different files in a very short period of time. I have added comments below in my code to show where it stops but the program just sits there and after awhile, I get an out of memory exception. This being my first time using TPL and PLINQ, I'm extremely confused so I could really use some advice on what to do to fix this.
UPDATE: I found out that I was getting a webexception constantly because the webclient was timing out. I fixed this by increasing the max amount of connections according to this answer here. I was then getting exceptions for the connection not opening and I fixed it by using this answer here. I'm now getting connection timeout errors for the database even though it is a local sql server. I still haven't been able to get any of my code to run so I could totally use some advice
static void Main(string[] args)
{
try
{
while (true)
{
// start the download process for market info
startDownload();
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
Console.WriteLine(ex.StackTrace);
}
}
public static void startDownload()
{
DateTime currentDay = DateTime.Now;
List<Task> taskList = new List<Task>();
if (Helper.holidays.Contains(currentDay) == false)
{
List<string> markets = new List<string>() { "amex", "nasdaq", "nyse", "global" };
Parallel.ForEach(markets, market =>
{
Downloads.startInitialMarketSymbolsDownload(market);
}
);
Console.WriteLine("All downloads finished!");
}
// wait 24 hours before you do this again
Task.Delay(TimeSpan.FromHours(24)).Wait();
}
public static void startInitialMarketSymbolsDownload(string market)
{
try
{
List<string> symbolList = new List<string>();
symbolList = Helper.getStockSymbols(market);
var historicalGroups = symbolList.AsParallel().Select((x, i) => new { x, i })
.GroupBy(x => x.i / 100)
.Select(g => g.Select(x => x.x).ToArray());
historicalGroups.AsParallel().ForAll(g => getHistoricalStockData(g, market));
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
Console.WriteLine(ex.StackTrace);
}
}
public static void getHistoricalStockData(string[] symbols, string market)
{
// download data for list of symbols and then upload to db tables
Uri uri;
string url, line;
decimal open = 0, high = 0, low = 0, close = 0, adjClose = 0;
DateTime date;
Int64 volume = 0;
string[] lineArray;
List<string> symbolError = new List<string>();
Dictionary<string, string> badNameError = new Dictionary<string, string>();
Parallel.ForEach(symbols, symbol =>
{
url = "http://ichart.finance.yahoo.com/table.csv?s=" + symbol + "&a=00&b=1&c=1900&d=" + (DateTime.Now.Month - 1) + "&e=" + DateTime.Now.Day + "&f=" + DateTime.Now.Year + "&g=d&ignore=.csv";
uri = new Uri(url);
using (dbEntities entity = new dbEntities())
using (WebClient client = new WebClient())
using (Stream stream = client.OpenRead(uri))
using (StreamReader reader = new StreamReader(stream))
{
while (reader.EndOfStream == false)
{
line = reader.ReadLine();
lineArray = line.Split(',');
// if it isn't the very first line
if (lineArray[0] != "Date")
{
// set the data for each array here
date = Helper.parseDateTime(lineArray[0]);
open = Helper.parseDecimal(lineArray[1]);
high = Helper.parseDecimal(lineArray[2]);
low = Helper.parseDecimal(lineArray[3]);
close = Helper.parseDecimal(lineArray[4]);
volume = Helper.parseInt(lineArray[5]);
adjClose = Helper.parseDecimal(lineArray[6]);
switch (market)
{
case "nasdaq":
DailyNasdaqData nasdaqData = new DailyNasdaqData();
var nasdaqQuery = from r in entity.DailyNasdaqDatas.AsParallel().AsEnumerable()
where r.Date == date
select new StockData { Close = r.AdjustedClose };
List<StockData> nasdaqResult = nasdaqQuery.AsParallel().ToList(); // hits this line
break;
default:
break;
}
}
}
// now save everything
entity.SaveChanges();
}
}
);
}
Async lambdas work like async methods in one regard: They do not complete synchronously but they return a Task. In your parallel loop you are simply generating tasks as fast as you can. Those tasks hold onto memory and other resources such as DB connections.
The simplest fix is probably to just use synchronous database commits. This will not result in a loss of throughput because the database cannot deal with high amounts of concurrent DML anyway.

Categories