I have a c# script task in an SSIS job that calls an API for the purpose of Geocoding. The API is proprietary and works something like this, receives request, takes address string, then attempts to string match to a huge list of addresses (millions) and if it cannot find it, then go out to another service such as google and get geodata info.
As you can imagine this string matching takes up a lot of time per request. Sometimes it's as slow as one request per min, and I have 4M addresses I need to do this for. Getting any dev work on the API side of things is not an option. To give a better picture of the process here is what I'm doing currently:
I pull a list of addresses from database (about 4M) and put it in a datatable and set variables:
// Fill c# datatable with query results
sdagetGeoData.Fill(dtGeoData);
// check to ensure datable has rows
if (dtGeoData.Rows.Count > 0)
{
// if datatable has rows, for every row set the varible
foreach (System.Data.DataRow row in dtGeoData.Rows)
{
localID = row[0].ToString();
address = row[1].ToString();
city = row[2].ToString();
state = row[3].ToString();
zip = row[4].ToString();
country = row[5].ToString();
// after varaibles are set, now run this method to post, get response and insert the string
GetGLFromAddress();
}
}
GetGLFromAddress() works like this:
Take the variables from above and form the JSON. Send the JSON using "POST" and httpWebRequest. Wait for request (time consuming). Return request. Set new variables with the return. Use those variables to update/ insert back into database, THEN loop through the next row in original datatable.
It's important to understand this flow because I need to be able to keep the localID variable with each request so I can update the correct record in the database.
Here is GetGLFromAddress():
private void GetGLFromAddress()
{
// Request JSON data with Payload
var httpWebRequest = (HttpWebRequest)WebRequest.Create("http:");
httpWebRequest.Headers.Add("Authorization", "");
httpWebRequest.ContentType = "application/json";
httpWebRequest.Method = "POST";
using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
{
// this takes the variables from your c# datatable and formats them for json post
var jS = new JavaScriptSerializer();
var newJson = jS.Serialize(new SeriesPost()
{
AddressLine1 = address,
City = city,
StateCode = state,
CountryCode = country,
PostalCode = zip,
CreateSiteIfNotFound = true
});
//// So you can see the JSON thats output
System.Diagnostics.Debug.WriteLine(newJson);
streamWriter.Write(newJson);
streamWriter.Flush();
streamWriter.Close();
}
try
{
var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
{
var result = streamReader.ReadToEnd();
// javascript serializer... deserializing the returned json so that way you can set the variables used for insert string
var p1 = new JavaScriptSerializer();
// after this line, obj is a fully deserialzed string of json Notice how I reference obj[x].fieldnames below. If you ever want to change the fiels or bring more in
// this is how you do it.
var obj = p1.Deserialize<List<RootObject>>(result);
// you must ensure the values returned are not null before trying to set the variable. You can see when that happens, I'm manually setting the variable value to null.
if (string.IsNullOrWhiteSpace(obj[0].MasterSiteId))
{
retGLMID = "null";
}
else
{
retGLMID = obj[0].MasterSiteId.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].PrecisionName))
{
retAcc = "null";
}
else
{
retAcc = obj[0].PrecisionName.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].PrimaryAddress.AddressLine1Combined))
{
retAddress = "null";
}
else
{
retAddress = obj[0].PrimaryAddress.AddressLine1Combined.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].Latitude))
{
retLat = "null";
}
else
{
retLat = obj[0].Latitude.ToString();
}
if (string.IsNullOrWhiteSpace(obj[0].Longitude))
{
retLong = "null";
}
else
{
retLong = obj[0].Longitude.ToString();
}
retNewRecord = obj[0].IsNewRecord.ToString();
// Build insert string... notice how I use the recently created variables
// string insertStr = retGLMID + ", '" + retAcc + "', '" + retAddress + "', '" + retLat + "', '" + retLong + "', '" + localID;
string insertStr = "insert into table " +
"(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " +
" VALUES " +
"('" + localID + "', '" + retGLMID + "', '" + retNewRecord + "', '" + retAcc + "')";
string connectionString = "Data Source=; Initial Catalog=; Trusted_Connection=Yes";
using (SqlConnection connection = new SqlConnection(connectionString))
{
SqlCommand cmd = new SqlCommand(insertStr);
cmd.CommandText = insertStr;
cmd.CommandType = CommandType.Text;
cmd.Connection = connection;
connection.Open();
cmd.ExecuteNonQuery();
connection.Close();
}
}
}
{
string insertStr2 = "insert into table " +
"(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " +
" VALUES " +
"('" + localID + "', null, null, 'Not_Found')";
string connectionString2 = "Data Source=; Initial Catalog=; Trusted_Connection=Yes";
using (SqlConnection connection = new SqlConnection(connectionString2))
{
SqlCommand cmd = new SqlCommand(insertStr2);
cmd.CommandText = insertStr2;
cmd.CommandType = CommandType.Text;
cmd.Connection = connection;
connection.Open();
cmd.ExecuteNonQuery();
connection.Close();
}
}
}
When I have attempted to use Parallel.Foreach, I had issues with the variables. I'd like to have multiple requests ran, but to retain each instance of the variable per request if that makes sense. I have no way to pass the localID to the API and return it, or that would be ideal.
Is this even possible?
And how would I need to structure this call to achieve what I am after?
Essentially I want to be able to send multiple calls, to speed up the entire process.
EDIT: added the code for GetGlFromAddress(). Yes, I am a newb, so please be kind :)
Put all your data in an array and you could call more than one request at a time, it is best to use multi tasks or Async Methods to call API.
Related
Good day,
In c#, I am trying to run a MySQL update query to update one record, based on its id. Everything goes well as long as I'm not using parameters.
I'm experiencing the issue once I am adding one or several parameters. I have made the test with only one parameter and same problem here.
What am I missing here ?
Thank you very much for your help.
public static void editCustomerTest(ClsCustomerTest pTest)
{
MySqlConnection l_Connection = null;
string l_SpName = string.Empty;
MySqlCommand l_MyCommand = null;
try
{
l_Connection = ClsIconEnv.getDataAccess().MySqlConnection;
ClsDataAccess.OpenConnection(l_Connection);
l_SpName = "update tbTestCustomers " +
"set sName = '#sLastName', " +
"sFirstName = '#sFirstName', " +
"sAddress = '#sAddress' " +
"Where id = #id);";
l_MyCommand = new MySqlCommand(l_SpName, l_Connection);
l_MyCommand.Parameters.Add("#sLastName", pTest.Last_Name);
l_MyCommand.Parameters.Add("#sFirstName", pTest.First_name);
l_MyCommand.Parameters.Add("#sAddress", pTest.Address);
l_MyCommand.Parameters.Add("#id", pTest.id);
l_MyCommand.ExecuteNonQuery(); // <----- This is the line at which the execution stops
ClsDataAccess.CloseConnection(l_Connection);
}
catch (Exception exc)
{
ClsIconErrorManager.manageException(exc);
}
finally
{
}
}
You do not need to wrap your params into the string and you have to use AddWithValue instead of Add if you don't want to explicitly specify the type, like this
l_SpName = "update tbTestCustomers " +
"set sName = #sLastName, " +
"sFirstName = #sFirstName, " +
"sAddress = #sAddress" +
"Where id = #id);";
l_MyCommand.Parameters.AddWithValue("#sLastName", pTest.Last_Name);
l_MyCommand.Parameters.AddWithValue("#sFirstName", pTest.First_name);
l_MyCommand.Parameters.AddWithValue("#sAddress", pTest.Address);
l_MyCommand.Parameters.AddWithValue("#id", pTest.id);
Like this:
l_SpName = #"update tbTestCustomers
set sName = #sLastName,
sFirstName = #sFirstName,
sAddress = #sAddress
Where id = #id";
l_MyCommand = new MySqlCommand(l_SpName, l_Connection);
l_MyCommand.Parameters.AddWithValue("#sLastName", pTest.Last_Name);
l_MyCommand.Parameters.AddWithValue("#sFirstName", pTest.First_name);
l_MyCommand.Parameters.AddWithValue("#sAddress", pTest.Address);
l_MyCommand.Parameters.AddWithValue("#id", pTest.id);
l_MyCommand.ExecuteNonQuery();
I have a script that is connecting to multiple databases and writing a query from each into a text file. However when I run it, I'm not getting the results expected. After some cross checking, it looks like it is writing the same results from the first database query instead of finding new results from the next connection. I inserted an IP string to verify the IPs are being grabbed by the for loop but it seems like I need some way of clearing the reader?
for (int z=0; z<2;z++) {
using (OleDbConnection connLocal = new OleDbConnection("Provider=SAOLEDB;LINKS=tcpip(host=" + ips[z] + ",PORT=2638);ServerName=EAGLESOFT;Integrated Security = True; User ID = dba; PWD = sql"))
try
{
connLocal.Open();
using (OleDbCommand cmdLocal = new OleDbCommand("SELECT tran_num, '" + ips[z] + "', provider_id, amount, tran_date, collections_go_to, impacts, type, '" + clinics[z] + "' AS Clinic FROM transactions WHERE tran_date LIKE '2015-11-23%'", connLocal))
using (StreamWriter sqlWriter = File.AppendText(#"C:\Users\Administrator\Desktop\Clinic.txt"))
{
using (OleDbDataReader readLocal = cmdLocal.ExecuteReader())
{
while (readLocal.Read())
{
sqlWriter.WriteLine("{0}|{1}|{2}|{3}|{4}|{5}|{6}|{7}|{8}",
readLocal.GetValue(0).ToString(),
readLocal.GetValue(1).ToString(),
readLocal.GetValue(2).ToString(),
readLocal.GetValue(3).ToString(),
readLocal.GetValue(4).ToString(),
readLocal.GetValue(5).ToString(),
readLocal.GetValue(6).ToString(),
readLocal.GetValue(7).ToString(),
readLocal.GetValue(8).ToString());
}
readLocal.Close();
}
sqlWriter.Close();
connLocal.Close();
}
}
catch (Exception connerr) { Debug.WriteLine(connerr.Message); }
}
As always, any insight is much appreciated!
So I have been browsing stack overflow and MSDN and cannot find a control (or make sense of the ones I have) to access the data directly of a detailsview. I'm in C# using a .Net WebApplication.
I think what I am looking for is the equivalent in gridview is row.Cells[1].Value can anybody help with the accessor to the DetailsView cells?
What I am trying to do is to access the exact data values I have bound to the DetailsView1
.Text is sufficient for all the numbers and string (only two shown for example) but not for the timestamp MTTS (a datetime) as it lost the milliseconds and the code (SQL query) I use after it cannot find the correct values in the db without the milliseconds. Will I also need to change the way I have bound the data, or some setting to give the bound data millisecond accuracy?
Code example:
Decimal RUN_ID = 0;
DateTime MTTS = new DateTime();
foreach(DetailsViewRow row in DetailsView1.Rows)
{
switch(row.Cells[0].Text)
{
case "RUN_ID":
RUN_ID = Decimal.Parse(row.Cells[1].Text);
break;
case "MTTS":
MTTS = DateTime.Parse(row.Cells[1].ToString());
break;
}
}
I have tried
row.Cells[1].ID = "MTTS";
MTTS = (DateTime)((DataRowView)DetailsView1.DataItem)["MTTS"];
But it does not recognize the MTTS and I am not sure how to set the parameter I have tried a few different things already with no success.
The workaround was messy, essentially I rebuilt the query that gathered the data to the GridView and then I made a function to grab the MTTS directly using LinQ and the parameteres from inside the GridView which assigns the MTTS as a DateTime.
This was in my opinion a bad way of doing things but it worked. I would prefer a better solution.
MTTS = GetMTTS(JOB_PLAN, JOB_NAME,JOB_NAME_ID,RUN_ID,JOB_STATUS);
public DateTime GetMTTS(string JOB_PLAN, string JOB_NAME, string JOB_NAME_ID, Decimal RUN_ID, string JOB_STATUS){
string myEnvName = XXX;
TableName = XXX.ToString();
ConnectionString = System.Configuration.ConfigurationManager.ConnectionStrings[myEnvName].ToString();
string thisRUN_ID = RUN_ID.ToString();
cmdText = #"SELECT MTTS FROM " + TableName +
" WHERE JOB_PLAN = '" + JOB_PLAN + "'"
+ " AND JOB_NAME = '" + JOB_NAME + "'"
+ " AND JOB_NAME_ID = '" + JOB_NAME_ID + "'"
+ " AND RUN_ID = '" + thisRUN_ID + "'"
+ " AND JOB_STATUS = '" + JOB_STATUS + "'";
DataSet ds = new DataSet();
using (SqlConnection conn = new SqlConnection(ConnectionString))
{
conn.Open();
try
{
SqlCommand SQLcc = new SqlCommand(cmdText,conn);
SqlDataReader reader;
reader = SQLcc.ExecuteReader();
while (reader.Read())
{
MTTS = reader.GetDateTime(0);
}
reader.Dispose();
}
catch (Exception e)
{
Console.WriteLine("{0} Exception caught.", e);
}
}
return MTTS;
}
I'm selecting about 20,000 records from the database and then I update them one by one.
I looked for this error and I saw that setting the CommandTimeout will help, but not in my case.
public void Initialize()
{
MySqlConnectionStringBuilder SQLConnect = new MySqlConnectionStringBuilder();
SQLConnect.Server = SQLServer;
SQLConnect.UserID = SQLUser;
SQLConnect.Password = SQLPassword;
SQLConnect.Database = SQLDatabase;
SQLConnect.Port = SQLPort;
SQLConnection = new MySqlConnection(SQLConnect.ToString());
}
public MySqlDataReader SQL_Query(string query)
{
MySqlCommand sql_command;
sql_command = SQLConnection.CreateCommand();
sql_command.CommandTimeout = int.MaxValue;
sql_command.CommandText = query;
MySqlDataReader query_result = sql_command.ExecuteReader();
return query_result;
}
public void SQL_NonQuery(string query)
{
MySqlCommand sql_command;
sql_command = SQLConnection.CreateCommand();
sql_command.CommandTimeout = int.MaxValue;
sql_command.CommandText = query;
sql_command.ExecuteNonQuery();
}
And here is my method which makes the select query:
public void CleanRecords()
{
SQLActions.Initialize();
SQLActions.SQL_Open();
MySqlDataReader cashData = SQLActions.SQL_Query("SELECT `cash`.`id`, SUM(`cash`.`income_money`) AS `income_money`, `cash_data`.`total` FROM `cash_data` JOIN `cash` ON `cash`.`cash_data_id` = `cash_data`.`id` WHERE `user`='0' AND `cash_data`.`paymentterm_id`='0' OR `cash_data`.`paymentterm_id`='1' GROUP BY `cash_data_id`");
while(cashData.Read()){
if(cashData["income_money"].ToString() == cashData["total"].ToString()){
UpdateRecords(cashData["id"].ToString());
}
}
SQLActions.SQL_Close();
}
And here is the method which makes the update:
public void UpdateRecords(string rowID)
{
SQLActions.Initialize();
SQLActions.SQL_Open();
SQLActions.SQL_NonQuery("UPDATE `cash_data` SET `end_date`='" + GetMeDate() + "', `user`='1' WHERE `id`='" + rowID + "'");
SQLActions.SQL_Close();
}
Changing the database structure is not an option for me.
I thought that setting the timeout to the maxvalue of int will solve my problem, but is looks like this wont work in my case.
Any ideas? :)
EDIT:
The error which I get is "Fatal error encoutered during data read".
UPDATE:
public void CleanRecords()
{
StringBuilder dataForUpdate = new StringBuilder();
string delimiter = "";
SQLActions.Initialize();
SQLActions.SQL_Open();
MySqlDataReader cashData = SQLActions.SQL_Query("SELECT `cash`.`id`, SUM(`cash`.`income_money`) AS `income_money`, `cash_data`.`total` FROM `cash_data` JOIN `cash` ON `cash`.`cash_data_id` = `cash_data`.`id` WHERE `user`='0' AND `cash_data`.`paymentterm_id`='0' OR `cash_data`.`paymentterm_id`='1' GROUP BY `cash_data_id`");
while (cashData.Read())
{
if (cashData["income_money"].ToString() == cashData["total"].ToString())
{
dataForUpdate.Append(delimiter);
dataForUpdate.Append("'" + cashData["id"].ToString() + "'");
delimiter = ",";
}
}
SQLActions.SQL_Close();
UpdateRecords(dataForUpdate.ToString());
}
public void UpdateRecords(string rowID)
{
SQLActions.Initialize();
SQLActions.SQL_Open();
SQLActions.SQL_NonQuery("UPDATE `cash_data` SET `end_date`='" + GetMeDate() + "', `user`='1' WHERE `id` IN (" + rowID + ")");
SQLActions.SQL_Close();
}
You may be able to use
UPDATE cash_data .... WHERE id IN (SELECT ....)
and do everything in one go. Otherwise, you could do it in two steps: first the select collects all the ids, close the connection and then do the update in obne go with all the ids.
The code for the second option might look something like this:
public void CleanRecords()
{
StringBuilder builder = new StringBuilder();
string delimiter = "";
SQLActions.Initialize();
SQLActions.SQL_Open();
MySqlDataReader cashData = SQLActions.SQL_Query("SELECT `cash`.`id`, SUM(`cash`.`income_money`) AS `income_money`, `cash_data`.`total` FROM `cash_data` JOIN `cash` ON `cash`.`cash_data_id` = `cash_data`.`id` WHERE `user`='0' AND `cash_data`.`paymentterm_id`='0' OR `cash_data`.`paymentterm_id`='1' GROUP BY `cash_data_id`");
while(cashData.Read()){
if(cashData["income_money"].ToString() == cashData["total"].ToString()){
builder.Append(delimiter);
builder.Append("'" + cashData["id"].ToString() + "'");
delimiter = ",";
}
}
SQLActions.SQL_Close();
UpdateRecords(builder.ToString());
}
public void UpdateRecords(string rowIDs)
{
SQLActions.Initialize();
SQLActions.SQL_Open();
SQLActions.SQL_NonQuery("UPDATE `cash_data` SET `end_date`='" + GetMeDate() + "', `user`='1' WHERE `id` IN (" + rowIDs + ")";
SQLActions.SQL_Close();
}
There are multiple problem:
First: You have reading information around 20K using data reader and then doing update one by one in reader itself. Reader holds the connection open until you are finished. So this is not the good way to do it. Solution: We can read the information using Data Adapter.
Second: Rather than doing one by one update, we can update in bulk in one go. There are multiple option for bulk operation. In SQL u can do either by sending information in XML format or u can use Table Valued Parameter (TVP) (http://www.codeproject.com/Articles/22205/ADO-NET-and-OPENXML-to-Perform-Bulk-Database-Opera) OR (http://dev.mysql.com/doc/refman/5.5/en/load-xml.html)
I have the following code in Mono, using the MySQL Connector/Net:
try
{
MatchPersonResult mpr = personServ.MatchPerson(p, "MatchAndStore", null);
using(MySqlCommand successcmd = new MySqlCommand())
{
successcmd.CommandText = "UPDATE myccontacts SET mcid = #mcid, matchresult = #mr, datetimematched = #dtm WHERE id = #id";
successcmd.Connection = conn;
successcmd.Parameters.Add("#mcid", MySqlDbType.Int32).Value = int.Parse(mpr.PersonID);
successcmd.Parameters.Add("#mr", MySqlDbType.Enum).Value = mpr.MatchResultStatus;
successcmd.Parameters.Add("#dtm", MySqlDbType.DateTime).Value = DateTime.Now.Year.ToString() + "-" + DateTime.Now.Month.ToString() + "-" + DateTime.Now.Day.ToString() + " " + DateTime.Now.Hour.ToString() + ":" + DateTime.Now.Minute.ToString() + ":" + DateTime.Now.Second.ToString();
successcmd.Parameters.Add("#id", MySqlDbType.Int32).Value = person["id"];
successcmd.ExecuteNonQuery();
Console.WriteLine(mpr.PersonID);
}
}
When the query is executed, the table isn't actually updated with anything. I set a breakpoint on the Console.WriteLine call so I can check what's happening and when it's hit, I load the row with the id mentioned in the code and it has not been updated. Even if I don't debug but just let the code execute, I see that nothing is happening to the database. For clarity's sake - personServ.MatchPerson is actually a web reference imported into my solution, so I can check on the other end and do in fact see that the proper data were sent over and that the db update should take place.
Anyone know what to do?
TIA,
Benjy
P.S.: Everything except for the db updates is working - the catch block here (not posted for brevity's sake) is never hit.
Could u try this code ?
try
{
MatchPersonResult mpr = personServ.MatchPerson(p, "MatchAndStore", null);
using(MySqlCommand successcmd = new MySqlCommand())
{
successcmd.CommandText = "UPDATE myccontacts SET mcid = #mcid, matchresult = #mr, datetimematched = #dtm WHERE id = id";
successcmd.Connection = conn;
successcmd.Parameters.AddWithValue("#mcid",int.Parse(mpr.PersonID));
successcmd.Parameters.AddWithValue("#mr",(int)mpr.MatchResultStatus);
successcmd.Parameters.AddWithValue("#dtm", DateTime.Now.ToString("yyyy-MM-dd hh:mm:ss"));
successcmd.Parameters.AddWithValue("#id",Convert.Int32(person["id"]);
successcmd.Connection.Open();
successcmd.ExecuteNonQuery();
successcmd.Connection.Close();
Console.WriteLine(mpr.PersonID);
}
}