General Query Method - c#

I find that my C# apps do a lot of queries with a lot of boilerplate that clutters up my code space. I also want to avoid repetition, but I'm not sure how I could write a method to do this generically.
I am accessing an Oracle database using ODP. I can't use Linq because our data warehouse people refuse to designate primary keys, and ODP support for Linq appears to be, well ... they'd rather have you use their platform.
I can't really return a List because every query returns different numbers of different types.
string gufcode = String.Empty;
double cost = 0.0;
OracleCommand GUFCommand2 = thisConnection.CreateCommand();
String GUFQuery2 = "SELECT GUF_ID, COST_RATE FROM SIMPLE_TABLE";
GUFCommand2.CommandText = GUFQuery2;
OracleDataReader GUFReader2 = GUFCommand2.ExecuteReader();
while (GUFReader2.Read())
{
if (GUFReader2[0/**GUF_CODE**/] != DBNull.Value)
{
gufcode = Convert.ToString(BUFReader2[0]);
}
if (GUFReader2[1/**COST_RATE**/] != DBNull.Value)
{
cost = Convert.ToDouble(GUFReader2[1]);
}
effortRatioDictionary.Add(bufcode, percentageOfEffort);
}
GUFReader2.Close();
But there's really a lot more terms and a lot more queries like this. I'd say 15 or so queries -some with as many as 15 or so fields returned.
Copy/pasting this boilerplate everywhere leads to a lot of fires: for example if I don't update everything in the copy paste I'll close the wrong reader (or worse) send a different query string to the database.
I'd like to be able to do something like this:
string gufQuery = "SELECT GUF_ID, COST_RATE FROM SIMPLE_TABLE";
List<something> gufResponse = miracleProcedure(gufQuery, thisConnection);
And so most of the boilerplate goes away.
I'm looking for something simple.

I think the main reason why you are not being able to abstract away a function is because the return data is going to be different everytime.
Which means the number of read is going to be different everytime as well.
You could just return GUFReader2 but then you will lose the ability to close it inside the function which you want.
I would say return an array (or list) of objects.
Inside the procedure, just read through every row and return the list while closing the connection.
Your calling function will always know what and in which sequence the expected data will be. It will have to cast the object data too, but you are doing that inside this procedure anyways.

Some hints:
Deriving from IDisposable allows for cleaner code with using statement.
IMO your magic method should look more like this :
List<T> list = doMagic("SIMPLE_TABLE", columns);
columns could be an array of small structs like this one :
struct Column
{
string Name;
Type DataType;
}
You might be able to use enums if you use the same tables/columns very often.
Or
You can take some inspiration from VertexDeclaration, VertexElement, VertexElementFormat and VertexElementUsage types that are in XNA : http://msdn.microsoft.com/en-us/library/bb197344.
This proven to be very helpful when dealing with a different number of 'inputs' in a random order.
In my case I've been able to build an easy to use, XNA-like framework for OpenGL with such stuff.
Regarding the return type of your list, refer to my 2nd suggestion.

Linq was the right answer. I give credit to David M, above, but I can't mark it as the correct answer since he only left a comment.
I was able to do a semi-generalized method using ArrayLists:
public static ArrayList GeneralQuery(string thisQuery, OracleConnection myConnection)
{
ArrayList outerAL = new ArrayList();
OracleCommand GeneralCommand = myConnection.CreateCommand();
GeneralCommand.CommandText = thisQuery;
OracleDataReader GeneralReader = GeneralCommand.ExecuteReader();
while (GeneralReader.Read())
{
for (int i = 0; i < GeneralReader.FieldCount; i++)
{
ArrayList innerAL = new ArrayList();
if (GeneralReader[i] != DBNull.Value)
{
innerAL.Add(GeneralReader[i]);
}
else
{
innerAL.Add(0);
}
outerAL.Add(innerAL);
}
}
GeneralReader.Close();
return outerAL;
}
And the code that calls this method looks like this:
thisConnection.Open();
List<ProjectWrapper> liProjectCOs = new List<ProjectWrapper>();
String ProjectQuery = "SELECT SF_CLIENT_PROJECT.ID, SF_CLIENT_PROJECT.NAMEX, SF_CHANGE_ORDER.ID, SF_CHANGE_ORDER.END_DATE, ";
ProjectQuery += "SF_CLIENT_PROJECT.CONTRACTED_START_DATE, SF_CHANGE_ORDER.STATUS, SF_CHANGE_ORDER.TYPE, SF_CLIENT_PROJECT.ESTIMATED_END_DATE, SF_CLIENT_PROJECT.CONTRACTED_END_DATE ";
ProjectQuery += "FROM SF_CLIENT_PROJECT, SF_CHANGE_ORDER ";
ProjectQuery += "WHERE SF_CHANGE_ORDER.TYPE = 'New' ";
ProjectQuery += "AND SF_CLIENT_PROJECT.ID = SF_CHANGE_ORDER.PROJECT";
ArrayList alProjects = GeneralQuery(ProjectQuery, thisConnection);
foreach( ArrayList proj in alProjects ) {
ProjectWrapper pw = new ProjectWrapper();
pw.projectId = Convert.ToString( proj[0] );
pw.projectId = Convert.ToString(proj[0]);
pw.projectName = Convert.ToString(proj[1]);
pw.changeOrderId = Convert.ToString(proj[2]);
pw.newEndDate = Convert.ToDateTime(proj[3]);
pw.startDate = Convert.ToDateTime(proj[4]);
pw.status = Convert.ToString(proj[5]);
pw.type = Convert.ToString(proj[6]);
if ( Convert.ToString(proj[7]) != "0" ) // 0 returned by generalquery if null
pw.oldEndDate = Convert.ToDateTime(proj[7]);
else
pw.oldEndDate = Convert.ToDateTime(proj[8]);
liProjectCOs.Add(pw);
There's a lot of obvious disadvantages here (although it is a lot better than what I was trying to do earlier). It is so much worse than Linq I renegotiated with our data warehouse people. There's a new guy there, and he was a lot more helpful.
Linq reduces the lines of code from above by a factor of 2. It is a factor of 4 from the non-encapsulated way I was doing it before that.

Related

Saving unknown number of rows/columns into a string list

I'm fairly new to C#, so please bear with me. I have a class FixData:
private class FixData
{
public int ID { get; set; }
public List<string> content { get; set; }
}
And there's also private List<FixData> IDList = new List<FixData>();
I'm querying data from sql database using IDs already stored in IDList andSqlDataReader and then trying to save it into IDList.content. But that's where the hard part starts. I don't really know how many rows or columns this data has and trying to read that from debugger made me so much more confused (in other words: I fail to read it). Despite this, I tried to save it in so many ways and so many times that I'm completly lost at this point. Here's the code:
foreach (var record in IDList)
{
SqlCommand nonQuerycmd = new SqlCommand(NonQuery, connection);
nonQuerycmd.Parameters.Add(new SqlParameter("ScenarioID", record.ID));
nonQuerycmd.ExecuteNonQuery();
SqlCommand cmd = new SqlCommand(FixQuery, connection);
sqlreader = cmd.ExecuteReader();
ArrayList rowList = new ArrayList();
while (sqlreader.Read())
{
object[] values = new object[sqlreader.FieldCount];
sqlreader.GetValues(values);
rowList.Add(values);
record.content = values.Cast<object>().Select(x => x.ToString()).ToList();
}
sqlreader.Close();
}
Could you please help me and point me to an explanation or link or something that could help me understand how I should solve this?
Edit
I've managed to scramble something, but I'm not sure if this works as it was intended to.
Take a look at the MSDN documentation for SqlDataReader class. It should get you started.
The examples and other classes linked to there should help with proper usage of SqlCommand and other classes as well.
Without knowing what's in your table, the best I can do is:
if (read.Read())
{
for(int i = 0; i < read.FieldCount; i++)
{
record.content.Add(Convert.ToString(read[i]));
}
}
This will add every field selected to record.content as a string. I just changed "while" to "if" because you will only be handling one row (I think). If you need more help, let us know more information about your data and what you need. If, for some reason, you are putting multiple fields from multiple rows, use while. If you only want one field from multiple rows, change if to while, take out the for/next and change [i] to [0]. Shannon is not Carnac the Magnificent.

Make a list containing distinct list items from dataset containing duplicates

I'm really stuck, and being as many of you guys that post solutions on here are by and large brilliant (IMHO), I figured I'd see what the best of you could make of this problem.
Some background
I'm trying to create a list that must contain only distinct items in a specific sequence.
(it's a primary key and thus must be distinct (I didn't make it a primary key, but I have to work with what I'm given, you know how it goes).
For ease of understanding this requirement, think of creating a distinct list of recipe steps from a book of recipes. My problem is that the "cooks" of these "recipes" often change the order in which they create their masterpieces.
For instance:
Recipe 1
Whisk eggs using fork
Melt margarine in a skillet
Pour in the eggs
stir constantly
Plate
Add salt and pepper as desired
Recipe 2
Break eggs into bowl
Whisk eggs using fork
Melt margarine in a skillet over low heat
Pour in the eggs
stir constantly
Plate
Serve
Add salt and pepper as desired
Recipe 3
Whisk eggs using fork
Add salt and pepper as desired
Melt margarine in a skillet over low heat
Pour in the eggs
stir constantly
Plate
As you can tell "Add salt and pepper..." can't be number 2 in Recipe 3 and still be in the correct sequence in Recipes 1 and 2.
I think if I can ID the "offending" list item and add a period to the end of it, thus making it unique, this would work as a solution.
How do I do this in C# given a dataset (gotten by a SQL query) with duplicates in the correct sequence and placed into a List of type string? LINQ is not a requirement here, but I'm not afraid to use it if it provides a solution.
Specifically code (or psedo-code) that:
IDs the list item that needs to be duplicated and modified.
Determine WHERE in the newly created large list (assuming) is the newly modified list item to be placed.
If Your 1st question is going to be "show me your work", please be advised that I've done quite a bit of work on this, and the code is generally long.
I am happy to work from either pseudo-code or try your code with my dataset.
I'm also happy reading other solutions that may be pertinent.
Thanks, I look forward to seeing your solutions.
--edit: I'm starting to get the impression people don't like it if you don't post code.
So here it goes (I said above it was long). The code works but it doesn't solve the problem. It returns a distinct list in order with no duplicates.
(If the formatting below is bad please forgive)
public void GetNewRecipeItemsFromDB(string RequestedRecipeName)
{
string connString = string.Empty;
string strGetRecipeElements_Sql = "SQL that returns the dataset";
string connString = GetConnectionString();
using (SqlConnection conn = new SqlConnection(connString))
{
SqlCommand cmd = conn.CreateCommand();
cmd.CommandType = CommandType.Text;
cmd.CommandText = strGetRecipeElements_Sql;
SqlDataReader reader = null;
try
{
conn.Open();
SqlDataAdapter adapter = new SqlDataAdapter(strGetRecipeElements_Sql, conn);
DataSet RecipeItems = new DataSet();
adapter.Fill(RecipeItems, "RecipeItems");
reader = cmd.ExecuteReader();
List<string> RecipeItemList = new List<string>();
//Create an array with existing RecipeItems
int readerCount = 0;
while (reader.Read())
{
RecipeItems GSI = new RecipeItems();
GSI.RecipeItem = reader[0].ToString();
GSI.Sequence = Convert.ToInt32(reader[1].ToString());
GSI.Rank = Convert.ToInt32(reader[2].ToString());
RecipeItemList.Add(GSI.RecipeItem.ToString());
readerCount++;
}
string[] CurrentRecipeItemArray = new string[readerCount];
string[] UpdatedRecipeItemArray = new string[readerCount];
//RecipeItemList.Sort();
label1.Text = "";
textBox1.Text = "";
CurrentRecipeItemArray = RecipeItemList.ToArray();
for (int y = CurrentRecipeItemArray.Length - 1; y >= 0; y--)
{
textBoxDBOrginal.Text += CurrentRecipeItemArray[y].ToString() + Environment.NewLine;
}
string[] lines = textBoxDBOrginal.Text.ToString().Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
List<string> UniqueRecipeItemsInvertedList = new List<string>();
if (lines.Length > 0)
{
//OK it's not null lets look at it.
int lineCount = lines.Length;
string NewCompare = string.Empty;
for (int z = 0; z < lineCount; z++)
{
NewCompare = lines[z];
if (!UniqueRecipeItemsInvertedList.Contains(NewCompare))
{
UniqueRecipeItemsInvertedList.Add(NewCompare);
}
}
}
UniqueRecipeItemsInvertedList.Reverse();
foreach (string s in UniqueRecipeItemsInvertedList)
{
if (!string.IsNullOrEmpty(s))
{
listBox7.Items.Add(s.ToString());
}
}
}
catch (SqlException ex)
{
MessageBox.Show(ex.Errors.ToString());
}
conn.Close();
}
}
The answer was already on this site.
How to rename duplicates in list using LINQ
Code is:
IEnumerable<String> GetUnique(IEnumerable<String> list)
{
HashSet<String> itms = new HashSet<String>();
foreach(string itm in list)
{
string itr = itm;
while(itms.Contains(itr))
{
itr = itr + "_";
}
itms.Add(itr);
yield return itr;
}
}
I've come to the conclusion that although this can be done, and I got close, I just don't have the skills/knowledge to pull it off.
It amounts to:
Cycle through total number of recipes and place recipe name in a list
For each recipe name in list of recipes Get Recipe Steps And Sequence From DB and place in a sorted list (and this is an iffy bit).
At this point you have all the data you need if you just wanted the distinct items. ListName.Distinct()
Cycling through the SortedList to see if the key/value exists in the proper sequence continues to be my death knell. I kept running into key already exists / Key doesn't exists exceptions. If I can crack this nut, I'll have solved the problem.
I learned quite a bit about list<>, sortedList<> and List and the power of having your own class(es) and methods. For example: RecipeInfo.RecipeItemsList makes life so much easier.
I still haven't figured out why no one here wanted to touch this or why it was down graded. This experience will likely result in me hesitating before posting another question to stackoverflow.com.
Since Dictionary won't allow for duplicate entries (it throws an ArgumentException was unhandled exception), thus handling the "heavy lifting" of ensuring uniqueness and sequence order (still testing that one). I think I'm using GSI.Sequence wrong because there could be multiple sequences for receipe items. (This is NOT the answer, but a place I could put code. I hope I did it right) Hat tip to http://williablog.net/williablog/post/2011/08/30/Generic-AddOrUpdate-Extension-for-IDictionary.aspx
while(reader.Read())
{
RecipeItems GSI = new RecipeItems();
GSI.RecipeItem = reader[0].ToString();
GSI.Sequence = Convert.ToInt32(reader[1].ToString());
GSI.RecipeName = reader[2].ToString();
GSI.MaxSequence = Convert.ToInt32(reader[3].ToString());
if (dictionary.ContainsKey(GSI.RecipeItem))
{
dictionary.[GSI.RecipeItem] = GSI.Sequence);
}
else
{
dictionary.Add(GSI.RecipeItem, GSI.Sequence);
}
}
I think the final answer here ended up being something that I didn't necessarily foresee or desire. With approximately 94 unique items, you will end up with a list of about 428 unique recipe items over the course of 20 recipes. This would give me a list where I could have the appropriate recipe item in the right sequence. I still think my logic on this one is BAD, but it makes sense when you figure you have a couple of recipe items per recipe out of order and must be duplicated and than you multiply that times the number of recipes.

How can I pass an unkown number of arguments to C#'s "Database.Open("DatabaseName").Query()" method?

I have tried Googling to find a solution, but all I get is totally irrelevant results or results involving 2 dimensional arrays, like the one here: http://social.msdn.microsoft.com/Forums/vstudio/en-US/bb4d54d3-14d7-49e9-b721-db4501db62c8/how-does-one-increment-a-value-in-a-two-dimensional-array, which does not apply.
Say I have this declared:
var db = Database.Open("Content");
var searchTerms = searchText.Split('"').Select((element, index) => index % 2 == 0 ? element.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries) : new string[] { element }).SelectMany(element => element).ToList();
int termCount = searchTerms.Count;
(Note: All that you really need to know about searchTerms is that it holds a number of search terms typed into a search bar by the user. All the LINQ expression is doing is ensuring that text wrapped in qoutes is treated as a single search term. It is not really necessary to know all of this for the purpose of this question.)
Then I have compiled (using for loops that loop for each number of items in the searchTerms list) a string to be used as a SELECT SQL query.
Here is an example that shows part of this string being compiled with the #0, #1, etc. placeholders so that my query is parameterized.
searchQueryString = "SELECT NULL AS ObjectID, page AS location, 'pageSettings' AS type, page AS value, 'pageName' AS contentType, ";
for (int i=0; i<termCount; i++)
{
if(i != 0)
{
searchQueryString += "+ ";
}
searchQueryString += "((len(page) - len(replace(UPPER(page), UPPER(#" + i + "), ''))) / len(#" + i + ")) ";
}
searchQueryString += "AS occurences ";
(Note: All that you really need to know about the above code is that I am concatenating the incrementing value of i to the # symbol to dynamically compile the placeholder value.)
All of the above works fine, but later, I must use something along the lines of this (only I don't know how many arguments I will need until runtime):
foreach (var row in db.Query(searchQueryString, searchTerms[0]))
{
#row.occurences
}
(For Clarification: I will need a number of additional arguments (i.e., in addition to the searchQueryString argument) equal to the number of items in the searchTerms list AND they will have to be referencing the correct index (effectively referencing each index from lowest to highest, in order, separated by commas, of course.)
Also, I will, of course need to use an incrementing value to reference the appropriate index of the list, if I can even get that far, and I don't know how to do that either. Could I use i++ somehow for that?
I know C# is powerful, but maybe I am asking too much?
Use params keyword for variable numbers of parameters. With params, the arguments passed to a any function are changed by the compiler to elements in a temporary array.
static int AddParameters(params int[] values)
{
int total = 0;
foreach (int value in values)
{
total += value;
}
return total;
}
and can be called as
int add1 = AddParameters(1);
int add2 = AddParameters(1, 2);
int add3 = AddParameters(1, 2, 3);
int add4 = AddParameters(1, 2, 3, 4);
//-----------Edited Reply based on comments below---
You can use something like this to be used with SQL
void MYSQLInteractionFunction(String myConnectionString)
{
String searchQueryString = "SELECT NULL AS ObjectID, page AS location, 'pageSettings' AS type, page AS value, 'pageName' AS contentType, ";
SqlConnection myConnection = new SqlConnection(myConnectionString);
SqlCommand myCommand = new SqlCommand(searchQueryString, myConnection);
myConnection.Open();
SqlDataReader queryCommandReader = myCommand.ExecuteReader();
// Create a DataTable object to hold all the data returned by the query.
DataTable dataTable = new DataTable();
// Use the DataTable.Load(SqlDataReader) function to put the results of the query into a DataTable.
dataTable.Load(queryCommandReader);
Int32 rowID = 0; // or iterate on your Rows - depending on what you want
foreach (DataColumn column in dataTable.Columns)
{
myStringList.Add(dataTable.Rows[rowID][column.ColumnName] + " | ");
rowID++;
}
myConnection.Close();
String[] myStringArray = myStringList.ToArray();
UnlimitedParameters(myStringArray);
}
static void UnlimitedParameters(params string[] values)
{
foreach (string strValue in values)
{
// Do whatever you want to do with this strValue
}
}
I'm not sure I quite understand what you need from the question, but it looks like you're substituting a series of placeholders in the SQL with another value. If that's the case, you can use String.Format to replace the values like this:
object val = "a";
object anotherVal = 2.0;
var result = string.Format("{0} - {1}", new[] { val, anotherVal });
This way, you can substitute as many values as you need by simply creating the arguments array to be the right size.
If you're creating a SQL query on the fly, then you need to be wary of SQL injection, and substituting user-supplied text directly into a query is a bit of a no-no from this point of view. The best way to avoid this is to use parameters in the query, which automatically then get sanitised to prevent SQL injection. You can still use a 'params array' argument though, to achieve what you need, for example:
public IDataReader ExecuteQuery(string sqlQuery, params string[] searchTerms)
{
var cmd = new SqlCommand { CommandText = sqlQuery };
for (int i = 0; i < searchTerms.Length; i++)
{
cmd.Parameters.AddWithValue(i.ToString(), searchTerms[i]);
}
return cmd.ExecuteReader();
}
obviously, you could also build up the sql string within this method if you needed to, based on the length of the searchTerms array
Well, for how complex the question probably seemed, the answer ended up being something pretty simple. I suppose it was easy to overlook because I had never done this before and others may have thought it too obvious to be what I was looking for. However, I had NEVER tried to pass a variable length of parameters before and had no clue if it was even possible or not (okay, well I guess I knew it was possible somehow, but could have been very far from my method for all I knew).
In any case, I had tried:
foreach (var row in db.Query(searchQueryString, searchTerms)) //searchTerms is a list of strings.
{
//Do something for each row returned from the sql query.
}
Assuming that if it could handle a variable length number of arguments (remember each argument passed after the first in the Database.Query() method is treated as fillers for the placeholders in the query string (e.g., #0, #1, #2, etc.) it could accept it from a list if it could from an array.
I couldn't really have been any more wrong with that assumption, as passing a list throws an error. However, I was surprised when I finally broke down, converted the list to an array, and tried passing the array instead of the list.
Indeed, and here is the short answer to my original question, if I simply give it an array it works easily (a little too easily, which I suppose is why I was so sure it wouldn't work):
string[] searchTermsArray = searchTerms.ToArray();
foreach (var row in db.Query(searchQueryString, searchTermsArray)) //searchTermsArray is an array of strings.
{
//Do something for each row returned from the sql query.
}
The above snippet of code is really all that is needed to successfully answer my original question.

C#: Iterating through a huge datatable

Iterating through a datatable that contains about 40 000 records using for-loop takes almost 4 minutes. Inside the loop I'm just reading the value of a specific column of each row and concatinating it to a string.
I'm not opening any DB connections or something, as its a function which recieves a datatable, iterate through it and returns a string.
Is there any faster way of doing this?
Code goes here:
private string getListOfFileNames(Datatable listWithFileNames)
{
string whereClause = "";
if (listWithFileNames.Columns.Contains("Filename"))
{
whereClause = "where filename in (";
for (int j = 0; j < listWithFileNames.Rows.Count; j++)
whereClause += " '" + listWithFileNames.Rows[j]["Filename"].ToString() + "',";
}
whereClause = whereClause.Remove(whereClause.Length - 1, 1);
whereClause += ")";
return whereClause;
}
Are you using a StringBuilder to concat the strings rather than just regular string concatenation?
Are you pulling back any more columns from the database then you really need to? If so, try not to. Only pull back the column(s) that you need.
Are you pulling back any more rows from the database then you really need to? If so, try not to. Only pull back the row(s) that you need.
How much memory does the computer have? Is it maxing out when you run the program or getting close to it? Is the processor at the max much or at all? If you're using too much memory then you may need to do more streaming. This means not pulling the whole result set into memory (i.e. a datatable) but reading each line one at a time. It also might mean that rather than concatting the results into a string (or StringBuilder ) that you might need to be appending them to a file so as to not take up so much memory.
Following linq statement have a where clause on first column and concat the third column in a variable.
string CSVValues = String.Join(",", dtOutput.AsEnumerable()
.Where(a => a[0].ToString() == value)
.Select(b => b[2].ToString()));
Step 1 - run it through a profiler, make sure you're looking at the right thing when optimizing.
Case in point, we had an issue we were sure was slow database interactions and when we ran the profiler the db barely showed up.
That said, possible things to try:
if you have the memory available, convert the query to a list, this
will force a full db read. Otherwise the linq will probably load in
chunks doing multiple db queries.
push the work to the db - if you can create a query than trims down
the data you are looking at, or even calculates the string for you,
that might be faster
if this is something where the query is run often but the data rarely
changes, consider copying the data to a local db (eg. sqlite) if
you're using a remote db.
if you're using the local sql-server, try sqlite, it's faster for
many things.
var value = dataTable
.AsEnumerable()
.Select(row => row.Field<string>("columnName"));
var colValueStr = string.join(",", value.ToArray());
Try adding a dummy column in your table with an expression. Something like this:
DataColumn dynColumn = new DataColumn();
{
dynColumn.ColumnName = "FullName";
dynColumn.DataType = System.Type.GetType("System.String");
dynColumn.Expression = "LastName+' '-ABC";
}
UserDataSet.Tables(0).Columns.Add(dynColumn);
Later in your code you can use this dummy column instead. You don't need to rotate any loop to concatenate a string.
Try using parallel for loop..
Here's the sample code..
Parallel.ForEach(dataTable.AsEnumerable(),
item => { str += ((item as DataRow)["ColumnName"]).ToString(); });
I've separated the job in small pieces and let each piece be handled by its own Thread. You can fine tune the number of thread by varying the nthreads number. Try it with different numbers so you can see the difference in performance.
private string getListOfFileNames(DataTable listWithFileNames)
{
string whereClause = String.Empty;
if (listWithFileNames.Columns.Contains("Filename"))
{
int nthreads = 8; // You can play with this parameter to fine tune and get your best time.
int load = listWithFileNames.Rows.Count / nthreads; // This will tell how many items reach thread mush process.
List<ManualResetEvent> mres = new List<ManualResetEvent>(); // This guys will help the method to know when the work is done.
List<StringBuilder> sbuilders = new List<StringBuilder>(); // This will be used to concatenate each bis string.
for (int i = 0; i < nthreads; i++)
{
sbuilders.Add(new StringBuilder()); // Create a new string builder
mres.Add(new ManualResetEvent(false)); // Create a not singaled ManualResetEvent.
if (i == 0) // We know were to put the very begining of your where clause
{
sbuilders[0].Append("where filename in (");
}
// Calculate the last item to be processed by the current thread
int end = i == (nthreads - 1) ? listWithFileNames.Rows.Count : i * load + load;
// Create a new thread to deal with a part of the big table.
Thread t = new Thread(new ParameterizedThreadStart((x) =>
{
// This is the inside of the thread, we must unbox the parameters
object[] vars = x as object[];
int lIndex = (int)vars[0];
int uIndex = (int)vars[1];
ManualResetEvent ev = vars[2] as ManualResetEvent;
StringBuilder sb = vars[3] as StringBuilder;
bool coma = false;
// Concatenate the rows in the string builder
for (int j = lIndex; j < uIndex; j++)
{
if (coma)
{
sb.Append(", ");
}
else
{
coma = true;
}
sb.Append("'").Append(listWithFileNames.Rows[j]["Filename"]).Append("'");
}
// Tell the parent Thread that your job is done.
ev.Set();
}));
// Start the thread with the calculated params
t.Start(new object[] { i * load, end, mres[i], sbuilders[i] });
}
// Wait for all child threads to finish their job
WaitHandle.WaitAll(mres.ToArray());
// Concatenate the big string.
for (int i = 1; i < nthreads; i++)
{
sbuilders[0].Append(", ").Append(sbuilders[i]);
}
sbuilders[0].Append(")"); // Close your where clause
// Return the finished where clause
return sbuilders[0].ToString();
}
// Returns empty
return whereClause;
}

Optimize query on unknown database

A third party application creates one database per project. All the databases have the same tables and structure. New projects may be added at anytime so I can't use any EF schema.
What I do now is:
private IEnumerable<Respondent> getListRespondentWithStatuts(string db)
{
return query("select * from " + db + ".dbo.respondent");
}
private List<Respondent> query(string sqlQuery)
{
using (var sqlConx = new SqlConnection(Settings.Default.ConnectionString))
{
sqlConx.Open();
var cmd = new SqlCommand(sqlQuery, sqlConx);
return transformReaderIntoRespondentList(cmd.ExecuteReader());
}
}
private List<Respondent> transformReaderIntoRespondentList(SqlDataReader sqlDataReader)
{
var listeDesRépondants = new List<Respondent>();
while (sqlDataReader.Read())
{
var respondent = new Respondent
{
CodeRépondant = (string)sqlDataReader["ResRespondent"],
IsActive = (bool?)sqlDataReader["ResActive"],
CodeRésultat = (string)sqlDataReader["ResCodeResult"],
Téléphone = (string)sqlDataReader["Resphone"],
IsUnContactFinal = (bool?)sqlDataReader["ResCompleted"]
};
listeDesRépondants.Add(respondent);
}
return listeDesRépondants;
}
This works fine, but it is deadly slow (20 000 records per minutes). Do you have any hints on what strategy should be faster? For info, what is really slow is transformReaderIntoRespondentList method
Thanks!!
Generally speaking anything SELECT * FROM is bad practice, but it could also be resulting in you having to pull back more data than is actually required. The transform is operating on only a few columns are more columns than required being returned? Consider replacing with:
private IEnumerable<Respondent> getListRespondentWithStatuts(string db)
{
return query("select ResRespondent, ResActive, ResCodeResult, Resphone, ResCompleted from " + db + ".dbo.respondent");
}
Also, gaurd against SQL-Injection attacks; concating strings for SQL queries is very dangerous.
When pulling data from a DataReader, I find that using the non-named lookups work best:
var respondent = new Respondent
{
CodeRépondant = sqlDataReader.GetString(0),
IsActive = sqlDataReader.IsDBNull(1) ? (Boolean?)null : sqlDataReader.GetBoolean(1),
CodeRésultat = sqlDataReader.GetString(2),
Téléphone = sqlDataReader.GetString(3),
IsUnContactFinal = sqlDataReader.IsDBNull(4) ? (Boolean?)null : sqlDataReader.GetBoolean(4)
};
I have not explcicitly tested the performance difference in a long while; but that used to make a notable difference. The ordinal checks did not have to do a named lookup and also avoided boxing/unboxing values.
Other than that, without more info it is hard to say... do you need all 20,000 records?
UPDATE
Ran a simple local test case with 300,000 records and reduced the time to load all data by almost 50%. I imagine these results will vary depending on the type of data being retrieved; but it still does make a difference on overall execution time. That being said, in my environment we are talking a drop from 650ms to just over 300ms.
NOTE
If respondent is a view, what is likely "really slow" is the database building up the result set; although the data reader will start processing information as soon as records are available, the ultimate bottleneck will be the database itself and/or network latency. Other than the above optimizations, there is not going to be much that you can do with your code unless you can index the view/table to optimize the query and or reduce the information required.

Categories