Neo4j locks CSV file even after importing

Neo4j locks CSV file even after importing - c#

EDIT It seems that locks (mostly/only?) stay locked when a transactional error occurs. I have to restart the database for it to work again, but it is not actively processing anything (no CPU/RAM/HDD activity).
Environment
I have an ASP.NET application that uses the Neo4jClient NuGet package to talk to a Neo4j database. I have N SimpleNode objects that need to be inserted where N can be anything from 100 to 50.000. There are other objects for the M edges that need to be inserted where M can be from 100 to 1.000.000.
Code
Inserting using normal inserts is too slow with 8.000 nodes taking about 80 seconds with the following code:
Client.Cypher
.Unwind(sublist, "node")
.Merge("(n:Node { label: node.label })")
.OnCreate()
.Set("n = node")
.ExecuteWithoutResults();
Therefore I used the import CSV function with the following code:
using (var sw = new StreamWriter(File.OpenWrite("temp.csv")))
{
sw.Write(SimpleNodeModel.Header + "\n");
foreach (var simpleNodeModel in nodes)
{
sw.Write(simpleNodeModel.ToCSVWithoutID() + "\n");
}
}
var f = new FileInfo("temp.csv");
Client.Cypher
.LoadCsv(new Uri("file://" + f.FullName), "csvNode", true)
.Merge("(n:Node {label:csvNode.label, source:csvNode.source})")
.ExecuteWithoutResults();
While still slow, it is an improvement.
Problem
The problem is that the CSV files are locked by the neo4j client (not C# or any of my own code it seems). I would like to overwrite the temp .CSV files so the disk does not fill up, or delete them after use. This is now impossible as the process locks them and I cannot use them. This also means that running this code twice crashes the program, as it cannot write to file the second time.
The nodes are inserted and do appear normally, so it is not the case that it is still working on them. After some unknown and widely varying amount of time, files do seem to unlock.
Question
How can I stop the neo4j client from locking the files long after use? Why does it lock them for so long? Another question: is there a better way of doing this in C#? I am aware of the java importer but I would like my tool to stay in the asp.net environment. It must be possible to insert 8.000 simple nodes within 2 seconds in C#?
SimpleNode class
public class SimpleNodeModel
{
public long id { get; set; }
public string label { get; set; }
public string source { get; set; } = "";
public override string ToString()
{
return $"label: {label}, source: {source}, id: {id}";
}
public SimpleNodeModel(string label, string source)
{
this.label = label;
this.source = source;
}
public SimpleNodeModel() { }
public static string Header => "label,source";
public string ToCSVWithoutID()
{
return $"{label},{source}";
}
}

Related

How to get a PID's children in c# at a usable speed

There are several (tons) of resources for finding parent/child processes from a PID. Too many to even list. I am finding that they are all incredibly slow.
Let's say I'm trying to implement a simple version of Process Explorer. I can enumerate all the processes easily and quickly with Process.GetProcesses(). However, getting the parent/child relationships takes forever.
I have tried using the NtQueryInformationProcess method, and found that this takes ~.1 seconds PER QUERY, that is, to get one parent from one process. To get children you basically have to build the whole tree and run this on every running process.
I then tried using ManagementObject to query for a process's parent and found it to be even slower at ~.12 seconds.
I tried going the other way and using ManagementObject to query the children directly rather than querying parents and building a tree. Querying for the children took between .25 and .5 seconds PER QUERY.
With any of these methods, that means populating a model of the current process tree takes 6-15 full seconds. That just seems crazy to me, like an order of magnitude higher than I would have guestimated. I can open process explorer and the parent/child relationships in the whole tree are just right there, immediatly.
Is there something wrong with my computer to make it this slow? Is this just a thing that for some reason takes way longer to discover than you'd think? How can process explorer do it so fast?

So, I came up with this which is a significant improvement. Since it's apparently WMI which is dog slow, I reduced it down to a single query for all processes and their parents, and then (with code) fill in the children so they are easily retrievable.
Notes on this code: it's a static class, sue me. it's called ProcessTree, but the data is not stored as a tree, it just models the tree, again sue me. also, there might be bugs, this is a proof of concept.
This gets the speed down to a third of a second, instead of 5-15 seconds. Since this is still expensive, it caches data for the interval ispecified in 'timeout'. This cache is not made to work with threads.
public static class ProcessTree
{
private static Dictionary<int, ProcessTreeNode> processes;
private static DateTime timeFilled = DateTime.MinValue;
private static TimeSpan timeout = TimeSpan.FromSeconds(3);
static ProcessTree()
{
processes = new Dictionary<int, ProcessTreeNode>();
}
public static List<int> GetChildPids(int pid)
{
if (DateTime.Now > timeFilled + timeout) FillTree();
return processes[pid].Children;
}
public static int GetParentPid(int pid)
{
if (DateTime.Now > timeFilled + timeout) FillTree();
return processes[pid].Parent;
}
private static void FillTree()
{
processes.Clear();
var results = new List<Process>();
string queryText = string.Format("select processid, parentprocessid from win32_process");
using (var searcher = new ManagementObjectSearcher(queryText))
{
foreach (var obj in searcher.Get())
{
object pidObj = obj.Properties["processid"].Value;
object parentPidObj = obj.Properties["parentprocessid"].Value;
if (pidObj != null)
{
int pid = Convert.ToInt32(pidObj);
if (!processes.ContainsKey(pid)) processes.Add(pid, new ProcessTreeNode(pid));
ProcessTreeNode currentNode = processes[pid];
if (parentPidObj != null)
{
currentNode.Parent = Convert.ToInt32(parentPidObj);
}
}
}
}
foreach (ProcessTreeNode node in processes.Values)
{
if (node.Parent != 0 && processes.ContainsKey(node.Parent)) processes[node.Parent].Children.Add(node.Pid);
}
timeFilled = DateTime.Now;
}
}
public class ProcessTreeNode
{
public List<int> Children { get; set; }
public int Pid { get; private set; }
public int Parent { get; set; }
public ProcessTreeNode(int pid)
{
Pid = pid;
Children = new List<int>();
}
}
I will leave the question not-answered in case someone has a better solution.

You could p/invoke Process32First, Process32Next, CreateToolhelp32Snapshot. The PROCESSENTRY32 structures returned have a parent process id, which you could use to establish parent relationship. These API's are blazingly fast.

What is the best way to work with multiple files in multithread in C#?

I am creating a Windows Form application, where I select a folder that contains multiple *.txt files. Their length may vary from few thousand lines (kB) to up to 50 milion lines (1GB). Every line of the code has three informations. Date in long, location id in int and value in float all separated by semicolon (;). I need to calculate min and max value in all those files and tell in which file it is, and then the most frequent value.
I already have these files verified and stored in an arraylist. I am opening a thread to read the files one by one and I read the data by line. It works fine, but when there are 1GB files, I run out of memory. I tried to store the values in dictionary, where key would be the date and the value would be an object that contains all the info loaded from the line alongside with the filename. I see I cannot use a dictionary, because at about 6M values, I ran out of memory. So I should probably do it in multithread. I though I could run two threads, one that reads the file and puts the info in some kind of container and the other that reads from it and makes calculations and then deletes the values from the container. But I don't know which container could do such thing. Moreover I need to calculate the most frequent value, so they need to be stored somewhere which leads me back to some kind of dictionary, but I already know I will run out of memory. I don't have much experience with threads either, so I don't know what is possible. Here is my code so far:
GUI:
namespace STI {
public partial class GUI : Form {
private String path = null;
public static ArrayList txtFiles;
public GUI() {
InitializeComponent();
_GUI1 = this;
}
//I run it in thread. I thought I would run the second
//one here that would work with the values inputed in some container
private void buttonRun_Click(object sender, EventArgs e) {
ThreadDataProcessing processing = new ThreadDataProcessing();
Thread t_process = new Thread(processing.runProcessing);
t_process.Start();
//ThreadDataCalculating calculating = new ThreadDataCalculating();
//Thread t_calc = new Thread(calculating.runCalculation());
//t_calc.Start();
}
}
}
ThreadProcessing.cs
namespace STI.thread_package {
class ThreadDataProcessing {
public static Dictionary<long, object> finalMap = new Dictionary<long, object>();
public void runProcessing() {
foreach (FileInfo file in GUI.txtFiles) {
using (FileStream fs = File.Open(file.FullName.ToString(), FileMode.Open))
using (BufferedStream bs = new BufferedStream(fs))
using (StreamReader sr = new StreamReader(bs)) {
String line;
String[] splitted;
try {
while ((line = sr.ReadLine()) != null) {
splitted = line.Split(';');
if (splitted.Length == 3) {
long date = long.Parse(splitted[0]);
int location = int.Parse(splitted[1]);
float value = float.Parse(splitted[2], CultureInfo.InvariantCulture);
Entry entry = new Entry(date, location, value, file.Name);
if (!finalMap.ContainsKey(entry.getDate())) {
finalMap.Add(entry.getDate(), entry);
}
}
}
GUI._GUI1.update("File \"" + file.Name + "\" completed\n");
}
catch (FormatException ex) {
GUI._GUI1.update("Wrong file format.");
}
catch (OutOfMemoryException) {
GUI._GUI1.update("Out of memory");
}
}
}
}
}
}
and the object in which I put the values from lines:
Entry.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace STI.entities_package {
class Entry {
private long date;
private int location;
private float value;
private String fileName;
private int count;
public Entry(long date, int location, float value, String fileName) {
this.date = date;
this.location = location;
this.value = value;
this.fileName = fileName;
this.count = 1;
}
public long getDate() {
return date;
}
public int getLocation() {
return location;
}
public String getFileName() {
return fileName;
}
}
}

I don't think that multithreading is going to help you here - it could help you separate the IO-bound tasks from the CPU-bound tasks, but your CPU-bound tasks are so trivial that I don't think they warrant their own thread. All multithreading is going to do is unnecessarily increase the problem complexity.
Calculating the min/max in constant memory is trivial: just maintain a minFile and maxFile variable that gets updated when the current file's value is less-than minFile or greater-than maxFile. Finding the most frequent value is going to require more memory, but with only a few million files you ought to have enough RAM to store a Dictionary<float, int> that maintains the frequency of each value, after which you iterate through the map to determine which value had the highest frequency. If for some reason you don't have enough RAM (make sure that your files are being closed and garbage collected if you're running out of memory, because a Dictionary<float, int> with a few million entries ought to fit in less than a gigabyte of RAM) then you can make multiple passes over the files: on the first pass store the values in a Dictionary<interval, int> where you've split up the interval between MIN_FLOAT and MAX_FLOAT into a few thousand sub-intervals, then on the next pass you can ignore all values that didn't fit into the interval with the highest frequency thus shrinking the dictionary's size. However, the Dictionary<float, int> ought to fit into memory, so unless you start processing billions of files instead of millions of files you probably won't need a multi-pass procedure.

Massive list in c# & wpf

I have a list consisting of all the US Zip codes, each with 3 elements. Thus the list is ~45,000 x 3 strings. What is the best way to load this, essentially the most efficient/optimized? Right now I have a foreach loop running it, and every time it gets to the loading point it hangs. Is there a better approach?
Edit
The usage of this is for the user to be able to type in a zip code and have the city and state displayed in two other text boxes. Right now I have it set to check as the user types, an after the dirt number is entered it freezes up, I believe at the ZipCodes codes = new ZipCodes()
This is the code I'm currently using. I left one of the zipCode.Add statements in, but deleted the other 44,999.
struct ZipCode
{
private String cvZipCode;
private String cvCity;
private String cvState;
public string ZipCodeID { get { return cvZipCode; } }
public string City { get { return cvCity; } }
public string State { get { return cvState; } }
public ZipCode(string zipCode, string city, string state)
{
cvZipCode = zipCode;
cvCity = city;
cvState = state;
}
public override string ToString()
{
return City.ToString() + ", " + State.ToString();
}
}
class ZipCodes
{
private List<ZipCode> zipCodes = new List<ZipCode>();
public ZipCodes()
{
zipCodes.Add(new ZipCode("97475","SPRINGFIELD","OR"));
}
public IEnumerable<ZipCode> GetLocation()
{
return zipCodes;
}
public IEnumerable<ZipCode> GetLocationZipCode(string zipCode)
{
return zipCodes;
}
public IEnumerable<ZipCode> GetLocationCities(string city)
{
return zipCodes;
}
public IEnumerable<ZipCode> GetLocationStates(string state)
{
return zipCodes;
}
}
private void LocateZipCode(TextBox source, TextBox destination, TextBox destination2 = null)
{
ZipCodes zips = new ZipCodes();
string tempZipCode;
List<ZipCode> zipCodes = new List<ZipCode>();
try
{
if (source.Text.Length == 5)
{
tempZipCode = source.Text.Substring(0, 5);
dataWorker.RunWorkerAsync();
destination.Text = zipCodes.Find(searchZipCode => searchZipCode.ZipCodeID == tempZipCode).City.ToString();
if (destination2.Text != null)
{
destination2.Text = zipCodes.Find(searchZipCode => searchZipCode.ZipCodeID == tempZipCode).State.ToString();
}
}
else destination2.Text = "";
}
catch (NullReferenceException)
{
destination.Text = "Invalid Zip Code";
if (destination2 != null)
{
destination2.Text = "";
}
}
}

There are several options that depend on your use case and target client machines.
Use paged controls. Use existing paged control variants (eg. telerik) which support paging. This way you will deal with smaller subset of the data available.
Use search/filter controls. Force users to enter partial data to reduce the size of the data you need to show.
Using observable collection will cause performance problems as framework provided class does not support bulk load. Make your own observable collection which supports bulk loading (which does not raise collection changed event on every element you add). On a list of 5-10.000 members I've seen loading times reduced from 3s to 0.03s.
Use async operations when loading data from db. This way UI stays responsive and you have a chance to inform users about the current operation. This improves the perceived performance immensely.

Instead of loading all of the items, try loading on demand. For instance, when user enters the first three letters then query the list and return only matching items. Many controls exists for this purpose both in silverlight and ajax.

Thanks for all the responses, I really do appreciate them. A couple I didn't really understand, but I know that's my own lack of knowledge in certain areas of c#. In researching them though, I did stumble across a different solution that had worked beautifully, using a Dictionary<T> instead of a List. Even without using a BackgroundWorker, it loads on app start-up in about 5 seconds. I had heard of Dictionary<T> before, but until now had never had a cause to use/research it, so this was doubly beneficial to me. Thanks again for all the assistance!

C# OutOfMemory, Mapped Memory File or Temp Database

Seeking some advice, best practice etc...
Technology: C# .NET4.0, Winforms, 32 bit
I am seeking some advice on how I can best tackle large data processing in my C# Winforms application which experiences high memory usage (working set) and the occasional OutOfMemory exception.
The problem is that we perform a large amount of data processing "in-memory" when a "shopping-basket" is opened. In simplistic terms when a "shopping-basket" is loaded we perform the following calculations;
For each item in the "shopping-basket" retrieve it's historical price going all the way back to the date the item first appeared in-stock (could be two months, two years or two decades of data). Historical price data is retrieved from text files, over the internet, any format which is supported by a price plugin.
For each item, for each day since it first appeared in-stock calculate various metrics which builds a historical profile for each item in the shopping-basket.
The result is that we can potentially perform hundreds, thousand and/or millions of calculations depending upon the number of items in the "shopping-basket". If the basket contains too many items we run the risk of hitting a "OutOfMemory" exception.
A couple of caveats;
This data needs to be calculated for each item in the "shopping-basket" and the data is kept until the "shopping-basket" is closed.
Even though we perform steps 1 and 2 in a background thread, speed is important as the number of items in the "shopping-basket" can greatly effect overall calculation speed.
Memory is salvaged by the .NET garbage collector when a "shopping-basket" is closed. We have profiled our application and ensure that all references are correctly disposed and closed when a basket is closed.
After all the calculations are completed the resultant data is stored in a IDictionary. "CalculatedData is a class object whose properties are individual metrics calculated by the above process.
Some ideas I've thought about;
Obviously my main concern is to reduce the amount of memory being used by the calculations however the volume of memory used can only be reduced if I
1) reduce the number of metrics being calculated for each day or
2) reduce the number of days used for the calculation.
Both of these options are not viable if we wish to fulfill our business requirements.
Memory Mapped Files
One idea has been to use memory mapped files which will store the data dictionary. Would this be possible/feasible and how can we put this into place?
Use a temporary database
The idea is to use a separate (not in-memory) database which can be created for the life-cycle of the application. As "shopping-baskets" are opened we can persist the calculated data to the database for repeated use, alleviating the requirement to recalculate for the same "shopping-basket".
Are there any other alternatives that we should consider? What is best practice when it comes to calculations on large data and performing them outside of RAM?
Any advice is appreciated....

The easiest solution is a database, perhaps SQLite. Memory mapped files don't automatically become dictionaries, you would have to code all the memory management yourself, and thereby fight with the .net GC system itself for ownership of he data.

If you're interested in trying the memory mapped file approach, you can try it now. I wrote a small native .NET package called MemMapCache that in essence creates a key/val database backed by MemMappedFiles. It's a bit of a hacky concept, but the program MemMapCache.exe keeps all references to the memory mapped files so that if your application crashes, you don't have to worry about losing the state of your cache.
It's very simple to use and you should be able to drop it in your code without too many modifications. Here is an example using it: https://github.com/jprichardson/MemMapCache/blob/master/TestMemMapCache/MemMapCacheTest.cs
Maybe it'd be of some use to you to at least further figure out what you need to do for an actual solution.
Please let me know if you do end up using it. I'd be interested in your results.
However, long-term, I'd recommend Redis.

As an update for those stumbling upon this thread...
We ended up using SQLite as our caching solution. The SQLite database we employ exists separate to the main data store used by the application. We persist calculated data to the SQLite (diskCache) as it's required and have code controlling cache invalidation etc. This was a suitable solution for us as we were able to achieve write speeds up and around 100,000 records per second.
For those interested, this is the code that controls inserts into the diskCache. Full credit for this code goes to JP Richardson (shown answering a question here) for his excellent blog post.
internal class SQLiteBulkInsert
{
#region Class Declarations
private SQLiteCommand m_cmd;
private SQLiteTransaction m_trans;
private readonly SQLiteConnection m_dbCon;
private readonly Dictionary<string, SQLiteParameter> m_parameters = new Dictionary<string, SQLiteParameter>();
private uint m_counter;
private readonly string m_beginInsertText;
#endregion
#region Constructor
public SQLiteBulkInsert(SQLiteConnection dbConnection, string tableName)
{
m_dbCon = dbConnection;
m_tableName = tableName;
var query = new StringBuilder(255);
query.Append("INSERT INTO ["); query.Append(tableName); query.Append("] (");
m_beginInsertText = query.ToString();
}
#endregion
#region Allow Bulk Insert
private bool m_allowBulkInsert = true;
public bool AllowBulkInsert { get { return m_allowBulkInsert; } set { m_allowBulkInsert = value; } }
#endregion
#region CommandText
public string CommandText
{
get
{
if(m_parameters.Count < 1) throw new SQLiteException("You must add at least one parameter.");
var sb = new StringBuilder(255);
sb.Append(m_beginInsertText);
foreach(var param in m_parameters.Keys)
{
sb.Append('[');
sb.Append(param);
sb.Append(']');
sb.Append(", ");
}
sb.Remove(sb.Length - 2, 2);
sb.Append(") VALUES (");
foreach(var param in m_parameters.Keys)
{
sb.Append(m_paramDelim);
sb.Append(param);
sb.Append(", ");
}
sb.Remove(sb.Length - 2, 2);
sb.Append(")");
return sb.ToString();
}
}
#endregion
#region Commit Max
private uint m_commitMax = 25000;
public uint CommitMax { get { return m_commitMax; } set { m_commitMax = value; } }
#endregion
#region Table Name
private readonly string m_tableName;
public string TableName { get { return m_tableName; } }
#endregion
#region Parameter Delimiter
private const string m_paramDelim = ":";
public string ParamDelimiter { get { return m_paramDelim; } }
#endregion
#region AddParameter
public void AddParameter(string name, DbType dbType)
{
var param = new SQLiteParameter(m_paramDelim + name, dbType);
m_parameters.Add(name, param);
}
#endregion
#region Flush
public void Flush()
{
try
{
if (m_trans != null) m_trans.Commit();
}
catch (Exception ex)
{
throw new Exception("Could not commit transaction. See InnerException for more details", ex);
}
finally
{
if (m_trans != null) m_trans.Dispose();
m_trans = null;
m_counter = 0;
}
}
#endregion
#region Insert
public void Insert(object[] paramValues)
{
if (paramValues.Length != m_parameters.Count)
throw new Exception("The values array count must be equal to the count of the number of parameters.");
m_counter++;
if (m_counter == 1)
{
if (m_allowBulkInsert) m_trans = m_dbCon.BeginTransaction();
m_cmd = m_dbCon.CreateCommand();
foreach (var par in m_parameters.Values)
m_cmd.Parameters.Add(par);
m_cmd.CommandText = CommandText;
}
var i = 0;
foreach (var par in m_parameters.Values)
{
par.Value = paramValues[i];
i++;
}
m_cmd.ExecuteNonQuery();
if(m_counter != m_commitMax)
{
// Do nothing
}
else
{
try
{
if(m_trans != null) m_trans.Commit();
}
catch(Exception)
{ }
finally
{
if(m_trans != null)
{
m_trans.Dispose();
m_trans = null;
}
m_counter = 0;
}
}
}
#endregion
}

Selecting million records from SQL Server

We need to index (in ASP.NET) all our records stored in a SQL Server table. That table has around 2M records with text (nvarchar) data too in each row.
Is it okay to fetch all records in one go as we need to index them (for search)? What is the other option (I want to avoid pagination)?
Note: I am not displaying these records, just need all of them in one go so that I can index them via a background thread.
Do I need to set any long time outs for my query? If yes, what is the most effective method for setting longer time outs if I am running the query from ASP.NET page?

If I needed something like this, just thinking about it from the database side, I'd probably export it to a file. Then that file can get moved around pretty easily. Moving around data sets that large is a huge pain to all involved. You can use SSIS, sqlcmd or even bcp in a batch command to get it done.
Then, you just have to worry about what you're doing with it on the app side, no worries about locking & everything on the database side once you've exported it.

I don't think a page is a good place for this regardless. There should be a different process or program that does this. On a related note maybe something like http://incubator.apache.org/lucene.net/ would help you?

Is it okay to fetch all records in one go as we need to index them
(for search)? What is the other option (I want to avoid pagination)?
Memory Management Issue / Performance Issue
You can face System Out Of Memory Exception in case you are bringing 2 millions of records
As you will be keeping all those records in DataSet and the dataset memory will be in RAM.
Do I need to set any long time outs for my query? If yes, what is the
most effective method for setting longer time outs if I am running the
query from ASP.NET page?
using (System.Data.SqlClient.SqlCommand cmd = new System.Data.SqlClient.SqlCommand())
{
cmd.CommandTimeout = 0;
}
Suggestion
It's better to filter out the record from database level...
Fetch all records from database and save it in a file. Access that file for any intermediate operations.

What you describe in Extract Transform Load (ETL). there are 2 options I'm aware of:
SSIS which is part of sql server
Rhino.ETL
I prefer Rhino.Etl as it's comletely written in C#, you can create scripts in Boo and it's much easier to test and compose ETL Processes. And the library is built to handle large sets of data, so memory management is built in.
One final note: while asp.net might be the entry point to start the indexing process, I wouldn't run the process within asp.net as it could take minutes or hours depending on the amount of records and processing.
instead have asp.net be the entry point to fires off a background task to process the records. Ideally, completely independent of asp.net so you avoid any timeout or shutdown issues.

Process your records in batches. You are going to have two main issues. (1) You need to index all of the existing records. (2) you will want to update the index with records that were added, updated or deleted. It might sound eaiser just to drop the index and recreate it, but it should be avoided if possible. Below is an example of processing the [Production].[TransactionHistory] from the AdventureWorks2008R2 database in batches of 10,000 records. It does not load all of the records into memory. Output on my local computer produces Processed 113443 records in 00:00:00.2282294. Obviously, this doesn't take into consideration remote computer and processing time for each record.
class Program
{
private static string ConnectionString
{
get { return ConfigurationManager.ConnectionStrings["db"].ConnectionString; }
}
static void Main(string[] args)
{
int recordCount = 0;
int lastId = -1;
bool done = false;
Stopwatch timer = Stopwatch.StartNew();
do
{
done = true;
IEnumerable<TransactionHistory> transactionDataRecords = GetTransactions(lastId, 10000);
foreach (TransactionHistory transactionHistory in transactionDataRecords)
{
lastId = transactionHistory.TransactionId;
done = false;
recordCount++;
}
} while (!done);
timer.Stop();
Console.WriteLine("Processed {0} records in {1}", recordCount, timer.Elapsed);
}
/// Get a new open connection
private static SqlConnection GetOpenConnection()
{
SqlConnection connection = new SqlConnection(ConnectionString);
connection.Open();
return connection;
}
private static IEnumerable<TransactionHistory> GetTransactions(int lastTransactionId, int count)
{
const string sql = "SELECT TOP(#count) [TransactionID],[TransactionDate],[TransactionType] FROM [Production].[TransactionHistory] WHERE [TransactionID] > #LastTransactionId ORDER BY [TransactionID]";
return GetData<TransactionHistory>((connection) =>
{
SqlCommand command = new SqlCommand(sql, connection);
command.Parameters.AddWithValue("#count", count);
command.Parameters.AddWithValue("#LastTransactionId", lastTransactionId);
return command;
}, DataRecordToTransactionHistory);
}
// funtion to convert a data record to the TransactionHistory object
private static TransactionHistory DataRecordToTransactionHistory(IDataRecord record)
{
TransactionHistory transactionHistory = new TransactionHistory();
transactionHistory.TransactionId = record.GetInt32(0);
transactionHistory.TransactionDate = record.GetDateTime(1);
transactionHistory.TransactionType = record.GetString(2);
return transactionHistory;
}
private static IEnumerable<T> GetData<T>(Func<SqlConnection, SqlCommand> commandBuilder, Func<IDataRecord, T> dataFunc)
{
using (SqlConnection connection = GetOpenConnection())
{
using (SqlCommand command = commandBuilder(connection))
{
using (IDataReader reader = command.ExecuteReader())
{
while (reader.Read())
{
T record = dataFunc(reader);
yield return record;
}
}
}
}
}
}
public class TransactionHistory
{
public int TransactionId { get; set; }
public DateTime TransactionDate { get; set; }
public string TransactionType { get; set; }
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.