cannot shuffle rows of DataTable - c#

I need to shuffle rows of DataTable as randomly accessing indexes would not work in my scenario. So I have dt1 having base data which I have to shuffle and dt is the DataTable having shuffled data. And my code is:
int j;
for (int i = 0; i < dt1.Rows.Count - 1; i++)
{
j = rnd.Next(0, dt1.Rows.Count - 1);
DataRow row = dt1.Rows[j];
dt.ImportRow(row);
}
Their is no syntax error but when I run my code where I further access dt I some of same rows get imported twice. What am I doing wrong here?

DataRow can only belong to a one DataTable, create a new Row with the values from existing DataRow.
dt.Rows.Add(row.ItemArray);
Or
dt.ImportRow(row);
Update:
Another approach to randomize any collection (From this Link).
public static class Extensions
{
private static Random random = new Random();
public static IEnumerable<T> OrderRandomly<T>(this IEnumerable<T> items)
{
List<T> randomly = new List<T>(items);
while (randomly.Count > 0)
{
Int32 index = random.Next(randomly.Count);
yield return randomly[index];
randomly.RemoveAt(index);
}
}
}
Now you can randomize any collection just by calling this extension function.
var dt = dt1.AsEnumerable()
.OrderRandomly()
.CopyToDataTable();
Check this Example

Here's an extension method I wrote for datatables. In a static DataTableExtensions Class
public static DataTable Shuffle(this DataTable table) {
int n = table.Rows.Count;
List<DataRow> shuffledRows = new List<DataRow>();
foreach (DataRow row in table.Rows) {
shuffledRows.Add(row);
}
while (n > 1) {
n--;
int k = Random.Range(0, n + 1);
DataRow value = shuffledRows[k];
shuffledRows[k] = shuffledRows[n];
shuffledRows[n] = value;
}
DataTable shuffledTable = table.Clone();
foreach (DataRow row in shuffledRows) {
shuffledTable.ImportRow(row);
}
return shuffledTable;
}
Probably not most efficient but it works.
use:
DataTable shuffledTable = otherDataTable.Shuffle();

here is my solution.
Just pass your table to the function and function will randomize the rows within the table.
public static void RandomizeTable(DataTable RPrl)
{
System.Security.Cryptography.RNGCryptoServiceProvider provider = new System.Security.Cryptography.RNGCryptoServiceProvider();
int n = RPrl.Rows.Count;
while (n > 1)
{
byte[] box = new byte[1];
do
{
provider.GetBytes(box);
}
while (!(box[0] < n * (System.Byte.MaxValue / n)));
int k = (box[0] % n);
n--;
object[] tmp = RPrl.Rows[k].ItemArray;
RPrl.Rows[k].ItemArray = RPrl.Rows[n].ItemArray;
RPrl.Rows[n].ItemArray = tmp;
}
}

Related

How to divide DataTable to multiple DataTables by row index

I have one DataTable that i need to split into multiple DataTables, with the same structure, by rows.
The way i need to split the tables is:
if i have a table with 40 rows, each individual new table can have a maximum of 17 rows. So it should be first DataTable with rows 1-17, second from 18-34 and third from 35 to 40. Finally, i would add the mentioned tables to the DataSet.
I tried creating copies of the tables and deleting rows by index, but that didn't work.
you can use table.AsEnumerable() and use Skip(startRowIndex) for start index of rows and take(size) for size of each table...
var t1 = table.AsEnumerable().Skip(0).Take(17).CopyToDataTable();
var t2 = table.AsEnumerable().Skip(17).Take(17).CopyToDataTable();
...
A reusable way that handles cases like empty tables or tables that contain less rows than the split-count is this method:
public static IEnumerable<DataTable> SplitTable(DataTable table, int splitCount)
{
if (table.Rows.Count <= splitCount)
{
yield return table.Copy(); // always create a new table
yield break;
}
for (int i = 0; i + splitCount <= table.Rows.Count; i += splitCount)
{
yield return CreateCloneTable(table, i, splitCount);
}
int remaining = table.Rows.Count % splitCount;
if (remaining > 0)
{
yield return CreateCloneTable(table, table.Rows.Count - remaining, splitCount);
}
}
private static DataTable CreateCloneTable(DataTable mainTable, int startIndex, int length)
{
DataTable tClone = mainTable.Clone(); // empty but same schema
for (int r = startIndex; r < Math.Min(mainTable.Rows.Count, startIndex + length); r++)
{
tClone.ImportRow(mainTable.Rows[r]);
}
return tClone;
}
Here is your case with 40 rows and 17 split-count:
DataTable table = new DataTable();
table.Columns.Add();
for (int i = 1; i <= 40; i++) table.Rows.Add(i.ToString());
DataSet ds = new DataSet();
ds.Tables.AddRange(SplitTable(table, 17).ToArray());
Demo: https://dotnetfiddle.net/8XUBjH

C# / I fill up a DataGridView from a string array by picking the next cell randomly and do not want to pick used cell again

I have a string array and have to fill up a DataGridView with it by picking a random next cell. I want to use all items of the array but avoid to pick a cell I already filled up. It is fine if there are empty cells but I have to use all items.
What I tried:
foreach (var item in myarray)
{
y = random.Next(0, 5);
x = random.Next(0, t.Rows.Count);
t.CurrentCell = t[y, x];
for (int j = 0; j < t.Columns.Count; j++)
{
for (int k = 0; k < t.Rows.Count; k++)
{
if (t.Rows[k].Cells[j].Value == null)
{
t.CurrentCell.Value = item;
}
}
}
}
Many thanks!
You can assign a number to each cell from left to right, like for 4x4 matrix each cell will be numbered from 0 to 15. After that you can shuffle 0...15 array of integers and iterate over that array by assigning a string to the corresponding cell which you can get by converting a number from array (cellNumber) back to coordinates like this:
(cellNumber / rowsCount, cellNumber % columnsCount).
Consider following source code:
private static Random rng = new Random();
public static void Shuffle<T>(this IList<T> list)
{
int n = list.Count;
while (n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}
public static (int x, int y) GetCoordinates(int cellNumber, int rowsCount, int columnsCount) {
return (cellNumber / rowsCount, cellNumber % columnsCount);
}
public static void Main()
{
var rowsCount = 4;
var columnsCount = 4;
var cellNumbers = Enumerable.Range(0,rowsCount*columnsCount - 1).ToList();
cellNumbers.Shuffle();
foreach (var cellCoordinates in cellNumbers.Select(x => GetCoordinates(x, rowsCount, columnsCount))) {
Console.WriteLine($"{cellCoordinates.x},{cellCoordinates.y}");
}
}

What is the best way to get cell index of a word table without using for loop in c#?

I am getting the index of the cell of a word table using for loop which takes a lot of time for bigger tables, is there any way to do this without for loop?
public static int[] GetColumnIndex(Xceed.Words.NET.Table table, string columnName, int endRow,int k)
{
int[] data = { -1, -1 };
for (int j = k; j < endRow; j++)
{
for (int i = 0; i < table.Rows[j].Cells.Count; ++i)
{
if (table.Rows[j].Cells[i].Paragraphs[0].Text.Equals("«" + columnName + "»"))
{
data[0] = j;
data[1] = i;
return data;
}
}
}
return data;
}
and I am calling this function form another function
int startRow = 0, endRow = 0;
int[] ind;
DocX doc;
doc = DocX.Load(fileName);
Xceed.Words.NET.Table t;
t = doc.Tables[0];
endRow = t.Rows.Count;
System.Data.DataTable dt = new DataTable();
dt = reader(report.Query);
foreach (DataColumn col in dt.Columns)
{
ind = GetColumnIndex(t, col.ColumnName, endRow,2);
//...more code here...
}
A few things you can do to optimise your algorithm (based on your access pattern) is that you search the same table number of times (in fact, since you are searching column names in the table, number of searches increases quickly as the table gets big). Hence, it would be worth transforming the data in the table to a data structure indexed by the words (for e.g. a Sorted Dictionary).
Firstly, create a class that holds the content of the table. This way when you want to search the same table, you can use the same instance of the class and avoid recreating the data structure based on the sorted dictionary:
public class XceedTableAdapter
{
private readonly SortedDictionary<string, (int row, int column)> dict;
public XceedTableAdapter(Xceed.Words.NET.Table table)
{
this.dict = new SortedDictionary<string, (int, int)>();
// Copy the content of the table into the dict.
// If you have duplicate words you need a SortedDictionary<string, List<(int, int)>> type. This is not clear in your question.
for (var i = 0, i < rowCount; i++)
{
for (var j = 0; j < columnCount; j++)
{
// this will overwrite the index if the text was previously found:
this.dict[table.Rows[i].Cells[j].Paragraphs[0].Text] = (i, j);
}
}
}
public (int, int) GetColumnIndex(string searchText)
{
if(this.dict.TryGetValue(searchText, out var index))
{
return index;
}
return (-1, -1);
}
}
Now you loop the entire table only once and the subsequent searches will happen in O(log n). If Xceed has a function to transform data table to a dictionary, that would be quite handy. I'm not familiar with this library.
Now you can search it like:
var searchableTable = new XceedTableAdapter(doc.Tables[0]);
foreach (var col in dt.Columns)
{
ind = searchableTable.GetColumnIndex(col);
}

How to create (n) objects in c# based on loop

Say I want to create (n) DataTables named DT(n)... how can I go about that in a loop.
Pseudo code below
int n=99;
for (int i = 0; i < n; i++)
{
DataTable DT + n = new DataTable(); // <--- this
}
Is this possible?
Store them in a data structure.
Enumerable.Range(0,n).Select(x => new DataTable()).ToArray()
No, you can't have dynamic variable names in C#.
You can however, put them into an array (this is a much better approach anyways):
int n=99;
DataTable[] DT = new DataTable[99];
for (int i = 0; i < n; i++)
{
DT[i] = new DataTable();
}
You can't create dynamically named variables in C#.
For your purpose you are better off using Arrays or Dictionaries ( http://msdn.microsoft.com/en-us/library/xfhwa508%28v=vs.110%29.aspx )
try
List<DataTable> DTs = new List<DataTable>();
int n =99;
for(int(i=0;i<n;i++)
{
DataTable DT = new DataTable();
DTs.Add(DT);
}
Can you try this one
public List<T> CreateObjects<T>(int numbers) where T: new()
{
List<T> _return = new List<T>();
for (int i = 0; i <= numbers; i++)
{
_return.Add(new T());
}
return _return;
}
to use
var myList = CreateObjects<DataTable>(100);
var myList = CreateObjects<AnyClass>(100);
use any object type

Adding multiple rows to DataTable

I know two ways to add new row with data to a DataTable
string[] arr2 = { "one", "two", "three" };
dtDeptDtl.Columns.Add("Dept_Cd");
for (int a = 0; a < arr2.Length; a++)
{
DataRow dr2 = dtDeptDtl.NewRow();
dr2["Dept_Cd"] = DeptCd[a];
dtDeptDtl.Rows.Add(dr2);
}
for (int a = 0; a < arr2.Length; a++)
{
dtDeptDtl.Rows.Add();
dtDeptDtl.Rows[a]["Dept_Cd"] = DeptCd[a];
}
Both the above methods will give me the same result i.e One Two Three will be added in DataTable in seperate rows.
But my question is that what is the difference between both the steps and which one is better way performance wise?
Some decompiler observations
In both scenarios, a different overload of the System.Data.DataRowCollection.Add method is being used.
The first approach uses:
public void Add(DataRow row)
{
this.table.AddRow(row, -1);
}
The second approach will use:
public DataRow Add(params object[] values)
{
int record = this.table.NewRecordFromArray(values);
DataRow dataRow = this.table.NewRow(record);
this.table.AddRow(dataRow, -1);
return dataRow;
}
Now, take a look at this little beast:
internal int NewRecordFromArray(object[] value)
{
int count = this.columnCollection.Count;
if (count < value.Length)
{
throw ExceptionBuilder.ValueArrayLength();
}
int num = this.recordManager.NewRecordBase();
int result;
try
{
for (int i = 0; i < value.Length; i++)
{
if (value[i] != null)
{
this.columnCollection[i][num] = value[i];
}
else
{
this.columnCollection[i].Init(num);
}
}
for (int j = value.Length; j < count; j++)
{
this.columnCollection[j].Init(num);
}
result = num;
}
catch (Exception e)
{
if (ADP.IsCatchableOrSecurityExceptionType(e))
{
this.FreeRecord(ref num);
}
throw;
}
return result;
}
Especially, note the this.columnCollection[i][num] = value[i];, which will call:
public DataColumn this[int index]
{
get
{
DataColumn result;
try
{
result = (DataColumn)this._list[index];
}
catch (ArgumentOutOfRangeException)
{
throw ExceptionBuilder.ColumnOutOfRange(index);
}
return result;
}
}
Moving forward, we discover that actually _list is an ArrayList:
private readonly ArrayList _list = new ArrayList();
Conclusion
In order to summarize the above, if you are using dtDeptDtl.Rows.Add(); instead of dtDeptDtl.Rows.Add(dr2);, you will get a performance degradation which will increase exponentially, as the number of columns grows. The responsible line for the degradation is call to the NewRecordFromArray method, which iterates over an ArrayList.
Note: This can be easily tested if you add, let's say, 8 columns to the table and make some tests in a for looping 1000000 times.

Categories