C#: Storing Filesize in Database - c#

I'm storing objects in a database as varbinary(MAX) and want to know their filesize. Without getting into the pro and cons of using the varbinary(MAX) datatype, what is the best way to read the file size of an object stored in the database?
Is it:
A. Better to just read the column from the DB and call the .Length property of System.Data.Linq.Binary.
OR
B. Better to determine the file size of the object before it is added to the DB and create another column called Size.
The files I'm dealing with are generally between 0 and 3 MB with a skew towards the smaller size. It doesn't necessarily make sense to hit the DB again for the file size, but it also doesn't really make sense to read through the entire item to determine its length.

Why not add a calculated column in your database that would be DATALENGTH([your_col])?

Related

Storing files to byte array

I have a database object that has a column to store files as varbinary. I have tried to store single file using C# and byte arrays and it worked. How can I add multiple files to this column. Please help.
I would suppose you'd need to concat the byte arrays from each file into a giant byte array and then insert that byte array into the field, but then how would know where 1 file begins and the next ends?
You could try to put in a magic set of bytes between each file byte array, but then what happens when one of those files randomly has that magic set of bytes?
If the files are the same exact type, say images, you could look for the magic bytes certain image file types always start with to separate them out, but again, there's still the random chance you might find those magic bytes in the middle of one of the files.
There is also memory concerns both saving and retrieving if the combined files are too large.
This idea also violates database design / normalization.
I would do what Jeremy Lakeman recommends: create a child table.
IE,
Files Table Columns:
ParentID (foreign key to parent table)
FileID (Autonumber / primary key)
File (varbinary)

EF Code First - Change column max length without maxing out DB

I am dealing with a problem that most of our columns were created with default EF behaviour which makes string as nvarchar(max). However that doesn't combine well with indexes.
I tried the putting the [MaxLength(100)] attribute onto the specific column and generate a migration. That generates the alter table statement that when run on a database (with a lot of data) spikes the DTU and basically trashes the DB.
I am now looking for a safe way how to proceed with this (let's say that the column name is "FileName"):
Create a column FileNameV2 with [MaxLength(100)].
Copy data from FileName column to FileNameV2.
Delete FileName column.
Rename FileNameV2 to FileName
Would this approach work or is there any better / easier way (especially one that doesn't upset EF)?
The main issue I found out later was that our SQL Azure database had max size 2 GB so when I was doing the change and the db had 1,5 GB it then reached its size probably when doing the transition from navarchar(max) to nvarchar(100). So the learning is to double check your max size of DB on Azure just to be sure you don't hit this threshold.

SQLXML Bulk Load or manual iteration?

I am looking to insert a 20-25MB xml file into a database on a daily basis. The issue is that each entry needs an extra column added with a calculated value. So what I am wondering is if the most efficient way to do this would be using the SQLXML Bulk Load tools after editing the xml file, running through the xml file and add the new column then loading each item, or using the Bulk Load followed by going through the database adding the new column values.
Comments = answer
There is no need to store this value seperate. Since it's a calculated value with the data you need to calculate it on each record, you can calculate this on the fly instead of storing it as it's own unique value. A mix of where and/or having clauses will allow for filtering (searching) of results based on that calculated value.

How to validate column before importing into database

I am a complete newbie to SSIS.
I have a c#/sql server background.
I would like to know whether it is possible to validate data before it goes into a database. I am grabbing text from a |(pipe) delimited text file.
For example, if a certain datapoint is null, then change it to 0 or if a certain datapoint's length is 0, then change to "nada".
I don't know if this is even possible with SSIS, but it would be most helpful if you can point me into the right direction.
anything is possible with SSIS!
after your flat file data source, use a Derived Column Transformation. Deriving a new column with the expression being something like the following.
ISNULL(ColumnName) ? "nada" : ColumnName
Then use this new column in your data source destination.
Hope it helps.
I don't know if you're dead set on using SSIS, but the basic method I've generally used to import textfile data into a database generally takes two stages:
Use BULK INSERT to load the file into a temporary staging table on the database server; each of the columns in this staging table are something reasonably tolerant of the data they contain, like a varchar(max).
Write up validation routines to update the data in the temporary table and double-check to make sure that it's well-formed according to your needs, then convert the columns into their final formats and push the rows into the destination table.
I like this method mostly because BULK INSERT can be a bit cryptic about the errors it spits out; with a temporary staging table, it's a lot easier to look through your dataset and fix errors on the fly as opposed to rooting through a text file.

C#: Is it possible to store a Decimal Array in an SQL database?

I'm working on an application for a lab project and I'm making it in C#. It's supposed to import results from a text file that is exported from the application we use to run the tests and so far, I've hit a road block.
I've gotten the program to save around 250 decimal values as a single-dimension array but then I'm trying to get the array itself to be able to saved in an SQL database so that I can later retrieve the array and use the decimal values to construct a plot of the points.
I need the entire array to be imported into the database as one single value though because the lab project has several specimens each with their own set of 250 or so Decimal points (which will be stored as arrays, too)
Thanks for your help.
EDIT: Thanks for the quick replies, guys but the problem is that its not just results from a specimen with only 1 test ran. Each specimen itself has the same test performed on them with different decibel levels over 15 times. Each test has its own sets of 250 results and we have many specimens.
Also, the specimens already have a unique ID assigned to them and it'd be stored as a String not an Int. What I'm planning on doing is having a separate table in the DB for each specimen and have each row include info on the decibel level of the test and store the array serialized...
I think this would work because we will NOT need to access individual points in the data straight from the database; I'm just using the database to store the data out of memory since there's so much of it. I'm going to query the database for the array and other info and then use zedgraph to plot the points in the array and compare multiple specimens simultaneously.
Short answer is absolutely not. These are two completely different data structures. There are work arounds like putting it in a blob or comma separating a text column. But, I really hate those. It doesn't allow you to do math at the SQL Server level.
IMO, the best option includes having more than one column in your table. Add an identifier so you know which array the data point belongs to.
For example:
AutoId Specimen Measurement
1 A 42
2 A 45.001
3 B 47.92
Then, to get your results:
select
measurement
from
mytable
where
specimen = 'A'
order by
autoid asc
Edit: You're planning on doing a separate 250 row table for each specimen? That's absolutely overkill. Just use one table, have the specimen identifier as a column (as shown), and index that column. SQL Server can handle millions upon millions of rows markedly well. Databases are really good at that. Why not play to their strengths instead of trying to recreate C# data structures?
I need the entire array to be imported
into the database as one single value
though because the lab project has
several specimens each with their own
set of 250 or so Decimal points (which
will be stored as arrays, too)
So you're trying to pound a nail, should you use an old shoe or a glass bottle?
The answer here isn't "serialize the array into XML and store it in a record". You really want to strive for correct database design, and in your case the simplest design is:
Specimens
---------
specimenID (pk int not null)
SpecimenData
------------
dataID (pk int not null
specimenID (fk int not null, points to Specimens table)
awesomeValue (decimal not null)
Querying for data is very straightforward:
SELECT * FROM SpecimenData where specimenID = #specimenID
As long as you don't to access the the individual values in your queries, you can serialize the array and store it as a blob in the database.
Presumably you could serialize the decimal array in C# to a byte array, and save that in a binary field on a table. Your table would have two fields: SpecimenID, DecimalArrayBytes
Alternately you could have a many to many type table and not store the array in one piece, having fields: SpecimenID, DecimalValue, and use SQL like
SELECT DecimalValue FROM Table WHERE SpecimenID = X
You can serialize the array and store it as a single chunk of xml/binary/json. Here is an example of serializing it as xml.
public static string Serialize<T>(T obj)
{
StringBuilder sb = new StringBuilder();
DataContractSerializer ser = new DataContractSerializer(typeof(T));
ser.WriteObject(XmlWriter.Create(sb), obj);
return sb.ToString();
}
You want two tables. One to store an index, the other to store the decimal values. Something like this:
create table arrayKey (
arrayId int identity(1,1) not null
)
create table arrayValue (
arrayID int not null,
sequence int identity(1,1) not null,
storedDecimal decimal(12,2) not null
)
Insert into arrayKey to get an ID to use. All of the decimal values would get stored into arrayValue using the ID and the decimal value to store. Insert them one at a time.
When you retrieve them, you can group them by arrayID so that they all come out together. If you need to retrieve them in the same order you stored them, sort by sequence.
Although any given example might be impractical, via programming you can engineer any shape of peg into any shape of hole.
You could serialize your data for storage in a varbinary, XML-ize it for storage into a SQL Server XML type, etc.
Pick a route to analyze and carefully consider. You can create custom CLR libraries for SQL as well so the virtual sky is the limit.

Categories