I am not entirely sure how GridFS works in MongoDB. All the examples I have seen currently seen just involve grabbing a file and uploading it to a db through the api, but I want to know
a) can you have large files embedded in your typical JSON style documents or do they have to be stored in their own special GridFS collection or db?
b) how can I handle this type of situation where I have an object which has some typical fields in it, strings ints etc but also has a collection of attachment files which could be anything from small txt files to fairly large video files?
for example
class bug
{
public int Id { get; protected set; }
public string Name { get; protected set; }
public string Description { get; protected set; }
public string StackTrace { get; protected set; }
public List<File> Attachments { get; protected set; } //pictures/videos of the bug in action or text files with user config data in it etc.
}
a) can you have large files embedded in your typical JSON style
documents or do they have to be stored in their own special GridFS
collection or db?
You can in case if file size don't goes above the mongodb document size limit in 16mb. But you will need serialize/deserialize and do another extra work yourself.
b) how can I handle this type of situation where I have an object
which has some typical fields in it, strings ints etc but also has a
collection of attachment files which could be anything from small txt
files to fairly large video files?
If you finally decided to store your attachments in mongodb, better way to go with gridfs. You can simple store file in gridfs, but in the Attachments collection store id of this file and any metadata (file name, size, etc.). Then you can easy get file content by id from inner Attachments collection.
Mongodb gridf is a simple layer above mongodb, that can split big files into chunks and store them in mongodb and also read files back from chunks. To get started with c# and gridfs read this answer.
Related
I have a property set below for which I need to ingest data from an Excel report with a column name of ENTRY NUMBER. Looking for a custom method that can take in a string so I can assign the spreadsheet name to property below.
public string EntryNumber { get; set; }
I tried looking into libraries similar to Xml.Seralization in that it uses XmlAttribute(nameof()) to differ the property name in the class to what is on the actual xml file and then apply it to an xlsx context.
First time using the csvReader - note it requires a custom class that defines the Headers found in the CSV file.
class DataRecord
{
//Should have properties which correspond to the Column Names in the file
public String Amount { get; set; }
public String InvoiceDate { get; set; }......
}
The example given then uses the class such:-
using (var sr = new StreamReader(#"C:\\Data\\Invoices.csv"))
{
var reader = new CsvReader(sr);
//CSVReader will now read the whole file into an enumerable
IEnumerable<DataRecord> records = reader.GetRecords<DataRecord>();
//First 5 records in CSV file will be printed to the Output Window
foreach (DataRecord record in records.Take(5))
{
Debug.Print("{0} {1}, {2}", record.Amount, record.InvoiceDate, ....);
}
Two questions :-
1. The app will be loading in files with differing headers so I need to be able to update this class on the fly - is this possible & how?
(I am able to extract the headers from the CSV file.)
CSV file is potentially multi millions of rows (gb size) so is this the best / most efficient way of importing the file.
Destination is a SQLite DB - debug line is used as example.
Thanks
The app will be loading in files with differing headers so I need to be able to update this class on the fly - is this possible & how?
Although it is definetely possible with reflecion or third part libraries, creating an object for a row will be inefficient for such a big files. Moreover, using C# for such a scenario is a bad idea (unless you have some business data transformation). I would consider something like this, or perhaps a SSIS package.
I am writting Windows Phone messanger application and I have dilemma how to store my messages. At this moment my message class looks like:
public class MessageModel
{
public string Side { get; set; }
public string Message { get; set; }
public DateTime TimeStamp { get; set; }
}
I don't know is it good idea to have an class like I mentioned and store them in IsolatedStorage. Is better solution to have a file and save them in XML or JSON format ? Or maybe some database ? In other way having a class of MessageModel makes Binding really easier. I would like to keep my messages in format of dictionary: Dictionary<username, ObservableCollection<MessageModel>> where username key is an string. If you can tell me some advice about that I will really appreciate that.
This is interestinq question. I did some tests. Test algorithm for 1 iteration:
Open data source. (Db connection for Linq to sql. Prepare isostore and streams for reading/writing xml and json)
Prepare new MessageModel. All message text have 150 char.
Apend new message and save it.
Release data sources.
Table below show results for 1000 and 10 000 iterations. Tested on emulator.
In yours scenario - I don't think you'll have many records. If u don't need any complex query and update etc. XML is good choice. It is easy to use, the resulting file is readable and don't need any third party library.
In my project i have a bit big database that has about 60 tables.
I should save and collect many image files (about 5000) which the average of their size is about 2MB in large.
the estimated size of my database would be 10 GB or even higher than!
Consider these models in code first:
class Document
{
[Key]
public int Id {get;set}
// ...
public virtual ICollection<ImageDocument> Images {get;set;}
}
and
class ImageDocument
{
[Key]
public int Id {get;set}
// ...
public Document Document {get;set;}
}
as you see every Document has some ImageDocuments.
My Solution:
Consider the following two-step:
Add ImageDocuments to Related Document then add the produced
Document by calling Add and SaveChange methods from EntityFramework
DbContext.
Calling a created stored procedure for every ImageDocuments of
related Document. The called stored procedure use bcp command to
extract image file from database and save it in specific path of
server, then removes the ImageDocument's data from database.
It works but i have some problems in this way:
I' can not create backup integrated file.
Atomicity-violation because of my save transaction gets broken to
some small transactions.
Consistency-violation. maybe in calling stored procedure system
fall in crash.
Durability-violation. because of deletion ImageDocument record to
release database space.
now my question is that, are there any better solution to do this and solve the problems?
it would be great if we can create a file field in SQL server that maintain the content in files separated database file.
If you are using SQL Server, you should be using FileStreams. Straight to disk, via SQL proxy.
http://msdn.microsoft.com/en-us/library/gg471497.aspx
If I want to save binary to Azure Table, when I create a class inherited by TableServiceEntity, whats the datatype should I use? and how to check the length for the datatype to insure its not over 64k
public class SomeEntity : TableServiceEntity
{
public whattype BinaryData { get; set; }
}
For binary, a byte[] of length <= 64K is all that is necessary. The table storage client will convert it into Base64 for transport purposes only but storage will be in binary.
If you want to store more than 64K you can split it across multiple columns.
I have written an alternate Azure table storage client, Lucifure Stash, which supports large data columns > 64K, arrays, enums, serialization, public and private properties and fields and more. It is open sourced and available at http://lucifurestash.codeplex.com and via NuGet.