Azure table storage: maximum variable size? - c#

I will be using the table storage to store a lot of blob names, in a single string, appended to each other using some special character. This string will sky rockets pretty soon. But is there a maximum size to the length of a property for a particular entity ? in my case the string ?

Maximal string size for a single property is 64kb. If you take the Fat Entity approach as defined by Lokad.Cloud, then you can have 1mb property instead (leveraging the maximal entity size instead).

Maximum string size is 64kb - an individual entity cannot exceed 1mb.

Related

In SQLite what is the maximum capacity of TEXT?

According to this answer TEXT has a maximum capacity of 65535 characters (or 64Kbytes).
However I just build a test in which I stored a JSON string taken from a json file that is 305KBytes into t TEXT without problems
I am wondering if there is some property in TEXT that allows this
The limit, by default is 1 billion bytes rather than characters so the encoding has to be considered. However it can be changed.
The complete section regarding this is :-
Maximum length of a string or BLOB
The maximum number of bytes in a string or BLOB in SQLite is defined
by the preprocessor macro SQLITE_MAX_LENGTH. The default value of this
macro is 1 billion (1 thousand million or 1,000,000,000). You can
raise or lower this value at compile-time using a command-line option
like this:
DSQLITE_MAX_LENGTH=123456789
The current implementation will only support a string or BLOB length up to 2 to the power of 31-1 or 2147483647. And some
built-in functions such as hex() might fail well before that point. In
security-sensitive applications it is best not to try to increase the
maximum string and blob length. In fact, you might do well to lower
the maximum string and blob length to something more in the range of a
few million if that is possible.
During part of SQLite's INSERT and SELECT processing, the complete
content of each row in the database is encoded as a single BLOB. So
the SQLITE_MAX_LENGTH parameter also determines the maximum number of
bytes in a row.
The maximum string or BLOB length can be lowered at run-time using the
sqlite3_limit(db,SQLITE_LIMIT_LENGTH,size) interface.
Limits In SQLite

Maximum length for string ActorId

What is the maximum length for a string-based ActorId? And if there is a difference between the maximum possible length and the maximum recommended length, what is the difference and why?
ActorId, per se, does not have a specified limit on length of string-based id. However, while choosing a length for string-based ActorId, you should keep following points in consideration:
1) The ActorStateProvider (implementation of IActorStateProvider) stores named-state(s) and reminder(s) for actor. Depending on the implementation, it may have a specific limit on length of string-based ActorId as internally it will be using a combination ActorId, actor state-name and reminder-name (and possibly some internal metadata tags) to uniquely identify persisted named-state(s) and reminder(s) for a given actor.
2) The default ActorStateProvider for actors is KvsActorStateProvider. It is implemented on top a key-value store. It has a key length limit of 872 characters. I would recommend leaving 50 characters for internal metadata tagging and you can use remaining characters to distribute between your string-based ActorId(s) and actor state-names/reminder-names based on you naming schemes.

What is the max number of indexes lucene.net can handle in a document

Lucene does not document the limitations of the storage engine. Does anyone know the max number of indexes allowed per document?
When referring to term numbers, Lucene's current implementation uses a Java int to hold the term index, which means the maximum number of unique terms in any single index segment is ~2.1 billion times the term index interval (default 128) = ~274 billion. This is technically not a limitation of the index file format, just of Lucene's current implementation.
Similarly, Lucene uses a Java int to refer to document numbers, and the index file format uses an Int32 on-disk to store document numbers. This is a limitation of both the index file format and the current implementation. Eventually these should be replaced with either UInt64 values, or better yet, VInt values which have no limit.
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/codecs/lucene40/package-summary.html#Limitations
As is suggested for all types of indexes (Lucene, RDBMS, or otherwise), the lowest possible number of fields is suggested to be indexed because it keeps your index size small and reduces run-time overhead reading from the index.
That said, the field count limitations are limited by your system resources. Fields are identified by their name (case-sensitive) rather than by an arbitrary numeric ID which typically becomes the limiting factor in these sorts of systems. Theoretical field count limitations are also hard to predict in a system without strict maximum field name lengths like Lucene.
I've personally used more than 200 analyzed fields more than 2 billion documents without issue. At the same time, performance for the same index was not what I have come to expect with smaller indexes on a medium-sized Azure VM.

StringBuilder used with PadLeft/Right OutOfMemoryException

All, I have the following Append which I am performing when I am producing a single line for a fixed text file
formattedLine.Append(this.reversePadding ?
strData.PadLeft(this.maximumLength) :
strData.PadRight(this.maximumLength));
This particular exception happens on the PadLeft() where this.maximumLength = 1,073,741,823 [a field length of an NVARCHAR(MAX) gathered from SQL Server]. formattedLine = "101102AA-1" at the time of exception so why is this happening. I should have a maximum allowed length of 2,147,483,647?
I am wondering if https://stackoverflow.com/a/1769472/626442 be the answer here - however, I am managing any memory with the appropriate Dispose() calls on any disposable objects and using block where possible.
Note. This fixed text export is being done on a background thread.
Thanks for your time.
This particular exception happens on the PadLeft() where this.maximumLength = 1,073,741,823
Right. So you're trying to create a string with over a billion characters in.
That's not going to work, and I very much doubt that it's what you really want to do.
Note that each char in .NET is two bytes, and also strings in .NET are null-terminated... and have some other fields beyond the data (the length, for one). That means you'd need at least 2147483652 bytes + object overhead, which pushes you over the 2GB-per-object limit.
If you're running on a 64-bit version of Windows, in .NET 4.5, there's a special app.config setting of <gcAllowVeryLargeObjects> that allows arrays bigger than 2GB. However, I don't believe that will change your particular use case:
Using this element in your application configuration file enables arrays that are larger than 2 GB in size, but does not change other limits on object size or array size:
The maximum number of elements in an array is UInt32MaxValue.
The maximum index in any single dimension is 2,147,483,591 (0x7FFFFFC7) for byte arrays and arrays of single-byte structures, and 2,146,435,071 (0X7FEFFFFF) for other types.
The maximum size for strings and other non-array objects is unchanged.
What would you want to do with such a string after creating it, anyway?
In order to allocate memory for this operation, the OS must find contiguous memory that is large enough to perform the operation.
Memory fragmentation can cause that to be impossible, especially when using a 32-bit .NET implementation.
I think there might be a better approach to what you are trying to accomplish. Presumably, this StringBuilder is going to be written to a file (that's what it sounds like from your description), and apparently, you are also potentially dealing with large (huge) database records.
You might consider a streaming approach, that wont require allocating such a huge block of memory.
To accomplish this you might investigate the following:
The SqlDataReader class exposes a GetChars() method, that allows you to read a chunk of a single large record.
Then, instead of using a StringBuilder, perhaps using a StreamWriter ( or some other TextWriter derived class) to write each chunk to the output.
This will only require having one buffer-full of the record in your application's memory space at one time. Good luck!

Storing 16 bytes of String array in 4 bytes memory, (compression) in RFID Tags

I hope that this question will not produce some vagueness. Actually I am working on RFID project and I am using Passive Tags. These Tags store only 4 bytes of Data, 32bits. I am trying to store more information in String in Tag's Data Bank. I searched the internet for String compression Algorithms but I didn't find any of them suitable. Someone please guide me through this issue. How can I save more data in this 4 bytes Data Bank, should I use some other strategy for storing, if yes, then what? Moreover, I am using C# on Handheld Window CE device.
I'll appreciate if someone could help me...
It depends on your tag, for example alien tag http://www.alientechnology.com/docs/products/Alien-Technology-Higgs-3-ALN-9662-Short.pdf , has EPC memory , I think you use your EPC memory but You can also use User Memory in your tag. You don't have to compress anything, just use your User Memory. Furthermore, technically I rather not to save many data on my tag, I use my own coding on 32 bit and relates(map) it to the more Data on my Software, and save my data on my Hard Disk. It is more safe too.
There is obviously no compression that can reduce arbitrary 16 byte values to 4 byte values. That's mathematically impossible, check the Pidgeonhole principle for details.
Store the actual data in some kind of database. Have the 4 bytes encode an integer that acts as a key for the row your want to refer to. For example by using an auto-increment primary key, or an index into an array. Works with up to 4 billion rows.
If you have less than 2^32 strings, simply enumerate them and then save the strings index (in your "dictionary") inside your 4 byte "Data Bank".
A compression scheme can't guarantee such high compression ratios.
The only way I can think of with 32-bits is to store an int in the 32-bits, and construct a local/remote URL out of it, which points to the actual data.
You could also make the stored value point to entries in a local look-up table on the device.
Unless you know a lot about the format of your string, it is impossible to do this. This is evident from the pigeonhole principle: you have a theoretical 2^128 different 16-byte strings, but only 2^32 different values to choose from.
In other words, no compression algorithm will guarantee that an arbitrary string in your possible input set will map to a 4-byte value in the output set.
It may be possible to devise an algorithm which will work in your particular case, but unless your data set is sufficiently restricted (at most 1 in 79,228,162,514,264,337,593,543,950,336 possible strings may be valid) and has a meaningful structure, then your only option is to store some mapping externally.

Categories