I am building a web site similar to Craigslist. I would like to know how to store the html formatted text (bold / italics / font size etc) in a sql 2008 database?
In order words, the user would enter their text, format it with font size, bold etc and save the information. Whats the most efficient way to store that in a database?
Save it to a nvarchar(max) field. Make sure you use parameterized queries for security. Read http://www.aspnet101.com/2007/03/parameterized-queries-in-asp-net/
Make sure only to allow a certain limited number of HTML tags or else you risk getting a cross script injection.
For example, don't allow your user to input <script> or <style> tags. I suggest you read more about cross script injection before you move on! Good luck
I would probably just store the ad text as a nvarchar(max) datatype
I would simply stuff it, as is, into a NVARCHAR(MAX) field.
Of course, you would use a parameterized query for this.
I would say just use a NVARCHAR(max) or Text data type as opposed to the XML data type.
This will allow easy access to the content where as the xml datatype would need converted somewhere along the line.
I would put it in a nvarchar(MAX) field if you are using SQL Server 2008 or above otherwise. If you are using SQL Server 2005 or lower and if the number of characters will be below 2000 you could use an nvarchar(2000) type. If that is too restricting use a text type.
Related
How do you view ALL text from an NTEXT or NVARCHAR(max) in SQL Server Management Studio? By default, it only seems to return the first few hundred characters (255?) but sometimes I just want a quick way of viewing the whole field, without having to write a program to do it. Even SSMS 2012 still has this problem :(
I was able to get the full text (99,208 chars) out of a NVARCHAR(MAX) column by selecting (Results To Grid) just that column and then right-clicking on it and then saving the result as a CSV file. To view the result open the CSV file with a text editor (NOT Excel). Funny enough, when I tried to run the same query, but having Results to File enabled, the output was truncated using the Results to Text limit.
The work-around that #MartinSmith described as a comment to the (currently) accepted answer didn't work for me (got an error when trying to view the full XML result complaining about "The '[' character, hexadecimal value 0x5B, cannot be included in a name").
Quick trick-
SELECT CAST('<A><![CDATA[' + CAST(LogInfo as nvarchar(max)) + ']]></A>' AS xml)
FROM Logs
WHERE IDLog = 904862629
In newer versions of SSMS it can be configured in the (Query/Query Options/Results/Grid/Maximum Characters Retrieved) menu:
Old versions of SSMS
Options (Query Results/SQL Server/Results to Grid Page)
To change the options for the current queries, click Query Options on the Query menu, or right-click in the SQL Server Query window and select Query Options.
...
Maximum Characters Retrieved
Enter a number from 1 through 65535 to specify the maximum number of characters that will be displayed in each cell.
Maximum is, as you see, 64k. The default is much smaller.
BTW Results to Text has even more drastic limitation:
Maximum number of characters displayed in each column
This value defaults to 256. Increase this value to display larger result sets without truncation. The maximum value is 8,192.
I have written an add-in for SSMS and this problem is fixed there. You can use one of 2 ways:
you can use "Copy current cell 1:1" to copy original cell data to clipboard:
http://www.ssmsboost.com/Features/ssms-add-in-copy-results-grid-cell-contents-line-with-breaks
Or, alternatively, you can open cell contents in external text editor (notepad++ or notepad) using "Cell visualizers" feature: http://www.ssmsboost.com/Features/ssms-add-in-results-grid-visualizers
(feature allows to open contents of field in any external application, so if you know that it is text - you use text editor to open it. If contents is binary data with picture - you select view as picture. Sample below shows opening a picture):
Return data as XML
SELECT CONVERT(XML, [Data]) AS [Value]
FROM [dbo].[FormData]
WHERE [UID] LIKE '{my-uid}'
Make sure you set a reasonable limit in the SSMS options window, depending on the result you're expecting.
This will work if the text you're returning doesn't contain unencoded characters like & instead of & that will cause the XML conversion to fail.
Returning data using PowerShell
For this you will need the PowerShell SQL Server module installed on the machine on which you'll be running the command.
If you're all set up, configure and run the following script:
Invoke-Sqlcmd -Query "SELECT [Data] FROM [dbo].[FormData] WHERE [UID] LIKE '{my-uid}'" -ServerInstance "database-server-name" -Database "database-name" -Username "user" -Password "password" -MaxCharLength 10000000 | Out-File -filePath "C:\db_data.txt"
Make sure you set the -MaxCharLength parameter to a value that suits your needs.
I was successful with this method today. It's similar to the other answers in that it also converts the contents to XML, just using a different method. As I didn't see FOR XML PATH mentioned amongst the answers, I thought I'd add it for completeness:
SELECT [COL_NVARCHAR_MAX]
FROM [SOME_TABLE]
FOR XML PATH(''), ROOT('ROOT')
This will deliver a valid XML containing the contents of all rows, nested in an outer <ROOT></ROOT> element. The contents of the individual rows will each be contained within an element that, for this example, is called <COL_NVARCHAR_MAX>. The name of that can be changed using an alias via AS.
Special characters like &, < or > or similar will be converted to their respective entities. So you may have to convert <, > and & back to their original character, depending on what you need to do with the result.
EDIT
I just realized that CDATA can be specified using FOR XML too. I find it a bit cumbersome though. This would do it:
SELECT 1 as tag, 0 as parent, [COL_NVARCHAR_MAX] as [COL_NVARCHAR_MAX!1!!CDATA]
FROM [SOME_TABLE]
FOR XML EXPLICIT, ROOT('ROOT')
PowerShell Alternative
This is an old post and I read through the answers. Still, I found it a bit too painful to output multi-line large text fields unaltered from SSMS. I ended up writing a small C# program for my needs, but got to thinking it could probably be done using the command line. Turns out, it is fairly easy to do so with PowerShell.
Start by installing the SqlServer module from an administrative PowerShell.
Install-Module -Name SqlServer
Use Invoke-Sqlcmd to run your query:
$Rows = Invoke-Sqlcmd -Query "select BigColumn from SomeTable where Id = 123" `
-MaxCharLength 2147483647 -ConnectionString $ConnectionString
This will return an array of rows that you can output to the console as follows:
$Rows[0].BigColumn
Or output to a file as follows:
$Rows[0].BigColumn | Out-File -FilePath .\output.txt -Encoding UTF8
The result is a beautiful un-truncated text written to a file for viewing/editing. I am sure there is a similar command to save back the text to SQL Server, although that seems like a different question.
EDIT: It turns out that there was an answer by #dvlsc that described this approach as a secondary solution. I think because it was listed as a secondary answer, is the reason I missed it in the first place. I am going to leave my answer which focuses on the PowerShell approach, but wanted to at least give credit where it was due.
If you only have to view it, I've used this:
print cast(dbo.f_functiondeliveringbigformattedtext(seed) as text)
The end result is that I get line feeds and all the content in the messages window of SMSS.
Of course, it only allows for a single cell - if you want to do a single cell from a number of rows, you could do this:
declare #T varchar(max)=''
select #T=#T
+ isnull(dbo.f_functiondeliveringbigformattedtext(x.a),'NOTHINGFOUND!')
+ replicate(char(13),4)
from x -- table containing multiple rows and a value in column a
print #T
I use this to validate JSON strings generated by SQL code. Too hard to read otherwise!
Use visual studio code with sql server plugin. Super usefull for jsons
Alternative 1: Right Click to copy cell and Paste into Text Editor (hopefully with utf-8 support)
Alternative 2: Right click and export to CSV File
Alternative 3: Use SUBSTRING function to visualize parts of the column. Example:
SELECT SUBSTRING(fileXml,2200,200) FROM mytable WHERE id=123456
The easiest way to quickly view large varchar/text column:
declare #t varchar(max)
select #t = long_column from table
print #t
I'm working with C# and MySQL now. I've tried to search around the internet for day to find out why I can't use AddWithValue method to add unicode characters because when I manually add it in MySQL, it works! But back in the C# code with MySQL connector for .NET it doesn't work. Other than the unicode characters is fine.
cmd.CommandText = "INSERT INTO tb_osm VALUES (#id, #timestamp, #user)";
cmd.Parameters.AddWithValue("#id", osmobj.ID);
cmd.Parameters.AddWithValue("#timestamp", osmobj.TimeStamp);
cmd.Parameters.AddWithValue("#user", osmobj.User);
cmd.ExecuteNonQuery();
For example: osmbj.User = "ສະບາຍດີ", it will be "???????" in the database.
Please T^T
does this link help you?
read/write unicode data in MySql
Basically it says, you should append your connection string with charset=utf8;
Like so:
id=my_user;password=my_password;database=some_db123;charset=utf8;
You have to be sure that unicode characters are supported at every level of the process, all the way from the input into C# to the column stored in MySql.
The C# level is easy, because strings are already utf-16 by default. As long as you're not using some weird gui toolkit, reading from a bad file or network stream, or running in a weird console app environment with no unicode support, you'll be in good shape.
The next layer is the parameter definition. Here, you're better off avoiding the AddWithValue() method, anyway. The link pertains the Sql Server, but the same reasoning applies to MySql, even if MySql is less strict with your data than it should be. You should use an Add() override that lets you explicitly the declare the type of your parameters as NVarChar, instead of making the ADO.Net provider try to guess.
Next up is the connection between your application and the database. Here, you want to make sure to include the charset=utf8 clause (or better) as part of the connection string.
Then we need to think about the collation of the database itself. You have to be sure that an NVarChar column in MySql will be able to support your data. One of the answers from the question at previous link also covers how to handle this.
Finally, make sure the column is defined with the NVarChar type, instead of just VarChar.
Yes, utf8 at all stages -- byte-encoding in client, conversion on the wire (charset=utf8), and on the column. I do not know whether C# converts from utf16 to utf8 before exposing the characters; if it does not, then charset=utf16 (or no setting) may be the correct step.
Because you got multiple ?, the likely cause is trying to transform non-latin1 characters into a CHARACTER SET latin1 column. Since latin1 has no codes for Lao, ? was substituted. Probably you said nothing about the column, but depended on the DEFAULT on the table and/or database, which happened to be latin1.
The ສະບາຍດີ is lost and cannot be recovered from ???????.
Once you have changed things, check that it is stored correctly by doing SELECT col, HEX(col) .... For the string ສະບາຍດີ, you should get hex E0BAAAE0BAB0E0BA9AE0BAB2E0BA8DE0BA94E0BAB5. Notice how that is groups of E0BAxx, which is the range of utf8 values for Lao.
If you still have troubles, please provide the HEX for further analysis.
I'm helping a friend with a winforms app loaded with crystal reports (two things I generally try to avoid so pardon my ignorance). Anyhow if I have a varchar database field:
2123456789
And want to display it on the **crystal**report as:
(212)-345-6789
How would I go about that without changing the stored procs, or the database data type (not trying to open that can of worms). From what I've been apply to surmise if it was a numeric or int field then I would be able to use the Format Object number tab. However this is not an option due to the datatyping.
EDIT
My goal is Formatting the data in the crystal report or back end code of the crystal report not the database or t-sql. Thanks
Set the field's display-string formula to:
Picture(CurrentFieldValue,"(XXX) XXX-XXXX")
try this:
Mid("2123456789",1,3)+" - "+Mid("2123456789",4,3)+" - "+ Mid("2123456789",7,4)
I have some text,
e.g
line one \r\n
line two \r\n
line threee.
In my database, I have a column define as type text, then I use Entity Framework to map that column, the code generate by Entity Framework is type string
I can successfully save that text into the column in the database. However, from Management Studio, I couldn't see those line separators, when I do copy value, and paste to notepad, they have become one line of text.
Anyone know what the problem is?
Thanks.
The problem is simply that Management Studio doesn't support all characters. It removes the line breaks when you copy the value.
If you change the result from grid to text (Query -> Results To -> Results to text, or ctrl+T), and select the value, you will see that the text comes out as separate lines.
You should stick to NVarchar when it comes to store text data (one of the reason is that it supports more characters).
Heres an extract from msdn:
"ntext, text, and image data types will be removed in a future version of Microsoft SQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use nvarchar(max), varchar(max), and varbinary(max) instead".
More info:
ntext, text, and image deprecated :http://msdn.microsoft.com/en-us/library/ms187993.aspx
replace text: http://blog.sqlauthority.com/2007/05/26/sql-server-2005-replace-text-with-varcharmax-stop-using-text-ntext-image-data-types/
This could quite simply be a problem with the management studio. Try querying the database from C# and printing the field's value. You will see immediately if the CRLFs are there.
Sometimes I need to format specific data or part of it that comes from the database .
For example :
If i have a desc (stored in DB) like this :
HTML 4 has been tweaked, stretched and augmented beyond its initial scope to bring high levels of interactivity and multimedia to Web sites. Plugins like Flash, Silverlight and Java have added media integration to the Web, but not without some cost.
and i wanna to format the last line , change the font and color for example .
What 's the best practice to do this ?
embedding HTML tags in my database ??Is this safe and the best practice or there 's some way to separate the structure layer from the presentation layer from the behavior layer ?
If you plan to manipulate or search upon the stored data then do not store HTML markup in your database. Imagine that at some point you're asked to change the fonts from Tahoma to Georgia, change <b> tags to <strong> or allow the users to search on the HTML column; and searches for strong end up returning irrelevant information because strong is also a frequently used HTML tag.
Storing HTML markup in your database is also a bad idea if you do not check what is being stored. A malicious script tag such as <script>location = 'http://otherwebsite'</script> is just one naive example.
Ideally you should store the data as-is or use some kind of markup such as (wiki or markdown) to store basic formatting information.
There IS some way to separate the data from the presentation. You keep them separate! If you want to do some formatting on that text that you pulled from the database, go ahead and do that in your application code. Note that structural markup is an entirely different topic from presentation markup (font, color, layout, etc)
http://en.wikipedia.org/wiki/Separation_of_presentation_and_content talks about this very point and makes a clear separation between presentation markup and structural markup in the paragraph under Intended Meaning.
Storing formatting tags in your data generally points to poor separation between the two layers or a data model that isn't sufficient to represent your data properly. As the author is storing data in a database, that might indicate that he has just a single field for holding the "content block" of an article rather than multiple fields for the author, title, body, references, etc. For user input data, we often fall back to a markup inside the user content for designating structure. That happens through "fake" html tags or even real html/xml tags like <h1>, <em>, <a>, etc.
Note that I'm not objecting to structural markup on principle but I would look carefully at why it's required if you're storing it in a database. I am objecting to presentation markup on principle.
It depends on where the data comes in to DB.
If you're the only one who changes the DB content, then it is perfectly normal to store HTML tags in it.
Otherwise, if you store your users input in DB, there are two approaches:
1) To sanitize the input supplied by your users (either on store or on display) to make sure no malicious data will be displayed.
2) To use some intermediate markup language with the limited possibilities (such as BBCode), and to compile it to HTML (again, either on store or on display).
I can not recommend to store any html tags inside your database. In the end you will find yourself lost if your codebase gets bigger and as well if you want to change your html. For example add some arguments to your html tags like classes or similar. You would need to "fix" all the html tags with sql statements. This also counts for the case you want to do something else with your data. For example create RSS Feeds or export it to another format like for example an excel sheet or similar.
Why do you want to do it anyway? I am sure tere is a better solution to your problem.
Try to separate the content form the application layer. Normelize your data and put paragraphs for example in a new dataset. If you really need to for example color one word, I would follow the suggestion that has already been posted. Use some own syntax like [color-a][/color-a]. The export problem could however been solved by striptags()
You can use blob field, however you won't be able to do full searching on it iirc. If you have a column with the template name as a value and a blob with the html template value then this will work out just fine.
IMO it's perfectly fine to store HTML in your database. You sound smart enough to not allow just anything to go into the DB without validation.
You just need to be careful about how it's updated. If you are inserting to the database via code:
INSERT INTO myTable Values(x + y + z)
if the variable x has some HTML in it with single quotes for example, no bueno.
I think the content of the string you stored in database has nothing to do with the presentation layer, its only affection is how your business layer provide the html string (directly read from database or decorate it later) to the presentation layer.