i have project in ASP.NET MVC 3 and a mysql database that contains a table of string values for phone numbers (this phone numbers can be stored as 123 456-789 or 12345 6789 or 123456789 or any ways the user enter his number) and other table with a keywords data for that users.
The thing is that i have a search that will find in the keywords table (fulltext table) for the users, but i'm writing a method that search in the phone table if the search query matches against certain regular expression.
I have 2 questions:
- How could be that regular expression from the C# code side that tells what method to execute (SearchByKeyword or SearchByNumber)?
- Using the same regular expression i think, i must do the mysql query to search in the phones table... how can i do it?
I hope i have explained well and sorry if my english is a little bit bad.
It's best when you capture the data to standardise the format it is saved in, IE:
111-111-1-111-11
1111-1111-1111
111111111111
1-1-1-1-1-1-1-1-1-1-1-1
All will save as
111111111111
You can then do a simple LIKE query:
SELECT * FROM tblNumbers WHERE number LIKE '%111%'
I would say that its bad idea to feed a phone number from the input -> directly to query variable.
better way would be:
input -> parser, error check procedure -> MySQL query request.
and before adding your phone number to the query - you can remove all extra symbols, like (, ), -, "space", etc.
Related
I have a simple SQLite database where data (names) is added with a C# application. The names usually get copied and pasted from .pdf files. I found out that sometimes copying a name from .pdf generates some weird symbols. During browsing data with SQLite DB Browser I saw that some records in my database have things mingled in between like 'DC3', 'FS', 'US' and so on:
This messes with 'WHERE' clause in my queries, for example the following query would yield 0 results:
SELECT Id FROM tblPerson WHERE Name = 'Alex Denelgo';
Can someone explain what these symbols are and how can I write query to find all the "corrupted" name records? I can't go one by one manually with browser since the data already contains thousands of different names.
It seems these symbols are Non printable ASCII control characters.
The way I found the "corrupted" records is using regex. If you have the same problem as me you can use the following query to find these kinds of records. I am selecting all records minus records that only contain letters from a-z, space and dot you can modify the regex for your case of course:
SELECT Name FROM tblPerson
EXCEPT
SELECT Name FROM tblPerson WHERE Name REGEXP "^[A-Za-z .]+$";
I have an MVC autocomplete that will search for any number of strings entered into a textbox to find an address.
For example, if they enter John Doe New York, my query will do a LIKE on all the columns in the customer table (first, last, address, city, state, zip) to see if it matches the term. Then will move to the next search word and do the same.
My question is, is it better to hit the Sql Server DB 4 times (in this example) doing LIKE's for each search term for each field, or would it be better to return approx 10,000 rows and search them in memory as a List?
The first would require a lot more DB I/O as it searches the tables, but the 2nd would require a lot more data coming into the app.
None of the data in the Customers table is full text search indexed and, at best, would have a SQL Index on the individual columns.
general part
it is better to have DB do its job
if you go with 4 queries approach you will have:
time for each query, let it be 6 comparisons for each row for 1 word, 6*4 comparisons, let call it 24*q1 (q1 - average number of rows)
time to transmit 4 results, let it be q2*4 (q2 - average number of filtered rows)
time to merge/filter results on client side, which actually be almost the same as p1 - 24 comparisons for each row, again 24*q2
if you go with fully db approach, you will have
time one query will be, 6 comparisons for 4 words = 24 comparisons for each row
time to transmit one result q2_filtered (q2_filtered < q2)
24*q1 + q2*4 + 24*q2 > 24*q1 + q2_filtered so, the answer is obvious - database should filter records
if you want to store whole customer table in memory - of course it will be faster to perform your own search which will take 24*q1, so you're getting rid of only transmission part, but it will consume web servers memory and you will have problems with synchronization between memory/db
some details
depending on how do you use like - you can have very different performance problems, for example like 'ABC%' will use index, but like '%ABC%' cannot use index
here are possible some tricks, like this one: concatenation of all columns into 1, sorting of symbols in it and remove duplicates, storing symbols in different columns, the same for words - this will help a bit as it can use indexes, but you will have some false positive matches
if you really need to fetch data fast - use full text indexes or special approaches to this really huge and global problem
I hava a database which only have 2 table Object , User ,
Obviously user table has information about all user and the object table have millions of records
ObjectTable
Id (int)
Text1 (nvarchar(max))
Text2 (nvarchar(max))
I am trying to make a translator , first of all i put all data in database like following
1 : Good , Well
2 : Bad , NotWell
3 : Man , Husband Of Women
if suppose i havae 2 text boxes in my site user enter following text
Good Bad Man
then i will split that string on space and then i will have an array of string now i will took first element of array and go to server to find that wheather there is any match of Good in my database if i found match then i will replace that value with that Text2 like we have Well for Good which took too much time to translate and sometime it gives Request Timed Out . So , what is the best way to deal with it
You do not provide a lot of information to help you about your timeout problem.
Only things I can tell you are :
first of all, check that there are indexes on your Text1 and Text2 columns. If not, add it
look at your sql query. It should be SELECT * FROM OBJECT WHERE TEXT1='Good' (maybe add a top 100 to avoid returning to many things with LIKE ).
For the index to run smoothly, there should be no function called on column TEXT1 (like uppercase or trim). If you use a like, it should be TEXT1 LIKE 'good%' (% at the end)
if you use a like, beware of not returning your whole table on a empty entry, % or _ entry ( LIKE '%', LIKE '%%', LIKE '_%' can be bad things)
why use nvarchar(max) if you are just storing words ?
don't forget to sanitize your entries to avoid sql injection
I have a database with a lot of words to be used in a tag system. I have created the necessary code for an autocomplete box, but I am not sure of how to fetch the matching entries from the database in the most efficient way.
I know of the LIKE command, but it seems to me that it is more of an EQUAL command. I get only the words that looks exactly like the word I enter.
My plan is to read every row, and then use C#'s string.StartsWith() and string.Contains() functions to find words that may fit, but I am thinking that with a large database, it may be inefficient to read every row and then filter them.
Is there a way to read only rows that starts with or contains a given string from SQL Server?
When using like, you provide a % sign as a wildcard. If you want strings that start with Hello, you would use LIKE 'Hello%' If you wanted strings with Hello anywhere in the string, you would use LIKE '%Hello%'
As for efficiency, using Like is not optimal. You should look into full text search.
I know of the LIKE command, but it seems to me that it is more of an EQUAL command. I get only the words that looks exactly like the word I enter.
That's because you aren't using wildcards:
WHERE column LIKE 'abc%'
...will return rows where the column value starts with "abc". I'll point out that when using wildcards, this is the only version that can make use of an index on the column... er column.
WHERE column LIKE '%abc%'
...will return rows where the column value contains "abc" anywhere in it. Wildcarding the left side of a LIKE guarantees that an index can not be used.
SQL Server doesn't natively support regular expressions - you have to use CLR functions to gain access to the functionality. But it performs on par with LIKE.
Full Text Search (FTS) is the best means of searching text.
You can also implement a StartWith functionality using the following statements:
LEFT('String in wich you search', X) = 'abc'
CHARINDEX('abc', 'String in wich you search') = 1
'String in wich you search' LIKE 'abc%'
Use the one wich performs best.
You can use CONTAINS in T-SQL, but I'm pretty sure you have to have to be using full-text indexing for the table involved in your query.
Contains
Getting started with Full-Text Search
I have a page with 26 sections - one for each letter of the alphabet. I'm retrieving a list of manufacturers from the database, and for each one, creating a link - using a different field in the Database. So currently, I leave the connection open, then do a new SELECT by each letter, WHERE the Name LIKE that letter. It's very slow, though.
What's a better way to do this?
TIA
Since you are going to fetch them all anyway, you might find it faster to fetch them in one go and split them into letter-groups in the code.
Looking at it from the other end, why do you need to fetch all the lists just to build a set of links? Shouldn't you fetch a single letter when its link is clicked?
It sounds like you are doing up to 26 queries, which will never be fast. Often a single db query can take at least 40 ms, due to network latency, establishing connection, etc. So, doing this 26 times means that it will take around 40 x 26 ms, or more than one second. Of course, it can take much longer depending on your schema, data set, hardware, etc., but this is a rule of thumb that gives you a rough idea of the impact of queries on overall page render time.
One way I deal with this kind of situation is to use a DataTable. Fetch all the records into the DataTable, and then you can iterate through the alphabet, and use the Select method to filter.
DataTable myData = GetMyData();
foreach(string letter in lettersOfTheAlphabet)
{
myData.Filter(String.Format("Name like '{0}%'", letter));
//create your link here
}
Depending on your model layer you may wish to filter in a different way, but this is the basic idea that should improve the performance a lot.
Assuming you are querying to determine which letters are used, so that you know which links to render, you could actually just query for the letters themselves, like this:
select distinct substring(ManufacturerName, 1, 1) as FirstCharacter
from MyTable
order by 1
get one result set from one query and split that up. There is quite a lot of overhead going out the the database 26 times to do basically the same work!
You could probably do it smarter with a stored procedure. Let the SP return all the information you need in one call, and suddenly you only have one database interaction instead of 26...
Bring back all the items in one set (dataset, etc..), either through stored procedure or query, including the field left(col1,1), and sorting by that field..
select left(col1,1) as LetterGroup, col1, url_column from table1 order by left(col1,1)
Then look through the whole resultset, changing sections when the letter changes.
First letter in the alphabet sucks (sorry) as discriminator. You do not neet to split them actually (you could just ask for "where name like 'a%'), but whatever you run for that gives you on average a 1/26 or so split of the names. Not extremely efficient.
What do you mean with "creating a link - using a different field in the Database" - this sounds like a bad design to me.
there are a couple ways u can do this. 1) create a view in your db that has all the manufactures and their website link and then continue to hit the view for each letter. 2) select all the manufactures once and store it in a .net dataset and then use that dataset to populate your links.
This seems dirty to me, but you could create a first letter CHAR column and trigger to populate it. Have the first letter from the manufacturer name stored in that column and index it. Then select * from table where FirstLetter = 'A'.
Or create a lookup table with rows A - Z and set up foreign key in the manufacturer table. Again you would probably need a trigger to update this information. Then you could inner join the lookup table to the manufacturer table.
Then instead of putting 26 datasets in the page, have a list of links (A-Z) which select and show each dataset one at a time.
If I read you right, you're making a query for every manufacturer to get the "different field" you need to construct the link. If so, that's your problem, not the 26 alphabetic queries (though that's no help).
In a case like that, the faster way is this one query:
SELECT manufacturer_name, manufacturer_id, different_field
FROM manufacturers m
INNER JOIN different_field_table d
ON m.manufacturer_id = d.manufacturer_id
ORDER BY manufacturer_name
In your server code, loop through the records as usual. If you want, emit a heading when the first letter of the manufacturer_name changes.
For additional speed:
Put that in a stored procedure.
Index different_field_table on manufacturer_id.