This question already has answers here:
GROUP BY / aggregate function confusion in SQL
(5 answers)
Closed 3 years ago.
I got an error -
Column 'Employee.EmpID' is invalid in the select list because it is
not contained in either an aggregate function or the GROUP BY clause.
select loc.LocationID, emp.EmpID
from Employee as emp full join Location as loc
on emp.LocationID = loc.LocationID
group by loc.LocationID
This situation fits into the answer given by Bill Karwin.
correction for above, fits into answer by ExactaBox -
select loc.LocationID, count(emp.EmpID) -- not count(*), don't want to count nulls
from Employee as emp full join Location as loc
on emp.LocationID = loc.LocationID
group by loc.LocationID
ORIGINAL QUESTION -
For the SQL query -
select *
from Employee as emp full join Location as loc
on emp.LocationID = loc.LocationID
group by (loc.LocationID)
I don't understand why I get this error. All I want to do is join the tables and then group all the employees in a particular location together.
I think I have a partial explanation for my own question. Tell me if its ok -
To group all employees that work in the same location we have to first mention the LocationID.
Then, we cannot/do not mention each employee ID next to it. Rather, we mention the total number of employees in that location, ie we should SUM() the employees working in that location. Why do we do it the latter way, i am not sure.
So, this explains the "it is not contained in either an aggregate function" part of the error.
What is the explanation for the GROUP BY clause part of the error ?
Suppose I have the following table T:
a b
--------
1 abc
1 def
1 ghi
2 jkl
2 mno
2 pqr
And I do the following query:
SELECT a, b
FROM T
GROUP BY a
The output should have two rows, one row where a=1 and a second row where a=2.
But what should the value of b show on each of these two rows? There are three possibilities in each case, and nothing in the query makes it clear which value to choose for b in each group. It's ambiguous.
This demonstrates the single-value rule, which prohibits the undefined results you get when you run a GROUP BY query, and you include any columns in the select-list that are neither part of the grouping criteria, nor appear in aggregate functions (SUM, MIN, MAX, etc.).
Fixing it might look like this:
SELECT a, MAX(b) AS x
FROM T
GROUP BY a
Now it's clear that you want the following result:
a x
--------
1 ghi
2 pqr
Your query will work in MYSQL if you set to disable ONLY_FULL_GROUP_BY server mode (and by default It is). But in this case, you are using different RDBMS. So to make your query work, add all non-aggregated columns to your GROUP BY clause, eg
SELECT col1, col2, SUM(col3) totalSUM
FROM tableName
GROUP BY col1, col2
Non-Aggregated columns means the column is not pass into aggregated functions like SUM, MAX, COUNT, etc..
Basically, what this error is saying is that if you are going to use the GROUP BY clause, then your result is going to be a relation/table with a row for each group, so in your SELECT statement you can only "select" the column that you are grouping by and use aggregate functions on that column because the other columns will not appear in the resulting table.
"All I want to do is join the tables and then group all the employees
in a particular location together."
It sounds like what you want is for the output of the SQL statement to list every employee in the company, but first all the people in the Anaheim office, then the people in the Buffalo office, then the people in the Cleveland office (A, B, C, get it, obviously I don't know what locations you have).
In that case, lose the GROUP BY statement. All you need is ORDER BY loc.LocationID
Related
When my query is:
select * from table_name where name='jim'
everything is fine.
But when my query is:
select * from table where ='a statement with 2 and more word'
For example this query:
select columns from table where ='jim carrey'
The query just considers 'jim'. In other words, the query just considers the first word and does not consider whatever comes after that.
SQL does not work like that. If you take the following three queries:
select * from users where name = 'Frank Jones'
select * from users where name = 'Frank'
select * from users where name like 'Frank%'
If I run these on my SQL server database (after changing back to our real data structure) I will get 1 response to the first , the person who is actually named "Frank Jones'. I will not get 'Frank Jones III'
Since both first and last names are in the name columns if I run the second query, I will get no results.
If I run the third query I will get everyone whose first name if Frank but will not get "Jason Franks' because I only have a wildcard at the end of the phrase I am searching for. If I wanted everyone who had and portion of Frank in their name I would write this query:
select * from users where name like '%Frank%'
These are standard rules on what the various where clauses mean that apply to every database I have ever seen (although some might have a differnt wildcard symbol).
You don't say what platform you are using which makes answering your question harder but I will give an answer that will be close.
You need to parse the first work in the string. So
SELECT aColumn
FROM aTable
WHERE name = LEFT('Jim Carrey', CHARINDEX('Jim Carrey',' '))
Would be an example in sql server
The name of these functions changes for each platform.
first of all I am sorry if this question is too obvious, since I am quite new in SQL.
So, I have a list of IDs (variable, depending how many products the user chooses). And I want to check if all of them are in a table. If one of them is not, the result of the query should be null. If all of them are there, the result should be all the rows where those IDs are.
How can I do this?
Best regards,
Flavio
Do a LEFT JOIN from the list to the table on the ID field. You'll get a null if there is no record
You can even put a WHERE clause like 'WHERE List.ID IS NULL' to only see those that aren't in the table
Edit: Original Poster did not say they were using C# when I wrote this answer
UNTESTED:
Not sure if this is the most efficient but it seems like it should work.
1st it generates a count of items in the table for your list. Next it cross joins the 1 result record to a query containing the entire list ensuring the count matches the count in your provided list and limiting the results to your list.
SELECT *
FROM Table
CROSS JOIN (
SELECT count(*) cnt
FROM table
WHERE ID in (yourlist)) b
WHERE b.cnt = yourCount
and ID IN (YourList)
Running two in statements seems like it would be terribly slow overall; but my first step when writing SQL is usually to get something that works and then seek to improve performance if needed.
Get the list of Ids into a table, (you can pass them as a table variable parameter to a Stored proc), then in the stored proc, write
assuming the list of ids from C# is in table variable #idList
Select * from myTable
Where id in (Select id from #idList)
and not exists
(Select * from #idList
where id Not in
(Select id from myTable))
I have a set of (not very well normalised or relational) tables named
PLAN,
GROUP,
PRODUCT
CLIENT
Most have linkage i.e.
PLAN -> CLIENT on clno
GROUP to PRODUCT on PRODCD
However, the linkage between PLAN and GROUP is tricky. A plan has 2 field of interest GRPNO and PRODCD.
What I want to do is if GRPNO != 0 then join GROUP on GRPNO. However if GRPNO = 0 then I want to join GROUP on PRODCD.
The frustrating thing is that the fileds I want to return in my queries are the same across the board I just need to be able to vary the join, or join the same table twice.
The best I can come up with is 2 queries and merge them using datasets, or possibly using a union.
Is there a nifty way to do this in one select?
I should point out I am access Foxpro over ODBC to do this.
Thank you!
You can do:
JOIN GROUP AS G ON
(PL.GRPNO = 0 AND G.PRODCD = PL.PRODCD) OR
(PL.GRPNO !=0 AND G.GRPNO = PL.GRPNO)
However it would surprise me if this is faster than using UNION ALL.
I have a DataSet that contains two tables. One is considered to be nested in the other.. All I want is for it to not be nested and for there to be one table. .Merge() and LINQ just aren't doing the trick.
Here is a sample of what the main table would look like
student-id ID
--------------------
123456789 1
654987321 2
But each of these has multiple rows that they correspond to in the next table
ID Col1 Col2 etc.
----------------------
1 fact1 fact2
1 fact3 fact4
2 fact5 fact6
I want to combine them so they would look like this...
student-id Col1 Col2
-------------------------------
123456789 fact1 fact2
123456789 fact3 fact4
654987321 fact5 fact6
Everytime that I try the merge it doesn't work I get an error that I cant duplicate the primary key which is "ID" and since the merge is based on the primary key(i believe) I cant remove it.
I cant use LINQ because I want to make this generic so that the second table could have any number of columns and I cant get the select to work for that.
UPDATE: MY SOLUTION
I ended up cloning the second table to a new data table. Then adding a column called 'student-id' and deleting the ID column. The I looped through the rows of the Main table finding and related them to row in the second table... Combined all the data in an array and created a row in the final table.
The LINQ isn't as bad as you suggest. You can just use an anonymous type that holds two DataRows:
var result = from t1 in table1.AsEnumerable()
join t2 in table2.AsEnumerable() on (int)t1["ID"] equals (int)t2["ID"]
select new
{
Student = t1,
Facts = t2
};
foreach(var s in result)
Console.WriteLine("{0} {1} {2}", s.Student["student-id"], s.Facts["Col1"], s.Facts["Col2"]);
That way, you're not including specific columns in your output, so you can access them after the fact.
That being said, the other poster's suggestion of using a pivot table is probably a better direction to go.
let's try it in SQL.
Let, 1st Table = Table1 and 2nd Table = Table2
SQL:
Select
X.student-id,Y.Col1,Y.Col2
From Table1 As X Inner Join Table2 As Y On Y.ID=X.ID
I think if you try it in SQL it's easy to do!!!
Sounds like what you need is a Pivot table.
This will essentially allow you to display the data how you want.
Here are a couple of tutorials/projects
http://www.codeproject.com/Articles/25167/Simple-Advanced-Pivots-with-C-and-ASP-NET
http://www.codeproject.com/Articles/46486/Pivoting-DataTable-Simplified
Update
you may find yourself better doing the 'pivot' part in MS SQL as stored procedure and then populating your datatable with the results of calling this stored procedure. This example here is a great starting block
http://blogs.msdn.com/b/spike/archive/2009/03/03/pivot-tables-in-sql-server-a-simple-sample.aspx
I have a database that contains:
user_id | category_id | liked_id | disliked_id
(thanks to stack overflow users for helping me get my database setup properly in the first place!!)
Last time I used food as an example but this time I'm going to use people.
The user is given 2 images (male vs male or female vs female) and he/she simply chooses which one he/she thinks is more attractive. The user repeats this process as long as he/she wishes. Each selection is entered into the database showing which person they liked and which they disliked (also a button would be available if you think the two are similar).
Now that I have my table full of entries, I'm trying to develop an algorithm that will take all of those "votes" and translate it into a ranked list of who the user finds most attractive (based on hundreds or maybe even thousands of ranking entries).
I've been at the drawing board for hours and can't seem to think of an effective way of doing this.
Any help would be appreciated.
P.S.: The idea is also to have this be a multi-user thing, where other users can see your "like" tables and also have globally averaged tables showing how all users in general rank things.
So you posted your question in the c# group. I want to give you, however, a solution that is implemented in the database, making it more independent of your program.
What you probably want to do first is to get the number of times an image has been liked and disliked. This SQL statement should do that for you (if you are using a database supporting grouping sets it would probably be easier to write):
SELECT t1.liked_id as id, t1.c_liked, t2.c_disliked
FROM
(SELECT liked_id, COUNT(*) as c_liked FROM table GROUP BY liked_id) t1
LEFT JOIN
(SELECT disliked_id, COUNT(*) c_disliked FROM table GROUP BY disliked_id) t2
ON
t1.liked_id = t2.disliked_id
Then it's up to you what you do with the numbers. In the outermost SELECT-statement, you could put a very complicated function, e.g. you could choose to weigh the dislikes less than the likes. To give you an idea of a possible very simple function:
SELECT t1.liked_id as id,
(t1.c_liked/(t1.c_liked + t2.c_disliked) - t2.c_disliked/(t1.c_liked + t2.c_disliked)) as score
This returns you values [-1, 1] (which you could normalize to [0, 1] if you like, but don't have to), which you then can sort as in this example:
SELECT t1.liked_id as id,
(t1.c_liked/(t1.c_liked + t2.c_disliked) - t2.c_disliked/(t1.c_liked + t2.c_disliked)) as score
FROM
(SELECT liked_id, COUNT(*) as c_liked FROM table GROUP BY liked_id) t1
LEFT JOIN
(SELECT disliked_id, COUNT(*) c_disliked FROM table GROUP BY disliked_id) t2
ON
t1.liked_id = t2.disliked_id
ORDER BY score