Inner Join
- SQL is a relational database
- The relation part refers to relationships between tables
- Think of tables as classes where the rows are the instance data, and the rows are instances
- The attributes we have seen so far have been primitive (int, float, str, …)
- Just like it is possible to have instance data that refer to other objects of other classes
- It is possible to have tables that refer to other tables
- To create a query with more than on table, use INNER JOIN - SELECT * FROM Table1 INNER JOIN Table2;
- This creates the Cartesian produce of the two tables
- Every row in Table1 is joined with a all the rows in Table2
- Note, inner join is so common, you can leave the INNER part out - SELECT * FROM Table1 INNER JOIN Table2;
- So what is the relationship?
- It related every row to every other row, which is not very useful
- A more useful relationship relates the values in Table1 to the values in Table2 - SELECT * FROM Table1 INNER JOIN Table2 ON id = id;
- Which produces an error, because roleID is two different tables
- SQL is not sure which roleID it refers to
- To fix this, use the table name in the column reference - SELECT * FROM Table1 INNER JOIN Table2 ON Table1.id = Table2.id;
- Note that the order of the tables does not matter
- It just changes the order of the columns in the retrieved table - SELECT * FROM Table2 INNER JOIN Table1 ON Table1.id = Table2.id;
- This works a lot like the WHERE clause, it filters the rows to only include the ones that meet the condition
- The columns that are used for the join often, do not have another purpose and are left out of the retrieved table - SELECT name, value FROM Table1 INNER JOIN Table2 ON (id);
- Can also use the table names in the column selection to make the query more readable - SELECT Table1.name, Table2.value FROM Table1 INNER JOIN Table2 ON (id);
- Note that naming columns in two different columns the same thing is common
- It makes it clear that the two columns form a relationship
- It is so common that there is a syntactic convenience - SELECT * FROM Table1 INNER JOIN Table2 USING (id);
- Note, that this doesn’t retrieve the joining column twice like ON does
- Two tables are often used to implement the equivalent of an enum
- That is restricting values that a column can have by using a second table that contains the possible values
- This can save memory if the values are text
- It makes changing an enum value easy, because it only exists once
- And it can prevent database errors
- If the enum value only exists in one place then there can’t be two different versions (think typo on entry) 
Keys
- Note that it is possible for multiple rows in Table1 to have the same value for id
- And this makes sense, when thinking about it like an enum
- Multiple objects can have the same value
- It is also possible for there to be multiple rows in the Table2 that have the same value for id
- This does not make sense when thinking about it like an enum
- That would mean it is possible for a value to have two different meanings
- To see what happens, add a row with a duplicate id - INSERT INTO Table2 VALUES(id, 'value'); SELECT * FROM Table1 INNER JOIN Table2 USING (id);
- The row gets duplicated with the two different values
- To remove the row - DELETE FROM Table2 value = 'value';
- To prevent this, columns in a table can be designated as a primary key or a foreign key
- A primary key can not have a duplicate value in a table
- A foreign key refers to a primary key in another table and can be duplicated in the table
- When you describe the table from the textbook, columns labeled as primary key or foreign key
- And if you try to drop a table that has a primary key that is another table’s foreign key, you get an error 
Alias Tables
- Just like columns can be named, table can be named - SELECT * FROM Table1 AS T1 INNER JOIN Table2 AS T2 ON T1.id = T2.id;
- What’s a little weird is that the ALIAS can also be used before the AS - SELECT T1.name, T2.value FROM Table1 AS T1 INNER JOIN Table2 AS T2 ON T1.id = T2.id;
- And that the new name is now mandatory
- This will produce an error - SELECT Table1.name, T2.value FROM Table1 AS T1 INNER JOIN Table2 AS T2 ON T1.id = T2.id;
- But T1 and T2 are actually worse names
- If table is created correctly everything already has good names, so why bother?
- Could be necessary with more complex joins 
Embedding Select
- Can filter with a join as normal - SELECT * FROM Table1 INNER JOIN Table2 ON Table1.id = Table2.id WHERE name = 'something';
- Since what is retrieved from the database is a table, and a JOIN has tables as operands, it is possible to use a select as an operand
- This can be more efficient because it filters before joining, which reduces the number of rows in the Cartesian product - SELECT * FROM (SELECT * FROM Table1 WHERE name = 'something') INNER JOIN Table2 ON Table1.id = Tabl2.id;
- But this produces an error ‘Every derived table must have its own alias’
- So add an alias - SELECT * FROM (SELECT * FROM Table1 WHERE name = 'something') AS P INNER JOIN ON Table1.id = Tabl2.id;
- And this produces another error because it doesn’t know what Table1 is anymore, so use the alias - SELECT * FROM (SELECT * FROM Table1 WHERE name = 'something') AS T1 INNER JOIN Table2 ON T1.id = Tabl2.id;
- Note that the alias is required even if using USING
- This is an error - SELECT * FROM (SELECT * FROM Table1 WHERE name = 'something') INNER JOIN USING id;
- This works even though the alias is not used - SELECT * FROM (SELECT * FROM Table1 WHERE name = 'something') AS T1 INNER JOIN USING id;
- Now that we’ve got queries inside of queries, they’re getting a little too long and difficult to read
- Can use line wrapping and indentation to make it easier - SELECT * FROM (SELECT * FROM Table1 WHERE name = 'something') AS T1 INNER JOIN Table2 ON T1.id = Tabl2.id;
- Each part of the query goes on its own line
- Sub-queries get indented
- Emacs can handle indenting these, so writing these larger queries is easier in a text editor than on the command line 
Joining Joins
- Can even use queries that have joins as subqueries to another join - SELECT * FROM (SELECT * FROM Table1 INNER JOIN Table2 USING id) AS T12 INNER JOIN Table3 USING value;
- This is such a common pattern that there is a convenience syntax - SELECT * FROM Table1 INNER JOIN Table2 USING id INNER JOIN Table3 USING value;
- Added bonus, don’t need to name the subquery