Having read "An Introduction to Database Systems" by C.J. Date (http://www.amazon.com/Introduction-Database-Systems-8th/dp/0...), I have a tremendous appreciation for Relational Algebra and Theory which allows for the closed functional nature/syntax of SQL which holds true across the various normal forms (1st-5th).
Perhaps due to my own lack of understanding, I always assumed that without an underlaying relational model SQL loses its closure. So for me, I always thought the distinction between SQL and NoSQL was whether or not the underlying database model was relational.
Sure for many of the queries against a non-relational model you could adopt a SQL-like syntax (which many languages seek to do), but it would no longer be operating in the closed Relational Algebra space.
Could you comment on this and help me understand the finer points/where I went wrong?
Yes, SQL is not the best implementation; theoretically we could try out Tutorial D/D4/whatever, but it's good enough especially in the Postgres flavour which is built on decades of academic and commercial research. PostgreSQL is almost entirely declarative (the RM is declarative) - how do you propose to improve on that?
As a quick reminder, these are the things you have to build manually into your application if you're not using a relational DBMS (or check are included in your new black box solution for managing data):
- data being shared (concurrent access, writes, etc.);
- avoiding redundancy, inconsistency (which you get for free by centralising the data into a single copy instead of having one copy per thread/program/user);
- transaction (= logical unit of work) atomicity (all or nothing) - say you want to transfer money from A to B, decrease A, then increase B; what if "increase B" fails? A relational database will ensure your transaction does not half go through;
- integrity - impossible things are avoided by constraints, anything from "an employee logging 400 hours of work this week" to parsing different date formats (because the date is stored as string because it's an Entity Attribute Value antipattern because "the schema has to be flexible");
- easily enforced security ("nobody but finance accesses payroll tables");
- and obviously data independence, such as freedom from having to specify the physical representation of data and access techniques in your application code. (OK, this is one place where SQL is not perfect; but it's pretty good)
(*thanks to C. J. Date for the list)
The sad thing is that this stuff does not seem to be taught anymore; I get a lot of business from the fact that most frameworks encourage antipatterns by design (Bill Karwin's book on the subject [1] is a great, easy read for those who can't stomach C. J. Date's 1000 page "Introduction" [2]).
[1] http://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic...
[2] http://www.amazon.com/Introduction-Database-Systems-8th/dp/0...
[1] https://www.coursera.org/courses?search=sql
[2] http://www.amazon.com/Introduction-Database-Systems-8th-Edit...
_An Introduction to Database Systems_ (8th Ed.) By C. J. Date http://www.amazon.com/Introduction-Database-Systems-8th/dp/0...