3.1 Layer 0 - The Core
3.1.1 Fragments
The above example uses 4 different kinds of fragments: where, select, bool, and scalar. Note that where and select are special - they belong in "root position" of the query, and are called "clauses". (All clauses are fragments; not all fragments are clauses.)
(define (Bar x) (scalar x".Bar")) (define (Foo x) (scalar x".Foo")) (define (my= a b) (bool "("a" = "b")")) (from x "X" (where (my= (Bar x) 3)) (select (Foo x)))
The reason for using non-clause fragments like scalar won’t become clear until later sections. For now, just know that fragments exist, and some of them are clauses.
3.1.2 From
(from x "Y")
We can see in the generated SQL that the table name is "Y", with an alias of "x". We can also see that, since there are no clauses, Plisqin generated select x.* to make a valid query.
3.1.2.1 Appending Queries
> (define (query-1) (from x "X" (select x".one") (select x".two")))
> (define (query-2) (from y (query-1) (select y".three") (select y".four"))) > (displayln (to-sql (query-2)))
select
x.one
, x.two
, x.three
, x.four
from X x
In the example above, query-2 appends to query-1. You can see in the generated SQL that all 4 select clauses are present in query-2. You can also see that the original alias from query-1 (x) is used, but this behavior is not guaranteed.
3.1.2.2 Always Use the Alias!
This is because queries in Plisqin are highly composable, so even if a query is unambiguous locally, it might become ambiguous if another query appends to it. If someone appends to the above example and adds a join from Rental to Copy, the naked "CopyId" becomes ambiguous, and the generated SQL is invalid.
3.1.2.3 Query Contents
> (define (my-query) (from x "X" (if #t (select x".truth") (error "true is false?!")) (if #t (list (where "1=1") (where "2=2")) (error "true is false?!")))) > (display (to-sql (my-query)))
select
x.truth
from X x
where 1=1
and 2=2
> (define (my-query) (from x "X" ; define a value: (define rating (scalar x".Rating")) ; define a procedure: (define (count-when predicate) (scalar "sum(case when "predicate" then 1 else 0 end)")) ; the definitions are available until the (from ...) is closed (select (count-when (bool rating" >= 7")) " as NumGoodRatings") (select (count-when (bool rating" >= 9")) " as NumGreatRatings"))) > (display (to-sql (my-query)))
select
sum(case when x.Rating >= 7 then 1 else 0 end) as NumGoodRatings
, sum(case when x.Rating >= 9 then 1 else 0 end) as NumGreatRatings
from X x
3.1.3 Join
(define (Foo-of x) (join f "Foo" (join-on f".FooId = "x".FooId"))) (from b "Bar" (select (Foo-of b)".Baz"))
(define (Foo-of x) (join f "Foo" (join-on f".FooId = "x".FooId"))) (from b "Bar" ; The colon in b:foo means nothing. It could be any identifier. (join b:foo (Foo-of b)) (select b:foo".Baz"))
These 3 variations will all generate the same SQL in this particular example, but there is a subtle difference that the next section will explain.
3.1.3.1 Attached vs Detached Joins
> (define (variation-one) (from x "X" (define y (join y "Y" (join-on y".YID = "x".YID"))) (where (exists (from z "Z" (where z".Foo = "y".Foo"))))))
> (define (variation-two) (from x "X" (join y (join y "Y" (join-on y".YID = "x".YID"))) (where (exists (from z "Z" (where z".Foo = "y".Foo"))))))
> (display (to-sql (variation-one)))
select x.*
from X x
where exists (
select z.*
from Z z
inner join Y y
on y.YID = x.YID
where z.Foo = y.Foo)
> (display (to-sql (variation-two)))
select x.*
from X x
inner join Y y
on y.YID = x.YID
where exists (
select z.*
from Z z
where z.Foo = y.Foo)
In variation-one, y is a "detached" join. This join is located inside the subquery z so that is where it gets rendered in SQL. You might think that y has some special relationship to the main query x because y mentions x in its join-on clause. But that is not the case. Plisqin does not try to guess where a join belongs based on the contents of its join-on clauses; doing so would be even more confusing! (Especially when you remember that the presence or absence of a join can affect the number of rows returned even if it is not used in any clauses.)
(define (make-join a) (join x "X" (join-attach y "Y" (join-on y".XID = "x".XID")) (join-on y".AID = "a".AID")))
(define (make-join a) (join x "X" ; the first word following this comment means join-attach (join y "Y" (join-on y".XID = "x".XID")) (join-on y".AID = "a".AID")))
3.1.3.2 Joined Subqueries
> (define (Foo-of x) (join f "Foo" (select f".*") (where "1=1") (join-on f".FooId = "x".FooId")))
> (define (my-query) (from b "Bar" (select b".*") (select (Foo-of b)".blah"))) > (display (to-sql (my-query)))
select
b.*
, f.blah
from Bar b
inner join (
select
f.*
from Foo f
where 1=1) f
on f.FooId = b.FooId
This makes it more clear that the select and where clauses live inside the subquery, while the join-on clause lives outside of it. There is a clarity/verbosity tradeoff here; you can decide your own preference. But I prefer the first, less verbose version.
3.1.3.3 Join-to-Query Conversion
> (define (Copies-for-Item i) (join c "Copy" (join-on c".ItemId = "i".ItemId")))
> (define (my-query) (from i "Item" (where (exists (from copy (Copies-for-Item i) ; append a clause just for fun: (where "1=1")))))) > (display (to-sql (my-query)))
select i.*
from Item i
where exists (
select c.*
from Copy c
where c.ItemId = i.ItemId
and 1=1)
You can see that c.ItemId = i.ItemId was a join-on clause that got converted to a where clause.
3.1.4 Injections
> (define (Rentals-by-Copy copy) (join r "Rental" (group-by r".CopyId") (join-on r".CopyId = "copy".CopyId")))
> (define (my-query) (from c "Copy" (join r (Rentals-by-Copy c)))) > (display (to-sql (my-query)))
select c.*
from Copy c
inner join (
select r.*
from Rental r
group by r.CopyId) r
on r.CopyId = c.CopyId
> (define (Rentals-by-Copy copy) (join r "Rental" (group-by r".CopyId") ; manually select the CopyId column here: (select r".CopyId") (join-on r".CopyId = "copy".CopyId"))) > (display (to-sql (my-query)))
select c.*
from Copy c
inner join (
select
r.CopyId
from Rental r
group by r.CopyId) r
on r.CopyId = c.CopyId
> (define (Rentals-by-Copy copy) (join r "Rental" (group-by (scalar r".CopyId")) (join-on (scalar r".CopyId")" = "(scalar copy".CopyId")))) > (display (to-sql (my-query)))
select c.*
from Copy c
inner join (
select
r.CopyId as __INJECT1
from Rental r
group by r.CopyId) r
on r.__INJECT1 = c.CopyId
The above example demonstrates the first type of injection, called scalar injection. Plisqin could recognize that the (scalar r ".CopyId") expression which occurs inside the join-on clause must be rewritten. The scalar gets "injected into" the grouped join as a select clause with the synthesized name __INJECT1. Then, outside of the grouped join, it is accessed by that synthesized name.
> (define (my-query) (from c "Copy" (join r (Rentals-by-Copy c)) (select (count r)" as NumRentals") (select (avg r".PricePaid")" as AveragePricePaid") (select c".*"))) > (display (to-sql (my-query)))
select
r.__INJECT2 as NumRentals
, r.__INJECT3 as AveragePricePaid
, c.*
from Copy c
inner join (
select
r.CopyId as __INJECT1
, count(*) as __INJECT2
, avg(r.PricePaid) as __INJECT3
from Rental r
group by r.CopyId) r
on r.__INJECT1 = c.CopyId
In the above example, both (count r) and (avg r ".PricePaid") get injected into the grouped join in a similar way to the scalar injection we already saw. I don’t have to manually manage the select list to include these aggregates. This is very useful: I have the ability to perform aggregate operations on a grouped join from outside that grouped join, without changing its definition, or even parameterizing it!
3.1.4.1 Aggregate Injection in Depth
> (attach-callstacks)
> (define (make-grouped-join x) (join gj "GJ" (group-by (scalar gj".GroupKey")) (join-on (scalar gj".GroupKey")" = "x)))
> (aggregate (make-grouped-join 1) (make-grouped-join 2)) Aggregate expression has 2 targets. At most one target is
allowed.
Target 1 was:
#<syntax:eval:58:0 (make-grouped-join 1)>
Target 2 was:
#<syntax:eval:58:0 (make-grouped-join 2)>
To recap, if an aggregate has no target then no injection occurs. An aggregate is not allowed to have more than 1 target. So the rest of this section is only interested in aggregates that have 1 target.
> (define (my-query) (from x "X" (join j (make-grouped-join (scalar x".Id"))) (select (avg j".foo")" as AverageFoo") (select (max j".bar")" as MaxBar")))
> (display (to-sql (my-query)))
select
gj.__INJECT2 as AverageFoo
, gj.__INJECT3 as MaxBar
from X x
inner join (
select
gj.GroupKey as __INJECT1
, avg(gj.foo) as __INJECT2
, max(gj.bar) as __INJECT3
from GJ gj
group by gj.GroupKey) gj
on gj.__INJECT1 = x.Id
3.1.4.2 Nested Aggregates
The only caveat left to consider is nesting. I would recommend avoiding nesting if possible, because it can be confusing. I’m going to use the Video Rental Example Schema here. An Item has many Copys, and a Copy has many Rentals.
I also need to introduce of/g which is pronounced "of" but carries extra meaning. The /g signifies "grouping", so "the Copies of/g the Item" means "the (group of) Copies of the Item." And per this naming convention, (Copies-of/g item) returns a grouped join.
(count (Rentals-of/g item)) ; or (sum (count (Rentals-of/g (Copies-of/g item))))
The first is a non-nested case that we have already seen: we are just counting a grouped join of Rentals grouped by Item. The second involves nesting, which is what we are interested in. Here is my best attempt at translating the code to its almost-English meaning:
item |
the Item |
(Copies-of/g item) |
the Copies of/g the Item |
(Rentals-of/g (Copies-of/g item)) |
(count (Rentals-of/g (Copies-of/g item))) |
(sum (count (Rentals-of/g (Copies-of/g item)))) |
(the sum of (the count of the Rentals of/g) the Copies of/g) the Item |
I added the parentheses to show how aggregates resolve grouped joins. Specifically, when the count expression appears, it finds that "the Rentals of/g" is unresolved, and resolves it. When the sum expression appears, it finds that "the Copies of/g" is unresolved, and resolves it. Importantly, when we reach the final step, all the grouped joins are resolved. (Visually, every of/g is enclosed in an aggregate.) If the result had had an unresolved grouped join, it would not be a valid scalar. Plisqin does not currently detect this error; it will just generate SQL that will produce an error when it is run.
> (define (my-query) (from item Item (select (sum (count (Rentals-of/g (Copies-of/g item)))) " as NumRentals"))) > (display (to-sql (my-query)))
select
_copy.__INJECT2 as NumRentals
from Item item
inner join (
select
_copy.ItemId as __INJECT1
, sum(_rental.__INJECT2) as __INJECT2
from Copy _copy
inner join (
select
_rental.CopyId as __INJECT1
, count(*) as __INJECT2
from Rental _rental
group by _rental.CopyId) _rental
on _rental.__INJECT1 = _copy.CopyId
group by _copy.ItemId) _copy
on _copy.__INJECT1 = item.ItemId
> (define (my-query) (from item Item (select (count (Rentals-of/g item)) " as NumRentals"))) > (display (to-sql (my-query)))
select
_rental.__INJECT2 as NumRentals
from Item item
inner join (
select
_copy.ItemId as __INJECT1
, count(*) as __INJECT2
from Rental _rental
inner join Copy _copy
on _copy.CopyId = _rental.CopyId
group by _copy.ItemId) _rental
on _rental.__INJECT1 = item.ItemId