On this page:
6.1 What Next?
6.2 Design Notes
6.2.1 Why do sources need UIDs?
6.2.2 On Parameters
6.3 Tests
7.7

6 For Developers

This documentation is not written for a user of Plisqin, but you might find something useful anyway.

6.1 What Next?

There are a lot of things I could try to do next.
  • Work on parameters (see notes below).

  • Can Plisqin be designed in such a way that it can generate SQL or directly integrate with Postgres?

  • Is the previous question important enough to care about right now?

  • Should I try direct integration with SQLite before trying Postgres?

  • What type system should Plisqin have? Dynamic, sure, but what about...

  • Can (scalar x ".Foo") be used in addition? (A: Maybe?)

  • Can (scalar x ".Foo" #:typed-as 'decimal) be used in addition? (A: Always.)

  • Can (scalar x ".Foo" #:typed-as 'varchar) be used in addition? (A: Never.)

  • If I integrate with Postgres/SQLite, could it detect the type errors after I translate the query to its native format?

  • Would I want it to? Probably not... I can’t imagine the semantics being similar enough.

  • How should Plisqin handle NULL?

  • Some aggregate operations have reasonable answers for empty grouped joins (specifically sum and count which should return 0.) But what about other aggregates? What is the average of an empty set? Force the programmer to specify what they want?

  • What does #lang plisqin look like?

  • Can I satisfy people who have never seen (or hate) Lisp without sacrificing power?

  • Can Plisqin provide a uniform API for different databases?

6.2 Design Notes

6.2.1 Why do sources need UIDs?

For some reason I keep thinking "can I just have all UIDs be 0 by default and use something like object equality?" And the answer is still NO. Consider this:

(define (duplicates-of x)
  (from t "Title"
        (where t".PrimaryTitle = "x".PrimaryTitle")
        (where t".TitleID < "x".TitleID")))
(from t "Title"
      (where (exists (duplicates-of t))))

If all the sources are '(Source "Title" "t" 0) then t and x will refer to the same thing. This is clearly wrong.

6.2.2 On Parameters

It’s very convenient to be able to put raw SQL right into a clause, like this:
(from x "X"
      (where x".foo = 'bar'"))

But this comes with the risk of SQL injection. Here is what I am thinking.
  1. Create a new data structure trusted-string.

  2. Tighten the contract (sql-token? probably) to replace string? with trusted-string?.

  3. Create a Racket parameter that says what to do if a string? gets passed in. Either error or automatically convert to an SQL parameter. (Oh yeah, I’ll need to add parameters too.)

  4. Create a rewrite rule so that a syntax object #': immediately left of a syntax object #'"string literal" gets rewritten to (trust "string literal").

  5. Should we do the same for numbers? Probably...

Then you can do
(from x "X"
      {where x :".foo = 'bar'"})

Which is almost as convenient.

If you don’t have a string literal, you would have to do (trust the-string).

I think that if (syntax->datum stx) returns a string? then you know that stx is a string literal.

6.3 Tests

This section is just a dumping ground for tests that might be educational.

-

Aliases are made to be unique if needed. Both of these want the alias "x" but the subquery gets renamed to "x1":

> (define (my-subquery parent)
    (from x "Sub"
          (where x".Something = "parent".Something")))
> (define my-query
    (from x "Parent"
          (where "not "(exists (my-subquery x)))))
> (display (to-sql my-query))

select x.*

from Parent x

where not exists (

    select x1.*

    from Sub x1

    where x1.Something = x.Something)

-

Joins are included even if they don’t appear in a fragment:

> (define my-query
    (from x "X"
          (join y "Y"
                (join-on "1=1"))))
> (display (to-sql my-query))

select x.*

from X x

inner join Y y

on 1=1

-

The query-building macros accept a single clause or a list for greater composability:

> (define my-query
    (from x "X"
          (if #t
              (select x".First")
              (list))
          (if #t
              (list
               (select x".Second")
               (select x".Third"))
              (list))
          (if #f
              (select x".Fourth")
              (list))))
> (display (to-sql my-query))

select

  x.First

  , x.Second

  , x.Third

from X x

-

Aggregate functions don’t need to be injected if they refer only to the main query and "simple" joins:
> (define my-query
    (from t "Title"
          (join r "Rating"
                (join-on r".TitleID = "t".TitleID"))
          (select (sum t".TitleID + "r".Score"))
          (group-by t".TitleID")))
> (display (to-sql my-query))

select

  sum(t.TitleID + r.Score)

from Title t

inner join Rating r

on r.TitleID = t.TitleID

group by t.TitleID

-

The SQL ‘having’ clause:
> (define my-query
    (from r "Rating"
          (group-by r".TitleID")
          (having "count("r".TitleID) > 100")
          (select r".TitleID")
          (select (avg r".Score"))))
> (display (to-sql my-query))

select

  r.TitleID

  , avg(r.Score)

from Rating r

group by r.TitleID

having count(r.TitleID) > 100

-

We can deduplicate injections!
> (define my-query
    (from t "Title"
          (join ratings "Rating"
                (group-by ratings".TitleID")
                (join-on (scalar ratings".TitleID")" = "t".TitleID"))
          (select (avg ratings".Score")" as AvgScore")
          (select (avg ratings".Score")" as AvgScore2")))
> (display (to-sql my-query))

select

  ratings.__INJECT2 as AvgScore

  , ratings.__INJECT2 as AvgScore2

from Title t

inner join (

    select

      ratings.TitleID as __INJECT1

      , avg(ratings.Score) as __INJECT2

    from Rating ratings

    group by ratings.TitleID) ratings

on ratings.__INJECT1 = t.TitleID

-

The old resolve-joins was designed on the assumption that we would not find the source of a join in the fragments. In other words, that all joins would be inlined. This is no longer true. Let’s try to create a failing test here:
> (define (TitleType-of/s title)
    (join tt "TitleType"
          (join-on tt".TitleTypeID = "title".TitleTypeID")))
> (define my-query
    (from t "Title"
          (join tt1 (TitleType-of/s t))
          (join tt2 (TitleType-of/s t))
          (select tt1".a1")
          (select tt2".a2")
          (select (TitleType-of/s t)".a3")
          (select (TitleType-of/s t)".a4")))
> (display (to-sql my-query))

select

  tt.a1

  , tt.a2

  , tt.a3

  , tt.a4

from Title t

inner join TitleType tt

on tt.TitleTypeID = t.TitleTypeID

-

Regression test. A joined subquery should be able to explicitly join other other joins.
> (define my-query
    (from x "X"
          (join y "Y"
                (join-on "'y'='y'")
                (join z "Z"
                      (join-on "'z'='z'"))
                (select y".Foo")
                (select z".Bar"))))
> (display (to-sql my-query))

select x.*

from X x

inner join (

    select

      y.Foo

      , z.Bar

    from Y y

    inner join Z z

    on 'z'='z') y

on 'y'='y'

-

Auto-inject scalars in join-on clauses:
> (define my-query
    (from x "X"
          (join y "Y"
                (group-by y".XID")
                (join-on (scalar y".XID")" = "(scalar x".XID")))
          (select x".*")
          (select (count y)" as NumYs")))
> (display (to-sql my-query))

select

  x.*

  , y.__INJECT2 as NumYs

from X x

inner join (

    select

      y.XID as __INJECT1

      , count(*) as __INJECT2

    from Y y

    group by y.XID) y

on y.__INJECT1 = x.XID

-

Auto-inject scalars in join-on clauses (TODO no tests here)
> (require (only-in racket remove-duplicates))
> (define (jZ y [group? #f])
    (join z "Z"
          (if group?
              (group-by (scalar z".ZID"))
              (list))
          (join-on (scalar z".ZID")" = "(scalar y".ZID"))))
> (define (jY x [group? #f])
    (join y "Y"
          (if group?
              (group-by (scalar y".YID"))
              (list))
          (join-on (scalar y".YID")" = "(scalar x".YID"))))
> (define-syntax-rule (q1 linkY linkZ)
    (from x "X"
  
          (linkY y (jY x))
          (linkZ z (jZ y #t))
          (select (count z)" as Blah")))
> (define sql-strings
    (remove-duplicates
     (list
      (to-sql (q1 join join))
      "\n--- a ---\n"
      (to-sql (q1 join define))
      "\n--- b ---\n"
      (to-sql (q1 define join))
      "\n--- c ---\n"
      (to-sql (q1 define define)))))
> (length sql-strings)

4

> (displayln sql-strings)

(select

  z.__INJECT2 as Blah

from X x

inner join Y y

on y.YID = x.YID

inner join (

    select

      z.ZID as __INJECT1

      , count(*) as __INJECT2

    from Z z

    group by z.ZID) z

on z.__INJECT1 = y.ZID

--- a ---

 

--- b ---

 

--- c ---

)

-

Auto-inject with duplicate inline grouped joins (TODO no tests here)
> (define my-query
    (from x "X"
          (define (z) (jZ (jY x) #t))
          (select (count (z))" as Blah")
          (select (sum (z)".Stuff")" as Total")))
> (displayln (to-sql my-query))

select

  z.__INJECT2 as Blah

  , z.__INJECT3 as Total

from X x

inner join Y y

on y.YID = x.YID

inner join (

    select

      z.ZID as __INJECT1

      , count(*) as __INJECT2

      , sum(z.Stuff) as __INJECT3

    from Z z

    group by z.ZID) z

on z.__INJECT1 = y.ZID

-

Regression test. This was broken because the auto-injected scalars used different sources (from the questionable global scope) which broke equality for the "auto-inject aggregates" algorithm. The solution is to call deduplicate between both algorithms. The better solution would be some way to avoid the global scope when creating sources.
> (define (my-query)
    (from cust Customer
          (define (is-movie rental)
            (bool (ItemTypeId rental)" = 1"))
          (define (is-recent rental)
            (bool (CheckoutTime rental)" > '2018-01-01'"))
          (define (count-when predicate)
            (sum "case when "predicate" then 1 else 0 end"))
          (join rentals (Rentals-of/g cust))
          (select cust".*")
          (select (count rentals)
                  " as TotalRentalCount")
          (select (count-when (is-movie rentals))
                  " as MovieRentalCount")
          (select (count-when (is-recent rentals))
                  " as RecentRentalCount")
          (select (count-when (bool (is-movie rentals)
                                    " and "
                                    (is-recent rentals)))
                  " as RecentMovieRentalCount")))
> (display (to-sql (my-query)))

select

  cust.*

  , _rental.__INJECT2 as TotalRentalCount

  , _rental.__INJECT3 as MovieRentalCount

  , _rental.__INJECT4 as RecentRentalCount

  , _rental.__INJECT5 as RecentMovieRentalCount

from Customer cust

inner join (

    select

      _checkout.CustomerId as __INJECT1

      , count(*) as __INJECT2

      , sum(case when _item.ItemTypeId = 1 then 1 else 0 end) as __INJECT3

      , sum(case when _checkout.CheckoutTime > '2018-01-01' then 1 else 0 end) as __INJECT4

      , sum(case when _item.ItemTypeId = 1 and _checkout.CheckoutTime > '2018-01-01' then 1 else 0 end) as __INJECT5

    from Rental _rental

    inner join Checkout _checkout

    on _checkout.CheckoutId = _rental.CheckoutId

    inner join Copy _copy

    on _copy.CopyId = _rental.CopyId

    inner join Item _item

    on _item.ItemId = _copy.ItemId

    group by _checkout.CustomerId) _rental

on _rental.__INJECT1 = cust.CustomerId

-

Tests of count.

> (define (q)
    (from x "X"
          (select (count x))))
> (display (to-sql (q)))

select

  count(*)

from X x

> (define (q)
    (from x "X"
          (select (count x".foo"))))
> (display (to-sql (q)))

select

  count(x.foo)

from X x

> (define (q)
    (from x "X"
          (select (count x".foo" #:distinct? #t))))
> (display (to-sql (q)))

select

  count(distinct x.foo)

from X x

> (define (q)
    (from x "X"
          (join y "Y")
          (select (count y))))
> (display (to-sql (q)))

select

  count(*)

from X x

inner join Y y

on 1=1

> (define (q)
    (from x "X"
          (join y "Y")
          (select (count y".foo"))))
> (display (to-sql (q)))

select

  count(y.foo)

from X x

inner join Y y

on 1=1

> (define (q)
    (from x "X"
          (join y "Y")
          (select (count y".foo" #:distinct? #t))))
> (display (to-sql (q)))

select

  count(distinct y.foo)

from X x

inner join Y y

on 1=1

> (define (q)
    (from x "X"
          (join y "Y"
                (group-by y".foo"))
          (select  (count y))))
> (display (to-sql (q)))

select

  y.__INJECT1

from X x

inner join (

    select

      count(*) as __INJECT1

    from Y y

    group by y.foo) y

on 1=1

> (define (q)
    (from x "X"
          (join y "Y"
                (group-by y".foo"))
          (select (count y".bar"))))
> (display (to-sql (q)))

select

  y.__INJECT1

from X x

inner join (

    select

      count(y.bar) as __INJECT1

    from Y y

    group by y.foo) y

on 1=1

> (define (q)
    (from x "X"
          (join y "Y"
                (group-by y".foo"))
          (select (count y".bar" #:distinct? #t))))
> (display (to-sql (q)))

select

  y.__INJECT1

from X x

inner join (

    select

      count(distinct y.bar) as __INJECT1

    from Y y

    group by y.foo) y

on 1=1

-

Tests of exists.
> (define (q)
    (from x "X"
          (where (exists "select * from blah"))))
> (display (to-sql (q)))

select x.*

from X x

where exists (select * from blah)

> (define (q)
    (from x "X"
          (where (exists (from y "Y"
                               (where y".Foo = "x".Bar"))))))
> (display (to-sql (q)))

select x.*

from X x

where exists (

    select y.*

    from Y y

    where y.Foo = x.Bar)

> (define (q)
    (from x "X"
          (where (exists (join y "Y"
                               (join-on y".Foo = "x".Bar"))))))
> (display (to-sql (q)))

select x.*

from X x

where exists (

    select y.*

    from Y y

    where y.Foo = x.Bar)