Cloud spanner best practice INTERLEAVE questions

1.4k Views Asked by At

Let's take the tables define in docs:

CREATE TABLE Singers (
  SingerId   INT64 NOT NULL,
  FirstName  STRING(1024),
  LastName   STRING(1024),
  SingerInfo BYTES(MAX),
) PRIMARY KEY (SingerId);

CREATE TABLE Albums (
  SingerId     INT64 NOT NULL,
  AlbumId      INT64 NOT NULL,
  AlbumTitle   STRING(MAX),
) PRIMARY KEY (SingerId, AlbumId),
  INTERLEAVE IN PARENT Singers ON DELETE CASCADE;

CREATE TABLE Songs (
  SingerId     INT64 NOT NULL,
  AlbumId      INT64 NOT NULL,
  TrackId      INT64 NOT NULL,
  SongName     STRING(MAX),
) PRIMARY KEY (SingerId, AlbumId, TrackId),
  INTERLEAVE IN PARENT Albums ON DELETE CASCADE;

So we have 3 tables Singers, Albums and Songs. Table Album INTERLEAVE Singers and table Songs INTERLEAVE Singers and Albums.

My question is if we want so search all the information about a particular singer can we search in table Songs if the singer have an album but don't have any song yet ? If not what's the best practice to retrive all the data of a singer(all albums and songs(if he have any)). I thought to search in table Songs if we don't find anything in Songs(because the singer can have an album but songs are in development) search in table Album and after in Singer(because even Album can be in development) but I don't think it's the best solution.

In my case the user who make the query doesn't know if singer have any songs or albums but want to retrive all the info about singer(in one split if possible).

2

There are 2 best solutions below

0
On BEST ANSWER

I have come to two solutions:

  1. In this case we have 3 table scan: Singers, Albums, Songs.

    select singers.singerId, albums.albumId, songs.trackId
    from singers
    left join albums ON singers.singerId = albums.singerid
    left join songs ON albums.albumid = songs.albumid

  2. Have one table like:

Table Schema:

CREATE TABLE Singers (
  SingerId   INT64 NOT NULL,
  AlbumId INT64,
  SongId INT64,
  .
  .(informations about Singer, Album and Song)
  .
) PRIMARY KEY (SingerId);

So we'll have something like:

SingerId AlbumId  SongId  SingerName AlbumName SongName
   1                        Singer 1
   1        1                          Album 1
   1        1       1                           Song 1
   1        1       2                           Song 2
   1        1       3                           Song 3
   1        1                          Album 2
   1        2       1                           Song 1
   1        2       2                           Song 2
   1        2       3                           Song 3

And with 1 query we can recive all data about Singer.(We have 1 big table scan not 3, but I don't know if it's best practive because again, the server will split data between servers so we'll end with multiple selects between splits).

What solution you think works the best and if you have something that I miss please explain.

0
On

I would recommend to use JOINs, potentially avoiding the 3 separate reads (something along the lines.. )

select singers.singerId, albums.albumId, songs.trackId 
from singers left join albums ON singers.singerId = albums.singerid 
left join songs ON songs.SingerId = singers.singerId
order by singerId, albumId;

Reads from child/interleaved tables with no corresponding rows return empty results, thereby requiring 3 separate read requests -

 
  select * from albums order by singerId, albumId;
  SingerId      AlbumId     AlbumTitle

   1            1           Total Junk  
   1            2           Go, Go, Go  
   .. more rows .. 
 

Though querying a child table - this returns no results, since the table does not have Songs for singerId = 1:

select * from songs where singerId = 1 order by singerId, albumId; 

No results. The query did not return any rows.

PS - not sure what u mean by "split" here - "to retrive all the info about singer(in one split if possible)."