class: center, middle, inverse, title-slide .title[ # Databases with R ] .subtitle[ ##
https://pos.it/databases-24
] .author[ ### Kirill Müller ] .author[ ### cynkra GmbH ] .date[ ### August 12, 2024 ] --- <style type="text/css"> .pull-left { margin-top: -25px; } .pull-right { margin-top: -25px; } .r.hljs.remark-code { font-size: 20px; } .remark-code { font-size: 20px; } .fansi.fansi-error { color: #D70040; } pre { white-space: pre-wrap; /* css-3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* Internet Explorer 5.5+ */ margin-bottom: 0px; } .remark-slide-content { padding-top: 0px; padding-bottom: 0px; } .remark-slide-scaler { overflow-y: auto; } .font17 { font-size: 17px; } .font14 { font-size: 14px; } .script-number-font { color: inherit !important; opacity: 0.9 !important; transform: scale(1.8); transform-origin: top left; } .script-number { color: inherit !important; opacity: 0.9 !important; position: absolute; top: 12px; right: 20px; transform: scale(1.8); transform-origin: top right; } .exercise-timer { position: fixed; bottom: 12px; left: 20px; transform: scale(1.8); transform-origin: bottom left; } .exercise-timer img { max-width: none; /* Prevent the image from being constrained by the container */ } </style> # https://pos.it/databases-24 .right[Welcome to posit::conf(2024)] - WiFi: `Posit Conf 2024` Password: `conf2024` - There are gender-neutral bathrooms located on levels 3, 4, 5, 6 & 7 - There is a meditation/prayer room is located in 503. Available Mon & Tues 7am - 7pm, and Wed 7am - 5pm. - The lactation room is located in 509, same timings as above. - Participants who do not wish to be photographed have red lanyards; please note everyone’s lanyard colors before taking a photo and respect their choices. - The Code of Conduct can be found at https://posit.co/code-of-conduct. Please review them carefully. You can report Code of Conduct violations in person, by email, or by phone. Please see the policy linked above for contact information. --- # https://pos.it/databases-24 .right[Schedule] | Time | Activity | | :------------ | :--------------- | | 09:00 - 10:30 | Talking to the database | | 10:30 - 11:00 | *Coffee break* | | 11:00 - 12:30 | Working with files | | 12:30 - 13:30 | *Lunch break* | | 13:30 - 15:00 | Digging in deeper | | 15:00 - 15:30 | *Coffee break* | | 15:30 - 17:00 | Exercises - Bring your own data | --- # https://pos.it/databases-24 .right[General remarks] - This course is interactive: go ahead and ask if anything is unclear! - More extended Q&A in the last section of the workshop - Discord channel: *#workshop-databases* - Join through <https://pos.it/conf-event-portal> --- .pull-left[ # Roles gravitating around a database - Database researcher / developer - Database operator / administrator - Data engineer - **Analytical engineer** - **Data scientist** - **Data analyst** ] .pull-right[ # Exercise Talk to your neighbors. - Which of the above-mentioned roles do you cover in your organization? - What do you expect to take home from the workshop? ] <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 05:00"><title>Timer: 05:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">05:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">05:00</text></g></svg></div> --- # https://pos.it/databases-24 .right[Course material] ```r # install.packages("usethis") usethis::use_course("posit-conf-2024/databases") # Slides for offline use: index.html # Materials: materials/ ``` posit.cloud: <https://posit.cloud/spaces/537049> GitHub: <https://github.com/posit-conf-2024/databases> --- background-image: url("data:image/png;base64,#images/10.webp") background-size: contain background-position: 100% 100% # First steps .pull-left[ - Connect - Discover <div class="script-number-font"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_10.R"><title>Script: db_10.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_10.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_10.R</text></g></svg></div> ``` r library(DBI) ``` ] --- background-image: url("data:image/png;base64,#images/10-frame.webp") background-size: contain background-position: 100% 100% # First steps .pull-left[ - Connect - Discover <div class="script-number-font"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_10.R"><title>Script: db_10.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_10.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_10.R</text></g></svg></div> ``` r library(DBI) ``` ] --- # Connect to the database First step when accessing the database. ``` r con <- dbConnect(duckdb::duckdb()) con ``` ``` <duckdb_connection 0faa0 driver=<duckdb_driver 22170 dbdir=':memory:' read_only=FALSE>> ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_10.R"><title>Script: db_10.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_10.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_10.R</text></g></svg></div> --- # Discover tables Where is my data? ``` r dbListTables(con) ``` ``` character(0) ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_10.R"><title>Script: db_10.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_10.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_10.R</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/magic-hat-and-wand-nath-r.svg") background-size: 25% background-position: 100% 100% # Prepare database Normally done by database administrators and data engineers. ``` r dm::copy_dm_to( con, dm::dm_pixarfilms(), set_key_constraints = FALSE, temporary = FALSE ) ``` ``` Note: method with signature 'DBIConnection#Id' chosen for function 'dbExistsTable', target signature 'duckdb_connection#Id'. "duckdb_connection#ANY" would also be valid ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_10.R"><title>Script: db_10.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_10.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_10.R</text></g></svg></div> --- # Discover tables Where is my data? .pull-left[ ``` r dbListTables(con) ``` ``` [1] "academy" "box_office" [3] "genres" "pixar_films" [5] "pixar_people" "public_response" ``` ] .pull-right[ ``` r dbListFields(con, "box_office") ``` ``` [1] "film" [2] "budget" [3] "box_office_us_canada" [4] "box_office_other" [5] "box_office_worldwide" ``` ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_10.R"><title>Script: db_10.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_10.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_10.R</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/10-frame.webp") background-size: contain background-position: 100% 100% # First steps: Exercises .pull-left[ 1. List all columns from the `pixar_films` table. 2. Review the help for `dbListFields()` and `dbListTables()`, and the index on <https://dbi.r-dbi.org/reference/>. ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_10.R"><title>Script: db_10.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_10.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_10.R</text></g></svg></div> <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 05:00"><title>Timer: 05:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">05:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">05:00</text></g></svg></div> --- # Learning goals <img src="data:image/png;base64,#databases_files/figure-html/fig-learn-base-1.png" style="display: block; margin: auto;" /> --- # Learning goals <img src="data:image/png;base64,#databases_files/figure-html/fig-learn-most-1.png" style="display: block; margin: auto;" /> --- # Learning goals <img src="data:image/png;base64,#databases_files/figure-html/fig-learn-all-1.png" style="display: block; margin: auto;" /> --- # R Packages | Section|Topic |R Packages | |-------:|:-----------------------|:------------------| | 1|Talking to the database |DBI, dbplyr | | 2|Working with files |+ duckdb, duckplyr | | 3|Digging in deeper |+ dm | --- .pull-left[ # Database Set of concepts: - relationships - data integrity - data redundancy - data control ] .pull-right[ # DBMS Software that embodies those concepts. > A Database Management System (DBMS) is a software system that uses a standard method to store and organize data. It allows users to create, retrieve, update, and manage data efficiently and provides tools for ensuring data consistency, integrity, and security. ] .center[These two terms are typically used interchangeably despite the differences.] --- # Database types .pull-left[ ## Relational (SQL database) or RDBMS Tables - Microsoft SQL Server - Oracle - PostgreSQL - MySQL / MariaDB - Exasol / Clickhouse - **duckdb** ] .pull-right[ ## Non-relational (NoSQL database) Key-value, document, graph, ... - MongoDB - Redis - Amazon DynamoDB ] .center[Server-based, can be found in "corporate" settings.] --- .pull-left[ # Server-based - It requires a separate server and dedicated resources (RAM, CPU, memory and storage) - A client (you / your application) access it over a network - Suitable for multi-user access (e.g. a "corporate" setting) - Deployment requires knowledge and expertise - Requires administration ] .pull-right[ # Embedded - It runs as part of the application process (e.g., in the R session of your laptop) - Does not require a separate database server - Minimal setup and configuration - **Handy in a workshop setting!** ] --- ## Lightweight, embedded and easy-to-configure RDBMS .pull-left[ # SQLite (2000) - Designed for embedded use in applications: mobile apps, web browser - "Most widely deployed and used database engine" ] .pull-right[ # DuckDB (2018) - Tailored towards heavy analytical workloads and data science applications - **"The SQLite for analytics"** ] --- .pull-left[ # Row data storage - SQLite - Good for transactional processing - Each row represented in a table is stored as an array on disk Easy to update individual rows. ] .pull-right[ # Columnar data storage - DuckDB - Good for analytics - Each column is stored as a chunked array on disk and in memory - Memory locality - Column compression **Faster queries for reading.** ] --- class: middle <img src="data:image/png;base64,#images/direct.svg" width="82%" height="82%" style="display: block; margin: auto;" /> --- background-image: url("data:image/png;base64,#images/11.webp") background-size: contain background-position: 100% 100% # Read whole tables .pull-left[ - {DBI} package - Read - Query <div class="script-number-font"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> ``` r library(tidyverse) library(DBI) ``` ] --- background-image: url("data:image/png;base64,#images/11-frame.webp") background-size: contain background-position: 100% 100% # Read whole tables .pull-left[ - {DBI} package - Read - Query <div class="script-number-font"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> ``` r library(tidyverse) library(DBI) ``` ] --- # Connect to the database First step when accessing the database. ```r con <- dbConnect(duckdb::duckdb()) con ``` ``` <duckdb_connection 0faa0 driver=<duckdb_driver 22170 dbdir=':memory:' read_only=FALSE>> ``` ## Caveat: server, credentials, ... ```r dbConnect( odbc::odbc(), # Alternatively: RPostgres::Postgres(), RPostgres::Redshift(), ... dsn = "...", # driver = "...", server = "...", uid = "...", pwd = "...", ... ) ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- # Discover tables Where is my data? .pull-left[ ``` r dbListTables(con) ``` ``` [1] "academy" "box_office" [3] "genres" "pixar_films" [5] "pixar_people" "public_response" ``` ] .pull-right[ ``` r dbListFields(con, "box_office") ``` ``` [1] "film" [2] "budget" [3] "box_office_us_canada" [4] "box_office_other" [5] "box_office_worldwide" ``` ] ## Caveat: schemas, catalogs, ... ```sql SELECT * FROM INFORMATION_SCHEMA.TABLES ``` ```r dbListFields(con, SQL("datasets.movies.box_office")) ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- # Read tables Read entire tables into your local session, if you can afford it. ``` r df_pixar_films <- dbReadTable(con, "pixar_films") df_pixar_films ``` ``` number film release_date run_time film_rating 1 1 Toy Story 1995-11-22 81 G 2 2 A Bug's Life 1998-11-25 95 G 3 3 Toy Story 2 1999-11-24 92 G 4 4 Monsters, Inc. 2001-11-02 92 G 5 5 Finding Nemo 2003-05-30 100 G 6 6 The Incredibles 2004-11-05 115 PG 7 7 Cars 2006-06-09 117 G 8 8 Ratatouille 2007-06-29 111 G 9 9 WALL-E 2008-06-27 98 G 10 10 Up 2009-05-29 96 PG 11 11 Toy Story 3 2010-06-18 103 G 12 12 Cars 2 2011-06-24 106 G 13 13 Brave 2012-06-22 93 PG 14 14 Monsters University 2013-06-21 104 G 15 15 Inside Out 2015-06-19 95 PG 16 16 The Good Dinosaur 2015-11-25 93 PG 17 17 Finding Dory 2016-06-17 97 PG 18 18 Cars 3 2017-06-16 102 G 19 19 Coco 2017-11-22 105 PG 20 20 Incredibles 2 2018-06-15 118 PG [ reached 'max' / getOption("max.print") -- omitted 7 rows ] ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- # Read tables Use `as_tibble()` to convert to a tibble for better display and more robust operation. ``` r df_pixar_films <- dbReadTable(con, "pixar_films") as_tibble(df_pixar_films) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 27 × 5</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_date</span> <span style='font-weight: bold;'>run_time</span> <span style='font-weight: bold;'>film_rating</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 81 G <span style='color: #BCBCBC;'>2</span> 2 A Bug's Life 1998-11-25 95 G <span style='color: #BCBCBC;'>3</span> 3 Toy Story 2 1999-11-24 92 G <span style='color: #BCBCBC;'>4</span> 4 Monsters, Inc. 2001-11-02 92 G <span style='color: #BCBCBC;'>5</span> 5 Finding Nemo 2003-05-30 100 G <span style='color: #BCBCBC;'>6</span> 6 The Incredibles 2004-11-05 115 PG <span style='color: #949494;'># ℹ 21 more rows</span> </CODE></PRE> <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- # Execute queries Write SQL code to define what data you want to see. ``` r dbGetQuery(con, "SELECT * FROM pixar_films") ``` ``` number film release_date run_time film_rating 1 1 Toy Story 1995-11-22 81 G 2 2 A Bug's Life 1998-11-25 95 G 3 3 Toy Story 2 1999-11-24 92 G 4 4 Monsters, Inc. 2001-11-02 92 G 5 5 Finding Nemo 2003-05-30 100 G 6 6 The Incredibles 2004-11-05 115 PG 7 7 Cars 2006-06-09 117 G 8 8 Ratatouille 2007-06-29 111 G 9 9 WALL-E 2008-06-27 98 G 10 10 Up 2009-05-29 96 PG 11 11 Toy Story 3 2010-06-18 103 G 12 12 Cars 2 2011-06-24 106 G 13 13 Brave 2012-06-22 93 PG 14 14 Monsters University 2013-06-21 104 G 15 15 Inside Out 2015-06-19 95 PG 16 16 The Good Dinosaur 2015-11-25 93 PG 17 17 Finding Dory 2016-06-17 97 PG 18 18 Cars 3 2017-06-16 102 G 19 19 Coco 2017-11-22 105 PG 20 20 Incredibles 2 2018-06-15 118 PG [ reached 'max' / getOption("max.print") -- omitted 7 rows ] ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- # Execute queries Write complex SQL code to define what data you want to see. ``` r sql <- " SELECT * FROM pixar_films WHERE release_date >= '2020-01-01' " ``` ``` r dbGetQuery(con, sql) ``` ``` number film release_date run_time film_rating 1 22 Onward 2020-03-06 102 PG 2 23 Soul 2020-12-25 100 PG 3 24 Luca 2021-06-18 151 N/A 4 25 Turning Red 2022-03-11 NA N/A 5 26 Lightyear 2022-06-17 NA N/A 6 27 <NA> 2023-06-16 155 Not Rated ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- # Execute queries R 4.0 or later: use new-style string literals for mixing quotes. ``` r sql <- r"( SELECT * FROM "pixar_films" WHERE "release_date" >= '2020-01-01' )" ``` ``` r dbGetQuery(con, sql) ``` ``` number film release_date run_time film_rating 1 22 Onward 2020-03-06 102 PG 2 23 Soul 2020-12-25 100 PG 3 24 Luca 2021-06-18 151 N/A 4 25 Turning Red 2022-03-11 NA N/A 5 26 Lightyear 2022-06-17 NA N/A 6 27 <NA> 2023-06-16 155 Not Rated ``` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/magic-hat-and-wand-nath-r.svg") background-size: 25% background-position: 100% 100% # Further pointers .pull-left[ ## Quoting ``` r dbQuoteIdentifier(con, "from") ``` ``` <SQL> "from" ``` ``` r dbQuoteLiteral(con, "Toy Story") ``` ``` <SQL> 'Toy Story' ``` ``` r dbQuoteLiteral(con, as.Date("2020-01-01")) ``` ``` <SQL> '2020-01-01'::date ``` ``` r glue::glue_sql(...) ``` ] .pull-right[ ## Parameterized queries ``` r sql <- "SELECT count(*) FROM pixar_films WHERE release_date >= ?" ``` ``` r dbGetQuery(con, sql, params = list(as.Date("2020-01-01")) ) ``` ``` count_star() 1 6 ``` ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/11-frame.webp") background-size: contain background-position: 100% 100% # Read whole tables: Exercises .pull-left[ 1. Read the `academy` table. 2. Read all records from the `academy` table that correspond to awards won - Hint: `"SELECT * FROM academy WHERE status = 'Won'"` 3. Use quoting and/or a query parameter to make the previous query more robust. - Hint: `sql <- paste0("SELECT * FROM academy WHERE ", quoted_column, " = ?")` ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_11.R"><title>Script: db_11.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_11.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_11.R</text></g></svg></div> <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 05:00"><title>Timer: 05:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">05:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">05:00</text></g></svg></div> --- # Learning goals <img src="data:image/png;base64,#databases_files/figure-html/fig-learn-dbplyr-1.png" style="display: block; margin: auto;" /> --- class: middle <img src="data:image/png;base64,#images/connecting-dots-simple.svg" width="82%" height="82%" style="display: block; margin: auto;" /> --- background-image: url("data:image/png;base64,#images/12.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database .pull-left[ - {dbplyr} package (part of the tidyverse) - Lazy tables - `collect()` - `select()` - `filter()` <div class="script-number-font"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> ``` r library(tidyverse) ``` ] --- background-image: url("data:image/png;base64,#images/12-frame.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database .pull-left[ - {dbplyr} package (part of the tidyverse) - Lazy tables - `collect()` - `select()` - `filter()` <div class="script-number-font"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> ``` r library(tidyverse) ``` ] --- # Lazy tables A pointer to a SQL table. The data is still on the database! ``` r pixar_films <- tbl(con, "pixar_films") pixar_films ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: table<pixar_films> [?? x 5]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_date</span> <span style='font-weight: bold;'>run_time</span> <span style='font-weight: bold;'>film_rating</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 81 G <span style='color: #BCBCBC;'>2</span> 2 A Bug's Life 1998-11-25 95 G <span style='color: #BCBCBC;'>3</span> 3 Toy Story 2 1999-11-24 92 G <span style='color: #BCBCBC;'>4</span> 4 Monsters, Inc. 2001-11-02 92 G <span style='color: #BCBCBC;'>5</span> 5 Finding Nemo 2003-05-30 100 G <span style='color: #BCBCBC;'>6</span> 6 The Incredibles 2004-11-05 115 PG <span style='color: #949494;'># ℹ more rows</span> </CODE></PRE> <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> --- # Read the whole table ``` r df_pixar_films <- pixar_films |> collect() df_pixar_films ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 27 × 5</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_date</span> <span style='font-weight: bold;'>run_time</span> <span style='font-weight: bold;'>film_rating</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 81 G <span style='color: #BCBCBC;'>2</span> 2 A Bug's Life 1998-11-25 95 G <span style='color: #BCBCBC;'>3</span> 3 Toy Story 2 1999-11-24 92 G <span style='color: #BCBCBC;'>4</span> 4 Monsters, Inc. 2001-11-02 92 G <span style='color: #BCBCBC;'>5</span> 5 Finding Nemo 2003-05-30 100 G <span style='color: #BCBCBC;'>6</span> 6 The Incredibles 2004-11-05 115 PG <span style='color: #949494;'># ℹ 21 more rows</span> </CODE></PRE> <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> --- # Select columns With `select()`, like with data frames. .pull-left[ ``` r pixar_films |> select(1:3) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 3]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_date</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 <span style='color: #BCBCBC;'>2</span> 2 A Bug's Life 1998-11-25 <span style='color: #BCBCBC;'>3</span> 3 Toy Story 2 1999-11-24 <span style='color: #BCBCBC;'>4</span> 4 Monsters, Inc. 2001-11-02 <span style='color: #BCBCBC;'>5</span> 5 Finding Nemo 2003-05-30 <span style='color: #BCBCBC;'>6</span> 6 The Incredibles 2004-11-05 <span style='color: #949494;'># ℹ more rows</span> </CODE></PRE> ] .pull-right[ ## Under the hood ``` r pixar_films |> select(1:3) |> show_query() ``` <PRE class="fansi fansi-output"><CODE><SQL> <span style='color: #0000BB;'>SELECT</span> number, film, release_date <span style='color: #0000BB;'>FROM</span> pixar_films </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> --- # Select columns and read .pull-left[ ``` r df_pixar_films_3 <- pixar_films |> select(1:3) |> collect() df_pixar_films_3 ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 27 × 3</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_date</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 <span style='color: #BCBCBC;'>2</span> 2 A Bug's Life 1998-11-25 <span style='color: #BCBCBC;'>3</span> 3 Toy Story 2 1999-11-24 <span style='color: #BCBCBC;'>4</span> 4 Monsters, Inc. 2001-11-02 <span style='color: #BCBCBC;'>5</span> 5 Finding Nemo 2003-05-30 <span style='color: #BCBCBC;'>6</span> 6 The Incredibles 2004-11-05 <span style='color: #949494;'># ℹ 21 more rows</span> </CODE></PRE> ] .pull-right[ ## Data on the database not affected ``` r pixar_films |> collect() ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 27 × 5</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='font-weight: bold;'>film_…³</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 81 G <span style='color: #BCBCBC;'>2</span> 2 A Bug's L… 1998-11-25 95 G <span style='color: #BCBCBC;'>3</span> 3 Toy Story… 1999-11-24 92 G <span style='color: #BCBCBC;'>4</span> 4 Monsters,… 2001-11-02 92 G <span style='color: #BCBCBC;'>5</span> 5 Finding N… 2003-05-30 100 G <span style='color: #BCBCBC;'>6</span> 6 The Incre… 2004-11-05 115 PG <span style='color: #949494;'># ℹ 21 more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> --- # Select rows With `filter()`, like with data frames. .pull-left[ ``` r pixar_films |> filter(release_date >= "2020-01-01") ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [6 x 5]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='font-weight: bold;'>film_…³</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 22 Onward 2020-03-06 102 PG <span style='color: #BCBCBC;'>2</span> 23 Soul 2020-12-25 100 PG <span style='color: #BCBCBC;'>3</span> 24 Luca 2021-06-18 151 N/A <span style='color: #BCBCBC;'>4</span> 25 Turning R… 2022-03-11 <span style='color: #BB0000;'>NA</span> N/A <span style='color: #BCBCBC;'>5</span> 26 Lightyear 2022-06-17 <span style='color: #BB0000;'>NA</span> N/A <span style='color: #BCBCBC;'>6</span> 27 <span style='color: #BB0000;'>NA</span> 2023-06-16 155 Not Ra… <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date,</span> <span style='color: #949494;'># ²​run_time, ³​film_rating</span> </CODE></PRE> ] .pull-right[ ## Under the hood ``` r pixar_films |> filter(release_date >= "2020-01-01") |> show_query() ``` <PRE class="fansi fansi-output"><CODE><SQL> <span style='color: #0000BB;'>SELECT</span> pixar_films.* <span style='color: #0000BB;'>FROM</span> pixar_films <span style='color: #0000BB;'>WHERE</span> (release_date >= '2020-01-01') </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> --- # Select rows and read .pull-left[ ``` r df_pixar_films_202x <- pixar_films |> filter(release_date >= "2020-01-01") |> collect() df_pixar_films_202x ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 6 × 5</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='font-weight: bold;'>film_…³</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 22 Onward 2020-03-06 102 PG <span style='color: #BCBCBC;'>2</span> 23 Soul 2020-12-25 100 PG <span style='color: #BCBCBC;'>3</span> 24 Luca 2021-06-18 151 N/A <span style='color: #BCBCBC;'>4</span> 25 Turning R… 2022-03-11 <span style='color: #BB0000;'>NA</span> N/A <span style='color: #BCBCBC;'>5</span> 26 Lightyear 2022-06-17 <span style='color: #BB0000;'>NA</span> N/A <span style='color: #BCBCBC;'>6</span> 27 <span style='color: #BB0000;'>NA</span> 2023-06-16 155 Not Ra… <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date,</span> <span style='color: #949494;'># ²​run_time, ³​film_rating</span> </CODE></PRE> ] .pull-right[ ## Data on the database not affected ``` r pixar_films |> collect() ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 27 × 5</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='font-weight: bold;'>film_…³</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 81 G <span style='color: #BCBCBC;'>2</span> 2 A Bug's L… 1998-11-25 95 G <span style='color: #BCBCBC;'>3</span> 3 Toy Story… 1999-11-24 92 G <span style='color: #BCBCBC;'>4</span> 4 Monsters,… 2001-11-02 92 G <span style='color: #BCBCBC;'>5</span> 5 Finding N… 2003-05-30 100 G <span style='color: #BCBCBC;'>6</span> 6 The Incre… 2004-11-05 115 PG <span style='color: #949494;'># ℹ 21 more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/12-frame.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database: Exercises 1 * Find several ways to select the 3 first columns * What happens if you include the name of a variable multiple times in a `select()` call? * Select all columns that contain underscores (use `contains()`) * Use `all_of()` to select 2 columns of your choice <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 10:00"><title>Timer: 10:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">10:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">10:00</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/12-frame.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database: Exercises 2 .pull-left[ Find all films that 1. Are rated "PG" 2. Had a run time below 95 3. Had a rating of "N/A" or "Not Rated" 4. Were released after and including year 2020 5. Have a missing name (`film` column) or `run_time` 6. Are a first sequel (the name ends with "2", as in "Toy Story 2") - Hint: Bring the data into the R session before filtering ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_12.R"><title>Script: db_12.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_12.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_12.R</text></g></svg></div> <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 10:00"><title>Timer: 10:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">10:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">10:00</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/12-frame.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database .pull-left[ - {dbplyr} package (part of the tidyverse) - Lazy tables - `collect()` - `summarize()` - `count()` - `left_join()` <div class="script-number-font"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> ``` r library(tidyverse) ``` ] --- # Aggregate .pull-left[ - With `summarize()` - Use `.by` for grouped operations. ``` r pixar_films |> summarize(.by = film_rating, n = n()) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [4 x 2]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film_rating</span> <span style='font-weight: bold;'>n</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Not Rated 1 <span style='color: #BCBCBC;'>2</span> G 13 <span style='color: #BCBCBC;'>3</span> N/A 3 <span style='color: #BCBCBC;'>4</span> PG 10 </CODE></PRE> ] .pull-right[ Old syntax: ``` r pixar_films %>% group_by(film_rating) %>% summarize(n = n()) %>% ungroup() ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [4 x 2]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film_rating</span> <span style='font-weight: bold;'>n</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Not Rated 1 <span style='color: #BCBCBC;'>2</span> G 13 <span style='color: #BCBCBC;'>3</span> PG 10 <span style='color: #BCBCBC;'>4</span> N/A 3 </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Aggregate .pull-left[ ``` r pixar_films |> summarize(.by = film_rating, n = n()) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [4 x 2]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film_rating</span> <span style='font-weight: bold;'>n</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Not Rated 1 <span style='color: #BCBCBC;'>2</span> PG 10 <span style='color: #BCBCBC;'>3</span> G 13 <span style='color: #BCBCBC;'>4</span> N/A 3 </CODE></PRE> ] .pull-right[ ## Under the hood ``` r pixar_films |> summarize(.by = film_rating, n = n()) |> show_query() ``` <PRE class="fansi fansi-output"><CODE><SQL> <span style='color: #0000BB;'>SELECT</span> film_rating, COUNT(*)<span style='color: #0000BB;'> AS </span>n <span style='color: #0000BB;'>FROM</span> pixar_films <span style='color: #0000BB;'>GROUP BY</span> film_rating </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Aggregate With `count()`, like with data frames. .pull-left[ ``` r pixar_films |> # # count(film_rating) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [4 x 2]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film_rating</span> <span style='font-weight: bold;'>n</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Not Rated 1 <span style='color: #BCBCBC;'>2</span> G 13 <span style='color: #BCBCBC;'>3</span> PG 10 <span style='color: #BCBCBC;'>4</span> N/A 3 </CODE></PRE> ] .pull-right[ ## Under the hood ``` r pixar_films |> # # count(film_rating) |> show_query() ``` <PRE class="fansi fansi-output"><CODE><SQL> <span style='color: #0000BB;'>SELECT</span> film_rating, COUNT(*)<span style='color: #0000BB;'> AS </span>n <span style='color: #0000BB;'>FROM</span> pixar_films <span style='color: #0000BB;'>GROUP BY</span> film_rating </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Aggregate and read .pull-left[ ``` r df_pixar_films_by_rating <- pixar_films |> count(film_rating) |> collect() df_pixar_films_by_rating ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 4 × 2</span> <span style='font-weight: bold;'>film_rating</span> <span style='font-weight: bold;'>n</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> PG 10 <span style='color: #BCBCBC;'>2</span> G 13 <span style='color: #BCBCBC;'>3</span> Not Rated 1 <span style='color: #BCBCBC;'>4</span> N/A 3 </CODE></PRE> ] .pull-right[ ## Data on the database not affected ``` r pixar_films |> collect() ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 27 × 5</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='font-weight: bold;'>film_…³</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> 1 Toy Story 1995-11-22 81 G <span style='color: #BCBCBC;'>2</span> 2 A Bug's L… 1998-11-25 95 G <span style='color: #BCBCBC;'>3</span> 3 Toy Story… 1999-11-24 92 G <span style='color: #BCBCBC;'>4</span> 4 Monsters,… 2001-11-02 92 G <span style='color: #BCBCBC;'>5</span> 5 Finding N… 2003-05-30 100 G <span style='color: #BCBCBC;'>6</span> 6 The Incre… 2004-11-05 115 PG <span style='color: #949494;'># ℹ 21 more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # A second table .pull-left[ ``` r academy <- tbl(con, "academy") academy |> count(status) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [5 x 2]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>status</span> <span style='font-weight: bold;'>n</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Award not yet introduced 3 <span style='color: #BCBCBC;'>2</span> Nominated 36 <span style='color: #BCBCBC;'>3</span> Won Special Achievement 1 <span style='color: #BCBCBC;'>4</span> Ineligible 23 <span style='color: #BCBCBC;'>5</span> Won 17 </CODE></PRE> ] .pull-right[ ``` r academy ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: table<academy> [?? x 3]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>award_type</span> <span style='font-weight: bold;'>status</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #BCBCBC;'>1</span> Toy Story Animated Feature Award not yet… <span style='color: #BCBCBC;'>2</span> Toy Story Original Screenplay Nominated <span style='color: #BCBCBC;'>3</span> Toy Story Adapted Screenplay Ineligible <span style='color: #BCBCBC;'>4</span> Toy Story Original Score Nominated <span style='color: #BCBCBC;'>5</span> Toy Story Original Song Nominated <span style='color: #BCBCBC;'>6</span> Toy Story Other Won Special A… <span style='color: #949494;'># ℹ more rows</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Left join, the most frequent kind of join .pull-left[ ## Unsafe ``` r academy |> left_join(pixar_films) ``` ``` Joining with `by = join_by(film)` ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 7]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>award_t…¹</span> <span style='font-weight: bold;'>status</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>release_…²</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #BCBCBC;'>1</span> Toy Story Animated… Award… 1 1995-11-22 <span style='color: #BCBCBC;'>2</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>3</span> Toy Story Adapted … Ineli… 1 1995-11-22 <span style='color: #BCBCBC;'>4</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>5</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>6</span> Toy Story Other Won S… 1 1995-11-22 <span style='color: #949494;'># ℹ more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​award_type, …</span> </CODE></PRE> ] .pull-right[ ## Better ``` r academy |> left_join(pixar_films, join_by(film)) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 7]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>award_t…¹</span> <span style='font-weight: bold;'>status</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>release_…²</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #BCBCBC;'>1</span> Toy Story Animated… Award… 1 1995-11-22 <span style='color: #BCBCBC;'>2</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>3</span> Toy Story Adapted … Ineli… 1 1995-11-22 <span style='color: #BCBCBC;'>4</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>5</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>6</span> Toy Story Other Won S… 1 1995-11-22 <span style='color: #949494;'># ℹ more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​award_type, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Left join, the most frequent kind of join Computed on the database, original data unchanged. ``` r academy |> left_join(pixar_films, join_by(film)) |> show_query() ``` <PRE class="fansi fansi-output"><CODE><SQL> <span style='color: #0000BB;'>SELECT</span> academy.*, number, release_date, run_time, film_rating <span style='color: #0000BB;'>FROM</span> academy <span style='color: #0000BB;'>LEFT JOIN</span> pixar_films <span style='color: #0000BB;'>ON</span> (academy.film = pixar_films.film) </CODE></PRE> <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Join with preparation .pull-left[ ## Prepare RHS ``` r academy_won <- academy |> filter(status == "Won") |> count(film, name = "n_won") academy_won ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 2]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>n_won</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> The Incredibles 2 <span style='color: #BCBCBC;'>2</span> WALL-E 1 <span style='color: #BCBCBC;'>3</span> Toy Story 3 2 <span style='color: #BCBCBC;'>4</span> Coco 2 <span style='color: #BCBCBC;'>5</span> Up 2 <span style='color: #BCBCBC;'>6</span> Monsters, Inc. 1 <span style='color: #949494;'># ℹ more rows</span> </CODE></PRE> ] .pull-right[ ``` r academy_won |> right_join(pixar_films, join_by(film)) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 6]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>n_won</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Monsters, I… 1 4 2001-11-02 92 <span style='color: #BCBCBC;'>2</span> Finding Nemo 1 5 2003-05-30 100 <span style='color: #BCBCBC;'>3</span> The Incredi… 2 6 2004-11-05 115 <span style='color: #BCBCBC;'>4</span> Ratatouille 1 8 2007-06-29 111 <span style='color: #BCBCBC;'>5</span> WALL-E 1 9 2008-06-27 98 <span style='color: #BCBCBC;'>6</span> Up 2 10 2009-05-29 96 <span style='color: #949494;'># ℹ more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Join with postprocessing .pull-left[ ## Raw result ``` r academy_won |> right_join(pixar_films, join_by(film)) |> arrange(release_date) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 6]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='color: #949494;'># Ordered by: release_date</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>n_won</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Toy Story <span style='color: #BB0000;'>NA</span> 1 1995-11-22 81 <span style='color: #BCBCBC;'>2</span> A Bug's Life <span style='color: #BB0000;'>NA</span> 2 1998-11-25 95 <span style='color: #BCBCBC;'>3</span> Toy Story 2 <span style='color: #BB0000;'>NA</span> 3 1999-11-24 92 <span style='color: #BCBCBC;'>4</span> Monsters, I… 1 4 2001-11-02 92 <span style='color: #BCBCBC;'>5</span> Finding Nemo 1 5 2003-05-30 100 <span style='color: #BCBCBC;'>6</span> The Incredi… 2 6 2004-11-05 115 <span style='color: #949494;'># ℹ more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date, …</span> </CODE></PRE> ] .pull-right[ ## After postprocessing ``` r academy_won |> right_join(pixar_films, join_by(film)) |> mutate(n_won = coalesce(n_won, 0L)) |> arrange(release_date) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 6]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='color: #949494;'># Ordered by: release_date</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>n_won</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>release_…¹</span> <span style='font-weight: bold;'>run_t…²</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #949494; font-style: italic;'><dbl></span> <span style='color: #BCBCBC;'>1</span> Toy Story 0 1 1995-11-22 81 <span style='color: #BCBCBC;'>2</span> A Bug's Life 0 2 1998-11-25 95 <span style='color: #BCBCBC;'>3</span> Toy Story 2 0 3 1999-11-24 92 <span style='color: #BCBCBC;'>4</span> Monsters, I… 1 4 2001-11-02 92 <span style='color: #BCBCBC;'>5</span> Finding Nemo 1 5 2003-05-30 100 <span style='color: #BCBCBC;'>6</span> The Incredi… 2 6 2004-11-05 115 <span style='color: #949494;'># ℹ more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​release_date, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Join with processing Computed on the database, original data unchanged. .pull-left[ ``` r pixar_films |> left_join(academy_won, join_by(film)) |> mutate(n_won = coalesce(n_won, 0L)) |> arrange(release_date) |> show_query() ``` ] .pull-right[ <PRE class="fansi fansi-output"><CODE><SQL> <span style='color: #0000BB;'>SELECT</span> number, film, release_date, run_time, film_rating, COALESCE(n_won, 0)<span style='color: #0000BB;'> AS </span>n_won <span style='color: #0000BB;'>FROM</span> ( <span style='color: #0000BB;'>SELECT</span> pixar_films.*, n_won <span style='color: #0000BB;'>FROM</span> pixar_films <span style='color: #0000BB;'>LEFT JOIN</span> ( <span style='color: #0000BB;'>SELECT</span> film, COUNT(*)<span style='color: #0000BB;'> AS </span>n_won <span style='color: #0000BB;'>FROM</span> ( <span style='color: #0000BB;'>SELECT</span> academy.* <span style='color: #0000BB;'>FROM</span> academy <span style='color: #0000BB;'>WHERE</span> (status = 'Won') ) q01 <span style='color: #0000BB;'>GROUP BY</span> film ) RHS <span style='color: #0000BB;'>ON</span> (pixar_films.film = RHS.film) ) q01 <span style='color: #0000BB;'>ORDER BY</span> release_date </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Tables must be on the same source Use `copy = TRUE` to enforce, the result is a lazy table if the LHS is a lazy table. .pull-left[ ``` r academy |> left_join( pixarfilms::pixar_films, join_by(film)) ``` <PRE class="fansi fansi-error"><CODE><span style='color: #BBBB00; font-weight: bold;'>Error</span><span style='font-weight: bold;'> in `auto_copy()`:</span> <span style='color: #BBBB00;'>!</span> `x` and `y` must share the same src. <span style='color: #00BBBB;'>ℹ</span> `x` is a <tbl_duckdb_connection/tbl_dbi/tbl_sql/tbl_lazy/tbl> object. <span style='color: #00BBBB;'>ℹ</span> `y` is a <tbl_df/tbl/data.frame> object. <span style='color: #00BBBB;'>ℹ</span> Set `copy = TRUE` if `y` can be copied to the same source as `x` (may be slow). </CODE></PRE> ] .pull-right[ ``` r academy |> left_join( pixarfilms::pixar_films, join_by(film), copy = TRUE) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># Source: SQL [?? x 7]</span> <span style='color: #949494;'># Database: DuckDB v1.0.0</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>award_t…¹</span> <span style='font-weight: bold;'>status</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>release_…²</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #BCBCBC;'>1</span> Toy Story Animated… Award… 1 1995-11-22 <span style='color: #BCBCBC;'>2</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>3</span> Toy Story Adapted … Ineli… 1 1995-11-22 <span style='color: #BCBCBC;'>4</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>5</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>6</span> Toy Story Other Won S… 1 1995-11-22 <span style='color: #949494;'># ℹ more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​award_type, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Copying is expensive! A temporary table is created on the LHS database. If the RHS comes from a different database, results are temporarily loaded into the local session! ``` r academy |> left_join(pixarfilms::pixar_films, join_by(film), copy = TRUE) |> show_query() ``` <PRE class="fansi fansi-output"><CODE><SQL> <span style='color: #0000BB;'>SELECT</span> academy.*, number, release_date, run_time, film_rating <span style='color: #0000BB;'>FROM</span> academy <span style='color: #0000BB;'>LEFT JOIN</span> dbplyr_ZoLmXZFBoc <span style='color: #0000BB;'>ON</span> (academy.film = dbplyr_ZoLmXZFBoc.film) </CODE></PRE> <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- # Joining data frames with lazy tables The result is a data frame too. .pull-left[ ``` r pixarfilms::academy |> left_join(pixar_films, join_by(film)) ``` <PRE class="fansi fansi-error"><CODE><span style='color: #BBBB00; font-weight: bold;'>Error</span><span style='font-weight: bold;'> in `auto_copy()`:</span> <span style='color: #BBBB00;'>!</span> `x` and `y` must share the same src. <span style='color: #00BBBB;'>ℹ</span> `x` is a <tbl_df/tbl/data.frame> object. <span style='color: #00BBBB;'>ℹ</span> `y` is a <tbl_duckdb_connection/tbl_dbi/tbl_sql/tbl_lazy/tbl> object. <span style='color: #00BBBB;'>ℹ</span> Set `copy = TRUE` if `y` can be copied to the same source as `x` (may be slow). </CODE></PRE> ] .pull-right[ ``` r pixarfilms::academy |> left_join(pixar_films, join_by(film), copy = TRUE) ``` <PRE class="fansi fansi-output"><CODE><span style='color: #949494;'># A tibble: 80 × 7</span> <span style='font-weight: bold;'>film</span> <span style='font-weight: bold;'>award_t…¹</span> <span style='font-weight: bold;'>status</span> <span style='font-weight: bold;'>number</span> <span style='font-weight: bold;'>release_…²</span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><chr></span> <span style='color: #949494; font-style: italic;'><date></span> <span style='color: #BCBCBC;'>1</span> Toy Story Animated… Award… 1 1995-11-22 <span style='color: #BCBCBC;'>2</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>3</span> Toy Story Adapted … Ineli… 1 1995-11-22 <span style='color: #BCBCBC;'>4</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>5</span> Toy Story Original… Nomin… 1 1995-11-22 <span style='color: #BCBCBC;'>6</span> Toy Story Other Won S… 1 1995-11-22 <span style='color: #949494;'># ℹ 74 more rows</span> <span style='color: #949494;'># ℹ abbreviated names: ¹​award_type, …</span> </CODE></PRE> ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/12-frame.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database: Exercises 3 1. How many films are stored in the table? 2. How many films released after 2005 are stored in the table? 3. What is the total run time of all films? - Hint: Use `summarize(sum(...))`, watch out for the warning 4. What is the total run time of all films, per rating? - Hint: Use `.by` <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 10:00"><title>Timer: 10:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">10:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">10:00</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/12-frame.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database: Exercises 4 .pull-left[ 1. How many rows does the join between `academy` and `pixar_films` contain? Try to find out without loading all the data into memory. Explain. 2. Which films are not yet listed in the `academy` table? What does the resulting SQL query look like? - Hint: Use `anti_join()` ] <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 10:00"><title>Timer: 10:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">10:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">10:00</text></g></svg></div> --- background-image: url("data:image/png;base64,#images/12-frame.webp") background-size: 40% background-position: 100% 100% # Downsizing on the database: Exercises 5 1. Plot a bar chart with the number of awards won and nominated per year. Compute as much as possible on the database. - Hint: "Long form" or "wide form"? <div class="script-number"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="100" height="20" role="img" aria-label="Script: db_13.R"><title>Script: db_13.R</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="100" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="43" height="20" fill="#555"/><rect x="43" width="57" height="20" fill="#fe7d37"/><rect width="100" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="225" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">Script</text><text x="225" y="140" transform="scale(.1)" fill="#fff" textLength="330">Script</text><text aria-hidden="true" x="705" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="470">db_13.R</text><text x="705" y="140" transform="scale(.1)" fill="#fff" textLength="470">db_13.R</text></g></svg></div> <div class="exercise-timer"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="84" height="20" role="img" aria-label="Timer: 10:00"><title>Timer: 10:00</title><linearGradient id="s" x2="0" y2="100%"><stop offset="0" stop-color="#bbb" stop-opacity=".1"/><stop offset="1" stop-opacity=".1"/></linearGradient><clipPath id="r"><rect width="84" height="20" rx="3" fill="#fff"/></clipPath><g clip-path="url(#r)"><rect width="41" height="20" fill="#555"/><rect x="41" width="43" height="20" fill="#007ec6"/><rect width="84" height="20" fill="url(#s)"/></g><g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110"><text aria-hidden="true" x="215" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="310">Timer</text><text x="215" y="140" transform="scale(.1)" fill="#fff" textLength="310">Timer</text><text aria-hidden="true" x="615" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">10:00</text><text x="615" y="140" transform="scale(.1)" fill="#fff" textLength="330">10:00</text></g></svg></div> --- class: inverse, middle, center # https://posit-conf-2024.github.io/databases/part-2.html