library(dplyr)
library(dbplyr)
library(DBI)
con <- dbConnect(
odbc::databricks(),
HTTPPath = "/sql/1.0/warehouses/300bd24ba12adf8e"
)8 Working with LLMs in Databricks
Catch up
8.1 Accessing AI functions
Use the sentiment classification function
- Create a quick review table using the following code:
reviews <- tribble(
~name, ~review,
"adam", "This is the best toaster I have ever bought",
"berry", "Toaster arrived broken, waiting for replancement",
"charles", "The washing machine is as advertised, can't wait to use it",
"dan", "Not sure how to feel about this tevelision, nice brightness but bad definition"
) |>
select(review) - Copy the
reviewsdata frame to your SQL session. Assign it to a variable calledtbl_reviews
tbl_reviews <- copy_to(con, reviews, overwrite = TRUE)- Create a new field called “sentiment”, use
ai_analyze_sentiment()to analyze the “review” field
tbl_reviews |>
mutate(sentiment = ai_analyze_sentiment(review))
#> # Source: SQL [4 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> review sentiment
#> <chr> <chr>
#> 1 This is the best toaster I have ever bought positive
#> 2 Toaster arrived broken, waiting for replancement negative
#> 3 The washing machine is as advertised, cant wait to use it positive
#> 4 Not sure how to feel about this tevelision, nice brightness but b… mixed8.2 Specify array
Using array() to run the classification function
- Use
ai_classify()to find out if we need to follow up with customer. The two options should be: ‘order complete’, and ‘need follow up’. Usearray()as if you would be using thec()function. Name the new field “follow_up”
tbl_reviews |>
mutate(
follow_up = ai_classify(review, array("order complete", "need follow up"))
)
#> # Source: SQL [4 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> review follow_up
#> <chr> <chr>
#> 1 This is the best toaster I have ever bought order complete
#> 2 Toaster arrived broken, waiting for replancement need follow up
#> 3 The washing machine is as advertised, cant wait to use it order complete
#> 4 Not sure how to feel about this tevelision, nice brightness … need follow up- Add a step that keeps only those orders that need follow up
tbl_reviews |>
mutate(
follow_up = ai_classify(review, array("order complete", "need follow up"))
) |>
filter(follow_up == "need follow up")
#> # Source: SQL [2 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> review follow_up
#> <chr> <chr>
#> 1 Toaster arrived broken, waiting for replancement need follow up
#> 2 Not sure how to feel about this tevelision, nice brightness … need follow up8.3 Process complex output
Working STRUCT output from an ‘ai’ function
- Use
ai_extract()to pull the type of product being referred to in the review. Pass ‘product’ as the extract argument, and pass it inside anarray()call. Name the new field “product”
tbl_reviews |>
mutate(product = ai_extract(review, array("product")))
#> # Source: SQL [4 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> review product
#> <chr> <chr>
#> 1 This is the best toaster I have ever bought "{\"product\":\"toa…
#> 2 Toaster arrived broken, waiting for replancement "{\"product\":\"Toa…
#> 3 The washing machine is as advertised, cant wait to use… "{\"product\":\"was…
#> 4 Not sure how to feel about this tevelision, nice brigh… "{\"product\":\"tel…- Append a
compute()step, and assign to a new variable calledtbl_review
tbl_review <- tbl_reviews |>
mutate(product = ai_extract(review, array("product"))) |>
compute()- Preview
tbl_review
tbl_review
#> # Source: table<`dbplyr_MjqcmuPoJy`> [4 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> review product
#> <chr> <chr>
#> 1 This is the best toaster I have ever bought "{\"product\":\"toa…
#> 2 Toaster arrived broken, waiting for replancement "{\"product\":\"Toa…
#> 3 The washing machine is as advertised, cant wait to use… "{\"product\":\"was…
#> 4 Not sure how to feel about this tevelision, nice brigh… "{\"product\":\"tel…- Pass
tbl_reviewtoshow_query()to confirm that it is pulling from a new temporary table
tbl_review |>
show_query()
#> <SQL>
#> SELECT *
#> FROM `dbplyr_MjqcmuPoJy`- Coerce “product” to a character
tbl_review |>
mutate(product = as.character(product))
#> # Source: SQL [4 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> review product
#> <chr> <chr>
#> 1 This is the best toaster I have ever bought {toaster}
#> 2 Toaster arrived broken, waiting for replancement {Toaster}
#> 3 The washing machine is as advertised, cant wait to use it {washing machine}
#> 4 Not sure how to feel about this tevelision, nice brightne… {television}- Wrap the
as.character()call, inside atolower()call
tbl_review |>
mutate(product = tolower(as.character(product)))
#> # Source: SQL [4 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> review product
#> <chr> <chr>
#> 1 This is the best toaster I have ever bought {toaster}
#> 2 Toaster arrived broken, waiting for replancement {toaster}
#> 3 The washing machine is as advertised, cant wait to use it {washing machine}
#> 4 Not sure how to feel about this tevelision, nice brightne… {television}- Add a count step, that breaks down the reviews by product
tbl_review |>
mutate(product = tolower(as.character(product))) |>
count(product)
#> # Source: SQL [3 x 2]
#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
#> product n
#> <chr> <int64>
#> 1 {toaster} 2
#> 2 {washing machine} 1
#> 3 {television} 18.4 Introducing chattr
Request a simple example from LLM
- Load the
chattrlibrary
library(chattr)- Call
chattr_app()and then select the first model (DBRX)
chattr_app()In the app type, and run: “show me a simple example of a ggplot using the mtcars data set”
Copy the code into a script and test