2 - Deploy your model

Intro to MLOps with vetiver

Plan for this workshop

  • Versioning
    • Managing change in models ✅
  • Deploying
    • Putting models in REST APIs 🎯
  • Monitoring
    • Tracking model performance 👀

Fit a random forest 🌳🌴🌲🌲🌴🌳🌴🌳🌲

library(tidyverse)
library(tidymodels)
library(arrow)
set.seed(123)

path <- here::here("data", "housing.parquet")
housing <- read_parquet(path)

set.seed(123)
housing_split <- housing |>
  mutate(price = log10(price)) |>
  initial_split(prop = 0.8)
housing_train <- training(housing_split)
housing_test <- testing(housing_split)

housing_fit <-
  workflow(
    price ~ bedrooms + bathrooms + sqft_living + yr_built,
    rand_forest(trees = 200, mode = "regression")
    ) |>
  fit(data = housing_train)
import pandas as pd
import numpy as np
from sklearn import preprocessing, ensemble, pipeline, compose, model_selection

housing = pd.read_parquet("../data/housing.parquet", engine="pyarrow")

X, y = housing[["bedrooms", "bathrooms", "sqft_living", "yr_built"]], np.log10(housing["price"])
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2)

housing_fit = ensemble.RandomForestRegressor(n_estimators=200).fit(X_train, y_train)

Create your vetiver model and version

R

library(vetiver)
library(pins)

v <- vetiver_model(housing_fit, "user.name/seattle-housing-rstats")
board <- board_connect()
board |> vetiver_pin_write(v)

Python

from vetiver import VetiverModel, vetiver_pin_write
from pins import board_connect
from dotenv import load_dotenv
load_dotenv()

v = VetiverModel(housing_fit, "user.name/seattle-housing-python", prototype_data = X_train)
board = board_connect(allow_pickle_read = True)
vetiver_pin_write(board, v)

Make it easy to do the right thing

  • Robust and human-friendly checking of new data
  • Track and document software dependencies of models
  • Model cards for transparent, responsible reporting

Make it easy to do the right thing

Your turn 🏺

Activity

Open the Model Card template, and spend a few minutes exploring how you might create a Model Card for this inspection model.

Discuss something you notice about the Model Card with your neighbor.

05:00

You can deploy your model as a…

REST API

What is a REST API?

An interface that can connect applications in a standard way

Create a vetiver REST API

R

library(plumber)

pr() |>
  vetiver_api(v) |>
  pr_run()

Python

api = VetiverAPI(v)
api.run()

Your turn 🏺

Activity

Create a vetiver API for your model and run it locally.

Explore the visual documentation.

How many endpoints are there?

Discuss what you notice with your neighbor.

07:00

What does “deploy” mean?

What does “deploy” mean?

Where does vetiver work?

  • Posit’s pro products, like Connect

  • AWS SageMaker (R only, for now)

  • A public or private cloud, using Docker

Deploy to Posit Connect

R

vetiver_deploy_rsconnect(board, "user.name/seattle-housing-rstats")

Python

from rsconnect.api import RSConnectServer

connect_server = RSConnectServer(url = rsc_url, api_key = api_key)
board = pins.board_connect(allow_pickle_read = True)

vetiver.deploy_rsconnect(
    connect_server = connect_server,
    board = board,
    pin_name = "user.name/seattle-housing-python",
)

Your turn 🏺

Activity

Deploy your model to your Posit Connect server.

Give your API a vanity URL.

Set your API as accessible to “Anyone”, for convenience.

Compare the results to your local API. Is anything different?

If you visit Connect, do you see your neighbor’s API?

07:00

You did it! 🥳

How do you make a request of your new API?

url <- "https://pub.demo.posit.team/public/seattle-housing-rstats/metadata"
r <- httr::GET(url)
metadata <- httr::content(r, as = "text", encoding = "UTF-8")
jsonlite::fromJSON(metadata)
#> $user
#> list()
#> 
#> $version
#> [1] "356"
#> 
#> $url
#> [1] "https://pub.demo.posit.team/content/af63b874-734d-4f31-af4a-7afe3ee319ba/_rev356/"
#> 
#> $required_pkgs
#> [1] "parsnip"   "ranger"    "workflows"
import requests

url = "https://pub.demo.posit.team/public/seattle-housing-python/metadata" 
print(requests.get(url).content)
#> b'{"user":{},"version":"362","url":"https://pub.demo.posit.team/content/7189ede9-7720-47f1-a783-0e3ed835a7f0/","required_pkgs":["scikit-learn"],"python_version":[3,12,4,"final",0]}'

How do you make a request of your new API?

  • Python or R packages like requests or httr (or httr2!)
  • curl
  • There is special support in vetiver for the /predict endpoint

Create a vetiver endpoint

You can treat your model API much like it is a local model in memory!

library(vetiver)

url <- "https://pub.demo.posit.team/public/seattle-housing-rstats/predict"
endpoint <- vetiver_endpoint(url)
predict(endpoint, slice_sample(housing_test, n = 5))
#> # A tibble: 5 × 1
#>   .pred
#>   <dbl>
#> 1  5.89
#> 2  5.60
#> 3  5.38
#> 4  6.16
#> 5  5.56
from vetiver.server import predict, vetiver_endpoint

url = "https://pub.demo.posit.team/public/seattle-housing-python/predict"
endpoint = vetiver_endpoint(url)
predict(endpoint = endpoint, data = X_test.head(5))
#>     predict
#> 0  5.680102
#> 1  5.714398
#> 2  5.561758
#> 3  5.763545
#> 4  5.451287

Your turn 🏺

Activity

Create a vetiver endpoint object for your API.

Predict with your endpoint for new data.

Optional: call another endpoint like /ping or /metadata.

10:00

Your turn 🏺

Activity

Create a vetiver endpoint object for your neighbor’s API.

Predict with your endpoint for new data.

You get extra credit if your neighbor’s model is in a different language than yours!

05:00

Create a vetiver endpoint

What if your model API requires authentication?

R

library(vetiver)

url <- "https://pub.demo.posit.team/public/seattle-housing-rstats/predict"
endpoint <- vetiver_endpoint(url)
predict(endpoint, slice_sample(housing_test, n = 10))

Python

from vetiver.server import predict, vetiver_endpoint

url = "https://pub.demo.posit.team/public/seattle-housing-python/predict"
endpoint = vetiver_endpoint(url)
predict(endpoint = endpoint, data = housing_test)

Create a vetiver endpoint

What if your model API requires authentication?

R

library(vetiver)

url <- "https://pub.demo.posit.team/public/seattle-housing-rstats/predict"
endpoint <- vetiver_endpoint(url)
apiKey <- Sys.getenv("CONNECT_API_KEY")
predict(endpoint, slice_sample(inspect_test, n = 10), 
        httr::add_headers(Authorization = paste("Key", apiKey)))

Python

from vetiver.server import predict, vetiver_endpoint

url = "https://pub.demo.posit.team/public/seattle-housing-python/predict"
endpoint = vetiver_endpoint(url)
h = { 'Authorization': f'Key {api_key}' }
predict(endpoint = endpoint, data = X_test, headers = h)

Model input prototype

inputs ➡️ outputs

Model input prototype

What are the inputs to your model?

glimpse(housing_train)
#> Rows: 11,706
#> Columns: 9
#> $ price       <dbl> 5.569374, 5.676694, 5.440909, 5.597695, 5.491362, 5.712229…
#> $ date        <date> 2014-06-10, 2014-06-10, 2014-10-08, 2014-09-10, 2014-11-1…
#> $ bedrooms    <int> 3, 3, 2, 5, 3, 4, 3, 3, 5, 5, 2, 3, 3, 4, 4, 3, 4, 3, 4, 4…
#> $ bathrooms   <dbl> 1.00, 2.25, 1.00, 2.75, 1.50, 2.50, 1.50, 1.75, 3.00, 2.25…
#> $ sqft_living <int> 890, 1630, 870, 2840, 1140, 2920, 1320, 1400, 3190, 2710, …
#> $ yr_built    <int> 1951, 2005, 2004, 1960, 1988, 2003, 1970, 1925, 2013, 1955…
#> $ waterfront  <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FA…
#> $ lat         <dbl> 47.7100, 47.5493, 47.5702, 47.7618, 47.5701, 47.7351, 47.5…
#> $ long        <dbl> -122.286, -121.998, -122.287, -122.253, -122.017, -121.975…
X_train
#>        bedrooms  bathrooms  sqft_living  yr_built
#> 7215          4       2.50         3520      1998
#> 9863          3       1.75         2200      1964
#> 12777         3       2.50         1600      2005
#> 1253          3       1.00         1070      1951
#> 8703          4       2.50         2540      1984
#> ...         ...        ...          ...       ...
#> 5612          4       1.75         1730      1949
#> 14434         4       2.25         2280      1977
#> 1566          4       2.75         3650      1951
#> 10201         3       1.00         1520      1953
#> 3131          4       2.25         2870      1926
#> 
#> [11706 rows x 4 columns]

Your turn 🏺

Activity

Call the prototype endpoints for both the Python and R model.

How do they compare?

05:00

Model input prototype

url <- "https://pub.demo.posit.team/public/seattle-housing-rstats/prototype"
r <- httr::GET(url)
prototype <- httr::content(r, as = "text", encoding = "UTF-8")
jsonlite::fromJSON(prototype)
#> $bedrooms
#> $bedrooms$type
#> [1] "integer"
#> 
#> $bedrooms$example
#> NULL
#> 
#> $bedrooms$details
#> list()
#> 
#> 
#> $bathrooms
#> $bathrooms$type
#> [1] "numeric"
#> 
#> $bathrooms$example
#> NULL
#> 
#> $bathrooms$details
#> list()
#> 
#> 
#> $sqft_living
#> $sqft_living$type
#> [1] "integer"
#> 
#> $sqft_living$example
#> NULL
#> 
#> $sqft_living$details
#> list()
#> 
#> 
#> $yr_built
#> $yr_built$type
#> [1] "integer"
#> 
#> $yr_built$example
#> NULL
#> 
#> $yr_built$details
#> list()
import requests

url = "https://pub.demo.posit.team/public/seattle-housing-python/prototype" 
print(requests.get(url).content)
#> b'{"properties":{"bedrooms":{"example":5.0,"type":"number"},"bathrooms":{"example":1.75,"type":"number"},"sqft_living":{"example":2110.0,"type":"number"},"yr_built":{"example":1962.0,"type":"number"}},"required":["bedrooms","bathrooms","sqft_living","yr_built"],"title":"prototype","type":"object"}'

Model input prototype

  • In Python, you supply the model’s input prototype via prototype_data
  • In R, the model input prototype is found automatically in most cases, but you can override this default via save_prototype
  • In both cases, it is ultimately up to you to decide what your API’s inputs should be!
  • The vetiver framework has sensible defaults but is extensible for more complex use cases

Your turn 🏺

Activity

Let’s say you need to customize your model API’s inputs for a more complex use case.

Make a new vetiver model object and change the input data prototype.

Run an API locally for your new vetiver model object and explore the visual documentation. (Note that making predictions will not work now, since we haven’t updated the API behavior to match these inputs.)

Discuss a possible situation you might use this with your neighbor.

07:00