Package: vitals 0.3.0.9000

Simon Couch

vitals: Large Language Model Evaluation

A port of 'Inspect', a widely adopted 'Python' framework for large language model evaluation. Specifically aimed at 'ellmer' users who want to measure the effectiveness of their large language model-based products, the package supports prompt engineering, tool usage, multi-turn dialog, and model graded evaluations.

Authors:Simon Couch [aut, cre], Max Kuhn [ctb], Hadley Wickham [ctb], Mine Cetinkaya-Rundel [ctb], Posit Software, PBC [cph, fnd]

vitals_0.3.0.9000.tar.gz
vitals_0.3.0.9000.zip(r-4.7)vitals_0.3.0.9000.zip(r-4.6)vitals_0.3.0.9000.zip(r-4.5)
vitals_0.3.0.9000.tgz(r-4.6-any)vitals_0.3.0.9000.tgz(r-4.5-any)
vitals_0.3.0.9000.tar.gz(r-4.7-any)vitals_0.3.0.9000.tar.gz(r-4.6-any)
vitals_0.3.0.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
vitals/json (API)
NEWS

# Install 'vitals' in R:
install.packages('vitals', repos = c('https://tidyverse.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/tidyverse/vitals/issues

Pkgdown/docs site:https://vitals.tidyverse.org

Datasets:
  • are - An R Eval

On CRAN:

Conda:

7.65 score 54 stars 76 scripts 457 downloads 15 exports 36 dependencies

Last updated from:de58d99bc6. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK150
source / vignettesOK203
linux-release-x86_64OK153
macos-release-arm64OK134
macos-oldrel-arm64OK87
windows-develOK129
windows-releaseOK94
windows-oldrelOK102
wasm-releaseOK126

Exports:detect_answerdetect_exactdetect_includesdetect_matchdetect_patterngenerategenerate_structuredmodel_graded_factmodel_graded_qaTaskvitals_bindvitals_bundlevitals_log_dirvitals_log_dir_setvitals_view

Dependencies:askpassclicorocpp11curldplyrellmerfastmapgenericsgluehttpuvhttr2jsonlitelaterlifecyclemagrittropensslotelpillarpkgconfigpromisespurrrR6rappdirsRcpprlangS7stringistringrsystibbletidyrtidyselectutf8vctrswithr

Readme and manuals

Help Manual

Help pageTopics
An R Evalare
Convert a chat to a solver functiongenerate
Convert a chat to a solver function with structured outputgenerate_structured
Scoring with string detectiondetect_answer detect_exact detect_includes detect_match detect_pattern scorer_detect
Model-based scoringmodel_graded_fact model_graded_qa scorer_model
Creating and evaluating tasksTask
Concatenate task samples for analysisvitals_bind
Prepare logs for deploymentvitals_bundle
The log directoryvitals_log_dir vitals_log_dir_set
Interactively view local evaluation logsvitals_view