Changes in version 0.3.0.9000 Changes in version 0.3.0 (2026-01-23) Breaking changes - In ragnar_find_links(), the default children_only = FALSE now returns all links on a page. If you relied on the previous default, set children_only = TRUE (#115). - ragnar_register_tool_retrieve() now uses search_{store@name} as the default tool name prefix (instead of rag_retrieve_from_{store@name}), so you may need to update any code that refers to the tool name explicitly (#123, #127). New features - New embed_azure_openai() supports embeddings from Azure AI Foundry (#144). - New embed_snowflake() supports embeddings via the Snowflake Cortex Embedding API (#148). - New mcp_serve_store() lets local MCP clients (e.g. Codex CLI or Claude Code) search a RagnarStore (#123). - ragnar_retrieve() (and the corresponding ellmer retrieval tool) now accepts a vector of queries (#150). - New ragnar_store_atlas() visualizes store embeddings (#124). - New ragnar_store_ingest() prepares documents in parallel with mirai and inserts them into a store (#133). Minor improvements and fixes - embed_ollama() now defaults to the embeddinggemma model (#121). - embed_openai() error messages are now surfaced to the user (#112). - Embedding helpers now share a generalized request retry policy, configurable via options(ragnar.embed.req_retry = ...) (#138). - ragnar now requires mirai >= 2.5.1 (#139). - print() on a RagnarStore now shows the store location (#116). - ragnar_retrieve_bm25() now orders results by descending score (#122). - ragnar_retrieve() no longer returns duplicate rows when called with multiple queries (#153). - The ellmer retrieval tool now omits score columns from its output (#130). - ragnar_store_inspect() now includes keyboard shortcuts, a draggable divider, improved preview linkification, better metadata display, and other UI tweaks (#120, #117, #118). - ragnar_find_links() works better with local HTML files (#115). - ragnar_store_insert() and ragnar_store_update() (v2 stores) now handle stores that are missing store@schema metadata (#146). - read_as_markdown() once again fetches YouTube transcripts and now supports youtube_transcript_formatter, so you can add timestamps or links to the transcript output (#149). - read_as_markdown() gains an origin argument to customize the @origin recorded on returned documents (#128). - read_as_markdown() now correctly reads plain-text files with non-ASCII characters (#151). - Vignette heading levels were fixed (#129). - Added an example using sentence-transformers embeddings (#131). Changes in version 0.2.1 (2025-08-19) - ragnar_register_tool_retrieve() now registers a tool that will not return previously returned chunks, enabling the LLM to perform deeper searches of a ragnar store with repeated tool calls (#106). - Updates for ellmer v0.3.0 and duckdb v1.3.1 (#99) - Improved docs and error message in ragnar_store_insert() (@mattwarkentin, #88) - ragnar_find_links() can now parse sitemap.xml files. It also gains a validate argument, allowing for sending a HEAD request to each link and filtering out broken links (#83). - ragnar_inspector() now renders all urls as clickable links in the chunk markdown viewer, even if url is not a formal markdown link (#82). - Before running examples and tests we now check if ragnar can load DuckDB extensions. This fixes issues in environments where DuckDB pre-built binaries for extensions are not compatible with the installed DuckDB version (#94). - Added embed_lm_studio to use LMStudio as an embedding provider (#100). - Fixed a bug causing ragnar_retrieve() to fail when documents were inserted without an origin (#102). - We now suppress a "Couldn't find ffmpeg or avconv" warning when importing markitdown when using read_as_markdown(). The warning would only be relevant for users doing audio transcription (#103). - Added embed_google_gemini to use Google Gemini API as an embedding provider (#105). Changes in version 0.2.0 (2025-07-12) - ragnar_store_create() gains a new argument: version, with default 2. Store version 2 adds support for chunk deoverlapping on retrieval and automatic chunk augmentation with headings. To support these features, the internal schema and ingestion requirements are different. See markdown_chunk() and new S7 classes MarkdownDocument and MarkdownDocumentChunks. Backwards compatibility is maintained with version = 1. (#58, #39, #36) - ragnar_store_create() now supports Date and POSIXct classes supplied to extra_cols. - ragnar_store_create() now supports remote MotherDuck Databases specified with md: as the location argument. (#50) - ragnar_retrieve() and friends gain a filter argument, adding support for efficiently filtering retrieval results. - ragnar_retrieve_bm25() gains arguments b, k, and conjunctive (#56). - ragnar_retrieve_vss() gains argument query_vector, supporting workflows that preprocess the query string before embedding. - ragnar_retrieve_vss() set of valid method choices have been updated to a narrower set to ensure that an HNSW index scan is used. - Passing a tbl(store) to ragnar_retrieve() is deprecated. - New chunker markdown_chunk() with support for chunk heading context generation, semantic boundary selection, overlapping chunks, document segmentation, and more. (#56) - New function embed_google_vertex() (@dfalbel, #49) - New function embed_databricks() (@atheriel, #45) - New function ragnar_chunks_view() for quickly previewing chunks (#42) - ragnar_register_tool_retrieve() gains optional name and title arguments to allow for more descriptive tool registration. These values can also be set in ragnar_store_create() (#43). - ragnar_read() and read_as_markdown() now accept paths that begin with ~ (@topepo, #46, #48). - Changes to read_as_markdown() HTML conversion (#40, #51): - New arguments html_extract_selectors and html_zap_selectors provide a flexible way to exclude some html page elements from being included in the converted markdown. - code blocks now include the language, if available. - Fixed handling of nested code fences in markdown output. Changes in version 0.1.0 (2025-05-30) - Initial CRAN submission.