kibana/src
Dario Gieselaar 7d20301289
Load huggingface content datasets (#224543)
Implements a huggingface dataset loader for RAG evals - see
[x-pack/platform/packages/shared/kbn-ai-tools-cli/src/hf_dataset_loader/README.md](https://github.com/dgieselaar/kibana/blob/hf-dataset-loader/x-pack/platform/packages/shared/kbn-ai-tools-cli/src/hf_dataset_loader/README.md).
Additionally, a `@kbn/cache-cli` tool was added that allows tooling
authors to cache to disk (possibly remote storage later).

Used o3 for finding datasets on HuggingFace and doing an initial pass on
a line-by-line dataset processor ([see
conversation](https://chatgpt.com/share/6853e49a-e870-8000-9c65-f7a5a3a72af0))

Libraries added:

- `cache-manager`, `cache-manager-fs-hash`, `keyv`,
`@types/cache-manager-fs-hash`: caching libraries and plugins. could not
find any existing caching libraries in the repo.
- `@huggingface/hub`: api client for HF.

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2025-06-26 17:24:45 +02:00
..
cli [Core] Rename o11y essentials tier to logs_essential (#223546) 2025-06-12 17:02:46 +02:00
cli_encryption_keys Updated js-yaml to v4 (#190678) 2024-09-19 12:25:03 +02:00
cli_health_gateway Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
cli_keystore Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
cli_plugin Upgrade Node.js to 22.16.0 (#205983) 2025-06-18 08:55:18 -05:00
cli_setup Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
cli_verification_code Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
core Remapping iInCircle and questionInCircle and deprecating help icon (#223142) 2025-06-25 14:52:04 -05:00
dev [ska] [xpack] relocate platform tests (#225223) 2025-06-25 17:01:04 +02:00
platform Load huggingface content datasets (#224543) 2025-06-26 17:24:45 +02:00
setup_node_env Upgrade Node.js to 22.16.0 (#205983) 2025-06-18 08:55:18 -05:00