Skip to content

Commit 725e6e1

Browse files
hanouticelinaWauplinCopilotgoogle-labs-jules[bot]
authored
[CLI] Add Inference Endpoints Commands (#3428)
* [1.0] Httpx migration (#3328) * first httpx integration * more migration * some fixes * download workflow should work * Fix repocard and error utils tests * fix hf-file-system * gix http utils tests * more fixes * fix some inference tests * fix test_file_download tests * async inference client * async code should be good * Define RemoteEntryFileNotFound explicitly (+some fixes) * fix async code quality * torch ok * fix hf_file_system * fix errors tests * mock * fix test_cli mock * fix commit scheduler * add fileno test * no more requests anywhere * fix test_file_download * tmp requests * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/hf_file_system.py Co-authored-by: célina <hanouticelina@gmail.com> * not async * fix tests --------- Co-authored-by: célina <hanouticelina@gmail.com> * Bump minimal version to Python3.9 (#3343) * Bump minimal version to Python3.9 * use built-in generics * code quality * new batch * yet another btach * fix dataclass_with_extra * fix * keep Type for strict dataclasses * fix test * Remove `HfFolder` and `InferenceAPI` classes (#3344) * Remove HfFolder * Remove InferenceAPI * more recent gradio * bump pytest * fix python 3.9? * install gradio only on python 3.10+ * fix tests * fix tests * fix * [v1.0] Remove more deprecated stuff (#3345) * remove constants.-hf_cache_home * remove smoothly_deprecate_use_auth_token * remove get_token_permission * remove update_repo_visibility * remove is_write_action arg * remove write_permission arg from login methods * new parameter skip_if_logged_in in login methods * Remove resume_download / force_filename parameters * Remove deprecated local_dir_use_symlinks parameter * Remove deprecated language, library, task, tags from list_models * Return commit URL in upload_file/upload_folder (previously url to file/folder on the Hub) * fix upload_file/upload_folder tests * smoothly_deprecate_legacy_arguments everywhere * code quality * fix tests * fix xet tests * [v1.0] Remove `Repository` class (#3346) * Remove Repository class + adapt docs * remove fr git_vs_http * bump to 1.0.0.dev0 * Remove _deprecate_positional_args on login methods (#3349) * [v1.0] Remove imports kept only for backward compatibility (#3350) * Remove imports kept only for backward compatibility * fix tests * [v1.0] Remove keras2 utilities (#3352) * Remove keras2 utilities * remove keras from init * [v1.0] Remove anything tensorflow-related + deps (#3354) * Remove anything tensorflow-related + deps * init * fix tests * fix conflicts in tests * Release: v1.0.0.rc0 * [v1.0] Update "HTTP backend" docs + `git_vs_http` guide (#3357) * HTTP configuration docs * http configuration docs * refactored git_vs_http * fix import * fix docs? * Update docs/source/en/package_reference/utilities.md Co-authored-by: célina <hanouticelina@gmail.com> --------- Co-authored-by: célina <hanouticelina@gmail.com> * Refactor CLI implementation using Typer (#3372) * Refactor CLI implementation using Typer (#3365) * migrate CLI to typer * (#3364) disable rich in all cases * update tests * make typer-slim a required dep * use Annotated * fix linting issues * fix tests * refactoring * update docs * use built in types * fix mypy * call whoami directly * lint * Apply suggestions from code review Co-authored-by: Lucain <lucain@huggingface.co> * import Annotated from typing * Use Enums * set verbosity globally * refactor scan cache and update version docstring * centralize where Typer is defined * no need for ... * rename enum * no need for extra param name * docstring * revert * centralize arguments and options definition * add library version when initializing HfApi * add auto-completion * sort commands alphabetically * suggestions * centralize jobs params and HfApi initialization * fix --------- Co-authored-by: Lucain <lucain@huggingface.co> * update docs --------- Co-authored-by: Lucain <lucain@huggingface.co> * Make HfHubHTTPError inherit from OSError (#3387) * Release: v1.0.0.rc1 * Add new HF commands (#3384) * [1.0] Httpx migration (#3328) * first httpx integration * more migration * some fixes * download workflow should work * Fix repocard and error utils tests * fix hf-file-system * gix http utils tests * more fixes * fix some inference tests * fix test_file_download tests * async inference client * async code should be good * Define RemoteEntryFileNotFound explicitly (+some fixes) * fix async code quality * torch ok * fix hf_file_system * fix errors tests * mock * fix test_cli mock * fix commit scheduler * add fileno test * no more requests anywhere * fix test_file_download * tmp requests * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/hf_file_system.py Co-authored-by: célina <hanouticelina@gmail.com> * not async * fix tests --------- Co-authored-by: célina <hanouticelina@gmail.com> * Bump minimal version to Python3.9 (#3343) * Bump minimal version to Python3.9 * use built-in generics * code quality * new batch * yet another btach * fix dataclass_with_extra * fix * keep Type for strict dataclasses * fix test * Remove `HfFolder` and `InferenceAPI` classes (#3344) * Remove HfFolder * Remove InferenceAPI * more recent gradio * bump pytest * fix python 3.9? * install gradio only on python 3.10+ * fix tests * fix tests * fix * [v1.0] Remove more deprecated stuff (#3345) * remove constants.-hf_cache_home * remove smoothly_deprecate_use_auth_token * remove get_token_permission * remove update_repo_visibility * remove is_write_action arg * remove write_permission arg from login methods * new parameter skip_if_logged_in in login methods * Remove resume_download / force_filename parameters * Remove deprecated local_dir_use_symlinks parameter * Remove deprecated language, library, task, tags from list_models * Return commit URL in upload_file/upload_folder (previously url to file/folder on the Hub) * fix upload_file/upload_folder tests * smoothly_deprecate_legacy_arguments everywhere * code quality * fix tests * fix xet tests * [v1.0] Remove `Repository` class (#3346) * Remove Repository class + adapt docs * remove fr git_vs_http * bump to 1.0.0.dev0 * Remove _deprecate_positional_args on login methods (#3349) * [v1.0] Remove imports kept only for backward compatibility (#3350) * Remove imports kept only for backward compatibility * fix tests * [v1.0] Remove keras2 utilities (#3352) * Remove keras2 utilities * remove keras from init * [v1.0] Remove anything tensorflow-related + deps (#3354) * Remove anything tensorflow-related + deps * init * fix tests * fix conflicts in tests * Release: v1.0.0.rc0 * [v1.0] Update "HTTP backend" docs + `git_vs_http` guide (#3357) * HTTP configuration docs * http configuration docs * refactored git_vs_http * fix import * fix docs? * Update docs/source/en/package_reference/utilities.md Co-authored-by: célina <hanouticelina@gmail.com> --------- Co-authored-by: célina <hanouticelina@gmail.com> * Refactor CLI implementation using Typer (#3372) * Refactor CLI implementation using Typer (#3365) * migrate CLI to typer * (#3364) disable rich in all cases * update tests * make typer-slim a required dep * use Annotated * fix linting issues * fix tests * refactoring * update docs * use built in types * fix mypy * call whoami directly * lint * Apply suggestions from code review Co-authored-by: Lucain <lucain@huggingface.co> * import Annotated from typing * Use Enums * set verbosity globally * refactor scan cache and update version docstring * centralize where Typer is defined * no need for ... * rename enum * no need for extra param name * docstring * revert * centralize arguments and options definition * add library version when initializing HfApi * add auto-completion * sort commands alphabetically * suggestions * centralize jobs params and HfApi initialization * fix --------- Co-authored-by: Lucain <lucain@huggingface.co> * update docs --------- Co-authored-by: Lucain <lucain@huggingface.co> * add hf repo delete command * add repo settings, repo move, repo branch commands * fix test * Apply suggestions from code review Co-authored-by: Lucain <lucain@huggingface.co> --------- Co-authored-by: Lucain <lucain@huggingface.co> Co-authored-by: Lucain Pouget <lucainp@gmail.com> * Release: v1.0.0.rc2 * Document new HF commands (#3393) * add docs for new hf commands * typo * add cli examples in repository * Add cross-platform CLI Installers (#3378) * [1.0] Httpx migration (#3328) * first httpx integration * more migration * some fixes * download workflow should work * Fix repocard and error utils tests * fix hf-file-system * gix http utils tests * more fixes * fix some inference tests * fix test_file_download tests * async inference client * async code should be good * Define RemoteEntryFileNotFound explicitly (+some fixes) * fix async code quality * torch ok * fix hf_file_system * fix errors tests * mock * fix test_cli mock * fix commit scheduler * add fileno test * no more requests anywhere * fix test_file_download * tmp requests * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/hf_file_system.py Co-authored-by: célina <hanouticelina@gmail.com> * not async * fix tests --------- Co-authored-by: célina <hanouticelina@gmail.com> * Bump minimal version to Python3.9 (#3343) * Bump minimal version to Python3.9 * use built-in generics * code quality * new batch * yet another btach * fix dataclass_with_extra * fix * keep Type for strict dataclasses * fix test * Remove `HfFolder` and `InferenceAPI` classes (#3344) * Remove HfFolder * Remove InferenceAPI * more recent gradio * bump pytest * fix python 3.9? * install gradio only on python 3.10+ * fix tests * fix tests * fix * [v1.0] Remove more deprecated stuff (#3345) * remove constants.-hf_cache_home * remove smoothly_deprecate_use_auth_token * remove get_token_permission * remove update_repo_visibility * remove is_write_action arg * remove write_permission arg from login methods * new parameter skip_if_logged_in in login methods * Remove resume_download / force_filename parameters * Remove deprecated local_dir_use_symlinks parameter * Remove deprecated language, library, task, tags from list_models * Return commit URL in upload_file/upload_folder (previously url to file/folder on the Hub) * fix upload_file/upload_folder tests * smoothly_deprecate_legacy_arguments everywhere * code quality * fix tests * fix xet tests * [v1.0] Remove `Repository` class (#3346) * Remove Repository class + adapt docs * remove fr git_vs_http * bump to 1.0.0.dev0 * Remove _deprecate_positional_args on login methods (#3349) * [v1.0] Remove imports kept only for backward compatibility (#3350) * Remove imports kept only for backward compatibility * fix tests * [v1.0] Remove keras2 utilities (#3352) * Remove keras2 utilities * remove keras from init * [v1.0] Remove anything tensorflow-related + deps (#3354) * Remove anything tensorflow-related + deps * init * fix tests * fix conflicts in tests * Release: v1.0.0.rc0 * [v1.0] Update "HTTP backend" docs + `git_vs_http` guide (#3357) * HTTP configuration docs * http configuration docs * refactored git_vs_http * fix import * fix docs? * Update docs/source/en/package_reference/utilities.md Co-authored-by: célina <hanouticelina@gmail.com> --------- Co-authored-by: célina <hanouticelina@gmail.com> * Refactor CLI implementation using Typer (#3372) * Refactor CLI implementation using Typer (#3365) * migrate CLI to typer * (#3364) disable rich in all cases * update tests * make typer-slim a required dep * use Annotated * fix linting issues * fix tests * refactoring * update docs * use built in types * fix mypy * call whoami directly * lint * Apply suggestions from code review Co-authored-by: Lucain <lucain@huggingface.co> * import Annotated from typing * Use Enums * set verbosity globally * refactor scan cache and update version docstring * centralize where Typer is defined * no need for ... * rename enum * no need for extra param name * docstring * revert * centralize arguments and options definition * add library version when initializing HfApi * add auto-completion * sort commands alphabetically * suggestions * centralize jobs params and HfApi initialization * fix --------- Co-authored-by: Lucain <lucain@huggingface.co> * update docs --------- Co-authored-by: Lucain <lucain@huggingface.co> * add installers * fix windows * fix log * fix workflow? * fix workflow again * add debugging steps * fix * remove bin dir and install dir params * update workflow * remove version param * document usage * [1.0] Httpx migration (#3328) * first httpx integration * more migration * some fixes * download workflow should work * Fix repocard and error utils tests * fix hf-file-system * gix http utils tests * more fixes * fix some inference tests * fix test_file_download tests * async inference client * async code should be good * Define RemoteEntryFileNotFound explicitly (+some fixes) * fix async code quality * torch ok * fix hf_file_system * fix errors tests * mock * fix test_cli mock * fix commit scheduler * add fileno test * no more requests anywhere * fix test_file_download * tmp requests * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/hf_file_system.py Co-authored-by: célina <hanouticelina@gmail.com> * not async * fix tests --------- Co-authored-by: célina <hanouticelina@gmail.com> * Bump minimal version to Python3.9 (#3343) * Bump minimal version to Python3.9 * use built-in generics * code quality * new batch * yet another btach * fix dataclass_with_extra * fix * keep Type for strict dataclasses * fix test * Remove `HfFolder` and `InferenceAPI` classes (#3344) * Remove HfFolder * Remove InferenceAPI * more recent gradio * bump pytest * fix python 3.9? * install gradio only on python 3.10+ * fix tests * fix tests * fix * [v1.0] Remove more deprecated stuff (#3345) * remove constants.-hf_cache_home * remove smoothly_deprecate_use_auth_token * remove get_token_permission * remove update_repo_visibility * remove is_write_action arg * remove write_permission arg from login methods * new parameter skip_if_logged_in in login methods * Remove resume_download / force_filename parameters * Remove deprecated local_dir_use_symlinks parameter * Remove deprecated language, library, task, tags from list_models * Return commit URL in upload_file/upload_folder (previously url to file/folder on the Hub) * fix upload_file/upload_folder tests * smoothly_deprecate_legacy_arguments everywhere * code quality * fix tests * fix xet tests * [v1.0] Remove `Repository` class (#3346) * Remove Repository class + adapt docs * remove fr git_vs_http * bump to 1.0.0.dev0 * Remove _deprecate_positional_args on login methods (#3349) * [v1.0] Remove imports kept only for backward compatibility (#3350) * Remove imports kept only for backward compatibility * fix tests * [v1.0] Remove keras2 utilities (#3352) * Remove keras2 utilities * remove keras from init * [v1.0] Remove anything tensorflow-related + deps (#3354) * Remove anything tensorflow-related + deps * init * fix tests * fix conflicts in tests * Release: v1.0.0.rc0 * [v1.0] Update "HTTP backend" docs + `git_vs_http` guide (#3357) * HTTP configuration docs * http configuration docs * refactored git_vs_http * fix import * fix docs? * Update docs/source/en/package_reference/utilities.md Co-authored-by: célina <hanouticelina@gmail.com> --------- Co-authored-by: célina <hanouticelina@gmail.com> * Refactor CLI implementation using Typer (#3372) * Refactor CLI implementation using Typer (#3365) * migrate CLI to typer * (#3364) disable rich in all cases * update tests * make typer-slim a required dep * use Annotated * fix linting issues * fix tests * refactoring * update docs * use built in types * fix mypy * call whoami directly * lint * Apply suggestions from code review Co-authored-by: Lucain <lucain@huggingface.co> * import Annotated from typing * Use Enums * set verbosity globally * refactor scan cache and update version docstring * centralize where Typer is defined * no need for ... * rename enum * no need for extra param name * docstring * revert * centralize arguments and options definition * add library version when initializing HfApi * add auto-completion * sort commands alphabetically * suggestions * centralize jobs params and HfApi initialization * fix --------- Co-authored-by: Lucain <lucain@huggingface.co> * update docs --------- Co-authored-by: Lucain <lucain@huggingface.co> * Make HfHubHTTPError inherit from OSError (#3387) * Release: v1.0.0.rc1 * print relevant message based on the linux distro * better warning message * log info instead of warning * copilot suggestions * Update utils/installers/install.sh Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * update docs --------- Co-authored-by: Lucain <lucain@huggingface.co> Co-authored-by: Lucain Pouget <lucainp@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * update installers paths (#3400) * [v1.0] feat: add migration guide for v1.0 (#3360) * [1.0] Httpx migration (#3328) * first httpx integration * more migration * some fixes * download workflow should work * Fix repocard and error utils tests * fix hf-file-system * gix http utils tests * more fixes * fix some inference tests * fix test_file_download tests * async inference client * async code should be good * Define RemoteEntryFileNotFound explicitly (+some fixes) * fix async code quality * torch ok * fix hf_file_system * fix errors tests * mock * fix test_cli mock * fix commit scheduler * add fileno test * no more requests anywhere * fix test_file_download * tmp requests * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/utils/_http.py Co-authored-by: célina <hanouticelina@gmail.com> * Update src/huggingface_hub/hf_file_system.py Co-authored-by: célina <hanouticelina@gmail.com> * not async * fix tests --------- Co-authored-by: célina <hanouticelina@gmail.com> * Bump minimal version to Python3.9 (#3343) * Bump minimal version to Python3.9 * use built-in generics * code quality * new batch * yet another btach * fix dataclass_with_extra * fix * keep Type for strict dataclasses * fix test * Remove `HfFolder` and `InferenceAPI` classes (#3344) * Remove HfFolder * Remove InferenceAPI * more recent gradio * bump pytest * fix python 3.9? * install gradio only on python 3.10+ * fix tests * fix tests * fix * [v1.0] Remove more deprecated stuff (#3345) * remove constants.-hf_cache_home * remove smoothly_deprecate_use_auth_token * remove get_token_permission * remove update_repo_visibility * remove is_write_action arg * remove write_permission arg from login methods * new parameter skip_if_logged_in in login methods * Remove resume_download / force_filename parameters * Remove deprecated local_dir_use_symlinks parameter * Remove deprecated language, library, task, tags from list_models * Return commit URL in upload_file/upload_folder (previously url to file/folder on the Hub) * fix upload_file/upload_folder tests * smoothly_deprecate_legacy_arguments everywhere * code quality * fix tests * fix xet tests * [v1.0] Remove `Repository` class (#3346) * Remove Repository class + adapt docs * remove fr git_vs_http * bump to 1.0.0.dev0 * Remove _deprecate_positional_args on login methods (#3349) * [v1.0] Remove imports kept only for backward compatibility (#3350) * Remove imports kept only for backward compatibility * fix tests * [v1.0] Remove keras2 utilities (#3352) * Remove keras2 utilities * remove keras from init * [v1.0] Remove anything tensorflow-related + deps (#3354) * Remove anything tensorflow-related + deps * init * fix tests * fix conflicts in tests * HTTP configuration docs * http configuration docs * refactored git_vs_http * feat: add migration guide for v1.0 This commit adds a comprehensive migration guide for the v1.0 release of the `huggingface_hub` library. The guide is located at `docs/source/en/concepts/migration.md` and provides a detailed list of main changes and breaking changes, along with instructions on how to adapt to them. The migration guide covers the following topics: - HTTPX migration - Python 3.9+ requirement - Removal of deprecated features - Removal of the `Repository`, `HfFolder`, and `InferenceApi` classes - Removal of TensorFlow and Keras 2 integrations This guide is intended to help users migrate their existing code to the new version of the library smoothly. Fixes #3358 [Auto-generated by https://jules.google.com/] * rewrite migration guide * fix import * fix docs? * add why httpx section * Update docs/source/en/concepts/migration.md --------- Co-authored-by: Lucain <lucain@huggingface.co> Co-authored-by: célina <hanouticelina@gmail.com> Co-authored-by: Lucain Pouget <lucainp@gmail.com> Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> * prepare rc3 * Remove contrib test suite (#3403) * Strict typed dict validator (#3408) * Strict typed dict validator * better type hint * work with Required / NotREquired * Make it work with Python 3.11 * correctly handle total=False * fix python 3.9+ * fix python 3.9 * Implement dry run mode in download CLI (#3407) * Implement dry run mode * docs * more docs * quality * quality * fix cli test * Apply suggestions from code review Co-authored-by: célina <hanouticelina@gmail.com> * fix test on widnwso --------- Co-authored-by: célina <hanouticelina@gmail.com> * Remove `huggingface-cli` entirely in favor of `hf` (#3404) * Remove huggingface-cli entirely in favor of hf * dup * Fix proxy environment variables not used in v1.0 (#3412) * Use EventHooks instead of custom Transport * add test * Update tests/test_utils_http.py Co-authored-by: célina <hanouticelina@gmail.com> --------- Co-authored-by: célina <hanouticelina@gmail.com> * reset * Release: v1.0.0.rc3 * [hf CLI] check for updates and notify user (#3418) * [hf CLI] check for updates and notify user * no alpha or beta * dirty check * check once every 24h * move ANSI / tabulate utils to their own module to avoid circular import issues * do not touch installers CI * Update src/huggingface_hub/cli/_cli_utils.py Co-authored-by: célina <hanouticelina@gmail.com> * docstring * update powershell command --------- Co-authored-by: célina <hanouticelina@gmail.com> * Fix forward ref validation if total false (#3423) * Release: v1.0.0.rc4 * Disable rich in CLI (#3427) * Print version only in CLI * add inference endpoints cli * fix naming * update docs * wording * remove logging * don't instantiate logger when not needed * refactor * remove unused import * nit * nit * Apply suggestions from code review Co-authored-by: Lucain <lucain@huggingface.co> * use docstring * rework CLI UX * fix merge conflicts * some fixes * fix * generate cli reference * Update src/huggingface_hub/cli/inference_endpoints.py Co-authored-by: Lucain <lucain@huggingface.co> --------- Co-authored-by: Lucain <lucain@huggingface.co> Co-authored-by: Lucain Pouget <lucainp@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
1 parent ede63d1 commit 725e6e1

File tree

6 files changed

+970
-2
lines changed

6 files changed

+970
-2
lines changed

docs/source/en/guides/cli.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,20 @@ On Windows:
3535
>>> powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"
3636
```
3737

38+
Alternatively, you can install the `hf` CLI with a single command:
39+
40+
On macOS and Linux:
41+
42+
```bash
43+
>>> curl -LsSf https://hf.co/cli/install.sh | sh
44+
```
45+
46+
On Windows:
47+
48+
```powershell
49+
>>> powershell -ExecutionPolicy ByPass -c "irm https://hf.co/cli/install.ps1 | iex"
50+
```
51+
3852
Once installed, you can check that the CLI is correctly setup:
3953

4054
```
@@ -1016,3 +1030,34 @@ Manage scheduled jobs using
10161030
# Delete a scheduled job
10171031
>>> hf jobs scheduled delete <scheduled_job_id>
10181032
```
1033+
1034+
## hf endpoints
1035+
1036+
Use `hf endpoints` to list, deploy, describe, and manage Inference Endpoints directly from the terminal. The legacy
1037+
`hf inference-endpoints` alias remains available for compatibility.
1038+
1039+
```bash
1040+
# Lists endpoints in your namespace
1041+
>>> hf endpoints ls
1042+
1043+
# Deploy an endpoint from Model Catalog
1044+
>>> hf endpoints catalog deploy --repo openai/gpt-oss-120b --name my-endpoint
1045+
1046+
# Deploy an endpoint from the Hugging Face Hub
1047+
>>> hf endpoints deploy my-endpoint --repo gpt2 --framework pytorch --accelerator cpu --instance-size x2 --instance-type intel-icl
1048+
1049+
# List catalog entries
1050+
>>> hf endpoints catalog ls
1051+
1052+
# Show status and metadata
1053+
>>> hf endpoints describe my-endpoint
1054+
1055+
# Pause the endpoint
1056+
>>> hf endpoints pause my-endpoint
1057+
1058+
# Delete without confirmation prompt
1059+
>>> hf endpoints delete my-endpoint --yes
1060+
```
1061+
1062+
> [!TIP]
1063+
> Add `--namespace` to target an organization, `--token` to override authentication.

docs/source/en/guides/inference_endpoints.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,16 @@ The first step is to create an Inference Endpoint using [`create_inference_endpo
3333
... )
3434
```
3535

36+
Or via CLI:
37+
38+
```bash
39+
hf endpoints deploy my-endpoint-name --repo gpt2 --framework pytorch --accelerator cpu --vendor aws --region us-east-1 --instance-size x2 --instance-type intel-icl --task text-generation
40+
41+
# Deploy from the catalog with a single command
42+
hf endpoints catalog deploy my-endpoint-name --repo openai/gpt-oss-120b
43+
```
44+
45+
3646
In this example, we created a `protected` Inference Endpoint named `"my-endpoint-name"`, to serve [gpt2](https://huggingface.co/gpt2) for `text-generation`. A `protected` Inference Endpoint means your token is required to access the API. We also need to provide additional information to configure the hardware requirements, such as vendor, region, accelerator, instance type, and size. You can check out the list of available resources [here](https://api.endpoints.huggingface.cloud/#/v2%3A%3Aprovider/list_vendors). Alternatively, you can create an Inference Endpoint manually using the [Web interface](https://ui.endpoints.huggingface.co/new) for convenience. Refer to this [guide](https://huggingface.co/docs/inference-endpoints/guides/advanced) for details on advanced settings and their usage.
3747

3848
The value returned by [`create_inference_endpoint`] is an [`InferenceEndpoint`] object:
@@ -42,6 +52,12 @@ The value returned by [`create_inference_endpoint`] is an [`InferenceEndpoint`]
4252
InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2', status='pending', url=None)
4353
```
4454

55+
Or via CLI:
56+
57+
```bash
58+
hf endpoints describe my-endpoint-name
59+
```
60+
4561
It's a dataclass that holds information about the endpoint. You can access important attributes such as `name`, `repository`, `status`, `task`, `created_at`, `updated_at`, etc. If you need it, you can also access the raw response from the server with `endpoint.raw`.
4662

4763
Once your Inference Endpoint is created, you can find it on your [personal dashboard](https://ui.endpoints.huggingface.co/).
@@ -101,6 +117,14 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
101117
[InferenceEndpoint(name='aws-starchat-beta', namespace='huggingface', repository='HuggingFaceH4/starchat-beta', status='paused', url=None), ...]
102118
```
103119

120+
Or via CLI:
121+
122+
```bash
123+
hf endpoints describe my-endpoint-name
124+
hf endpoints ls --namespace huggingface
125+
hf endpoints ls --namespace '*'
126+
```
127+
104128
## Check deployment status
105129

106130
In the rest of this guide, we will assume that we have a [`InferenceEndpoint`] object called `endpoint`. You might have noticed that the endpoint has a `status` attribute of type [`InferenceEndpointStatus`]. When the Inference Endpoint is deployed and accessible, the status should be `"running"` and the `url` attribute is set:
@@ -117,6 +141,12 @@ Before reaching a `"running"` state, the Inference Endpoint typically goes throu
117141
InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2', status='pending', url=None)
118142
```
119143

144+
Or via CLI:
145+
146+
```bash
147+
hf endpoints describe my-endpoint-name
148+
```
149+
120150
Instead of fetching the Inference Endpoint status while waiting for it to run, you can directly call [`~InferenceEndpoint.wait`]. This helper takes as input a `timeout` and a `fetch_every` parameter (in seconds) and will block the thread until the Inference Endpoint is deployed. Default values are respectively `None` (no timeout) and `5` seconds.
121151

122152
```py
@@ -189,6 +219,14 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
189219
# Endpoint is not 'running' but still has a URL and will restart on first call.
190220
```
191221

222+
Or via CLI:
223+
224+
```bash
225+
hf endpoints pause my-endpoint-name
226+
hf endpoints resume my-endpoint-name
227+
hf endpoints scale-to-zero my-endpoint-name
228+
```
229+
192230
### Update model or hardware requirements
193231

194232
In some cases, you might also want to update your Inference Endpoint without creating a new one. You can either update the hosted model or the hardware requirements to run the model. You can do this using [`~InferenceEndpoint.update`]:
@@ -207,6 +245,14 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
207245
InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)
208246
```
209247

248+
Or via CLI:
249+
250+
```bash
251+
hf endpoints update my-endpoint-name --repo gpt2-large
252+
hf endpoints update my-endpoint-name --min-replica 2 --max-replica 6
253+
hf endpoints update my-endpoint-name --accelerator cpu --instance-size x4 --instance-type intel-icl
254+
```
255+
210256
### Delete the endpoint
211257

212258
Finally if you won't use the Inference Endpoint anymore, you can simply call [`~InferenceEndpoint.delete()`].

0 commit comments

Comments
 (0)