Hugging Face launches one-command OpenAI-compatible LLM endpoint with HF Jobs
Hugging Face now offers a private, OpenAI-compatible LLM endpoint that can be deployed with a single command via HF Jobs. The feature utilizes the official vllm/vllm-openai image and requires a payment method or a positive prepaid credit balance. Users can query the endpoint from anywhere, and it is billed at $0.01 per minute per GPU hour of usage. This capability is suitable for tests, evals, or batch generation.