What types of workloads does NVCF support?

NVCF supports long-running, invokable workloads called functions (for inference, streaming, and service-style GPU workflows) and asynchronous, run-to-completion workloads called tasks (for batch inference, evaluation, fine-tuning, and data preparation). Both can be packaged as containers or Helm charts.

How do I get started with NVCF?

After installing a self-managed NVCF deployment and configuring the CLI, you initialize a project with nvcf-cli init, generate an API key, create a function from a configuration file, deploy it, and invoke it via command line. Full setup and quickstart guides are in the docs/user/ directory.

What is the architecture of NVCF?

NVCF runs as Kubernetes services with three main planes: a control plane that manages function state and secrets, an invocation plane that routes HTTP, streaming, and gRPC requests, and a compute plane that integrates GPU clusters through the NVIDIA Cluster Agent (NVCA). Observability and telemetry help operators monitor health and debug workloads.

Back to articles

NVIDIA Open-Sources NVCF for GPU-Accelerated AI Workloads

Hacker News1d ago5 min read

Key takeaway

NVIDIA has open-sourced NVCF, a platform that deploys and scales GPU-accelerated AI workloads across multiple clusters and regions. The platform provides unified control, load-balanced routing, multi-cluster autoscaling, and support for mixed GPU types—features that allow developers to manage compute-heavy AI tasks without building custom infrastructure. The code is now publicly available for inspection, modification, and community contribution.

Summaries like this, in your inbox every morning.

3 Key Points

What happened
NVIDIA has released NVIDIA Cloud Functions (NVCF), an open-source platform for deploying and managing GPU-accelerated workloads at scale. The monorepo includes service code, deployment assets, documentation, CLI tools, and examples that allow users to run inference, streaming, and other GPU work across multi-region clusters.
Why it matters
NVCF lets organizations scale demanding GPU workloads with less infrastructure to manage themselves. The platform handles routing, load balancing, autoscaling across clusters, and mixed GPU support—capabilities that previously required custom engineering. Open-sourcing the code means developers can inspect, modify, and contribute to the platform rather than rely solely on a proprietary offering.
What to watch
The public roadmap is tracked in GitHub issue #27 for the current quarter. The project is new and actively under development; users can file bugs, feature ideas, and documentation requests as GitHub issues, or use GitHub Discussions for support.

FAQ

What types of workloads does NVCF support?: NVCF supports long-running, invokable workloads called functions (for inference, streaming, and service-style GPU workflows) and asynchronous, run-to-completion workloads called tasks (for batch inference, evaluation, fine-tuning, and data preparation). Both can be packaged as containers or Helm charts.
How do I get started with NVCF?: After installing a self-managed NVCF deployment and configuring the CLI, you initialize a project with nvcf-cli init, generate an API key, create a function from a configuration file, deploy it, and invoke it via command line. Full setup and quickstart guides are in the docs/user/ directory.
What is the architecture of NVCF?: NVCF runs as Kubernetes services with three main planes: a control plane that manages function state and secrets, an invocation plane that routes HTTP, streaming, and gRPC requests, and a compute plane that integrates GPU clusters through the NVIDIA Cluster Agent (NVCA). Observability and telemetry help operators monitor health and debug workloads.

Discussion

No discussion yet for this article

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

Free · takes 30 seconds · unsubscribe anytime

1 minute a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →