vAccel | Nubificus

vAccel is a lightweight, modular framework that exposes hardware acceleration functionality to workloads in virtualized or otherwise isolated environments — write once, accelerate anywhere.

Hardware-Agnostic Plugin Architecture Lightweight Multi-Language

Architecture Overview

vAccel Framework

App

Application Code

User workload (C, Python, Go, Rust)

vAccel API

Unified operations interface

ML / Compute Ops

TF, Torch, image, BLAS, exec

Unified
API

Plugin

Accel Plugins

TensorFlow, Torch, TVM

Transport Plugins

RPC, VirtIO for remote exec

Utility Plugins

Exec, MBench, NoOp

Driver
Layer

GPUs

NVIDIA, AMD, Intel

TPUs / NPUs

Neural accelerators

FPGAs

Custom fabric acceleration

Remote Backends

Network-attached accelerators

vAccel provides a unified abstraction layer that allows applications to offload compute-intensive operations — such as image processing, machine-learning inference, and cryptographic functions — to a wide range of accelerators without requiring platform-specific code. Built to integrate seamlessly with existing runtimes, vAccel supports local and remote acceleration backends, offering a consistent API across GPUs, FPGAs, NPUs, and custom devices. Its design emphasises portability, minimal overhead, and interoperability with containerised and unikernel-based workloads.

Key Capabilities

Hardware Agnostic

Write accelerated code once and deploy on any backend — GPUs, TPUs, NPUs, or FPGAs — without vendor lock-in or driver-level integration.

Plugin Architecture

Modular plugin system supporting acceleration (TensorFlow, PyTorch, TVM), transport (RPC, VirtIO), and utility backends.

Remote Acceleration

Transparent remote execution via RPC and VirtIO transport plugins, enabling acceleration from VMs and constrained environments.

Multi-Language Bindings

Native bindings for C, Python, Go, and Rust, with a unified operations API across all supported languages.

Minimal Overhead

Lightweight framework with negligible performance penalty, designed for highly constrained serverless and edge environments.

Runtime Integration

Seamless interoperability with containers, unikernels (via urunc), and Kubernetes for cloud-native accelerated workloads.

Plugin Ecosystem

Category	Plugin	Operations
Acceleration	TensorFlow	Image classification, object detection, model inference
Acceleration	PyTorch	Tensor ops, model inference, jit forward
Acceleration	TVM	Compiled model execution, cross-platform inference
Transport	RPC	Network-based remote acceleration dispatch
Transport	VirtIO	VM-to-host hardware passthrough
Utility	Exec, MBench, NoOp	Generic exec, benchmarking, testing

Use Cases

ML Inference at the Edge

Run TensorFlow and PyTorch models on edge devices with transparent offloading to available accelerators, no code changes needed.

Serverless Acceleration

Enable hardware acceleration in serverless functions and lightweight VMs without requiring direct driver access or GPU passthrough.

Image & Video Processing

Offload compute-intensive image classification, object detection, and video analytics to heterogeneous accelerators at scale.

Kubernetes GPU Sharing

Deploy accelerated workloads as standard K8s pods with vAccel managing hardware multiplexing across tenants.

Explore vAccel

Dive into the documentation, API reference, tutorials, and plugin development guides.

Visit vAccel Docs