Welcome to Optimium 👋

Welcome on board! We're excited to invite you to the world of Optimium, where we give your AI models wings 🦅

🧐 What is Optimium?

Optimium, our AI inference optimization engine, supports users in accelerating their AI model inference. It seamlessly optimizes and deploys models on target devices without any engineering effort.

Optimium makes AI model inference faster and allows deployment on machines with minimal libraries and dependencies. It also facilitates convenient deployment across various hardware including but not limited to AMD, Intel, Arm, etc. using a single tool.

Why does AI inference speed matter?

Slow AI models may skip a few camera frames or miss some requests. When you think of applications such as autonomous driving system, these skipped frames and delayed judgments could result in tragic accidents.

Faster AI models can also reduce serving costs. For example, if a client requires an AI throughput of 20 requests per second, and you currently meet this demand with five inference devices, a 25% increase in model speed would allow you to reduce one device, cutting serving costs by 25%.

🔥 Why should I use Optimium?

Optimium provides the fastest inference speed without requiring any manual engineering effort. Most importantly, we achieve this without compromising model accuracy, unlike other compression techniques such as pruning and quantization. In fact, Optimium can work complementarily with these compression technologies, as we also accept compressed models as input.

We currently support x86, x64, Arm64 CPUs, and will be supporting GPUs early next year. Sign up here to try it out on your models!

Performance Benchmark

ModelTarget HardwareReference Toolvs Reference
MobileNetV3AWS Graviton 3TensorFlow21.09x
ShuffleNetV2AWS Graviton 3TF Lite/XNNPACK2.30x
MobileNetV3AMD Ryzen9 7950xModular MAX Engine2.26x
MobileNetV3AMD Ryzen9 7950xPyTorch3.26x
MobileNetV3AMD Ryzen9 7950xTensorFlow9.6x
NasNet MobileAMD Ryzen9 7950xModular MAX Engine4.28x
NasNet MobileAMD Ryzen9 7950xPyTorch5.07x
NasNet MobileAMD Ryzen9 7950xTensorFlow16.84x
MediaPipe Pose Landmark(Lite)RaspberryPi 5(Cortex-A76)TF Lite/XNNPACK1.71x
MediaPipe Face LandmarkAMD Ryzen9 7950xOpenVINO1.61x
MediaPipe Palm Detection(Full)Qcom Kryo 585 Gold(Cortex-A77)TF Lite/XNNPACK1.57x
MediaPipe Pose Landmark(Lite)RaspberryPi 4(Cortex-A72)TF Lite/XNNPACK1.57x
MediaPipe Palm Detection(Full)Qcom Kryo 585 Gold(Cortex-A77)TF Lite/XNNPACK1.55x
MediaPipe Pose Landmark(Lite)RaspberryPi 4(Cortex-A72)TF Lite/XNNPACK1.50x

📨 Contact Us

Please contact [email protected] for a prompt response. Sign up here for Optimium Beta.