Welcome to Optimium 👋
Welcome on board! We're excited to invite you to the world of Optimium, where we give your AI models wings 🦅
🧐 What is Optimium?
Optimium, our AI inference optimization engine, supports users in accelerating their AI model inference. It seamlessly optimizes and deploys models on target devices without any engineering effort.
Optimium makes AI model inference faster and allows deployment on machines with minimal libraries and dependencies. It also facilitates convenient deployment across various hardware including but not limited to AMD, Intel, Arm, etc. using a single tool.
Why does AI inference speed matter?
Slow AI models may skip a few camera frames or miss some requests. When you think of applications such as autonomous driving system, these skipped frames and delayed judgments could result in tragic accidents.
Faster AI models can also reduce serving costs. For example, if a client requires an AI throughput of 20 requests per second, and you currently meet this demand with five inference devices, a 25% increase in model speed would allow you to reduce one device, cutting serving costs by 25%.
🔥 Why should I use Optimium?
Optimium provides the fastest inference speed without requiring any manual engineering effort. Most importantly, we achieve this without compromising model accuracy, unlike other compression techniques such as pruning and quantization. In fact, Optimium can work complementarily with these compression technologies, as we also accept compressed models as input.
We currently support x86, x64, Arm64 CPUs, and will be supporting GPUs early next year. Sign up here to try it out on your models!
Performance Benchmark
Model | Target Hardware | Reference Tool | vs Reference |
---|---|---|---|
MobileNetV3 | AWS Graviton 3 | TensorFlow | 21.09x |
ShuffleNetV2 | AWS Graviton 3 | TF Lite/XNNPACK | 2.30x |
MobileNetV3 | AMD Ryzen9 7950x | Modular MAX Engine | 2.26x |
MobileNetV3 | AMD Ryzen9 7950x | PyTorch | 3.26x |
MobileNetV3 | AMD Ryzen9 7950x | TensorFlow | 9.6x |
NasNet Mobile | AMD Ryzen9 7950x | Modular MAX Engine | 4.28x |
NasNet Mobile | AMD Ryzen9 7950x | PyTorch | 5.07x |
NasNet Mobile | AMD Ryzen9 7950x | TensorFlow | 16.84x |
MediaPipe Pose Landmark(Lite) | RaspberryPi 5(Cortex-A76) | TF Lite/XNNPACK | 1.71x |
MediaPipe Face Landmark | AMD Ryzen9 7950x | OpenVINO | 1.61x |
MediaPipe Palm Detection(Full) | Qcom Kryo 585 Gold(Cortex-A77) | TF Lite/XNNPACK | 1.57x |
MediaPipe Pose Landmark(Lite) | RaspberryPi 4(Cortex-A72) | TF Lite/XNNPACK | 1.57x |
MediaPipe Palm Detection(Full) | Qcom Kryo 585 Gold(Cortex-A77) | TF Lite/XNNPACK | 1.55x |
MediaPipe Pose Landmark(Lite) | RaspberryPi 4(Cortex-A72) | TF Lite/XNNPACK | 1.50x |
📨 Contact Us
Please contact [email protected] for a prompt response. Sign up here for Optimium Beta.
Updated 6 months ago