Welcome on board! We're excited to invite you to the world of Optimium, where we give your AI models wings 🦅

🧐 What is Optimium?

Optimium, our AI inference optimization engine, supports users in accelerating their AI model inference. It seamlessly optimizes and deploys models on target devices without any engineering effort.

Optimium makes AI model inference faster and allows deployment on machines with minimal libraries and dependencies. It also facilitates convenient deployment across various hardware including but not limited to AMD, Intel, Arm, etc. using a single tool.

Why does AI inference speed matter?

Slow AI models may skip a few camera frames or miss some requests. When you think of applications such as autonomous driving system, these skipped frames and delayed judgments could result in tragic accidents.

Faster AI models can also reduce serving costs. For example, if a client requires an AI throughput of 20 requests per second, and you currently meet this demand with five inference devices, a 25% increase in model speed would allow you to reduce one device, cutting serving costs by 25%.

🔥 Why should I use Optimium?

Optimium provides the fastest inference speed without requiring any manual engineering effort. Most importantly, we achieve this without compromising model accuracy, unlike other compression techniques such as pruning and quantization. In fact, Optimium can work complementarily with these compression technologies, as we also accept compressed models as input.

We currently support x86, x64, Arm64 CPUs, and will be supporting GPUs early next year. Sign up here to try it out on your models!

Performance Benchmark

Model	Target Hardware	Reference Tool	vs Reference
MobileNetV3	AWS Graviton 3	TensorFlow	21.09x
ShuffleNetV2	AWS Graviton 3	TF Lite/XNNPACK	2.30x
MobileNetV3	AMD Ryzen9 7950x	Modular MAX Engine	2.26x
MobileNetV3	AMD Ryzen9 7950x	PyTorch	3.26x
MobileNetV3	AMD Ryzen9 7950x	TensorFlow	9.6x
NasNet Mobile	AMD Ryzen9 7950x	Modular MAX Engine	4.28x
NasNet Mobile	AMD Ryzen9 7950x	PyTorch	5.07x
NasNet Mobile	AMD Ryzen9 7950x	TensorFlow	16.84x
MediaPipe Pose Landmark(Lite)	RaspberryPi 5(Cortex-A76)	TF Lite/XNNPACK	1.71x
MediaPipe Face Landmark	AMD Ryzen9 7950x	OpenVINO	1.61x
MediaPipe Palm Detection(Full)	Qcom Kryo 585 Gold(Cortex-A77)	TF Lite/XNNPACK	1.57x
MediaPipe Pose Landmark(Lite)	RaspberryPi 4(Cortex-A72)	TF Lite/XNNPACK	1.57x
MediaPipe Palm Detection(Full)	Qcom Kryo 585 Gold(Cortex-A77)	TF Lite/XNNPACK	1.55x
MediaPipe Pose Landmark(Lite)	RaspberryPi 4(Cortex-A72)	TF Lite/XNNPACK	1.50x

📨 Contact Us

Please contact [email protected] for a prompt response. Sign up here for Optimium Beta.