FAQs
Kompact AI provides a runtime that enables AI models to run efficiently on CPU architectures without any performance degradation. It allows enterprises to achieve high-performance inference without relying on GPUs.Key features include:
- Cost-effective AI – Drastically lower inference costs compared to GPU-based deployments.
- High-performance AI – Optimized models deliver fast, reliable results even at scale.
- Accessible AI – Easy to deploy across a range of cloud and on-prem environments.
- Environmentally sustainable AI – Reduced energy consumption through efficient CPU usage.
Kompact AI does not train models or alter their weights. Instead, we focus on optimising token generation throughput and reducing system latency on standard CPU infrastructure. Once a model is optimised, we evaluate its performance using the same benchmark tests as those published by the original model developers. Optimisation continues iteratively until the model matches or preserves the original accuracy scores—ensuring no compromise in output quality.
Kompact AI enables developers to build and deploy fast, cost-efficient AI applications using models with fewer than 50 million parameters. These models can run seamlessly on standard CPUs, both in cloud and on-prem environments.
Developers can use Kompact AI to:
Implement RAG systems for context-rich question answering.
Create agentic AI workflows that automate multi-step tasks and decision-making processes
Build lightweight LLM applications such as:
- A Knowledge Q&A App with enterprise-grade performance.
- A Big Data Querying Agent that interprets natural language prompts.
- A Productivity Assistant tailored for internal workflows.
Yes, Kompact AI integrates seamlessly with tools like LangChain, LlamaIndex, Super AGI, etc. Since KAI provides a runtime to execute optimised models, the way these models are used within different applications and workflows — including those built with LangChain — remains unchanged.
New users are walked through a guided Experience Centre session to understand deployments and use cases. We also provide quick-start guides and technical documentation to support setup and integration.
Yes — as long as the model is open source and you can share the architecture details, we can work with it. We don’t support optimization for proprietary or closed-source models. Once we have the required inputs, we can optimize the model for specified CPU hardware.
Kompact AI currently supports inference only. Fine-tuning capabilities will be available soon, but model training is not supported at this time.
Deployment is simple and takes a few minutes. Customers share their target environment details, and we provide a ready-to-run package — cloud machine images (e.g., AMIs for AWS) or on-prem virtual images (e.g., OVA).
Kompact AI is not available as an open source framework.
No. We use the original model weights without quantization or distillation. Our focus is on optimising token generation throughput and reducing system latency on standard CPU infrastructure — not modifying the model architecture or accuracy.
Support for running Kompact AI-optimised models on edge devices—including laptops and mobile phones—is planned for Q4 2025.
Please visit kompact.ai to view the list of optimized models.
Yes, Kompact AI supports inference for RAG workflows. We've developed our own RAG-based application using a KAI-optimized model.
Performance scaling depends heavily on the available CPU architecture, memory bandwidth, and the batching strategy employed. We recommend working with models under 50B for cost-effective CPU-based inference.
We use the same benchmarks adopted by the original model developers depending on the model type and task. This ensures a like-for-like comparison of accuracy before and after optimisation.
The optimization timeline depends on several factors, including the chosen LLM, model architecture, target hardware, and performance expectations. We assess these inputs to provide an estimated timeline on a case-by-case basis.
We plan to release the software for testing and benchmarking in the coming weeks and will notify you once it's available. Alongside the release, we also intend to publish white papers and technical reports to provide deeper insights into the platform.
Yes. We will be introducing Autoscaling capabilities in Kompact AI.