FAQ

Frequently Asked Questions

What is Kompact AI?

Kompact AI is a complete, end-to-end platform for AI inference, enabling LLMs of varying sizes to run on CPUs without any loss in performance.

What are the main components of Kompact AI?

Build lightweight LLM applications such as:

The runtime to execute the Models.
Remote REST‑based server for serving model inferences remotely.
Observability to track model and system performance.
Client-Side SDKs in Go, Python, Java, .NET, and JavaScript, which are OpenAI Compatible for writing downstream applications that use the Kompact AI models.

What is a runtime to execute models?

It is a component of Kompact AI for running the model on a CPU.

Can I execute any open-source models?

Yes, you can run any open-source model on the Kompact AI runtime.

Can I download models from Hugging Face and execute them on Kompact AI?

Yes, you can download models from Hugging Face and run them on the Kompact AI runtime.

I have a proprietary closed-source model. Can I execute it using the Kompact AI runtime?

Yes, you can run proprietary or closed-source models on the Kompact AI runtime. You can connect with us, and we can tell you the steps on how to execute the model in Kompact AI.

Does it allow me to execute fine-tuned models?

The Kompact AI runtime supports fine-tuned models as well. Connect with us, and we’ll guide you through the steps to run your model on Kompact AI.

How does Kompact AI enable custom access control for hosted models?

Models are served via a remote REST-based server hosted on NGINX, which supports pluggable modules to implement custom access controls.

What is Observability on Kompact AI?

Kompact AI’s observability tracks inputs, outputs, SLAs, user requests, and system metrics like CPU, memory, and network usage. With OpenTelemetry support, it integrates seamlessly with tools like Prometheus and Grafana for monitoring.

How can I access a model for Inference in Kompact AI?

There are two ways to access a model for inferencing in Kompact AI:

Using Plain Vanilla HTTP(s)
Client-Side SDKs in Go, Python, Java, .NET, and JavaScript. No Code Change. They are OpenAI-compatible.

Do Kompact AI alter the weights of a model?

Kompact AI does not alter the weights of the model.

Does Kompact AI distil models before executing a model on a CPU?

Kompact AI do not do any distillation of a model.

Is there a performance degradation due to using Kompact AI?

There is no performance degradation in models optimised by Kompact AI.

What are the quality tests that Kompact AI performs on a model?

After optimisation, we benchmark the model against the original developers’ tests and iterate until it matches the original accuracy, ensuring no loss in quality.

Can I run any model using Kompact AI?

We currently focus on models with under 50 billion parameters. Support for larger models will be available from Q1 2026.

Can I use Kompact AI in my RAG application?

Yes, you can build a RAG application using a model optimised by Kompact AI. You can connect with us, and we can tell you the steps on how to execute the model in Kompact AI.

Can I use Kompact AI in my agentic AI application?

Yes, you can build a Agentic AI application using a model optimised by Kompact AI. Connect with us, and we’ll guide you through the steps to run the model on Kompact AI.

Does Kompact AI integrate with LangChain?

Yes, Kompact AI integrates seamlessly with LangChain.

Does Kompact AI integrate with LlamaIndex?

Yes, Kompact AI integrates seamlessly with LlamaIndex.

Can I train or fine-tune models using Kompact AI?

Kompact AI currently supports inference only. Fine-tuning capabilities will be available soon, but model training is not supported at this time.

I have a quantised model. Can I execute it on top of Kompact AI.

Yes, very much. You can take any quantised model and execute it as long as the CPU supports that. You can connect with us, and we can tell you the steps on how to execute the model in Kompact AI.

How does it work in a cloud-based environment, in terms of accessibility?

Kompact AI model images are available on Google Cloud, Microsoft Azure, and AWS.

Is Kompact AI available as an open-source framework?

Kompact AI is not available as an open-source framework.

What are the specific system requirements?

System requirements vary based on the model being executed and the desired throughput. They are determined on a case-by-case basis, depending on the specific use case.

Can I deploy Kompact optimised models on on-prem?

Yes, you can deploy Kompact AI optimised models on premise servers.

Are the models running on Kompact AI quantised or distilled?

No. We use original model weights without quantisation or distillation, focusing solely on boosting throughput and reducing latency on standard CPUs without altering architecture or accuracy.

I am working on a vision model. What are the different types of vision models KAI supports or will support?

Please have a look at our models page. These are models that we have optimised. If your model is not listed, let us know; we will incorporate it. Alternatively, you can use Kompact AI and build on your own, too.

https://www.ziroh.com/upcoming-models

https://www.ziroh.com/model-listing

I am working on a speech model. What are the different types of speech models KAI supports or will support?

https://www.ziroh.com/upcoming-models

https://www.ziroh.com/model-listing

I am using RAG. Can I use Kompact AI for inference?

Yes, Kompact AI supports inference for RAG workflows. We've developed our own RAG-based application using a KAI-optimized model.

What CPU architecture is supported? Does it have anything to do with AI chips?

Currently, Kompact AI models are on Intel CPUs. We plan to release models optimised for AMD, Ampere, Qualcomm, and ARM very soon.

Who manages the memory in Kompact AI?

Kompact itself manages model memory.

Do you optimise using PyTorch?

No, we do not use PyTorch.

Will Kompact AI scale in and scale out based on the number of processors?

Yes. Kompact AI autoscales. We can work with your DevOps team and show you how it can be achieved.

I want to design an AI application, but I’m not sure which model to use. Can you help me with that?

We'd be happy to help. Please write to us at contact@ziroh.com with a brief description of what you're building, and our team will get back to you.