A unified platform for AI development and deployment, including MAX🧑🚀 and Mojo🔥.
The Modular Platform is an open and fully-integrated suite of AI libraries and tools that accelerates model serving and scales GenAI deployments. It abstracts away hardware complexity so you can run the most popular open models with industry-leading GPU and CPU performance without any code changes.
You don't need to clone this repo.
You can install Modular as a pip
or conda
package and then start an
OpenAI-compatible endpoint with a model of your choice.
If we trim the ceremonial steps, you can start a local LLM endpoint with just two commands:
pip install modular
max serve --model-path=modularai/Llama-3.1-8B-Instruct-GGUF
Then start sending the Llama 3 model inference requests using our OpenAI-compatible REST API.
Or try running hundreds of other models from our model repository.
For a complete walkthrough, see the quickstart guide.
The MAX container is our Kubernetes-compatible Docker container for convenient
deployment, using the same inference server you get from the max serve
command shown above.
You can try it now with this command:
docker run --gpus=1 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-p 8000:8000 \
docker.modular.com/modular/max-nvidia-full:latest \
--model-path modularai/Llama-3.1-8B-Instruct-GGUF
For more information, see our MAX container docs or the Modular Docker Hub repository.
We're constantly open-sourcing more of the Modular Platform and you can find all of it in here.
Highlights include:
- Mojo standard library: /mojo/stdlib
- MAX GPU and CPU kernels: /max/kernels (Mojo kernels)
- MAX inference serving layer: /max/serve (OpenAI-compatible endpoint)
- MAX model pipelines: /max/pipelines (Python-based graphs)
- Code example: /examples
- Tutorials: /tutorials
This repo has two primary branches:
-
The
main
branch, which is in sync with the nightly build and subject to new bugs. Use this branch for contributions, or if you installed the nightly build. -
The
stable
branch, which is in sync with the last stable released version of Mojo. Use the examples in here if you installed the stable build.
Thanks for your interest in contributing to this repository!
We accept contributions to the Mojo standard library and MAX AI kernels. We currently aren't accepting contributions for other parts of the repository.
Please see the Contribution Guide for instructions.
We also welcome your bug reports. If you have a bug, please file an issue here.
If you'd like to chat with the team and other community members, please send a message to our Discord channel and our forum board.
This repository and its contributions are licensed under the Apache License v2.0 with LLVM Exceptions (see the LLVM License). Modular, MAX and Mojo usage and distribution are licensed under the Modular Community License.
You are entirely responsible for checking and validating the licenses of third parties (i.e. Huggingface) for related software and libraries that are downloaded.