mirror of
https://github.com/ml-explore/mlx-examples.git
synced 2025-08-29 13:01:53 +08:00
561 B
561 B
Export LLMs to C++
Export language model inference from Python to run directly in C++.
To run, first install the requirements:
pip install -U mlx-lm
Then generate text from Python with:
python export.py generate "How tall is K2?"
To export the generation function run:
python export.py export
Then build the C++ code (requires CMake):
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build
And run the generation from C++ with:
./build/main lama3.1-instruct-4bit "How tall is K2?"