In this tutorial we will walk you through the process of getting setup to build the MPS backend for ExecuTorch and running a simple model on it.
The MPS backend device maps machine learning computational graphs and primitives on the MPS Graph framework and tuned kernels provided by MPS.
::::{grid} 2 :::{grid-item-card} What you will learn in this tutorial: :class-card: card-prerequisites
- In this tutorial you will learn how to export MobileNet V3 model to the MPS delegate.
- You will also learn how to compile and deploy the ExecuTorch runtime with the MPS delegate on macOS and iOS. ::: :::{grid-item-card} Tutorials we recommend you complete before this: :class-card: card-prerequisites
- Introduction to ExecuTorch
- Getting Started
- Building ExecuTorch with CMake
- ExecuTorch iOS Demo App
- ExecuTorch iOS LLaMA Demo App ::: ::::
In order to be able to successfully build and run a model using the MPS backend for ExecuTorch, you'll need the following hardware and software components:
- A mac for tracing the model
Step 1. Complete the steps in Getting Started to set up the ExecuTorch development environment.
You will also need a local clone of the ExecuTorch repository. See Building ExecuTorch from Source for instructions. All commands in this document should be run from the executorch repository.
Compiling model for MPS delegate:
- In this step, you will generate a simple ExecuTorch program that lowers MobileNetV3 model to the MPS delegate. You'll then pass this Program (the
.ptefile) during the runtime to run it using the MPS backend.
cd executorch
# Note: `mps_example` script uses by default the MPSPartitioner for ops that are not yet supported by the MPS delegate. To turn it off, pass `--no-use_partitioner`.
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --bundled --use_fp16
# To see all options, run following command:
python3 -m examples.apple.mps.scripts.mps_example --helpBuilding the MPS executor runner:
# In this step, you'll be building the `mps_executor_runner` that is able to run MPS lowered modules:
cd executorch
./examples/apple/mps/scripts/build_mps_executor_runner.sh./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_float16_bundled.pte --bundled_program- You should see the following results. Note that no output file will be generated in this example:
I 00:00:00.003290 executorch:mps_executor_runner.mm:286] Model file mv3_mps_float16_bundled.pte is loaded.
I 00:00:00.003306 executorch:mps_executor_runner.mm:292] Program methods: 1
I 00:00:00.003308 executorch:mps_executor_runner.mm:294] Running method forward
I 00:00:00.003311 executorch:mps_executor_runner.mm:349] Setting up non-const buffer 1, size 606112.
I 00:00:00.003374 executorch:mps_executor_runner.mm:376] Setting up memory manager
I 00:00:00.003376 executorch:mps_executor_runner.mm:392] Loading method name from plan
I 00:00:00.018942 executorch:mps_executor_runner.mm:399] Method loaded.
I 00:00:00.018944 executorch:mps_executor_runner.mm:404] Loading bundled program...
I 00:00:00.018980 executorch:mps_executor_runner.mm:421] Inputs prepared.
I 00:00:00.118731 executorch:mps_executor_runner.mm:438] Model executed successfully.
I 00:00:00.122615 executorch:mps_executor_runner.mm:501] Model verified successfully.
- Make sure
pybindMPS support was installed:
CMAKE_ARGS="-DEXECUTORCH_BUILD_MPS=ON" ./install_executorch.sh- Run the
mps_examplescript to trace the model and run it directly from python:
cd executorch
# Check correctness between PyTorch eager forward pass and ExecuTorch MPS delegate forward pass
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --check_correctness
# You should see following output: `Results between ExecuTorch forward pass with MPS backend and PyTorch forward pass for mv3_mps are matching!`
# Check performance between PyTorch MPS forward pass and ExecuTorch MPS forward pass
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp16 --bench_pytorch- [Optional] Generate an ETRecord while you're exporting your model.
cd executorch
python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b- Run your Program on the ExecuTorch runtime and generate an ETDump.
./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_float16_bundled.pte --bundled_program --dump-outputs
- Create an instance of the Inspector API by passing in the ETDump you have sourced from the runtime along with the optionally generated ETRecord from step 1.
python3 -m devtools.inspector.inspector_cli --etdump_path etdump.etdp --etrecord_path etrecord.binStep 1. Create the ExecuTorch core and MPS delegate frameworks to link on iOS
cd executorch
./scripts/build_apple_frameworks.sh --mpsmps_delegate.xcframework will be in cmake-out folder, along with executorch.xcframework and portable_delegate.xcframework:
cd cmake-out && lsStep 2. Link the frameworks into your XCode project:
Go to project Target’s Build Phases - Link Binaries With Libraries, click the + sign and add the frameworks: files located in Release folder.
executorch.xcframeworkportable_delegate.xcframeworkmps_delegate.xcframework
From the same page, include the needed libraries for the MPS delegate:
MetalPerformanceShaders.frameworkMetalPerformanceShadersGraph.frameworkMetal.framework
In this tutorial, you have learned how to lower a model to the MPS delegate, build the mps_executor_runner and run a lowered model through the MPS delegate, or directly on device using the MPS delegate static library.
If you encountered any bugs or issues following this tutorial please file a bug/issue on the ExecuTorch repository, with hashtag #mps.