Reasoning models are a new class of large language models (LLMs) designed to tackle highly complex tasks by employing chain-of-thought (CoT) reasoning with the tradeoff of taking longer to respond. The DeepSeek R1 is a recently released frontier “reasoning” model which has been distilled into highly capable smaller models. Deploying these DeepSeek R1 distilled models on AMD Ryzen™ AI processors and Radeon™ graphics cards is incredibly easy and available now through LM Studio.
Reasoning models add a “thinking” stage before the final output – which you can see by expanding the “thinking” window before the model gives its final answer. Unlike conventional LLMs, which one-shot the response, CoT LLMs perform extensive reasoning before answering. The assumptions and self-reflection the LLM performs are visible to the user and this improves the reasoning and analytical capability of the model – albeit at the cost of significantly longer time-to-first-(final output)token.
A reasoning model may first spend thousands of tokens (and you can view this chain of thought!) to analyze the problem before giving a final response. This allows the model to be excellent at complex problem-solving tasks involving math and science and attack a complex problem from all angles before deciding on a response. Depending on your AMD hardware, each of these models will offer state-of-the-art reasoning capability on your AMD Ryzen™ AI processor or Radeon™ graphics cards.
How to run DeepSeek R1 Distilled “Reasoning” Models on AMD Ryzen™ AI and Radeon™ Graphics Cards
Follow these simple steps to get up and running with DeepSeek R1 distillations in just a few minutes (dependent upon download speed).
Please make sure you are using the optional driver Adrenalin 25.1.1, which can be downloaded directly by clicking this link.
Step 1: Make sure you are on the 25.1.1 Optional or higher Adrenalin driver.
Step 2: Download LM Studio 0.3.8 or above from lmstudio.ai/ryzenai
Step 3: Install LM Studio and skip the onboarding screen.
Step 4: Click on the discover tab.
Step 5: Choose your DeepSeek R1 Distill. Smaller distills like the Qwen 1.5B offer blazing fast performance (and are the recommended starting point) while bigger distills will offer superior reasoning capability. All of them are extremely capable. The table below details the maximum recommended DeepSeek R1 Distill size:
Processor | DeepSeek R1 Distill* (Max Supported) |
AMD Ryzen™ AI Max+ 395 32GB, 64 GBand 128 GB | DeepSeek-R1-Distill-Llama-70B (64GB and 128GB only) DeepSeek-R1-Distill-Qwen-32B |
AMD Ryzen™ AI HX 370 and 365 24GB and 32 GB | DeepSeek-R1-Distill-Qwen-14B |
AMD Ryzen™ 8040 and Ryzen™ 7040 32 GB | DeepSeek-R1-Distill-Llama-14B |
Graphics Card | DeepSeek R1 Distill* (Max Supported) |
AMD Radeon™ RX 7900 XTX | DeepSeek-R1-Distill-Qwen-32B |
AMD Radeon™ RX 7900 XT | DeepSeek-R1-Distill-Qwen-14B |
AMD Radeon™ RX 7900 GRE | DeepSeek-R1-Distill-Qwen-14B |
AMD Radeon™ RX 7800 XT | DeepSeek-R1-Distill-Qwen-14B |
AMD Radeon™ RX 7700 XT | DeepSeek-R1-Distill-Qwen-14B |
AMD Radeon™ RX 7600 XT | DeepSeek-R1-Distill-Qwen-14B |
AMD Radeon™ RX 7600 | DeepSeek-R1-Distill-Llama-8B |
Step 6: On the right-hand side, make sure the “Q4 K M” quantization is selected and click “Download”.
Step 7: Once downloaded, head back to the chat tab and select the DeepSeek R1 distill from the drop-down menu and make sure “manually select parameters” is checked.
Step 8: In the GPU offload layers – move the slider all the way to the max.
Step 9: Click model load.
Step 10: Interact with a reasoning model running completely on your local AMD hardware!
Covered By: NCN MAGAZINE / AMD
If you have an interesting Article / Report/case study to share, please get in touch with us at editors@roymediative.com , roy@roymediative.com, 9811346846/ 9625243429