The easiest way to run an LLM locally on your Mac

Jeremy Morgan

Jan 7, 2024 - 4 min read

Tags: | #Ai| #Generative-Ai| #Large-Language-Models| #Ollama

Last Update: Jan 31, 2025

Coding with AI

I wrote a book! Check out A Quick Guide to Coding with AI.
Become a super programmer!
Learn how to use Generative AI coding tools as a force multiplier for your career.

Follow on LinkedIn

I’ve written about running LLMs (large language models) on your local machine for a while now. I play with this sort of thing nearly every day. So, I’m always looking for cool things to do in this space and easy ways to introduce others (like you) to the world of LLMs. This is my latest installment.

Here it is in video form if you want:

So, you want to run LLMs on your Mac.

While I’ve been using Ollama a ton lately, I saw this new product called LMstudio come up and thought I’d give it a shot. I’ll install it and try it out. This is my impression of LM Studio. But there’s a twist!

“How to run an LLM locally on a Mac M1

I’m doing this on my trusty old Mac Mini! Usually, I do LLM work with my Digital Storm PC, which runs Windows 11 and Arch Linux, with an NVidia 4090. It runs local models really fast. But I’ve wanted to try this stuff on my M1 Mac for a while now. I decided to try out LM Studio on it.

Will it work? Will it be fast? Let’s find out!

Installing LM Studio on Mac

This installation process couldn’t be any easier. I went to the LM Studio website and clicked the download button.

“How to run an LLM locally on a Mac M1

Then, of course, you just drag the app to your applications folder.

“How to run an LLM locally on a Mac M1

The first screen that comes up is the LM Studio home screen, and it’s pretty cool. It has a bunch of models listed, and you can click on them to see more information about them.

“How to run an LLM locally on a Mac M1

You can select a model and download it.

“How to run an LLM locally on a Mac M1

Running a Model Under an Inference Server

There’s a cool option here to run it as an inference server and write code to talk to it.

Running as an “inference server” loads up the model with an interface with minimal overhead. That way, you can talk directly to the model with an API, and it allows customizable interactions.

It even provides the code to run in several languages if you want to connect to it.

“How to run an LLM locally on a Mac M1

Running a Model as a Chat Server

You can run a chat server if you’re more familiar with things like ChatGPT.

“How to run an LLM locally on a Mac M1

Custom Options

You can use a few configuration options to work with these models. These options include preset styles and an option to use the Apple Metal GPU. Cool!

“How to run an LLM locally on a Mac M1

I found running the Microsoft Phi 2 model to be very responsive and generate clean results quickly. It has “real-time” chat speed and generates large amounts of texts fairly quick.

How does it run a 7B Model?

LM Studio has a nice home screen here that lists a bunch of models. Let’s try a 7B model. I run these routinely on my Windows machine with an RTX 4090, and I don’t think my M1 will get anywhere close, but it’s certainly worth a try.

“How to run an LLM locally on a Mac M1

I loaded it up and found it to be surprisingly fast. This is too slow for a chat model you’d run on a web page, for instance, if you wanted to simulate chatting with a real person. But it’s certainly useable for question-answer type prompting or code generation. It’s not too bad at all!

“How to run an LLM locally on a Mac M1

Let’s throw a our programming question at it.

“How to run an LLM locally on a Mac M1

It generated some cool code, pretty fast. I will experiment with this more in the coming days, it’s a really neat interface.

Should you Try This on Your Mac?

I was surprised by two things: The installation is SO easy. It’s as easy as installing any other application, and you can get going in minutes.

I was also surprised at how fast it is on the M1 Mac. I can only imagine it’s much quicker on M2 or M3 machine. It’s nowhere near as fast as these models run on my 4090, but they’re much faster than I expected them to be.

Whether you’re a seasoned LLM expert or just a little curious about LLMs and Generative AI, this is a great product to try out. Heck, it’s free, so why not?

You can download LM Studio here for Mac, Linux, and Windows.

Now go make some cool stuff!

Questions? Comments? Let me know!

Follow on LinkedIn

Published: Jan 7, 2024 by Jeremy Morgan. Contact me before republishing this content.