With iOS 26, Apple introduces the Basis Fashions framework, a privacy-first, on-device AI toolkit that brings the identical language fashions behind Apple Intelligence proper into your apps. This framework is out there throughout Apple platforms, together with iOS, macOS, iPadOS, and visionOS, and it offers builders with a streamlined Swift API for integrating superior AI options straight into your apps.
In contrast to cloud-based LLMs akin to ChatGPT or Claude, which run on highly effective servers and require web entry, Apple’s LLM is designed to run fully on-device. This architectural distinction provides it a singular benefit: all knowledge stays on the person’s gadget, guaranteeing privateness, decrease latency, and offline entry.
This framework opens the door to a complete vary of clever options you possibly can construct proper out of the field. You’ll be able to generate and summarize content material, classify info, and even construct in semantic search and customized studying experiences. Whether or not you wish to create a sensible in-app information, generate distinctive content material for every person, or add a conversational assistant, now you can do it with just some strains of Swift code.
On this tutorial, we’ll discover the Basis Fashions framework. You’ll be taught what it’s, the way it works, and the best way to use it to generate content material utilizing Apple’s on-device language fashions.
Able to get began? Let’s dive in.
The Demo App: Ask Me Something

It’s at all times nice to be taught new frameworks or APIs by constructing a demo app — and that’s precisely what we’ll do on this tutorial. We’ll create a easy but highly effective app known as Ask Me Something to discover how Apple’s new Basis Fashions framework works in iOS 26.
The app lets customers sort in any questions and offers an AI-generated response, all processed on-device utilizing Apple’s built-in LLM.
By constructing this demo app, you may learn to combine the Basis Fashions framework right into a SwiftUI app. You will additionally perceive the best way to create prompts and seize each full and partial generated responses.
Utilizing the Default System Language Mannequin
Apple offers a built-in mannequin known as SystemLanguageModel
, which supplies you entry to the on-device basis mannequin that powers Apple Intelligence. For general-purpose use, you possibly can entry the base model of this mannequin by way of the default
property. It’s optimized for textual content technology duties and serves as an amazing start line for constructing options like content material technology or query answering in your app.
To make use of it in your app, you may first must import the FoundationModels
framework:
import FoundationModels
With the framework now imported, you will get a deal with on the default system language mannequin. Right here’s the pattern code to do this:
struct ContentView: View {
personal var mannequin = SystemLanguageModel.default
var physique: some View {
change mannequin.availability {
case .obtainable:
mainView
case .unavailable(let motive):
Textual content(unavailableMessage(motive))
}
}
personal var mainView: some View {
ScrollView {
.
.
.
}
}
personal func unavailableMessage(_ motive: SystemLanguageModel.Availability.UnavailableReason) -> String {
change motive {
case .deviceNotEligible:
return "The gadget just isn't eligible for utilizing Apple Intelligence."
case .appleIntelligenceNotEnabled:
return "Apple Intelligence just isn't enabled on this gadget."
case .modelNotReady:
return "The mannequin is not prepared as a result of it is downloading or due to different system causes."
@unknown default:
return "The mannequin is unavailable for an unknown motive."
}
}
}
Since Basis Fashions solely work on units with Apple Intelligence enabled, it is vital to confirm {that a} mannequin is out there earlier than utilizing it. You’ll be able to test its readiness by inspecting the availability
property.
Implementing the UI
Let’s proceed to construct the UI of the mainView
. We first add two state variables to retailer the person query and the generated reply:
@State personal var reply: String = ""
@State personal var query: String = ""
For the UI implementation, replace the mainView
like this:
personal var mainView: some View {
ScrollView {
ScrollView {
VStack {
Textual content("Ask Me Something")
.font(.system(.largeTitle, design: .rounded, weight: .daring))
TextField("", textual content: $query, immediate: Textual content("Sort your query right here"), axis: .vertical)
.lineLimit(3...5)
.padding()
.background {
Shade(.systemGray6)
}
.font(.system(.title2, design: .rounded))
Button {
} label: {
Textual content("Get reply")
.body(maxWidth: .infinity)
.font(.headline)
}
.buttonStyle(.borderedProminent)
.controlSize(.extraLarge)
.padding(.prime)
Rectangle()
.body(peak: 1)
.foregroundColor(Shade(.systemGray5))
.padding(.vertical)
Textual content(LocalizedStringKey(reply))
.font(.system(.physique, design: .rounded))
}
.padding()
}
}
}
The implementation is fairly simple – I simply added a contact of fundamental styling to the textual content subject and button.

Producing Responses with the Language Mannequin
Now we’ve come to the core a part of app: sending the query to the mannequin and producing the response. To deal with this, we create a brand new perform known as generateAnswer()
:
personal func generateAnswer() async {
let session = LanguageModelSession()
do {
let response = attempt await session.reply(to: query)
reply = response.content material
} catch {
reply = "Did not reply the query: (error.localizedDescription)"
}
}
As you possibly can see, it solely takes a number of strains of code to ship a query to the mannequin and obtain a generated response. First, we create a session utilizing the default system language mannequin. Then, we move the person’s query, which is called a immediate, to the mannequin utilizing the reply
methodology.
The decision is asynchronous because it normally takes a number of second (and even longer) for the mannequin to generate the response. As soon as the response is prepared, we will entry the generated textual content via the content material
property and assign it to reply
for show.
To invoke this new perform, we additionally must replace the closure of the “Get Reply” button like this:
Button {
Job {
await generateAnswer()
}
} label: {
Textual content("Present me the reply")
.body(maxWidth: .infinity)
.font(.headline)
}
You’ll be able to take a look at the app straight within the preview pane, or run it within the simulator. Simply sort in a query, wait a number of seconds, and the app will generate a response for you.

Reusing the Session
The code above creates a brand new session for every query, which works effectively when the questions are unrelated.
However what if you need customers to ask follow-up questions and hold the context? In that case, you possibly can merely reuse the identical session every time you name the mannequin.
For our demo app, we will transfer the session
variable out of the generateAnswer()
perform and switch it right into a state variable:
@State personal var session = LanguageModelSession()
After making the change, attempt testing the app by first asking: “What are the must-try meals when visiting Japan?” Then observe up with: “Counsel me some eating places.”
For the reason that session is retained, the mannequin understands the context and is aware of you are searching for restaurant suggestions in Japan.

In case you don’t reuse the identical session, the mannequin gained’t acknowledge the context of your follow-up query. As an alternative, it’ll reply with one thing like this, asking for extra particulars:
“Certain! To give you one of the best solutions, might you please let me know your location or the kind of delicacies you are excited by?”
Disabling the Button Throughout Response Technology
For the reason that mannequin takes time to generate a response, it’s a good suggestion to disable the “Get Reply” button whereas ready for the reply. The session
object features a property known as isResponding
that permits you to test if the mannequin is at present working.
To disable the button throughout that point, merely use the .disabled
modifier and move within the session’s standing like this:
Button {
Job {
await generateAnswer()
}
} label: {
.
.
.
}
.disabled(session.isResponding)
Working with Stream Responses
The present person expertise is not superb — for the reason that on-device mannequin takes time to generate a response, the app solely exhibits the end result after the complete response is prepared.
In case you’ve used ChatGPT or related LLMs, you’ve most likely observed that they begin displaying partial outcomes virtually instantly. This creates a smoother, extra responsive expertise.
The Basis Fashions framework additionally helps streaming output, which lets you show responses as they’re being generated, reasonably than ready for the whole reply. To implement this, use the streamResponse
methodology reasonably than the reply
methodology. Here is the up to date generateAnswer()
perform that works with streaming responses:
personal func generateAnswer() async {
do {
reply = ""
let stream = session.streamResponse(to: query)
for attempt await streamData in stream {
reply = streamData.asPartiallyGenerated()
}
} catch {
reply = "Did not reply the query: (error.localizedDescription)"
}
}
Similar to with the reply
methodology, you move the person’s query to the mannequin when calling streamResponse
. The important thing distinction is that as an alternative of ready for the total response, you possibly can loop via the streamed knowledge and replace the reply
variable with every partial end result — displaying it on display because it’s generated.
Now while you take a look at the app once more and ask any questions, you may see responses seem incrementally as they’re generated, creating a way more responsive person expertise.

Abstract
On this tutorial, we coated the fundamentals of the Basis Fashions framework and confirmed the best way to use Apple’s on-device language mannequin for duties like query answering and content material technology.
That is only the start — the framework affords far more. In future tutorials, we’ll dive deeper into different new options akin to the brand new @Generable
and @Information
macros, and discover extra capabilities like content material tagging and power calling.
In case you’re seeking to construct smarter, AI-powered apps, now could be the proper time to discover the Basis Fashions framework and begin integrating on-device intelligence into your initiatives.