We are close to having a basic, working local AI system. The last technically issue to address is streaming the response. To stream a response, we need to enable server_sent_events
on our client:
# app/models/conversation.rb
def ollama_client
@client ||= Ollama.new(
credentials: { address: "http://localhost:11434" },
options: { server_sent_events: true }
)
end
Next we will want to break the thread updating into creation of the new message, and updating the message:
class Conversation < ApplicationRecord
def send_thread
# collect the preceding messages
messages = self.messages.order(:created_at).map { {
role: _1.role,
content: _1.message
} }
# create the blank initial response
response = self.messages.create({
role: "assistant",
message: ""
})
ollama_client.chat({
model: "llama3",
stream: true,
messages: messages
}) do |event, raw|
response.message += event["message"]["content"]
response.save
end
end
end
Notably, we enable the stream: true
option and provide a block to collect the responses.
Because we are saving after piece of the message is received, this will trigger an update to the message, which is broadcast to the page.
If you are looking to have a conversation to see if sprinkling some AI love onto your application makes sense, book free a call.