Editing article

Title

Summary

Content

<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*MaT34TRoitiU1RMM.jpg" /></figure>Lately, for my Generative AI powered Java apps, I’ve used the <a href="https://deepmind.google/technologies/gemini/#introduction">Gemini</a> multimodal large language model from Google. But there’s also <a href="https://blog.google/technology/developers/gemma-open-models/">Gemma</a>, its little sister model.Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Gemma is available in two sizes: 2B and 7B. Its weights are freely available, and its small size means you can run it on your own, even on your laptop. So I was curious to give it a run with <a href="https://docs.langchain4j.dev/">LangChain4j</a>.<h3>How to run Gemma</h3>There are many ways to run Gemma: in the cloud, via <a href="https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335">Vertex AI</a> with a click of a button, or <a href="https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm">GKE</a> with some GPUs, but you can also run it locally with <a href="https://github.com/tjake/Jlama">Jlama</a> or <a href="https://github.com/google/gemma.cpp">Gemma.cpp</a>.Another good option is to run Gemma with <a href="https://ollama.com/">Ollama</a>, a tool that you install on your machine, and which lets you run small models, like Llama 2, Mistral, and <a href="https://ollama.com/library">many others</a>. They quickly added support for <a href="https://ollama.com/library/gemma">Gemma</a> as well.Once installed locally, you can run:<pre>ollama run gemma:2b ollama run gemma:7b</pre>Cherry on the cake, the <a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/">LangChain4j</a> library provides an <a href="https://docs.langchain4j.dev/integrations/language-models/ollama">Ollama module</a>, so you can plug Ollama supported models in your Java applications easily.<h3>Containerization</h3>After a great discussion with my colleague <a href="https://twitter.com/ddobrin">Dan Dobrin</a> who had worked with Ollama and TestContainers (<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaContainerTest.java">#1</a> and<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaChatModelTest.java#L37">#2</a>) in his <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main">serverless production readiness workshop</a>, I decided to try the approach below.Which brings us to the last piece of the puzzle: Instead of having to install and run Ollama on my computer, I decided to use Ollama within a container, handled by <a href="https://testcontainers.com/">TestContainers</a>.TestContainers is not only useful for testing, but you can also use it for driving containers. There’s even a specific <a href="https://java.testcontainers.org/modules/ollama/">OllamaContainer</a> you can take advantage of!So here’s the whole picture:<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*lHxJaKf0ALEEnoJS.png" /></figure><h3>Time to implement this approach!</h3>You’ll find the code in the Github <a href="https://github.com/glaforge/gemini-workshop-for-java-developers/blob/main/app/src/main/java/gemini/workshop/CallGemma.java">repository</a> accompanying my recent <a href="https://codelabs.developers.google.com/codelabs/gemini-java-developers">Gemini workshop</a>Let’s start with the easy part, interacting with an Ollama supported model with LangChain4j:<pre>OllamaContainer ollama = createGemmaOllamaContainer(); ollama.start(); ChatLanguageModel model = OllamaChatModel.builder() .baseUrl(String.format(&quot;http://%s:%d&quot;, ollama.getHost(), ollama.getFirstMappedPort())) .modelName(&quot;gemma:2b&quot;) .build(); String response = model.generate(&quot;Why is the sky blue?&quot;); System.out.println(response);</pre><ul><li>You run an Ollama test container.</li><li>You create an Ollama chat model, by pointing at the address and port of the container.</li><li>You specify the model you want to use.</li><li>Then, you just need to call model.generate(yourPrompt) as usual.</li></ul>Easy? Now let’s have a look at the trickier part, my local method that creates the Ollama container:<pre>// check if the custom Gemma Ollama image exists already List&lt;Image&gt; listImagesCmd = DockerClientFactory.lazyClient() .listImagesCmd() .withImageNameFilter(TC_OLLAMA_GEMMA_2_B) .exec(); if (listImagesCmd.isEmpty()) { System.out.println(&quot;Creating a new Ollama container with Gemma 2B image...&quot;); OllamaContainer ollama = new OllamaContainer(&quot;ollama/ollama:0.1.26&quot;); ollama.start(); ollama.execInContainer(&quot;ollama&quot;, &quot;pull&quot;, &quot;gemma:2b&quot;); ollama.commitToImage(TC_OLLAMA_GEMMA_2_B); return ollama; } else { System.out.println(&quot;Using existing Ollama container with Gemma 2B image...&quot;); // Substitute the default Ollama image with our Gemma variant return new OllamaContainer( DockerImageName.parse(TC_OLLAMA_GEMMA_2_B) .asCompatibleSubstituteFor(&quot;ollama/ollama&quot;)); }</pre>You need to create a derived Ollama container that pulls in the Gemma model. Either this image was already created beforehand, or if it doesn’t exist yet, you create it.Use the Docker Java client to check if the custom Gemma image exists. If it doesn’t exist, notice how TestContainers let you create an image derived from the base Ollama image, pull the Gemma model, and then commit that image to your local Docker registry.Otherwise, if the image already exists (ie. you created it in a previous run of the application), you’re just going to tell TestContainers that you want to substitute the default Ollama image with your Gemma-powered variant.<h3>And voila!</h3>You can call Gemma locally on your laptop, in your Java apps, using LangChain4j, without having to install and run Ollama locally (but of course, you need to have a Docker daemon running).Big thanks to <a href="https://twitter.com/ddobrin">Dan Dobrin</a> for the approach, and to <a href="https://twitter.com/bsideup">Sergei</a>, <a href="https://twitter.com/EdduMelendez">Eddú</a> and <a href="https://twitter.com/shelajev">Oleg</a> from TestContainers for the help and useful pointers.Originally published at <a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/">https://glaforge.dev</a> on April 3, 2024.<img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=fbfe220ca715" width="1" height="1" alt=""><hr><a href="https://medium.com/google-cloud/calling-gemma-with-ollama-testcontainers-and-langchain4j-fbfe220ca715">Calling Gemma with Ollama, TestContainers, and LangChain4j</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.

Author

Link

Published date

Image url

Feed url

Guid

Hidden blurb

--- !ruby/object:Feedjira::Parser::RSSEntry
title: Calling Gemma with Ollama, TestContainers, and LangChain4j
published: 2024-04-05 04:50:25.000000000 Z
categories:
- gcp-app-dev
- ollama
- google-cloud-platform
- testcontainer
- langchain4j
entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier
 is_perma_link: 'false'
 guid: https://medium.com/p/fbfe220ca715
carlessian_info:
 news_filer_version: 2
 newspaper: Google Cloud - Medium
 macro_region: Blogs
content: '<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*MaT34TRoitiU1RMM.jpg"
 /></figure>Lately, for my Generative AI powered Java apps, I’ve used the <a href="https://deepmind.google/technologies/gemini/#introduction">Gemini</a>
 multimodal large language model from Google. But there’s also <a href="https://blog.google/technology/developers/gemma-open-models/">Gemma</a>,
 its little sister model.Gemma is a family of lightweight, state-of-the-art
 open models built from the same research and technology used to create the Gemini
 models. Gemma is available in two sizes: 2B and 7B. Its weights are freely available,
 and its small size means you can run it on your own, even on your laptop. So I was
 curious to give it a run with <a href="https://docs.langchain4j.dev/">LangChain4j</a>.<h3>How
 to run Gemma</h3>There are many ways to run Gemma: in the cloud, via <a href="https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335">Vertex
 AI</a> with a click of a button, or <a href="https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm">GKE</a>
 with some GPUs, but you can also run it locally with <a href="https://github.com/tjake/Jlama">Jlama</a>
 or <a href="https://github.com/google/gemma.cpp">Gemma.cpp</a>.Another good
 option is to run Gemma with <a href="https://ollama.com/">Ollama</a>, a tool that
 you install on your machine, and which lets you run small models, like Llama 2,
 Mistral, and <a href="https://ollama.com/library">many others</a>. They quickly
 added support for <a href="https://ollama.com/library/gemma">Gemma</a> as well.Once
 installed locally, you can run:<pre>ollama run gemma:2b ollama run gemma:7b</pre>Cherry
 on the cake, the <a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/">LangChain4j</a>
 library provides an <a href="https://docs.langchain4j.dev/integrations/language-models/ollama">Ollama
 module</a>, so you can plug Ollama supported models in your Java applications easily.<h3>Containerization</h3>After
 a great discussion with my colleague <a href="https://twitter.com/ddobrin">Dan Dobrin</a>
 who had worked with Ollama and TestContainers (<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaContainerTest.java">#1</a>
 and<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaChatModelTest.java#L37">#2</a>)
 in his <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main">serverless
 production readiness workshop</a>, I decided to try the approach below.Which
 brings us to the last piece of the puzzle: Instead of having to install and run
 Ollama on my computer, I decided to use Ollama within a container, handled by <a
 href="https://testcontainers.com/">TestContainers</a>.TestContainers is not
 only useful for testing, but you can also use it for driving containers. There’s
 even a specific <a href="https://java.testcontainers.org/modules/ollama/">OllamaContainer</a>
 you can take advantage of!So here’s the whole picture:<figure><img alt=""
 src="https://cdn-images-1.medium.com/max/1024/0*lHxJaKf0ALEEnoJS.png" /></figure><h3>Time
 to implement this approach!</h3>You’ll find the code in the Github <a href="https://github.com/glaforge/gemini-workshop-for-java-developers/blob/main/app/src/main/java/gemini/workshop/CallGemma.java">repository</a>
 accompanying my recent <a href="https://codelabs.developers.google.com/codelabs/gemini-java-developers">Gemini workshop</a>Let’s
 start with the easy part, interacting with an Ollama supported model with LangChain4j:<pre>OllamaContainer
 ollama = createGemmaOllamaContainer(); ollama.start(); ChatLanguageModel
 model = OllamaChatModel.builder() .baseUrl(String.format(&quot;http://%s:%d&quot;,
 ollama.getHost(), ollama.getFirstMappedPort())) .modelName(&quot;gemma:2b&quot;) .build(); String
 response = model.generate(&quot;Why is the sky blue?&quot;); System.out.println(response);</pre><ul><li>You
 run an Ollama test container.</li><li>You create an Ollama chat model, by pointing
 at the address and port of the container.</li><li>You specify the model you want
 to use.</li><li>Then, you just need to call model.generate(yourPrompt) as usual.</li></ul>Easy?
 Now let’s have a look at the trickier part, my local method that creates the Ollama
 container:<pre>// check if the custom Gemma Ollama image exists already List&lt;Image&gt;
 listImagesCmd = DockerClientFactory.lazyClient() .listImagesCmd() .withImageNameFilter(TC_OLLAMA_GEMMA_2_B) .exec(); if
 (listImagesCmd.isEmpty()) { System.out.println(&quot;Creating a new Ollama
 container with Gemma 2B image...&quot;); OllamaContainer ollama = new OllamaContainer(&quot;ollama/ollama:0.1.26&quot;); ollama.start(); ollama.execInContainer(&quot;ollama&quot;,
 &quot;pull&quot;, &quot;gemma:2b&quot;); ollama.commitToImage(TC_OLLAMA_GEMMA_2_B); return
 ollama; } else { System.out.println(&quot;Using existing Ollama container
 with Gemma 2B image...&quot;); // Substitute the default Ollama image with
 our Gemma variant return new OllamaContainer( DockerImageName.parse(TC_OLLAMA_GEMMA_2_B) .asCompatibleSubstituteFor(&quot;ollama/ollama&quot;)); }</pre>You
 need to create a derived Ollama container that pulls in the Gemma model. Either
 this image was already created beforehand, or if it doesn’t exist yet, you create it.Use
 the Docker Java client to check if the custom Gemma image exists. If it doesn’t
 exist, notice how TestContainers let you create an image derived from the base Ollama
 image, pull the Gemma model, and then commit that image to your local Docker registry.Otherwise,
 if the image already exists (ie. you created it in a previous run of the application),
 you’re just going to tell TestContainers that you want to substitute the default
 Ollama image with your Gemma-powered variant.<h3>And voila!</h3>You can call
 Gemma locally on your laptop, in your Java apps, using LangChain4j, without
 having to install and run Ollama locally (but of course, you need to have a Docker
 daemon running).Big thanks to <a href="https://twitter.com/ddobrin">Dan Dobrin</a>
 for the approach, and to <a href="https://twitter.com/bsideup">Sergei</a>, <a href="https://twitter.com/EdduMelendez">Eddú</a>
 and <a href="https://twitter.com/shelajev">Oleg</a> from TestContainers for the
 help and useful pointers.Originally published at <a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/">https://glaforge.dev</a>
 on April 3, 2024.<img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=fbfe220ca715"
 width="1" height="1" alt=""><hr><a href="https://medium.com/google-cloud/calling-gemma-with-ollama-testcontainers-and-langchain4j-fbfe220ca715">Calling
 Gemma with Ollama, TestContainers, and LangChain4j</a> was originally published
 in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium,
 where people are continuing the conversation by highlighting and responding to this
 story.'
rss_fields:
- title
- published
- categories
- entry_id
- content
- url
- author
url: https://medium.com/google-cloud/calling-gemma-with-ollama-testcontainers-and-langchain4j-fbfe220ca715?source=rss----e52cf94d98af---4
author: Guillaume Laforge

Language

Active

Ricc internal notes

Ricc source

Show this article Back to articles