♊️ GemiNews 🗞️
(dev)
🏡
📰 Articles
🏷️ Tags
🧠 Queries
📈 Graphs
☁️ Stats
💁🏻 Assistant
💬
🎙️
Demo 1: Embeddings + Recommendation
Demo 2: Bella RAGa
Demo 3: NewRetriever
Demo 4: Assistant function calling
Editing article
Title
Summary
Content
<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*MaT34TRoitiU1RMM.jpg" /></figure><p>Lately, for my Generative AI powered Java apps, I’ve used the <a href="https://deepmind.google/technologies/gemini/#introduction">Gemini</a> multimodal large language model from Google. But there’s also <a href="https://blog.google/technology/developers/gemma-open-models/">Gemma</a>, its little sister model.</p><p>Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Gemma is available in two sizes: 2B and 7B. Its weights are freely available, and its small size means you can run it on your own, even on your laptop. So I was curious to give it a run with <a href="https://docs.langchain4j.dev/">LangChain4j</a>.</p><h3>How to run Gemma</h3><p>There are many ways to run Gemma: in the cloud, via <a href="https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335">Vertex AI</a> with a click of a button, or <a href="https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm">GKE</a> with some GPUs, but you can also run it locally with <a href="https://github.com/tjake/Jlama">Jlama</a> or <a href="https://github.com/google/gemma.cpp">Gemma.cpp</a>.</p><p>Another good option is to run Gemma with <a href="https://ollama.com/">Ollama</a>, a tool that you install on your machine, and which lets you run small models, like Llama 2, Mistral, and <a href="https://ollama.com/library">many others</a>. They quickly added support for <a href="https://ollama.com/library/gemma">Gemma</a> as well.</p><p>Once installed locally, you can run:</p><pre>ollama run gemma:2b<br>ollama run gemma:7b</pre><p>Cherry on the cake, the <a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/">LangChain4j</a> library provides an <a href="https://docs.langchain4j.dev/integrations/language-models/ollama">Ollama module</a>, so you can plug Ollama supported models in your Java applications easily.</p><h3>Containerization</h3><p>After a great discussion with my colleague <a href="https://twitter.com/ddobrin">Dan Dobrin</a> who had worked with Ollama and TestContainers (<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaContainerTest.java">#1</a> and<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaChatModelTest.java#L37">#2</a>) in his <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main">serverless production readiness workshop</a>, I decided to try the approach below.</p><p>Which brings us to the last piece of the puzzle: Instead of having to install and run Ollama on my computer, I decided to use Ollama within a container, handled by <a href="https://testcontainers.com/">TestContainers</a>.</p><p>TestContainers is not only useful for testing, but you can also use it for driving containers. There’s even a specific <a href="https://java.testcontainers.org/modules/ollama/">OllamaContainer</a> you can take advantage of!</p><p>So here’s the whole picture:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*lHxJaKf0ALEEnoJS.png" /></figure><h3>Time to implement this approach!</h3><p>You’ll find the code in the Github <a href="https://github.com/glaforge/gemini-workshop-for-java-developers/blob/main/app/src/main/java/gemini/workshop/CallGemma.java">repository</a> accompanying my recent <a href="https://codelabs.developers.google.com/codelabs/gemini-java-developers">Gemini workshop</a></p><p>Let’s start with the easy part, interacting with an Ollama supported model with LangChain4j:</p><pre>OllamaContainer ollama = createGemmaOllamaContainer();<br>ollama.start();<br><br>ChatLanguageModel model = OllamaChatModel.builder()<br> .baseUrl(String.format("http://%s:%d", ollama.getHost(), ollama.getFirstMappedPort()))<br> .modelName("gemma:2b")<br> .build();<br><br>String response = model.generate("Why is the sky blue?");<br><br>System.out.println(response);</pre><ul><li>You run an Ollama test container.</li><li>You create an Ollama chat model, by pointing at the address and port of the container.</li><li>You specify the model you want to use.</li><li>Then, you just need to call model.generate(yourPrompt) as usual.</li></ul><p>Easy? Now let’s have a look at the trickier part, my local method that creates the Ollama container:</p><pre>// check if the custom Gemma Ollama image exists already<br>List<Image> listImagesCmd = DockerClientFactory.lazyClient()<br> .listImagesCmd()<br> .withImageNameFilter(TC_OLLAMA_GEMMA_2_B)<br> .exec();<br><br>if (listImagesCmd.isEmpty()) {<br> System.out.println("Creating a new Ollama container with Gemma 2B image...");<br> OllamaContainer ollama = new OllamaContainer("ollama/ollama:0.1.26");<br> ollama.start();<br> ollama.execInContainer("ollama", "pull", "gemma:2b");<br> ollama.commitToImage(TC_OLLAMA_GEMMA_2_B);<br> return ollama;<br>} else {<br> System.out.println("Using existing Ollama container with Gemma 2B image...");<br> // Substitute the default Ollama image with our Gemma variant<br> return new OllamaContainer(<br> DockerImageName.parse(TC_OLLAMA_GEMMA_2_B)<br> .asCompatibleSubstituteFor("ollama/ollama"));<br>}</pre><p>You need to create a derived Ollama container that pulls in the Gemma model. Either this image was already created beforehand, or if it doesn’t exist yet, you create it.</p><p>Use the Docker Java client to check if the custom Gemma image exists. If it doesn’t exist, notice how TestContainers let you create an image derived from the base Ollama image, pull the Gemma model, and then commit that image to your local Docker registry.</p><p>Otherwise, if the image already exists (ie. you created it in a previous run of the application), you’re just going to tell TestContainers that you want to substitute the default Ollama image with your Gemma-powered variant.</p><h3>And voila!</h3><p>You can <strong>call Gemma locally on your laptop, in your Java apps, using LangChain4j</strong>, without having to install and run Ollama locally (but of course, you need to have a Docker daemon running).</p><p>Big thanks to <a href="https://twitter.com/ddobrin">Dan Dobrin</a> for the approach, and to <a href="https://twitter.com/bsideup">Sergei</a>, <a href="https://twitter.com/EdduMelendez">Eddú</a> and <a href="https://twitter.com/shelajev">Oleg</a> from TestContainers for the help and useful pointers.</p><p><em>Originally published at </em><a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/"><em>https://glaforge.dev</em></a><em> on April 3, 2024.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=fbfe220ca715" width="1" height="1" alt=""><hr><p><a href="https://medium.com/google-cloud/calling-gemma-with-ollama-testcontainers-and-langchain4j-fbfe220ca715">Calling Gemma with Ollama, TestContainers, and LangChain4j</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>
Author
Link
Published date
Image url
Feed url
Guid
Hidden blurb
--- !ruby/object:Feedjira::Parser::RSSEntry title: Calling Gemma with Ollama, TestContainers, and LangChain4j published: 2024-04-05 04:50:25.000000000 Z categories: - gcp-app-dev - ollama - google-cloud-platform - testcontainer - langchain4j entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier is_perma_link: 'false' guid: https://medium.com/p/fbfe220ca715 carlessian_info: news_filer_version: 2 newspaper: Google Cloud - Medium macro_region: Blogs content: '<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*MaT34TRoitiU1RMM.jpg" /></figure><p>Lately, for my Generative AI powered Java apps, I’ve used the <a href="https://deepmind.google/technologies/gemini/#introduction">Gemini</a> multimodal large language model from Google. But there’s also <a href="https://blog.google/technology/developers/gemma-open-models/">Gemma</a>, its little sister model.</p><p>Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Gemma is available in two sizes: 2B and 7B. Its weights are freely available, and its small size means you can run it on your own, even on your laptop. So I was curious to give it a run with <a href="https://docs.langchain4j.dev/">LangChain4j</a>.</p><h3>How to run Gemma</h3><p>There are many ways to run Gemma: in the cloud, via <a href="https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335">Vertex AI</a> with a click of a button, or <a href="https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-gemma-gpu-vllm">GKE</a> with some GPUs, but you can also run it locally with <a href="https://github.com/tjake/Jlama">Jlama</a> or <a href="https://github.com/google/gemma.cpp">Gemma.cpp</a>.</p><p>Another good option is to run Gemma with <a href="https://ollama.com/">Ollama</a>, a tool that you install on your machine, and which lets you run small models, like Llama 2, Mistral, and <a href="https://ollama.com/library">many others</a>. They quickly added support for <a href="https://ollama.com/library/gemma">Gemma</a> as well.</p><p>Once installed locally, you can run:</p><pre>ollama run gemma:2b<br>ollama run gemma:7b</pre><p>Cherry on the cake, the <a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/">LangChain4j</a> library provides an <a href="https://docs.langchain4j.dev/integrations/language-models/ollama">Ollama module</a>, so you can plug Ollama supported models in your Java applications easily.</p><h3>Containerization</h3><p>After a great discussion with my colleague <a href="https://twitter.com/ddobrin">Dan Dobrin</a> who had worked with Ollama and TestContainers (<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaContainerTest.java">#1</a> and<a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/blob/main/sessions/next24/books-genai-vertex-langchain4j/src/test/java/services/OllamaChatModelTest.java#L37">#2</a>) in his <a href="https://github.com/GoogleCloudPlatform/serverless-production-readiness-java-gcp/tree/main">serverless production readiness workshop</a>, I decided to try the approach below.</p><p>Which brings us to the last piece of the puzzle: Instead of having to install and run Ollama on my computer, I decided to use Ollama within a container, handled by <a href="https://testcontainers.com/">TestContainers</a>.</p><p>TestContainers is not only useful for testing, but you can also use it for driving containers. There’s even a specific <a href="https://java.testcontainers.org/modules/ollama/">OllamaContainer</a> you can take advantage of!</p><p>So here’s the whole picture:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*lHxJaKf0ALEEnoJS.png" /></figure><h3>Time to implement this approach!</h3><p>You’ll find the code in the Github <a href="https://github.com/glaforge/gemini-workshop-for-java-developers/blob/main/app/src/main/java/gemini/workshop/CallGemma.java">repository</a> accompanying my recent <a href="https://codelabs.developers.google.com/codelabs/gemini-java-developers">Gemini workshop</a></p><p>Let’s start with the easy part, interacting with an Ollama supported model with LangChain4j:</p><pre>OllamaContainer ollama = createGemmaOllamaContainer();<br>ollama.start();<br><br>ChatLanguageModel model = OllamaChatModel.builder()<br> .baseUrl(String.format("http://%s:%d", ollama.getHost(), ollama.getFirstMappedPort()))<br> .modelName("gemma:2b")<br> .build();<br><br>String response = model.generate("Why is the sky blue?");<br><br>System.out.println(response);</pre><ul><li>You run an Ollama test container.</li><li>You create an Ollama chat model, by pointing at the address and port of the container.</li><li>You specify the model you want to use.</li><li>Then, you just need to call model.generate(yourPrompt) as usual.</li></ul><p>Easy? Now let’s have a look at the trickier part, my local method that creates the Ollama container:</p><pre>// check if the custom Gemma Ollama image exists already<br>List<Image> listImagesCmd = DockerClientFactory.lazyClient()<br> .listImagesCmd()<br> .withImageNameFilter(TC_OLLAMA_GEMMA_2_B)<br> .exec();<br><br>if (listImagesCmd.isEmpty()) {<br> System.out.println("Creating a new Ollama container with Gemma 2B image...");<br> OllamaContainer ollama = new OllamaContainer("ollama/ollama:0.1.26");<br> ollama.start();<br> ollama.execInContainer("ollama", "pull", "gemma:2b");<br> ollama.commitToImage(TC_OLLAMA_GEMMA_2_B);<br> return ollama;<br>} else {<br> System.out.println("Using existing Ollama container with Gemma 2B image...");<br> // Substitute the default Ollama image with our Gemma variant<br> return new OllamaContainer(<br> DockerImageName.parse(TC_OLLAMA_GEMMA_2_B)<br> .asCompatibleSubstituteFor("ollama/ollama"));<br>}</pre><p>You need to create a derived Ollama container that pulls in the Gemma model. Either this image was already created beforehand, or if it doesn’t exist yet, you create it.</p><p>Use the Docker Java client to check if the custom Gemma image exists. If it doesn’t exist, notice how TestContainers let you create an image derived from the base Ollama image, pull the Gemma model, and then commit that image to your local Docker registry.</p><p>Otherwise, if the image already exists (ie. you created it in a previous run of the application), you’re just going to tell TestContainers that you want to substitute the default Ollama image with your Gemma-powered variant.</p><h3>And voila!</h3><p>You can <strong>call Gemma locally on your laptop, in your Java apps, using LangChain4j</strong>, without having to install and run Ollama locally (but of course, you need to have a Docker daemon running).</p><p>Big thanks to <a href="https://twitter.com/ddobrin">Dan Dobrin</a> for the approach, and to <a href="https://twitter.com/bsideup">Sergei</a>, <a href="https://twitter.com/EdduMelendez">Eddú</a> and <a href="https://twitter.com/shelajev">Oleg</a> from TestContainers for the help and useful pointers.</p><p><em>Originally published at </em><a href="https://glaforge.dev/posts/2024/04/04/calling-gemma-with-ollama-and-testcontainers/"><em>https://glaforge.dev</em></a><em> on April 3, 2024.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=fbfe220ca715" width="1" height="1" alt=""><hr><p><a href="https://medium.com/google-cloud/calling-gemma-with-ollama-testcontainers-and-langchain4j-fbfe220ca715">Calling Gemma with Ollama, TestContainers, and LangChain4j</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>' rss_fields: - title - published - categories - entry_id - content - url - author url: https://medium.com/google-cloud/calling-gemma-with-ollama-testcontainers-and-langchain4j-fbfe220ca715?source=rss----e52cf94d98af---4 author: Guillaume Laforge
Language
Active
Ricc internal notes
Imported via /usr/local/google/home/ricc/git/gemini-news-crawler/webapp/db/seeds.d/import-feedjira.rb on 2024-04-05 09:23:03 +0200. Content is EMPTY here. Entried: title,published,categories,entry_id,content,url,author. TODO add Newspaper: filename = /usr/local/google/home/ricc/git/gemini-news-crawler/webapp/db/seeds.d/../../../crawler/out/feedjira/Blogs/Google Cloud - Medium/2024-04-05-Calling_Gemma_with_Ollama,_TestContainers,_and_LangChain4j-v2.yaml
Ricc source
Show this article
Back to articles