As I was working on tweaking the Vertex AI text embedding model in LangChain4j, I wanted to better understand how the textembedding-geckomodel tokenizes the text, in particular when we implement the Retrieval Augmented Generation approach.The various PaLM-based models offer a computeTokens endpoint, which returns a list of tokens (encoded in Base 64) and their respective IDs.Note: At the time of this writing, there’s no equivalent endpoint for Gemini models.So I decided to create a small application that lets users:input some text,select a model,calculate the number of tokens,and visualize them with some nice pastel colors.The available PaLM-based models are:textembedding-geckotextembedding-gecko-multilingualtext-bisontext-unicornchat-bisoncode-geckocode-bisoncodechat-bisonYou can try the application online.And also have a look at the source code on Github. It’s a Micronaut application. I serve the static assets as explained in my recent article. I deployed the application on Google Cloud Run, the easiest way to deploy a container, and let it auto-scale for you. I did a source based deployment, as explained at the bottom here.And voilà I can visualize my LLM tokens!Originally published at https://glaforge.dev on February 5, 2024.Visualize PaLM-based LLM tokens was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.
[Blogs]
🌎 https://medium.com/google-cloud/visualize-palm-based-llm-tokens-8760b3122c0f?source=rss-431147437aeb------2
[🧠] [v2] article_embedding_description: {:llm_project_id=>"Unavailable", :llm_dimensions=>nil, :article_size=>3112, :llm_embeddings_model_name=>"textembedding-gecko"}
[🧠] [v1/3] title_embedding_description: {:ricc_notes=>"[embed-v3] Fixed on 9oct24. Only seems incompatible at first glance with embed v1.", :llm_project_id=>"unavailable possibly not using Vertex", :llm_dimensions=>nil, :article_size=>3112, :poly_field=>"title", :llm_embeddings_model_name=>"textembedding-gecko"}
[🧠] [v1/3] summary_embedding_description:
[🧠] As per bug https://github.com/palladius/gemini-news-crawler/issues/4 we can state this article belongs to titile/summary version: v3 (very few articles updated on 9oct24)
🗿article.to_s
------------------------------
Title: Visualize PaLM-based LLM tokens
[content]
As I was working on tweaking the Vertex AI text embedding model in LangChain4j, I wanted to better understand how the textembedding-geckomodel tokenizes the text, in particular when we implement the Retrieval Augmented Generation approach.The various PaLM-based models offer a computeTokens endpoint, which returns a list of tokens (encoded in Base 64) and their respective IDs.Note: At the time of this writing, there’s no equivalent endpoint for Gemini models.So I decided to create a small application that lets users:input some text,select a model,calculate the number of tokens,and visualize them with some nice pastel colors.The available PaLM-based models are:textembedding-geckotextembedding-gecko-multilingualtext-bisontext-unicornchat-bisoncode-geckocode-bisoncodechat-bisonYou can try the application online.And also have a look at the source code on Github. It’s a Micronaut application. I serve the static assets as explained in my recent article. I deployed the application on Google Cloud Run, the easiest way to deploy a container, and let it auto-scale for you. I did a source based deployment, as explained at the bottom here.And voilà I can visualize my LLM tokens!Originally published at https://glaforge.dev on February 5, 2024.Visualize PaLM-based LLM tokens was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.
[/content]
Author: Guillaume Laforge
PublishedDate: 2024-02-05
Category: Blogs
NewsPaper: Guillaume Laforge - Medium
Tags: llm, google-cloud-platform, gcp-app-dev, vertex-ai, generative-ai-tools
As I was working on tweaking the Vertex AI text embedding model in LangChain4j, I wanted to better understand how the textembedding-geckomodel tokenizes the text, in particular when we implement the Retrieval Augmented Generation approach.
The various PaLM-based models offer a computeTokens endpoint, which returns a list of tokens (encoded in Base 64) and their respective IDs.
Note: At the time of this writing, there’s no equivalent endpoint for Gemini models.
And also have a look at the source code on Github. It’s a Micronaut application. I serve the static assets as explained in my recent article. I deployed the application on Google Cloud Run, the easiest way to deploy a container, and let it auto-scale for you. I did a source based deployment, as explained at the bottom here.