Editing article

Title

Summary

Content

<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*F0bDfSj3tKoBaHPjgukktA.jpeg" /><figcaption>Image of stacked newspapers representing the use case we are addressing in this article</figcaption></figure><h3>Introduction</h3>Imagine a world where language models understand your business / industry’s specific needs, where they can generate text perfectly tailored to your unique domain or task. This isn’t science fiction; it’s the power of fine-tuning, and Google Cloud’s Vertex AI is making it accessible to everyone.<h3>Why Fine-Tuning Matters</h3>Tuning a foundation model can improve its performance. Foundation models are trained for general purposes and sometimes don’t perform tasks as well as you’d like them to. This might be because the tasks you want the model to perform are specialized tasks that are difficult to teach a model by using only prompt design. In these cases, you can use model tuning to improve the performance of a model for specific tasks. Model tuning can also help it adhere to specific output requirements when instructions aren’t sufficient. Think of large language models (LLMs) like incredibly talented students. They’ve learned a vast amount of information and can perform many tasks, but they excel when given specialized training. Fine-tuning is that training, allowing you to adapt a pre-trained LLM to your specific needs, whether it’s:Generating creative content: Imagine an LLM that writes marketing copy in your brand voice or composes poems in the style of your favorite poet.Summarizing complex information: Need to quickly grasp the key points of a lengthy research paper or news article in a particular domain or specialty like medicine or law? A fine-tuned LLM can do that for you.Translating languages with nuance: Go beyond literal translations and capture the cultural context and subtle meanings with a fine-tuned LLM.<blockquote>This article provides an overview of model tuning, describes the tuning options available on Vertex AI, and implements fine tuning using the supervised tuning approach for one of the models. More details about model customization are available <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/tune-models">here</a>.</blockquote><h3>Use case: From News Articles to Headlines</h3>Let’s see how this works in practice. Imagine you want to automatically generate headlines for news articles. Using Vertex AI, you can fine tune a Large Language Model that generates a suitable summarized title in a specific style and customization of titles that the news channel follows.We will use BBC FULLTEXT DATA (made available by BigQuery Public Dataset bigquery-public-data.bbc_news.fulltext). We will fine tune an LLM (text-bison@002) to a new fine-tuned model called “bbc-news-summary-tuned” and compare the result to the response from the base model. The sample JSONL is made available for the implementation, feel free to upload it to your Cloud Storage Bucket to execute the fine tuning steps:Prepare your data: Start with a dataset of news articles and their corresponding headlines, like the BBC News dataset used in the example code.Fine-tune a pre-trained model: Choose a base model like “text-bison@002” and fine-tune it on your news data using Vertex AI’s Python sdk.Evaluate the results: Compare the performance of your fine-tuned model with the base model to see the improvement in headline generation quality.Deploy and use your model: Make your fine-tuned model available through an API endpoint and start generating headlines for new articles automatically.For this we are going to use the <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/tuning/supervised-tuning">Supervised Tuning</a> approach. Supervised tuning improves the performance of a model by teaching it a new skill. Data that contains hundreds of labeled examples is used to teach the model to mimic a desired behavior or task. We are going to provide a labeled dataset for input text (prompt) and output text (response) to teach the model how to customize the responses for our specific use case.Let’s dive in!<h3>Vertex AI: Your Fine-Tuning Partner</h3>Vertex AI provides a comprehensive suite of tools and services to guide you through the entire fine-tuning journey:Vertex AI Pipelines: Streamline your workflow by building and managing end-to-end machine learning pipelines, including data preparation, model training, evaluation, and deployment.Vertex AI Evaluation Services: Assess the performance of your fine-tuned model with metrics tailored to your specific task, ensuring it meets your quality standards.Vertex AI Model Registry: Keep track of all your models, including different versions and their performance metrics, in a centralized repository.Vertex AI Endpoints: Deploy your fine-tuned model as an API endpoint, making it easily accessible for integration into your applications.<h3>High Level Flow Diagram</h3>This diagram represents the flow of data and steps involved in the implementation. Please note that the owner for the respective step is mentioned in the text underneath.<figure><img alt="" src="https://cdn-images-1.medium.com/max/947/1*5233hdmreanFqQ91GB22iQ.png" /><figcaption>High Level Flow of Model Fine Tuning Steps</figcaption></figure><h3>Industry Use cases</h3>With Vertex AI, the possibilities are endless. You can fine-tune LLMs for sentiment analysis, chatbot development, code generation, and much more. This technology is democratizing access to powerful language models, allowing businesses and individuals to unlock new levels of creativity and efficiency.<h3>Hands-on Time</h3>This implementation is done with Vertex AI Python SDK for Generative AI models. You can also perform fine tuning in other ways — HTTP, CURL command, Java SDK, Console.In 5 easy steps, you can fine-tune and evaluate your model for your customized responses!<ol><li>Install and Import dependencies</li></ol><pre>!pip install google-cloud-aiplatform !pip install --user datasets !pip install --user google-cloud-pipeline-components</pre>Follow the rest of the steps as shown in the .ipynb file in the repo. Make sure you replace the PROJECT_ID and BUCKET_NAME with your credentials.<pre>import os os.environ[&#39;TF_CPP_MIN_LOG_LEVEL&#39;] = &#39;3&#39; import warnings warnings.filterwarnings(&#39;ignore&#39;) import vertexai vertexai.init(project=PROJECT_ID, location=REGION) import kfp import sys import uuid import json import pandas as pd from google.auth import default from datasets import load_dataset from google.cloud import aiplatform from vertexai.preview.language_models import TextGenerationModel, EvaluationTextSummarizationSpec</pre>2. Prepare &amp; Load Training DataReplace YOUR_BUCKET with your bucket and upload the sample <a href="https://github.com/AbiramiSukumaran/LLMFineTuningSupervised/blob/main/TRAIN.jsonl">TRAIN.jsonl</a> training data file to it.<pre>json_url = &#39;https://storage.googleapis.com/YOUR_BUCKET/TRAIN.jsonl&#39; df = pd.read_json(json_url, lines=True) print (df)</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/582/1*XRuJQcjxOyZ-09A5lVwc8Q.png" /><figcaption>Dataframe of the training dataset</figcaption></figure>3. Fine Tune a Large Language Model<pre>model_display_name = &#39;bbc-finetuned-model&#39; # @param {type:&quot;string&quot;} tuned_model = TextGenerationModel.from_pretrained(&quot;text-bison@002&quot;) tuned_model.tune_model( training_data=df, train_steps=100, tuning_job_location=&quot;europe-west4&quot;, tuned_model_location=&quot;europe-west4&quot;, )</pre>The code above takes the pretrained model “text-bison@002” and tunes it with the data frame that has the training data we loaded in the previous step.This step will take a few hours to complete. Remember you can always track the progress of the fine tuning pipeline in the pipeline job link it outputs in this step.4. Predict with the new Fine Tuned ModelOnce the fine tuning job is complete, you will be able to predict with your new model.<pre>response = tuned_model.predict(&quot;Summarize this text to generate a title: \n Ever noticed how plane seats appear to be getting smaller and smaller? With increasing numbers of people taking to the skies, some experts are questioning if having such packed out planes is putting passengers at risk. They say that the shrinking space on aeroplanes is not only uncomfortable it it&#39;s putting our health and safety in danger. More than squabbling over the arm rest, shrinking space on planes putting our health and safety in danger? This week, a U.S consumer advisory group set up by the Department of Transportation said at a public hearing that while the government is happy to set standards for animals flying on planes, it doesn&#39;t stipulate a minimum amount of space for humans.&quot;) print(response.text)</pre>Here is the output:<figure><img alt="" src="https://cdn-images-1.medium.com/proxy/0*Y6cIFGtNVpsaxsb-" /><figcaption>Output of prediction using the new fine-tuned model</figcaption></figure>Predict with Base Model (text-bison@002) for comparison<pre>base_model = TextGenerationModel.from_pretrained(&quot;text-bison@002&quot;) response = base_model.predict(&quot;Summarize this text to generate a title: \n Ever noticed how plane seats appear to be getting smaller and smaller? With increasing numbers of people taking to the skies, some experts are questioning if having such packed out planes is putting passengers at risk. They say that the shrinking space on aeroplanes is not only uncomfortable it it&#39;s putting our health and safety in danger. More than squabbling over the arm rest, shrinking space on planes putting our health and safety in danger? This week, a U.S consumer advisory group set up by the Department of Transportation said at a public hearing that while the government is happy to set standards for animals flying on planes, it doesn&#39;t stipulate a minimum amount of space for humans.&quot;) print(response.text)</pre>Here is the output:<figure><img alt="" src="https://cdn-images-1.medium.com/max/897/0*z-tfR03xCsgkqow3" /><figcaption>Output of prediction using the base model</figcaption></figure>Even though both titles generated look appropriate, the first one (generated with the fine-tuned model) is more in tune with the style of titles used in the dataset in question.Load the fine tuned modelIt might be easier to load a model that you just fine-tuned. But remember in step 3, it is invoked in the scope of the code itself so it still holds the tuned model in the variable tuned_model. But what if you want to invoke a model that was tuned in the past?To do this, you can invoke the get_tuned_model() method on the LLM with the full ENDPOINT URL of the deployed fine tuned model from Vertex AI Model Registry.<pre>tuned_model_1 = TextGenerationModel.get_tuned_model(&quot;projects/273845608377/locations/europe-west4/models/4220809634753019904&quot;) print(tuned_model_1.predict(&quot;YOUR_PROMPT&quot;))</pre>5. Model EvaluationThis is a big topic in itself. We will reserve that topic of detailed discussion to another day. For now, we will see how we can get some evaluation metrics on the fine tuned model and compare against the base model.Load the <a href="https://github.com/AbiramiSukumaran/LLMEvaluationAuto/blob/main/EVALUATE.jsonl">EVALUATION dataset</a>:<pre>json_url = &#39;https://storage.googleapis.com/YOUR_BUCKET/EVALUATE.jsonl&#39; df = pd.read_json(json_url, lines=True) print (df)</pre>Evaluate:<pre> # Define the evaluation specification for a text summarization task on the fine tuned model task_spec = EvaluationTextSummarizationSpec( task_name = &quot;summarization&quot;, ground_truth_data=df )</pre>This step will take a few minutes to complete. You can track the progress using the pipeline job link in the step result. Once complete, you would be able to view the evaluation result:<figure><img alt="" src="https://cdn-images-1.medium.com/max/776/0*QoP9htlaWlXnszrw" /><figcaption>Evaluation Metric Output</figcaption></figure>rougeLSum: This is the ROUGE-L score for the summary. ROUGE-L is a recall-based metric that measures the overlap between a summary and a reference summary. It is calculated by taking the longest common subsequence (LCS) between the two summaries and dividing it by the length of the reference summary.The rougeLSum score in the given expression is 0.36600753600753694, which means that the summary has a 36.6% overlap with the reference summary.If you run the evaluation step on the baseline model, you will observe that the summary score is RELATIVELY higher for the Fine Tuned Model.You can find the evaluation results in the Cloud Storage output directory that you specified when creating the evaluation job. The file is named evaluation_metrics.json. For tuned models, you can also view evaluation results in the Google Cloud console on the Vertex AI Model Registry page.<blockquote>Important Considerations</blockquote><blockquote>Model Support: Always check the model <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/tuning/supervised-tuning#models_that_support_supervised_tuning">documentation</a> for the latest compatibility.</blockquote><blockquote>Rapid Development: The field of LLMs advances quickly. A newer, more powerful model could potentially outperform a fine-tuned model built on an older base. The good news is that you can apply these fine-tuning techniques to newer models when the capability becomes available.</blockquote><blockquote>LoRA: LoRA is a technique for efficiently fine-tuning LLMs. It does this by introducing trainable, low-rank decomposition matrices into the existing pre-trained model’s layers. Read more about it <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/lora-qlora">here</a>. Instead of updating all the parameters of a massive LLM, LoRA learns smaller matrices that are added to or multiplied with the original model’s weight matrices. This significantly reduces the number of additional parameters introduced during fine-tuning.</blockquote><h3>Conclusion</h3>Fine-tuning is a powerful technique that allows you to customize LLMs to your domain and tasks. With Vertex AI, you have the tools and resources you need to fine-tune your models efficiently and effectively. Explore the GitHub repositories and experiment with the sample code to experience <a href="https://github.com/AbiramiSukumaran/LLMFineTuningSupervised">fine-tuning</a> and <a href="https://github.com/AbiramiSukumaran/LLMEvaluationAuto">evaluation</a> firsthand. Consider how fine-tuned LLMs can address your specific needs, from generating targeted marketing copy to summarizing complex documents or translating languages with cultural nuance. Utilize the comprehensive suite of tools and services offered by <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/tune-models">Vertex AI</a> to build, train, evaluate, and deploy your fine-tuned models with ease.Register for the upcoming season of <a href="https://codevipassana.dev">Code Vipassana</a> (Season 6) where we will be building these out in instructor-led virtual hands-on sessions.Also, if you are coming to Google Cloud NEXT on 9th, 10th and 11th of April 2024 at Vegas, check this in action or just come say hi at our Innovator’s Hive End to End AI demo station!<img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3c113f4007da" width="1" height="1" alt=""><hr><a href="https://medium.com/google-cloud/fine-tuning-large-language-models-how-vertex-ai-takes-llms-to-the-next-level-3c113f4007da">Fine Tuning Large Language Models: How Vertex AI Takes LLMs to the Next Level</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.

Author

Link

Published date

Image url

Feed url

Guid

Hidden blurb

--- !ruby/object:Feedjira::Parser::RSSEntry
title: 'Fine Tuning Large Language Models: How Vertex AI Takes LLMs to the Next Level'
url: https://medium.com/google-cloud/fine-tuning-large-language-models-how-vertex-ai-takes-llms-to-the-next-level-3c113f4007da?source=rss----e52cf94d98af---4
author: Abirami Sukumaran
categories:
- vertex-ai
- generative-ai
- fine-tuning
- google-cloud-platform
- supervised-learning
published: 2024-04-08 06:27:19.000000000 Z
entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier
 is_perma_link: 'false'
 guid: https://medium.com/p/3c113f4007da
carlessian_info:
 news_filer_version: 2
 newspaper: Google Cloud - Medium
 macro_region: Blogs
rss_fields:
- title
- url
- author
- categories
- published
- entry_id
- content
content: '<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*F0bDfSj3tKoBaHPjgukktA.jpeg"
 /><figcaption>Image of stacked newspapers representing the use case we are addressing
 in this article</figcaption></figure><h3>Introduction</h3>Imagine a world where
 language models understand your business / industry’s specific needs, where they
 can generate text perfectly tailored to your unique domain or task. This isn’t science
 fiction; it’s the power of fine-tuning, and Google Cloud’s Vertex AI is making it
 accessible to everyone.<h3>Why Fine-Tuning Matters</h3>Tuning a foundation
 model can improve its performance. Foundation models are trained for general purposes
 and sometimes don’t perform tasks as well as you’d like them to. This might be because
 the tasks you want the model to perform are specialized tasks that are difficult
 to teach a model by using only prompt design. In these cases, you can use model
 tuning to improve the performance of a model for specific tasks. Model tuning can
 also help it adhere to specific output requirements when instructions aren’t sufficient.
 Think of large language models (LLMs) like incredibly talented students. They’ve
 learned a vast amount of information and can perform many tasks, but they excel
 when given specialized training. Fine-tuning is that training, allowing you to adapt
 a pre-trained LLM to your specific needs, whether it’s:Generating
 creative content: Imagine an LLM that writes marketing copy in your brand
 voice or composes poems in the style of your favorite poet.Summarizing
 complex information: Need to quickly grasp the key points of a lengthy
 research paper or news article in a particular domain or specialty like medicine
 or law? A fine-tuned LLM can do that for you.Translating languages
 with nuance: Go beyond literal translations and capture the cultural context
 and subtle meanings with a fine-tuned LLM.<blockquote>This article provides
 an overview of model tuning, describes the tuning options available on Vertex AI,
 and implements fine tuning using the supervised tuning approach for one of the models.
 More details about model customization are available <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/tune-models">here</a>.</blockquote><h3>Use
 case: From News Articles to Headlines</h3>Let’s see how this works in practice.
 Imagine you want to automatically generate headlines for news articles. Using Vertex
 AI, you can fine tune a Large Language Model that generates a suitable summarized
 title in a specific style and customization of titles that the news channel follows.We
 will use BBC FULLTEXT DATA (made available by BigQuery Public Dataset bigquery-public-data.bbc_news.fulltext).
 We will fine tune an LLM (text-bison@002) to a new fine-tuned model called “bbc-news-summary-tuned”
 and compare the result to the response from the base model. The sample JSONL is
 made available for the implementation, feel free to upload it to your Cloud Storage
 Bucket to execute the fine tuning steps:Prepare your data:
 Start with a dataset of news articles and their corresponding headlines, like the
 BBC News dataset used in the example code.Fine-tune a pre-trained
 model: Choose a base model like “text-bison@002” and fine-tune it on your
 news data using Vertex AI’s Python sdk.Evaluate the results:
 Compare the performance of your fine-tuned model with the base model to see the
 improvement in headline generation quality.Deploy and use your model:
 Make your fine-tuned model available through an API endpoint and start generating
 headlines for new articles automatically.For this we are going to use the
 <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/tuning/supervised-tuning">Supervised
 Tuning</a> approach. Supervised tuning improves the performance of a model by teaching
 it a new skill. Data that contains hundreds of labeled examples is used to teach
 the model to mimic a desired behavior or task. We are going to provide a labeled
 dataset for input text (prompt) and output text (response) to teach the model how
 to customize the responses for our specific use case.Let’s dive in!<h3>Vertex
 AI: Your Fine-Tuning Partner</h3>Vertex AI provides a comprehensive suite of
 tools and services to guide you through the entire fine-tuning journey:Vertex
 AI Pipelines: Streamline your workflow by building and managing end-to-end
 machine learning pipelines, including data preparation, model training, evaluation,
 and deployment.Vertex AI Evaluation Services: Assess the
 performance of your fine-tuned model with metrics tailored to your specific task,
 ensuring it meets your quality standards.Vertex AI Model Registry:
 Keep track of all your models, including different versions and their performance
 metrics, in a centralized repository.Vertex AI Endpoints:
 Deploy your fine-tuned model as an API endpoint, making it easily accessible for
 integration into your applications.<h3>High Level Flow Diagram</h3>This diagram
 represents the flow of data and steps involved in the implementation. Please note
 that the owner for the respective step is mentioned in the text underneath.<figure><img
 alt="" src="https://cdn-images-1.medium.com/max/947/1*5233hdmreanFqQ91GB22iQ.png"
 /><figcaption>High Level Flow of Model Fine Tuning Steps</figcaption></figure><h3>Industry
 Use cases</h3>With Vertex AI, the possibilities are endless. You can fine-tune
 LLMs for sentiment analysis, chatbot development, code generation, and much more.
 This technology is democratizing access to powerful language models, allowing businesses
 and individuals to unlock new levels of creativity and efficiency.<h3>Hands-on
 Time</h3>This implementation is done with Vertex AI Python SDK for Generative
 AI models. You can also perform fine tuning in other ways — HTTP, CURL command,
 Java SDK, Console.In 5 easy steps, you can fine-tune and evaluate your model
 for your customized responses!<ol><li>Install and Import dependencies</li></ol><pre>!pip
 install google-cloud-aiplatform !pip install --user datasets !pip install
 --user google-cloud-pipeline-components</pre>Follow the rest of the steps as
 shown in the .ipynb file in the repo. Make sure you replace the PROJECT_ID and BUCKET_NAME
 with your credentials.<pre>import os os.environ[&#39;TF_CPP_MIN_LOG_LEVEL&#39;]
 = &#39;3&#39; import warnings warnings.filterwarnings(&#39;ignore&#39;) import
 vertexai vertexai.init(project=PROJECT_ID, location=REGION) import kfp import
 sys import uuid import json import pandas as pd from google.auth import
 default from datasets import load_dataset from google.cloud import aiplatform from
 vertexai.preview.language_models import TextGenerationModel, EvaluationTextSummarizationSpec</pre>2.
 Prepare &amp; Load Training DataReplace YOUR_BUCKET with your bucket
 and upload the sample <a href="https://github.com/AbiramiSukumaran/LLMFineTuningSupervised/blob/main/TRAIN.jsonl">TRAIN.jsonl</a>
 training data file to it.<pre>json_url = &#39;https://storage.googleapis.com/YOUR_BUCKET/TRAIN.jsonl&#39; df
 = pd.read_json(json_url, lines=True) print (df)</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/582/1*XRuJQcjxOyZ-09A5lVwc8Q.png"
 /><figcaption>Dataframe of the training dataset</figcaption></figure>3.
 Fine Tune a Large Language Model<pre>model_display_name = &#39;bbc-finetuned-model&#39;
 # @param {type:&quot;string&quot;} tuned_model = TextGenerationModel.from_pretrained(&quot;text-bison@002&quot;) tuned_model.tune_model( training_data=df, train_steps=100, tuning_job_location=&quot;europe-west4&quot;, tuned_model_location=&quot;europe-west4&quot;, )</pre>The
 code above takes the pretrained model “text-bison@002” and tunes it with the data
 frame that has the training data we loaded in the previous step.This step
 will take a few hours to complete. Remember you can always track the progress of
 the fine tuning pipeline in the pipeline job link it outputs in this step.4.
 Predict with the new Fine Tuned ModelOnce the fine tuning job is
 complete, you will be able to predict with your new model.<pre>response = tuned_model.predict(&quot;Summarize
 this text to generate a title: \n Ever noticed how plane seats appear to be getting
 smaller and smaller? With increasing numbers of people taking to the skies, some
 experts are questioning if having such packed out planes is putting passengers at
 risk. They say that the shrinking space on aeroplanes is not only uncomfortable
 it it&#39;s putting our health and safety in danger. More than squabbling over the
 arm rest, shrinking space on planes putting our health and safety in danger? This
 week, a U.S consumer advisory group set up by the Department of Transportation said
 at a public hearing that while the government is happy to set standards for animals
 flying on planes, it doesn&#39;t stipulate a minimum amount of space for humans.&quot;) print(response.text)</pre>Here
 is the output:<figure><img alt="" src="https://cdn-images-1.medium.com/proxy/0*Y6cIFGtNVpsaxsb-"
 /><figcaption>Output of prediction using the new fine-tuned model</figcaption></figure>Predict
 with Base Model (text-bison@002) for comparison<pre>base_model = TextGenerationModel.from_pretrained(&quot;text-bison@002&quot;) response
 = base_model.predict(&quot;Summarize this text to generate a title: \n Ever noticed
 how plane seats appear to be getting smaller and smaller? With increasing numbers
 of people taking to the skies, some experts are questioning if having such packed
 out planes is putting passengers at risk. They say that the shrinking space on aeroplanes
 is not only uncomfortable it it&#39;s putting our health and safety in danger. More
 than squabbling over the arm rest, shrinking space on planes putting our health
 and safety in danger? This week, a U.S consumer advisory group set up by the Department
 of Transportation said at a public hearing that while the government is happy to
 set standards for animals flying on planes, it doesn&#39;t stipulate a minimum amount
 of space for humans.&quot;) print(response.text)</pre>Here is the output:<figure><img
 alt="" src="https://cdn-images-1.medium.com/max/897/0*z-tfR03xCsgkqow3" /><figcaption>Output
 of prediction using the base model</figcaption></figure>Even though both titles
 generated look appropriate, the first one (generated with the fine-tuned model)
 is more in tune with the style of titles used in the dataset in question.Load
 the fine tuned modelIt might be easier to load a model that you
 just fine-tuned. But remember in step 3, it is invoked in the scope of the code
 itself so it still holds the tuned model in the variable tuned_model. But what if
 you want to invoke a model that was tuned in the past?To do this, you can
 invoke the get_tuned_model() method on the LLM with the full ENDPOINT URL of the
 deployed fine tuned model from Vertex AI Model Registry.<pre>tuned_model_1 =
 TextGenerationModel.get_tuned_model(&quot;projects/273845608377/locations/europe-west4/models/4220809634753019904&quot;) print(tuned_model_1.predict(&quot;YOUR_PROMPT&quot;))</pre>5.
 Model EvaluationThis is a big topic in itself. We will reserve that
 topic of detailed discussion to another day. For now, we will see how we can get
 some evaluation metrics on the fine tuned model and compare against the base model.Load
 the <a href="https://github.com/AbiramiSukumaran/LLMEvaluationAuto/blob/main/EVALUATE.jsonl">EVALUATION
 dataset</a>:<pre>json_url = &#39;https://storage.googleapis.com/YOUR_BUCKET/EVALUATE.jsonl&#39; df
 = pd.read_json(json_url, lines=True) print (df)</pre>Evaluate:<pre> #
 Define the evaluation specification for a text summarization task on the fine tuned
 model task_spec = EvaluationTextSummarizationSpec( task_name = &quot;summarization&quot;, ground_truth_data=df )</pre>This
 step will take a few minutes to complete. You can track the progress using the pipeline
 job link in the step result. Once complete, you would be able to view the evaluation
 result:<figure><img alt="" src="https://cdn-images-1.medium.com/max/776/0*QoP9htlaWlXnszrw"
 /><figcaption>Evaluation Metric Output</figcaption></figure>rougeLSum:
 This is the ROUGE-L score for the summary. ROUGE-L is a recall-based metric that
 measures the overlap between a summary and a reference summary. It is calculated
 by taking the longest common subsequence (LCS) between the two summaries and dividing
 it by the length of the reference summary.The rougeLSum score in the given
 expression is 0.36600753600753694, which means that the summary has a 36.6% overlap
 with the reference summary.If you run the evaluation step on the baseline
 model, you will observe that the summary score is RELATIVELY higher for the Fine
 Tuned Model.You can find the evaluation results in the Cloud Storage output
 directory that you specified when creating the evaluation job. The file is named
 evaluation_metrics.json. For tuned models, you can also view evaluation results
 in the Google Cloud console on the Vertex AI Model Registry page.<blockquote>Important
 Considerations</blockquote><blockquote>Model Support: Always check
 the model <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/tuning/supervised-tuning#models_that_support_supervised_tuning">documentation</a>
 for the latest compatibility.</blockquote><blockquote>Rapid Development:
 The field of LLMs advances quickly. A newer, more powerful model could potentially
 outperform a fine-tuned model built on an older base. The good news is that you
 can apply these fine-tuning techniques to newer models when the capability becomes
 available.</blockquote><blockquote>LoRA: LoRA is a technique for
 efficiently fine-tuning LLMs. It does this by introducing trainable, low-rank decomposition
 matrices into the existing pre-trained model’s layers. Read more about it <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/lora-qlora">here</a>.
 Instead of updating all the parameters of a massive LLM, LoRA learns smaller matrices
 that are added to or multiplied with the original model’s weight matrices. This
 significantly reduces the number of additional parameters introduced during fine-tuning.</blockquote><h3>Conclusion</h3>Fine-tuning
 is a powerful technique that allows you to customize LLMs to your domain and tasks.
 With Vertex AI, you have the tools and resources you need to fine-tune your models
 efficiently and effectively. Explore the GitHub repositories and experiment with
 the sample code to experience <a href="https://github.com/AbiramiSukumaran/LLMFineTuningSupervised">fine-tuning</a>
 and <a href="https://github.com/AbiramiSukumaran/LLMEvaluationAuto">evaluation</a>
 firsthand. Consider how fine-tuned LLMs can address your specific needs, from generating
 targeted marketing copy to summarizing complex documents or translating languages
 with cultural nuance. Utilize the comprehensive suite of tools and services offered
 by <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/models/tune-models">Vertex
 AI</a> to build, train, evaluate, and deploy your fine-tuned models with ease.Register
 for the upcoming season of <a href="https://codevipassana.dev">Code Vipassana</a>
 (Season 6) where we will be building these out in instructor-led virtual hands-on
 sessions.Also, if you are coming to Google Cloud NEXT on 9th, 10th and
 11th of April 2024 at Vegas, check this in action or just come say hi at our Innovator’s
 Hive End to End AI demo station!<img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3c113f4007da"
 width="1" height="1" alt=""><hr><a href="https://medium.com/google-cloud/fine-tuning-large-language-models-how-vertex-ai-takes-llms-to-the-next-level-3c113f4007da">Fine
 Tuning Large Language Models: How Vertex AI Takes LLMs to the Next Level</a> was
 originally published in <a href="https://medium.com/google-cloud">Google Cloud -
 Community</a> on Medium, where people are continuing the conversation by highlighting
 and responding to this story.'

Language

Active

Ricc internal notes

Ricc source

Show this article Back to articles