♊️ GemiNews 🗞️
(dev)
🏡
📰 Articles
🏷️ Tags
🧠 Queries
📈 Graphs
☁️ Stats
💁🏻 Assistant
💬
🎙️
Demo 1: Embeddings + Recommendation
Demo 2: Bella RAGa
Demo 3: NewRetriever
Demo 4: Assistant function calling
Editing article
Title
Summary
Content
<p>In this blog, we are going to highlight some keynotes of the Kubeflow Summit Europe 2024 which was held this year at Paris. Unfortunately, i couldn’t assist physically but i watched lately the cncf playlist on youtube and tried to do a small wrap up.</p><h3>💻What is Kubeflow ?</h3><p>Kubeflow is a Kubernetes-native, open-source framework for developing, managing, and running machine learning (ML) workloads. Kubeflow is an AI/ML platform that brings together several tools covering the main AI/ML use cases: data exploration, data pipelines, model training, and model serving.</p><h3>What is Kubeflow used for?</h3><p>Kubeflow solves many of the challenges involved in orchestrating machine learning pipelines by providing a set of tools and <a href="https://www.redhat.com/en/topics/api/what-are-application-programming-interfaces">APIs</a> that simplify the process of training and deploying ML models at scale.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/271/1*rYv0ZLAHY6tYZydMto3eBw.png" /><figcaption>Kubeflow pipeline</figcaption></figure><p>Now we defined what is Kubeflow, let’s start talking about the keynotes and what they brang to us this year :</p><h3>🤖Scalable Platform for Training and Inference Using Kubeflow at CERN</h3><p>The <strong>European Organization for Nuclear Research</strong>, known as <strong>CERN</strong> is an intergovernmental organization that operates the largest particle physics laboratory in the world.</p><p>This talk will go into the details of how a kubeflow based machine learning platform handles all the steps from data preparation, interactive analysis, distributed training and inference.</p><p>The requirements at CERN :</p><ul><li>The platform should manage the full machine learning lifecycle Using multiple services can be confusing and hard to integrate.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/703/1*G5s1ZtE7MPx3xDGqwQcwdw.png" /><figcaption>MLOps lifecycle</figcaption></figure><ul><li>The platform needs to be integrated with CERN systems Auth, storage systems, etc…</li><li>The platform should be centralized to ensure easy and efficient access to GPUs and other accelerators.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/697/1*YtYQ7G-HtoFD9KrjjFlbgQ.png" /><figcaption>Reasons for centralizing resources</figcaption></figure><ul><li>The platform should be easy to use many scientists are not infrastructure experts.</li></ul><h4>⚛How MLOPS and Kubeflow are used at CERN ?</h4><p>ATLAS is one of two general-purpose detectors at the Large Hadron Collider (LHC). It investigates a wide range of physics, from the Higgs boson to extra dimensions and particles that could make up dark matter.</p><blockquote>I want to find Higgs bosons in the recorded collisions to study them.</blockquote><p>And this was their pipeline workflow to study the Higgs bosons particles.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*uyu37xBGm6rGxCFU11_IYg.png" /><figcaption>CERN Atlas pipeline</figcaption></figure><p><strong>Salt: </strong>General-purpose software to train multi-modal, multi-task transformer models.</p><p><strong>Katib:</strong> Used within Kubeflow to tune model Hyperparmeters.</p><p><strong>Kubeflow Notebooks:</strong> Store notebooks to be run in containers.</p><p><strong>Ceph: </strong>an open-source, distributed storage system.</p><h3>Transforming Data Science at PepsiCo: The Kubeflow Revolution</h3><p>Kubeflow is also used at Pepsi and this is for many reasons :</p><ul><li>We already have K8S clusters and infrastructure team to<br>maintain it</li><li>Lots of data deserves lots of models</li><li>Hyperparameters tuning -> Katib</li><li>Serve models -> KServe</li><li>Model training -> training operators</li></ul><h4>The need for Kubeflow ?</h4><p>There was several reasons for using kubeflow at PepsiCo :</p><ul><li>Production is PAINFUL</li><li>With all the gaps Data Science was left to fend for themselves.</li><li>A lot of non-efficient work, going to production (or even staging) is a slog.</li></ul><p>This led to creating multiple solutions that works with kubeflow to bring the best to the AI/ML ecosystem like the Monorepo for all of Data Science/AI project :</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/541/1*qHcDzBczY0PuyoPinSss-w.png" /><figcaption>Monorepo benefits</figcaption></figure><p>Or even the Prometheus CLI that is built on top of the <strong>kfp</strong> SDK:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/444/1*p8yRdZztyx5OlA-p5UASkA.png" /><figcaption>Prometheus cli features</figcaption></figure><h4>🔄Culture Shift at PepsiCo</h4><ul><li>None of the code we built matters without rethinking our relationship to Kubeflow.</li><li>If all we built was better tooling for a broken workflow, there would be no fundamental change.</li></ul><h3>The Good, the Bad, and the Missing Parts of Kubeflow</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/326/1*cP606VqXOXJkKPWdIZBjxg.png" /><figcaption>Kubeflow</figcaption></figure><h4>😄The good parts:</h4><ul><li>pipelines</li><li>notebooks, katib, kserve</li></ul><h4>😞The bad parts:</h4><ul><li>documentation, tutorials, installation</li></ul><h4>🤔The missing parts:</h4><ul><li>Monitoring models</li><li>Model registry</li><li>Initial setup</li></ul><h4>What’s coming for kubeflow ?</h4><ul><li>finish cncf graduation</li><li>establish a TOC (technical oversigh tcommitte)</li><li>arm64 support</li><li>conformance testing</li></ul><h3>🧠AutoML & Training Working Group Updates</h3><p>AutoML working group (WG) is responsible for all aspects of AutoML features on Kubeflow with Katib as the sub-project. Katib is a Kubernetes-native project with rich support for HyperParameter tuning, Neural Architecture Search, and Early Stopping algorithms.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/769/1*NJyayYjUBsFkBpzEGVvvmQ.png" /><figcaption>Katib features</figcaption></figure><h4>Katib architecture</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*DFrIJ9Std7EY_gkp7RePHA.png" /><figcaption>Katib architecture</figcaption></figure><h4>Katib future ?</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/635/1*nDdJCyqqquAxV_FviVpf0g.png" /></figure><h4>Training operator overview</h4><p>Kubeflow Training Operator is a Kubernetes-native project for fine-tuning and scalable distributed training of machine learning (ML) models created with various ML frameworks such as PyTorch, TensorFlow, XGBoost, and others.</p><p>User can integrate other ML libraries such as <a href="https://huggingface.co/">HuggingFace</a>, <a href="https://github.com/microsoft/DeepSpeed">DeepSpeed</a>, or <a href="https://github.com/NVIDIA/Megatron-LM">Megatron</a> with Training Operator to orchestrate their ML training on Kubernetes.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/790/1*Cv8dwiSuJZaPcVLHF7j80Q.png" /><figcaption>Training Operator features</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/719/1*gkEN7mITUsh_BdYoTdTghg.png" /><figcaption>Example of distributed training for PyTorch</figcaption></figure><h4>Training Operator Roadmap :</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/756/1*Yz5Mr_HJfvACp_jTazPUBQ.png" /></figure><h4>Conclusion :</h4><p>With this AI trend and need for performant and cost effective deployment strategies for ML models, kubeflow can be an interesting option for companies that haven’t already migrated to cloud-native environments.</p><p>— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —</p><p>Was this helpful? Confusing? If you have any questions, feel free to contact me!</p><p>Before you leave:</p><p>👏 Clap for the story</p><p>📰 Subscribe for more posts like this @malek.zaag ⚡️</p><p>👉👈 Please follow me: <a href="https://github.com/Malek-Zaag">GitHub </a>| <a href="https://www.linkedin.com/in/malekzaag/">LinkedIn</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ed60cada2f03" width="1" height="1" alt=""><hr><p><a href="https://medium.com/google-cloud/kubeflow-summit-europe-2024-ed60cada2f03">Kubeflow Summit Europe 2024 ✨</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>
Author
Link
Published date
Image url
Feed url
Guid
Hidden blurb
--- !ruby/object:Feedjira::Parser::RSSEntry title: Kubeflow Summit Europe 2024 ✨ published: 2024-04-12 12:26:03.000000000 Z categories: - ai - kubernetes - machine-learning - kubeflow - cloud-native url: https://medium.com/google-cloud/kubeflow-summit-europe-2024-ed60cada2f03?source=rss----e52cf94d98af---4 entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier is_perma_link: 'false' guid: https://medium.com/p/ed60cada2f03 carlessian_info: news_filer_version: 2 newspaper: Google Cloud - Medium macro_region: Blogs content: "<p>In this blog, we are going to highlight some keynotes of the Kubeflow Summit Europe 2024 which was held this year at Paris. Unfortunately, i couldn’t assist physically but i watched lately the cncf playlist on youtube and tried to do a small wrap up.</p><h3>\U0001F4BBWhat is Kubeflow ?</h3><p>Kubeflow is a Kubernetes-native, open-source framework for developing, managing, and running machine learning (ML) workloads. Kubeflow is an AI/ML platform that brings together several tools covering the main AI/ML use cases: data exploration, data pipelines, model training, and model serving.</p><h3>What is Kubeflow used for?</h3><p>Kubeflow solves many of the challenges involved in orchestrating machine learning pipelines by providing a set of tools and <a href=\"https://www.redhat.com/en/topics/api/what-are-application-programming-interfaces\">APIs</a> that simplify the process of training and deploying ML models at scale.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/271/1*rYv0ZLAHY6tYZydMto3eBw.png\" /><figcaption>Kubeflow pipeline</figcaption></figure><p>Now we defined what is Kubeflow, let’s start talking about the keynotes and what they brang to us this year :</p><h3>\U0001F916Scalable Platform for Training and Inference Using Kubeflow at CERN</h3><p>The <strong>European Organization for Nuclear Research</strong>, known as <strong>CERN</strong> is an intergovernmental organization that operates the largest particle physics laboratory in the world.</p><p>This talk will go into the details of how a kubeflow based machine learning platform handles all the steps from data preparation, interactive analysis, distributed training and inference.</p><p>The requirements at CERN :</p><ul><li>The platform should manage the full machine learning lifecycle Using multiple services can be confusing and hard to integrate.</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/703/1*G5s1ZtE7MPx3xDGqwQcwdw.png\" /><figcaption>MLOps lifecycle</figcaption></figure><ul><li>The platform needs to be integrated with CERN systems Auth, storage systems, etc…</li><li>The platform should be centralized to ensure easy and efficient access to GPUs and other accelerators.</li></ul><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/697/1*YtYQ7G-HtoFD9KrjjFlbgQ.png\" /><figcaption>Reasons for centralizing resources</figcaption></figure><ul><li>The platform should be easy to use many scientists are not infrastructure experts.</li></ul><h4>⚛How MLOPS and Kubeflow are used at CERN ?</h4><p>ATLAS is one of two general-purpose detectors at the Large Hadron Collider (LHC). It investigates a wide range of physics, from the Higgs boson to extra dimensions and particles that could make up dark matter.</p><blockquote>I want to find Higgs bosons in the recorded collisions to study them.</blockquote><p>And this was their pipeline workflow to study the Higgs bosons particles.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*uyu37xBGm6rGxCFU11_IYg.png\" /><figcaption>CERN Atlas pipeline</figcaption></figure><p><strong>Salt: </strong>General-purpose software to train multi-modal, multi-task transformer models.</p><p><strong>Katib:</strong> Used within Kubeflow to tune model Hyperparmeters.</p><p><strong>Kubeflow Notebooks:</strong> Store notebooks to be run in containers.</p><p><strong>Ceph: </strong>an open-source, distributed storage system.</p><h3>Transforming Data Science at PepsiCo: The Kubeflow Revolution</h3><p>Kubeflow is also used at Pepsi and this is for many reasons :</p><ul><li>We already have K8S clusters and infrastructure team to<br>maintain it</li><li>Lots of data deserves lots of models</li><li>Hyperparameters tuning -> Katib</li><li>Serve models -> KServe</li><li>Model training -> training operators</li></ul><h4>The need for Kubeflow ?</h4><p>There was several reasons for using kubeflow at PepsiCo :</p><ul><li>Production is PAINFUL</li><li>With all the gaps Data Science was left to fend for themselves.</li><li>A lot of non-efficient work, going to production (or even staging) is a slog.</li></ul><p>This led to creating multiple solutions that works with kubeflow to bring the best to the AI/ML ecosystem like the Monorepo for all of Data Science/AI project :</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/541/1*qHcDzBczY0PuyoPinSss-w.png\" /><figcaption>Monorepo benefits</figcaption></figure><p>Or even the Prometheus CLI that is built on top of the <strong>kfp</strong> SDK:</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/444/1*p8yRdZztyx5OlA-p5UASkA.png\" /><figcaption>Prometheus cli features</figcaption></figure><h4>\U0001F504Culture Shift at PepsiCo</h4><ul><li>None of the code we built matters without rethinking our relationship to Kubeflow.</li><li>If all we built was better tooling for a broken workflow, there would be no fundamental change.</li></ul><h3>The Good, the Bad, and the Missing Parts of Kubeflow</h3><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/326/1*cP606VqXOXJkKPWdIZBjxg.png\" /><figcaption>Kubeflow</figcaption></figure><h4>\U0001F604The good parts:</h4><ul><li>pipelines</li><li>notebooks, katib, kserve</li></ul><h4>\U0001F61EThe bad parts:</h4><ul><li>documentation, tutorials, installation</li></ul><h4>\U0001F914The missing parts:</h4><ul><li>Monitoring models</li><li>Model registry</li><li>Initial setup</li></ul><h4>What’s coming for kubeflow ?</h4><ul><li>finish cncf graduation</li><li>establish a TOC (technical oversigh tcommitte)</li><li>arm64 support</li><li>conformance testing</li></ul><h3>\U0001F9E0AutoML & Training Working Group Updates</h3><p>AutoML working group (WG) is responsible for all aspects of AutoML features on Kubeflow with Katib as the sub-project. Katib is a Kubernetes-native project with rich support for HyperParameter tuning, Neural Architecture Search, and Early Stopping algorithms.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/769/1*NJyayYjUBsFkBpzEGVvvmQ.png\" /><figcaption>Katib features</figcaption></figure><h4>Katib architecture</h4><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/1024/1*DFrIJ9Std7EY_gkp7RePHA.png\" /><figcaption>Katib architecture</figcaption></figure><h4>Katib future ?</h4><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/635/1*nDdJCyqqquAxV_FviVpf0g.png\" /></figure><h4>Training operator overview</h4><p>Kubeflow Training Operator is a Kubernetes-native project for fine-tuning and scalable distributed training of machine learning (ML) models created with various ML frameworks such as PyTorch, TensorFlow, XGBoost, and others.</p><p>User can integrate other ML libraries such as <a href=\"https://huggingface.co/\">HuggingFace</a>, <a href=\"https://github.com/microsoft/DeepSpeed\">DeepSpeed</a>, or <a href=\"https://github.com/NVIDIA/Megatron-LM\">Megatron</a> with Training Operator to orchestrate their ML training on Kubernetes.</p><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/790/1*Cv8dwiSuJZaPcVLHF7j80Q.png\" /><figcaption>Training Operator features</figcaption></figure><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/719/1*gkEN7mITUsh_BdYoTdTghg.png\" /><figcaption>Example of distributed training for PyTorch</figcaption></figure><h4>Training Operator Roadmap :</h4><figure><img alt=\"\" src=\"https://cdn-images-1.medium.com/max/756/1*Yz5Mr_HJfvACp_jTazPUBQ.png\" /></figure><h4>Conclusion :</h4><p>With this AI trend and need for performant and cost effective deployment strategies for ML models, kubeflow can be an interesting option for companies that haven’t already migrated to cloud-native environments.</p><p>— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —</p><p>Was this helpful? Confusing? If you have any questions, feel free to contact me!</p><p>Before you leave:</p><p>\U0001F44F Clap for the story</p><p>\U0001F4F0 Subscribe for more posts like this @malek.zaag ⚡️</p><p>\U0001F449\U0001F448 Please follow me: <a href=\"https://github.com/Malek-Zaag\">GitHub </a>| <a href=\"https://www.linkedin.com/in/malekzaag/\">LinkedIn</a></p><img src=\"https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ed60cada2f03\" width=\"1\" height=\"1\" alt=\"\"><hr><p><a href=\"https://medium.com/google-cloud/kubeflow-summit-europe-2024-ed60cada2f03\">Kubeflow Summit Europe 2024 ✨</a> was originally published in <a href=\"https://medium.com/google-cloud\">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>" rss_fields: - title - published - categories - url - entry_id - content - author author: Malek ZAAG
Language
Active
Ricc internal notes
Imported via /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/import-feedjira.rb on 2024-04-16 21:08:38 +0200. Content is EMPTY here. Entried: title,published,categories,url,entry_id,content,author. TODO add Newspaper: filename = /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/../../../crawler/out/feedjira/Blogs/Google Cloud - Medium/2024-04-12-Kubeflow_Summit_Europe_2024_✨-v2.yaml
Ricc source
Show this article
Back to articles