♊️ GemiNews 🗞️
(dev)
🏡
📰 Articles
🏷️ Tags
🧠 Queries
📈 Graphs
☁️ Stats
💁🏻 Assistant
💬
🎙️
Demo 1: Embeddings + Recommendation
Demo 2: Bella RAGa
Demo 3: NewRetriever
Demo 4: Assistant function calling
Editing article
Title
Summary
<div class="block-paragraph_advanced"><p><strong style="font-style: italic; vertical-align: baseline;">Editor’s note:</strong><span style="font-style: italic; vertical-align: baseline;"> Stanford University Assistant Professor Paul Nuyujukian and his team at the Brain Inferencing Laboratory explore motor systems neuroscience and neuroengineering applications as part of an effort to create brain-machine interfaces for medical conditions such as stroke and epilepsy. This blog explores how the team is using Google Cloud data storage, computing and analytics capabilities to streamline the collection, processing, and sharing of that scientific data, for the betterment of science and to adhere to funding agency regulations. </span></p> <p><span style="vertical-align: baseline;">Scientific discovery, now more than ever, depends on large quantities of high-quality data and sophisticated analyses performed on those data. In turn, the ability to reliably capture and store data from experiments and process them in a scalable and secure fashion is becoming increasingly important for researchers. Furthermore, collaboration and peer-review are critical components of the processes aimed at making discoveries accessible and useful across a broad range of audiences. </span></p> <p><span style="vertical-align: baseline;">The cornerstones of scientific research are rigor, reproducibility, and transparency — critical elements that ensure scientific findings can be trusted and built upon [</span><a href="https://grants.nih.gov/policy/reproducibility/index.htm" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">1</span></a><span style="vertical-align: baseline;">]. Recently, US Federal funding agencies have adopted strict guidelines around the availability of research data, and so not only is leveraging data best practices practical and beneficial for science, it is now compulsory [</span><a href="https://www.nature.com/articles/d41586-022-00402-1" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">2</span></a><span style="vertical-align: baseline;">, </span><a href="https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">3</span></a><span style="vertical-align: baseline;">, </span><a href="https://www.whitehouse.gov/ostp/news-updates/2023/01/11/fact-sheet-biden-harris-administration-announces-new-actions-to-advance-open-and-equitable-research" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">4</span></a><span style="vertical-align: baseline;">, </span><a href="https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">5</span></a><span style="vertical-align: baseline;">]. Fortunately, Google Cloud provides a wealth of data storage, computing and analytics capabilities that can be used to streamline the collection, processing, and sharing of scientific data. </span></p> <p><span style="vertical-align: baseline;">Prof. Paul Nuyujukian and his </span><a href="https://bil.stanford.edu" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">research team</span></a><span style="vertical-align: baseline;"> at Stanford’s Brain Inferencing Laboratory explore motor systems neuroscience and neuroengineering applications. Their work involves studying how the brain controls movement, recovers from injury, and work to establish brain-machine interfaces as a platform technology for a variety of brain-related medical conditions, particularly stroke and epilepsy. The relevant data is obtained from experiments on preclinical models and human clinical studies. The raw experimental data collected in these experiments is extremely valuable and virtually impossible to reproduce exactly (not to mention the potential costs involved).</span></p></div> <div class="block-image_full_width"> <div class="article-module h-c-page"> <div class="h-c-grid"> <figure class="article-image--large h-c-grid__col h-c-grid__col--6 h-c-grid__col--offset-3 " > <img src="https://storage.googleapis.com/gweb-cloudblog-publish/images/stanford-gitlab-post-figure-1.max-1000x1000.jpg" alt="stanford-gitlab-post-figure-1"> </a> <figcaption class="article-image__caption "><p data-block-key="4dq7h">Fig. 1: Schematic representation of a scientific computation workflow</p></figcaption> </figure> </div> </div> </div> <div class="block-paragraph_advanced"><p><span style="vertical-align: baseline;">To address the challenges outlined above, Prof. Nuyujukian has developed a sophisticated data collection and analysis platform that is in large part inspired by the practices that make up the </span><a href="https://en.wikipedia.org/wiki/DevOps" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">DevOps approach</span></a><span style="vertical-align: baseline;"> common in software development [</span><a href="https://doi.org/10.48550/arXiv.2310.08247" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">6</span></a><span style="vertical-align: baseline;">, Fig. 2]. Keys to the success of this system are standardization, automation, repeatability and scalability. The platform allows for both standardized analyses and “one-off” or ad-hoc analyses in a heterogeneous computing environment. The critical components of the system are containers, Git, CI/CD (leveraging </span><a href="https://docs.gitlab.com/runner/" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">GitLab Runners</span></a><span style="vertical-align: baseline;">), and high-performance compute clusters, both on-premises and in cloud environments such as Google Cloud, in particular </span><a href="https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview"><span style="text-decoration: underline; vertical-align: baseline;">Google Kubernetes Engine</span></a><span style="vertical-align: baseline;"> (GKE) running in </span><a href="https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview"><span style="text-decoration: underline; vertical-align: baseline;">Autopilot</span></a><span style="vertical-align: baseline;"> mode.</span></p></div> <div class="block-image_full_width"> <div class="article-module h-c-page"> <div class="h-c-grid"> <figure class="article-image--large h-c-grid__col h-c-grid__col--6 h-c-grid__col--offset-3 " > <img src="https://storage.googleapis.com/gweb-cloudblog-publish/images/stanford-gitlab-post-figure-2.max-1000x1000.png" alt="stanford-gitlab-post-figure-2"> </a> <figcaption class="article-image__caption "><p data-block-key="c5vlt">Fig. 2: Leveraging DevOps for Scientific Computing</p></figcaption> </figure> </div> </div> </div> <div class="block-paragraph_advanced"><p><span style="vertical-align: baseline;">Google Cloud provides a secure, scalable, and highly interoperable framework for the various analyses that need to be run on the data collected from scientific experiments (spanning basic science and clinical studies). </span><a href="https://docs.gitlab.com/ee/ci/pipelines/" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">GitLab Pipelines</span></a><span style="vertical-align: baseline;"> specify the transformations and analyses that need to be applied to the various datasets. GitLab Runner instances running on GKE (or other on-premises cluster/high-performance computing environments) are used to execute these pipelines in a scalable and cost-effective manner. Autopilot environments in particular provide substantial advantages to researchers since they are fully managed and require only minimal customization or ongoing “manual” maintenance. Furthermore, they instantly scale with the demand for analyses that need to be run, even with </span><a href="https://cloud.google.com/kubernetes-engine/docs/concepts/spot-vms"><span style="text-decoration: underline; vertical-align: baseline;">spot VM pricing</span></a><span style="vertical-align: baseline;">, allowing for cost-effective computation. Then, they scale down to near-zero when idle, and scale up as demand increases again – all without intervention by the researcher.</span></p> <p><span style="vertical-align: baseline;">GitLab pipelines have a clear and well-organized structure defined in YAML files. Data transformations are often multi-stage and GitLab’s framework explicitly supports such an approach. Defaults can be set for an entire pipeline, such as the various data transformation stages, and can be overwritten for particular stages where necessary. Since the exact steps of a data transformation pipeline can be context- or case-dependent, conditional logic is supported along with dynamic definition of pipelines, e.g., definitions depending on the outcome of previous analysis steps. Critically, different stages of a GitLab pipeline can be executed by different runners, facilitating the execution of pipelines across heterogeneous environments, for example transferring data from experimental acquisition systems and processing them in cloud or on-premises computing spaces [Fig. 3].</span></p></div> <div class="block-image_full_width"> <div class="article-module h-c-page"> <div class="h-c-grid"> <figure class="article-image--large h-c-grid__col h-c-grid__col--6 h-c-grid__col--offset-3 " > <img src="https://storage.googleapis.com/gweb-cloudblog-publish/images/stanford-gitlab-post-figure-3.max-1000x1000.png" alt="stanford-gitlab-post-figure-3"> </a> <figcaption class="article-image__caption "><p data-block-key="wni3p">Fig. 3: Architecture of the Google Cloud based scientific computation workflow via GitLab Runners hosted on Google Kubernetes Engine</p></figcaption> </figure> </div> </div> </div> <div class="block-paragraph_advanced"><p><span style="vertical-align: baseline;">Cloud computing resources can provide exceptional scalability, while pipelines allow for parallel execution of stages to take advantage of this scalability, allowing researchers to execute transformations at scale and substantially speed up data processing and analysis. Parametrization of pipelines allows researchers to automate the validation of processing protocols across many acquired datasets or analytical variations, yielding robust, reproducible, and sustainable data analysis workflows.</span></p> <p><span style="vertical-align: baseline;">Collaboration and data sharing is another critical, and now mandatory, aspect of scientific discovery. Multiple generations of researchers, from the same lab or different labs, may interact with particular datasets and analysis workflows over a long period of time. Standardized pipelines like the ones described above can play a central role in providing transparency on how data is collected and how it is processed, since they are essentially self-documenting. That, in turn, allows for scalable and repeatable discovery. Data provenance, for example, is explicitly supported by this framework. Through the extensive use of containers, workflows are also well encapsulated and no longer depend on specifically tuned local computing environments. This consequently leads to increased rigor, reproducibility and transparency, enabling a large audience to interact productively with datasets and data transformation workflows.</span></p> <p><span style="vertical-align: baseline;">In conclusion, by using the computing, data storage, and transformation technologies available from Google Cloud along with workflow capabilities of CI/CD engines like GitLab, researchers can build highly capable and cost-effective scientific data-analysis environments that aid efforts to increase rigor, reproducibility, and transparency, while also achieving compliance with relevant government regulations.</span></p> <p><span style="vertical-align: baseline;">References:</span></p> <ol> <li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"> <p role="presentation"><a href="https://grants.nih.gov/policy/reproducibility/index.htm" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">Enhancing Reproducibility through Rigor and Transparency</span></a></p> </li> <li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"> <p role="presentation"><a href="https://www.nature.com/articles/d41586-022-00402-1" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">NIH issues a seismic mandate: share data publicly</span></a></p> </li> <li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"> <p role="presentation"><a href="https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">Final NIH Policy for Data Management and Sharing</span></a></p> </li> <li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"> <p role="presentation"><a href="https://www.whitehouse.gov/ostp/news-updates/2023/01/11/fact-sheet-biden-harris-administration-announces-new-actions-to-advance-open-and-equitable-research" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">FACT SHEET: Biden-Harris Administration Announces New Actions to Advance Open and Equitable Research</span></a></p> </li> <li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"> <p role="presentation"><a href="https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">MEMORANDUM FOR THE HEADS OF EXECUTIVE DEPARTMENTS AND AGENCIES</span></a></p> </li> <li aria-level="1" style="list-style-type: decimal; vertical-align: baseline;"> <p role="presentation"><a href="https://doi.org/10.48550/arXiv.2310.08247" rel="noopener" target="_blank"><span style="text-decoration: underline; vertical-align: baseline;">Leveraging DevOps for Scientific Computing</span></a></p> </li> </ol></div>
Content
empty
Author
Link
Published date
Image url
Feed url
Guid
Hidden blurb
--- !ruby/object:Feedjira::Parser::RSSEntry published: 2024-03-22 16:00:00.000000000 Z entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier guid: https://cloud.google.com/blog/products/containers-kubernetes/stanford-team-uses-devops-tools-to-manage-research-data/ title: Using GKE and applying DevOps principles for scientific research at Stanford categories: - DevOps & SRE - Containers & Kubernetes summary: "<div class=\"block-paragraph_advanced\"><p><strong style=\"font-style: italic; vertical-align: baseline;\">Editor’s note:</strong><span style=\"font-style: italic; vertical-align: baseline;\"> Stanford University Assistant Professor Paul Nuyujukian and his team at the Brain Inferencing Laboratory explore motor systems neuroscience and neuroengineering applications as part of an effort to create brain-machine interfaces for medical conditions such as stroke and epilepsy. This blog explores how the team is using Google Cloud data storage, computing and analytics capabilities to streamline the collection, processing, and sharing of that scientific data, for the betterment of science and to adhere to funding agency regulations. </span></p>\n<p><span style=\"vertical-align: baseline;\">Scientific discovery, now more than ever, depends on large quantities of high-quality data and sophisticated analyses performed on those data. In turn, the ability to reliably capture and store data from experiments and process them in a scalable and secure fashion is becoming increasingly important for researchers. Furthermore, collaboration and peer-review are critical components of the processes aimed at making discoveries accessible and useful across a broad range of audiences. </span></p>\n<p><span style=\"vertical-align: baseline;\">The cornerstones of scientific research are rigor, reproducibility, and transparency — critical elements that ensure scientific findings can be trusted and built upon [</span><a href=\"https://grants.nih.gov/policy/reproducibility/index.htm\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">1</span></a><span style=\"vertical-align: baseline;\">]. Recently, US Federal funding agencies have adopted strict guidelines around the availability of research data, and so not only is leveraging data best practices practical and beneficial for science, it is now compulsory [</span><a href=\"https://www.nature.com/articles/d41586-022-00402-1\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">2</span></a><span style=\"vertical-align: baseline;\">, </span><a href=\"https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">3</span></a><span style=\"vertical-align: baseline;\">, </span><a href=\"https://www.whitehouse.gov/ostp/news-updates/2023/01/11/fact-sheet-biden-harris-administration-announces-new-actions-to-advance-open-and-equitable-research\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">4</span></a><span style=\"vertical-align: baseline;\">, </span><a href=\"https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">5</span></a><span style=\"vertical-align: baseline;\">]. Fortunately, Google Cloud provides a wealth of data storage, computing and analytics capabilities that can be used to streamline the collection, processing, and sharing of scientific data. </span></p>\n<p><span style=\"vertical-align: baseline;\">Prof. Paul Nuyujukian and his </span><a href=\"https://bil.stanford.edu\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">research team</span></a><span style=\"vertical-align: baseline;\"> at Stanford’s Brain Inferencing Laboratory explore motor systems neuroscience and neuroengineering applications. Their work involves studying how the brain controls movement, recovers from injury, and work to establish brain-machine interfaces as a platform technology for a variety of brain-related medical conditions, particularly stroke and epilepsy. The relevant data is obtained from experiments on preclinical models and human clinical studies. The raw experimental data collected in these experiments is extremely valuable and virtually impossible to reproduce exactly (not to mention the potential costs involved).</span></p></div>\n<div class=\"block-image_full_width\">\n\n\n\n\n\n\n \n <div class=\"article-module h-c-page\">\n <div class=\"h-c-grid\">\n \n\n <figure class=\"article-image--large\n \ \n \n h-c-grid__col\n h-c-grid__col--6 h-c-grid__col--offset-3\n \ \n \n \"\n >\n\n \n \n \n <img\n \ src=\"https://storage.googleapis.com/gweb-cloudblog-publish/images/stanford-gitlab-post-figure-1.max-1000x1000.jpg\"\n \ \n alt=\"stanford-gitlab-post-figure-1\">\n \n </a>\n \ \n <figcaption class=\"article-image__caption \"><p data-block-key=\"4dq7h\">Fig. 1: Schematic representation of a scientific computation workflow</p></figcaption>\n \ \n </figure>\n\n \n </div>\n </div>\n \n\n\n\n\n</div>\n<div class=\"block-paragraph_advanced\"><p><span style=\"vertical-align: baseline;\">To address the challenges outlined above, Prof. Nuyujukian has developed a sophisticated data collection and analysis platform that is in large part inspired by the practices that make up the </span><a href=\"https://en.wikipedia.org/wiki/DevOps\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">DevOps approach</span></a><span style=\"vertical-align: baseline;\"> common in software development [</span><a href=\"https://doi.org/10.48550/arXiv.2310.08247\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">6</span></a><span style=\"vertical-align: baseline;\">, Fig. 2]. Keys to the success of this system are standardization, automation, repeatability and scalability. The platform allows for both standardized analyses and “one-off” or ad-hoc analyses in a heterogeneous computing environment. The critical components of the system are containers, Git, CI/CD (leveraging </span><a href=\"https://docs.gitlab.com/runner/\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">GitLab Runners</span></a><span style=\"vertical-align: baseline;\">), and high-performance compute clusters, both on-premises and in cloud environments such as Google Cloud, in particular </span><a href=\"https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview\"><span style=\"text-decoration: underline; vertical-align: baseline;\">Google Kubernetes Engine</span></a><span style=\"vertical-align: baseline;\"> (GKE) running in </span><a href=\"https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview\"><span style=\"text-decoration: underline; vertical-align: baseline;\">Autopilot</span></a><span style=\"vertical-align: baseline;\"> mode.</span></p></div>\n<div class=\"block-image_full_width\">\n\n\n\n\n\n\n \ \n <div class=\"article-module h-c-page\">\n <div class=\"h-c-grid\">\n \ \n\n <figure class=\"article-image--large\n \n \n h-c-grid__col\n \ h-c-grid__col--6 h-c-grid__col--offset-3\n \n \n \"\n \ >\n\n \n \n \n <img\n src=\"https://storage.googleapis.com/gweb-cloudblog-publish/images/stanford-gitlab-post-figure-2.max-1000x1000.png\"\n \ \n alt=\"stanford-gitlab-post-figure-2\">\n \n </a>\n \ \n <figcaption class=\"article-image__caption \"><p data-block-key=\"c5vlt\">Fig. 2: Leveraging DevOps for Scientific Computing</p></figcaption>\n \n </figure>\n\n \ \n </div>\n </div>\n \n\n\n\n\n</div>\n<div class=\"block-paragraph_advanced\"><p><span style=\"vertical-align: baseline;\">Google Cloud provides a secure, scalable, and highly interoperable framework for the various analyses that need to be run on the data collected from scientific experiments (spanning basic science and clinical studies). </span><a href=\"https://docs.gitlab.com/ee/ci/pipelines/\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">GitLab Pipelines</span></a><span style=\"vertical-align: baseline;\"> specify the transformations and analyses that need to be applied to the various datasets. GitLab Runner instances running on GKE (or other on-premises cluster/high-performance computing environments) are used to execute these pipelines in a scalable and cost-effective manner. Autopilot environments in particular provide substantial advantages to researchers since they are fully managed and require only minimal customization or ongoing “manual” maintenance. Furthermore, they instantly scale with the demand for analyses that need to be run, even with </span><a href=\"https://cloud.google.com/kubernetes-engine/docs/concepts/spot-vms\"><span style=\"text-decoration: underline; vertical-align: baseline;\">spot VM pricing</span></a><span style=\"vertical-align: baseline;\">, allowing for cost-effective computation. Then, they scale down to near-zero when idle, and scale up as demand increases again – all without intervention by the researcher.</span></p>\n<p><span style=\"vertical-align: baseline;\">GitLab pipelines have a clear and well-organized structure defined in YAML files. Data transformations are often multi-stage and GitLab’s framework explicitly supports such an approach. Defaults can be set for an entire pipeline, such as the various data transformation stages, and can be overwritten for particular stages where necessary. Since the exact steps of a data transformation pipeline can be context- or case-dependent, conditional logic is supported along with dynamic definition of pipelines, e.g., definitions depending on the outcome of previous analysis steps. Critically, different stages of a GitLab pipeline can be executed by different runners, facilitating the execution of pipelines across heterogeneous environments, for example transferring data from experimental acquisition systems and processing them in cloud or on-premises computing spaces [Fig. 3].</span></p></div>\n<div class=\"block-image_full_width\">\n\n\n\n\n\n\n \ \n <div class=\"article-module h-c-page\">\n <div class=\"h-c-grid\">\n \ \n\n <figure class=\"article-image--large\n \n \n h-c-grid__col\n \ h-c-grid__col--6 h-c-grid__col--offset-3\n \n \n \"\n \ >\n\n \n \n \n <img\n src=\"https://storage.googleapis.com/gweb-cloudblog-publish/images/stanford-gitlab-post-figure-3.max-1000x1000.png\"\n \ \n alt=\"stanford-gitlab-post-figure-3\">\n \n </a>\n \ \n <figcaption class=\"article-image__caption \"><p data-block-key=\"wni3p\">Fig. 3: Architecture of the Google Cloud based scientific computation workflow via GitLab Runners hosted on Google Kubernetes Engine</p></figcaption>\n \n </figure>\n\n \ \n </div>\n </div>\n \n\n\n\n\n</div>\n<div class=\"block-paragraph_advanced\"><p><span style=\"vertical-align: baseline;\">Cloud computing resources can provide exceptional scalability, while pipelines allow for parallel execution of stages to take advantage of this scalability, allowing researchers to execute transformations at scale and substantially speed up data processing and analysis. Parametrization of pipelines allows researchers to automate the validation of processing protocols across many acquired datasets or analytical variations, yielding robust, reproducible, and sustainable data analysis workflows.</span></p>\n<p><span style=\"vertical-align: baseline;\">Collaboration and data sharing is another critical, and now mandatory, aspect of scientific discovery. Multiple generations of researchers, from the same lab or different labs, may interact with particular datasets and analysis workflows over a long period of time. Standardized pipelines like the ones described above can play a central role in providing transparency on how data is collected and how it is processed, since they are essentially self-documenting. That, in turn, allows for scalable and repeatable discovery. Data provenance, for example, is explicitly supported by this framework. Through the extensive use of containers, workflows are also well encapsulated and no longer depend on specifically tuned local computing environments. This consequently leads to increased rigor, reproducibility and transparency, enabling a large audience to interact productively with datasets and data transformation workflows.</span></p>\n<p><span style=\"vertical-align: baseline;\">In conclusion, by using the computing, data storage, and transformation technologies available from Google Cloud along with workflow capabilities of CI/CD engines like GitLab, researchers can build highly capable and cost-effective scientific data-analysis environments that aid efforts to increase rigor, reproducibility, and transparency, while also achieving compliance with relevant government regulations.</span></p>\n<p><span style=\"vertical-align: baseline;\">References:</span></p>\n<ol>\n<li aria-level=\"1\" style=\"list-style-type: decimal; vertical-align: baseline;\">\n<p role=\"presentation\"><a href=\"https://grants.nih.gov/policy/reproducibility/index.htm\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">Enhancing Reproducibility through Rigor and Transparency</span></a></p>\n</li>\n<li aria-level=\"1\" style=\"list-style-type: decimal; vertical-align: baseline;\">\n<p role=\"presentation\"><a href=\"https://www.nature.com/articles/d41586-022-00402-1\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">NIH issues a seismic mandate: share data publicly</span></a></p>\n</li>\n<li aria-level=\"1\" style=\"list-style-type: decimal; vertical-align: baseline;\">\n<p role=\"presentation\"><a href=\"https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">Final NIH Policy for Data Management and Sharing</span></a></p>\n</li>\n<li aria-level=\"1\" style=\"list-style-type: decimal; vertical-align: baseline;\">\n<p role=\"presentation\"><a href=\"https://www.whitehouse.gov/ostp/news-updates/2023/01/11/fact-sheet-biden-harris-administration-announces-new-actions-to-advance-open-and-equitable-research\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">FACT SHEET: Biden-Harris Administration Announces New Actions to Advance Open and Equitable Research</span></a></p>\n</li>\n<li aria-level=\"1\" style=\"list-style-type: decimal; vertical-align: baseline;\">\n<p role=\"presentation\"><a href=\"https://www.whitehouse.gov/wp-content/uploads/2022/08/08-2022-OSTP-Public-Access-Memo.pdf\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">MEMORANDUM FOR THE HEADS OF EXECUTIVE DEPARTMENTS AND AGENCIES</span></a></p>\n</li>\n<li aria-level=\"1\" style=\"list-style-type: decimal; vertical-align: baseline;\">\n<p role=\"presentation\"><a href=\"https://doi.org/10.48550/arXiv.2310.08247\" rel=\"noopener\" target=\"_blank\"><span style=\"text-decoration: underline; vertical-align: baseline;\">Leveraging DevOps for Scientific Computing</span></a></p>\n</li>\n</ol></div>" carlessian_info: news_filer_version: 2 newspaper: Google Cloud Blog macro_region: Technology url: https://cloud.google.com/blog/products/containers-kubernetes/stanford-team-uses-devops-tools-to-manage-research-data/ rss_fields: - title - url - summary - author - categories - published - entry_id author: Paul Nuyujukian
Language
Active
Ricc internal notes
Imported via /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/import-feedjira.rb on 2024-03-31 23:23:57 +0200. Content is EMPTY here. Entried: title,url,summary,author,categories,published,entry_id. TODO add Newspaper: filename = /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/../../../crawler/out/feedjira/Technology/Google Cloud Blog/2024-03-22-Using_GKE_and_applying_DevOps_principles_for_scientific_research-v2.yaml
Ricc source
Show this article
Back to articles