♊️ GemiNews 🗞️
(dev)
🏡
📰 Articles
🏷️ Tags
🧠 Queries
📈 Graphs
☁️ Stats
💁🏻 Assistant
💬
🎙️
Demo 1: Embeddings + Recommendation
Demo 2: Bella RAGa
Demo 3: NewRetriever
Demo 4: Assistant function calling
Editing article
Title
Summary
Content
<h3>Secure Together — Federated Learning for Decentralized Security on GCP</h3><p>Integrating security mechanisms to enhance organization posture with FL</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/500/0*yCOc3CDRbhGEVJwn.jpeg" /></figure><p>As I might have emphasized enough, I am not a machine learning guy, neither am I able to be the AI boss around people talking deep about models and other jargons that I am falling short of even talking about it right now. But you, you can be rest assured that if you’re reading this article to learn, you’ll be able to because if I could, you can as well.</p><p>Federated Learning (FL) enables cooperative training on decentralized data. By maintaining sensitive data on individual devices or inside organizational silos, this strategy promotes security and privacy in security-sensitive applications. Google Cloud is a desirable choice for developing decentralized security solutions because it provides a stable platform for implementing FL workflows.</p><p>This article explores the fundamental ideas of Federated Learning (FL), looks at how it can help with decentralized security on Google Cloud, and presents use cases along with tools and code samples.</p><h3>Understanding FL</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*if1Dz3C2Ej8Nx9D5.png" /></figure><p>Large volumes of data must frequently be centrally located in order for traditional machine learning algorithms to be trained. Privacy issues are brought up by this method, particularly when handling sensitive data such as medical records or financial transactions. Federated learning presents a strong substitute.</p><p>In FL, the training procedure is managed by a central coordinator who does not have direct access to each individual data point. The workflow is broken down as follows:</p><ul><li>Model Distribution: To enable devices or organizations to participate, the coordinator distributes a preliminary global model to them.</li><li>Local Training: Using their own data, each participant trains the model locally. Privacy is guaranteed by this localized training because the raw data never leaves the device or silo.</li><li>Model Updates: In contrast to sending raw data, participants send the coordinator only the model updates, or gradients, greatly cutting down on communication overhead.</li><li>Aggregation of the Global Model: The coordinator compiles the updates that are received and applies them to enhance the global model.</li><li>Iteration: The global model is improved iteratively without jeopardizing data privacy by repeating steps 1–4 for a number of rounds.</li></ul><h4>So what are the benefits?</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*cUasHA5oNG5hzkMY.png" /></figure><p>FL offers a number of benefits for developing private-preserving and safe security solutions on Google Cloud:</p><ul><li>Enhanced Data Privacy: FL reduces the possibility of data breaches and unauthorized access by maintaining data decentralization. Organizations handling sensitive security data, such as threat intelligence or user behavior patterns, will especially benefit from this.</li><li>Enhanced Regulatory Compliance: By reducing data collection and sharing, FL can assist businesses in complying with stringent data privacy laws like the California Consumer Privacy Act and the General Data Protection Regulation.</li><li>Collaborative Threat Intelligence Sharing: FL allows security teams from different organizations to securely collaborate with one another. Without disclosing their unique threat intelligence datasets, they can jointly train a threat detection model. This promotes a more thorough comprehension of the changing threat environment.</li><li>On-Device Security Training: FL enables security model training on user devices directly. This protects user privacy while enabling real-time, personalized threat detection and anomaly identification.</li><li>Federated Learning for Secure Multi-party Computation (SMC): To conduct secure computations on sensitive data dispersed among several parties, FL can be coupled with SMC methodologies. This creates opportunities for sophisticated analytics in security applications that protect privacy.</li></ul><h3>Getting to work</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/453/0*Nk2dhSQSVM-GBvrX.jpg" /></figure><p>Let’s talk about some of the ways we can use FL for securing postures</p><h4><strong>Collaborative Malware Detection</strong></h4><p>Conventional methods of malware detection frequently rely on signature-based techniques. These techniques compare files with known malicious patterns to identify malware. On the other hand, zero-day attacks — attackers who employ novel tactics — are difficult for signature-based methods to identify.</p><p>This restriction is addressed by collaborative malware detection, which shares threat intelligence amongst various systems. This knowledge may consist of:</p><ul><li>File hashes of known malware: Systems can swiftly recognize malware that has already been encountered by exchanging file hashes.</li><li>Data from behavioral analysis: Exchanging information about how files work with the system makes it easier to spot questionable patterns of behavior.</li><li>Compromise Indicators (IOCs): Collective defense is strengthened when information related to malware campaigns, such as URLs, IP addresses, and domain names, are shared.</li></ul><p>Collaborative detection systems are better able to recognize new malware variants and emerging threats by pooling this shared intelligence.</p><p><strong><em>Prepping ourselves</em></strong></p><ul><li>Collect Data: Compile a wide range of benign and malware samples, such as PE and APK files. Online public malware datasets are accessible, but make sure to observe ethical and legal requirements.</li></ul><pre>import apache_beam as beam<br><br>class IngestMalware(beam.DoFn):<br> def process(self, element):<br> # element: Malware sample metadata (e.g., filename, source)<br> file_name = element['filename']<br> # Download malware sample from source based on metadata<br> download_and_save_malware(file_name)<br> yield {'filePath': f'gs://your-bucket/{file_name}'} # Upload to GCS<br><br>with beam.Pipeline() as pipeline:<br> malware_data = (<br> pipeline<br> | 'ReadMetadata' >> beam.io.ReadFromText('path/to/metadata.csv')<br> | 'IngestMalware' >> beam.ParDo(IngestMalware())<br> )</pre><ul><li>Data Labeling: Assign a malicious or benign label to every file. Crowdsourcing platforms or security experts can perform this manually.</li><li>Data Preprocessing: Prepare and clean the data in accordance with the specifications of the selected machine learning model. This could entail formatting, normalization, and feature extraction.</li></ul><pre>import kfp.components as comp<br><br># Download and pre-process internal security data<br>download_security_data = comp....(source="internal_security_logs")<br>preprocess_security_data = comp....(inputs=[download_security_data.outputs["data"]])<br><br># Download and pre-process public threat intelligence data<br>download_threat_intel = comp....(source="public_threat_feed_url")<br>preprocess_threat_intel = comp....(inputs=[download_threat_intel.outputs["data"]])<br><br># Merge both pre-processed datasets<br>merged_data = comp....(inputs=[preprocess_security_data.outputs["data"], preprocess_threat_intel.outputs["data"]])<br><br># Create a Vertex AI Pipeline with these components<br>training_pipeline = comp.pipeline(<br> name="data_preprocessing_pipeline",<br> description="Preprocesses data for malware detection model training",<br> components=[<br> download_security_data,<br> preprocess_security_data,<br> download_threat_intel,<br> preprocess_threat_intel,<br> merged_data,<br> ],<br>)</pre><p>I know you guys are professionals so we won’t delve deeper into this with code. Moving On!</p><p><strong><em>Training Our Model</em></strong></p><ul><li>Select a Model: Depending on the format of your data, choose an appropriate machine learning model (e.g., image classification for executables, NLP for scripts). Scikit-learn models and TensorFlow are popular options.</li><li>Create a Training Script: To load, preprocess, and train the model using your labeled data, write a Python script. For resource management and dispersed training, use Vertex AI Training.</li></ul><pre>from google.cloud importaiplatform<br><br>project = "your-project-id"<br>location = "us-central1"<br><br>endpoint = aiplatform.Endpoint.create(<br> display_name="malware-detection-endpoint",<br> project=project,<br> location=location,<br>)<br><br>dataset = aiplatform.Dataset.create(<br> display_name="malware-dataset",<br> project=project,<br> location=location,<br>)<br><br># Define training and validation splits<br>train_split = 0.8<br><br>training_job = aiplatform.TrainingJob.create(<br> display_name="malware-detection-training",<br> project=project,<br> location=location,<br> dataset=dataset,<br> split=train_split,<br> machine_type="n1-standard-4", # Adjust machine type as needed<br> target_rotation_period="30d", # Periodic retraining to stay up-to-date<br> encryption_spec_key_name="your-encryption-key", # Optional encryption<br>)<br><br># Monitor training job progress using aiplatform.TrainingJob.get(training_job.name)</pre><p><strong><em>Alert generation</em></strong></p><p>This sample of code shows how a Cloud Function is started by a Pub/Sub message that contains a malware detection from Vertex AI. Based on collaborative detection results, the function determines the threat type of the finding and, if it indicates malware, generates an alert.</p><pre>import json<br><br>def analyze_malware_finding(data, context):<br> # Access the Pub/Sub message data<br> payload = json.loads(data.pubsubj)<br> finding = payload["finding"]<br><br> # Check if the finding indicates malware based on collaborative detection results<br> if finding["threat_type"] == "MALWARE":<br> # Generate an alert with details from the finding<br> alert_message = f"Potential Malware Detected: {finding['file_hash']}"<br> # Send the alert using a notification service (e.g., Cloud Monitoring)<br><br></pre><p><strong><em>Alert Integration (Cloud Monitoring API)</em></strong></p><pre>from google.cloud import monitoring_v3<br><br>alerts_service = monitoring_v3.AlertingPolicyServiceClient()<br><br># Define the alert policy details<br>alert_policy = monitoring_v3.AlertPolicy(<br> name=f"projects/{project}/locations/{location}/alertPolicies/malware_detection_alert",<br> # ... other policy configuration options<br>)<br><br># Create the alert policy<br>alerts_service.CreateAlertPolicy(request={"parent": parent, "alert_policy": alert_policy})</pre><p><strong>Note: </strong>This is a simplified overview. You’ll need to fill in the details based on your specific requirements and chosen tools. Refer to the Vertex AI and Cloud Monitoring documentation for comprehensive instructions and code examples.</p><h3>Resources</h3><ul><li>Vertex AI Pipelines: <a href="https://cloud.google.com/vertex-ai/docs/pipelines/introduction">https://cloud.google.com/vertex-ai/docs/pipelines/introduction</a></li><li>Custom Training in Vertex AI: <a href="https://cloud.google.com/vertex-ai/docs/training/overview">https://cloud.google.com/vertex-ai/docs/training/overview</a></li><li>Cloud Monitoring Metrics: <a href="https://cloud.google.com/monitoring/api/metrics_gcp">https://cloud.google.com/monitoring/api/metrics_gcp</a></li><li>Alerting Policies in Cloud Monitoring: <a href="https://cloud.google.com/monitoring/alerts">https://cloud.google.com/monitoring/alerts</a></li><li><a href="https://federated.withgoogle.com/">https://federated.withgoogle.com/</a></li></ul><h3>Get in Touch?</h3><p><a href="https://imranfosec.linkb.org/">Imran Roshan</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4c6219ba8f09" width="1" height="1" alt=""><hr><p><a href="https://medium.com/google-cloud/secure-together-federated-learning-for-decentralized-security-on-gcp-4c6219ba8f09">Secure Together — Federated Learning for Decentralized Security on GCP</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>
Author
Link
Published date
Image url
Feed url
Guid
Hidden blurb
--- !ruby/object:Feedjira::Parser::RSSEntry title: Secure Together — Federated Learning for Decentralized Security on GCP url: https://medium.com/google-cloud/secure-together-federated-learning-for-decentralized-security-on-gcp-4c6219ba8f09?source=rss----e52cf94d98af---4 author: Imran Roshan categories: - technology - ai - google-cloud-platform - machine-learning - python published: 2024-03-28 10:20:14.000000000 Z entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier is_perma_link: 'false' guid: https://medium.com/p/4c6219ba8f09 carlessian_info: news_filer_version: 2 newspaper: Google Cloud - Medium macro_region: Blogs rss_fields: - title - url - author - categories - published - entry_id - content content: '<h3>Secure Together — Federated Learning for Decentralized Security on GCP</h3><p>Integrating security mechanisms to enhance organization posture with FL</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/500/0*yCOc3CDRbhGEVJwn.jpeg" /></figure><p>As I might have emphasized enough, I am not a machine learning guy, neither am I able to be the AI boss around people talking deep about models and other jargons that I am falling short of even talking about it right now. But you, you can be rest assured that if you’re reading this article to learn, you’ll be able to because if I could, you can as well.</p><p>Federated Learning (FL) enables cooperative training on decentralized data. By maintaining sensitive data on individual devices or inside organizational silos, this strategy promotes security and privacy in security-sensitive applications. Google Cloud is a desirable choice for developing decentralized security solutions because it provides a stable platform for implementing FL workflows.</p><p>This article explores the fundamental ideas of Federated Learning (FL), looks at how it can help with decentralized security on Google Cloud, and presents use cases along with tools and code samples.</p><h3>Understanding FL</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*if1Dz3C2Ej8Nx9D5.png" /></figure><p>Large volumes of data must frequently be centrally located in order for traditional machine learning algorithms to be trained. Privacy issues are brought up by this method, particularly when handling sensitive data such as medical records or financial transactions. Federated learning presents a strong substitute.</p><p>In FL, the training procedure is managed by a central coordinator who does not have direct access to each individual data point. The workflow is broken down as follows:</p><ul><li>Model Distribution: To enable devices or organizations to participate, the coordinator distributes a preliminary global model to them.</li><li>Local Training: Using their own data, each participant trains the model locally. Privacy is guaranteed by this localized training because the raw data never leaves the device or silo.</li><li>Model Updates: In contrast to sending raw data, participants send the coordinator only the model updates, or gradients, greatly cutting down on communication overhead.</li><li>Aggregation of the Global Model: The coordinator compiles the updates that are received and applies them to enhance the global model.</li><li>Iteration: The global model is improved iteratively without jeopardizing data privacy by repeating steps 1–4 for a number of rounds.</li></ul><h4>So what are the benefits?</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*cUasHA5oNG5hzkMY.png" /></figure><p>FL offers a number of benefits for developing private-preserving and safe security solutions on Google Cloud:</p><ul><li>Enhanced Data Privacy: FL reduces the possibility of data breaches and unauthorized access by maintaining data decentralization. Organizations handling sensitive security data, such as threat intelligence or user behavior patterns, will especially benefit from this.</li><li>Enhanced Regulatory Compliance: By reducing data collection and sharing, FL can assist businesses in complying with stringent data privacy laws like the California Consumer Privacy Act and the General Data Protection Regulation.</li><li>Collaborative Threat Intelligence Sharing: FL allows security teams from different organizations to securely collaborate with one another. Without disclosing their unique threat intelligence datasets, they can jointly train a threat detection model. This promotes a more thorough comprehension of the changing threat environment.</li><li>On-Device Security Training: FL enables security model training on user devices directly. This protects user privacy while enabling real-time, personalized threat detection and anomaly identification.</li><li>Federated Learning for Secure Multi-party Computation (SMC): To conduct secure computations on sensitive data dispersed among several parties, FL can be coupled with SMC methodologies. This creates opportunities for sophisticated analytics in security applications that protect privacy.</li></ul><h3>Getting to work</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/453/0*Nk2dhSQSVM-GBvrX.jpg" /></figure><p>Let’s talk about some of the ways we can use FL for securing postures</p><h4><strong>Collaborative Malware Detection</strong></h4><p>Conventional methods of malware detection frequently rely on signature-based techniques. These techniques compare files with known malicious patterns to identify malware. On the other hand, zero-day attacks — attackers who employ novel tactics — are difficult for signature-based methods to identify.</p><p>This restriction is addressed by collaborative malware detection, which shares threat intelligence amongst various systems. This knowledge may consist of:</p><ul><li>File hashes of known malware: Systems can swiftly recognize malware that has already been encountered by exchanging file hashes.</li><li>Data from behavioral analysis: Exchanging information about how files work with the system makes it easier to spot questionable patterns of behavior.</li><li>Compromise Indicators (IOCs): Collective defense is strengthened when information related to malware campaigns, such as URLs, IP addresses, and domain names, are shared.</li></ul><p>Collaborative detection systems are better able to recognize new malware variants and emerging threats by pooling this shared intelligence.</p><p><strong><em>Prepping ourselves</em></strong></p><ul><li>Collect Data: Compile a wide range of benign and malware samples, such as PE and APK files. Online public malware datasets are accessible, but make sure to observe ethical and legal requirements.</li></ul><pre>import apache_beam as beam<br><br>class IngestMalware(beam.DoFn):<br> def process(self, element):<br> # element: Malware sample metadata (e.g., filename, source)<br> file_name = element['filename']<br> # Download malware sample from source based on metadata<br> download_and_save_malware(file_name)<br> yield {'filePath': f'gs://your-bucket/{file_name}'} # Upload to GCS<br><br>with beam.Pipeline() as pipeline:<br> malware_data = (<br> pipeline<br> | 'ReadMetadata' >> beam.io.ReadFromText('path/to/metadata.csv')<br> | 'IngestMalware' >> beam.ParDo(IngestMalware())<br> )</pre><ul><li>Data Labeling: Assign a malicious or benign label to every file. Crowdsourcing platforms or security experts can perform this manually.</li><li>Data Preprocessing: Prepare and clean the data in accordance with the specifications of the selected machine learning model. This could entail formatting, normalization, and feature extraction.</li></ul><pre>import kfp.components as comp<br><br># Download and pre-process internal security data<br>download_security_data = comp....(source="internal_security_logs")<br>preprocess_security_data = comp....(inputs=[download_security_data.outputs["data"]])<br><br># Download and pre-process public threat intelligence data<br>download_threat_intel = comp....(source="public_threat_feed_url")<br>preprocess_threat_intel = comp....(inputs=[download_threat_intel.outputs["data"]])<br><br># Merge both pre-processed datasets<br>merged_data = comp....(inputs=[preprocess_security_data.outputs["data"], preprocess_threat_intel.outputs["data"]])<br><br># Create a Vertex AI Pipeline with these components<br>training_pipeline = comp.pipeline(<br> name="data_preprocessing_pipeline",<br> description="Preprocesses data for malware detection model training",<br> components=[<br> download_security_data,<br> preprocess_security_data,<br> download_threat_intel,<br> preprocess_threat_intel,<br> merged_data,<br> ],<br>)</pre><p>I know you guys are professionals so we won’t delve deeper into this with code. Moving On!</p><p><strong><em>Training Our Model</em></strong></p><ul><li>Select a Model: Depending on the format of your data, choose an appropriate machine learning model (e.g., image classification for executables, NLP for scripts). Scikit-learn models and TensorFlow are popular options.</li><li>Create a Training Script: To load, preprocess, and train the model using your labeled data, write a Python script. For resource management and dispersed training, use Vertex AI Training.</li></ul><pre>from google.cloud importaiplatform<br><br>project = "your-project-id"<br>location = "us-central1"<br><br>endpoint = aiplatform.Endpoint.create(<br> display_name="malware-detection-endpoint",<br> project=project,<br> location=location,<br>)<br><br>dataset = aiplatform.Dataset.create(<br> display_name="malware-dataset",<br> project=project,<br> location=location,<br>)<br><br># Define training and validation splits<br>train_split = 0.8<br><br>training_job = aiplatform.TrainingJob.create(<br> display_name="malware-detection-training",<br> project=project,<br> location=location,<br> dataset=dataset,<br> split=train_split,<br> machine_type="n1-standard-4", # Adjust machine type as needed<br> target_rotation_period="30d", # Periodic retraining to stay up-to-date<br> encryption_spec_key_name="your-encryption-key", # Optional encryption<br>)<br><br># Monitor training job progress using aiplatform.TrainingJob.get(training_job.name)</pre><p><strong><em>Alert generation</em></strong></p><p>This sample of code shows how a Cloud Function is started by a Pub/Sub message that contains a malware detection from Vertex AI. Based on collaborative detection results, the function determines the threat type of the finding and, if it indicates malware, generates an alert.</p><pre>import json<br><br>def analyze_malware_finding(data, context):<br> # Access the Pub/Sub message data<br> payload = json.loads(data.pubsubj)<br> finding = payload["finding"]<br><br> # Check if the finding indicates malware based on collaborative detection results<br> if finding["threat_type"] == "MALWARE":<br> # Generate an alert with details from the finding<br> alert_message = f"Potential Malware Detected: {finding['file_hash']}"<br> # Send the alert using a notification service (e.g., Cloud Monitoring)<br><br></pre><p><strong><em>Alert Integration (Cloud Monitoring API)</em></strong></p><pre>from google.cloud import monitoring_v3<br><br>alerts_service = monitoring_v3.AlertingPolicyServiceClient()<br><br># Define the alert policy details<br>alert_policy = monitoring_v3.AlertPolicy(<br> name=f"projects/{project}/locations/{location}/alertPolicies/malware_detection_alert",<br> # ... other policy configuration options<br>)<br><br># Create the alert policy<br>alerts_service.CreateAlertPolicy(request={"parent": parent, "alert_policy": alert_policy})</pre><p><strong>Note: </strong>This is a simplified overview. You’ll need to fill in the details based on your specific requirements and chosen tools. Refer to the Vertex AI and Cloud Monitoring documentation for comprehensive instructions and code examples.</p><h3>Resources</h3><ul><li>Vertex AI Pipelines: <a href="https://cloud.google.com/vertex-ai/docs/pipelines/introduction">https://cloud.google.com/vertex-ai/docs/pipelines/introduction</a></li><li>Custom Training in Vertex AI: <a href="https://cloud.google.com/vertex-ai/docs/training/overview">https://cloud.google.com/vertex-ai/docs/training/overview</a></li><li>Cloud Monitoring Metrics: <a href="https://cloud.google.com/monitoring/api/metrics_gcp">https://cloud.google.com/monitoring/api/metrics_gcp</a></li><li>Alerting Policies in Cloud Monitoring: <a href="https://cloud.google.com/monitoring/alerts">https://cloud.google.com/monitoring/alerts</a></li><li><a href="https://federated.withgoogle.com/">https://federated.withgoogle.com/</a></li></ul><h3>Get in Touch?</h3><p><a href="https://imranfosec.linkb.org/">Imran Roshan</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4c6219ba8f09" width="1" height="1" alt=""><hr><p><a href="https://medium.com/google-cloud/secure-together-federated-learning-for-decentralized-security-on-gcp-4c6219ba8f09">Secure Together — Federated Learning for Decentralized Security on GCP</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>'
Language
Active
Ricc internal notes
Imported via /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/import-feedjira.rb on 2024-03-31 22:53:31 +0200. Content is EMPTY here. Entried: title,url,author,categories,published,entry_id,content. TODO add Newspaper: filename = /Users/ricc/git/gemini-news-crawler/webapp/db/seeds.d/../../../crawler/out/feedjira/Blogs/Google Cloud - Medium/2024-03-28-Secure__Together — Federated_Learning_for_Decentralized_Security-v2.yaml
Ricc source
Show this article
Back to articles