web analytics

Archive for the ‘GCP’ Category

Where is the OSI model in the public cloud?

When talking about the public cloud, I always like the analogy to the OSI model.

“The Open Systems Interconnection model (OSI model) is a conceptual model. Communications between a computing system are split into seven different abstraction layers: Physical, Data Link, Network, Transport, Session, Presentation, and Application” (Wikipedia)

A similar and shorter model of the OSI model is the TCP/IP model.

Here is a comparison of the two models:

In the public cloud, we find a similar concept when talking about the shared responsibility model, where we draw the line of responsibility between the public cloud provider and the customers, in the different cloud service models, usually in terms of security, as we can see in the diagram below:

Where do public cloud services fit in the OSI model?

There are many networks related services in each of the major public cloud providers.

To make things easy to understand, I have prepared the following diagram, comparing common network-related services to the various OSI model layers:

Encryption / Cryptography and the OSI Model

Layer 6 of the OSI model is the presentation layer.

Among the things, we can find in this layer is data encryption.

Encryption in this context is about encryption at rest – from object storage, block storage, file storage, and various data services.

Encryption includes symmetric and asymmetric encryption keys, secrets, passwords, API keys, certificates, etc.

The process includes the generation, storage, retrieval, and rotation of encryption keys.

Here are the most common encryption /cryptography-related services:

Identity Management and the OSI Model

Layer 7 of the OSI model is the application layer.

Among the things we can find in this layer are related to authentication and authorization, or the entire identity management.

Identity management is about managing the entire lifecycle of identity – from an end user, service account, computer accounts, etc.

The process includes account provisioning, password management (and MFA), permission management (role assignments), and finally account de-provisioning.

Here are the most common identity-related services:

How does everything come together?

When reviewing a cloud architecture, I like to compare the various services in the architecture to the different layers of the OSI model, from the bottom up:

  • Network connectivity and traffic flow
  • Encryption (according to data sensitivity)
  • Authentication and Authorization (according to the least privilege principle)

The OSI model analogy, assist me to make sure I do not forget any important aspect when reviewing an architecture for a cloud workload.

Sustainability in the cloud era

When thinking about cloud computing, we immediately think about technology.

Have we ever stopped to think about how much energy this sort of technology requires to operate an average cloud data center, and what is the environmental effect of running such huge data centers around the world?

Data centers generate around 1% of the energy consumed around the world, daily.

Data centers consume a lot of energy – electricity (for running the servers) and water (for cooling the servers).

The more energy a common data center consumes, the bigger its carbon footprint (the total amount of greenhouse gases that is generated by running a data center).

In the past couple of years, there is a new concept for professionals working with cloud services, with high environmental awareness called cloud sustainability.

The idea behind it (from a cloud provider’s point of view) is to achieve 100% renewable energy – replace fuel-based electricity with wind and solar power, within a few years.

All major cloud providers (AWS, Azure, and GCP) put a lot of effort into building a new data center to be powered by green energy and making changes to the existing data center to lower their emissions as much as possible and use green energy as well.

To remain transparent to their customers, the major cloud providers have created carbon footprint tools:

  • AWS customer carbon footprint tool

https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/what-is-ccft.html

  • Microsoft Sustainability Calculator

https://aka.ms/SustainabilityCalculator

  • GCP Carbon Footprint

https://cloud.google.com/carbon-footprint

  • Cloud Carbon Footprint (Open source) tool

https://www.cloudcarbonfootprint.org/docs/getting-started

Indeed, most of the responsibility for keeping the cloud data centers green is under the responsibility of the cloud providers, since they build and maintain their data centers, but what is our responsibility as consumers?

As an example, here is AWS’s point of view regarding the shared responsibility model, in the context of sustainability:

Source: https://docs.aws.amazon.com/wellarchitected/latest/sustainability-pillar/the-shared-responsibility-model.html

How to act as responsible cloud consumers?

Region selection

Review business requirements (compliance, latency, cost, service, and features), and pay attention to regions with a low carbon footprint.

Additional information:

  • AWS – What to Consider when Selecting a Region for your Workloads

https://aws.amazon.com/blogs/architecture/what-to-consider-when-selecting-a-region-for-your-workloads

  • Carbon-free energy for Google Cloud regions

https://cloud.google.com/sustainability/region-carbon

  • Measuring greenhouse gas emissions in data centers: the environmental impact of cloud computing

https://www.climatiq.io/blog/measure-greenhouse-gas-emissions-carbon-data-centres-cloud-computing

Architecture design considerations

Use cloud-native design patterns:

  • Microservices – use containers (and Kubernetes) to deploy your applications and leverage the scaling capabilities of the cloud
  • Serverless – use serverless (or function as a service) whenever you can decouple your applications into small functions
  • Use message queues as much as possible, to decouple your applications and lower the number of requests between the various services/components
  • Use caching mechanisms to lower the number of queries to backend systems

Infrastructure considerations

Embed the following as part of your infrastructure considerations:

  • Right-sizing – when using VMs, always remember to right-size the VM size to your application demands
  • Use up-to-date hardware – when using VMs, always use the latest VM family types and the latest block storage type, to suit your application demands
  • ARM-based processors – consider using ARM processors (such as AWS Graviton Processor, Azure Ampere Altra Arm-based processors, GCP Ampere Altra Arm processors, and more), whenever your application supports the ARM technology (for better performance and lower cost)
  • Idle hardware – monitor and shut down (or even delete) unused or idle hardware (VMs, databases, etc.)
  • GPU – use GPUs only for tasks that are considered more efficient than CPUs (such as machine learning, rendering, transcoding, etc.)
  • Spot instances – use spot instances, whenever your application supports sudden interruptions
  • Schedule automatic start and stop of VMs – use scheduling capabilities (such as AWS Instance scheduler, Azure Start/Stop VMs, GCP start and stop virtual machine (VM) instances, etc.) to control the behavior of your workload VMs
  • Managed services – prefer to use PaaS or managed services (from databases, storage, load-balancers, and more)
  • Data lifecycle management – use object storage (or file storage) lifecycle policies to archive or remove unused or unnecessary data
  • Auto-scaling – use the cloud built-in capabilities to scale horizontally according to your application load
  • Content Delivery Network – use CDN (such as Amazon CloudFront, Azure Content Delivery Network, Google Cloud CDN, etc.) to lower the amount of customer traffic to your publicly exposed workloads

Summary

Sustainability and green computing are here to stay.

Although the large demand for cloud services has a huge environmental impact, I strongly believe that the use of cloud services is much more environmentally friendly than any use of legacy data center, for the following reasons:

  • Efficient hardware utilization (nearly 100% of hardware utilization)
  • Fast hardware replacement (due to high utilization)
  • Better energy use (high use of renewable energy sources to support the electricity requirements)

I advise all cloud customers, to put sustainability higher in their design considerations.

Additional reading materials

  • AWS Well-Architected Framework – Sustainability Pillar

https://docs.aws.amazon.com/wellarchitected/latest/sustainability-pillar/sustainability-pillar.html

  • Microsoft Azure Well-Architected Framework – Sustainability

https://learn.microsoft.com/en-us/azure/architecture/framework/sustainability/sustainability-get-started

  • Google Cloud – Design for environmental sustainability

https://cloud.google.com/architecture/framework/system-design/sustainability

Journey for writing my first book about cloud security

My name is Eyal, and I am a cloud architect.

I have been in the IT industry since 1998 and began working with public clouds in 2015.

Over the years I have gained hands-on experience working on the infrastructure side of AWS, Azure, and GCP.

The more I worked with the various services from the three major cloud providers, the more I had the urge to compare the cloud providers’ capabilities, and I have shared several blog posts comparing the services.

In 2021 I was approached by PACKT publishing after they came across one of my blog posts on social media, and they offered me the opportunity to write a book about cloud security, comparing AWS, Azure, and GCP services and capabilities.

Over the years I have published many blog posts through social media and public websites, but this was my first experience writing an entire book with the support and assistance of a well-known publisher.

As with any previous article, I began by writing down each chapter title and main headlines for each chapter.

Once the chapters were approved, I moved on to write the actual chapters.

For each chapter, I first wrote down the headlines and then began filling them with content.

Before writing each chapter, I have done research on the subject, collected references from the vendors’ documentation, and looked for security best practices.

Once I have completed a chapter, I submitted it for review by the PACKT team.

PACKT team, together with external reviewers, sent me their input, things to change, additional material to add, request for relevant diagrams, and more.

Since copyright and plagiarism are important topics to take care of while writing a book, I have prepared my diagrams and submitted them to PACKT.

Finally, after a lot of review and corrections, which took almost a year, the book draft was submitted to another external reviewer and once comments were fixed, the work on the book (at least from my side as an author) was completed.

From my perspective, the book is unique by the fact that it does not focus on a single public cloud provider, but it constantly compares between the three major cloud providers.

From a reader’s point of view or someone who only works with a single cloud provider, I recommend focusing on the relevant topics according to the target cloud provider.

For each topic, I made a list of best practices, which can also be referenced as a checklist for securing the cloud providers’ environment, and for each recommendation I have added reference for further reading from the vendors’ documentation.

If you are interested in learning how to secure cloud environments based on AWS, Azure, or GCP, my book is available for purchase in one of the following book stores:

  • Amazon:

https://www.amazon.com/Cloud-Security-Handbook-effectively-environments/dp/180056919X

  • Barnes & Noble:

https://www.barnesandnoble.com/w/cloud-security-handbook-eyal-estrin/1141215482?ean=9781800569195

  • PACKT

https://www.packtpub.com/product/cloud-security-handbook/9781800569195

Not all cloud providers are built the same

When organizations debate workload migration to the cloud, they begin to realize the number of public cloud alternatives that exist, both U.S hyper-scale cloud providers and several small to medium European and Asian providers.

The more we study the differences between the cloud providers (both IaaS/PaaS and SaaS providers), we begin to realize that not all cloud providers are built the same.

How can we select a mature cloud provider from all the alternatives?

Transparency

Mature cloud providers will make sure you don’t have to look around their website, to locate their security compliance documents, allow you to download their security controls documentation, such as SOC 2 Type II, CSA Star, CSA Cloud Controls Matrix (CCM), etc.

What happens if we wish to evaluate the cloud provider by ourselves?

Will the cloud provider (no matter what cloud service model), allow me to conduct a security assessment (or even a penetration test), to check the effectiveness of his security controls?

Global presence

When evaluating cloud providers, ask yourself the following questions:

  1. Does the cloud provider have a local presence near my customers?
  2. Will I be able to deploy my application in multiple countries around the world?
  3. In case of an outage, will I be able to continue serving my customers from a different location with minimal effort?

Scale

Deploying an application for the first time, we might not think about it, but what happens in the peak scenario?

Will the cloud provider allow me to deploy hundreds or even thousands of VM’s (or even better, containers), in a short amount of time, for a short period, from the same location?

Will the cloud provider allow me infinite scale to store my data in cloud storage, without having to guess or estimate the storage size?

Multi-tenancy

As customers, we expect our cloud providers to offer us a fully private environment.

We never want to hear about “noisy neighbor” (where one customer is using a lot of resources, which eventually affect other customers), and we never want to hear a provider admits that some or all of the resources (from VMs, database, storage, etc.) are being shared among customers.

Will the cloud provider be able to offer me a commitment to a multi-tenant environment?

Stability

One of the major reasons for migrating to the cloud is the ability to re-architect our services, whether we are still using VMs based on IaaS, databases based on PaaS, or fully managed CRM services based on SaaS.

In all scenarios, we would like to have a stable service with zero downtime.

Will the cloud provider allow me to deploy a service in a redundant architecture, that will survive data center outage or infrastructure availability issues (from authentication services, to compute, storage, or even network infrastructure) and return to business with minimal customer effect?

APIs

In the modern cloud era, everything is based on API (Application programming interface).

Will the cloud provider offer me various APIs?

From deploying an entire production environment in minutes using Infrastructure as Code, to monitoring both performances of our services, cost, and security auditing – everything should be allowed using API, otherwise, it is simply not scale/mature/automated/standard and prone to human mistakes.

Data protection

Encrypting data at transit, using TLS 1.2 is a common standard, but what about encryption at rest?

Will the cloud provider allow me to encrypt a database, object storage, or a simple NFS storage using my encryption keys, inside a secure key management service?

Will the cloud provider allow me to automatically rotate my encryption keys?

What happens if I need to store secrets (credentials, access keys, API keys, etc.)? Will the cloud provider allow me to store my secrets in a secured, managed, and audited location?

In case you are about to store extremely sensitive data (from PII, credit card details, healthcare data, or even military secrets), will the cloud provider offer me a solution for confidential computing, where I can store sensitive data, even in memory (or in use)?

Well architected

A mature cloud provider has a vast amount of expertise to share knowledge with you, about how to build an architecture that will be secure, reliable, performance efficient, cost-optimized, and continually improve the processes you have built.

Will the cloud provider offer me rich documentation on how to achieve all the above-mentioned goals, to provide your customers the best experience?

Will the cloud provider offer me an automated solution for deploying an entire application stack within minutes from a large marketplace?

Cost management

The more we broaden our use of the IaaS / PaaS service, the more we realize that almost every service has its price tag.

We might not prepare for this in advance, but once we begin to receive the monthly bill, we begin to see that we pay a lot of money, sometimes for services we don’t need, or for an expensive tier of a specific service.

Unlike on-premise, most cloud providers offer us a way to lower the monthly bill or pay for what we consume.

Regarding cost management, ask yourself the following questions:

Will the cloud provider charge me for services when I am not consuming them?

Will the cloud provider offer me detailed reports that will allow me to find out what am I paying for?

Will the cloud provider offer me documents and best practices for saving costs?

Summary

Answering the above questions with your preferred cloud provider, will allow you to differentiate a mature cloud provider, from the rest of the alternatives, and to assure you that you have made the right choice selecting a cloud provider.

The answers will provide you with confidence, both when working with a single cloud provider, and when taking a step forward and working in a multi-cloud environment.

References

Security, Trust, Assurance, and Risk (STAR)

https://cloudsecurityalliance.org/star/

SOC 2 – SOC for Service Organizations: Trust Services Criteria

https://www.aicpa.org/interestareas/frc/assuranceadvisoryservices/aicpasoc2report.html

Confidential Computing and the Public Cloud

https://eyal-estrin.medium.com/confidential-computing-and-the-public-cloud-fa4de863df3

Confidential computing: an AWS perspective

https://aws.amazon.com/blogs/security/confidential-computing-an-aws-perspective/

AWS Well-Architected

https://aws.amazon.com/architecture/well-architected

Azure Well-Architected Framework

https://docs.microsoft.com/en-us/azure/architecture/framework/

Google Cloud’s Architecture Framework

https://cloud.google.com/architecture/framework

Oracle Architecture Center

https://docs.oracle.com/solutions/

Alibaba Cloud’s Well-Architectured Framework

https://www.alibabacloud.com/architecture/index

The Future of Data Security Lies in the Cloud

We have recently read a lot of posts about the SolarWinds hack, a vulnerability in a popular monitoring software used by many organizations around the world.

This is a good example of supply chain attack, which can happen to any organization.

We have seen similar scenarios over the past decade, from the Heartbleed bug, Meltdown and Spectre, Apache Struts, and more.

Organizations all around the world were affected by the SolarWinds hack, including the cybersecurity company FireEye, and Microsoft.

Events like these make organizations rethink their cybersecurity and data protection strategies and ask important questions.

Recent changes in the European data protection laws and regulations (such as Schrems II)  are trying to limit data transfer between Europe and the US.

Should such security breaches occur? Absolutely not.

Should we live with the fact that such large organization been breached? Absolutely not!

Should organizations, who already invested a lot of resources in cloud migration move back workloads to on-premises? I don’t think so.

But no organization, not even major financial organizations like banks or insurance companies, or even the largest multinational enterprises, have enough manpower, knowledge, and budget to invest in proper protection of their own data or their customers’ data, as hyperscale cloud providers.

There are several of reasons for this:

  1. Hyperscale cloud providers invest billions of dollars improving security controls, including dedicated and highly trained personnel.
  2. Breach of customers’ data that resides at hyperscale cloud providers can drive a cloud provider out of business, due to breach of customer’s trust.
  3. Security is important to most organizations; however, it is not their main line of expertise.
    Organization need to focus on their core business that brings them value, like manufacturing, banking, healthcare, education, etc., and rethink how to obtain services that support their business goals, such as IT services, but do not add direct value.

Recommendations for managing security

Security Monitoring

Security best practices often state: “document everything”.
There are two downsides to this recommendation: One, storage capacity is limited and two, most organizations do not have enough trained manpower to review the logs and find the top incidents to handle.

Switching security monitoring to cloud-based managed systems such as Azure Sentinel or Amazon Guard​Duty, will assist in detecting important incidents and internally handle huge logs.

Encryption

Another security best practice state: “encrypt everything”.
A few years ago, encryption was quite a challenge. Will the service/application support the encryption? Where do we store the encryption key? How do we manage key rotation?

In the past, only banks could afford HSM (Hardware Security Module) for storing encryption keys, due to the high cost.

Today, encryption is standard for most cloud services, such as AWS KMS, Azure Key Vault, Google Cloud KMS and Oracle Key Management.

Most cloud providers, not only support encryption at rest, but also support customer managed key, which allows the customer to generate his own encryption key for each service, instead of using the cloud provider’s generated encryption key.

Security Compliance

Most organizations struggle to handle security compliance over large environments on premise, not to mention large IaaS environments.

This issue can be solved by using managed compliance services such as AWS Security Hub, Azure Security Center, Google Security Command Center or Oracle Cloud Access Security Broker (CASB).

DDoS Protection

Any organization exposing services to the Internet (from publicly facing website, through email or DNS service, till VPN service), will eventually suffer from volumetric denial of service.

Only large ISPs have enough bandwidth to handle such an attack before the border gateway (firewall, external router, etc.) will crash or stop handling incoming traffic.

The hyperscale cloud providers have infrastructure that can handle DDoS attacks against their customers, services such as AWS Shield, Azure DDoS Protection, Google Cloud Armor or Oracle Layer 7 DDoS Mitigation.

Using SaaS Applications

In the past, organizations had to maintain their entire infrastructure, from messaging systems, CRM, ERP, etc.

They had to think about scale, resilience, security, and more.

Most breaches of cloud environments originate from misconfigurations at the customers’ side on IaaS / PaaS services.

Today, the preferred way is to consume managed services in SaaS form.

These are a few examples: Microsoft Office 365, Google Workspace (Formerly Google G Suite), Salesforce Sales Cloud, Oracle ERP Cloud, SAP HANA, etc.

Limit the Blast Radius

To limit the “blast radius” where an outage or security breach on one service affects other services, we need to re-architect infrastructure.

Switching from applications deployed inside virtual servers to modern development such as microservices based on containers, or building new applications based on serverless (or function as a service) will assist organizations limit the attack surface and possible future breaches.

Example of these services: Amazon ECS, Amazon EKS, Azure Kubernetes Service, Google Kubernetes Engine, Google Anthos, Oracle Container Engine for Kubernetes, AWS Lambda, Azure Functions, Google Cloud Functions, Google Cloud Run, Oracle Cloud Functions, etc.

Summary

The bottom line: organizations can increase their security posture, by using the public cloud to better protect their data, use the expertise of cloud providers, and invest their time in their core business to maximize value.

Security breaches are inevitable. Shifting to cloud services does not shift an organization’s responsibility to secure their data. It simply does it better.

Confidential Computing and the Public Cloud

What exactly is “confidential computing” and what are the reasons and benefits for using it in the public cloud environment?

Introduction to data encryption

To protect data stored in the cloud, we usually use one of the following methods:

· Encryption at transit — Data transferred over the public Internet can be encrypted using the TLS protocol. This method prohibits unwanted participants from entering the conversation.

· Encryption at rest — Data stored at rest, such as databases, object storage, etc., can be encrypted using symmetric encryption which means using the same encryption key to encrypt and decrypt the data. This commonly uses the AES256 algorithm.

When we wish to access encrypted data, we need to decrypt the data in the computer’s memory to access, read and update the data.

This is where confidential computing comes in — trying to protect the gap between data at rest and data at transit.

Confidential Computing uses hardware to isolate data. Data is encrypted in use by running it in a trusted execution environment (TEE).

As of November 2020, confidential computing is supported by Intel Software Guard Extensions (SGX) and AMD Secure Encrypted Virtualization (SEV), based on AMD EPYC processors.

Comparison of the available options

 Intel SGXIntel SGX2AMD SEV 1AMD SEV 2
PurposeMicroservices and small workloadsMachine Learning and AICloud and IaaS workloads (above the hypervisor), suitable for legacy applications or large workloadsCloud and IaaS workloads (above the hypervisor), suitable for legacy applications or large workloads
Cloud VM support (November 2020)
Cloud containers support (November 2020)
Operating system supportedWindows, LinuxLinuxLinuxLinux
Memory limitationUp to 128MBUp to 1TBUp to available RAMUp to available RAM
Software changesRequire software rewriteRequire software rewriteNot required

Reference Architecture

AMD SEV Architecture:

Azure Kubernetes Service (AKS) Confidential Computing:

References

· Confidential Computing: Hardware-Based Trusted Execution for Applications and Data

https://confidentialcomputing.io/wp-content/uploads/sites/85/2020/10/ConfidentialComputing_Outreach_Whitepaper-8-5×11-1.pdf

· Google Cloud Confidential VMs vs Azure Confidential Computing

https://msandbu.org/google-cloud-confidential-vms-vs-azure-confidential-computing/

· A Comparison Study of Intel SGX and AMD Memory Encryption Technology

https://caslab.csl.yale.edu/workshops/hasp2018/HASP18_a9-mofrad_slides.pdf

· SGX-hardware listhttps://github.com/ayeks/SGX-hardware

· Performance Analysis of Scientific Computing Workloads on Trusted Execution Environments

https://arxiv.org/pdf/2010.13216.pdf

· Helping Secure the Cloud with AMD EPYC Secure Encrypted Virtualization

https://developer.amd.com/wp-content/resources/HelpingSecuretheCloudwithAMDEPYCSEV.pdf

· Azure confidential computing

https://azure.microsoft.com/en-us/solutions/confidential-compute/

· Azure and Intel commit to delivering next generation confidential computing

https://azure.microsoft.com/en-us/blog/azure-and-intel-commit-to-delivering-next-generation-confidential-computing/

· DCsv2-series VM now generally available from Azure confidential computing

https://azure.microsoft.com/en-us/blog/dcsv2series-vm-now-generally-available-from-azure-confidential-computing/

· Confidential computing nodes on Azure Kubernetes Service (public preview)

https://docs.microsoft.com/en-us/azure/confidential-computing/confidential-nodes-aks-overview

· Expanding Google Cloud’s Confidential Computing portfolio

https://cloud.google.com/blog/products/identity-security/expanding-google-clouds-confidential-computing-portfolio

· A deeper dive into Confidential GKE Nodes — now available in preview

https://cloud.google.com/blog/products/identity-security/confidential-gke-nodes-now-available

· Using HashiCorp Vault with Google Confidential Computing

https://www.hashicorp.com/blog/using-hashicorp-vault-with-google-confidential-computing

· Confidential Computing is cool!

https://medium.com/google-cloud/confidential-computing-is-cool-1d715cf47683

· Data-in-use protection on IBM Cloud using Intel SGX

https://www.ibm.com/cloud/blog/data-use-protection-ibm-cloud-using-intel-sgx

· Why IBM believes Confidential Computing is the future of cloud security

https://venturebeat.com/2020/10/16/why-ibm-believes-confidential-computing-is-the-future-of-cloud-security/

· Alibaba Cloud Released Industry’s First Trusted and Virtualized Instance with Support for SGX 2.0 and TPM

https://www.alibabacloud.com/blog/alibaba-cloud-released-industrys-first-trusted-and-virtualized-instance-with-support-for-sgx-2-0-and-tpm_596821

Running MySQL Managed Database in the Cloud

Today, more and more organizations are moving to the public cloud and choosing open source databases. They are choosing this for a variety of reasons, but license cost is one of the main ones.

In this post, we will review some of the common alternatives for running MySQL database inside a managed environment.

Legacy applications may be a reason for manually deploying and managing MySQL database.

Although it is possible to deploy a virtual machine, and above it manually install MySQL database (or even a MySQL cluster), unless your organization have a dedicated and capable DBA, I recommend looking at what brings value to your organization. Unless databases directly influence your organization’s revenue, I recommend paying the extra money and choosing a managed solution based on a Platform as a Service model.

It is important to note that several cloud providers offer data migration services to assist migrating existing MySQL (or even MS-SQL and Oracle) databases from on-premise to a managed service in the cloud.

Benefits of using managed database solutions

  • Easy deployment – With a few clicks from within the web console, or using CLI tools, you can deploy fully managed MySQL databases (or a MySQL cluster)
  • High availability and Read replica – Configurable during the deployment phase and after the product has already been deployed, according to customer requirements
  • Maintenance – The entire service maintenance (including database fine-tuning, operating system, and security patches, etc.) is done by the cloud provider
  • Backup and recovery – Embedded inside the managed solution and as part of the pricing model
  • Encryption at transit and at rest – Embedded inside the managed solution
  • Monitoring – As with any managed solution, cloud providers monitor service stability and allow customers access to metrics for further investigation (if needed)

Alternatives for running managed MySQL database in the cloud

Summary

As you can read in this article, running MySQL database in a managed environment in the cloud is a viable option, and there are various reasons for taking this step (from license cost, decrease man power maintaining the database and operating system, backups, security, availability, etc.)

References

How to run HPC in the cloud?

Is it feasible to run HPC in the cloud? How different is it from running a local HPC cluster? What are some of the common alternatives for running HPC in the cloud?

Introduction

Before beginning our discussion about HPC (High Performance Computing) in the cloud, let us talk about what exactly HPC really means?

“High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.” (https://www.usgs.gov/core-science-systems/sas/arc/about/what-high-performance-computing)

In more technical terms – it refers to a cluster of machines composed of multiple cores (either physical or virtual cores), a lot of memory, fast parallel storage (for read/write) and fast network connectivity between cluster nodes.

HPC is useful when you need a lot of compute resources, from image or video rendering (in batch mode) to weather forecasting (which requires fast connectivity between the cluster nodes).

The world of HPC is divided into two categories:

  • Loosely coupled – In this scenario you might need a lot of compute resources, however, each task can run in parallel and is not dependent on other tasks being completed.

Common examples of loosely coupled scenarios: Image processing, genomic analysis, etc.

  • Tightly coupled – In this scenario you need fast connectivity between cluster resources (such as memory and CPU), and each cluster node depends on other nodes for the completion of the task. Common examples of tightly coupled scenarios: Computational fluid dynamics, weather prediction, etc.

Pricing considerations

Deploying an HPC cluster on premise requires significant resources. This includes a large investment in hardware (multiple machines connected in the cluster, with many CPUs or GPUs, with parallel storage and sometimes even RDMA connectivity between the cluster nodes), manpower with the knowledge to support the platform, a lot of electric power, and more.

Deploying an HPC cluster in the cloud is also costly. The price of a virtual machine with multiple CPUs, GPUs or large amount of RAM can be very high, as compared to purchasing the same hardware on premise and using it 24×7 for 3-5 years.

The cost of parallel storage, as compared to other types of storage, is another consideration.

The magic formula is to run HPC clusters in the cloud and still have the benefits of (virtually) unlimited compute/memory/storage resources is to build dynamic clusters.

We do this by building the cluster for a specific job, according to the customer’s requirements (in terms of number of CPUs, amount of RAM, storage capacity size, network connectivity between the cluster nodes, required software, etc.). Once the job is completed, we copy the job output data and take down the entire HPC cluster in-order to save unnecessary hardware cost.

Alternatives for running HPC in the cloud

Summary

As you can see, running HPC in the public cloud is a viable option. But you need to carefully plan the specific solution, after gathering the customer’s exact requirements in terms of required compute resources, required software and of course budget estimation.

Product documentation

  • Azure Batch

https://azure.microsoft.com/en-us/services/batch/

  • Azure CycleCloud

https://azure.microsoft.com/en-us/features/azure-cyclecloud/

  • AWS ParallelCluster

https://aws.amazon.com/hpc/parallelcluster/

  • Slurm on Google Cloud Platform

https://github.com/SchedMD/slurm-gcp

  • HPC on Oracle Cloud Infrastructure

https://www.oracle.com/cloud/solutions/hpc.html

What makes a good cloud architect?

Virtually any organization active in the public cloud needs at least one cloud architect to be able to see the big picture and to assist designing solutions.

So, what makes a cloud architect a good cloud architect?

In a word – be multidisciplinary.

Customer-Oriented

While the position requires good technical skills, a good cloud architect must have good customer facing skills. A cloud architect needs to understand the business needs, from the end-users (usually connecting from the Internet) to the technological teams. That means being able to speak many “languages,” and translate from one to the another while navigating the delicate nuances of each. All in the same conversation.

At the end of the day, the technology is just a means to serve your customers.

Sometimes a customer may ask for something non-technical at all (“Draw me a sheep…”) and sometimes it could be very technical (“I want to expose an API to allow read and update backend database”).

A good cloud architect knows how to take make a drawing of a sheep into a full-blown architecture diagram, complete with components, protocols, and more. In other worlds, translating a business or customer requirement into a technical requirement.

Technical Skills

Here are a few of the technical skills good cloud architects should have under their belts.

  • Operating systems – Know how to deploy and troubleshoot problems related to virtual machines, based on both Windows and Linux.
  • Cloud services – Be familiar with at least one public cloud provider’s services (such as AWS, Azure, GCP, Oracle Cloud, etc.). Even better to be familiar with at least two public cloud vendors since the world is heading toward multi-cloud environments.
  • Networking – Be familiar with network-related concepts such as OSI model, TCP/IP, IP and subnetting, ACLs, HTTP, routing, DNS, etc.
  • Storage – Be familiar with storage-related concepts such as object storage, block storage, file storage, snapshots, SMB, NFS, etc.
  • Database – Be familiar with database-related concepts such as relational database, NoSQL database, etc.
  • Architecture – Be familiar with concepts such as three-tier architecture, micro-services, serverless, twelve-factor app, API, etc.

Information Security

A good cloud architect can read an architecture diagram and knows which questions to ask and which security controls to embed inside a given solution.

  • Identity management – Be familiar with concepts such as directory services, Identity and access management (IAM), Active Directory, Kerberos, SAML, OAuth, federation, authentication, authorization, etc.
  • Auditing – Be familiar with concepts such as audit trail, access logs, configuration changes, etc.
  • Cryptography – Be familiar with concepts such as TLS, public key authentication, encryption at transit & at rest, tokenization, hashing algorithms, etc.
  • Application Security – Be familiar with concepts such as input validation, OWASP Top10, SDLC, SQL Injection, etc.

Laws, Regulation and Standards

In our dynamic world a good cloud architect needs to have at least a basic understanding of the following topics:

  • Laws and Regulation – Be familiar with privacy regulations such as GDPR, CCPA, etc., and how they affect your organization’s cloud environments and products
  • Standards – Be familiar with standards such as ISO 27001 (Information Security Management), ISO 27017 (Cloud Security), ISO 27018 (Protection of PII in public clouds), ISO 27701 (Privacy), SOC 2, CSA Security Trust Assurance and Risk (STAR), etc.
  • Contractual agreements – Be able to read contracts between customers and public cloud providers, and know which topics need to appear in a typical contract (SLA, business continuity, etc.)

Code

Good cloud architects, like a good DevOps guys or gals, are not afraid to get their hands dirty and be able read and write code, mostly for automation purposes.

The required skills vary from scenario to scenario, but in most cases include:

  • CLI – Be able to run command line tools, in-order to query existing environment settings up to updating or deploying new components.
  • Scripting – Be familiar with at least one scripting language, such as PowerShell, Bash scripts, Python, Java Script, etc.
  • Infrastructure as a Code – Be familiar with at least one declarative language, such as HashiCorp Terraform, AWS Cloud​Formation, Azure Resource Manager, Google Cloud Deployment Manager, RedHat Ansible, etc.
  • Programming languages – Be familiar with at least one programming language, such as Java, Microsoft .NET, Ruby, etc.

Sales

A good cloud architect needs to be able to “sell” a solution to various audiences. Again the required skills vary from scenario to scenario, but in most cases include:

Summary

Recruiting a good cloud architect is indeed challenging. The role requires multidisciplinary skills – from soft skills (been a customer-oriented and salesperson) to deep technical skills (technology, cloud services, information security, etc.)

There is no alternative to years of hands-on experience. The more areas of experience cloud architects have, the better they will succeed at the job.

References

  • What is a cloud architect? A vital role for success in the cloud.

https://www.cio.com/article/3282794/what-is-a-cloud-architect-a-vital-role-for-success-in-the-cloud.html

  • Want to Become a Cloud Architect? Here’s How

https://www.businessnewsdaily.com/10767-how-to-become-a-cloud-architect.html

How to Achieve Long Term Cost Savings Using Cloud Services

The relatively high cost of cloud computing resources, compared to on-premise solutions, is a major challenge for organizations migrating to public cloud services. In this post, we will review several available plans for long-term cost saving of compute resources.

Background

The Pay-As-You-Go, or Pay on-Demand, is the most common option for paying for actual usage when consuming cloud resources. This method is suitable when the required compute power is changing or unpredictable. A good example of this is for services migrated from on-premise to the public cloud (Lift & Shift), or new environments (Dev/Test), and more.

The second most common pricing option is called Spot (Amazon EC2 Spot Instances, Azure Spot Virtual Machines or Google Preemptible Virtual Machines). These options can potentially deliver a discount of up to 90% and are best when there is a demand for large amounts of compute power, and the service is not sensitive to disruptions. Spot Instances are suitable for scenarios when compute power is required by another paying customer and the cloud provider claims the machine back, with 30-second to 2-minute notifications. This method is suitable for image/video processing, batch processing, HPC services, etc.

Reserved Instance

This is the most common pricing option for saving costs. Users commit to one to three years of usage in advance, with a potential savings of up to 70%.

Reserved Instances are available with various payment methods. These range from

  • “All Upfront” – where you pay the entire server cost in advance for the entire commitment period
  • “Partial Upfront” – where you pay the server costs on monthly basis in installments, until the end of the commitment period
  •  “No Upfront” – where you pay a fixed price for the server cost until till the end of the commitment period

There are also options for more flexibility on Reserved Instance options. These include Standard RI, where you commit to a certain instance type (instance family type, operating system, payment method, etc.), and Convertible RI, where you are allowed to change the instance type (instance family type, operating system, etc.) during the commitment period.

Additional information about Reserved Instance options can be found at:

It is important to note that these cost-saving options are not limited to virtual servers. It is possible to purchase a commitment for managed services, such as Amazon RDS Reserved Instances, Azure SQL Database reserved capacity, Azure Blob storage reserved capacity, and more.

AWS Saving Plans

AWS has a flexible pricing option, like AWS Reserved Instances, which allows up to 72% discount.

These plans include two alternatives:

  • Compute Saving Plans – This plan allows you to commit to resource consumption in advance, with the flexibility to choose, and change, instance family type, instance size (ratio between CPU/Memory), region, availability zone and operating systems. The Compute Saving plan covers compute resources from virtual machines (EC2 instances), through AWS Fargate and up to Amazon Lambda.
  • EC2 Instance Saving Plans – This plan allows you to save on virtual servers’ costs. However, it is limited to virtual servers from specific instance family types, in a specific region. It is still possible to change instance size (ratio between CPU/Memory), availability zone and operating systems.

Additional information can be found at: https://aws.amazon.com/savingsplans/faq/

Google Sustained Use Discounts

This plan is designed to encourage customers to commit to long term use of Google compute resources, such as virtual servers or Google Kubernetes Engine, for any constant time, longer than 25% of the month. This plan grants an automatic discount of between 20% and 30% of the price list. No action needs to be taken; the discount is applied when reaching the plan’s minimum consumption level of compute resources.

Additional information can be found at: https://cloud.google.com/compute/docs/sustained-use-discounts

Conclusion

The first step toward enjoying long-term cost savings is understanding your compute demands. Studying up and staying up to date on vendors’ various pricing plans and options, then matching those to your needs and environments, is the key to achieving the most cost-effective public cloud solution.