Benefits of using managed database as a service in the cloud

When using public cloud services for relational databases, you have two options:

  • IaaS solution – Install a database server on top of a virtual machine
  • PaaS solution – Connect to a managed database service

In the traditional data center, organizations had to maintain the operating system and the database by themselves.

The benefits are very clear – full control over the entire stack.

The downside – The organization needs to maintain availability, license cost and security (access control, patch level, hardening, auditing, etc.)

Today, all the major public cloud vendors offer managed services for databases in the cloud.

To connect to the database and begin working, all a customer needs is a DNS name, port number and credentials.

The benefits of a managed database service are:

  • Easy administration – No need to maintain the operating system (including patch level for the OS and for the database, system hardening, backup, etc.)
  • Scalability – The number of virtual machines in the cluster will grow automatically according to load, in addition to the storage space required for the data
  • High availability – The cluster can be configured to span across multiple availability zones (physical data centers)
  • Performance – Usually the cloud provider installs the database on SSD storage
  • Security – Encryption at rest and in transit
  • Monitoring – Built-in the service
  • Cost – Pay only for what you use

Not all features available on the on-premises version of the database are available on the PaaS version, and not all common databases are available as managed service of the major cloud providers.

Amazon RDS

Amazon managed services currently (as of April 2018) supports the following database engines:

Azure Managed databases

Microsoft Azure managed database services currently (as of April 2018) support the following database engines:

Google Cloud SQL

Google managed database services currently (as of April 2018) support the following database engines:

Oracle Database Cloud Service

Oracle managed database services currently (as of April 2018) support the following database engines:

Cloud Providers Service Limits

When working with cloud service providers, you may notice that at some point there are service / quota limitations.

Some limits are per account / subscription; some of them are per region and some limits are per pricing tier (free tier vs billable).

Here are some of the most common reasons for service / quota limitations:

  • Performance issues on the cloud provider’s side – loading a lot of virtual machines on the same data center requires a lot of resources from the cloud provider
  • Avoiding spikes in usage – protect from a situation where one customer consumes a lot of resources that might affect other customers and might eventually cause denial of service

For more information about default cloud service limits, see:

Default limitations can be changed by contacting the cloud service provider’s support and requesting a change to the default limitation.

For instructions on how to change the service limitations, see:

Steps to Protect Your Website From Hackers in this year

The term hacking is a terrifying term when it comes to online business. Also, for so many years with the advancement and growth of online business, there has been a lot more to talk about hacking and phishing activities. Moreover, in the past couple of years, the number of Internet users, online shoppers and bloggers has been immensely increased. This leads to security as an important factor to lay emphasis on. A little frightening is the case studies and research that have been done, on several companies’ data. For example, there are almost 50,000-180,000 unauthorized login attempts that are seen on the WordPress hosted sites and other websites as well. They try to attempt, is by using brute force techniques that is – trying loads of permutations and combinations of usernames & passwords, to get access to the website. However, there are also websites that are under serious attacks from hackers that steal the important information by using various strategic methods. Based on an ITRC Data Breach Report – there are nearly 200 million personal records which were unprotected and faced 780 data security breaches in 2015. We aren’t trying to scare you but this is the concern which every year the websites are facing and are in a havoc to handle the situation.

Now that you realize how vulnerable your website can be, this article will take you to the best hacks for avoiding threats, hacking and another intrusion on your website this year. As website developers or owners of certain websites, we understand that there are challenges to face these situations and preventing websites from hacking. Here is a checklist of some of the brilliant steps that are a mandate to protect your website this year.

Ban users and set up the lockdown

A lockdown feature is something which a website user can stop or avert the number of log-in attempts. Which means that for every failed login attempt can be prevented which can eventually be a huge problem – what you get are no more continuous brute force attempts. Here, by this feature whenever you face repetitive hacking attempt and login attempts the site gets automatically locked and you will be immediately notified of the unauthorized activity. For example, there is feature iThemes Security plugin which works pretty well where you can also specify a certain number of failed login attempts post which the plugin will ban the attacker’s IP address.

Always use a 2-factor authentication

Thanks to this feature of two-factor verification as it certainly helps to give a boost in security measures. This 2FA gives the user login details for two separate components and options. Here you can decide the medium to get the secret code, email, text, call and etc while using two-factor authentication. This helps the website to be secure and in one hand use. While deploying this feature Google Authenticator plugin helps to use it just in a few clicks.

An extra layer of security is not a bad idea

Yes, we are talking about the use of Secure Sockets Layer (SSL) certificate to ensure that the third-party programs are running smoothly and there are no chances of a hijack or shut down. SSL offers an extra layer of protection by securing all the internal data and any transactions which are performed by the browser and the client’s website. There are many types of SSL certificates that are used depending on the validation they offer and the type of business they are used for. For example, Wildcard SSL certificate is widely used to Secure root domain and its multiple Sub domains. Due to these certificates, the customers build a trust in the online business while doing their transactions.

Content management system – choose strategically

When working on website security, it is also important to understand and work on CMS. It plays an important part in website security. Why are we saying this is because there are hundreds of Content Management Systems that are used to develop the digital content. The popular one is CMS includes Joomla, Drupal, WordPress, Wix, Weebly, and Magnolia. These content management systems have their unique security features that completely contribute to the website’s protection. Thus, keep a track of the latest versions and updates of the chosen CMS to keep your website secure.

Get strong passwords

Today every application, email, or any website login needs a password for security reasons. This is essential for the security of your data and much other personal information. Use strong passwords that are a combination of characters, special characters, numbers, symbols, lower case, upper case and etc. Hackers have a habit of sending out viruses and malware that can harm the website’s data and other crucial information. Thus if you put a strong password and keep changing it regularly it will keep the hackers in a fix and avert any kind of malware or phishing activity. You can also use hashed passwords beforehand for further security.

Try validating server side

Browsers are vulnerable to catch malicious activities quickly as there are loads of data and information sharing being done over the internet. Thus it is very important to validate all the forms and servers. Conduct a deeper validation on the server sides so that there are no chances of malicious code or scripting code to cause harm to your website.

Secure your website with security tools

There are some good white security tools that help in protecting your website from hackers and other cyber threats. These products are commercials and some are free to use. It is advisable to go for testing XSS and SQL injection. Moreover, OpenVAS is one of the most advanced security scanners that is awesome for vulnerabilities and testing challenges. Lastly, Xenotix XSS Exploit Framework is also a good choice for XSS examples.

A messy website is not a good idea

A clean website is always better and safe online. As we have got latest plugins and updates so why not get rid of the old ones and update the latest versions of the software and other updates in order to get a new feel and new security features. Give your website a fresh air by cleaning the mess, Install new themes, add new plugins add new versions of the software and other add-ons by this you will be avoiding the security breach conducted by any hacker.

Best practices for using AWS access keys

AWS access keys enable us to use programmatic or AWS CLI services in a manner similar to using a username and password.

AWS access keys have account privileges – for better and for worse.

For example, if you save access keys (credentials) of a root account inside code, anyone who uses this code can totally damage your AWS account.

Many stories have been published about security breaches due to access key exposure, especially combined with open source version control systems such as GitHub and GitLab.

In order to avoid security breaches, here is a list of best practices for securing your environment when using access keys:

 

How I passed the CCSP exam

The CCSP is one of the hardest vendor-neutral cloud related certifications in the industry.

The CCSP exam test the candidate’s knowledge in the following domains:

  • Architectural Concepts and Design Requirements
  • Cloud Data Security
  • Cloud Platform and Infrastructure Security
  • Cloud Application Security
  • Operations
  • Legal and Compliance

 

I strongly recommend to take this exam if you are a solution or cloud security architect, passionate about cloud computing.

CISSP certification gives you an advantage when taking the exam, due to the amount of study material, amount of exam questions and the exam length.

 

Here are the steps I took in-order to pass the exam:

Official Cloud Security Alliance course and exam – I have attended the CCSK course and took the official exam.

As part of the CCSK exam preparation, I read the following documents:

Official CCSP CBK training – I took the official live on-line training. Most of the study were based on the official book – “Official (ISC)2 Guide to the CISSP CBK

As part of the instructor’s recommendations, I have summarized key aspects of the material and reviewed those couple of times (instead of reading 600 pages of the CBK more than once).

The online training was not cheap, but an exam voucher (for one year) was included.

Extra reading – I read the “CCSP (ISC)2 Certified Cloud Security Professional Official Study Guide

Purchasing this book allowed me access to Wiley’s test bank of more than 700 practice exam questions, which allowed me to better test my knowledge and prepared for a long time-based exam.

Mobile applications – I have installed the following free applications with practice exam questions:

Free CBT – I watched the Cybrary’s free CCSP Training, which covers the exam materials

Work experience – I have no doubt that work experience gave me allot of knowledge for passing some of the tough scenarios.

I have not measured the time it took me to review the written material and prepare myself for the exam, but I am guessing couple of months of preparations.

I am proud to hold the CCSP (Certified Cloud Security Professional) certification.

A Guide to NDAs for Startups and Entrepreneurs

Nobody wants to sign your NDA. VCs, software engineers, freelancers — will all likely throw some shade your way when you bring up “non-disclosure agreement.” But if you’re an entrepreneur, you still need to protect your confidential information in certain circumstances.

In this post, we will cover why you need an NDA, who you should expect to use one with, what should be in it, how to draft the document, the proper timing for signature requests, and how to enforce a non-disclosure agreement.

Here we go.

What is an NDA?

NDA Signature

An NDA is a legal document that is intended to set up a confidential relationship between two or more parties, made up of a party disclosing information and a party receiving information. The non-disclosure agreement stipulates that information shared between parties should be used only for the purpose of that specific partnership, so as to protect the market position and the competitive advantage of the disclosing party.

Basically, the two parties agree to share information in order to help each other — while promising not to use that information outside of the relationship in such a way that could damage the other party’s interests.

NDAs can be written as a section of an employment contract or separately drawn up. A non-disclosure agreement can also be referred to as: Confidentiality Agreement, a Confidential Disclosure Agreement, a Proprietary Information Agreement, a Secrecy Agreement, a Proprietary Information and Inventions Agreement, or for that matter, any other arrangement of words indicating confidentiality that a disclosing party might prefer.

Why Do You Need an NDA?

If you are building a business, presumably there is certain information you wouldn’t want getting into the hands of potential competitors. Here are some of the main assets businesses seek to protect through an NDA.

Intellectual Property

Because of the incremental but speedy nature of software development, Intellectual Property Rights (IPR) patents are quickly becoming irrelevant in the industry. Most developers working on new projects are free to use code from open source libraries like ReactJS or AngularJS, which means the wheel is hardly ever reinvented for new software.

In most cases, it would be counterproductive to file a patent for code that would be outdated by the time said patent was delivered. Instead, companies can protect their software innovation by including a work-for-hire clause that requires that the rights to the code and products developed by employees and freelancers are automatically transferred to the company.

This can be included as a clause within the NDA, or for those who are really concerned their pot of gold will be pilfered, a separate Intellectual Property Rights Agreement like this one can be used. However, this may feel excessive to some and can be taken as an insult to one’s professional integrity. In fact, legal experts suggest that common law will usually be enough to uphold an employer’s rights to all of the creative works developed by employees in their service. However, this is not true of independent contractors, who will retain rights to their work unless otherwise agreed upon. Therefore, entrepreneurs should take care to include a work-for-hire or transfer of rights clause in an NDA or contract when working with freelancers.

Proprietary Information and Trade Secrets

There is a slight distinction between proprietary information and trade secrets, and a business may reasonably want to protect both. Proprietary information is any unique information that a business uses to operate, including:

  • Suppliers
  • Manufacturing agreements
  • Marketing strategy
  • Development processes
  • Pricing
  • Customer lists/client info
  • Research & data
  • Formulas and algorithms
  • Unique code
  • Test results
  • Product development plans

The list goes on. Not all proprietary information is, or needs to be, confidential. The business must decide what proprietary information they want to mark as confidential, thereby making it a trade secret. The Uniform Trade Secrets Act defines a trade secret as:

(i) information that derives independent economic value, actual or potential, from not being generally known to, and not being readily ascertainable by proper means by, other persons who can obtain economic value from its disclosure or use, and (ii) is the subject of efforts that are reasonable under the circumstances to maintain its secrecy.

Basically, this is your “secret sauce.” Information that helps you make a profit and is not readily available to others — the stuff that you wouldn’t want your competitors to have.

The most important reason for using an NDA is to guard against future legal expenses. You can hopefully avoid future litigation by protecting your trade secrets and insisting on ownership of the intellectual property developed at your business.

The NDA should be designed to provide a measure of protection against having to sue someone who has improperly used your information and, likewise, avert the risk of being sued for rights or royalties by a former employee or freelancer who wants to claim rights to products they helped develop.

What’s in an NDA?

This section will go over the eight clauses of an NDA, two of which are optional and should be carefully considered if written in.

Definition and Scope
Non-use
Non-compete
Representative Provision
Duration
Return Clause
Arbitration Clause
Attorney Provision Fee

Sample NDA

1. Definition and Scope

The definition and scope of what will be considered confidential are the most important parts of the NDA. It’s best to be specific when designating what proprietary information will be a trade secret. Without further clarification, terms like “proprietary information” or “business practices” are vague, and will likely not be enforceable if put to a legal test. Plus, under such broad terms, all information exchanged could be considered confidential. In a one-way NDA, the receiving party would be wise to walk away.

Whether it be your codebase, algorithms, client lists, product roadmap, or any other valuable proprietary information, it should be specified as confidential in the NDA. Future communications or documentation including these confidential items should also be marked as such with a “confidential footer.”

2. Non-use

As we’ve mentioned in an earlier section of this article, the whole purpose of an NDA is to formally agree that the information exchanged is only to be used for the parties entering into partnership. Therefore, your NDA will need to have a non-use clause. In addition to agreeing not to disclose your confidential information, the non-use clause also prohibits the receiving party from making use of the information in such a way that would be damaging to the vital business interests of the disclosing party.

The non-use clause is intended to prevent:

  • Formation of new businesses in direct competition to the disclosing party
  • Receiving parties from using confidential proprietary information as a bargaining chip for personal gain or new job opportunities
  • Existing competitors from soliciting current employees or freelancers for the knowledge they have of your business secrets

The non-use clause should not prevent:

  • The receiving party from using new skills learned in the future
  • The receiving party from working on projects with similar but technically different applications in the future

🌟 To make the non-use clause more acceptable to the receiving party, consider making the restriction only as long as the duration of the project or employment contract

3. Non-compete

The jury is out on whether or not it’s appropriate to use a non-compete clause in your NDA. Any developer worth their salt is probably not going to sign anything that restricts their future employment options.

Freelancers will be especially sensitive to any non-compete language. If you must have a non-compete clause, it should only be designed to prevent the employee or freelance developer from taking your technology and business model to a direct competitor or soliciting your employees to start a competing business venture. It typically should notprevent them from working in the same industry or location, for any period of time.

In truth, both of these objectives can be achieved with the non-use clause. Another reason to tread lightly when it comes to non-compete is that many state courts deem non-compete restrictions a barrier to free trade and seldom enforce this part of an NDA anyway.

4. Representative Provision

To provide your partners with more flexibility and freedom to complete the project as they see fit, you should consider including a representative provision. The representative provision permits confidential information to be shared with associates of the receiving party for the purpose of completing the project. Once information is shared with a representative, he or she will be bound by terms of the NDA as well.

This clause should define who may be considered a “representative,” and can also require that the recipient party inform the disclosing party of any additional associates who will be collaborating on the project.

5. Duration

When drawing up an NDA, you will want to define a reasonable length of time for the agreement to remain in force. Too short, and you expose yourself to the risk of competitors learning your trade secrets before you’re able to establish a firm competitive edge. Too long, and skilled developers will not want to work with you. Most experts suggest anywhere from two to five years as a fair term of obligation, after which the agreement will automatically terminate.

6. Return Clause

Throughout the course of a project, you will have likely transferred a lot of files, instructions, data, communications, and other materials containing your business’s confidential information. Upon termination or completion of the project, you will want the return or destruction of the most sensitive of those documents. This clause stipulates that upon the written request of the disclosing party, the recipient of confidential information returns or destroys the material in their possession.

Just like confidential information should be specifically defined, what must be returned or destroyed should also be specified. Unless you are a government defense contractor, it won’t be necessary to ask someone to delete every single email or file they have received from you — just protect the key ingredients of your secret sauce!

7. Arbitration Clause

Should things go awry, having a mutual understanding of how to handle a dispute can save you money and hassle. Lawyering-up and going to court to file a lawsuit is very, very expensive. This should be considered a last resort. the arbitration clause is the place to outline alternatives to official litigation.

Your preferred path should be good faith mediation between parties. This is essentially a discussion facilitated by a moderator to air grievances and explore solutions. If informal mediation fails to resolve differences, you can opt for a binding (or non-binding) arbitration committee. This like a mini-trial outside of the official judicial system. Arbitration committees can deliver decisions, which are typically a good indicator of how an actual court proceeding would go. Your last and most resource draining option is to bring your dispute to trial.

Lastly, your arbitration clause should identify the territory whose laws will be used to govern the agreement, including any disputes.

8. Attorney Provision Fee

An attorney provision fee requires that the “losing” party of a lawsuit pay the legal fees of the prevailing party. Of course, this will help to ease the burden of litigation, but it also conflicts with mediation and arbitration for that reason. When passions are inflamed and both sides believe they are in the right, removing the threat of financial punishment will mean that parties are less likely to cooperate. With that in mind, you may choose to include or not include an attorney provision fee in your NDA.

 

What’s Not in an NDA?

What's not in an NDA?

Public knowledge. As mentioned earlier in this article, most software applications today are developed using snippets of open source code. If the codebase used to develop your project is publically available, it cannot be included as part of your confidential information. That being said, new changes made to source code may become part of your business advantage, and can be protected under the NDA.

Prior knowledge and independently developed knowledge. Rights to knowledge or innovations credited to the freelancer prior to the partnership will be retained by the freelancer developer. The same goes for knowledge developed independently of the project, even if it occurred during the time of the partnership. Freelancers would be wise make an itemized list of any valuable knowledge or previous inventions to be covered under this clause.

Third Party Information. Freelance software developers often work with various clients, and they may have more than one non-disclosure agreement in force concurrently. The NDA should specifically note that information obtained by a third party is not confidential. This protects the freelancer from being unfairly sued for breach of contract in the case that information disclosed from various parties overlaps.

 

Now you know what is and isn’t in an NDA, who should you get to sign your NDA and when? Should you ask for signatures from freelancers or VCs? When should you lawyer up and how does one enforce an NDA? Make sure to read the Guide to NDAs for Startups and Entrepreneurs for answers.

Cloud Computing Journey – Part 2

Cloud service provider questionnaire

In my previous post I gave you a short introduction to cloud computing.

When engaging with cloud service provider, it is important to evaluate the provider’s maturity level by asking the provider, as many questions as possible to allow you the comfort level to sign a contract.

Below is a sample questionnaire I recommend you to ask the cloud service provider.

Privacy related questions:

  • Does the cloud service provider has an official privacy policy?
  • Where are the cloud service provider data centers located around the world?
  • Are the cloud service provider data centers compliant with the EU Directive 95/46/EC?
  • Are the cloud service provider data centers compliant with the General Data Protection Regulation (GDPR)?

 

Availability related questions:

  • What is the SLA of the cloud service provider? (Please elaborate)
  • Does the cloud service provider publish information about system issues or outages?
  • What compensation does the cloud service provider offer in case of potential financial loss due to lack of availability?
  • Does the cloud service provider sync data between more than one data center on the same region?
  • How many data centers does the cloud service provider has in the same region?
  • Does the cloud service provider have business continuity processes and procedures? (Please elaborate)
    • What is the cloud service provider’s RTO?
    • What is the cloud service provider’s RPO?
  • What is the cloud service provider disaster recovery strategy?
  • Does the cloud service provider have change management processes? (Please elaborate)
  • Does the cloud service provider have backup processes? (Please elaborate)

 

Interoperability related questions:

  • Does the cloud service provider support security event monitoring using an API? (Please elaborate)
  • Does the cloud service provider support infrastructure related event monitoring using an API? (Please elaborate)

 

Security related questions:

  • What is the cloud service provider’s audit trail process for my organizational data stored or processed? (Please elaborate)
  • What logical controls does the cloud service provider use for my organizational data stored or processed? (Please elaborate)
  • What physical controls does the cloud service provider use for my organizational data stored or processed? (Please elaborate)
  • Does the cloud service provider encrypt data at transit? (Please elaborate)
  • Does the cloud service provider encrypt data at rest? (Please elaborate)
    • What encryption algorithm is been used?
    • What encryption key size is been used?
    • Where are the encryption keys stored?
    • At what interval does the cloud service provider rotate the encryption keys?
    • Does the cloud service provider support BYOK (Bring your own keys)?
    • Does the cloud service provider support HYOK (Hold your own keys):
    • At what level does the data at rest been encrypted? (Storage, database, application, etc.)
  • What security controls are been used by the cloud service provider to protect the cloud service itself?
  • Is there an on-going process for Firewall rule review been done by the cloud service provider? (Please elaborate)
  • Are all cloud service provider’s platform (Operating system, database, middleware, etc.) been hardened according to best practices? (Please elaborate)
  • Does the cloud service provider perform an on-going patch management process for all hardware and software? (Please elaborate)
  • What security controls are been used by the cloud service provider to protect against data leakage in a multi-tenant environment?
  • How does the cloud service provider perform access management process? (Please elaborate)
  • Does the cloud service provider enforce 2-factor authentication for accessing all management interfaces?
  • Is the authentication to the cloud service based on standard protocols such as SAML, OAuth, OpenID?
  • How many employees at the cloud service provider will have access to my organizational data? (Infrastructure and database level)
  • Is there an access to the cloud service provider’s 3rd party suppliers to my organizational data?
  • Does the cloud service provider enforce separation between production and development/test environments? (Please elaborate)
  • What is the cloud service provider’s password policy (Operating system, database, network components, etc.) for systems that store or process my organizational data?
  • Is it possible to schedule security survey and penetration test on the systems that stored my organizational data?
  • Does the cloud service provider have incident response processes and procedures? (Please elaborate)
  • What are the escalation processes in case of security incident related to my organizational data? (Please elaborate)
  • What are the cloud service provider’s processes and controls against distributed denial-of-service? (Please elaborate)
  • Does the cloud service provider have vulnerability management processes? (Please elaborate)
  • Does the cloud service provider have secure development lifecycle (SDLC) process? (Please elaborate)

 

Compliance related questions:

  • Is the cloud service provider compliant with certifications or standards? (Please elaborate)
  • What is the level of compliance with the Cloud Security Alliance Matrix (https://cloudsecurityalliance.org/research/ccm)?
  • Is it possible to receive a copy of internal audit report performed on the cloud service in the last 12 months?
  • Is it possible to receive a copy of external audit report performed on the cloud service in the last 12 months?
  • Is it possible to perform an on site audit on the cloud service provider’s data center and activity?

 

Contract termination related questions:

  • What are the cloud service provider’s contract termination options?
  • What options does the cloud service provider allow me to export my organizational data stored on the cloud?
  • Is there a process for data deletion in case of contract termination?
  • What standard does the cloud service provider use for data deletion?

 

Stay tuned for my next article.

 

Here are some recommended articles:

Separation Anxiety: A Tutorial for Isolating Your System with Linux Namespaces

With the advent of tools like Docker, Linux Containers, and others, it has become super easy to isolate Linux processes into their own little system environments. This makes it possible to run a whole range of applications on a single real Linux machine and ensure no two of them can interfere with each other, without having to resort to using virtual machines. These tools have been a huge boon to PaaS providers. But what exactly happens under the hood?

These tools rely on a number of features and components of the Linux kernel. Some of these features were introduced fairly recently, while others still require you to patch the kernel itself. But one of the key components, using Linux namespaces, has been a feature of Linux since version 2.6.24 was released in 2008.

Anyone familiar with chroot already has a basic idea of what Linux namespaces can do and how to use namespace generally. Just as chroot allows processes to see any arbitrary directory as the root of the system (independent of the rest of the processes), Linux namespaces allow other aspects of the operating system to be independently modified as well. This includes the process tree, networking interfaces, mount points, inter-process communication resources and more.

Why Use Namespaces for Process Isolation?

In a single-user computer, a single system environment may be fine. But on a server, where you want to run multiple services, it is essential to security and stability that the services are as isolated from each other as possible. Imagine a server running multiple services, one of which gets compromised by an intruder. In such a case, the intruder may be able to exploit that service and work his way to the other services, and may even be able compromise the entire server. Namespace isolation can provide a secure environment to eliminate this risk.

For example, using namespacing, it is possible to safely execute arbitrary or unknown programs on your server. Recently, there has been a growing number of programming contest and “hackathon” platforms, such as HackerRank, TopCoder, Codeforces, and many more. A lot of them utilize automated pipelines to run and validate programs that are submitted by the contestants. It is often impossible to know in advance the true nature of contestants’ programs, and some may even contain malicious elements. By running these programs namespaced in complete isolation from the rest of the system, the software can be tested and validated without putting the rest of the machine at risk. Similarly, online continuous integration services, such as Drone.io, automatically fetch your code repository and execute the test scripts on their own servers. Again, namespace isolation is what makes it possible to provide these services safely.

Namespacing tools like Docker also allow better control over processes’ use of system resources, making such tools extremely popular for use by PaaS providers. Services like Heroku and Google App Engine use such tools to isolate and run multiple web server applications on the same real hardware. These tools allow them to run each application (which may have been deployed by any of a number of different users) without worrying about one of them using too many system resources, or interfering and/or conflicting with other deployed services on the same machine. With such process isolation, it is even possible to have entirely different stacks of dependency softwares (and versions) for each isolated environment!

If you’ve used tools like Docker, you already know that these tools are capable of isolating processes in small “containers”. Running processes in Docker containers is like running them in virtual machines, only these containers are significantly lighter than virtual machines. A virtual machine typically emulates a hardware layer on top of your operating system, and then runs another operating system on top of that. This allows you to run processes inside a virtual machine, in complete isolation from your real operating system. But virtual machines are heavy! Docker containers, on the other hand, use some key features of your real operating system, including namespaces, and ensure a similar level of isolation, but without emulating the hardware and running yet another operating system on the same machine. This makes them very lightweight.

Process Namespace

Historically, the Linux kernel has maintained a single process tree. The tree contains a reference to every process currently running in a parent-child hierarchy. A process, given it has sufficient privileges and satisfies certain conditions, can inspect another process by attaching a tracer to it or may even be able to kill it.

With the introduction of Linux namespaces, it became possible to have multiple “nested” process trees. Each process tree can have an entirely isolated set of processes. This can ensure that processes belonging to one process tree cannot inspect or kill – in fact cannot even know of the existence of – processes in other sibling or parent process trees.

Every time a computer with Linux boots up, it starts with just one process, with process identifier (PID) 1. This process is the root of the process tree, and it initiates the rest of the system by performing the appropriate maintenance work and starting the correct daemons/services. All the other processes start below this process in the tree. The PID namespace allows one to spin off a new tree, with its own PID 1 process. The process that does this remains in the parent namespace, in the original tree, but makes the child the root of its own process tree.

With PID namespace isolation, processes in the child namespace have no way of knowing of the parent process’s existence. However, processes in the parent namespace have a complete view of processes in the child namespace, as if they were any other process in the parent namespace.

 

It is possible to create a nested set of child namespaces: one process starts a child process in a new PID namespace, and that child process spawns yet another process in a new PID namespace, and so on.

With the introduction of PID namespaces, a single process can now have multiple PIDs associated with it, one for each namespace it falls under. In the Linux source code, we can see that a struct named pid, which used to keep track of just a single PID, now tracks multiple PIDs through the use of a struct named upid:

struct upid {
  int nr;                     // the PID value
  struct pid_namespace *ns;   // namespace where this PID is relevant
  // ...
};

struct pid {
  // ...
  int level;                  // number of upids
  struct upid numbers[0];     // array of upids
};

To create a new PID namespace, one must call the clone() system call with a special flag CLONE_NEWPID. (C provides a wrapper to expose this system call, and so do many other popular languages.) Whereas the other namespaces discussed below can also be created using the unshare() system call, a PID namespace can only be created at the time a new process is spawned using clone(). Once clone() is called with this flag, the new process immediately starts in a new PID namespace, under a new process tree. This can be demonstrated with a simple C program:

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>

static char child_stack[1048576];

static int child_fn() {
  printf("PID: %ld\n", (long)getpid());
  return 0;
}

int main() {
  pid_t child_pid = clone(child_fn, child_stack+1048576, CLONE_NEWPID | SIGCHLD, NULL);
  printf("clone() = %ld\n", (long)child_pid);

  waitpid(child_pid, NULL, 0);
  return 0;
}

Compile and run this program with root privileges and you will notice an output that resembles this:

clone() = 5304
PID: 1

The PID, as printed from within the child_fn, will be 1.

Even though this namespace tutorial code above is not much longer than “Hello, world” in some languages, a lot has happened behind the scenes. The clone() function, as you would expect, has created a new process by cloning the current one and started execution at the beginning of the child_fn() function. However, while doing so, it detached the new process from the original process tree and created a separate process tree for the new process.

Try replacing the static int child_fn() function with the following, to print the parent PID from the isolated process’s perspective:

static int child_fn() {
  printf("Parent PID: %ld\n", (long)getppid());
  return 0;
}

Running the program this time yields the following output:

clone() = 11449
Parent PID: 0

Notice how the parent PID from the isolated process’s perspective is 0, indicating no parent. Try running the same program again, but this time, remove the CLONE_NEWPID flag from within the clone() function call:

pid_t child_pid = clone(child_fn, child_stack+1048576, SIGCHLD, NULL);

This time, you will notice that the parent PID is no longer 0:

clone() = 11561
Parent PID: 11560

However, this is just the first step in our tutorial. These processes still have unrestricted access to other common or shared resources. For example, the networking interface: if the child process created above were to listen on port 80, it would prevent every other process on the system from being able to listen on it.

Linux Network Namespace

This is where a network namespace becomes useful. A network namespace allows each of these processes to see an entirely different set of networking interfaces. Even the loopback interface is different for each network namespace.

Isolating a process into its own network namespace involves introducing another flag to the clone() function call: CLONE_NEWNET;

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>


static char child_stack[1048576];

static int child_fn() {
  printf("New `net` Namespace:\n");
  system("ip link");
  printf("\n\n");
  return 0;
}

int main() {
  printf("Original `net` Namespace:\n");
  system("ip link");
  printf("\n\n");

  pid_t child_pid = clone(child_fn, child_stack+1048576, CLONE_NEWPID | CLONE_NEWNET | SIGCHLD, NULL);

  waitpid(child_pid, NULL, 0);
  return 0;
}

Output:

Original `net` Namespace:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 00:24:8c:a1:ac:e7 brd ff:ff:ff:ff:ff:ff


New `net` Namespace:
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

What’s going on here? The physical ethernet device enp4s0 belongs to the global network namespace, as indicated by the “ip” tool run from this namespace. However, the physical interface is not available in the new network namespace. Moreover, the loopback device is active in the original network namespace, but is “down” in the child network namespace.

In order to provide a usable network interface in the child namespace, it is necessary to set up additional “virtual” network interfaces which span multiple namespaces. Once that is done, it is then possible to create Ethernet bridges, and even route packets between the namespaces. Finally, to make the whole thing work, a “routing process” must be running in the global network namespace to receive traffic from the physical interface, and route it through the appropriate virtual interfaces to to the correct child network namespaces. Maybe you can see why tools like Docker, which do all this heavy lifting for you, are so popular!

 

To do this by hand, you can create a pair of virtual Ethernet connections between a parent and a child namespace by running a single command from the parent namespace:

ip link add name veth0 type veth peer name veth1 netns <pid>

Here, <pid> should be replaced by the process ID of the process in the child namespace as observed by the parent. Running this command establishes a pipe-like connection between these two namespaces. The parent namespace retains the veth0 device, and passes the veth1 device to the child namespace. Anything that enters one of the ends, comes out through the other end, just as you would expect from a real Ethernet connection between two real nodes. Accordingly, both sides of this virtual Ethernet connection must be assigned IP addresses.

Mount Namespace

Linux also maintains a data structure for all the mountpoints of the system. It includes information like what disk partitions are mounted, where they are mounted, whether they are readonly, et cetera. With Linux namespaces, one can have this data structure cloned, so that processes under different namespaces can change the mountpoints without affecting each other.

Creating separate mount namespace has an effect similar to doing a chroot(). chroot() is good, but it does not provide complete isolation, and its effects are restricted to the root mountpoint only. Creating a separate mount namespace allows each of these isolated processes to have a completely different view of the entire system’s mountpoint structure from the original one. This allows you to have a different root for each isolated process, as well as other mountpoints that are specific to those processes. Used with care per this tutorial, you can avoid exposing any information about the underlying system.

 

The clone() flag required to achieve this is CLONE_NEWNS:

clone(child_fn, child_stack+1048576, CLONE_NEWPID | CLONE_NEWNET | CLONE_NEWNS | SIGCHLD, NULL)

Initially, the child process sees the exact same mountpoints as its parent process would. However, being under a new mount namespace, the child process can mount or unmount whatever endpoints it wants to, and the change will affect neither its parent’s namespace, nor any other mount namespace in the entire system. For example, if the parent process has a particular disk partition mounted at root, the isolated process will see the exact same disk partition mounted at the root in the beginning. But the benefit of isolating the mount namespace is apparent when the isolated process tries to change the root partition to something else, as the change will only affect the isolated mount namespace.

Interestingly, this actually makes it a bad idea to spawn the target child process directly with the CLONE_NEWNS flag. A better approach is to start a special “init” process with the CLONE_NEWNS flag, have that “init” process change the “/”, “/proc”, “/dev” or other mountpoints as desired, and then start the target process. This is discussed in a little more detail near the end of this namespace tutorial.

Other Namespaces

There are other namespaces that these processes can be isolated into, namely user, IPC, and UTS. The user namespace allows a process to have root privileges within the namespace, without giving it that access to processes outside of the namespace. Isolating a process by the IPC namespace gives it its own interprocess communication resources, for example, System V IPC and POSIX messages. The UTS namespace isolates two specific identifiers of the system: nodename and domainname.

A quick example to show how UTS namespace is isolated is shown below:

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/utsname.h>
#include <sys/wait.h>
#include <unistd.h>


static char child_stack[1048576];

static void print_nodename() {
  struct utsname utsname;
  uname(&utsname);
  printf("%s\n", utsname.nodename);
}

static int child_fn() {
  printf("New UTS namespace nodename: ");
  print_nodename();

  printf("Changing nodename inside new UTS namespace\n");
  sethostname("GLaDOS", 6);

  printf("New UTS namespace nodename: ");
  print_nodename();
  return 0;
}

int main() {
  printf("Original UTS namespace nodename: ");
  print_nodename();

  pid_t child_pid = clone(child_fn, child_stack+1048576, CLONE_NEWUTS | SIGCHLD, NULL);

  sleep(1);

  printf("Original UTS namespace nodename: ");
  print_nodename();

  waitpid(child_pid, NULL, 0);

  return 0;
}

This program yields the following output:

Original UTS namespace nodename: XT
New UTS namespace nodename: XT
Changing nodename inside new UTS namespace
New UTS namespace nodename: GLaDOS
Original UTS namespace nodename: XT

Here, child_fn() prints the nodename, changes it to something else, and prints it again. Naturally, the change happens only inside the new UTS namespace.

More information on what all of the namespaces provide and isolate can be found in the tutorial here

Cross-Namespace Communication

Often it is necessary to establish some sort of communication between the parent and the child namespace. This might be for doing configuration work within an isolated environment, or it can simply be to retain the ability to peek into the condition of that environment from outside. One way of doing that is to keep an SSH daemon running within that environment. You can have a separate SSH daemon inside each network namespace. However, having multiple SSH daemons running uses a lot of valuable resources like memory. This is where having a special “init” process proves to be a good idea again.

The “init” process can establish a communication channel between the parent namespace and the child namespace. This channel can be based on UNIX sockets or can even use TCP. To create a UNIX socket that spans two different mount namespaces, you need to first create the child process, then create the UNIX socket, and then isolate the child into a separate mount namespace. But how can we create the process first, and isolate it later? Linux provides unshare(). This special system call allows a process to isolate itself from the original namespace, instead of having the parent isolate the child in the first place. For example, the following code has the exact same effect as the code previously mentioned in the network namespace section:

#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>


static char child_stack[1048576];

static int child_fn() {
  // calling unshare() from inside the init process lets you create a new namespace after a new process has been spawned
  unshare(CLONE_NEWNET);

  printf("New `net` Namespace:\n");
  system("ip link");
  printf("\n\n");
  return 0;
}

int main() {
  printf("Original `net` Namespace:\n");
  system("ip link");
  printf("\n\n");

  pid_t child_pid = clone(child_fn, child_stack+1048576, CLONE_NEWPID | SIGCHLD, NULL);

  waitpid(child_pid, NULL, 0);
  return 0;
}

And since the “init” process is something you have devised, you can make it do all the necessary work first, and then isolate itself from the rest of the system before executing the target child.

Conclusion

This tutorial is just an overview of how to use namespaces in Linux. It should give you a basic idea of how a Linux developer might start to implement system isolation, an integral part of the architecture of tools like Docker or Linux Containers. In most cases, it would be best to simply use one of these existing tools, which are already well-known and tested. But in some cases, it might make sense to have your very own, customized process isolation mechanism, and in that case, this namespace tutorial will help you out tremendously.

There is a lot more going on under the hood than I’ve covered in this article, and there are more ways you might want to limit your target processes for added safety and isolation. But, hopefully, this can serve as a useful starting point for someone who is interested in knowing more about how namespace isolation with Linux really works.

Originally from Toptal

Cloud Computing Journey – Part 1

So, you decided to migrate a system to the cloud. It may be business or IT initiative, but what does it really mean switching between on premise and the cloud?

For ages, we used to manage our IT infrastructure by ourselves, on our own data centers (or network communication cabinets, for small companies…), using our purchased hardware, while maintaining and troubleshooting every software/hardware/network problem.

In the cloud, things change. In the cloud, we are one of the many customers sharing compute resources in multi-tenant environment. We have no control of the hardware or the chosen platform technology (from the servers’ hardware vendor to the storage vendor), we barely control the virtualization layer, and don’t even get me started talking about troubleshooting the network layer.

There are 3 cloud service models:

  • IaaS (Infrastructure as a service) – In this service model, the customer controls (almost) everything from the virtual servers operating system, through the application layer, up until the data itself.
  • PaaS (Platform as a service) – In this service model, the customer controls the application layer, up until the data itself.
  • SaaS (Software as a service) – In this service model, the customer has access to a close application, but the customer is the data owner and can control permissions (and auditing) of the data.

 

Once we understood those basic rules, let’s analyze what does it really means migrating to the cloud.

The word that differentiates a mature cloud service provider from a rookie is transparency.

Mature cloud service provider won’t hesitate to answer tough questions such as “Can I have a copy of your last external audit report?”, “Do you have a business continuity plan?”, “Do you have an SDLC (Software development lifecycle) process?”, etc.

When engaging with cloud service provider, it is important to know as much details about the provider as you can, until you are comfortable enough to sign a contract.

A solid contract will give you assurance about the cloud service provider’s ability to fulfill his obligations and will higher the chances of project success.

In the next couple of articles, I will try to pinpoint important tips for a successful cloud project.

Stay tuned for my next article.

Here are some recommended articles:

JSON Web Token Tutorial: An Example in Laravel and AngularJS