Skip to content

Ellery Yang

I write about products and the PM role.

Menu
  • Home
  • About me
Menu

What is cloud?

Posted on August 26, 2016April 25, 2020 by Ellery

I want to talk about cloud. Everyone seems to have heard about it, but not everyone understand what it means. In my opinion, the essence of cloud is resource aggregation, and the two pillars making it possible are the ability to coordinate large-scale resources, and the division of resource’s ownership and usage.

Resource aggregation: the WHAT

Oversimplifying it, cloud is resource aggregation. AWS, Azure and Google Cloud are all estimated to have hundreds of thousands, millions or even more servers running, branded and accessed as one entity.

This astronomical scale gives them the ability to solve big problems that no regular data center can: to support a SQL database service with 100 queries per second, any traditional hosting will do; to support one with millions of queries per second, it is necessary to leave it to a big cloud.

Another example is big data. The reason that people usually hear the words “big data” and “cloud” together is because cloud, with its aggregated computing resources, is one of the most promising way to solve big data problems. Big data problems are big, mining in the haystack of billions of data points for some needles of insights. That scale of problem simply cannot be solved without, say, 10,000 computers. This is where cloud comes in.

Division of ownership and usage: the HOW, part 1

Separating resource ownership and usage is the key to make cloud possible. Why? Because there are too many users with different computing needs.

Imagine there is a law that forces everyone who needs computing to do it on a device he owns. User could run into all kinds of trouble in that world: when his computer is off, it wastes CPU resource; when he has a huge project to run, his computer seems too slow but it would be crazy to buy a faster one, because this type of task occurs only once every 6 months for him; when his computer gets a virus, he cannot work; and when he travels, he has to bring his big desktop computer with him because his work cannot be processed on a smartphone CPU.

But when users become subscribers instead of owners of resources, everything seems peachy: all a user needs is a small device to login to his cloud account; there, he can ask the cloud to do the computing anytime and send the result back to his device; the task can be big or small – the cloud will deal with it and charge the user for CPU time; the cloud never idles.

Consider OneDrive (or Google Drive, iCloud, what have you). When you store (note that storage is a type of computing service!) a file in OneDrive rather than your local drive, you have access to that file as long as you have a Microsoft account. You can even sell all your computers and phones, go on a world trip for a year, coming back, still able to access to that file – because you didn’t have to be the owner of the storage device!

Coordinating the resources: the HOW, part 2

The ability to coordinate resources empowers clouds to do more with aggregated resources.

The coordination is usually done with a software. As a matter of fact, an often-seen definition of “cloud” is “the software that runs the data center”, for example, AWS, Azure, OpenStack, WAP. These software are masters in performance optimization, divide-and-conquer strategy. They know how to make the best use of each computing part in their pools, and they provide failover mechanism – in case a cluster fails, computing can be carried over to another one with little latency added.

Without resource coordination, a data center with 1000 computers is no different than the collection of 1000 random computers in town. The coordination is what maximizes the performance and minimizes the cost through economy of scale.

For example, imagine there are 1000 people who wants to keep a copy of Photoshop CS6 installation package. Say that package is 1GB big. Without cloud (OneDrive, iCould…), these 1000 people each needs a computer with at least 1GB of hard drive space left. But with cloud, all it takes is 1GB of disk space in one of the servers in data center, thanks to resource coordination.

Here is how the software would handle these 1000 requests: When person no.1 uploads this file to cloud, cloud stores it, marks it uniquely (with MD5 hash, for example), and marks person no.1 as owner. When person no.2 uploads this file, the cloud detects that said file already exists, and simply mark person no.2 as an additional owner of this file. This goes on for all other people. Even if at some point person no.1 deletes this file from his cloud account, this file remains in the hard drive of the server for other owners. The cloud ends up storing only one 1GB file, with a list of its owners.

The benefits of resource coordination includes not only better performance and lower cost, but also robustness, environmental benefits, etc.

Summary

Cloud is a way to solve problems by aggregating resources, based on the idea of separating resource ownership and usage, and the technology to coordinate such resources. Said technology is often referred to as “cloud” itself.

PM Blog (2017 - present)

  • July 2025
  • April 2024
  • November 2023
  • September 2023
  • March 2023
  • January 2023
  • September 2022
  • July 2022
  • February 2022
  • September 2021
  • August 2021
  • April 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • February 2020
  • January 2020
  • December 2019
  • October 2019
  • July 2019
  • June 2019
  • April 2019
  • March 2019
  • December 2018
  • October 2018
  • May 2018
  • March 2018
  • December 2017

A Student's Blog (2015 - 2017)

  • July 2017
  • June 2017
  • August 2016
  • July 2016
  • April 2016
  • March 2016
  • February 2016
  • December 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
© 2025 Ellery Yang | Powered by Minimalist Blog WordPress Theme