Sunday, December 3, 2023

Exam Prep 5 – Designing Storage System

 The next chapter is on storage solutions. There are a lot of them, to match all the possible needs applications could have. The main categories are:

  • object storage
  • persistent local and attached storage
  • relation and NoSQL databases
Flowchart for decisions: https://cloud.google.com/architecture/storage-advisor#decision_tree

Google Cloud Storage is the object storage solution on GCP. It is not a file system, there is no clear structure in it, and the files are treated atomically, which means that getting a part of a file is not possible. The files are arranged into buckets. Files in a bucket share access controls. The bucket names must be globally unique, therefore it is advisable to use a unique identifier in it. It has four tiers:

  • Regional – data is frequently accessed, and present in one region
  • Multiregional – data is frequently accessed from multiple regions
  • Nearline – for data accessed less than once per month
  • Coldline – data is accessed once per year or less

Cloud Filestore is a network-attached storage service. Used mostly for GCE and GKE, can be attached to multiple instances.

There are multiple databases available as well. Relational DBs follow the ACID principle.

CloudSQL is a managed SQL DB offering. It offers MySQL, PostgreSQL and Microsoft SQL servers on the internet.

Cloud Spanner is a globally scalable SQL DB on GCP. Ideal for applications that need to be available in multiple regions of the world.

BigQuery is a data warehouse used for analytics. It supports SQL. You pay based on the data you use, not the data you store.

NoSQL databases use flexible schemas.

Cloud Bigtable is a NoSQL database used for data analytics. Perfect for IoT projects.

Cloud Datastore is a document-based NoSQL DB. Its successor is Firestore, which is advised for web applications requiring flexible schema.

Cloud Memorystore is a managed Redis. Used for caching.

GCP encrypts data at rest. The user has to take care of data retention and lifecycle management. Networking and latency have to be taken into consideration as well when designing an application using cloud storage.

The review questions in this chapter went better than the others so far. I did go through them twice, to make sure that my usual issue of writing down a different letter than what I chose is not happening – I’ll keep doing that for future chapters.

No comments:

Post a Comment

Building Resilience

In the beginning of the year we had a conversation with my manager about my yearly goals. Due to the nature of my current project and the ne...