AWS Database Questions And Answers

What do you know about the Amazon Database?

Amazon database is one of the Amazon Web Services that offers managed database along with managed service and NoSQL. It is also a fully managed petabyte-scale data warehouse service and in-memory caching as a service. There are four AWS database services, the user can choose to use one or multiple that meet the requirements. Amazon database services are –

RDS

DynamoDB

RedShift and

ElastiCache.

 

Explain Amazon Relational Database.

RDS stands for Relational Database Service by which a user can easily manage and scale a relational database in the cloud. You can focus on your application and business instead of managing the time-consuming data administration works. The user can access his files anywhere on the go with high scalability and a cost-effective manner.

The code and applications that you use today with the existing database like MySQL, MariaDB, Oracle, SQL Server work efficiently with Amazon RDS. It automatically backups the database and updates regularly with the latest version. There are a number of AWS RDS engines, such as:

MySQL

Oracle

PostgreSQL

SQL Server

MariaDB

Amazon Aurora

 

What is a key-value store?

A key-value store is a database service that facilitates the storing, updating, and querying of the objects which are generally identified with the key and values. These objects consist of the keys and values which constitute the actual content that is stored.

 

Which of the AWS DB service is a NoSQL database and serverless, and delivers consistent single-digit millisecond latency at any scale?

Amazon DynamoDB

Amazon DynamoDB is a fully managed, serverless, key-value No SQL database service. This service has many essential features such as built-in security, in-memory caching, continuous back-ups, data export tools, and automated multi-region replication. Mainly, you can run high-performance applications at any scale using this service. For instance, it extensively supports internet-scale applications that require high concurrency and connections for many users with millions of requests per second.

 

List some of the benefits of using Amazon DynamoDB.

Amazon DynamoDB is a NoSQL service that provides a number of benefits to users. Some of the benefits of AWS DynamoDB are –

  • Being a self-managed service, DynamoDB doesn’t require experts for setup, installation, cluster, etc.
  • It provides inevitable and faster performance.
  • It is highly scalable, available, and durable.
  • It provides very high throughput at low latency.
  • It is highly cost-effective.
  • It supports and allows the creation of dynamic tables with multi-values attributes i.e. it’s flexible in nature.

 

What is a DynamoDB Mapper Class?

The mapper class is the entry point of DynamoDB. It allows users to enter DynamoDB and access the endpoint. DynamoDB mapper class helps users access the data stored in various tables, then execute queries, scan them against the tables, and perform CRUD operations on the data items.

 

What do you understand by DynamoDB Auto Scaling?

DynamoDB Auto Scaling specifies its specialized feature to automatically scale up and down its own read and write capacity or global secondary index.

 

What is a Data Warehouse and how AWS Redshift can play a vital role in storage?

A data warehouse can be thought of as a repository where the data generated from the company’s systems and other sources is collected and stored. So a data warehouse has three-tier architecture:

  • In the bottom tier, we have the tools which cleanse and collect the data.
  • in the middle tier, we have tools that transform the data using Online Analytical Processing Server.
  • In the top tier, we have different tools where data analysis and data mining is performed at the front end.

Setting up and managing a data warehouse involves a lot of money as the data in an organization continuously increases and the organization has to continuously upgrade their data storage servers. So here AWS Redshift comes into existence where the companies store their data in the cloud-based warehouses provided by Amazon.

 

What is Amazon Redshift and why is it popular among other cloud data warehouses?

Answer: Amazon Redshift is a fast and scalable data warehouse that is easy to use and cost-effective to manage all the organization’s data. The database is ranged from gigabytes to hundreds of petabytes of cloud data storage. A person does not need knowledge of any programming language to use this feature, just upload the cluster and tools which are already known to the user he can start using Redshift.

 

How to load data in Amazon Redshift?

Answer: Amazon DynamoDB, Amazon EMR, AWS Glue, AWS Data Pipeline are some of the data sources by which you can load data in Redshift data warehouse. The clients can also connect to Redshift with the help of ODBC or JDBC and give the SQL command ‘insert’ to load the data.

 

What is Amazon ElastiCache?

Answer: Amazon ElastiCache is an in-memory key-value store that is capable of supporting two key-value engines – Redis and Memcached. It is a fully managed and zero administration which is hardened by Amazon. With the help of Amazon ElastiCache, you can either build a new high-performance application or improve the existing application. You can find the various application of ElastiCache in the field of Gaming, Healthcare, etc.

 

What is the use of Amazon ElastiCache?

Answer: The performance of web applications could be improved with the help of the caching of information that is used again and again. The information can be accessed very fast using in-memory caching. With ElastiCache there is no need of managing a separate caching server. You can easily deploy or run an open-source compatible in-memory data source with high throughput and low latency.

 

What are the benefits of Amazon ElastiCache?

There are various benefits of using Amazon ElastiCache some of which are discussed below:

  • The cache node failures are automatically detected and recovered.
  • It can be easily integrated with other AWS so as to provide a high performance and secured in-memory cache.
  • As most of the data is managed by ElastiCache such as setup, configuration, and monitoring so that the user can focus on other high-value applications.
  • The performance is enhanced greatly as it only supports the applications which require a very less response time.
  • The ElastiCache can easily scale itself up or scale down according to the need.

 

Explain the types of engines in ElastiCache.

There is two types of engine supported in Elasticache: Memcached and Redis.

Memcached

It is a popular in-memory data store which the developers use for the high-performance cache to speed up applications. By storing the data in memory instead of disk Memcached can retrieve the data in less than a millisecond. It works by keeping every value of the key for every other data to be stored and uniquely identifies each data and lets Memcached quickly find the record.

Redis

Today’s applications need low latency and high throughput performance for real-time processing. Due to the performance, simplicity, and capability of Redis, it is most favored by developers. It provides high performance for real-time apps and sub-millisecond latency. It supports complex datatypes i.e. string, hashes, etc and has backup and restores capabilities. While Memcached supports key names and values up to 1 MB only Redis supports up to 512 MB.

 

Can you differentiate DynamoDB, RDS, and RedShift?

DynamoDB, RDS, and RedShift these three are the database management services offered by Amazon. These can be differentiated as –

Amazon DynamoDB is the NoSQL database service that deals with unstructured data. DynamoDB offers a high level of scalability with faster and inevitable performance.

Amazon RDS is the database management service for relational databases which manages to upgrade, fix, patch and back up information of the database without your intervention. RDS is solely a database management service for structured data.

Amazon RedShift is totally different from RDS and DynamoDB. RedShift is a data warehouse product that is used in data analysis.

Features
Amazon DynamoDB
Amazon RDS
Amazon RedShift
Primary Usage Database for dynamically modified unstructured data Conventional databases Data warehouse
Computing Resources Non-specified, SaaS (Software-as-a-Service) Instances with 64 vCPU and 244 GB RAM Nodes with vCPU and 244 GB RAM
Database Engine NoSQL MySQL, SQL Server, Oracle, Aurora, Postgre SQL, MariaDB RedShift
Maintenance Window No impact 30 minutes every week 30 minutes every week
Multi A-Z Replication In-built Additional service Manual

 

Is it possible to run multiple DB instances for free for Amazon RDS?

Yes, it is possible to run more than one Single-AZ micro DB instance for Amazon RDS and that’s for free. However, if the usage exceeds 750 instance hours across all the RDS Single-AZ micro DB instances, billing will be done at the standard Amazon RDS pricing across all the regions and database engines.

For example, consider we are running 2 Single-AZ micro DB instances for 400 hours each in one month only, the accumulated usage will be 800 instance hours from which 750 instance hours will be free. In this case, you will be billed for the remaining 50 hours at the standard pricing of Amazon RDS.

 

What will happen to the dB snapshots and backups if any user deletes the dB instance?

When a dB instance is deleted, the user receives an option of making a final dB snapshot. If you do that it will restore your information from that snapshot. AWS RDS keeps all these dB snapshots together that are created by the user along with all other manually created dB snapshots when the dB instance is deleted. At the same time, automated backups are deleted while manually created dB snapshots are preserved.

 

What is Amazon Aurora

Amazon Aurora is the MySQL and PostgreSQL relational database. It performs similar-like traditional databases and has the simplicity and cost-effectiveness of open source databases. Amazon Aurora is fully managed by Amazon RDS and automates the processes, such as hardware provisioning, database setup, back-ups, and patching. Also, it has a self-healing storage system that can scale up to 128 TB per database instance.

 

What is the function of the DynamoDB Accelerator?

The fully managed in-memory cache improves data accessing performance up to 10 times higher than usual. Also, it allows to access data within microseconds and manages millions of requests per second; and it helps to lower operational costs.

 

How does AWS RDS ensure high availability and reliability?

AWS RDS allows multi-AZ deployment to ensure high availability and reliability.

With the use of a multi-AZ deployment feature, AWS automatically provisions and maintains a synchronous standby replica in a different Availability Zone.

AWS synchronously replicates the data from the primary to the secondary database instance.

Failover Support: In case if the primary database instance fails or gets shut down in any way, AWS will automatically fail over to another secondary database instance.

 

Which RDS database engines support multi-AZ deployment?

AWS RDS allows multi-AZ deployment to ensure high availability and reliability. Multi-AZ deployments support can be used for below RDS database engine:

MariaDB

MySQL

Oracle

PostgreSQL

 

What are the consistency models for modern DBs offered by AWS?

Eventual Consistency – It means that the data will be consistent eventually but may not be immediate. This will serve the client requests faster, but chances are that some of the initial read requests may read the stale data. This type of consistency is preferred in systems where data need not be real-time. For example, if you don’t see recent tweets on Twitter or recent posts on Facebook for a couple of seconds, it is acceptable.

Strong Consistency – It provides an immediate consistency where the data will be consistent across all the DB Servers immediately. Accordingly. This model may take some time to make the data consistent and subsequently start serving the requests again. However, in this model, it is guaranteed that all the responses will always have consistent data

 

If you launched a standby RDS, will it be launched in the same availability zone as your primary?

No, standby instances are automatically launched in different availability zones than the primary, making them physically independent infrastructures. This is because the whole purpose of standby instances is to prevent infrastructure failure. So, in case the primary goes down, the standby instance will help recover all of the data.

 

Your organization is developing a new multi-tier web application in AWS. Being a new and small organization, there’s a limited staff. But, the organization requires high availability. This new application comprises complex queries and table joins. Which Amazon service will be the best solution for your organization’s requirements?

DynamoDB will be the right choice here since it is designed to be highly scalable, more than RDS or any other relational database service.

 

Your organization is using DynamoDB for its application. This application collects data from its users every 10 minutes and stores it in DynamoDB. Then every day, after a particular time interval, the data (respective to each user) is extracted from DynamoDB and sent to S3. Then, the application visualizes this data to the users. You are asked to propose a solution to help optimize the backend of the application for latency at a lower cost. What would you recommend?

ElastiCache. Amazon ElastiCache is a caching solution offered by Amazon.

It can be used to store a cached version of the application in a region closer to users so that when requests are made by the users the cached version of the application can respond, and hence latency will be reduced.

 

How would you handle a situation where the relational database engine crashes often whenever the traffic to your RDS instances increases, given that the replica of the RDS instance is not promoted as the master instance?

A bigger RDS instance type needs to be opted for handling large amounts of traffic and creating manual or automated snapshots to recover data in case the RDS instance goes down.

 

Your organization wants to monitor the read-and-write IOPS for its AWS MySQL RDS instance and then send real-time alerts to its internal operations team. Which service offered by Amazon can help your organization achieve this scenario?

Amazon CloudWatch would help us achieve this. Since Amazon CloudWatch is a monitoring tool offered by Amazon, it’s the right service to use in the above-mentioned scenario.

 

 

You have created a VPC with private and public subnets. In what kind of subnet would you launch the database servers?

Database servers should be ideally launched in private subnets. Private subnets are ideal for the backend services and databases of all applications since they are not meant to be accessed by the users of the applications, and private subnets are not routable from the Internet.

 

You have deployed multiple EC2 instances across multiple availability zones to run your website. You have also deployed a Multi-AZ RDS MySQL Extra Large DB Instance. The site performs a high number of small read and writes operations per second. After some time, you observed that there is read contention on RDS MySQL. What would be your approach to resolve the contention and optimize your website?

We can deploy ElastiCache in-memory cache running in every availability zone. This will help in creating a cached version of the website for faster access in each availability zone. We can also add an RDS MySQL read replica in each availability zone that can help in efficient and better performance for read operations. So, there will not be any increased workload on the RDS MySQL instance, hence resolving the contention issue.

 

Which of the following Amazon Services would you choose if you want complex querying capabilities but not a whole data warehouse?

Amazon RDS

Which service offered by Amazon will you choose if you want to collect and process e-commerce data for near real-time analysis? (Choose any two)
– DynamoDB

– Redshift

– Aurora

– SimpleDB

DynamoDB. DynamoDB is a fully managed NoSQL database service that can be fed any type of unstructured data. Hence, DynamoDB is the aptest choice for collecting data from e-commerce websites. For near-real-time analysis, we can use Amazon Redshift.

 

Leave a Reply

Your email address will not be published. Required fields are marked *