diff --git a/README.md b/README.md index ad415dc..1b47bbb 100644 --- a/README.md +++ b/README.md @@ -365,6 +365,124 @@ To pass the AWS Certified Solutions Architect - Professional exam, you ha 4. Capabilities to provide best practices guidance on the architectural design across multiple applications, projects, or the enterprise.
+You also have to: + +Demonstrate ability to architect the appropriate level of availability based on stakeholder requirements + +1. Stakeholder requirements is key phrase here – look at what the requirements are first before deciding the best way to architect the solution +2. What is availability? Basically up time. Does the customer need 99.99% up time or less? Which products may need to be used to meet this requirement? +3. Look at products which are single AZ, multi AZ and multi region. It may be the case that a couple of instances in a single AZ will suffice if cost is a factor +4. CloudWatch can be used to perform EC2 or auto scaling actions when status checks fail or metrics are exceeded (alarms, etc) + +Demonstrate ability to implement DR for systems based on RPO and RTO + +1. What is DR? It is the recovery of systems, services and applications after an unplanned period of downtime. +2. What is RPO? Recovery Point Objective. At which point in time do we need to get back to when DR processes are invoked? 3. 3. This would come from a customer requirement – when systems are recovered, data is consistent from 30 minutes prior to the outage, or 1 hour, or 4 hours etc. What is acceptable to the stakeholder? +4. What is RTO? Recovery Time Objective. How quickly must systems and services be recovered after invoking DR processes? It may be that all critical systems must be back online within a maximum of four hours. +5. RTO and RPO are often paired together to provide an SLA to end users as to when services will be fully restored and how much data may be lost. For example, an RTO of 2 hours and an RPO of 15 minutes would mean all systems would be recovered in two hours or less and consistent to within 15 minutes of the failure. +6. How can low RTO be achieved? This can be done by using elastic scaling, for example or using monitoring scripts to power up new instances using the AWS API. You may also use multi AZ services such as EBS and RDS to provide additional resilience +7. How can low RPO be achieved? This can be done by using application aware and consistent backup tools, usually native ones such as VSS aware ones from Microsoft or RMAN for Oracle, for example. Databases and real time systems may need to be acquiesced to obtain a crash consistent backup. Standard snapshot tools may not provide this. RMAN can backup to S3 or use point in time snapshots using RDS. RMAN is supported on EC2. Use data dump to move large databases. +8. AWS has multi AZ, multi region and services like S3 which has 11 nines of durability with cross region replication +9. Glacier – long term archive storage. Cheap but not appropriate for fast recovery (several hours retrieval SLA) +19. Storage Gateway is a software appliance that sits on premises that can operate in three modes – gateway cached (hot data kept locally but most data stored in S3), gateway stored (all data kept locally but also replicated to S3) and VTL-Tape Library (virtual disk tapes stored in S3, virtual tape shelf stored in Glacier) +11. You should use gateway cached when the requirement is for low cost primary storage with hot data stored locally +12. Gateway stored keeps all data locally but takes asynchronous snapshots to S3 +13. Gateway cached volumes can store 32TB of data, 32 volumes are supported (32 x 32, 1PB) +14. Gateway stored volumes are 16TB in size, 12 volumes are supported (16 x 12, 192TB) +15. Virtual tape library supports 1500 virtual tapes in S3 (150 TB total) +16. Virtual tape shelf is unlimited tapes (uses Glacier) +17. Storage Gateway can be on premises or EC2. Can also schedule snapshots, supports Direct Connect and also bandwidth throttling. +18. Storage Gateway supports ESXi or Hyper-V, 7.5GB RAM, 75GB storage, 4 or 8 vCPU for installation. To use the Marketplace appliance, you must choose xlarge instance or bigger and m3, i2, c3, c4, r3, d2, or m4 instance types +19. Gateway cached requires a separate volume as a buffer upload area and caching area +20. Gateway stored requires enough space to hold your full data set and also an upload buffer +VTL also requires an upload buffer and cache area +21. Ports required for Storage Gateway include 443 (HTTPS) to AWS, port 80 for initial activation only, port 3260 for iSCSI internally and port 53 for DNS (internal) +22. Gateway stored snapshots are stored in S3 and can be used to recover data quickly. EBS snapshots can also be used to create a volume to attach to new EC2 instances +23. Can also use gateway snapshots to create a new volume on the gateway itself +24. Snapshots can also be used to migrate cached volumes into stored volumes, stored volumes into cached volumes and also snapshot a volume to create a new EBS volume to attach to an instance +25. Use System Resource Check from the appliance menu to ensure the appliance has enough virtual resources to run (RAM, vCPU, etc.) +26. VTL virtual tape retrieval is instantaneous, whereas Tape Shelf (Glacier) can take up to 24 hours +27. VTL supports Backup Exec 2012-15, Veeam 7 and 8, NetBackup 7, System Center Data Protection 2012, Dell NetVault 10 +28. Snapshots can either be scheduled or done ad hoc +29. Writes to S3 get throttled as the write buffer gets close to capacity – you can monitor this with CloudWatch +30. EBS – Elastic Block Store – block based storage replicated across hosts in a single AZ in a region +31. Direct Connect – connection directly into AWS’s data centre via a trusted third party. This can be backed up with standby Direct Connect links or even software VPN +32. Route53 also has 100% uptime SLA, Elastic Load Balancing and VPC can also provide a level of resilience if required +32. DynamoDB has three copies per region and also can perform multi-region replication +33. RDS also supports multi-AZ deployments and read only replicas of data. 5 read only replicas for MySQL, MariaDB and PostGres, 15 for Aurora +34. There are four DR models in the AWS white paper:- +