First, JetPack, a service provided by WordPress that does monitoring and statistics on WordPress blogs has said that my blog was not reachable. Typically this is something that easily resolves itself with monit and other tools in place. Then nearly an hour later AWS determines there is a problem stating:
Dear Amazon EC2 Customer,
We have important news about your account (AWS Account ID: 117905818048). EC2 has detected degradation of the underlying hardware hosting your Amazon EC2 instance (instance-ID: i-91a71e55) in the us-west-2 region. Due to this degradation, your instance could already be unreachable. After 2016-07-13 08:00 UTC your instance, which has an EBS volume as the root device, will be stopped.
You can see more information on your instances that are scheduled for retirement in the AWS Management Console (https://console.aws.amazon.com/ec2/v2/home?region=us-west-2#Events)
* How does this affect you?
Your instance’s root device is an EBS volume and the instance will be stopped after the specified retirement date. You can start it again at any time. Note that if you have EC2 instance store volumes attached to the instance, any data on these volumes will be lost when the instance is stopped or terminated as these volumes are physically attached to the host computer
* What do you need to do?
You may still be able to access the instance. We recommend that you replace the instance by creating an AMI of your instance and launch a new instance from the AMI. For more information please see Amazon Machine Images (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html) in the EC2 User Guide. In case of difficulties stopping your EBS-backed instance, please see the Instance FAQ (http://aws.amazon.com/instance-help/#ebs-stuck-stopping).
* Why retirement?
AWS may schedule instances for retirement in cases where there is an unrecoverable issue with the underlying hardware. For more information about scheduled retirement events please see the EC2 user guide (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-retirement.html). To avoid single points of failure within critical applications, please refer to our architecture center for more information on implementing fault-tolerant architectures: http://aws.amazon.com/architecture
If you have any questions or concerns, you can contact the AWS Support Team on the community forums and via AWS Premium Support at: http://aws.amazon.com/support
Amazon Web Services
So this morning I attempt to create an AMI from my image and it fails, my last backup was over a month ago, what to do? First I create a snapshot of a volume; this succeeds. Then I used a previously created AMI to launch a new EC2 Instance, stop said instance, detach the volume and create and attach a volume from from the snapshot I just made. Now just a little finagling with my elastic IP and boom were back up.
- Prepare for hardware to fail at any time.
- Automated backups are your friends (even for non critical things like a blog)
- Design for HA to ensure minimal downtime in a production application on the cloud
Thanks AWS for the notice but I would have imagined that a billion dollar company would be able to detect a hardware issue before JetPack by WordPress determines my site is down. Then again perhaps I should have had my CloudWatch alarms notify me on this.