Dec 3, 2011 - Business, IT, Startups    1 Comment

Amazon’s cloud computing platform as a disruptive technology

You may have heard about Amazon’s Elastic Cloud Computing platform. It’s a way to run virtual computers in the cloud and it’s changing the way a lot of IT business operate.

Err… Virtual machines? In a cloud? Get real!

Ok, they sound confusing for the non-technical folks. However, it’s actually straight forward. For the savvy who already know, feel free to skip to the next section.

Still with me? Ok, lets cover virtual machine first. Traditionally operating systems (OSs like Windows or Linux or Mac OS) run directly on top of hardware. And then several application run on top of the OS. You see this on the figure below.

Virtualization_Explained

Block diagram of Virtualization

Now a virtual machine is basically a small but powerful piece of software that runs on top of real hardware. It’s only job is to run other OSs on top of itself. The traditional OS that runs on the top? Those marked as OS 1, OS 2, OS N etc above? They thinks they are running on “real” hardware while they are in reality, running on the virtualization layer. This virtualization software layer, called a Hypervisor, then runs many OSs on itself and each such OS can run multiple applications.

So why bother with this complexity? As usual, it’s about the money. The real hardware machine costs money. If one OS is not busy, the other OS can consume the real hardware, effectively sharing the cost of the real hardware. It’s analogous to sharing the freeway system with other drivers/taxpayers or sharing the cellphone towers with other cellphone users.

Ok, virtual machine done. What about the cloud? Well, if you take that real hardware, (the actual computer box you can touch and feel) and put it inside a secure building with continuous electricity,back generators, air conditioning/cooling systems and some really FAST internet access, you basically have put your virtual machines in the cloud. And that secure building? That’s called a Data Center and usually hosts thousands of ‘real’ hardware servers. Why so many physical servers? To share the costs of the expensive building itself. It sort of looks like the figure below.

DataCenterDiagram

DataCenter Diagram

Back on track

With that quick education, lets resume our story. So, I’ve been using Amazon’s Elastic Cloud Computing (ECC = EC2) platform a fair amount recently. I find it a great way to rent servers/computer on-the-fly on hourly basis, with no contractual commitments (eg. one year etc). So you want to use a powerful quad-core server for 4 hours? Or maybe you need a monster 16 CPU cluster computer to grind through some heavy duty number crunching? Well, with EC2 you can create and start these as virtual machines in under 5 minutes!!

The economics

In case this doesn’t hit you, it’s a dramatic change from 2+ years back when you’d need to call up Dell/HP/Sun and order that server hardware for many thousands of dollars, then wait for it to come and then start using it. And once you’re done with your computational tasks – then what? Well, you lost a lot of money. So many such projects never saw the light of day. To put realistic numbers, that monster of a 16 core cluster computer would cost you about $6000 to buy. The same processing power is available for about $2/hour from Amazon EC2. So you can now own your own supercomputer for 2 days, get done with your work – all for under $100!! On top of that, with Amazon’s EC2, you don’t have to worry about hardware failures or maintenance – it’s all covered by Amazon. To draw a common analogy, imagine that instead of renting you had to you had to BUY a taxi cab each time you needed one!

That enables an entire class of software applications and services to spring into existence. They can all of a sudden access vast amounts of computing power without large capital expenditure. Business is growing? Great, fire up new EC2 computers ! You can scale the IT side within a matter of minutes. EC2 basically allows you to switch from a large fixed cost (buying the server) to much smaller variable costs (per hour rental of server). This is great because computer servers are depreciating assets (unlike, in normal times, say real estate). So the allure in ‘owning’ the asset doesn’t exist either. Score one more for EC2!

What else

But all the attention and spotlight at the high end supercomputer end doesn’t mean that there isn’t any use at the low end either. In fact, Amazon makes it very inexpensive to quickly fire up a handful (eg: 6) modest, cheap computers that do mundane tasks like serve web pages or read-write to databases. Here each computer is not very powerful but collectively their distribute their workload and get some real tasks done.

There is yet another category of computing that I guess will gain popularity over time. That is of moving your desktop computer into the cloud. That’s what I’ve been doing recently. I needed another computer system for accounting and instead of buying another physical computer, I simply created a virtual EC2 computer in the cloud. When it’s time to do accounting, fire that up and your rental starts. When I’m done, shut it down and my rental stops/pauses. If my accountant needs to review the financial statements, she simply remotes into the system (via Windows Remote Desktop, RDP) and enter her credentials. It won’t work for every scenario (eg. on airplanes without internet access) but for the most part they are enabling.

Ending notes

I suspect we’ll be seeing a lot of innovation in the coming months and years. Small companies can now compete with the big boys in certain areas that were previously capital intensive. As more players enter the cloud computing market, the prices for renting cloud computers will also decrease. At a personal level, I’ll be tinkering with EC2 for some more time too. So expect some more articles here as time passes on.

Dec 3, 2011 - Miscellaneous    No Comments

Migrating Amazon EC2 Windows virtual machine between regions

Many of you might know of Amazon’s Elastic Cloud Computing platform. If not, check this article I wrote about it recently.

Anyway, I had to migrate an Amazon computer from their Virginia Data center to their new Oregon Data center. The rental costs are the same but Oregon is closer to California, so it makes for more responsive computing. Unfortunately the Amazon Web Services webpage from where you control everything doesn’t offer a simple point and click option to achieve this. So unfortunately, I had to do it by hand. Rather unproductive :(

Newbie checkpoint

Before we move on, I must say, this one is a fairly technical article. I’ve written this more as a note to myself for future references than an article for widespread consumption. If you’re new to EC2, you probably don’t care about the details. This article I wrote on EC2 recently may be more appropriate.

However, if a Google search brought you here, then you know what you’re looking for.

Migrating EC2 Windows virtual machine between regions

Setting it up

We’re going to use Linux to copy the Windows volume, byte-for-byte across the network. It’s actually very simple, I’ve just given detailed instructions to assist newbies.

1. Backup your Windows EBS volume at the source region

  1. Shutdown your Windows EC2 instance.
  2. Detach Volume. volumes section -> the Windows volume(Right click) -> Detach Volume.
  3. Create snapshot. Right click it again -> create snapshot.
  4. Note the ‘zone’ of the volume (eg. ‘us-east-1b’) somewhere. You need it shortly

2. Create a Linux instance in your source region, in same zone I used Ubuntu, (search for “ubuntu/images” AMIs). Alternatively go to http://alestic.com/ and you’ll find links to the latest official Ubuntu AMIs on the top. I used AMI ami-20f97410 as reference.

  1. Pick instance type, t1.micro worked for me. My ubuntu 11.10 32bit instance was at 40% CPU throughout the disk read operation.
  2. Set instance in the SAME zone as your windows volume (eg. ‘us-east-1b’).
  3. Attach the windows volume to your Linux instance (right click-> attach -> pick linux instance) to something like /dev/sdf (Web UI may change to /dev/xvdf in future). If you don’t see your linux instance, the Windows volume and Linux instance are likely in different ‘zones’ (eg ‘us-east-1a’ vs ‘us-east-1b’).
  4. Ensure that port 9999 is open in the security groups
  5. Boot up the Linux instance
  6. do NOT mount the Windows volume (we only want the windows ‘hard drive’ attached to the ‘machine’, NOT the Linux OS to mount the filesystem present inside the ‘hard drive’ device)
  7. install cpipe via sudo apt-get install cpipe

3. Create a blank EBS volume at your destination region

  1. Destination region (top left) -> Volumes -> Create Volume.
  2. I used the same size (30GB) but this is a chance to grow the volume.
  3. Note down the zone (eg. us-west-2a). Ensure it’s in the same region that you’d like your Windows EC2 instance to be.

4. Create a Linux instance in your destination region, in same zone Read the above section (#2) about the Linux instance. Again make sure the Linux and Windows volume zones are the same within the destination zone. My Ubuntu 11.10 32bit ‘server’ t1.micro instance was hitting 80% CPU during this disk write operation.

  1. Boot the instance.
  2. do NOT* mount the windows volume
  3. Attach the blank, destination windows volume to this Linux instance.

Start the copy process

1. Destination Log into the Linux EC2 instance and enter

sudo sh -c 'netcat -p 9999 -l > /dev/xvdf' 

2. Source Log into the Linux EC2 instance and enter

sudo sh -c 'cpipe -vt -b 1024 < /dev/xvdf | netcat -q 1 dest-aws-ip-dns-address.com 9999' 

note : More new/current kernels will use /dev/xvdf for the Windows volume. Older kernels refer to it as /dev/sdf

note2 netcat is sometimes aliased to nc, in case you’re using another Linux distro. cpipe just gives your feedback about progress and tranfer rates. netcat runs over TCP/IP, so it’s robust against dropped/corrupt packets. You’d still want to perform a windows disk check on the destination once the transfer completes.

3. Wait I was getting an exact 3.00MB/sec (Mbytes, not Mbits) throughout this operation from the east to west coast. My 30GB image took 2hours, 50 minutes.

4. Create new Windows EC2 instance We have a volume but a EC2 instance/machine is needed to actually boot anything. If go use the AWS console GUI to go volume->snapshot->AMI, AWS incorrectly creates a Linux (?!?) VM that doesn’t boot because the EC2 configuration is now garbage. We work around that by

  1. Create a new Windows EC2 instance in the SAME zone as your volume. You will choose stock Windows AMI’s but pick one closest to your actual Windows volume. My source image was Windows 2003 R2, 32bits, EBS backed, so I fired off a (generic) Windows 2003 R2, 32bit, EBS backed AMI too.
  2. Boot this instance completely. Verify by logging in via RDP.
  3. Shutdown this instance
  4. Detach the EBS volume created in #1 just before
  5. Attach the EBS volume you copied over as /dev/sda1 (NOT the default /dev/xvdf in the Web UI dialog)
  6. Reboot.
  7. Safety measure: Check the resulting system’s drive (chkdsk).
  8. Enjoy!

5. Cleanup Remember that you’ve ONLY done the following

  1. Migrated the volume
  2. Recreated the EC2 instance

If you terminate the above, it’s gone forever, no AMI to relaunch it. So, I would highly recommend

  1. Build an AMI out of the currently working instance (right click instance -> create image AMI). This is now your new “day 0″ reference point.
  2. Make a snapshot of the volume too. Yeah, I know the AMI backup above helps, but I keep weekly backups, so I’d have a “day 7″ reference next week anyway. Plus since snapshots are differential, I did this right away.

Conclusion

It’s a LONG writeup, but I left no details for newbies. The above worked 100% for me migrating from US-East (Virginia) to US-West (Oregon). I’m in CA and did this to bring the VM closer to me without hitting northern CA’s higher charges.

Dec 1, 2011 - IT, Technology    No Comments

Virtual machine performance for software development

I needed a mobile development station in case I needed to jump into code along with my other engineers. My absolute first choice for development is a desktop with a large keyboard, bull sized mouse and a large screen – don’t get me wrong. But now I need mobility. My laptop is a 2008 Aluminum Core2Due MacBook and for the most times it’s quite quick. For development, I loaded Windows 7 professional on it via bootcamp but noticed really poor battery life (40% of Mac OS). Sadly, Apple doesn’t have decent Win7 drivers for that machine due to it’s age. Newer Macbook Pros/Airs don’t have this issue with Windows 7. But if I buy a new machine right now, I’ll need 8GB RAM to run both OSs in parallel. And I don’t want to spend $2700 on a Macbook Pros while constantly resenting it’s size. Not do I want to settle for a MacBook Air with just 4GB RAM (Apple’s top limit right now).

So I thought I’d try running Windows 7 inside a Mac OS virtual machine.Mac OS’s drivers are very well tuned, so battery life would be nice. Not as good as Mac OS X only, because the Windows XP virtual machine loads the processor (=> more electricity => lesser battery life). But as was later confirmed, the battery tradeoff was worth it.

So, what’s the problem?

When I tried that, I noticed a massive slow down in the virtualized Windows 7 running on the laptop. Take a look at the numbers below, the same operations take about 4.25 x longer !!

(Table of Benchmarks)

This was mind boggling since I knew even the humble Core2Duo had some hardware support for virtualization. I tried the same virtual machine on my desktop and there I was getting near native performance. It was admittedly a lot better than I was expecting but I won’t complain.

And what might cause that problem?

OS : I tried both Windows 7 (VM) on top of Windows 7 native AND Windows 7 (VM) on top of Mac OS X 10.7 (native). Columns 4 and 5. No difference. So hosting OS isn’t an issue. Thankfully.

RAM: Then I started suspecting 4GB RAM on the laptop may be tight for both Mac OS X 10.7.2 + Windows 7 VM, so I fired up an older WinXP VM I had and installed Visual Studio 2010 there and tried it again. Although it’s still not near-native speeds, the WinXP VM is a lot more acceptable at about 2.66 x of the native speed (of 1.66x or 166% extra time than usual). But on the flip side, even after taking RAM out of the equation, there is STILL a LOT of performance degradation on the Core2Duo’s virtual machine performance. There is something else …

 CPU: The only major thing that’s left out is the CPU. Of course, the core i-5 and the core2duo CPUs are different. But we’re comparing processor “X” native performance vs processor “X” virtualized performance. So the experiment should be self-controlled to a large extent.We need to look closer and do so via a neat tool, CPU-z. I’ve put screens below.

However, So unfortunately, it’s not clear why virtualized performance is SO radically different between the near-native performance of the core i-5 versus the 266% slowdown on the core2duo. A dump of the CPU-Z info is also shown below, you’ll notice that the CPU extensions are similar meaning they should have the same level of hardware virtualization support.

CPU-z on desktop

CPU-z on desktop

CPU-z on laptop

CPU-z on laptop

Now we can see that the faster RAM bandwidth and the larger cache on the Desktop core i5 could really outpace the laptop Core2Duo. So even when the virtual machine gets the same quantity of ram, it’s not the same quality. The desktop RAM has much higher bandwidth, so it’s can push/pull data from the RAM to the CPU much quicker. This meas the CPU is NOT starving for data as compared to the laptop. On top of that the cache on the desktop core i5 is way larger (8Megs L3 Cache!).

Conclusion?

For the time being I’ll stick to Mac OS X for email, office docs, presentations and spreadsheets. When I need some development AND I’m on the road, I’ll fire up Windows XP and work in it. Same for QuickBooks if I need to catchup with accounting – that’s Windows only too. When Apple starts making 8GB MacBook Airs, I might upgrade.

For the curious, the software test was basically compiling a Visual studio 2010 solution which contained 3 projects

  • 1x DLL library
  • 2x ASP.NET 4.0 websites
Nov 11, 2011 - Miscellaneous    No Comments

X marks the spot

I’ve been wanting to have my own blog for quite some time now, so I’m glad to finally have put one up here. I intend to accumulate my thoughts and questions over here, accumulating then over the years as a digital diary.