Archive for July, 2009

PHD have just announced support for vSphere in ESXpress 3.6 for the full detail on the product visit http://www.phdvirtual.com/products/esxpress-virtual-backup

PHD Virtual have been a market leader within the virtualised backup industry since the ESX 2.x days, they are one of “the” original Virtualisation backup vendors who noticed a gap that needed to be bridged to be able to successfully back up VM’s without sacrificing ESX host and VM resources to backup agents/frameworks;


The latest version of 3.6 includes the following features;

  • Full Vsphere 4 support
  • Global side Deduplication of virtual machine backups (Note this is deduplication across the whole of your backups and i’ve seen the compression rate its fantastic)
  • Multi User Instant file level restore
  • Up to 16 Concurrent Backups per ESX Host (VCB Recommendation is 4-6)
  • Built in Incremental backups
  • And many more niche technologies which give them a competitive edge over other vendors

In brief ESXpress by design is different to most other virtualisation backup products in that it doesn’t use the Vmware VCB Framework, it utilises a Virtual Backup Appliance (VBA) on your ESX host to backup VM’s located upon that Host. It also has some exceptionally clever deduplication algorithms which also separates’s it from competitors by deduplicating the complete

Installation and configuration of critical backup jobs is a breeze, this is certainly obvious when looking at how to install it is exceptionally easy as this video showshttp://download.phdvirtual.com/docs/esXpress3-5.swf . I will certainly be installing and evaluating the new release at some point from the PHD Virtual website and I highly recommend you have a look yourself, when i’ve had a chance to look at the new release I will hope to post some reviews.

Someone in my office recently got a nice shiny new Solid State Drive for there Laptop. It really does sound really cool to have one of these and Im sure most expect it to be beneficial to removing what is probably the weakest link in most laptops today of Boot times, defrag issues and general slow down over time of the OS environment.

When I looked at it though through my sceptical eyes my first thought was….Are the currently available Operating Systems optimized to use capability in SSD yet? On the SSD my colleague has Windows 7 on it so all very swish….thing is is W7 optimized for SSD as what is a “next generation” Operating system and is possibly one of Microsoft’s last OS’s in the Architectural sense that we know today?

After having a look around on some websites to back this up and I found http://apcmag.com/windows_7_gets_ssdfriendly.htm and http://ajaymatharu.wordpress.com/2008/11/09/windows-7-and-ssds/ so it appears we are still at very early days of being able to exploit full capability in SSDs and a lot of enablement technology hasn’t made it through development into Windows 7 yet. I know that Windows 7 is not GA’d yet but I am pretty sure this functionality won’t make it through to GA in the next few months. So what is it that will propel SSD to bridge the early adopter gap and into being fully commoditised.

Future OS’s

Google just announced an OS last week which I predict will be an OS that will be designed to cater for future hardware such as SSD. Netbooks today run many Linux variants such as Ubuntu and Fedora which are continually in open development and lifecycle due to the nature of Open source will also be OS’s that will exploit the benefits for Notebooks/PC’s.

The TV

I compare the SSD experience to the TV, We all love our TV, its been a device in the home that has functionality available within seconds, press the big red button on your remote control and your tuned in and ready for action. This is exactly what I would expect from a computer running SSD, I want to be able to press the button and boot into my programs, I don’t want to wait and make a cup of tea while it boots, I also don’t want to sit waiting for it to turn off either! I also think that anyone who is a Mac user will know how dam good it is to just open and shut your macbook to resume where you left off, I hardly turn mine off, the pitfall of this is gradual battery drain but its still what I would want to see from a future machine built with SSD however from boot.

It appears it is in the hands of the ISV’s for us to start to see instantaneous available functionality becoming defacto for OS’s, I do predict that Google and Apple will be ahead of this curve before Microsoft (please post if im wrong MS and you have SSD in your roadmap). Building an OS from the ground up maybe a benefit to Mr Google, it certainly gives them the opportunity if it is based on a Linux kernel to continuously evolve the kernel around the technology that is ever evolving a lot more easier.

Hopefully this post has shown that SSD in the Laptop/PC is not certainly something you will obtain large benefits from if you are looking to become an early adopter however I am more than sure it will provide better benefits than what is available in PATA/SATA today. SSD’s in a SAN on the other hand is a different subject post all together….:)

UPDATE: Readded diagram due to link issue, apologies RSS Subscribers

This isn’t an educational post on how many VM’s you place per VMFS volume or how to plan your VMFS luns, its a thinking matter post in response to a question that Steve Chamber’s raised on possible ways to script optimal placement of your VM’s on VMFS/RDM storage LUNs to gain the best performance. This got me thinking (highly dangerous yes) with some possible responses, the main topic for debate on this post is “Is VMDK placement on LUNs really something that should be decided by a Scripting logic or evenly balanced with an algorithm like Vmware DRS does?”

Storage virtualisation and natural decouplement within ESX architecture means that you can design and build a Virtual Machine that can have a Virtual disk drive such as the main OS or partition for flat file copies hosted on lower end SATA Storage or Networked storage, you can then run on the same VM other disks that require higher IO Log and DB disk volumes on more capable Fibre Channel or EFD media. This technical capability all ensures that you can achieve and obtain the dedicated IOP’s needed for running the virtualised workload and more importantly allows organisations to reduce cost by not using higher end storage for lower end storage demands. The diagram below hopefully provides a simplified view of this.

Optimal VMDK Placement

It is important to ensure optimal placement of VM’s upon any storage volumes and plan ahead for expected workload, also important is to ensure that the spindle count and raid level is suited to the running workload. These factors are probably longer term more important than rightsizing your virtual hardware. Within early virtualisation projects you could quite happily operate most VM’s on just a RAID 5 set with 5 or 7 disks, this was mainly due to the fact that it was low hanging fruit and was heavily underutilised before when it was first on its original Physical Platform. More recently with the new major Scalability benefits that are available in Vsphere allow scaling to large amounts of vCPU and RAM which now means you are going to almost want to exploit and use this new capability to target and facilitate virtualising Tier 1 Applications and databases, you’d be silly not to.

Engagement work is needed at the architectural planning and design stage to gain a predicted indication from your Application teams and ISV’s jointly on what requirements the workload will have based on the business requirement of that application. Gaining storage relevant statistics such as how many IOPs and the expected Disk Read/Write characteristic of the running workload are paramount to deciding where to host VM’s. In most projects however their are inherent problems with this in that most Virtualisation/Server Ops guys struggle to engage or obtain this information from Application owners/support unless it is easily accessible within an off the shelve application/DB such as MS Exchange or SQL. Also issues exist with bespoke applications or web services that tend to not have the available technical resources and any performance information from the application developers or the ISV to factor this into your design.

The problem is…

In most provisioning scenarios, IT Operations create VMFS LUNs, present them to ESX Hosts in preparation for VM requirements. When deployment occurs the Virtual Admin will put new VM/VMDK on a LUN that aligns to having appropriate maximum amount of VM’s on that LUN that is set to avoid excessive HBA Path Thrashing, some will just put VM’s on LUNs based on spare space available. Unfortunately when it comes to facilitating for high IO workload both will at some point likely lead to performance problems due to bad placement. Currently best practices and guidelines from VMware work effectively, so you can avoid hot spots on LUN’s to a degree with prior planning for placement at both the SAN and Virtualisation Layer, however the more that you attempt to virtualise Tier 1 Workloads that demand constant amounts of compute resource the harder it will be for people to operate such workloads effectively without constraining resources within the virtual estate.

FAST for VMware

One future possible solution to move to the panacea of automated balancing is something that will feature in EMC’s new Symmetrix V-Max high end array called FAST (Fully Automated Storage Tiering). In a nutshell FAST works by monitoring the storage LUNs and migrating workloads to more suitable tiers. I will spare full detail on how the EMC solution works as Barry Burke provides this on his EMC blog on
http://tinyurl.com/cmawre. Additionally similar technology is also available today in what will be almost the same as FAST initial release within DMX with technology called Symm Optimiser, Symm Optimiser reports on hot and cold spots on your LUN’s and balances them to prioritise workloads against others.

Using automated tiering technology which is automatic and balanced according to the monitoring at the VMFS layer of utilised or underutilised VMDK’s rather than monitoring at the complete LUN would seriously be cool, imagine your SAN array receiving from the ESX host a trap that you have a VMDK that is IO constrained and needed to be migrated onto a VMFS volume that was able to facilitate such as Solid State Disk, or say you plan a regular monthly migration under a defined policy to move a VM from SATA to Fibre Channel for certain periods of time when payroll runs or when you run a batch job, once its complete you move the VM back to sit on SATA.

The Goal

The premise of using automated tiering at array level is to remove any dependency on Human activity within the Operation teams that are today performing excessive amounts of either live Storage Vmotions which on a grand scale are reactive and point driven solutions to problems, other benefits to automated tiering include being able to reduce excessive amounts from the side effect of best case “guestimates” of the VM placement by having to Cold Migrate VMDK’s, cold means downtime which unfortunately costs businesses of any size money and pain.

By using intuitive monitoring techniques across both ESX host and the Storage array and offloading resource balancing from the virtualisation stack to Arrays means the beefy Storage Array can control optimal placement activity which inturn offloads and reduces any imposed overhead from the ESX host, this means more compute resources are available to the running VMs to basically virtualise more or larger workloads.

Other options

Svmotion’ing in response to any storage thresholds being reported in vCenter is another option as an interim action plan until new wacky ideas and technology like automated tiering appear mainstream. Within vCenter you can now use alarm thresholds for;

  • VM Disk Usage (KBps)
  • Total Disk Latency (Ms)
  • VM Disk Aborts
  • VM Disk resets

These when meeting thresholds all can trigger alternative actions. Before you think automatic migrations here with an invoked script it is seriously recommended to Svmotion constrained VM’s manually, you need to seriously consider load imposed on the VM, ESX host and the Storage Array of this type of activity at the moment. Overall I am no scripter but I am sure it would be feasible to output lists of VM’s that were constrained on Disk IO to assess what is utilised and under utilised and then choose to migrate with minimal impact.

Summary

Fail to prepare, prepare to fail is the motto for this post, you seriously need to plan and design storage for Virtualised environments for a variety of workloads before you implement anything into production. To plan for workload requirements needs full scope and detailed workshop activity to occur with the application bods, ISVs and SI’s, this maybe impossible with some bespoke applications but someone will know, whether its a “one man and there dog” developer shop or Microsoft what the workload characteristics of the workload is, if they don’t know then seriously consider the consequences of the application being run within your environment and highlight the risks before it gives Virtualisation a bad name!.

About Me

My name is Daniel Eason, a forward thinking Infrastructure Architect with an eye to ensure that infrastructure technology provides a solid foundational platform that enables business growth and stability.

Provided in this blogs is content on all things related to Infrastructure strategy, technology design and commentary and predictions on emergent trends.

Find Me On