Strategies for optimizing public cloud costs

Cloud computing has the potential to advance considerable cost savings for organisations.    There are however some strategies that could lead to additional savings  in cloud spend as well as guard against undesirable expense.

 

 

Tailor cloud instances to specific solution requirements

  • Not all cloud workloads fit with cloud provider standard offerings
  • App resource requirements from a memory, storage and networking perspective differ immensely

Public cloud providers such as Amazon Web Services (AWS), Microsoft Azure and Google offer standard instances. These instances are not a fit for all business requirements. Creating custom instance types for enterprise specific workloads is advised where a specific fit is not available.

For example, an organisation may want to deploy a CPU-intensive application that does not require large amounts of memory or storage. Standard offerings with the required CPU performance would be a waste of funds due to the accompanying memory and storage specifications and costs. A custom instance with the require CPU specification and reduced memory and storage specifications would save spend on irrelevant resources.

A customized Amazon Machine Image represents a template for a cloud instance which may be re-used for additional custom instances. These custom instances provide the specific resources required without the surplus capacity.

Virtual machines (VMs) are created from Azure managed VM images using either PowerShell or the Azure portal. A managed VM create all of the information required for creation of a VM and resource allocations are configurable on the VM.

Google Cloud’s Compute Engine allows for the creation of instances with hardware settings customized as required.

Reduction of active resources

Unused and underutilized resources raise costs. These costs may be reduced via the use of auto scaling services, tagging of cloud resources and load balancing. The foundation of public cloud is scalability, but resources should only be scaled up when needed, then scaled back when not due to under-utilisation in order to control costs.

Pinpoint superfluous resource allocations and workloads

Cloud service providers provide methods to manage object lifecycles on their platforms. Examples of this are AWS’s Lifecycle configuration for its Simple Storage Service which allows for objects to be moved to cheaper storage[1] or deleted. Azure Blob Storage lifecycle management offers rule-based policies to transition object between storage tiers[2]. Google’s Cloud Storage utilises Object Lifecycle Management automatically perform actions on objects[3].

Tagging cloud resources allows for the identification of resources that are superfluous to requirements[4].

Applications should only be built with features and within specification[5].

Adopt cloud auto scaling

Auto scaling allows for the addition and removal of resources from cloud instances.

  • The Auto Scaling feature of AWS Autoscaling groups will add resources to groups when utilization peaks and remove resources from those groups when usage declines[6].
  • Microsoft Azure VM Scale Sets allows for the number of VM instances to automatically increase or decrease in response to demand or on a defined schedule.
  • Google Cloud Platforms’ Managed Instance Groups offer autoscaling capabilities to automatically add or delete instances from a managed instance group based on load requirements.

Network configuration needs to be kept in mind when utilising Auto Scaling.[7]

Design around cloud regions/zones

Regions and availability zones have an impact on cloud provider costs. The cost of transferring data between different cloud regions varies among vendors and between different geographical regions[8]. The same applies to computing and storage resources.

Selection of regions/zones will also be influenced by:

  • Location of users
  • Data sovereignty issues
  • Location of on-premise data centres in hybrid environments
  • Location of other cloud providers
  • Taxes, duties in different regions

Cloud storage fees

Some factors that may lead to an increase of cloud storage fees are:

  • Vendor Lock-in
  • Lack of understanding of service offerings
  • Shadow IT
  • Hoarding data

Avoid Vendor Lock-in

The cloud storage market is fiercely competitive. Providers are constantly changing pricing plans for tiered cloud storage.

  • Shorter term contracts may allow organisations to take advantages of competitive pricing
  • Costs need to be projected over the medium term when negotiating storage[9]
  • Understand exit costs

Understand the full service offering and its costs

Storage pricing increases with add-ons, performance and configuration.

Some items to be mindful of are:

  • Backups and disaster recovery[10]
  • Storage tier pricing[11]
  • Storage location[12]

Impede shadow IT

Shadow IT represents cloud services implemented by organisations that are not managed by their respective IT departments and in some instances not even known to be used by IT departments. These services have the potential of creating complications for organisations among these are:

  • One of the advantages of cloud services is access to volume pricing for large organisations. This advantage is lost with Shadow IT procured services[13].
  • Services and applications procured outside of normal IT procurement do not always provide enterprise level security and disaster recovery options[14].
  • Data governance becomes very complex when IT departments are kept out of the loop via Shadow IT initiatives. This situation is exacerbated when cloud services are involved, causing confusion about the “source of truth” of data.

Organisations need to guard against Shadow IT, mainly via pre-emptive realisation of user requirements and pro-active delivery to organisations – thus negating the need for business users to engage in Shadow IT initiatives. Investigating the reasons for existence of Shadow IT within an organisation and addressing those reasons has more value than taking an adversarial approach to the problem[15].

Deduplicate data before migration

Data Deduplication reduces the impact of redundant data on storage costs by analysing new data items in chunks before committing data to storage. If the same chunk of data is already committed to storage, a reference to the stored data is saved rather than the full chunk, thus optimizing redundancy without compromising data fidelity or integrity. Cloud storage costs may be reduced by deduplicating data before it is transmitted to cloud storage. This will reduce the amount of data in cloud storage, reduce network traffic and lower the amount of time required for backup and disaster recovery.

Some deduplication vendors are:

  • Compellent
  • Druva
  • Opendedup
  • Quantum
  • Permabit
  • StarDQ
  • StorReduce

Dispose of obsolete data

Obsolete/outdated data stored in the cloud needs to be purged or archived on a regular schedule. This process needs to be undertaken as a team effort by IT, the organisations records-management officers and legal/compliance teams. Data may be purged, archived to less costly cloud storage or removed to non-cloud storage to reduce cloud storage costs.

Check for and manage cloud sprawl

Cloud sprawl occurs when cloud resources proliferate in an organisation due to uncontrolled implementation and adoption of cloud services. Some reasons for cloud sprawl in organisations are:

  • Overprovisioning of cloud resources
  • Decentralised control over cloud spending and resourcing
  • Unconsolidated use of multiple cloud service providers

Unless steps are taken to minimize cloud sprawl, it could thwart project plans and lead to uneccessary cloud resource costs.

Muzzle overprovisioning

Cloud computing costs vary greatly according to the amount of computing power purchased. Factors that influence these costs are:

  • Amount of RAM
  • Storage capacity and performance
  • Number of processor cores
  • Server running time
  • Enabled features (scaling, load balancing etc.)

Scaling back on any of these factors will save funds, it is advised to only specify computing resources that meet requirements and not anything above that.

Guard against zombie resources

One of the primary benefits of cloud computing is consumption-based costs, allowing organisations to spin up required new services rapidly and to turn services that are not required off. Typical use of cloud resources in this manner is software development and testing which requires constant changes in computing requirements. In practice, resources are frequently activated for testing and not shut down once testing is completed. Development resources are also frequently left running during non-working hours. Both of these situations drive up cloud costs.

Co-ordinate cloud policy centrally

Unlike traditional IT spending, cloud computing does not have a financial capital component. Cloud computing is handled as a financial expense. This leads to cloud service procurement often not following the same rigorous procurement processes that would be in use for traditional IT or capital projects. Cloud services may thus be procured without rigorous review from within the purchasing organisation.

A cloud policy enables organisations to manage cloud services as a virtual data centre in the same way as a traditional data centre would be managed. Via a cloud policy, an organisations IT team is able to provide guidelines to cloud resource procurement and use providing maximum value to the organisation.

  1. Such as moving objects to the STANDARD_IA storage class 30 days after creation or archival of objects to the GLACIER storage class one year after creating them.
  2. For example, blobs to a cooler storage tier (Hot to Cool, Hot to Archive, or Cool to Archive)
  3. Such as downgrading objects of a certain age to Coldline Storage, deleting objects of a certain age and only keeping defined versions of objects.
  4. Tagging allows for the discrimination of cloud objects in a manner that makes sense to organisations. Tags could be via technical attributes/infrastructure, usage, costs centres, projects or any other discernible attributes.
  5. Redundancy, elasticity, geo-dispersion and other growth and resilience features all come at a cost. These features should only be added if required
  6. Typical resources such as Amazon Elastic Compute Cloud (EC2) instances are added
  7. It must be kept in mind that network traffic needs to be re-configured to achieve the full benefit of auto scaling. Network traffic needs to be redirected to new compute instances when scaling up and consolidated to the reduced instances when scaling down.
  8. Examples are Google Cloud Platform who charges less for data transfers between US regions than other non-US regions. AWS has the lowest transfer rate for EC2 users between its US East (Ohio) and US East (North Virginia) regions than between any other regions.
  9. Keeping in mind aspects such as data growth, client growth by region etc.
  10. Have these been included in the offering?
  11. Which tiers are included? Are policies enabled to move content to less costly tiers as required?
  12. Issue such as data sovereignty, user/application location etc. need to be kept in mind.
  13. Specifically negotiating from a position of strength due to procured volume of services for the whole enterprise
  14. Non-IT decision makers could not be expected to analyse prospective cloud offerings with the same level of rigour as IT departments.
  15. More direct (and adversarial) options are tightening of network security, locking down of corporate data sources and stringent control of IT spend throughout the enterprise.