Virtual Thoughts

Virtualisation, Storage and various other ramblings.

Category: Storage

Homelab Networking Refresh

Adios, Netgear router

In hindsight, I shouldn’t have bought a Netgear D7000 router. The reviews were good but after about 6 months of ownership, it decided to exhibit some pretty awful symptoms. One of which was completely and indiscriminately drop all wireless clients regardless of device type, range, band or frequency it resided on. A reconnect to the wireless network would prompt the passphrase again, weirdly. Even after putting in the passphrase (again) it wouldn’t connect. The only way to rectify this was to physically reboot the router.

Netgear support was pretty poor too. The support representative wanted me to downgrade firmware versions just to “see if it helps” despite confirming that this issue is not known in any of the published firmware versions.

Netgear support also suggested I changed the 2.4ghz network band. Simply put. They weren’t listening or couldn’t comprehend what I was saying.

Anyway, rant over. Amazon refunded me the £130 for the Netgear router after me explaining the situation about Netgear’s poor support. Amazing service really.

Hola, Ubiquiti

I’ve been eyeing up Ubiquiti for a while now but never had a reason to get any of their kit until now.  With me predominantly working from home when I’m not on the road and my other half running a business from home, stable connectivity is pretty important to both of us.

The EdgeMAX range from Ubiquiti looked like it fit the bill. I’d say it sits above the consumer-level stuff from the likes of Netgear, Asus, TP-Link etc and just below enterprise level kit from the likes of Juniper, Cisco, etc. Apart from the usual array of features found on devices of this type I particularly wanted to mess around with BGP/OSPF from my homelab when creating networks in VMware NSX.

With that in mind, I cracked open Visio and started diagramming, eventually ending up with the following:

 

I noted the following observations:

  • Ubiquti Edgerouters do not have a build in VDSL modem, therefore for connections such as mine, I required a separate modem.
  • The Edgerouter Lite has no hardware switching module, therefore it should be purely used as a router (makes sense)
  • The Edgerouter X has a hardware switching module with routing capabilities (but lower total pps (Packets Per Second))

Verdict

I managed to set up the pictured environment over the weekend fairly easily. The Ubiquiti software is very modern, slick, easy to use and responsive. Leaps and bounds from what I’ve found on consumer-grade equipment.

I have but one criticism with the Ubiquiti routers, and that is not everything is easily configurable through the UI (yet). From what I’ve read Ubiquiti are making good progress with this, but for me I had to resort to the CLI to finish my OSPF peering configuration.

The wireless access point is decent. good coverage and the ability to provision an isolated guest network with custom portal is a very nice touch.

Considering the Edgerouter Lite costs about £80 I personally think it represents good value for money considering the feature set it provides. I wouldn’t recommend it for every day casual network users, but then again, that isn’t Ubiquiti’s market.

The Ubiquiti community is active and very helpful as well.

 

 

 

 

VMware Cloud on AWS

Perhaps one of VMware’s most significant announcements made in recent times is the partnership with Amazon Web Services (AWS), including the ability to leverage AWS’s infrastructure to provision vSphere managed resources. What exactly does this mean and what benefits could this bring to the enterprise?

 

Collaboration of Two Giants

To understand and appreciate the significance of this partnership we must acknowledge the position and perspective of each.

 

 

 

  • Market leader in private cloud offerings
  • Deep roots and history in virtualisation
  • Expanding portfolio

 

 

 

 

  • Market leader in public cloud offerings
  • Broad and expanding range of services
  • Global scale

 

VMware has a significant presence in the on-premise datacentre, in contrast to AWS which focuses entirely on the public cloud space. VMware cloud on AWS sits in the middle as a true hybrid cloud solution leveraging the established, industry-leading technologies and software developed by VMware, together with the infrastructure capabilities provided by AWS.

 

How it Works

In a typical setup, an established vSphere private cloud already exists. Customers can then provision an AWS-backed vSphere environment using a modern HTML5 based client. The environment created by AWS leverages the following technologies:

  • ESXi on bare metal servers
  • vSphere management
  • vSAN
  • NSX

 

The connection between the on-premise and AWS hosted vSphere environments is facilitated by Hybrid Linked Mode. This allows customers to manage both on-premise and AWS hosted environments through a single management interface. This also allows us to, for example, migrate and manage workloads between the two.

Advantages

Existing vSphere customers may already be leveraging AWS resources in a different way, however, there are significant advantages associated with implementing VMware cloud on AWS, such as:

Delivered as a service from VMware – The entire ecosystem of this hybrid cloud solution is sold, delivered and supported by VMware. This simplifies support, management, billing amongst other activities such as patching and updates.

Consistent operational model – Existing private cloud users use the same tools, processes and technologies to manage the solution. This includes integration with other VMware products included in the vRealize product suite.

Enterprise-grade capabilities – This solution leverages the extensive AWS hardware capabilities which include the latest in low latency IO storage technology based on Solid State Drive technology and high-performance networking.

Access to native AWS resources – This solution can be further expanded to access and consume native AWS technologies pertaining to databases, AI, analytics and more.

Use Cases

VMware Cloud on AWS has several applications, including (but not limited to) the following:

 

Datacenter Extension

 

Because of how rapidly an AWS-backed software-defined datacenter can be provisioned, expanding an on-premise environment becomes a trivial task. Once completed, these additional resources can be consumed to meet various business and technical demands.

 

 

 

Dev / Test

 

Adding additional capabilities to an existing private cloud environment enables the division of duties/responsibilities. This enables organisations to separate out specific environments for the purposes of security, delegation and management.

 

 

 

 

 

Application Migration

 

 

VMware cloud on AWS enables us to migrate N-tier applications to an AWS backed vSphere environment without the need to re-architect or convert our virtual machine/compute and storage constructs. This is because we’re using the same software-defined data centre technologies across our entire estate (vSphere, NSX and vSAN).

 

 

 

 

 

 

Conclusion

There are a number of viable applications for VMware Cloud on AWS and it’s a very strong offering considering the pedigree of both VMware and AWS. Combining the strengths from each creates a very compelling option for anyone considering a hybrid cloud adoption strategy.

To learn more about VMware Cloud on AWS please review the following:

https://aws.amazon.com/vmware/

https://cloud.vmware.com/vmc-aws

 

Intel Skylake/Kaby Lake processors: broken hyper-threading

Overview

Source : https://lists.debian.org/debian-devel/2017/06/msg00308.html

It appears some Intel Xeon CPU’s are susceptible to a recently discovered Hyper Threading bug. However, these are limited to E3 v5/v6 based Xeon systems which are found mostly in entry level servers with single socket implementations. > Dual socket systems currently leverage E5 based Xeons which don’t appear to be affected.

Currently, the easiest way to mitigate against this bug is to simply disable hyper-threading. The bug also appears to be OS agnostic.

Just Servers?

The focus around social media has predominately been around run of the mill servers; ones you typically purchase from the likes of Dell, HP, etc. However, there could be many bespoke devices that leverage susceptible processors, such as NAS/SAN heads. It is unlikely that in the event you find such a device HT can simply be disabled, but it should be something to be aware of.

List of Intel processors code-named “Skylake”
List of Intel processors code-named “Kaby Lake”

Homelab v2 – Part 1

Out with the old

My previous homelab, although functional was starting to hit the limits of 32GB of RAM, particularly when running vCenter, vSAN, NSX, etc concurrently.

A family member had use for my old lab so I decided to sell it and get a replacement whitebox.

 

Requirements

  • Quiet – As this would live in my office and powered on pretty much 24/7 it need a silent running machine
  • Power efficient – I’d rather not rack up the electric bill.
  • 64GB Ram Support

 

Nice to have

  • 10GbE
  • IPMI / Remote Access
  • Mini-ITX

Order List

I’ve had a interest in the Xeon-D boards for quite some time, the low power footprint, SRV-IO support, integrated 10GbE, IPMI and 128GB RAM support make it an attractive offering. I spotted a good deal and decided to take the plunge on a Supermicro X10SDV-4C+-TLN4F

 

As for a complete list:

Motherboard – Supermicro X10SDV-4C+-TLN4F

RAM – 64GB (4x16GB) ADATA DDR4

Case – TBC, undecided between a supermicro 1U case or a standard desktop ITX case

Network – Existing gigabit switch. 10GbE Switches are still quite expensive, but it’s nice to have future compatibility on the motherboard for it.

I’ve yet to take delivery of all the components, part 2 will include assembly.

Achievement Unlocked : Dell Compellent Certified Deployment Professional

I’ve only recently started focusing more on developing my storage skills which I personally believe to be a good complement to my existing VMware knowledge. I’ve been working with Compellent systems for a few months now and thought it was a good time to get officially certified.

The exam itself put me a little out of my comfort zone, as in the past my storage level knowledge was limited to administrator level on EqualLogic setups. This exam was tough but rewarding.

Now I get to enjoy week long holiday and relaxing.

My return will start my VCAP6-Deploy exam prep…

Compellent

 

 

An introduction to vSphere Metro Storage Cluster with Compellent Live Volume

Intro

VMware vSphere Metro Storage Cluster is a suite of infrastructure configurations that facilitate a stretched cluster setup. It’s not a feature like HA/DRS that we can switch on easily; it requires architectural design decisions that specifically contribute to this configuration. The foundation of which are stretched clusters and with regards to the Compellent suite of solutions Live Volume

Stretched Cluster

Stretched clusters are pretty much self explanatory. In comparison to a lot of configurations where compute clusters reside within the same physical room, stretched clusters spread the compute capacity over more than one physical location. This can still be internally (different server rooms within the same building) or further apart over geographically disperse sites.

Having stretched clusters gives us greater flexibility and potentially better RPO/RTO with mission critical workloads when implemented correctly. Risk and performance are spread across one location. Failover scenarios can be further enhanced with automatic fail over features that come with solutions like Compellent Live Volume.

From a networking perspective. Ideally we have a stretched & trunked layer 2 network across both sites facilitated by redundant connections. I will touch on requirements later on in this post.

What is Live Volume?

Live volume is a specific feature with Dell Compellent storage centers. Broadly speaking  Live Volume virtualizes the volume presentation separating it from disk and RAID groups within each storage system. This virtualization enables decoupling of the volume presentation to the host from its physical location on a particular storage array. As a result, promoting the secondary storage array to primary status is transparent to the hosts, and can be done automatically with auto-failover. Why is this important for vMSC? Because in certain failure scenarios we can fail over between both sites automatically and gracefully.

 

LV

 

Requirements

Specifically regarding the Dell Compellent solution:

  • SCOS 6.7 or newer
  • High Bandwidth, low latency link between two sites
    • Latency must be no greater than 10ms. 5ms or less is recommended
    • Bandwidth is dependent on load, it is not uncommon to see redundant 10Gb/40Gb links between sites
  • Uniform or non-uniform presentation
  • Fixed or round Robin path selection
  • No support for Physical Mode RDM’s
    • Very important when considering traditional MSCS
  • For auto failover a third site is required with the Enterprise Manager software installed to act as a a tiebreaker
    • Maximum latency to both storage center networks must not exceed 200ms RTT
  • Redundant vMotion network supporting minimum throughput of 250Mbps

 

Presentation Modes – Uniform

For vMSC we have two options for presenting our storage. Uniform and Non uniform. Below is a diagram depicting a traditional uniform configuration. Uniform configurations are commonly reffered to as “mesh” typologies, because of how the compute layer has access to primary and secondary storage both locally and via the inter-site link.

 

uniform

 

Key considerations about uniform presentation:

  • Both Primary and Secondary Live Volumes presented on active paths to all ESXi hosts.
  • Typically used in environments where both sites are in close proximity.
  • Greater pressure/dependency on inter-site link compared to non-uniform.
  • Different reactions to failure scenarios compared to non-uniform – Because of storage paths and how Live volume works
  • Attention needs to be taken to IO paths. For example, write requests received by a storage center that holds the secondary volume will simply act as a proxy and redirect the I/O request to the Storage Center that has the primary volume over the replication network. This causes additional delay. Under some conditions, Live volume will be intelligent enough to swap the roles for a volume when it experiences all I/O requests from a specific site.

 

Presentation Modes – Non Uniform

Non-Uniform presentation restricts primary volume access to the confines of the local site. Key differences and observations are around how vCenter/ESXi will react to certain failure scenarios. It could be argued that non-uniform presentation isn’t as resilient as uniform, but this depends on the implementation.non-uniform

Key considerations about Non-uniform presentation:

  • Primary and Secondary Live Volumes presented via active paths to ESXi hosts within their local site only
  • Typically used in environments where both sites are not in close proximity
  • Less pressure/dependency on inter-site connectivity
  • Path/Storage failure would invoke a “All Paths Down” condition. Consequently affected VM’s will be rebooted on secondary site. Whereas compared to uniform presentation they would not – because paths would still be active.

 

Synchronous Replication Types

With Dell Compellent storage centers we have two methods of achieving synchronous replication:

  • High Consistency
    • Rigidly follows storage industry specifications pertaining to synchronous replication.
    • Guarantees data consistency between replication source and target.
    • Sensitive to Latency
    • If writes cannot be committed to destination target, it will not be committed at the source. Consequently the IO will appear as failed to the OS.
  • High Availability
    • Adopts a level of flexibility when adhering to industry specifications.
    • Under normal conditions behaves the same as High Consistency.
    • If the replication link or the destination storage either becomes unavailable or exceeds a latency threshold, Storage Center will automatically remove the dual write committal requirement at the estination volume.
    • IO is then Journaled at the source
    • When destination volume has returned healthy, IO is flushed at the destination

Most people tend to opt for High Availability mode for flexibility, unless they have some specific internal or external regulatory requirements.

 

Are there HA/DRS considerations?

Short answer, yes. Long answer, it depends on the storage vendor, but as this is a Compellent-Centric post I wanted to discuss a (really cool) feature that can potentially alleviate some headaches. It doesn’t absolve all HA/DRS considerations, because these are still valid design factors.

FO

 

In this example we have a Live Volume configured on two SAN’s leveraging Synchronous replication in a uniform presentation.

If, for any reason a VM is migrated to the second site where the secondary volume resides, we will observe IO requests proxied over to the storage center that currently has the primary live volume.

However, Live volume is intelligent enough to identify this, and under these conditions will perform a automatic role swap, in an attempt to make all IO as efficient as possible.

I really like this feature, but it will only be efficient if a VM has its own volume, or VM’s that reside on one volume are grouped together. If Live Volume sees IO from both sites to the same Live Volume, then it will not perform a role swap. Prior to this feature, and under different design considerations we would need to leverage DRS affinity rules (should, not must) for optimal placement of VM’s for the shortest path of IO.

Other considerations include, but not limited to:

  • Admission Control
    • Set to 50% to allow enough resource for complete failover
  • Isolation Addresses
    • Specify two, one for each physical site
  • Datastore Heartbeating
    • Increase the number of datastore heartbeats from two to four in a stretched cluster. Two DS’s for each site

 

Why we need a third site

Live volume can work without a third site, but you won’t get automatic failover. It’s integral for establishing quorum during unplanned outages and for preventing split brain conditions during networking partitioning. With Compellent, we just need to install the Enterprise Manager on a third site that has connectivity to both storage centers, <200ms latency and it can be a physical or virtual Windows machine.

 

Conclusion

As you can imagine a lot of care and attention is required when designing and implementing a vMSC solution. Compellent has some very useful features to facilitate it, and with advancements in network technology there is a growing trend for stretched clusters for many reasons.

© 2019 Virtual Thoughts

Theme by Anders NorenUp ↑

Social media & sharing icons powered by UltimatelySocial
RSS