Virtual Thoughts

Virtualisation, Storage and various other ramblings.

Category: NSX/SDDC (page 1 of 2)

Container Packet Inspection with NSX-T

For troubleshooting (or just being a bit nosey) we have a number of tools that allow us to inspect the traffic between two endpoints. When it comes to containers however, our approach has to be adjusted slightly. When using NSX-T as a CNI, we have some of these tools available to us out of the box.

App Overview

 

For this example, I’ve deployed a standard WordPress deployment consisting of a frontend (web) pod and a backend (DB) pod. The objective is to identify and capture the traffic between these pods.

 

Configure NSX-T Port Mirroring

Navigate to the “Port Mirroring Session” section in NSX-T via “Advanced Networking & Security” and click “Add”

Define a Local SPAN session:

Define the transport node (Source ESXI host)

Select the physical NIC(s) that participate in the n-vds configuration that we want to capture packets from.

Important: Enable “Encapsulated Packet” (Disabled by default) so we can inspect the underlying Geneve overlay packets. However, the NIC we want to mirror to must support the increased overall packet size. Therefore, adjust the MTU of the nic on the Wireshark host accordingly.

 

Select the source VNIC to capture from – This is the NIC of the Kubernetes worker node VM.

 

 

Finally, select the destination for traffic mirroring. For this example, it’s a NIC on a VM I have Wireshark installed on

Traffic Inspection

With all the hard work out of the way, we can simply provide a filter in Wireshark to show only packets originating from my WordPress Web Pod and terminating at the WordPress DB pod:

Notes:

  • The overall frame size of 1600 further validates our requirement to have the recipient of the traffic mirroring having an interface configured accordingly.
  • Because we previously defined the mirroring of encapsulated packets we can see the overlay information
  • We can see the entire packet structure consisting of:
    • Outer Ethernet (NSX-T Transport Node MAC)
    • Outer IP (NSX-T VTEP IP)
    • Geneve UDP
    • Geneve Data
    • Inner Ethernet (Pod Ethernet addresses)
    • Inner IP (Pod IP addresses)
    • Inner TCP (MYSQL connection)
    • Inner Payload (MYSQL Query) – Highlighted

 

Removing Unused PKS LoadBalancer IP allocations in NSX-T

I’m constantly building, removing, destroying, breaking and constantly tinkering in my lab, which unfortunately can leave behind a bit of technical mess. Most recently I’ve been doing some messing around in PKS and consequently had a load of load balancer VIPs no longer in use due to clusters being destroyed and / or recreated.

Unfortunately, there’s no way to remove these in the UI (for now, at least). But you can remove them using the API.:


curl -k -u <NSX-USERNAME>:<NSX-PASSWORD> -X POST 'https://<NSX-IP-OR-FQDN>/api/v1/pools/ip-pools/<POOL-ID>?action=RELEASE' -H "X-Allow-Overwrite: true" -d {"allocation_id":"10.10.10.2"}' -H "Content-Type: application/json"

This is great for the odd IP, but for me, I had 10’s of IP addresses I needed releasing. A quick and easy Bash script to the rescue!


for i in {2..254}
do
curl -k -u <NSX-USERNAME>:<NSX-PASSWORD> -X POST 'https://<NSX-IP-OR-FQDN>/api/v1/pools/ip-pools/<POOL-ID>?action=RELEASE' -H "X-Allow-Overwrite: true" -d '{"allocation_id":"10.10.10.'"$i"'"}' -H "Content-Type: application/json"
done

Modify the for loop for your environment. I typically have .1 as my default gateway.

Note : After releasing an IP, it seems to take a while before it’s reflected in the UI, therefore give it a few minutes.

Bootstrapping Prometheus, Grafana and Alertmanager to PKS deployed K8s Clusters

PKS is a comprehensive platform for the provisioning and management of Kubernetes clusters, which can be further enhanced by leveraging its extensibility options. In this post, we will modify a plan to deploy a yaml manifest file which provisions Prometheus, Grafana, and Alertmanager backed by NSX-T load balancers.

Why Prometheus, Grafana and Alertmanager?

The Cloud Native Computing Foundation accepted Prometheus as its second incubated project, the first being Kubernetes. Originally developed by SoundCloud. It has quickly become a popular platform for the monitoring of Kubernetes platforms. Built upon a powerful analytics engine, extensive and highly flexible data modeling can be accomplished with relative ease.

Grafana is, amongst other things, a visualisation tool that enables users to graph, chart, and generally visually represent data from a wide range of sources, Prometheus being one of them.

AlertManager handles alerts that are sent by applications such as Prometheus and performs a number of operations such as deduplicating, grouping and routing.

What I wanted to do, as someone unfamiliar with these tools is to devise a way to deploy these components in an automated way in which I can destroy, and recreate with ease. The topology of the solution is depicted below:

 

 

Constructing the manifest file

TLDR; I’ve placed the entire manifest file here, which creates the following:

  • Create the “monitoring” namespace.
  • Create a service account for Prometheus.
  • Create a cluster role requires for the Prometheus service account.
  • Create a cluster role binding for the Prometheus service account and the Prometheus cluster role.
  • Create a config map for Prometheus that:
    • Defines the Alertmonitor target.
    • Defines K8S master, K8S worker, and cAdvisor scrape targets.
    • Defines where to input alert rules from.
  • Create a config map for Prometheus that:
    • Provides a template for alerting rules.
  • Create a single replica deployment for Prometheus.
    • Expose this deployment via “LoadBalancer” (NSX-T).
  • Create a single replica deployment for Grafana.
    • Expose this deployment via “LoadBalancer” (NSX-T).
  • Create a single replica deployment for Alertmonitor.
    • Expose this deployment via “Loadbalancer” (NSX-T).

 

Bootstrapping the manifest file in PKS

Log into Ops manager and select the PKS tile:

Select a plan from the left-hand side, record the plan name for later:

Scroll down and paste the aforementioned YAML file into the Add-ons section

Save and then provision a cluster using the aforementioned plan:


david@mgmt-jumpbox:~$ pks create-cluster k8s --external-hostname k8s.virtualthoughts.co.uk --plan medium-with-prometheus-grafana-alertmanager --num-nodes 2

After which execute the following to acquire the list of loadbalancer IP addresses for the respective services:

david@mgmt-jumpbox:~$ kubectl get svc -n monitoring
NAME           TYPE           CLUSTER-IP       EXTERNAL-IP                 PORT(S)        AGE
alertmanager   LoadBalancer   10.100.200.218   100.64.96.5,172.16.12.129   80:31747/TCP   100m
grafana        LoadBalancer   10.100.200.89    100.64.96.5,172.16.12.128   80:31007/TCP   100m
prometheus     LoadBalancer   10.100.200.75    100.64.96.5,172.16.12.127   80:31558/TCP   100m

Prometheus – Quick tour

Logging into http://prometheus-lb-vip/targets will list the list of scrape targets for Prometheus, which have been configured via the respective configmap and include:

  • API server (Master)
  • Nodes (Workers)
  • cAdvisor (Pods)
  • Prometheus (self)

Which can be graphed / modelled / queried:

 

Grafana – Quick Tour

Out of the box, Grafana has very limited configuration applied – I struggled a little bit with constructing a configmap that would automatically add Prometheus as a data source, so a little bit of manual configuration is required (for now). Accessing http://grafana-lb-vip will prompt for a logon: (admin/admin) is the default

 

Add a data source:

Hint : use “prometheus.monitoring.svc.cluster.local” as the source URL

After creating a dashboard (or importing one) we can validate Grafana is extracting information from Prometheus

 

 

Alertmanager – Quick Tour

Alertmanager can be accessed via http://alertmanager-lb-vip. From the YAML manifest it has a vanilla config, but Prometheus is configured to use it as a alert target via the configmap:

  prometheus.yml: |-
    global:
      scrape_interval: 5s
      evaluation_interval: 5s
      
    alerting:
      alertmanagers:
        - static_configs:
          - targets: 
            - "alertmanager.monitoring.svc.cluster.local:80"

Alerts need to be configured in Prometheus in order for Alertmanager to ingest / deduplicate / forward etc.

As an example, I tested some integration with Discord:

Practical to send alerts to a Discord sever? Probably not.

Fun? Yes!

PKS, Harbor and the importance of container registries

What are container registries and why do we need them?

A lot of the time, particularly when individuals and organisations are evaluating, testing and experimenting with containers they will use public container registries such as Docker Hub.  These public registries provide an easy-to-use, simple way to access images. As developers, application owners, system admins etc gain familiarity and experience additional operational considerations need to be explored, such as:

  • Organisation – How can we organise container images in a meaningful way? Such as by environment state (Prod/Dev/Test) and application type?
  • RBAC – How can we implement role-based access control to a container registry?
  • Vulnerability Scanning – How can we scan container images for known vulnerabilities?
  • Efficiency – How can we centrally manage all our container images and deploy an application from them?
  • Security – Some images need to kept under lock and key, rather than using an external service like Docker Hub.

Introducing VMware Harbor Registry

VMware Harbor Registry has been designed to address these considerations as enterprise-class container registry solution with integration into PKS. In this post, We’ll have a quick primer on getting up and running with Harbor in PKS and explore some of its features. To begin, we need to download PKS Harbor from the Pivotal site and import it into ops manager.

After which the tile will be added (When doing this for the first time it will have an orange bar at the bottom. Press the tile to configure).

The following need to be defined with applicable parameters to suit your environment.

  • Availability Zone and Networks – This is where the Harbor VM will reside, and the respective configuration will be dependent on your setup.
  • General – Hostname and IP address settings
  • Certificate – Generate a self-signed certificate, or BYOC (bring your own certificate)
  • Credentials – Define the local admin password
  • Authentication – Choose between
    • Internal
    • LDAP
    • UAA in PKS
    • UAA in PAS
  • Container Registry store – Choose where to store container images pushed to Harbor
    • Local file system
    • NFS Server
    • S3 Bucket
    • Google Cloud Storage
  • Clair Proxy Settings
  • Notary settings
  • Resource Config

VMware Harbor Registry – Organisation

Harbor employs the concept of “projects”. Projects are a way of collecting images for a specific application or service. When images are pushed to Harbor, they reside within a project:

 

Projects can either be private or public and can be configured during, or after, project creation:

A project is comprised of a number of components:

 

VMware Harbor Registry – RBAC

In Harbor, we have three role types we can assign to projects:

 

rbac

Image source: https://github.com/goharbor/harbor/blob/master/docs/user_guide.md#managing-projects

  • Guest – Read-only access, can pull images
  • Developer – Read/write access, can pull and push images
  • Admin – Read/Write access, as well as project-level activities, such as modifying parameters and permissions.

As a practical example, AD groups can be created to facilitate these roles:

And these AD groups can be mapped to respective permissions within the project

 

Therefore, facilitating RBAC within our Harbor environment. Pretty handy.

VMware Harbor Registry – Vulnerability Scanning

The ability to identify, evaluate and remediate vulnerabilities is a standard operation is modern software development and deployment. Thankfully Harbor addresses this with integration with Clair – an open source project that addresses the identification, categorisation and analysis of vulnerabilities within containers. As a demonstration we need to first push an image to Harbor:

After initiating a scan, Harbor can inform us of what vulnerabilities exist within this container image

We can then explore more details about these vulnerabilities, including when they were fixed:

 

Conclusion

Harbor provides us with an enterprise level, container registry solution. This blog post has only scratched the surface, and with constant development being invested into the project, expect more features and improvements.

 

My VCAP6-NV Experience

Preamble (pun intended)

I’ve been eyeing up the VCAP6-NV exam for quite some time now, but due to work and personal projects I’ve not been able to focus on this exam. Having some time between jobs I decided to start revising and push myself into taking the exam. At time of writing, there is still no NV-Design exam, therefore, anyone who sits and passes the VCAP6-NV Deploy exam is automatically given the VCIX6-NV certification.

VMware Certified Implementation Expert 6 – Network Virtualization

I’m not a full on networking person. Back in the day (~10 years ago) I wanted to be, but the opportunities didn’t exist for me and I consequently went down a generic sysadmin path until ending up where I am today, primarily focusing on SDDC and cloud technologies. There are a lot of people who dive into NSX that do come from traditional networking backgrounds. For me, I found this quite intimidating, however the point is you do not have to be some kind of Cisco God to appreciate NSX, or pass this exam.

Preparation

Read any blog post, forum or reddit post about any VMware-based exam and it won’t take long until someone says something like:

“Read, study, understand and master the contents of the blueprint.”

And it’s absolutely correct. The version of the blueprint for the exam I sat can be found at https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/certification/vmw-vcap6-nv-deploy-3v0-643-guide.pdf

This, as well as vZealand.com’s excellent guide were my primary study resources and I would prepare by doing the following:

  1. Go over each section  of the blueprint via vZealand’s guide in my own lab, following the instructions on the site.
  2. Go over each section of the blueprint without any initial external assistance, assess accuracy after each objective by checking the guide.
  3. Go over each section of the blueprint in a way where I was confident in going over the objectives from practice.

Also, bear in mind the exam is based on 6.2 of NSX, therefore it’s a good idea to have a lab running on this version as there have been a number of significant changes since then.

After I accomplished all three, I felt confident to sit the exam

The exam itself

You’re looking at 205 minutes in total covering 23 questions that cover the blueprint in its entirety. Without breaking NDA my observations are:

  • Time management – This is the third VCAP exam I’ve taken and with each the time has flown by. It really doesn’t feel like a long time when you’ve finished. I personally found the NSX VCAP exam much more demanding on time compared to the VCAP-DCV Deploy exam. I didn’t complete all my questions in the NV exam whereas I had about 30-40mins left when I took the DCV exam.
  • Content – I feel like the exam was a pretty good reflection of the blueprint and was fairly well represented.
  • HOL Interface – The Exam simulation feels very similar to the VMware HOL, including (unfortunately) the latency. The performance for my exam wasn’t great, but wasn’t terrible.
  • Skip questions you’re not sure on – Time is an expensive commodity in this exam, if you’re struggling with some questions skip and move on. You may have time to come back to it later. I skipped a couple of questions.

 

Result

I passed, but not by much. But a pass is a pass and I was pretty chuffed. It’s definitely the hardest VCAP exam I’ve taken to date.

 

NSX-T, Kubernetes and Microsegmentation

For the uninitiated, VMware NSX comes in two “flavours”, NSX-V which is heavily integrated with vSphere, and NSX-T which is more IaaS agnostic. NSX-T also has more emphasis on facilitating container-based applications, providing a  number of features into our container ecosystem. In this blog post, we discuss the microsegmentation capabilities provided by NSX-T in combination with container technology.

What is Microsegmentation?

Prior to Software-defined networking, firewall functions were largely centralised, typically manifested as edge devices which were and still are, good for controlling traffic to and from the datacenter, otherwise known as north-south traffic:

The problem with this model, however, is the lack of control for resources that reside within the datacenter, aka east-west traffic. Thankfully, VMware NSX (-V or -T) can facilitate this, manifested by the distributed firewall.

Because of the distributed firewall, we have complete control over lateral movement within our datacenter. In the example above we can define firewall rules between logical tiers of our application which enforce permitted traffic.

 

But what about containers?

Containers are fundamentally different from Virtual machines in both how they’re instantiated and how they’re managed. Containers run on hosts that are usually VM’s themselves, so how can we achieve the same level of lateral network security we have with Virtual Machines, but with containers?

 

Introducing the NSX-T Container Plugin

The NSX-T container plugin facilitates the exposure of container “Pods” as NSX-T logical switch ports and because of this, we can implement microsegmentation rules as well as expose Pod’s to the wider NSX ecosystem, using the same approach we have with Virtual Machines.

Additionally, we can leverage other NSX-T constructs with our YAML files. For example, we can request load balancers from NSX-T to facilitate our application, which I will demonstrate further on. For this example, I’ve leveraged PKS to facilitate the Kubernetes infrastructure.

Microsegmentation in action

Talk is cheap, so here’s a demonstration of the concepts previously discussed. First, we need a multitier app. For my example, I’m simply using a bunch of nginx images, but with some imagination you can think of more relevant use cases:

Declaring Load balancers

To begin with, I declare two load balancers, one for each tier of my application. Inclusion into these load balancers is determined by tags.


apiVersion: v1
kind: Service
metadata:
name: web-loadbalancer
  labels:
  namespace: vt-web
spec:
  type: LoadBalancer
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: web-frontend
    tier: frontend
---
apiVersion: v1
kind: Service
metadata:
 name: app-loadbalancer
 labels:
 namespace: vt-web
spec:
 type: LoadBalancer
 ports:
 - port: 8080
 protocol: TCP
 targetPort: 80
 selector:
 app: web-midtier
 tier: midtier
---

Declaring Containers

Next, I define the containers I want to run for this application.

&lt;pre&gt;---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 name: web-frontend
 namespace: vt-web
spec:
 replicas: 2
 template:
 metadata:
 labels:
 app: vt-webapp
 tier: webtier
 spec:
 containers:
 - name: web-frontend
 image: nginx:latest
 ports:
 - containerPort: 80
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 name: web-midtier
 namespace: vt-web
spec:
 replicas: 2
 template:
 metadata:
 labels:
 app: web-midtier
 tier: apptier
 spec:
 containers:
 - name: web-midtier
 image: nginx:latest
 ports:
 - containerPort: 80&lt;/pre&gt;

Logically, this app looks like this:

 

 

Deploying app

david@ubuntu_1804:~/vt-webapp$ kubectl create namespace vt-web
namespace "vt-web" created
david@ubuntu_1804:~/vt-webapp$ kubectl apply -f webappv2.yaml
service "web-loadbalancer" created
service "app-loadbalancer" created
deployment "web-frontend" created
deployment "web-midtier" created

 

Testing Microsegmentation

At this stage, we’re not leveraging the Microsegmentation capabilities of NSX-T.  To validate this we can simply do a traceflow between two web-frontend containers over port 80:

 

As expected, traffic between these two containers is permitted. So, lets change that. In the NSX-T web interface go to inventory -> Groups and click on “Add”. Give it a meaningful name.

As for membership Criteria, we can select the tags we’ve previously designed, namely tier and app.

Click “add”. After which we can validate:

We can then create a firewall rule to block TCP 80 between members of this group:

Consequently, if we run the same traceflow exercise:

Conclusion

NSX-T provides an extremely comprehensive framework for containerised applications. Given the nature of containers in general, I think container + microsegmentation are a winning combination to secure these workloads. Dynamic inclusion adheres to the automated mentality of containers, and with very little effort we can implement microsegmentation using a framework that is agnostic – the principles are the same between VM’s and containers.

Hybrid Cloud monitoring with VMware vRealize Operations

Applications and the underlying infrastructure, be it public, private or hybrid cloud are becoming increasingly sophisticated. Because of this, the way in which we monitor and observe these environments requires more sophisticated tools. In this blog post, we look at vRealize Operations and how it can be a facilitator of true hybrid cloud monitoring.

What is vRealize Operations?

vRealize Operations forms part of the overall vRealize suite from VMware – a collection of products targeted to accommodate cloud management and automation. In particular, vRealize Operations, as the name implies, primarily caters to operations management with full visibility across physical, virtual and cloud-based environments. The anatomy of vRealize Operations is depicted below

 

Integrated Cloud Operations Console – A single, unified frontend to access, modify and view all related vRealize Operations components.

Integrated Management Disciplines – vRealize Operations has built-in intelligence to assimilate, dissect and report back on a number of key operational metrics pertaining to performance, capacity, planning and more. Essentially, vRealize Operations “learns” about your environment and is able to make recommendations, predictions and much more based on your specific workloads.

Platform Services – vRealize Operations is able to perform a number of platform management disciplines based on your specific environment. As an example, vRealize Operations can automate the addition of virtual machine memory based on monitored load, therefore proactively addressing potential issues before they surface.

Extensibility – Available from the VMware Marketplace, Management Packs extend the functionality of vRealize Operations. Examples include:

  • Microsoft Azure Management Pack from Blue Medora
  • AWS Management Pack from VMware
  • Docker Management Pack from Blue Medora
  • Dell | EMC Management Pack from Blue Medora
  • vRealize Operations Compliance Pack for PCI from VMware

The examples above demonstrate vRealize Operation’s capability to monitor AWS and Azure environments in addition to on-premises workloads, making vRealize Operations a true platform for Hybrid Cloud monitoring and operations management

Practical Example – Cluster Monitoring / Troubleshooting

In this example, we leverage one of the vRealize Operation’s built-in dashboards to check the performance of a specific cluster. A dashboard in vRealize operations terminology is a collection of objects and their state, represented in a visual fashion.

 

One of the ways vRealize understands the underlying environment is to establish and map dependencies in a logical manner. In this example, we have a top-level datacentre object (ISH), which child objects are decedents of (Cluster and hosts) this dashboard identifies key aspects of this cluster in a single page:

  • Cluster activity / utilisation
  • Health state of associated objects
  • CPU contention information
  • Memory contention information
  • Disk latency information

Without vRealize Operations it would be common for an administrator to try and collate these metrics manually, looking at individual performance charts, DRS scheduling information, and vCenter health alarms. However, with vRealize operations, this data is collected and centralised for easy and effortless exposure.

 

Practical Example – Workload Planning

In this example, we have an upcoming project that we want to forecast into our environment, particularly around disk space demand. We facilitate this by creating a “Project” in vRealize Operations, but before that, let’s look at the project UI in a bit more detail:

 

We can access this section by navigating to Environment > vSphere Object. At which point we can select the resource we’re interested in forecasting into. The chart in the middle projects the disk space demand for this specific vSphere object (a cluster, in this example). Note how we have an incline in disk space demand, which is typical of a production environment, however, we are within capacity for the time period specified (90 days).

To add a project, we click the green “plus” icon below the chart:

 

Next, we fill in details pertaining to the demand. In this case, I’m adding demand in the form of 5 virtual machines and I’m populating the specification of these VM’s based on an existing VM in my environment with an implementation date of June 19th.

 

 

If we add this project to the forecast chart, the chart changes to accommodate this change in our environment:

 

 

By adding this project we have obviously created more demand, consequently, the date in which our disk space resources will exhaust has been expedited.

By having this knowledge we can plan our capacity requirements ahead of time. In this example, I decide to add another project to add resources prior to the commissioning of the aforementioned VM’s:

Because we can combine projects into a single chart, we can see based on observed metrics what effect adding demand and capacity to our environment has.

This is one of a vast number of features in vRealize Operations.  vRealize Operations Manager can be an incredibly useful tool to have for a number of reasons. Its intelligent analytics, a breath of extensibility options and unified experience make it a compelling experience for modern cloud-based operations

 

Understanding Data Center traffic flow using NSX-V capabilities

The defining characteristic of the Software Defined Data Center (SDDC), as the name implies, is to bring the intelligence and operations of various datacenter functions into software. This type of integration provides us with the ability to gain insights and analytics in a much more controlled, tightly integrated fashion.

VMware NSX is the market leader in network virtualisation. In this post, we have a look at a selection of tools which come with NSX, enabling a greater understanding of exactly what is transpiring in our NSX environment.

 

What we do now

Before diving into NSX-V traffic flow capabilities, let’s take a step back into how some organisations may approach identifying traffic flows currently by taking a simple example issue:

“Server A can’t talk to Server B on port 8443”

In this example, we assume that Server B is listening on port 8443.

Here are a few tools/methods that can be used to help identify the root cause

 

What these tools/methods have in common are:

 

  • Disjointed – Treated as separate, discrete exercises.
  • Isolated – Requires specific tools/skillsets.
  • Decentralised – Analysis requires output to be crossed referenced and analysed manually.

 

How NSX-V native tools can help

NSX-V provides us with a number tools to help us gain a deeper understanding of our network environment as well as provide accelerated troubleshooting and root cause analysis. These can be found via the vCenter client:

 

Flow Monitoring

Flow Monitoring is one of the traffic analysis tools that provide a detailed view of the traffic originating and terminating at virtual machines. One example use case of this is to determine in real time the traffic flows originating from a virtual machine – the below example demonstrating this. No agent or VM configuration is needed, unlike with Wireshark – NSX does this all natively without any modifications to the VM:

 

The VM in the example above has an IP of 172.16.201.10. We can see that itself is making DNS calls out to 8.8.8.8 as well as communicating with another machine with an IP of 172.16.200.10 over port 8443.

Endpoint Monitoring

 

Endpoint Monitoring enables us to map specific processes inside a guest operating system to network connections that are facilitating this traffic. This is helpful for gaining insight into application-layer details. The examples shown below demonstrate NSX’s ability to identify:

  • The source of the flow (process or application)
  • The source VM
  • The destination (can be any destination)
  • Traffic Type

 

 

 

Traceflow

Traceflow acts as a very useful diagnostic tool. Compared to flow monitoring, which takes a real-time view of network traffic, traceflow allows us to simulate traffic by synthetically “injecting” this traffic into our environment and monitoring the data path. In this example a test was executed for connectivity from a web server to an App server over port 8443:

 

NSX has informed us that this packet was dropped due a firewall rule – it also gives us the Rule ID in question. We can click on this link to get more information about this rule:

 

Once this rule was modified we can re-run the test, which shows this traffic has been successfully delivered to the target VM.

Traceflow also gives us an idea as to the journey our packet has travelled. From the above output we can see that this packet has traversed two logical switches, two ESXi hosts, one distributed logical router, and has forwarded through the distributed firewall running on the vNIC’s of two VM’s:

 

 

Packet Capture

The Packet Capture feature in NSX-V enables us to generate packet traces from physical ESXi hosts should we wish to perform any troubleshooting at that level.

These captures are done on a per-host level and we can specify to gather packet captures from one of the following interface types:

  • Physical NIC
  • VMKernel Interface
  • vNIC
  • vDR Port

Or from one of the respective filter types. Once started NSX will start gathering packet logs. Once the session has stopped these can be downloaded as .PCAP files which can be opened with a tool such as Wireshark

 

Conclusion

As organisations are adopting software-defined technologies, the tools and processes we use must also change. Thankfully, NSX-V has a plethora of native capabilities to observe, identify and troubleshoot software-defined networks.

vRealize Log insight – Frequently Overlooked Centralised Log Management

Log analysis has always been a standardised practice for activities such as root cause analysis or advanced troubleshooting. However, ingesting and analysing these logs from different devices, types, locations and formats can be a challenge. In this post, we have a look at vRealize Log Insight and what it can deliver.

 

What is it?

vRealize Log Insight is a product in the vRealize suite specifically designed for heterogeneous and scalable log management across physical, virtual and cloud-based environments. It is designed to be agnostic across what it can ingest logs from and is therefore valid candidate in a lot of deployments.

Additionally, any customer with a vCenter Server Standard or above license is entitled to a free 25 OSI pack. OSI is known as “Operating System Instance” and is broadly defined as a managed entity which is capable of generating logs. For example, a 25 OSI pack license can be used to cover a vCenter server, a number of ESXi hosts and other devices covered either natively or via VMware Content Packs (with the exception of Custom and 3rd party content packs – standalone vRealize Log Insight is required for this feature).

 

Current Challenges

Modern datacenters and cloud environments are rarely consumed by homogeneous solutions. Customers use a number of different technologies from different vendors and operating systems. With this comes a number of challenges:

 

  • The inconsistent format of log types – vCenter/ESXi uses syslog for logging, Windows has a bespoke method, applications may simply write data to a file in a specific format. This can require a number of tools/skills to read, interpret and action from this data.
  • Silos of information – The decentralised nature of dispersed logging causes this information to be siloed in different areas. This can have an impact on resolution times for incidents and accuracy of root cause analysis.
  • Manual analysis – Simply logging information can be helpful, but the reason why this is required is to perform the analysis. In some environments, this is a manual process performed by a systems administrator.
  • Not scalable – As environments grow larger and more complex having silos of differentiating logging types and formats becomes unwieldy to manage.
  • Cost – Man hours used to perform manual analysis can be costly.
  • No Correlation – Siloed logs doesn’t cater for any correlation of events/activities across an environment. This can greatly impede efforts in performing activities such as root cause analysis.

 

Addressing Challenges With vRealize Log Insight

Below are examples of how vRealize Log Insight can address the aforementioned challenges.

 

  • Create structure from unstructured data – Collected data is automatically analysed and structured for ease of reporting.
  • Centralised logging – vRealize Log Insight centrally collates logs from a number of sources which can then be accessed through a single management interface.
  • Automatic analysis – Logs are collected in near real-time and alerts can be configured to inform users of potential issues and unexpected events.
  • Scalable – Advanced licenses of vRealize Log insight include additional features such as Clustering, High Availability, Event Forwarding and Archiving to facilitate a highly scalable, centralised log management solution. vRealize Log Insight is also designed to analyse massive amounts of log data.
  • Cost – Automatic analysis of logs and alerting can assist with reducing man-hours spent manually analysing logs, freeing up IT staff to perform other tasks.
  • Log Correlation – Because logs are centralised and structured events across multiple devices/services can be correlated to identify trends and patterns.

 

Extensibility

vRealize Log Insight’s capabilities can be extended by the use of content packs. Content packs are available from the VMware marketplace (https://marketplace.vmware.com/vsx/?contentType=2)

Content packs are published either by VMware directly or from vendors to support their own devices/solutions. Examples include:

  • Apache Web Service
  • Brocade Devices
  • Cisco Devices
  • Dell | EMC Devices
  • F5 Devices
  • Juniper Devices
  • Microsoft Active Directory
  • Nimble Devices
  • VMware SRM

 

Closing Thoughts

It’s surprising how underused vRealize Log Insight is considering it comes bundled in as part of any valid vSphere Standard or above license. The modular design of the solution allowing third-party content packs adds a massive degree of flexibility which is not common amongst other centralised logging tools. 

Homelab Networking Refresh

Adios, Netgear router

In hindsight, I shouldn’t have bought a Netgear D7000 router. The reviews were good but after about 6 months of ownership, it decided to exhibit some pretty awful symptoms. One of which was completely and indiscriminately drop all wireless clients regardless of device type, range, band or frequency it resided on. A reconnect to the wireless network would prompt the passphrase again, weirdly. Even after putting in the passphrase (again) it wouldn’t connect. The only way to rectify this was to physically reboot the router.

Netgear support was pretty poor too. The support representative wanted me to downgrade firmware versions just to “see if it helps” despite confirming that this issue is not known in any of the published firmware versions.

Netgear support also suggested I changed the 2.4ghz network band. Simply put. They weren’t listening or couldn’t comprehend what I was saying.

Anyway, rant over. Amazon refunded me the £130 for the Netgear router after me explaining the situation about Netgear’s poor support. Amazing service really.

Hola, Ubiquiti

I’ve been eyeing up Ubiquiti for a while now but never had a reason to get any of their kit until now.  With me predominantly working from home when I’m not on the road and my other half running a business from home, stable connectivity is pretty important to both of us.

The EdgeMAX range from Ubiquiti looked like it fit the bill. I’d say it sits above the consumer-level stuff from the likes of Netgear, Asus, TP-Link etc and just below enterprise level kit from the likes of Juniper, Cisco, etc. Apart from the usual array of features found on devices of this type I particularly wanted to mess around with BGP/OSPF from my homelab when creating networks in VMware NSX.

With that in mind, I cracked open Visio and started diagramming, eventually ending up with the following:

 

I noted the following observations:

  • Ubiquti Edgerouters do not have a build in VDSL modem, therefore for connections such as mine, I required a separate modem.
  • The Edgerouter Lite has no hardware switching module, therefore it should be purely used as a router (makes sense)
  • The Edgerouter X has a hardware switching module with routing capabilities (but lower total pps (Packets Per Second))

Verdict

I managed to set up the pictured environment over the weekend fairly easily. The Ubiquiti software is very modern, slick, easy to use and responsive. Leaps and bounds from what I’ve found on consumer-grade equipment.

I have but one criticism with the Ubiquiti routers, and that is not everything is easily configurable through the UI (yet). From what I’ve read Ubiquiti are making good progress with this, but for me I had to resort to the CLI to finish my OSPF peering configuration.

The wireless access point is decent. good coverage and the ability to provision an isolated guest network with custom portal is a very nice touch.

Considering the Edgerouter Lite costs about £80 I personally think it represents good value for money considering the feature set it provides. I wouldn’t recommend it for every day casual network users, but then again, that isn’t Ubiquiti’s market.

The Ubiquiti community is active and very helpful as well.

 

 

 

 

Older posts

© 2019 Virtual Thoughts

Theme by Anders NorenUp ↑

Social media & sharing icons powered by UltimatelySocial
RSS