VSAN, NSX on Cisco Nexus, vSphere Containers, NSX Future & a chat with VMware CEO – Highlights Of My Day 2 at VMworld 2016 US

In this post,  I will aim to highlight the various breakout sessions I’ve attended during the day 2 at VMworld 2016 US, key items / notes / points learnt and few other interesting things I was privy to  during the day that is worth mentioning, along with my thoughts on them…!!

Day 2 – Breakout Session 1 – Understanding the availability features of VSAN

vsan-net-deploy-support

  • Session ID: STO8179R
  • Presenters:
    • GS Khalsa – Sr. Technical Marketing manager – VMware (@gurusimran)
    • Jeff Hunter – Staff Technical Marketing Architect – VMware (@Jhuntervmware)

In all honesty, I wasn’t quite sure why I signed up to this breakout session as I know VSAN fairly well, including its various availability features as I’ve been working with testing & analysing its architecture and performance when VSAN was first launched to then designing and deploying VSAN solutions on behalf of my customers for a while. However, having attended the session it reminded me of a key fact that I normally try to never forget which is “you always learn something new” even when you think you know most of it.

Anyways, about the session itself, it was good and was mainly aimed at the beginners to VSAN but I did manage to learn few new things as well as refresh my memory on few other facts, regarding VSAN architecture. The key new ones I learnt are as follows

  • VSAN component statuses (as shown within vSphere Web Client) and their meanings
    • Absent
      • This means VSAN things the said component will probably return. Examples are,
        • Host rebooted
        • Disk pulled
        • NW partition
        • Rebuild starts after 60 mins
      • When an item is detected / marked as absent, VSNA typically wait for 60 minutes before a rebuild is started in order to allow temporary failure to rectify itself
        • This means for example, pulling disks out of VSAN will NOT trigger an instant rebuild / secondary copy…etc. so it wont be an accurate test of VSAN
    • Degraded
      • This typically means the device / component is unlikely to return. Examples include,
        • A permeant Device Loss (PDL) or a failed disk
      • When a degraded item is noted, a rebuild started immediately
    • Active-Stale
      • This means the device is back online from a failure (i.e. was absent) but the data residing on it are NOT up to date.
  • VSAN drive degradation monitoring is proactively logged in the following log files
    • vmkernel.log indicating LSOM errors
  • Dedupe and Compression during drive failures
    • During a drive failure, de-duplication and compression (al flash only) is automatically disabled – I didn’t know this before

 

Day 2 – Breakout Session 2 – How to deploy VMware NSX with Cisco Nexus / UCS Infrastructure

  • Session ID: NET8364R
  • Presenters:
    • Paul Mancuso – Technical Product Manager (VMware)
    • Ron Fuller – Staff System Engineer (VMware)

This session was about a deployment architecture for NSX which is becoming increasingly popular, which is about how to design & deploy NSX on top of Cisco Nexus switches with ACI as the underlay network and Cisco UCS hardware. Pretty awesome session and a really popular combination too. (FYI – I’ve been touting that both these solutions are better together since about 2 years back and its really good to see both companies recognising this and now working together on providing guidance stuff like these). Outside of this session I also found out that the Nexus 9k switches will soon have the OVS DB support so that they can be used as TOR switches too with NSX (hardware VTEP to bridge VXLANs to VLANs to communication with physical world), much like the Arista switches with NSX – great great news for the customers indeed.

ACI&NSX-2

I’m not going to summarise the content of this session but wold instead like to point people at the following 2 documentation sets from VMware which covers everything that this session was based on, its content and pretty simply, everything you need to know when designing NSX solutions together with Cisco ACI using Nexus 9K switches and Cisco UCS server hardware (blades & rack mounts)

One important thing to keep in mind for all Cisco folks though: Cisco N1K is NOT supported for NSX. All NSX prepped clusters must use vDS. I’m guessing this is very much expected and probably only a commercial decision rather than a technical one.

Personally I am super excited to see VMware ands Cisco are working together again (at least on the outset) when it comes to networking and both companies finally have realised the use cases of ACI and NSX are somewhat complementary to each other (i.e. ACI cannot do most of the clever features NSX is able to deliver in the virtual world, including public clouds and NSX cannot do any of the clever features ACI can offer to a physical fabric). So watch this space for more key joint announcements from both companies…!!

Day 2 – Breakout Session 3 – Containers for the vSphere admin

Capture

  • Session ID: CNA7522
  • Presenters:
    • Ryan Kelly – Staff System Engineer (VMware)

A session about how VMware approaches the massive buzz around containerisation through their own vSphere integrated solution (VIC) as well as a brand new hypervisor system designed from ground up with containerisation in mind (Photon platform). This was more of a refresher session for than anything else and I’m not going to summarise all of it but instead, will point you to the dedicated post I’ve written about VMware’s container approach here.

Day 2 – Breakout Session 4 – The architectural future of Network Virtualisation

the-vision-for-the-future-of-network-virtualization-with-vmware-nsx-27-638

  • Session ID: NET8193R
    Presenters: Bruce Davie – CTO, Networking (VMware)

Probably the most inspiring session of the day 2 as Bruce went through the architectural future of NSX where he described what the NSX team within VMware are focusing on as key improvements & advancements of the NSX platform. The summary of the session is as follows

  • NSX is the bridge from solving today’s requirement to solving tomorrow’s IT requirements
    • Brings remote networking closer easily (i.e. Stretched L2)
    • Programtically (read automatically) provisoned on application demand
    • Security ingrained at a kernel level and every hop outwards from the applications
  • Challenges NSX is trying address (future)
    • Developers – Need to rapidly provision and destroy complex networks as a pre-reqs for applications demanded by developers
    • Micro services – Container networking ands security
    • Containers
    • Unseen future requirements
  • Current NSX Architecture
    • Cloud consumption plane
    • Management plane
    • Control plane
    • Data plane
  • Future Architecture – This is what the NSX team is currently looking at for NSX’s future.
    • Management plane scale out
      • Management plane now needs to be highly available in order to constantly keep taking large number of API calls for action from cloud consumption systems such as OpenStack, vRA..etc – Developer and agile development driven workflows….etc.
      • Using & scaling persistent memory for the NSX management layer is also being considered – This is to keep API requests in persistent memory in a scalable way providing write and read scalability & Durability
      • Being able to take consistent NSX snapshots – Point in time backups
      • Distributed log capability is going to be key in providing this management plane scale out whereby distributed logs that store all the API requests coming from Cloud Consumption Systems will be synchronously stored across multiple nodes providing up to date visibility of the complete state across to all nodes, while also increasing performance due to management node scale out
    • Control plane evolution
      • Heterogeneity
        • Currently vSphere & KVM
        • Hyper-V support coming
        • Control plane will be split in to 2 layers
          • Central control plane
          • Local control plane
            • Data plane (Hyper-V, vSphere, KVM) specific intelligence
    • High performance data plane
      • Use the Intel DPDK – A technology that optimize packet processing in Intel CPU
        • Packet switching using x86 chips will be the main focus going forward and new technologies such as DPDK will only make this better and better
        • DPDK capacities are best placed to optimise iterative processing rather than too many context switching
        • NSX has these optimisation code built in to its components
          • Use DPDK CPUs in the NSX Edge rack ESXi servers is  a potentially good design decision?
  • Possible additional NSX use cases being considered
    • NSX for public clouds
      • NSX OVS and an agent is deployed to in guest – a technical preview of this solution was demoed by Pat Gelsinger during the opening key note on day 1 of VMworld.
    • NSX for containers
      • 2 vSwitches
        • 1 in guest
        • 1 in Hypervisor

 

My thoughts

I like what I heard from the Bruce about the key development focus areas for NSX and looks like all of us, partners & customers of VMware NSX alike, are in for some really cool, business enabling treats from NSX going forward, which kind of reminds me of when vSphere first came out about 20 years ago :-). I am extremely excited about the opportunities NSX present to remove what is often the biggest bottleneck enterprise or corporate IT teams have to overcome to simply get things done quickly and that is the legacy network they have. Networks in most organisations  are still very much managed by an old school minded, networking team that do not necessarily understand the convergence of networking with other silos in the data center such as storage and compute, and most importantly when it comes to convergence with modern day applications. It is a fact that software defined networking will bring the efficiency to the networking the way vSphere brought efficiency to compute (want examples how this SDN efficiency is playing today? Look at AWS and Azure as the 2 biggest use cases) where the ability to spin up infrastructure, along with a “virtual” networking layer significantly increases the convenience for the businesses to consume IT (no waiting around for weeks for your networking team to set up new switches with some new VLANs…etc.) as well as significantly decreasing the go to market time for those businesses when it comes to launching new products / money making opportunities. All in all, NSX will act as a key enabler for any business, regardless of the size to have an agile approach to IT and even embrace cloud platforms.

From my perspective, NSX will provide the same, public cloud inspired advantages to customers own data center and not only that but it will go a step further by effectively converting your WAN to an extended LAN by bridging your LAN with a remote network / data center / Public cloud platform to create something like a LAN/WAN (Read LAN over WAN – Trade mark belongs to me :-))which can automatically get deployed, secured (encryption) while also being very application centric (read “App developers can request networking configuration through an API as a part of the app provisioning stage which can automatically apply all the networking settings including creating various networking segments, routing in between & the firewall requirements…etc. Such networking can be provisioned all the way from a container instance where part of the app is running (i.e. DB server instance as a container service) to a public cloud platform which host the other parts (i.e. Web servers).

I’ve always believed that the NSX solution offering is going to be hugely powerful given its various applications and use cases and natural evolution of the NSX platform through the focus areas like those mentioned above will only make it an absolute must have for all customers, in my humble view.

 

Day 2 – Meeting with Pat Gelsinger and Q&A’s during the exclusive vExpert gathering

vExpert IMG_5750

As interesting as the breakout sessions during the day have been, this was by far the most significant couple of hours for me on the day. As a #vExpert, I was invited to an off site, vExpert only gathering held at Vegas Mob Museum which happened to include VMware CEO, Pat Gelsinger as the guest of honour. Big thanks to the VMware community team lead by Corey Romero (@vCommunityGuy) for organising this event.

This was an intimate gathering for about 80-100 VMware vExperts who were present at VMworld to meet up at an off site venue and discuss things and also to give everyone a chance to meet with VMware CEO and ask him direct questions, which is something you wouldn’t normally get as an ordinary person so it was pretty good. Pat was pretty awesome as he gave a quick speech about the importance of vExpert community to VMware followed up by a Q&A session where we all had a chance to ask him questions on various fronts. I myself started the Q&A session by asking him the obvious question, “What would be the real impact on VMware once the Dell-EMC merger completes” and Pats answer was pretty straight forward. As Michael Dell (who happened to come on stage during the opening day key note speech said it himself), Dell is pretty impressed with the large ecosystem of VMware partners (most of whom are Dell competitors) and will keep that ecosystem intact going forward and Pat echoed the same  message, while also hinting that Dell hardware will play a key role in all VMware product integrations, including using Dell HW by default in most pre-validated and hyper-converged solution offerings going forward, such as using Dell rack mount servers in VCE solutions….etc. (in Pat’s view, Cisco will still play a big role in blade based VCE solution offerings and they are unlikely to walk away from it all just because of Dell integration given the substantial size of revenue that business brings to Cisco).

If I read in between the lines correctly (may be incorrect interpretations from my end here),  he also alluded that the real catch of the EMC acquisition as far as Dell was concerned was VMware. Pat explained that most of the financing charges behind the capital raised by Dell will need to be paid through EMC business’s annual run rate revenue (which by the way is roughly the same as the financing interest) so in a way, Dell received VMware for free and given their large ecosystem of partners all contributing towards VMware’s revenue, it is very likely Dell will continue to let VMware run as an independent entity.

There were other interesting questions from the audience and some of the key points made by Pat in answering those questions were,

  • VMware are fully committed to increasing NSX adoption by customers and sees NSX as a key revenue generator due to what it brings to the table – I agree 100%
  • VMware are working on the ability to provide networking customers through NSX, a capability similar to VMotion for compute as one of their (NSX business units) key goals. Pat mentioned that engineering in fact have this figured out already and testing internally but not quite production ready.
  • In relation to VMware’s Cross Cloud Services as a service offering (announced by Pat during the event opening keynote speech), VMware are also working on offering NSX as a service – Though the detail were not discussed, I’m guessing this would be through the IBM and vCAN partners
  • Hinted that a major announcement on the VMware Photon platform  (One of the VMware vSphere container solutions) will be taking place during VMworld Barcelona – I’ve heard the same from the BU’s engineers too and look forward to Barcelona announcements
  • VMware’s own cloud platform, vCloud air WILL continue to stay focused on targeted use cases while the future scale of VMware’s cloud business will be expected to come from the vCAN partners (hosting providers that use VMware technologies and as a result are part of the VMware vCloud Air Network…i.e IBM)
  • Pat also mentioned about the focus VMware will have on IOT and to this effect, he mentioned about the custom IOT solution VMware have already built or working on (I cannot quite remember which was it) for monitoring health devices through the Android platform – I’m guessing this is through their project ICE and LIOTA (Little IOT Agent) platform which already had similar device monitoring solutions being demoed in the solutions exchange during VMworld 2016. I mentioned about that during my previous post here

It was really good to have had the chance to listen to Pat up close and be able to ask direct questions and get frank answers which was a fine way to end a productive and an education day for me at VMworld 2016 US

Image credit goes to VMware..!!

Cheers

Chan

 

 

VVDs, Project Ice, vRNI & NSX – Summary Of My Breakout Sessions From Day 1 at VMworld 2016 US –

Capture

Quick post to summerise the sessions I’ve attended on day 1 at @VMworld 2016 and few interesting things I’ve noted. First up are the 3 sessions I had planned to attend + the additional session I managed to walk in to.

Breakout Session 1 – Software Defined Networking in VMware validated Designs

  • Session ID: SDDC7578R
  • Presenter: Mike Brown – SDDC Integration Architect (VMware)

This was a quick look at the VMware Validated Designs (VVD) in general and the NSX design elements within the SDDC stack design in the VVD. If you are new to VVD’s and are typically involved in designing any solutions using the VMware software stack, it is genuinely worth reading up on and should try to replicate the same design principles (within your solution design constraints) where possible. The diea being this will enable customers to deploy robust solutions that have been pre-validated by experts at VMware in order to ensure the ighest level of cross solution integrity for maximum availability and agility required for a private cloud deployment. Based on typical VMware PSO best practices, the design guide (Ref architecture doc) list out each design decision applicable to each of the solution components along with the justification for that decision (through an explanation) as well as the implication of that design decision. An example is given below

NSX VVD

I first found out about the VVDs during last VMworld in 2015 and mentioned in my VMworld 2015 blog post here. At the time, despite the annoucement of availability, not much content were actually avaialble as design documents but its now come a long way. The current set of VVD documents discuss every design, planning, deployment and operational aspect of the following VMware products & versions, integrated as a single solution stack based on VMware PSO best practises. It is based on a multi site (2 sites) production solution that customers can replicate in order to build similar private cloud solutions in their environments. These documentation set fill a great big hole that VMware have had for a long time in that, while their product documentation cover the design and deployment detail for individual products, no such documentaiton were available for when integrating multiple products and with VVD’s, they do now. In a way they are similar to CVD documents (Cisco Validated Designs) that have been in use for the likes of FlexPod for VMware…etc.

VVD Products -1

VVD Products -2

VVD’s generally cover the entire solution in the following 4 stages. Note that not all the content are fully available yet but the key design documents (Ref Architecture docs) are available now to download.

  1. Reference Architecture guide
    1. Architecture Overview
    2. Detailed Design
  2. Planning and preperation guide
  3. Deployment Guide
    1. Deployment guide for region A (primary site) is now available
  4. Operation Guide
    1. Monitoring and alerting guide
    2. backup and restore guide
    3. Operation verification guide

If you want to find out more about VVDs, I’d have a look at the following links. Just keep in mind that the current VVD documents are based on a fairly large, no cost barred type of design and for those of you who are looking at much smaller deployments, you will need to exercise caution and common sense to adopt some of the recommended design decisions to be within the appplicable cost constraints (for example, current NSX design include deploying 2 NSX managers, 1 integrated with the management cluster vCenter and the other with the compute cluster vCenter, meaning you need NSX licenses on the management clutser too. This may be an over kill for most as typically, for most deployments, you’d only deploy a single NSX manager integrated to the compute cluster)

As for the Vmworld session itself, the presenter went over all the NSX related design decisions and explained them which was a bit of a waste of time for me as most people would be able to read the document and understand most of those themselves. As a result I decided the leave the session early, but have downloaded the VVD documents in order to read throughly at leisure. 🙂

Breakout Session 2 – vRA, API, Ci Oh My!

  • Session ID: DEVOP7674
  • Presenters

vRA Jenkins Plugin

As I managd to leave the previous session early, I manage to just walk in to this session which had just started next door and both Kris and Ryan were talking about the DevOps best practises with vRealize Automation and vrealize Code Stream. they were focusing on how developpers who are using agile development that want to invoke infrastructure services can use these products and invoke their capabilities through code, rather than through the GUI. One of the key focus areas was the vRA plugin for Jenkins and if you were a DevOps person of a developper, this session content would be great value. if you can gain access to the slides or the session recordings after VMworld (or planning to attend VMworld 2016 Europe), i’d highly encourage you to watch this session.

Breakout Session 3 – vRealize, Secure and extend your data center to the cloud suing NSX: A perspective for service providers and end users

  • Session ID: HBC7830
  • Presenters
    • Thomas Hobika – Director, America’s Service Provider solutions engineering & Field enablement, vCAN, vCloud Proviuder Software business unit (VMware)
    • John White – Vice president of product strategy (Expedient)

Hosted Firewall Failover

This session was about using NSX and other products (i.e. Zerto) to enable push button Disaster Recovery for VMware solutions presented by Thomas, and John was supposed to talk about their involvement in designing this solution.  I didn’t find this session content that relevent to the listed topic to be honest so left failrly early to go to the blogger desks and write up my earlier blog posts from the day which I thought was of better use of my time. If you would like more information on the content covered within this sesstion, I’d look here.

 

Breakout Session 4 – Practical NSX Distributed Firewall Policy Creation

  • Session ID: SEC7568
  • Presenters
    • Ron Fuller – Staff Systems Engineer (VMware)
    • Joseph Luboimirski – Lead virtualisation administrator (University of Michigan)

Fairly useful session focusing about NSX distributed firewall capability and how to effectively create a zero trust security policy on ditributed firewall using vairous tools. Ron was talking about various different options vailablle including manual modelling based on existing firewall rules and why that could potentially be inefficient and would not allow customers to benefit from the versatality available through the NSX platform. He then mentioned other approaches such as analysing traffic through the use of vRealize Network Insight (Arkin solution) that uses automated collection of IPFIX & NetFlow information from thre virtual Distributed Switches to capture traffic and how that capture data could potentialy be exported out and be manipulated to form the basis for the new firewall rules. He also mentioned the use of vRealize Infrastructure Navigator (vIN) to map out process and port utilisation as well as using the Flow monitor capability to capture exisitng communication channels to design the basis of the distributed firewall. The session also covered how to use vRealize Log Insight to capture syslogs as well.

All in all, a good session that was worth attending and I would keep an eye out, especially if you are using / thinking about using NSx for advanced security (using DFW) in your organisation network. vRealize Network Insight really caught my eye as I think the additional monitoring and analytics available through this platform as well as the graphical visualisation of the network activities appear to be truely remarkeble (explains why VMware integrated this to the Cross Cloud Services SaS platform as per this morning’s announcement) and I cannot wait to get my hands on this tool to get to the nitty gritty’s.

If you are considering large or complex deployment of NSX, I would seriously encourage you to explore the additional features and capabilities that this vRNI solution offers, though it’s important to note that it is licensed separately form NSX at present.

vNI         vNI 02

 

Outside of these breakout sessions I attended and the bloggin time in between, I’ve managed to walk around the VM Village to see whats out there and was really interested in the Internet Of Things area where VMware was showcasing their IOT related solutions currently in R&D. VMware are currently actively developing an heterogeneous IOT platform monitoring soluton (internal code name: project Ice). The current version of the project is about partnering up with relevent IOT device vendors to develop a common monitoring platform to monitor and manage the various IOT devices being manufacured by various vendors in various areas. If you have a customer looking at IOT projects, there are opportunities available now within project Ice to sign up with VMware as a beta tester and co-develop and co-test Ice platform to perform monitoring of these devices.

An example of this is what VMware has been doing with Coca Cola to monitor various IOT sensors deployed in drinks vending machines and a demo was available in the booth for eall to see

IOT - Coke

Below is a screenshot of Project Ice monitoring screen that was monitoring the IOT sensors of this vending machine.   IOT -

The solution relies on an Open-Source, vendor neutral SDK called LIOTA (Little IOT Agent) to develop a vendor neutral agent to monitor each IOT sensor / device and relay the information back to the Ice monitoring platform. I would keep and eye out on this as the use cases of such a solution is endless and can be applied on many fronts (Auto mobiles, ships, trucks, Air planes as well as general consumer devices). One can argue that the IOT sensor vendors themselves should be respornsible for developping these mo nitoring agents and platforms but most of these device vendors do not have the knowledge or the resources to build such intelligent back end platforms which is where VMware can fill that gap through a partship.

If you are in to IOT solutions, this is defo a one to keep your eyes on for further developments & product releases. This solution is not publicly available as of yet though having spoken to the product manager (Avanti Kenjalkar), they are expecting a big annoucement within 2 months time which is totally exciting.

Some additional details can be found in the links below

Cheers

Chan

#vRNI #vIN #VVD # DevOps #Push Button DR # Arkin Project Ice # IOT #LIOTA

VMware NSX vExpert 2016

vExpert-NSX-Badge

Its been a while since I’ve managed to post anything due to various reasons but most of them revolved around being just too busy. Anyhow I’m planning to change that with some new exciting posts over the next few weeks & months, starting with this one, which was a little overdue.

On Friday the 17th of June, I was notified by VMware that I have been selected as one of the first ever NSX vExperts (there’s only 115 in total globally). NSX vExpert program is a sub program off the general VMware vExpert program (their global evangelism and advocacy program) which has now been running for some years (I’ve been a vExpert in 2014 and again in 2016). The NSX vExpert program is however quite new, only started this year for the first time. VMware have only picked the NSX vExperts from the current pool of general vExperts and to be short listed within those 115 people is quite an honour.

As a part of the NSX vExperts program, we are entitled to a number of benefits such as NFR license keys for full NSX suite, access to NSX product management, exclusive webinars & NDA meetings, access to preview builds of the new software and also get a chance to provide feedback to the product management team on behalf of our clients which is great.

NSX is a truly great product that lets you bring the operational model of the VM to your network in order to maximise its utilisation while increasing the efficiency by many-folds and lots of customers who are looking at automation and operation in their data center are looking at NSX as the software layer to underpin all such requirements. New functionalities keep coming with each new version, and I’m sure that will keep all of us NSX vExperts quite busy for the foreseeable future.

 

VMworld Europe 2015 – Day 1 & 2 summary

The day 1 of the VMworld Europe began with the usual general session in the morning down at the hall 7.0. It was continuing the VMworld US theme of “Ready for any” during the European event too. It has become a standard for VMware to announce new products (or repeat announce new products following VMworld US) during this which, by now are somewhat public knowledge and this was no different this year. Also of special note was that they played a recorded video message from their new boss, Michael Dell (im sure everyone’s aware of the Dell’s acquisition of EMC on Monday) where he assured that VMware would remain as a publicly listed company and is a key part of the Dell-EMC enterprise.

To summarise the key message from the general session, VMware are planning to deliver 3 main messages

  • One Cloud – Seamless integration facilitated by VMware products between your private cloud / on-premise and various public clouds such as  AWS, Azure, Google…etc. Things like long distance VMotion, provided by vSphere 6,  Stretched L2 connectivity provided by NSX will make this a possibility
  • Any Application – VMware will build their SDDC product set to support containerisation of traditional (legacy client-server type) apps as well as new Cloud Native Apps going forward. Some work is already underway with the introduction of vSphere Integrated containers which I’d encourage you to have a look as well as VMware Photon platform
  • Any Device – Facilitate connectivity to any cloud / any application from Any end user device

Additional things announced also included vRealize Automation version 7.0 (urrently BETA, looks totally cool), VMware vCloud NFV platform availability for the Telco companies…etc.

Also worth mentioning that 2 large customers, Nova Media and Telefornica had their CEO’s on stage to explain how they managed to gain agility and market edge through the use to VMware’s SDDC technologies such as vSphere, NSX, vRealize Automation…etc. which was really good to see.

There were few other speakers at the general session such as Bill Fathers (about Cloud services – mainly vCloud Air) which I’m not going to mention in detail but sufficient to say that VMware’s overall product positioning and the corporate message to customers sound very catchy I think….and is very relevant to what’s going on out there too…

During the rest of the day 1, I attended a number of breakout sessions. 1st of which was the Converged Blueprints session presented by Kal De who was the VP or VMware R&D. This was based on the new vRealize Automation (version 7.0) and needless to say this was of total interest to me. So much so, straight after the event I managed to get in on the BETA programme for vRA 7.0 straight away (may be closed to public though now). Given below were some highlights from the session FYI

  • An improved, more integrated blueprint canvass where blueprints can be build through a drag and drop approach. Makes it a whole lot easier to build blueprints now.
  • Additional NSX integration to provide out of the box workflows….etc
  • Announcement of converged blueprints including IaaS, XaaS and Application services all in one blueprints…. Awesome…!!
  • Various other improvements & features….
  • Some potential (non committal of course) roadmap information also shared such as potential future ability to provision single Blueprint for multi-platform and multi-clouds, Blueprints to support container based Cloud Native Apps, Aligning vRA as a code with industry standards such as OASIS TOSCA, Open source HEAT…etc.

Afterwards, I had some time to spare, so I went to the Solutions Exchange and had a browse around at as many vendor stands as possible. Most of the key vendors were there with their usual tech, EMC (or Dell now??) and the VCE stands being the loudest (no surprise there then??). However I want to mention the following 2 new, VMware partner start-ups I came across that really caught my attention. These were both new to me and I really liked what both of them had to offer.

  • RuneCast:
    • This is a newly formed Czech start-up and basically what they do is hoover in all VMware KB articles with configuration best practises and bug fix instructions and assessing your vSphere environment components against these information to warn you of the configuration drift from the recommended state. Almost like a best practise analyser…. Best part is the cost is fairly cheap at $25 per CPU per month (list price which usually get heavily discounted)… Really simple, but a good idea made more appealing due to the low cost. Check it out…!!
  • Velvica:
    • These guys provide a billing and management platform to cloud service providers (especially small to medium scale cloud service providers) so they don’t have to build such capabilities ground up on their own. If you are a CSP, all that is required is for you to have VMware vCloud Director instance and you can simply point Velvica portal at the vCD to present a service serviceable public Cloud portal to customers. Can also be used within an organisation internally if you have a private cloud. Again, I hadn’t come across this before and I thought their offering helps many small CSP’s to get to market quicker while providing a good platform for corporate & enterprise customers to introduce utility computing internally without much initial delay or cost.

During the rest of the Day 1, I attended few more breakout sessions such as the vCenter 6.0 HA deepdive. While this was not as good a session as I had expected, I did learn few little things such as prior to vSphere 6 u1, vCenter database NOT being officially supported on SQL AAG (Always on Availability Groups), Platform Service Controller being clusterable without a load balancer (require manual failover tasks of course) as well as a tech preview of the native HA capability going to be available for vCenter (no need for vCenter heartbeat or any 3rd party products anymore) that looked pretty cool.

On day 2, there was another general session on the morning where VMware discussed the strategy and new announcement on EYUC, security & SDN…etc. with various BU leaders on stage. VMware CEO Pat Gelsinger also came on stage to discuss future direction of the organization (though I suspect most of this may be influenced by Dell if they remain a part of Dell??).

Following on from the general session on day 2, I attended a breakout session about NSX Micro Segmentation automation deep dive which was presented by 2 VMware Professional Services team members from US. This was really cool as they showed a live demo of how to create custom vRO workflows to perform NSX operations and how they can be used to automate NSX operations. While they didn’t mention this, it should be noted that these workflows can naturally be accessed from vRealize Automation where performing routine IT tasks can now be made available through a pre-configured service blueprint that users (IT staff themselves) can consume via the vRA self serviceable portal.

While I had few other breakout sessions booked for afterwards, unfortunately I was not able to attend these due to a last minute meeting I had to attend onsite at VMworld, with a customer to discuss a specific requirement they have.

I will be spending the rest of the afternoon looking at more vendor stands at Solutions Exchange until the VMware official party begins where im planning to catch up with few clients as well as some good friends…

Will provide an update if I come across any other interesting vendors from the Solutions Exchange in tomorrow’s summary

Cheers

Chan

 

 

VMware NSX Upgrade from 6.1.2 to 6.1.3

OK, I’ve had NSX 6.1.2 deployed in my lab, simulating a complex enterprise environment and the DLR (Distributed Logical Routers) kept on causing numerous issues (one of which I’ve blogged about in a previous article here). Anyways, VMware support has requested that I upgrade to the latest NSX version 6.1.3 which is supposed to have fixed most of the issues I’ve been encountering so I’ve decided to upgrade my environment and thought I’d write a blog about it too, especially since there aren’t very many precise NSX related documentation out there.

Note that despite the title of this article (upgrading 6.1.2 to 6.1.3), these steps are also applicable to upgrading from any NSX 6.0.x version to 6.1.x version.

High level steps involved

  1. First thing to do is to check for compatibility and I have vSphere 5.5 which is fully supported with NSX 6.1.3 as per the compatibility Matrix here.
  2. Then have a look at the 6.1.3 release notes to make sure you are up to date with what’s fixed (not all the fixes are listed here btw), what are the known issues…etc. That is available right here.
  3. Next thing is to download the upgrade media from the NSX / Nicira portal. If you have an active NSX subscription, you may download it from My VMware portal or else, (if you have taken the course and were deemed to be suitable to have access to the eval, you can download it from Nicira portal here. Once you’ve downloaded the Nicira upgrade media (VMware-NSX-Manager-upgrade-bundle-6.1.3-2591148.tar.gz file), you may also download the install & upgrade guide here.(Note that there don’t appear to be a separate set of documentation for 6.1.3 as it appears to be for the whole 6.1.x platform).
  4. If you are upgrading vSphere to v6.0 also at the same time, follow the steps below
    1. Upgrade NSX Manager and all NSX components to 6.1.3 in the following order.
      1. NSX Manager
      2. NSX controller(s)
      3. Clusters and Logical Switches
      4. NSX Edge
      5. Guest Introspection and Data Security
    2. Upgrade vCenter Server to 6.0.
    3. Upgrade ESXi to 6.0. – When the hosts boot up, NSX Manager pushes the ESXi 6.0 VIBs to the hosts. When the hosts show reboot required on the Hosts and Clusters tab in the left hand side of the vSphere Web Client, reboot the hosts a second time.  NSX VIBs for ESXi 6.0 are now enabled.
  5. If you are not upgrading vSphere at the same time (this is the case for me), follow the step below
    1. Upgrade NSX Manager and all NSX components to 6.1.3 in the following order.
      1. NSX Manager
      2. NSX controller(s)
      3. Clusters and Logical Switches
      4. NSX Edge
      5. Guest Introspection and Data Security

Now lets look at the actual upgrade process of NSX

NSX Upgrade steps

  1. NSX Manager Upgrade
    1. I don’t have Data security deployed. If you did, uninstall it first
    2. Snapshot the NSX Manager virtual appliance (as a precaution)
    3. Login to the NSX manager interface and go to the upgrade tab. Click on the upgrade button on top right and select the previously downloaded VMware-NSX-Manager-upgrade-bundle-6.1.3-2591148.tar.gz file & continue.          1.3
    4. Select to enable SSH and click upgrade.                                                           1.4
    5. Upgrade of the NSX Manager appliance will now start.       1.5
    6. Once the upgrade completes, the appliance will automatically reboot (this is not so obvious unless you look at the console of the appliance). Once rebooted, login to the NSX manager instance and confirm the version details (Top right).    1.6
  2. NSX controller(s) Upgrade
    1. Notes
      1. Upgrades are done at cluster level and would appear with an upgrade link in NSX manager.
      2. It’s recommended that no new VM’s are provisioned, no VMs are moved (manually or via DRS) during the controller upgrade
    2. Backup controller data by taking a snapshot
      1. If the existing NSX version is < 6.1, This has to be done using a REST API call
      2. If the existing NSX version is => 6.1, you can do this using the GUI of the web client as follows 2.2.2
    3. Login to the Web client, and go to the Networking & Security -> Installation and click on upgrade available link as shown below 2.3
    4. You’ll now see the upgrade being in progress (may ask you to reload the web client). Controller(s) will reboot during the upgrade 2.4
    5. The upgrade should complete within 30 mins normally. If not completed by then, upgrade may need to be launched using the upgrade available link again. Once successful, the upgrade status should state upgraded and the software version should be updated with the new build number. Note that if the upgrade status says failed, be sure to refresh the web client prior to assuming that its actually failed. It may state its failed due to the 30 standard timeout even though the controller upgrade has completed successfully (happened to me couple of times) 2.5
  3. Clusters and Logical Switches
    1. Go to the host preparation tab and click update                           3.1
    2. Installation and upgrade guide states that each host will automatically be put in to the maintenance mode & rebooted. If the upgrade failed with a Not ready status for each node, you may need to click resolve each time to retry (happened o me once) and it will proceed. Some host may require manual evacuation of VM’s too. Also worth noting that if you keep getting the task “DRS recommends hosts to evacuate” logged in vSphere client, you may want to temporarily disable HA so that the ESXi host can be put in maintenance mode by the NSX cluster installer (Arguably, in a well managed cluster, this should not happen).    3.2
    3. When the cluster is updated, the Installation Status column displays the software version that you have updated to.    3.3
  4. NSX Edge (Including the Distributed Logic Router instances)
    1. Go to the NSX Edges section and for each instance, select the Upgrade Versions option from the Actions menu.   4.1
    2. once the upgrade is complete, it will show the new version of the Edge device, weather that’s a DLR or an Edge gateway device.   4.2
  5. Guest Introspection and Data Security
    1. Upgrading these components are relatively straight forward and I’d encourage you to look at the install & upgrade guide…
  6. Now, create a NSX backup (You may already have routine backup scheduled on the NSX manager)
  7. If all is well, remove the snapshot created earlier, on the NSX manager appliance.

 

All in all, it was a fairly straight forward upgrade but the time it took due to somewhat finicky nature of some of the components (especially the cluster upgrade where a number of manual retries were required) unnecessarily added time. Other than that, all appears to be well in my NSX deployment.

Cheers

Chan

NSX 6.1.2 Bug – DLR interface communication issues & How to troubleshoot using net-vdr command

I have NSX 6.1.2 deployed on vSphere 5.5 and wanted to share the details of a specific bug applicable to this version of NSX that I came across, especially since it doesn’t have a relevant KB article published for it.

After setting up the NSX manager and all the host preparations including VXLAN prep, I had deployed a number of DLR instances, each with multiple internal interfaces and an uplink interface which in tern, was connected to an Edge Gateway instance for external connectivity. And I kept on getting communication issues between certain interfaces of the DLR, where by for example, the internal interfaces connected to vNIC 1 & 2 would communicate with one another (I can ping from a VM on VXLAN 1 attached to the internal interface on vNIC 1 to a VM on VXLAN 2 attached to the internal interface on vNIC 2) but none of them would talk to the internal interface 3, attached to the DLR vNic 3 or even uplink interface on vNic 0 (Cannot ping the Edge gateway). The interfaces that cannot communicate were completely random however and was persistent on multiple DLR instances deployed. All of them had one thing in common which was that no internal interface would talk to the uplink interface IP (of the Edge gateway attached to the other end of the uplink interface).

One symptom of the issue was what was described in this blog post I posed on the VMware communities page, at https://communities.vmware.com/thread/505542

Finally I had to log a call with NSX support at VMware GSS and according to their diagnosis, it turned out to be an inconsistency issue with the netcpa daemon running on the ESXi hosts and its communication with the NSX controllers. (FYI – netcpa gets deployed during the NSX host preparation stage, as a part of the User World Agents and is responsible for communication between DLR and the NSX controllers as well as the VXLAN and NSX controllers- see the diagram here)

During the troubleshooting, it transpired that some details (such as VXLAN details) were out of sync between the hosts and the controllers (different from what was shown as the VXLAN & VNI configuration in the GUI) and the temporary fix was to stop and start the netcpa daemon on each of the hosts in the compute & edge cluster ESXi nodes (commands “/etc/init.d/netcpad stop” followed up by “/etc/init.d/netcpad start” on the ESXi shell as root).

Having analysed the logs thereafter, VMware GSS confirmed that this was indeed an internally known issue with NSX 6.1.2. Their message was “This happens due to a problem in the tcp connection handshake between netcpa and the manager once the last one is rebooted (one of the ACKs is not received by netcpa, and it does not retry the connection). This fix added re-connect for every 5 seconds until the ACK is received“. Unfortunately, there’s no published KB article out (as of now) for this issue which doesn’t help much. expecially if you deploying this in a lab…etc.

This issue has (allegedly) been resolved in NSX version 6.1.3 even thought its not explicitly stated within the release notes as of yet (the support engineer I dealt with mentioned that he’s requested this to be added)

So, if you have similar issues with NSX 6.1.2 (or lower possibly), this may well be the way to go about it.

One thing I did lean during the troubleshooting process (which was probably the most important thing that came out of this whole issue for me, personally) was the understanding of the importance of the net-vdr command, which I should emphasize here though. Currently, there are no other ways to check if the agents running on the ESXi hosts have the correct configuration settings to do with NSX other than looking at it on command line…. I mean, you can force a resync of the appliance configuration or redeploy the appliances themselves using NSX GUI but that doesn’t necessarily update the ESXi agents and was of no help in my case mentioned above.

net-vdr command lets you perform a number of useful operations relating to DLR, from basic operations such as adding / deleting a new VDR (Distributed Router) instance, dumpi9ng the instance info, configure DRl settings including changing the controller details, adding / deleting DLR route details, listing all DLR instances, ARP operations such as show, add & delete ARP entries in the DLR & DLR Bridge operations and has turned out to be real handy for me to do various troubleshooting & verification operations regarding DLR settings. Unfortunately, there don’t appear to be much documentation on this command and its use, not on the VMware NSX documentation NOR within the vSphere documentation, at least as of yet, hence why I thought I’d mention it here.

Given below is an output of the commands usage,

net-vdr

So, couple of examples….

If you want to list all the DLR instances deployed and their internal names, net-vdr –instance -l will list out all the DLR instances as how ESXi sees them.

net-vdr --instance -l

If you NSX GUI says that you have 4 VNI’s 5000, 5001, 5002 & 5003 defined and you need to check whether if all 4 VNI configurations are also present on the ESXi hosts, net-vdr -L -l default+edge-XX (where XX is the unique number assigned to each of your DLR) will show you all the DLR interface configuration, as how the ESXi host sees it.

net-vdr -L l

If you want to see the ARP information within each DLR,  net-vdr –nbr -l default+edge-XX (where XX is the unique number assigned to each DLR) will show you the ARP info.

net-vdr --nbr -l default+edge-XX

 

Hope this would be of some use to those early adaptors of NSX out there….

Cheers

Chan

 

7. NSX L2 Bridging

Next Article: ->

NSX L2 bridging is used to bridge a VXLAN to a VLAN to enable direct Ethernet connectivity between VMs in a logical switch, or a distributed dvPG (or between VMs and physical devices through an uplink to the external physical network). This provides direct L2 connectivity rather than L3 routed connectivity which could have been achieved via attaching the VXLAN network on to an internal interface of the DLR and also attaching the VLAN tagged port  group to another interface to route traffic. While the use cases of this may be limited, it can be handy during P2V migrations where direct L2 access from the physical network to the VM network is required. 0 Overview Given below are some key points to note during design.

  • L2 bridging is enabled in a DLR using the “Bridging” tab (DLR is a pre-requisite)
  • Only VXLAN and VLAN bridging is supported (no VLAN & VLAN or VXLAN & VXLAN bridging)
  • All participants of the VXLAN and VLAN bridge must be in the same datacentre
  • Once configured on the DLR, the actual bridging takes place on the specific ESXi server, designated as the bridge instance (usually the host where DLR control VM runs).
  • If the ESXi host acting as the bridge instance fails, NSX controller will move the role to a different server and pushes a copy of the MAC table to the new bridge instance to keep it synchronised.
  • L2 bridging and distributed routing cannot be enabled on the same logic switch at present (meaning, the VMs attached to the logical switch cannot use the DLR as the default gateway)
  • Bridge instances are limited to the throughput of a single ESXi server.
    • Since, for each bridge, the bridging happens in a single ESXi server, all the related traffic are hair-pinned to that server
    • Therefore, if deploying multiple bridges, its better to use multiple DLR’s where the control VM’s are spread across multiple ESXi servers to get aggregate throughput from multiple bridge instances
  • VXLAN & VLAN port groups must be on the same distributed virtual switch
  • Bridging to a VLAN id of 0 is NOT supported (similar to an uplink interface not being able to be mapped to a dvPG with no VLAN tag)

 

  • Given below is an illustration of the packet flow, during an ARP request from the VXLAN to a physical device RP-VXLAN to physical]
    • 1. The ARP request from VM1 comes to the ESXi host with the IP address of a host on the physical network
    • 2. The ESXi host does not know the destination MAC address. So the ESXi host contacts NSX Controller to find the destination MAC address
    • 3. The NSX Controller instance is unaware of the MAC address. So the ESXi host sends a broadcast to the VXLAN segment 5001
    • 4. All ESXi hosts on the VXLAN segment receive the broadcast and forward it up to their virtual machines
    • 5. VM2 receives the request because it is a broadcast and disregards the frame and drops it. 6. The designated instance receives the broadcast
    • 7. The designated instance forwards the broadcast to VLAN 100 on the physical network
    • 8. The physical switch receives the broadcast on the VLAN 100 and forwards it out to all ports on VLAN 100 including the desired destination device.
    • 9. The Physical server responds

 

  • Given below is an illustration of the packet flow, during the ARP response to the above, from the physical device in the VLAN to the VM in the VXLAN    2. ARP response
    • 1. The physical host creates an ARP response for the machine. The source MAC address is the physical host’s MAC and the destination MAC is the virtual machine’s MAC address
    • 2. The physical host puts the frame on the wire
    • 3. The physical switch sends the packet out of the port where the ARP request originated
    • 4. The frame is received by the bridge instance
    • 5. The bridge instance examines the MAC address table, sends the packet to the VNI that contains the virtual machine’s MAC address, and sends the frame. The bridge instance also stores the MAC address of the physical server in the MAC address table
    • 6. The ESXi host receives the frame and stores the MAC address of the physical server in its own local MAC address table.
    • 7. The virtual machine receives the frame
  • Given below is an illustration of the packet flow, from the VM to the physical server / device, after the initial ARP request is resolved (above)  3. Unicast
    • 1. The virtual machine sends a packet destined for the physical server
    • 2. The ESXi host locates the destination MAC address in its MAC address table
    • 3. The ESXi host sends the traffic to the bridge instance
    • 4. The bridge instance receives the packet and locates the destination MAC address
    • 5. The bridge instance forwards the packet to the physical network
    • 6. The switch on the physical server receives the traffic and forwards the traffic to the physical host.
    • 7. The physical host receives the traffic.
  • Given below is an illustration of the packet flow, during an ARP request from the physical network (VLAN) to the VXLAN vm. 4. ARP from Physical#
    • 1. An ARP request is received from the physical server on the VLAN that is destined for a virtual machine on the VXLAN through broadcast
    • 2. The frame is sent to the physical switch where it is forwarded to all ports on VLAN 100
    • 3. The ESXi host receives the frame and passes it up to the bridge instance
    • 4. The bridge instance receives the frame and looks up the destination IP address in its MAC address table
    • 5. Because the bridge instance does not know the destination MAC address, it sends a broadcast on VXLAN 5001 to resolve the MAC address
    • 6. All ESXi hosts on the VXLAN receive the broadcast and forward the frame to their virtual machines
    • 7. VM2 drops the frame, but VM1 sends an ARP response

Deployment of the L2 bridge is pretty easy and given below are the high level steps involved (Unfortunately, I cannot provide screenshots due to the known bug on NSX for vSphere 6.1.x as per documented in the VMware KB article 2099414).

 

Prerequisites

  • An NSX logical router must be deployed in the environment.

 

High level deployment steps involved

  1. Log in to the vSphere Web Client.
  2. Click Networking & Security and then click NSX Edges.
  3. Double-click an NSX Edge.
  4. Click Manage and then click Bridging.
  5. Click the Add icon.
  6. Type a name for the bridge.
  7. Select the logical switch that you want to create a bridge for.
  8. Select the distributed virtual port group that you want to bridge the logical switch to.
  9. Click OK.

Hope these make sense. In the next post of the series, we’ll look at other NSX edge services gateway functions available.

The slide credit goes to VMware….!!

Thanks

Chan

 

Next Article: ->

6. NSX Distributed Logical Router

Next: 7. NSX L2 Bridging ->

In the previous article of this VMware NSX tutorial, we looked at the VXLAN & Logical switch deployment within NSX. Its now time to look at another key component of NSX, which is the Distributed Logical Router, also known as DLR.

VMware NSX provide the ability to do traffic routing (between 2 different L2 segments for example) within the hypervisor without ever having to send the packet out to a physical router. (It has always been the case with traditional vSphere solutions where, for example, if the application server VM in vlan 101 need to talk to the DB server VM in vlan 102, the packet need to go out of the vlan 101 tagged port group via the uplink ports to the L3 enabled physical switch which will perform the routing and send the packet back to the vlan 102 tagged portgroup, even if both VM’s reside on the same ESXi server). This new ability to route within the hypervisor is made available as follows

  • East-West routing = Using the Distributed Logical Router (DLR)
  • North-South routing = NSX Edge Gateway device

This post aim to summarise the DLR architecture, key points and its deployment.

DLR Architectural Highlights

  • DLR data plane capability is provided to each ESXi host during the host preparation stage, through the deployment of the DLR VIB component (as explained in a previous article of this topic)
  • DLR control plane capability is provided through a dedicated control virtual machine, deployed during the DLR configuration process, by the NSX manager. 0. Architecture
    • This control VM may reside on any ESXi server host on the computer & edge cluster.
    • Since the separation of data and control planes, a failure of the control VM doesn’t affect routing operations (no new routes are learnt during the unavailability however)
    • Control VM doesn’t perform any routing
    • Can be deployed in high availability mode (active VM and a passive control VM)
    • Its main function is to establish routing protocol sessions with other routers
    • Supports OSPF and BGP routing protocols for dynamic route learning (within the virtual as well as physical routing infrastructure) as well as static routes which you can manually configure.

 

  • Management, Control and Data path communication looks like below

1. mgmt & control & datapath

  • Logical Interfaces (LIF)
    • A single DLR instance can have a number of logical interfaces (LIF) similar to a physical router having multiple interfaces.
    • A DLR can have up to 1000 LIFs
    • A LIF connects to Logical Switches (VXLAN virtual wire) or distributed portgroups (tagged with a vlan)
      • VXLAN LIF – Connected to a NSX logical switch
        • A virtual MAC address (vMAC) assigned to the LIF is used by all the VMs that connect to that LIF as their default gateway MAC address, across all the hosts in the cluster.
      • VLAN LIF – Connected to a distributed portgroup with one or more vlans (note that you CANNOT connect to a dvPortGroup with no vlan tag or vlan id 0)
        • A physical MAC address (pMAC) assigned to an uplink through which the traffic flows to the physical network is used by the VLAN LIF.
        • Each ESXi host will maintain a pMAC for the VLAN LIF at any point in time, but only one host responds to ARP requests for the VLAN LIF and this host is called the designated host
          • Designated host is chosen by the NSX controller
          • All incoming traffic (from the physical world) to the VLAN LIF is received by the designated instance
          • All outgoing traffic from the VLAN LIF (to the physical world) is sent directly from the originating ESXi server rather than through the designated host.
        • One designated instance (an ESXi server the LIF is used by all the VMs that connect to that LIF as their default gateway MAC address, across all the hosts in the cluster.
    • The LIF configuration is distributed to each host
    • An ARP table is maintained per each LIF
    • DLR can route between 2 VXLAN LIFs (web01 VM on VNI 5001 on esxi01 server talking to app01 VM on VNI 5002 on the same or different ESXi hosts) or between physical subnets / VLAN LIFs (web01 VM on VLAN 101 on esxi01 server talking to app01 VM on VLAN 102 on the same or different ESXi hosts)

 

  • DLR deployment scenarios
    • One tier deployment with an NSX edge 2. One tier deployment
    • Two tier deployment with an NSX edge  3. two tier routing

 

  • DLR traffic flows – Same host     4. Traffic Flow same host
    • 1. VM1 on VXLAN 5001 attempts to communicate with VM2 on VXLAN 5002 on the same host
    • 2. VM1 sends a frame with L3 IP on the payload to its default gateway. The default gateway uses the destination IP to determine that it is directly connected to that subnet.
    • 3. the default gateway checks its ARP table and sees the correct MAC for that destination
    • 4. VM2 is running on the same host. Default gateway passes the frame to VM2, packet never leaves the host.

 

  • DLR traffic flow – Different hosts    5. Traffic flow - different hosts
    • 1. VM1 on VXLAN 5001 attempts to communicate with VM2 on VXLAN 5002 on a different host. Since the VM2 is on a different host, VM1 sends the frame to the default gateway
    • 2. The default gateway sends the traffic to the router and the router determines that the destination IP is on a directly connected interface
    • 3. The router checks its ARP table to obtain the MAC adders of the destination VM but the MAC is not listed. The router sends the frame to the logical switch for VXLAN 5002.
    • 4. The source and destination MAC addresses on the internal frame are changed. So the destination MAC is the address for the VM2 and the source MAC is the vMAC LIF for that subnet. The logical switch in the source host determines that the destination is on host 2
    • 5. The logical switch puts the Ethernet frame in a VXLAN frame and sends the frame to host 2
    • 6. Host 2 takes out the L2 frame, looks at the destination MAC and delivers it to the destination VM.

DLR Deployment Steps

Given below are the key deployment steps.

 

  1. Go to vSphere Web client -> Networking & Security and click on the NSX Edges on the left had pane. Click the plus sign at the top and select Logical Router and provide a name & click next.0
  2. Now provide the CLI (SSH) username and password. Note that the password here need to be of a minimum of 12 digits and must include a special character. Click on enable SSH to be able to putty on to it (note that you also need to enable the appropriate firewall rules later without which SSH wont work. Enabling HA will deploy a second VM as a standby. Click next. 1. CLI
  3. Select the cluster location and the datastore where the NSX Edge appliances (provides the DLR capabilities) will be deployed. Note that both appliances will be deployed on the same datastore and if you require that they be deployed in different datastores for HA purposes, you’d need to svmotion one to a different datastore manually. 2. Edge location
  4. In the next screen, you configure the interfaces (LIF – explained above). There are 2 main different types of interfaces here. Management interface is used to manage the NSX edge device (i.e. SSH on to) and is usually mapped to the management network. All other interfaces are mapped to either VXLAN networks or VLAN backed portgroups. First, we create the management interface by mapping / connecting the management interface to the management distributed port group and providing an IP address on the management subnet for this interface. 3. Management interface
  5. And then, click the plus sign at the bottom to create & configure other interfaces used to connect to VXLAN networks  or VLAN tagged port groups. These interfaces fall in to 2 parts. Uplink interface and internal interfaces. Uplink interface would be to connect the DLR /Edge device to an external network and often this would be connected to one of the “VM Network” portgroups to connect the internal interfaces to the outside world. Internal interfaces are typically mapped to NSX virtual wires (VXLAN networks) or a dvPortGroup. Below, we create 2 interfaces and map them to the 2 VXLAN networks called App-Tier and Web-Tier (created in a previous post of this series during the VXLAN & Logical switch deployment). For each interface you create, an interface subnet must be specified along with an ip address for the interface. Often, this would be the default gateway IP address to all the VM’s belonging to the VXLAN network / dvPortGroup mapped to that interface). Below we create 3 different interfaces
    1. Uplink Interface – This uplink interface would map to the “VM network” dvPortGroup and would provide external connectivity from the internal interfaces of the DLR to the outside world. It will have an IP address from the external network IP subnet and is reachable from the outside work using this IP (after the appropriate firewall rules). Note that this dvPG need to have a vlan tag other than 0 (a VLAN ID must be defined on the connected portgroup) 5.1 Interface Uplink
    2. We then create 2 internal interfaces, one for the Web-tier (192.168.2.0/24) and another for the App-Tier (192.168.1.0/24). The interface IP would be the default gateway for the VMs.4. Interface Interface 1 5. Interface Interface 2
  6. Once all the 3 interfaces are configured, verify settings and click next. 8. Config Summary
  7. Next screen allows you to create a default gateway and had the Uplink interface been correctly configured, this uplink interface would need to be selected as the vNIC and gateway IP would have been the default gateway of the external network. In my example below, I’m not configuring this as I do not need my VM traffic (configured on VXLAN network) to go to outside world. 7. Default GW
  8. In the final screen, review all settings and click finish for the NSX DLR (edge devices) to be deployed as appliances. These would be the control VM’s referred to earlier in the post. 8. Config Summary
  9. Once the apliances have been deployed on to the vSphere cluster (compute & Edge clusters), you can see the Edge devices under the NSX Edges section as shown below 9. DLR deployed
  10. You can double click on the edge device to go the configuration details as shown below 9.1
  11. You can make further configurations here including adding additional interfaces or removing existing interfaces…etc. 9.2
  12. By default, all NSX edge devices contain a built in firewall which bocks all traffic due to a global deny rule. If you need to be able to ping the management address / external uplink interface address or putty in to the management IP from the outside network, you’d need to enable the appropriate firewall rules within the firewall section of the DLR. Example rules shown below. 10. Firewall rules
  13. That is it. You now have a DLR with 3 interfaces deployed as follows
    1. Uplink interface – Connected to the external network using a VLAN tagged dvPortGroup
    2. Web-Interface (internal) – An internal interface connected to a VXLAN network (virtual wire) where all the Web server VMs on IP subnet 192.168.2.0/24 resides. The interface IP of 192.168.2.1 is set as the default gateway for all the VMs on this network
    3. App-Interface (internal) – An internal interface connected to a VXLAN network (virtual wire) where all the App server VMs on IP subnet 192.168.1.0/24 resides. The interface IP of 192.168.1.1 is set as the default gateway for all the VMs on this network
  14. App VM’s and Web VMs could not communicate with each other before, as there was no way of being able to route between the 2 networks. Once the DLR has been deployed and connected to the interfaces as listed above, each VM can now talk to the other from the other subnet.

 

That’s is it and its as simple as that. Now obviously you can configure these DLR’s to have dynamic routing via OSPF or BGP…etc should you deploy these in an enterprise network with external connectivity which I’m not going to go in to but the above should give you a high level, decent understanding of how to deploy the DLR and get things going to begin with.

In the next post, we’d look at Layer 2 bridging.

Slide credit goes to VMware…!!

Cheers

Chan

Next: 7. NSX L2 Bridging ->

 

5. VXLAN & Logical Switch Deployment

Next: 6. NSX Distributed Logical Router ->

Ok, this is the part 5 of the series where we are looking at the VXLAN & Logical switch configuration

  • VXLAN Architecture – Key Points

    • VXLAN – Virtual Extensible LAN
      • A formal definition can be found here but put it simply, its an extensible overlay network that can deploy a L2 network on top of an existing L3 network fabric though encapsulating a L2 frame inside a UDP packet and transfer over an underlying transport network which could be another L2 network or even across L3 boundaries. Kind of similar to Cisco OVT or even Microsoft NVGRE for example. But I’m sure many folks are aware of what VXLAN is and what and where its used for already.
      • VXLAN encapsulation adds 50 bytes to the original frame if no VLAN’s used or 54 bytes if the VXLAN endpoint is on a VLAN tagged transport network   0.1 VXLAN frame
      • Within VMware NSX, this is the primary (and only) IP overlay technology that will be used to achieve L2 adjacency within the virtual network
      • A minimum MTU of 1600 is required to be configured end to end, in the underlying transport network as the VXLAN traffic sent between VTEPs (do not support fragmentation) – You can have MTU 9000 too
      • VXLAN traffic can be sent between VTEPs (below) in 3 different modes
        • Unicast – Default option that is supported with vSphere 5.5 and above. This has a slightly more overhead on the VTEPs 0.2 VXLAN Unicast
        • Multicast – Supported with ESXi 5.0 and above. Relies in the Multicasting being fully configured on the transport network with L2-IGMP and L3-PIM 0.3 VXLAN Multicast
        • Hybrid – Unicast for remote traffic and Multicast for local segment traffic.0.4 VXLAN Hybrid
      • Within NSX, VXLAN is an overlay network only between ESXi hosts and VM’s have no information of the underlying VXLAN fabric.

     

    • VNI – VXLAN Network Identifier (similar to a VLAN ID)
      • Each VXLAN network identified by a unique VNI is an isolated logical network
      • Its a 24 bit number that gets added to the VXLAN frame which allows a theoretical limit of 16 million separate networks (but note that in NSX version 6.0, the only supported limit is 20,000 NOT 16 million as VMware marketing may have you believe)
      • The VNI uniquely identifies the segment that the inner Ethernet frame belongs to
      • VMware NSX VNI range starts from 5000-16777216

     

    • VTEP – VXLAN Tunnel End Point
      • VTEP is the end point that is responsible for encapsulating the L2 Ethernet frame in a VXLAN header and forward that on to the transport network as well as the reversal of that process (receive an incoming VXLAN frame from the transport network, strip off everything and just forward on the original L2 Ethernet frame to the virtual network)
      • Within NSX, a VTEP is essentially a VMkernal port group that gets created on each ESXi server automatically when you prepare the clusters for VXLAN (which we will do later on)
      • A VTEP proxy is a VTEP (specific VMkernal port group in a remote ESXi server) that receive VXLAN traffic from a remote VTEP and then forward that on to its local VTEPs (in the local subnet). The VTEP proxy is selected by the NSX controller and is per VNI.
        • In Unicast mode – this proxy is called a UTEP
        • In Multicast or Hybrid mode – this proxy is called MTEP

     

    • Transport Zone
      • Transport zone is a configurable boundary for a given VNI (VXLAN network segment)
      • Its like a container that house NSX Logical Switches created along with their details that present them to all hosts (across all clusters) that are configured to be a part of that transport zone (if you want to restrict certain hosts from seeing certain Logical switches, you’d have to configure multiple Transport Zones)
      • Typically, a single Transport Zone across all you vSphere clusters (managed by a single vCenter) is sufficient.
  •  Logical Switch Architecture – Key Points

    • Logical Switch within NSX is a virtual network segment which is represented on vSphere via a distributed port group tagged with a unique VNI on a distributed switch.
    • Logical Switch can also span distributed switch by associating with a port group in each distributed switch.
    • VMotion is supported amongst the hosts that are part of the same vDS.
    • This distributed port group is automatically created when you add a Logical Switch, on all the VTEPs (ESXi hosts) that are part of the same underlying Transport Zone.
    • A VM’s vNIC then connects to each Logic Switch as appropriate.

 

 VXLAN Network Preparation

  1. First step involved is to prepare the hosts. (Host Preparation)
    1. Launch vSphere web client, go to Netwrking & Security -> Installation (left) and click on Host preparation tab 1. Host prep 1
    2.  Select the Compute & Edge clusters where the NSX controllers & compute & edge VM’s reside where you need to enable VXLAN and click install. 1. Host prep 2
    3. While the installation is taking place, you can monitor the progress of vSphere web / c# client 1. Host prep 3
    4. During this host preparation, 3 VIB’s will be installed on the ESXi servers (as mentioned in the previous post of this series) & you can notice this in vSphere client.   1. Host prep 4
    5. Once complete, it will be shown as Ready under the installation status as follows 1. Host prep 5
  2. Configure the VXLAN networking
    1. Click configure within the same window as above (Host Preparation Tab) under VXLAN. This is where you configure the VTEP settings including the MTU size and if the VTEP connectivity is happening through a dedicated VLAN within the underlying transport network.  Presumably this is going to be common as you still have your normal physical network for all other things such as VMotion network (vlan X), storage network (vlan Y)…etc.
      1. In my example I’ve configured a dedicated VXLAN VLAN in my underlying switch with the MTU size set to 9000 (this could have been slightly more than 1600 for VXLAN to work).
      2. In a corporate / enterprise network, ensure that this vlan has the correct MTU size specified and also in any other vlans that the remote VTEP are tagged with. The communication between all VTEPs across vlans need to have at least MTU 1600 end to end.
      3. You also create an IP pool for the VTEP VMkernal port group to use on each ESXi host. Ensure that there’s sufficient capacity for all the ESXi hosts. 2. Co0nfigure VXLAN networking
    2. Once you click ok, you can see on vSphere client / web client the creation of the VTEP VMkernal portgroup with an IP assigned from the VTEP Pool defined along with the appropriate MTU size (Note that I’ve only set MTU 1600 in the previous step but it seems to have worked out that my underlying vDS and the physical network is set to MTU 9000 and used that here automatically). 3. VXLAN networking prep check
    3. Once complete, VXLAN will appear as complete under the host preparation tab as follows 4. VXLAN Enabled
    4. In Logical Network Preparation tab, you’ll notice that the VTEP vlan and the MTU size with all the VMkernal IP assignment for that VXLAN transport network as show below 5. VXLAN Transport enabled
  3. Create a segment ID (VNI pool) – Go to Segment ID and provide a range for a pool of VNI’s 6. VXLAN Segment ID pool (VNI)
  4. No go to the Transport Zone tab and create a Global transport zone.  7. VXLAN Global Transport Zone
  5. The logical switch is created using the section on the left and create a logical switch. Provide a name and the transport zone and select the Multicast / Unicast / Hybrid mode as appropriate (my example uses Unicast mode).
    1. Enable IP discovery: Enables the ARP suppression available within NSX. ARP traffic is generated as a broadcast in a network when the destination IP is known but not the MAC. However within NSX, the NSX controller maintains and ARP table which negates the use of ARP broadcast traffic.
    2. Enabled MAC learning: Useful if the VM’s have multiple MAC addresses or using vmnics with trunking. Enabling MAC Learning builds a VLAN/MAC pair learning table on each vNic. This table is stored as part of the dvfilter data. During vMotion, dvfilter saves and restores the table at the new location. The switch then issues RARPs for all the VLAN/MAC entries in the table.  8. Add logical Switch
  6. You can verify the creation of the associated port group by looking at the vSphere client. 10. Check the port group creaton
  7. In order to confirm that your VTEPs (behind the logical switches) are fully configured and can communicate with one another, you can double click the logical switch created, go to monitor and do a ping test using 2 ESXi servers. Note that the switch ports where the uplink NIC adaptors plugged in to need to be configured for appropriate MTU size (as shown)10.1 Verify the VTEP communication
  8. You can create multiple Logical Switches as required. Once created, select the logical switch and using Actions menu, select add VM to migrate a VM’s networking connectivity to the Logical switch.  11. Add VM step 1 12. Add VM step 2 13. Add VM step 3

 

There you have it. your NSX environment now has Logical switches that all your existing and new VM’s should be connecting to instead of standard or distributed switches.

As it stands now, these logical networks are somewhat unusable as they are isolated bubbles and the traffic cannot go outside of these networks. Following posts will look at introducing NSX routing using DLR – Distributed Logical Routers to route between different NXLAN networks (Logical switches) and introducing Layer 2 bridging to enable traffic within the Logical Switch network to communicate with the outside world.

Thanks

Chan

 

Next: 6. NSX Distributed Logical Router ->

4. NSX Controller Architecture & Deployment

Next: 5. VXLAN & Logical Switches ->

In the previous step of this series of NSX posts, we looked at the NSX Manager and its deployment. In this article, we are going to have a quick look at the NSX Controller architecture at a high level and how to deploy them.

  • NSX Controller Architecture – Key points

    • They provide
      • VXLAN distribution & Distributed Logical Router (DLR) workload handling & providing information to ESXi hosts.
      • Workload distribution through slicing dynamically amongst all controllers
      • Removal of multicast
      • ARP broadcast traffic suppression in VXLAN networks
    • They store
      • ARP Table:          VM ARP requests for a MAC are intercepted by the hosts and sent to NSX controllers. If the NSX controllers has the ARP, it’s returned to the host that then replies to the VM locally resulting in no ARP broadcast.
      • VTEP table
      • MAC table
      • Routing table:    Routing tables are obtained from the DRL control VM
    • Cluster of 3 NSX controllers is always recommended to avoid a split brain scenario
    • 4 VCU & 4GB RAM per each controller
    • Should be deployed on the vCenter linked to NSX manager (meaning, on the compute or service & edge cluster, NOT the management cluster)
    • User interaction with NSX controllers is through CLI
    • Control plane communication is secured by SSL certificates
  • NSX Manager interaction with NSX Controller

    • NSX mgr and vCenter systems are linked 1:1
    • Install UWA, and few kernel modules (VXLAN, DLR VIB, DFW VIB) on the ESXi servers of the clusters managed by the linked vCenter server during the host preparation stage                                                                                                  
      • UWA=User World Agent
        • Run as a service daemon called netcpa (/etc/init.d/netcpad status)
        • Mediates between NSX controller and hypervisor kernel module communication except for DFW
        • Maintains logs at /var/log/netcpa.log on the ESXi host of the compute & edge clusters
      • Kernel modules
        • Distributed Firewall VIB: Communicate directly with NSX Manager through vsfwd service running on the host
        • Distributed Logical Router VIB: Communicate with NSX controllers through UWA
        • VXLAN VIB: Communicate with NSX controllers through UWA1.6. UWA

     

    • NSX Manager also configures the NSX controller nodes through the REST API                                                                1.3. Controller high level

     

    • For each NSX role (such as VXLAN, Logical routers….etc) a master controller is required
    • Uses slicing as a way to divide NSX controller workload in to different slices and allocate to each controller (controlled by the master) 1.5. Slicing

     

    • Highlighted below in the diagram are the typical communication channels between NSX controllers and other NSX components.0. NSX mgr communication

 

NSX Controller Deployment

Deploying the NSX controllers (3 recommended as stated above) is fairly straight forward

  1. Launch the vSphere Web client (for the compute or edge cluster, NOT the management cluster vCenter server) and select Networking and Security – note that you need to have logged in to vSphere web client as a NSX enterprise admin user (how to set up rights was covered in the previous post of this series)
  2. Select Installation from the left pane
  3. At the bottom, under NSX controller nodes section, select the plus sign to add the first NSX controller node and provide all the information requested in the next screen. Note the below
    1. Connected to: You need to select the management network port group here
    2. IP Pool:  Need an IP pool of at least 3 (for 3 NSX controllers)
    3. Password: NSX controller CLI password specified here. All subsequent controller nodes deployed will use the same password.   3. Add controller wizard 4. Add NSX-Controller-Pool
  4.  Once complete, click OK and you can see the first controller is being deployed                                            6. 1st NSC COntroller deployment
  5.  Once deployed, you can putty in to the CLI using the IP (first IP of the pool you specified above) and verify the control cluster status 6.1 Show control-cluster status
  6. Now, follow the same steps and deploy the 2nd and 3rd NSX Controller nodes too and verify the CLI access 7. 2nd & 3rd Controller node deployment8. Deploy all 3 controller nodes

 

That’s it, you now have your NSX controller clusters fully deployed and configured.

In the next post of the series, we will look at Logical switches and VXLAN overlays..

Next: VXLAN & Logical Switches ->

Cheers

Chan