From Nothing to Data Center with VXLAN / EVPN or How to Cook Cumulus Linux. Part 2

Hello. So the continuation of the first part has come. As promised, in this article, I want to touch upon the main options for implementing a fabric on VXLAN / EVPN, and tell why we decided to choose this or that solution in our data center.

Choosing an Underlay design

Foreword

The first thing you have to deal with when building a factory is the Underlay design - how do we want to build our VXLAN tunnels (or rather, organize a VTEP search)?

We have 3 options:

1. , VTEP - , , , , RSVP-TE . , . SDN.

2.Multicast - , , , .. Cumulus , Juniper resourse-intensive .

3.BGP BGP, - BGP? , - , , BGP, EVPN BGP. . iBGP eBGP, underlay .

iBGP

iBGP IGP, OSPF IS-IS ( ), .. Loopback, , , - . , iBGP , full-mesh ( Spine BGP).

Spine Route-Reflector.

2 full-mesh RR. , , , .

eBGP

, , , . eBGP, Route-Reflector (, ), IGP , .. p2p . MLAG . . MLAG AS, .. VXLAN/EVPN , AS. peerlink'e, , .. , Spine, - , .

, - , , AS. Cumulus 4.2(.. ) , .. AS, MAC, ( 32-bit AS).

Spine AS. .. Spine Leaf, . , AS Spine, Cumulus.

P.S. AS , AS . 32-bit AS .

BGP Unnumbered

, p2p , loopback . - , Unnumbered. Cumulus Cisco, . Cumulus IPv6 , eBGP . extended-nexthop ( , IPv4 Family IPv6 ).

P.S. - IPv4 /30 /31 , BGP . broadcast , p2p.

BGP+BFD

, BGP . , iSCSI, VSAN .. . BFD. Cumulus 2x50 , Cisco 233 , .

?

1. eBGP AS + iBGP MLAG

2. loopback, Unnumbered

3.BFD

1. AS Cumulus

#  Cumulus   AS
#P.S.   Spine     AS, ..       AS-path
cumulus@Switch1:mgmt:~$ net add bgp autonomous-system
    <1-4294967295>  :  An integer from 1 to 4294967295
    leaf            :  Auto configure a leaf ASN in the 4-byte private range 4200000000 - 4294967294 based on the switch
                       MAC
    spine           :  Auto configure a spine AS-number in the 4-byte private ASN range. The value 4200000000 is always
                       used

#      "net add bgp autonomous-system leaf"
cumulus@Switch1:mgmt:~$ net add bgp autonomous-system leaf
cumulus@Switch1:mgmt:~$ net pending
+router bgp 4252968529 # BGP    AS
+end

2. BGP+Unnumbered

#loopback
net add loopback lo clag vxlan-anycast-ip 10.223.250.30 # MLAG     IP (       
net add loopback lo ip address 10.223.250.1/32

#AS+Router ID
net add bgp autonomous-system leaf
net add bgp router-id 10.223.250.1

#      peer-group
net add bgp neighbor fabric peer-group #  peer 
net add bgp neighbor fabric remote-as external #   eBGP
net add bgp neighbor fabric bfd 3 50 50 # BFD
net add bgp neighbor fabric capability extended-nexthop # IPv4 over IPv6

#   (Unnumbered)
net add bgp neighbor swp2 interface peer-group fabric 
net add bgp neighbor peerlink.4094 interface remote-as internal #iBGP  Peerlink
net add bgp ipv4 unicast neighbor peerlink.4094 next-hop-self #   next-hop  iBGP

net add bgp ipv4 unicast redistribute connected #  BGP   

#EVPN
net add bgp l2vpn evpn neighbor fabric activate # family  eBGP
net add bgp l2vpn evpn neighbor peerlink.4094 activate # family  iBGP
net add bgp l2vpn evpn advertise-all-vni # BGP    VNI 

3.BGP Summary

cumulus@Switch1:mgmt:~$ net show bgp summary
#IPv4
show bgp ipv4 unicast summary
=============================
BGP router identifier 10.223.250.1, local AS number 4252968145 vrf-id 0
BGP table version 84
RIB entries 29, using 5568 bytes of memory
Peers 3, using 64 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor                       V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
Switch3(swp2)              4 4252424337    504396    484776        0    0    0 02w0d23h            3
Switch4(swp49)             4 4208128255    458840    485146        0    0    0 3d12h03m            9
Switch2(peerlink.4094) 4 4252968145    460895    456318        0    0    0 02w0d23h           14

Total number of neighbors 3

#EVPN
show bgp l2vpn evpn summary
===========================
BGP router identifier 10.223.250.1, local AS number 4252968145 vrf-id 0
BGP table version 0
RIB entries 243, using 46 KiB of memory
Peers 3, using 64 KiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor                       V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
Switch3(swp2)              4 4252424337    504396    484776        0    0    0 02w0d23h          237
Switch4(swp49)             4 4208128255    458840    485146        0    0    0 3d12h03m          563
Switch2(peerlink.4094) 4 4252968145    460895    456318        0    0    0 02w0d23h          807

Total number of neighbors 3

4.net show route

cumulus@Switch1:mgmt:~$ net show route
show ip route
=============
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued route, r - rejected route
#   IPv4    IPv6 (   Cumulus  weight,  Cisco)
C>* 10.223.250.1/32 is directly connected, lo, 02w0d23h #Loopback
B>* 10.223.250.2/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d23h
B>* 10.223.250.6/32 [20/0] via fe80::ba59:9fff:fe70:e5c, swp49, weight 1, 3d12h19m
B>* 10.223.250.7/32 [20/0] via fe80::ba59:9fff:fe70:e5c, swp49, weight 1, 3d12h19m
B>* 10.223.250.9/32 [20/0] via fe80::ba59:9fff:fe70:e5c, swp49, weight 1, 3d12h19m
C>* 10.223.250.30/32 is directly connected, lo, 02w0d23h #MLAG Loopback
B>* 10.223.250.101/32 [20/0] via fe80::1e34:daff:fe9e:67ec, swp2, weight 1, 02w0d23h
B>* 10.223.250.102/32 [20/0] via fe80::1e34:daff:fe9e:67ec, swp2, weight 1, 02w0d23h
B>* 10.223.250.103/32 [20/0] via fe80::1e34:daff:fe9e:67ec, swp2, weight 1, 02w0d23h
B>* 10.223.252.11/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.12/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.20/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.101/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.102/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.103/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h

5.traceroute + BFD

#traceroute   (      TCP )
cumulus@Switch1:mgmt:~$ traceroute -s 10.223.250.1 10.223.250.6
vrf-wrapper.sh: switching to vrf "default"; use '--no-vrf-switch' to disable
traceroute to 10.223.250.6 (10.223.250.6), 30 hops max, 60 byte packets
 1  10.223.250.7 (10.223.250.7)  1.002 ms  1.010 ms  0.981 ms #     loopback 
 2  10.223.250.6 (10.223.250.6)  0.933 ms  0.917 ms  1.018 ms

# BFD
cumulus@Switch1:mgmt:~$ net show bfd
------------------------------------------------------------------------------------------
port   peer                       state  local                      type       diag  vrf

------------------------------------------------------------------------------------------
swp2   fe80::1e34:daff:fe9e:67ec  Up     fe80::1e34:daff:fea6:b53d  singlehop  N/A   N/A
swp49  fe80::ba59:9fff:fe70:e5c   Up     fe80::1e34:daff:fea6:b510  singlehop  N/A   N/A

Underlay , Overlay.

Overlay

, Overlay VXLAN VTEP EVPN. . 3 SVI, .

Centralized IRB

, centralized switch. (active-active ), L3 , Leaf L2. entralized switch EVPN type-2 (MAC+IP) c Default Gateway community(0x03). centralized switch VNI .

* All Leaf switches in MLAG pair
* Leaf MLAG

, , , Leaf L3. .. L3 , , .

Asymmetric IRB

Symmetric IRB ( ), . VTEP, L2. , Leaf VNI+Vlan, L3VNI.

P.S. , , ( VNI). , VLAN VRF. , , c .

Symmetric IRB

L3 L3VNI, VLAN. VNI VTEP.

, , , VM3 VM4. Leaf03/04, route-lookup, VM4 Leaf05/06, nexthop L3VNI . Leaf05/06 , VXLAN SVI20. SVI , , L3VNI .

Overlay.

Overlay , VTEP .

1. L3VNI

#VRF+VNI
net add vrf Test vrf-table auto #C VRF    RD+RT
net add vrf Test vni 200000 # VNI  VRF
net add vxlan vniTest vxlan id 200000 # L3VNI
net add vxlan vniTest bridge learning off #  , ..  EVPN
net add vxlan vniTest vxlan local-tunnelip 10.223.250.1 #   

#VLAN
net add vlan 2000 hwaddress 44:38:39:BE:EF:AC #   MLAG,    Active-Active    MAC
net add vlan 2000 vlan-id 2000 # 
net add vlan 2000 vlan-raw-device bridge #  bridge
net add vlan 2000 vrf vniTest #  VRF
net add vxlan vniTest bridge access 2000 # VLAN  L3VNI

2. VNI

#VNI
net add vxlan vni-20999 vxlan id 20999 
net add vxlan vni-20999 bridge arp-nd-suppress on #  ARP-supress -  BUM 
net add vxlan vni-20999 bridge learning off #  , ..  EVPN
net add vxlan vni-20999 stp bpduguard #  bpduguard
net add vxlan vni-20999 stp portbpdufilter #  bpdufilter
net add vxlan vni-20999 vxlan local-tunnelip 10.223.250.1 #   

# VLAN
net add vlan 999 ip address 10.223.255.253/24 # IP  VLAN
net add vlan 999 ip address-virtual 44:39:39:ff:01:01 10.223.255.254/24 #  (  gateway   Leaf)
net add vlan 999 vlan-id 999 # 
net add vlan 999 vlan-raw-device bridge #  bridge
net add vlan 999 vrf Test #  VRF

net add vxlan vni-20999 bridge access 999 # VLAN  VNI

#P.S.      L2 only vlan ( VNI  )
net add vlan 999 ip forward off
net add vlan 999 vlan-id 999
net add vlan 999 vlan-raw-device bridge

3.

#   BGP   VRF
net add bgp vrf Test autonomous-system 4252424337
net add bgp vrf Test router-id 10.223.250.101
net add bgp vrf Test neighbor 100.64.1.105 remote-as 35083
net add bgp vrf Test ipv4 unicast redistribute connected
net add bgp vrf Test ipv4 unicast redistribute static
net add bgp vrf Test ipv4 unicast neighbor 100.64.1.105 route-map Next-Hop-VRR_Vl997 out #    MLAG+VSS  BGP+SVI,       
#P.S.          /32    EVPN,         .

net add bgp vrf Test l2vpn evpn  advertise ipv4 unicast #    EVPN

, :

#VNI
cumulus@Switch1:mgmt:~$ net show evpn vni
VNI     Type VxLAN IF      MACs   ARPs   Remote VTEPs  Tenant VRF
20995   L2   vni-20995     5        3        1               default #VNI,   IP  SVI
20999   L2   vni-20999     26       24       4               Test    #VNI
200000  L3   vniTest        3        3        n/a             Test    # L3VNI

#VNI 
cumulus@Switch1:mgmt:~$ net show evpn vni 20999
VNI: 20999
 Type: L2
 Tenant VRF: Test
 VxLAN interface: vni-20999
 VxLAN ifIndex: 73
 Local VTEP IP: 10.223.250.30 #  MLAG,   vxlan-anycast-ip
 Mcast group: 0.0.0.0 # 
 Remote VTEPs for this VNI: #       VNI
  10.223.250.9 flood: HER # Head-end Replication (    Ingress replication)
  10.223.252.103 flood: HER
  10.223.252.20 flood: HER
  10.223.250.103 flood: HER
 Number of MACs (local and remote) known for this VNI: 26
 Number of ARPs (IPv4 and IPv6, local and remote) known for this VNI: 24
 Advertise-gw-macip: No #Community  centralized IRB
#     
cumulus@Switch3:mgmt:~$ net show evpn mac vni 20999
Number of MACs (local and remote) known for this VNI: 28
Flags: B=bypass N=sync-neighs, I=local-inactive, P=peer-active, X=peer-proxy
MAC               Type   Flags Intf/Remote ES/VTEP            VLAN  Seq #'s
0c:59:9c:b9:d8:dc remote       10.223.250.30                        0/0
0c:42:a1:95:79:7c remote       10.223.250.30                        0/0

# MAC,  Interface VNI
cumulus@Switch3:mgmt:~$ net show bridge macs vlan 999

VLAN  Master  Interface  MAC                TunnelDest  State      Flags         LastSeen
----  ------  ---------  -----------------  ----------  ---------  ------------  -----------------
 999  bridge  bridge     1c:34:da:9e:67:68              permanent                24 days, 04:35:13
 999  bridge  bridge     44:39:39:ff:01:01              permanent                14:26:45
 999  bridge  peerlink   1c:34:da:9e:67:48              permanent                31 days, 17:22:36
 999  bridge  peerlink   1c:34:da:9e:67:e8              static     sticky        24 days, 04:14:16
 999  bridge  vni-20999  00:16:9d:9e:dd:41                         extern_learn  19 days, 04:17:21
 999  bridge  vni-20999  00:21:1c:2e:86:42                         extern_learn  19 days, 04:17:21
 999  bridge  vni-20999  00:22:0c:de:30:42                         extern_learn  19 days, 04:17:21

#  VTEP
cumulus@Switch3:mgmt:~$ net show evpn rmac vni all
VNI 200000 RMACs 3

RMAC              Remote VTEP
44:38:39:be:ef:ac 10.223.250.30
44:39:39:ff:40:94 10.223.252.103
44:38:39:be:ef:ae 10.223.250.9

#Arp-cache
cumulus@Switch3:mgmt:~$ net show evpn arp-cache vni 20999
Number of ARPs (local and remote) known for this VNI: 28
Flags: I=local-inactive, P=peer-active, X=peer-proxy
Neighbor                  Type   Flags State    MAC               Remote ES/VTEP                 Seq #'s
10.223.255.242            local        active   1c:34:da:9e:67:68                                0/0
10.223.255.13             remote       active   0c:42:a1:96:d2:44 10.223.250.30                  0/0
10.223.255.243            local        inactive 06:73:4a:02:27:8a                                0/0
10.223.255.7              remote       active   0c:59:9c:b9:f8:fa 10.223.250.30                  0/0
10.223.255.14             remote       active   0c:42:a1:95:79:7c 10.223.250.30                  0/0
# 
cumulus@Switch1:mgmt:~$ net show route vrf Test | grep "10.3.53"
//   next-hop vlan2000,      L3VNI
C * 10.223.255.0/24 [0/1024] is directly connected, vlan999-v0, 03w3d04h
C>* 10.223.255.0/24 is directly connected, vlan999, 03w3d04h
B>* 10.223.255.1/32 [20/0] via 10.223.250.30, vlan2000 onlink, weight 1, 02w5d04h
B>* 10.223.255.2/32 [20/0] via 10.223.250.30, vlan2000 onlink, weight 1, 02w5d04h
B>* 10.223.255.3/32 [20/0] via 10.223.250.30, vlan2000 onlink, weight 1, 02w5d04h

#  EVPN 
cumulus@Switch3:mgmt:~$ net show bgp evpn route vni 20999
BGP table version is 1366, local router ID is 10.223.250.101
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[ESI]:[EthTag]:[IPlen]:[VTEP-IP]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]
   Network          Next Hop            Metric LocPrf Weight Path
*> [2]:[0]:[48]:[00:16:9d:9e:dd:41]
                    10.223.250.30                          0 4252968145 i
                    RT:9425:20999 RT:9425:200000 ET:8 Rmac:44:38:39:be:ef:ac
*> [2]:[0]:[48]:[00:22:56:ac:f3:42]:[32]:[10.223.255.1]
                    10.223.250.30                          0 4252968145 i
                    RT:9425:20999 RT:9425:200000 ET:8 Rmac:44:38:39:be:ef:ac

, VXLAN/EVPN . Underlay Unnumbered BGP , Overlay VXLAN/EVPN c Symmetric IRB, . , .

, Spine, . , - . , , .




All Articles