Hello. So the continuation of the first part has come. As promised, in this article, I want to touch upon the main options for implementing a fabric on VXLAN / EVPN, and tell why we decided to choose this or that solution in our data center.
Choosing an Underlay design
Foreword
The first thing you have to deal with when building a factory is the Underlay design - how do we want to build our VXLAN tunnels (or rather, organize a VTEP search)?
We have 3 options:
1. , VTEP - , , , , RSVP-TE . , . SDN.
2.Multicast - , , , .. Cumulus , Juniper resourse-intensive .
3.BGP BGP, - BGP? , - , , BGP, EVPN BGP. . iBGP eBGP, underlay .
iBGP
iBGP IGP, OSPF IS-IS ( ), .. Loopback, , , - . , iBGP , full-mesh ( Spine BGP).
Spine Route-Reflector.
2 full-mesh RR. , , , .
eBGP
, , , . eBGP, Route-Reflector (, ), IGP , .. p2p . MLAG . . MLAG AS, .. VXLAN/EVPN , AS. peerlink'e, , .. , Spine, - , .
, - , , AS. Cumulus 4.2(.. ) , .. AS, MAC, ( 32-bit AS).
Spine AS. .. Spine Leaf, . , AS Spine, Cumulus.
P.S. AS , AS . 32-bit AS .
BGP Unnumbered
, p2p , loopback . - , Unnumbered. Cumulus Cisco, . Cumulus IPv6 , eBGP . extended-nexthop ( , IPv4 Family IPv6 ).
P.S. - IPv4 /30 /31 , BGP . broadcast , p2p.
BGP+BFD
, BGP . , iSCSI, VSAN .. . BFD. Cumulus 2x50 , Cisco 233 , .
?
1. eBGP AS + iBGP MLAG
2. loopback, Unnumbered
3.BFD
1. AS Cumulus
# Cumulus AS
#P.S. Spine AS, .. AS-path
cumulus@Switch1:mgmt:~$ net add bgp autonomous-system
<1-4294967295> : An integer from 1 to 4294967295
leaf : Auto configure a leaf ASN in the 4-byte private range 4200000000 - 4294967294 based on the switch
MAC
spine : Auto configure a spine AS-number in the 4-byte private ASN range. The value 4200000000 is always
used
# "net add bgp autonomous-system leaf"
cumulus@Switch1:mgmt:~$ net add bgp autonomous-system leaf
cumulus@Switch1:mgmt:~$ net pending
+router bgp 4252968529 # BGP AS
+end
2. BGP+Unnumbered
#loopback
net add loopback lo clag vxlan-anycast-ip 10.223.250.30 # MLAG IP (
net add loopback lo ip address 10.223.250.1/32
#AS+Router ID
net add bgp autonomous-system leaf
net add bgp router-id 10.223.250.1
# peer-group
net add bgp neighbor fabric peer-group # peer
net add bgp neighbor fabric remote-as external # eBGP
net add bgp neighbor fabric bfd 3 50 50 # BFD
net add bgp neighbor fabric capability extended-nexthop # IPv4 over IPv6
# (Unnumbered)
net add bgp neighbor swp2 interface peer-group fabric
net add bgp neighbor peerlink.4094 interface remote-as internal #iBGP Peerlink
net add bgp ipv4 unicast neighbor peerlink.4094 next-hop-self # next-hop iBGP
net add bgp ipv4 unicast redistribute connected # BGP
#EVPN
net add bgp l2vpn evpn neighbor fabric activate # family eBGP
net add bgp l2vpn evpn neighbor peerlink.4094 activate # family iBGP
net add bgp l2vpn evpn advertise-all-vni # BGP VNI
3.BGP Summary
cumulus@Switch1:mgmt:~$ net show bgp summary
#IPv4
show bgp ipv4 unicast summary
=============================
BGP router identifier 10.223.250.1, local AS number 4252968145 vrf-id 0
BGP table version 84
RIB entries 29, using 5568 bytes of memory
Peers 3, using 64 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
Switch3(swp2) 4 4252424337 504396 484776 0 0 0 02w0d23h 3
Switch4(swp49) 4 4208128255 458840 485146 0 0 0 3d12h03m 9
Switch2(peerlink.4094) 4 4252968145 460895 456318 0 0 0 02w0d23h 14
Total number of neighbors 3
#EVPN
show bgp l2vpn evpn summary
===========================
BGP router identifier 10.223.250.1, local AS number 4252968145 vrf-id 0
BGP table version 0
RIB entries 243, using 46 KiB of memory
Peers 3, using 64 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
Switch3(swp2) 4 4252424337 504396 484776 0 0 0 02w0d23h 237
Switch4(swp49) 4 4208128255 458840 485146 0 0 0 3d12h03m 563
Switch2(peerlink.4094) 4 4252968145 460895 456318 0 0 0 02w0d23h 807
Total number of neighbors 3
4.net show route
cumulus@Switch1:mgmt:~$ net show route
show ip route
=============
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
# IPv4 IPv6 ( Cumulus weight, Cisco)
C>* 10.223.250.1/32 is directly connected, lo, 02w0d23h #Loopback
B>* 10.223.250.2/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d23h
B>* 10.223.250.6/32 [20/0] via fe80::ba59:9fff:fe70:e5c, swp49, weight 1, 3d12h19m
B>* 10.223.250.7/32 [20/0] via fe80::ba59:9fff:fe70:e5c, swp49, weight 1, 3d12h19m
B>* 10.223.250.9/32 [20/0] via fe80::ba59:9fff:fe70:e5c, swp49, weight 1, 3d12h19m
C>* 10.223.250.30/32 is directly connected, lo, 02w0d23h #MLAG Loopback
B>* 10.223.250.101/32 [20/0] via fe80::1e34:daff:fe9e:67ec, swp2, weight 1, 02w0d23h
B>* 10.223.250.102/32 [20/0] via fe80::1e34:daff:fe9e:67ec, swp2, weight 1, 02w0d23h
B>* 10.223.250.103/32 [20/0] via fe80::1e34:daff:fe9e:67ec, swp2, weight 1, 02w0d23h
B>* 10.223.252.11/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.12/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.20/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.101/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.102/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
B>* 10.223.252.103/32 [200/0] via fe80::ba59:9fff:fe70:e50, peerlink.4094, weight 1, 02w0d06h
5.traceroute + BFD
#traceroute ( TCP )
cumulus@Switch1:mgmt:~$ traceroute -s 10.223.250.1 10.223.250.6
vrf-wrapper.sh: switching to vrf "default"; use '--no-vrf-switch' to disable
traceroute to 10.223.250.6 (10.223.250.6), 30 hops max, 60 byte packets
1 10.223.250.7 (10.223.250.7) 1.002 ms 1.010 ms 0.981 ms # loopback
2 10.223.250.6 (10.223.250.6) 0.933 ms 0.917 ms 1.018 ms
# BFD
cumulus@Switch1:mgmt:~$ net show bfd
------------------------------------------------------------------------------------------
port peer state local type diag vrf
------------------------------------------------------------------------------------------
swp2 fe80::1e34:daff:fe9e:67ec Up fe80::1e34:daff:fea6:b53d singlehop N/A N/A
swp49 fe80::ba59:9fff:fe70:e5c Up fe80::1e34:daff:fea6:b510 singlehop N/A N/A
Underlay , Overlay.
Overlay
, Overlay VXLAN VTEP EVPN. . 3 SVI, .
Centralized IRB
, centralized switch. (active-active ), L3 , Leaf L2. entralized switch EVPN type-2 (MAC+IP) c Default Gateway community(0x03). centralized switch VNI .
, , , Leaf L3. .. L3 , , .
Asymmetric IRB
Symmetric IRB ( ), . VTEP, L2. , Leaf VNI+Vlan, L3VNI.
P.S. , , ( VNI). , VLAN VRF. , , c .
Symmetric IRB
L3 L3VNI, VLAN. VNI VTEP.
, , , VM3 VM4. Leaf03/04, route-lookup, VM4 Leaf05/06, nexthop L3VNI . Leaf05/06 , VXLAN SVI20. SVI , , L3VNI .
Overlay.
Overlay , VTEP .
1. L3VNI
#VRF+VNI
net add vrf Test vrf-table auto #C VRF RD+RT
net add vrf Test vni 200000 # VNI VRF
net add vxlan vniTest vxlan id 200000 # L3VNI
net add vxlan vniTest bridge learning off # , .. EVPN
net add vxlan vniTest vxlan local-tunnelip 10.223.250.1 #
#VLAN
net add vlan 2000 hwaddress 44:38:39:BE:EF:AC # MLAG, Active-Active MAC
net add vlan 2000 vlan-id 2000 #
net add vlan 2000 vlan-raw-device bridge # bridge
net add vlan 2000 vrf vniTest # VRF
net add vxlan vniTest bridge access 2000 # VLAN L3VNI
2. VNI
#VNI
net add vxlan vni-20999 vxlan id 20999
net add vxlan vni-20999 bridge arp-nd-suppress on # ARP-supress - BUM
net add vxlan vni-20999 bridge learning off # , .. EVPN
net add vxlan vni-20999 stp bpduguard # bpduguard
net add vxlan vni-20999 stp portbpdufilter # bpdufilter
net add vxlan vni-20999 vxlan local-tunnelip 10.223.250.1 #
# VLAN
net add vlan 999 ip address 10.223.255.253/24 # IP VLAN
net add vlan 999 ip address-virtual 44:39:39:ff:01:01 10.223.255.254/24 # ( gateway Leaf)
net add vlan 999 vlan-id 999 #
net add vlan 999 vlan-raw-device bridge # bridge
net add vlan 999 vrf Test # VRF
net add vxlan vni-20999 bridge access 999 # VLAN VNI
#P.S. L2 only vlan ( VNI )
net add vlan 999 ip forward off
net add vlan 999 vlan-id 999
net add vlan 999 vlan-raw-device bridge
3.
# BGP VRF
net add bgp vrf Test autonomous-system 4252424337
net add bgp vrf Test router-id 10.223.250.101
net add bgp vrf Test neighbor 100.64.1.105 remote-as 35083
net add bgp vrf Test ipv4 unicast redistribute connected
net add bgp vrf Test ipv4 unicast redistribute static
net add bgp vrf Test ipv4 unicast neighbor 100.64.1.105 route-map Next-Hop-VRR_Vl997 out # MLAG+VSS BGP+SVI,
#P.S. /32 EVPN, .
net add bgp vrf Test l2vpn evpn advertise ipv4 unicast # EVPN
, :
#VNI
cumulus@Switch1:mgmt:~$ net show evpn vni
VNI Type VxLAN IF MACs ARPs Remote VTEPs Tenant VRF
20995 L2 vni-20995 5 3 1 default #VNI, IP SVI
20999 L2 vni-20999 26 24 4 Test #VNI
200000 L3 vniTest 3 3 n/a Test # L3VNI
#VNI
cumulus@Switch1:mgmt:~$ net show evpn vni 20999
VNI: 20999
Type: L2
Tenant VRF: Test
VxLAN interface: vni-20999
VxLAN ifIndex: 73
Local VTEP IP: 10.223.250.30 # MLAG, vxlan-anycast-ip
Mcast group: 0.0.0.0 #
Remote VTEPs for this VNI: # VNI
10.223.250.9 flood: HER # Head-end Replication ( Ingress replication)
10.223.252.103 flood: HER
10.223.252.20 flood: HER
10.223.250.103 flood: HER
Number of MACs (local and remote) known for this VNI: 26
Number of ARPs (IPv4 and IPv6, local and remote) known for this VNI: 24
Advertise-gw-macip: No #Community centralized IRB
#
cumulus@Switch3:mgmt:~$ net show evpn mac vni 20999
Number of MACs (local and remote) known for this VNI: 28
Flags: B=bypass N=sync-neighs, I=local-inactive, P=peer-active, X=peer-proxy
MAC Type Flags Intf/Remote ES/VTEP VLAN Seq #'s
0c:59:9c:b9:d8:dc remote 10.223.250.30 0/0
0c:42:a1:95:79:7c remote 10.223.250.30 0/0
# MAC, Interface VNI
cumulus@Switch3:mgmt:~$ net show bridge macs vlan 999
VLAN Master Interface MAC TunnelDest State Flags LastSeen
---- ------ --------- ----------------- ---------- --------- ------------ -----------------
999 bridge bridge 1c:34:da:9e:67:68 permanent 24 days, 04:35:13
999 bridge bridge 44:39:39:ff:01:01 permanent 14:26:45
999 bridge peerlink 1c:34:da:9e:67:48 permanent 31 days, 17:22:36
999 bridge peerlink 1c:34:da:9e:67:e8 static sticky 24 days, 04:14:16
999 bridge vni-20999 00:16:9d:9e:dd:41 extern_learn 19 days, 04:17:21
999 bridge vni-20999 00:21:1c:2e:86:42 extern_learn 19 days, 04:17:21
999 bridge vni-20999 00:22:0c:de:30:42 extern_learn 19 days, 04:17:21
# VTEP
cumulus@Switch3:mgmt:~$ net show evpn rmac vni all
VNI 200000 RMACs 3
RMAC Remote VTEP
44:38:39:be:ef:ac 10.223.250.30
44:39:39:ff:40:94 10.223.252.103
44:38:39:be:ef:ae 10.223.250.9
#Arp-cache
cumulus@Switch3:mgmt:~$ net show evpn arp-cache vni 20999
Number of ARPs (local and remote) known for this VNI: 28
Flags: I=local-inactive, P=peer-active, X=peer-proxy
Neighbor Type Flags State MAC Remote ES/VTEP Seq #'s
10.223.255.242 local active 1c:34:da:9e:67:68 0/0
10.223.255.13 remote active 0c:42:a1:96:d2:44 10.223.250.30 0/0
10.223.255.243 local inactive 06:73:4a:02:27:8a 0/0
10.223.255.7 remote active 0c:59:9c:b9:f8:fa 10.223.250.30 0/0
10.223.255.14 remote active 0c:42:a1:95:79:7c 10.223.250.30 0/0
#
cumulus@Switch1:mgmt:~$ net show route vrf Test | grep "10.3.53"
// next-hop vlan2000, L3VNI
C * 10.223.255.0/24 [0/1024] is directly connected, vlan999-v0, 03w3d04h
C>* 10.223.255.0/24 is directly connected, vlan999, 03w3d04h
B>* 10.223.255.1/32 [20/0] via 10.223.250.30, vlan2000 onlink, weight 1, 02w5d04h
B>* 10.223.255.2/32 [20/0] via 10.223.250.30, vlan2000 onlink, weight 1, 02w5d04h
B>* 10.223.255.3/32 [20/0] via 10.223.250.30, vlan2000 onlink, weight 1, 02w5d04h
# EVPN
cumulus@Switch3:mgmt:~$ net show bgp evpn route vni 20999
BGP table version is 1366, local router ID is 10.223.250.101
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[ESI]:[EthTag]:[IPlen]:[VTEP-IP]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]
Network Next Hop Metric LocPrf Weight Path
*> [2]:[0]:[48]:[00:16:9d:9e:dd:41]
10.223.250.30 0 4252968145 i
RT:9425:20999 RT:9425:200000 ET:8 Rmac:44:38:39:be:ef:ac
*> [2]:[0]:[48]:[00:22:56:ac:f3:42]:[32]:[10.223.255.1]
10.223.250.30 0 4252968145 i
RT:9425:20999 RT:9425:200000 ET:8 Rmac:44:38:39:be:ef:ac
, VXLAN/EVPN . Underlay Unnumbered BGP , Overlay VXLAN/EVPN c Symmetric IRB, . , .
, Spine, . , - . , , .