This chapter provides information about Network Address Translation (NAT) and implementation notes.
Topics in this chapter include:
BNG Subscriber — A broader term than the ESM Subscriber, independent of the platform on which the subscriber is instantiated. It includes ESM subscribers on 7750 SR as well as subscribers instantiated on third party BNGs. Some of the NAT functions, such as Subscriber Aware Large Scale NAT44 utilizing standard RADIUS attribute work with subscribers independently of the platform on which they are instantiated.
Deterministic NAT — A mode of operation where mappings between the NAT subscriber and the outside IP address and port-range are allocated at the time of configuration. Each subscriber is permanently mapped to an outside IP and a dedicated port block. This dedicated port block is referred to as deterministic port block. Logging is not needed as the reverse mapping can be obtained using a known formula. The subscriber’s ports can be expanded by allocating a dynamic port block in case that all ports in deterministic port block are exhausted. In such case logging for the dynamic port block allocation/de-allocation is required.
Enhanced Subscriber Management (ESM) subscriber — A host or a collection of hosts instantiated in 7750 SR Broadband Network Gateway (BNG). The ESM subscriber represents a household or a business entity for which various services with committed Service Level Agreements (SLA) can be delivered. NAT function is not part of basic ESM functionality.
L2-Aware NAT — In the context of 7750 SR platform combines Enhanced Subscriber Management (ESM) subscriber-id and inside IP address to perform translation into a unique outside IP address and outside port. This is in contrast with classical NAT technique where only inside IP is considered for address translations. Since the subscriber-id alone is sufficient to make the address translation unique, L2-Aware NAT allows many ESM subscribers to share the same inside IP address. The scalability, performance and reliability requirements are the same as in LSN.
Large Scale NAT (LSN) — Refers to a collection of network address translation techniques used in service provider network implemented on a highly scalable, high performance hardware that facilitates various intra and inter-node redundancy mechanisms. The purpose of LSN semantics is to make delineation between high scale and high performance NAT functions found in service provider networks and enterprise NAT that is usually serving much smaller customer base at smaller speeds. The following NAT techniques can be grouped under the LSN name:
Each distinct NAT technique is referred to by its corresponding name (Large Scale NAT44 [or CGN], DS-Lite and NAT64) with the understanding that in the context of 7750 SR platform, they are all part of LSN (and not enterprise based NAT).
Large Scale NAT44 term can be interchangeably used with the term Carrier Grade NAT (CGN) which in its name implies high reliability, high scale and high performance. These are again typical requirements found in service provider (carrier) network.
L2-Aware NAT term refers to a separate category of NAT defined outside of LSN.
NAT RADIUS accounting — Reporting (or logging) of address translation related events (port-block allocation/de-allocation) via RADIUS accounting facility. NAT RADIUS accounting is facilitated via regular RADIUS accounting messages (star/interim-update/stop) as defined in RFC 2866, RADIUS Accounting, with NAT specific VSAs.
NAT RADIUS accounting — Can be interchangeably used with the term NAT RADIUS logging.
NAT Subscriber — in NAT terminology a NAT subscriber is an inside entity whose true identity is hidden from the outside. There are a few types of NAT implementation in 7750 SR and subscribers for each implementation are defined as follows:
Non-deterministic NAT — A mode of operation where all outside IP address and port block allocations are made dynamically at the time of subscriber instantiation. Logging in such case is required.
Port block — A collection of ports that is assigned to a subscriber. A deterministic LSN subscriber can have only one deterministic port block that can be extended by multiple dynamic port blocks. Non-deterministic LSN subscriber can be assigned only dynamic port blocks. All port blocks for a LSN subscriber must be allocated from a single outside IP address.
Port-range — A collection of ports that can spawn multiple port blocks of the same type. For example, deterministic port-range includes all ports that are reserved for deterministic consumption. Similarly dynamic port-range is a total collection of ports that can be allocated in the form of dynamic port blocks. Other types of port-ranges are well-known ports and static port forwards.
The 7750 SR supports Network Address (and port) Translation (NAPT) to provide continuity of legacy IPv4 services during the migration to native IPv6. By equipping the multi-service ISA (MS ISA) in an IOM3-XP, the 7750 SR can operate in two different modes, known as:
These two modes both perform source address and port translation as commonly deployed for shared Internet access. The 7750 SR with NAT is used to provide consumer broadband or business Internet customers access to IPv4 Internet resources with a shared pool of IPv4 addresses, such as may occur around the forecast IPv4 exhaustion. During this time it, is expected that native IPv6 services will still be growing and a significant amount of Internet content will remain IPv4.
Network Address Translation devices modify the IP headers of packets between a host and server, changing some or all of the source address, destination address, source port (TCP/UDP), destination port (TCP/UDP), or ICMP query ID (for ping). The 7750 SR in both NAT modes performs Source Network Address and Port Translation (S-NAPT). S-NAPT devices are commonly deployed in residential gateways and enterprise firewalls to allow multiple hosts to share one or more public IPv4 addresses to access the Internet. The common terms of inside and outside in the context of NAT refer to devices inside the NAT (that is behind or masqueraded by the NAT) and outside the NAT, on the public Internet.
TCP/UDP connections use ports for multiplexing, with 65536 ports available for every IP address. Whenever many hosts are trying to share a single public IP address there is a chance of port collision where two different hosts may use the same source port for a connection. The resultant collision is avoided in S-NAPT devices by translating the source port and tracking this in a stateful manner. All S-NAPT devices are stateful in nature and must monitor connection establishment and traffic to maintain translation mappings. The 7750 SR NAT implementation does not use the well-known port range (1to1023).
In most circumstances, S-NAPT requires the inside host to establish a connection to the public Internet host or server before a mapping and translation will occur. With the initial outbound IP packet, the S-NAPT knows the inside IP, inside port, remote IP, remote port and protocol. With this information the S-NAPT device can select an IP and port combination (referred to as outside IP and outside port) from its pool of addresses and create a unique mapping for this flow of data.
Any traffic returned from the server will use the outside IP and outside port in the destination IP/port fields – matching the unique NAT mapping. The mapping then provides the inside IP and inside port for translation.
The requirement to create a mapping with inside port and IP, outside port and IP and protocol will generally prevent new connections to be established from the outside to the inside as may occur when an inside host wishes to be a server.
Applications which operate as servers (such as HTTP, SMTP, etc) or peer-to-peer applications can have difficulty when operating behind an S-NAPT because traffic from the Internet can reach the NAT without a mapping in place.
Different methods can be employed to overcome this, including:
The 7750 SR supports all three methods following the best-practice RFC for TCP (RFC 5382, NAT Behavioral Requirements for TCP) and UDP (RFC 4787, Network Address Translation (NAT) Behavioral Requirements for Unicast UDP). Port Forwarding is supported on the 7750 SR to allow servers which operate on well-known ports <1024 (such as HTTP and SMTP) to request the appropriate outside port for permanent allocation.
STUN is facilitated by the support of Endpoint-Independent Filtering and Endpoint-Independent Mapping (RFC 4787) in the NAT device, allowing STUN-capable applications to detect the NAT and allow inbound P2P connections for that specific application. Many new SIP clients and IM chat applications are STUN capable.
Application Layer Gateways (ALG) allows the NAT to monitor the application running over TCP or UDP and make appropriate changes in the NAT translations to suit. The 7750 SR has an FTP ALG enabled following the recommendation of the IETF BEHAVE RFC for NAT (RFC 5382).
Even with these three mechanisms some applications will still experience difficulty operating behind a NAT. As an industry-wide issue, forums like UPnP the IETF, operator and vendor communities are seeking technical alternatives for application developers to traverse NAT (including STUN support). In many cases the alternative of an IPv6-capable application will give better long-term support without the cost or complexity associated with NAT.
Large Scale NAT represents the most common deployment of S-NAPT in carrier networks today, it is already employed by mobile operators around the world for handset access to the Internet.
A Large Scale NAT is typically deployed in a central network location with two interfaces, the inside towards the customers, and the outside towards the Internet. A Large Scale NAT functions as an IP router and is located between two routed network segments (the ISP network and the Internet).
Traffic can be sent to the Large Scale NAT function on the 7750 SR using IP filters (ACL) applied to SAPs or by installing static routes with a next-hop of the NAT application. These two methods allow for increased flexibility in deploying the Large Scale NAT, especially those environments where IP MPLS VPN are being used in which case the NAT function can be deployed on a single PE and perform NAT for any number of other PE by simply exporting the default route.
The 7750 SR NAT implementation supports NAT in the base routing instance and VPRN, and through NAT traffic may originate in one VPRN (the inside) and leave through another VPRN or the base routing instance (the outside). This technique can be employed to provide customer’s of IP MPLS VPN with Internet access by introducing a default static route in the customer VPRN, and NATing it into the Internet routing instance.
As Large Scale NAT is deployed between two routed segments, the IP addresses allocated to hosts on the inside must be unique to each host within the VPRN. While RFC1918 private addresses have typically been used for this in enterprise or mobile environments, challenges can occur in fixed residential environments where a subscriber has existing S-NAPT in their residential gateway. In these cases the RFC 1918 private address in the home network may conflict with the address space assigned to the residential gateway WAN interface. Some of these issues are documented in draft-shirasaki-nat444-isp-shared-addr-02. Should a conflict occur, many residential gateways will fail to forward IP traffic.
The S-NAPT service on the 7750 SR BNG incorporates a port range block feature to address scalability of a NAT mapping solution. With a single BNG capable of hundreds of thousands of NAT mappings every second, logging each mapping as it is created and destroyed logs for later retrieval (as may be required by law enforcement) could quickly overwhelm the fastest of databases and messaging protocols. Port range blocks address the issue of logging and customer location functions by allocating a block of contiguous outside ports to a single subscriber. Rather than log each NAT mapping, a single log entry is created when the first mapping is created for a subscriber and a final log entry when the last mapping is destroyed. This can reduce the number of log entries by 5000x or more. An added benefit is that as the range is allocated on the first mapping, external applications or customer location functions may be populated with this data to make real-time subscriber identification, rather than having to query the NAT as to the subscriber identity in real-time and possibly delay applications.
Port range blocks are configurable as part of outside pool configuration, allowing the operator to specify the number of ports allocated to each subscriber when a mapping is created. Once a range is allocated to the subscriber, these ports are used for all outbound dynamic mappings and are assigned in a random manner to minimize the predictability of port allocations (draft-ietf-tsvwg-port-randomization-05).
Port range blocks also serve another useful function in a Large Scale NAT environment, and that is to manage the fair allocation of the shared IP resources among different subscribers.
When a subscriber exhausts all ports in their block, further mappings will be prohibited. As with any enforcement system, some exceptions are allowed and the NAT application can be configured for reserved ports to allow high-priority applications access to outside port resources while exhausted by low priority applications.
Reserved ports allows an operator to configure a small number of ports to be reserved for designated applications should a port range block be exhausted. Such a scenario may occur when a subscriber is unwittingly subjected to a virus or engaged in extreme cases of P2P file transfers. In these situations, rather than block all new mappings indiscriminately the 7750 SR NAT application allows operators to nominate a number of reserved ports and then assign a 7750 SR forwarding class as containing high priority traffic for the NAT application. Whenever traffic reaches the NAT application which matches a priority session forwarding class, reserved ports will be consumed to improve the chances of success. Priority sessions could be used by the operator for services such as DNS, web portal, e-mail, VoIP, etc to permit these applications even when a subscriber exhausted their ports.
The outside IP address is always shared for the subscriber with a port forward (static or via PCP) and the dynamically allocated port block, insofar as the port from the port forward is in the range >1023. This behavior can lead to starvation of dynamic port blocks for the subscriber. An example for this scenario is shown in Figure 52.
Eventually the PCs in Home 1 come to life and they try to connect to the Internet. Due to the dynamic port block exhaustion for the IP address 3.3.3.1 (that is mandated by static port forward – Web Server), the dynamic port block allocation will fail and consequently the PCs will not be able to access the Internet. There will be no additional attempt within CGN to allocate another outside IP address. In the CGN there is no distinction between the PCs in Home 1 and the Web Server when it comes to source IP address. They both share the same source IP address 2.2.2.1 on the CPE.
To prevent starvation of dynamic port blocks for the subscribers that utilize port forwards, a dynamic port block (or blocks) will be reserved during the lifetime of the port forward. Those reserved dynamic port blocks will be associated with the same subscriber that created the port forward. However, a log would not be generated until the dynamic port block is actually used and mapping within that block are created.
At the time of the port forward creation, the dynamic port block will be reserved in the following fashion:
The reserved dynamic port block (even without any mapping) will continue to be associated with the subscriber as long as the port forward for the subscriber is present. The log (syslog or RADIUS) will be generated only when there is not active mapping within the dynamic port block AND all port forwards for the subscriber are deleted.
Additional considerations with dynamic port block reservation:
Creating a NAT mapping is only one half of the problem – removing a NAT mapping at the appropriate time maximizes the shared port resource. Having ports mapped when an application is no longer active reduces solution scale and may impact the customer experience should they exhaust their port range block. The NAT application provides timeout configuration for TCP, UDP and ICMP.
TCP state is tracked for all TCP connections, supporting both three-way handshake and simultaneous TCP SYN connections. Separate and configurable timeouts exist for TCP SYN, TCP transition (between SYN and Open), established and time-wait state. Time-wait assassination is supported and enabled by default to quickly remove TCP mappings in the TIME WAIT state.
UDP does not have the concept of connection state and is subject to a simple inactivity timer. Company-sponsored research into applications and NAT behavior suggested some applications, like the Bittorrent Distributed Hash Protocol (DHT) can make a large number of outbound UDP connections that are unsuccessful. Rather than wait the default five (5) minutes to time these out, the 7750 SR NAT application supports an udp-initial timeout which defaults to 15 seconds. When the first outbound UDP packet is sent, the 15 second time starts – it is only after subsequent packets (inbound or outbound) that the default UDP timer will become active, greatly reducing the number of UDP mappings.
It is possible to define watermarks to monitor the actual usage of sessions and/or ports.
For each watermark, a high and a low value has to be set. Once the high value is reached, a notification will be send. As soon as the usage drops below the low watermark, another notification will be send.
Watermarks can be defined on nat-group, pool and policy level.
NAT is supported on DHCP, PPPoE and L2TP, there is not support for static and ARP hosts.
In an effort to address issues of conflicting address space raised in draft-shirasaki-nat444-isp-shared-addr-02, an enhancement to Large Scale NAT was co-developed to give every broadband subscriber their own NAT mapping table, yet still share a common outside pool of IPs.
Layer-2 Aware (or subscriber aware) NAT is combined with Enhanced Subscriber Management on the 7750 SR BNG to overcome the issues of colliding address space between home networks and the inside routed network between the customer and Large Scale NAT.
Layer-2 Aware NAT permits every broadband subscriber to be allocated the exact same IPv4 address on their residential gateway WAN link and then proceeds to translate this into a public IP through the NAT application. In doing so, L2-Aware NAT avoids the issues of colliding address space raised in draft-shirasaki without any change to the customer gateway or CPE.
Layer-2-Aware NAT is supported on any of the ESM access technologies, including PPPoE, IPoE (DHCP) and L2TP LNS. For IPoE both n:1 (VLAN per service) and 1:1 (VLAN per subscriber) models are supported. A subscriber device operating with L2-Aware NAT needs no modification or enhancement – existing address mechanisms (DHCP or PPP/IPCP) are identical to a public IP service, the 7750 SR BNG simply translates all IPv4 traffic into a pool of IPv4 addresses, allowing many L2-Aware NAT subscribers to share the same IPv4 address.
More information on L2-Aware NAT can be found in draft-miles-behave-l2nat-00.
In 1:1 NAT, each source IP address is translated in 1:1 fashion to a corresponding outside IP address. However, the source ports are passed transparently without translation.
The mapping between the inside IP addresses and outside IP addresses in 1:1 NAT supports two modes:
Dynamic version of 1:1 NAT is protocol dependent. Only TCP/UDP/ICMP protocols are allowed to traverse such NAT. All other protocols are discarded, with the exception of PPTP with ALG. In this case, only GRE traffic associated with PPTP is allowed through dynamic 1:1 NAT.
Static version of 1:1 NAT is protocol agnostic. This means that all IP based protocols are allowed to traverse static 1:1 NAT.
The following is applicable to 1:1 NAT:
In static 1:1 NAT, inside IP addresses are statically mapped to the outside IP addresses. In this fashion, devices on the outside can predictably initiate traffic to the devices on the inside.
Static configuration is based on the CLI concepts used in deterministic NAT. For example:
Static mappings are configured according to the map statements. The map statement can be configured manually by the operator or automatically by the system. IP addresses from the automatically generated map statements are sequentially mapped into available outside IP address in the pool:
The following mappings apply to the example from above:
Although static 1:1 NAT is protocol agnostic, the state maintenance for TCP and UDP traffic is still required in order to support ALGs. Because of that, the existing scaling limits related to the number of supported flows still apply.
Protocol agnostic behavior in 1:1 NAT is a property of a NAT pool:
The application agnostic command is a pool create-time parameter. This command will automatically pre-set the following pool parameters:
Once pre-set, these parameters cannot be changed while pool is operating in protocol agnostic mode.
The deterministic port-reservation 65536 command configures the pool to operate in static (or deterministic) mode.
Parameters in static 1:1 NAT can be changed according to the following rules:
For best traffic distribution over ISAs, the value of the classic-lsn-max-subscriber-limit parameter should be set to 1.
This mean that traffic is load balanced over ISAs based on inside IP addresses. In static 1:1 NAT this is certainly possible since the subscriber-limit parameter at the pool level is preset to a fixed value of 1.
However, if 1:1 static NAT is simultaneously used with regular (many-to-one) deterministic NAT where the subscriber-limit parameter can be set to a value greater than 1, then the classic-lsn-max-subscriber-limit parameter will also have to be set to a value that is greater than 1. The consequence of this is that the traffic will be load balanced based on the consecutive blocks of IP addresses (subnets) rather than individual IP addresses. Further information on this topic is provided in sections describing Deterministic NAT behavior.
Traffic match criteria used in selection of specific NAT policy in static 1:1 NAT (deterministic part of the configuration) must not overlap with traffic match criteria that is used in selection of specific NAT policy used in filters or in destination-prefix statement (these are used for traffic diversion to NAT). Otherwise, traffic will be dropped in ISA.
A specific NAT policy in this context refers to a non-default NAT policy, or a NAT policy that is directly referenced in a filter, in a destination-prefix command or in a deterministic prefix command.
The following example is used to clarify this point further:
Traffic is diverted to nat using specific nat-policy pol-2:
Deterministic (source) prefix 10.10.10.0/30 is configured to be mapped to specific nat-policy pol-1 that point to protocol agnostic 1:1 nat pool.
Packet received in the ISA has srcIP 10.10.10.1 and destIP 192.168.10.10.
In case that no NAT mapping for this traffic exists in the ISA, a NAT policy (and with this the NAT pool) needs to be determined in order to create the mapping. Traffic is diverted to NAT using nat-policy pol-2, while the deterministic mapping says that nat-policy pol-1 should be used (and thus a different pool from the one referenced in nat-policy pol-2). Due to the specific NAT policy conflict, traffic will be dropped in the ISA.
In order to successfully pass traffic between two subnets through NAT while simultaneously using static 1:1 NAT and regular LSN44, a default (non-specific) NAT policy can be used for regular LSN44.
For example:
In this case, the four hosts from the prefix 10.10.10.0/30 will be mapped in 1:1 fashion to 4 IP addresses from the pool referenced in the specific nat-policy pol-1, while all other hosts from the 10.10.10.0/24 network will be mapped to the NAPT pool referenced by the default nat-policy pol-2. In this fashion, nat-policy conflict is avoided.
In summary, specific nat-policy (in filter, destination-prefix command or in deterministic prefix command) will always take precedence over default nat-policy. However, traffic that matches classification criteria (in filter, destination-prefix command or a deterministic prefix command) that leads to multiple specific nat-policies, will be dropped.
Static 1:1 NAT mappings are explicitly configured, and therefore their lifetime is tied to the configuration.
The logging mechanism for static mapping is the same as in Deterministic NAT - configuration changes are logged via syslog enhanced with reverse querying on the system.
Static 1:1 NAT is supported only for LSN44 (no support for DS-Lite/NAT64 or L2-aware NAT).
In 1:1 NAT, certain ICMP messages contain an additional IP header that is embedded in the ICMP header. For example, when the ICMP message is sent to the source due to the inability to deliver datagram to its destination, the ICMP generating node will include the original IP header of the packet + 64bits of the original datagram. This information will help the source node to match the ICMP message to the process associated with this message in the first place.
When such message are received in the downstream direction (on the outside), 1:1 NAT will recognize them and change the destination IP address not only in the outside header but also in the ICMP header. In other words, a lookup in the downstream direction will be performed in the ISA to determine if the packet is ICMP with specific type. Depending on the outcome, the destination IP address in the ICMP header will be changed (reverted to the original source IP address).
Messages which carry original IP header within ICMP header are:
In deterministic NAT the subscriber is deterministically mapped into an outside IP address and a port block. The algorithm that performs this deterministic mapping is revertive, which means that a NAT subscriber can be uniformly derived from the outside IP address and the outside port (and the routing instance). Thus, logging in deterministic NAT is not needed.
The deterministic [subscriber <-> outside-ip, deterministic-port-block] mapping can be automatically extended by a dynamic port-block in case that deterministic port block becomes exhausted of ports. By extending the original deterministic port block of the NAT subscriber by a dynamic port block yields a satisfactory compromise between a deterministic NAT and a non-deterministic NAT. There will be no logging as long as the translations are in the domain of the deterministic NAT. Once the dynamic port block is allocated for port extension, logging will be automatically activated.
NAT subscribers in deterministic NAT are not assigned outside IP address and deterministic port-block on a first come first serve basis. Instead, deterministic mappings will be pre-created at the time of configuration regardless of whether the NAT subscriber is active or not. In other words we can say that overbooking of the outside address pool is not supported in deterministic NAT. Consequently, all configured deterministic subscribers (for example, inside IP addresses in LSN44 or IPv6 address/prefix in DS-Lite) will be guaranteed access to NAT resources.
The routers support Deterministic LSN44 and Deterministic DS-Lite. The basic deterministic NAT principle is applied equally to both NAT flavors. The difference between the two stem from the difference in interpretation of the subscriber – in LSN44 a subscriber is an IPv4 address, whereas in DS-Lite the subscriber is an IPv6 address or prefix (configuration dependent).
With the exception of classic-lsn-max-subscriber-limit and dslite-max-subscriber-limit commands in the inside routing context, the deterministic NAT configuration blocks are for the most part common to LSN44 and DS-Lite.
Deterministic DS-Lite section at the end of this section will focus on the features specific to DS-Lite.
The outside pools in deterministic NAT can contain an arbitrary number of address ranges, where each address range can contain an arbitrary number of IP addresses (up to the ISA maximum).
The maximum number of NAT subscribers that can be mapped to a single outside IP address is configurable using a subscriber-limit command under the pool hierarchy. For Deterministic NAT, this number is restricted to the power of 2 (2^n). The consequence of this is that the number of NAT subscribers must be configuration-wise organized in ranges with the boundary that must be power of 2.
For example, in LSN44 where the NAT subscriber is an IP address, the deterministic subscribers would be configured with prefixes (for example, 10.10.10.0/24 – 256 subscribers) rather than an IP address range that would contain an arbitrary number of addresses (e.g. 10.10.10.10 – 10.10.10.50).
On the other hand, in DS-Lite the deterministic subscribers are for the most part already determined by the prefix with the subscriber-prefix-length command under the DS-Lite configuration node.
The number of subscribers per outside IP (the subscriber-limit command [2^n]) multiplied by the number of IP addresses over all address-range in an outside pool will determine the maximum number of subscribers that a deterministic pool can support.
In deterministic NAT, the outside pool can be shared amongst subscribers from multiple routing instances. Also, NAT subscribers from a single routing instance can be selectively mapped to different outside pools.
The number of deterministic mappings that a single outside IP address can sustain is determined through the configuration of the outside pool.
The port allocation per an outside IP is shown in Figure 54.
The well-known ports are predetermined and are in the range 0 — 1023.
The upper limit of the port range for static port forwards (wildcard range) is determined by the existing port-forwarding-range command.
The range of ports allocated for deterministic mappings (DetP) is determined by multiplying the number of subscribers per outside IP (subscriber-limit command) with the number of ports per deterministic block (determinisitic>port-reservation command). The number of subscribers per outside IP in deterministic NAT must be power of 2 (2^n).
The remaining ports, extending from the end of the deterministic port range to the end of the total port range (65,535) are used for dynamic port allocation. The size of each dynamic port block is determined with the existing port-reservation command.
The determinisitic>port-reservation command enables deterministic mode of operation for the pool.
Examples:
The follow show three examples with deterministic Large Scale NAT44 where the requirements are:
In the first case, the ideal case will be examined where an arbitrary number of subscribers per outside IP address is allocated according to our requirements outlined above. Then the limitation of the number of subscribers being power of 2 will be factored in.
Well-Known Ports* | Static Port Range* | Number of Ports in Deterministic Block* | Number of Deterministic Blocks | Number of Ports in Dynamic Block* | Number of Dynamic Blocks | Number of Inside IP Addresses per Outside IP Address* | Block Limit per Inside IP Address* | Wasted Ports |
0-1023 | 1024-4023 | 300 | 153 | 100 | 153 | 153 | 5 | 312 |
0-1023 | 1024-4023 | 500 | 102 | 100 | 102 | 102 | 5 | 312 |
0-1023 | 1024-4023 | 700 | 76 | 100 | 76 | 76 | 5 | 712 |
The example in Table 30 shows how port ranges would be carved out in ideal scenario.
* — Signifies the fixed parameters (requirements).
The other values are calculated according to the fixed requirements.
port-block-limit includes the deterministic port block plus all dynamic port-blocks.
Next, a more realistic example with the number of subscribers being equal to 2^n are considered. The ratio between the deterministic ports and the dynamic ports per port-block just like in the example above: 3/1, 5/1 and 7/1 are preserved. In this case, the number of ports per port-block is dictated by the number of subscribers per outside IP address.
Well-Known Ports* | Static Port Range* | Number of Ports in Deterministic Block* | Number of Deterministic Blocks | Number of Ports in Dynamic Block* | Number of Dynamic Blocks | Number of Inside IP Addresses per Outside IP Address* | Block Limit per Inside IP Address* | Wasted Ports |
0-1023 | 1024-4023 | 180 | 256 | 60 | 256 | 256 | 5 | 72 |
0-1023 | 1024-4023 | 400 | 128 | 80 | 128 | 128 | 5 | 72 |
0-1023 | 1024-4023 | 840 | 64 | 120 | 64 | 64 | 5 | 72 |
* — Signifies the fixed parameters (requirements).
The final example is similar as Table 30 with the difference that the number of deterministic port blocks fixed are kept, as in the original example (300, 500 and 700).
Well-Known Ports | Static Port Range | Number of Ports in Deterministic Block | Number of Deterministic Blocks | Number of Ports in Dynamic Block | Number of Dynamic Blocks | Number of Inside IP Addresses per Outside IP Address | Block Limit per Inside IP Address | Wasted Ports |
0-1023 | 1024-4023 | 300 | 128 | 180 | 128 | 128 | 5 | 72 |
0-1023 | 1024-4023 | 500 | 64 | 461 | 64 | 64 | 5 | 8 |
0-1023 | 1024-4023 | 700 | 64 | 261 | 64 | 64 | 5 | 8 |
The three examples from above should give us a perspective on the size of deterministic and dynamic port blocks in relation to the number of subscribers (2^n) per outside IP address. Operators should run a similar dimensioning exercise before they start configuring their deterministic NAT.
The CLI for the highlighted case in the Table 30 is displayed:
Where:
128 subs * 300ports = 38,400 deterministic port range
128 subs * 180ports = 23,040 dynamic port range
Det+dyn available ports = 65,536 – 4024 = 61,512
Det+dyn usable pots = 128*300 + 128 *180 = 61,440 ports
72 ports per outside-ip are wasted.
This configuration will allow 128 subscribers (inside IP addresses in LSN44) for each outside address (compression ratio is 128:1) with each subscriber being assigned up to 1020 ports (300 deterministic and 720 dynamic ports over 4 dynamic port blocks).
The outside IP addresses in the pool and their corresponding port ranges are organized as shown in Figure 55.
Assuming that the above graph depicts an outside deterministic pool, the number of subscribers that can be accommodated by this deterministic pool is represented by purple squares (number of IP addresses in an outside pool * subscriber-limit). The number of subscribers across all configured prefixes on the inside that are mapped to the same deterministic pool must be less than the outside pool can accommodate. In other words, an outside address pool in deterministic NAT cannot be oversubscribed.
The following is a CLI representation of a deterministic pool definition including the outside IP ranges:
The common building block on the inside in the deterministic LSN44 configuration is a IPv4 prefix. The NAT subscribers (inside IPv4 addresses) from the configured prefix will be deterministically mapped to the outside IP addresses and corresponding deterministic port-blocks. Any inside prefix in any routing instance can be mapped to any pool in any routing instance (including the one in which the inside prefix is defined).
The mapping between the inside prefix and the deterministic pool is achieved through a NAT policy that can be referenced per each individual inside IPv4 prefix. IPv4 addresses from the prefixes on the inside will be distributed over the IP addresses defined in the outside pool referenced by the NAT policy.
The mapping itself is represented by the map command under the prefix hierarchy:
The purpose of the map statement is to split the number of subscribers within the configured prefix over available sequences of outside IP addresses. The key parameter that governs mappings between the inside IPv4 addresses and outside IPv4 addresses in deterministic LSN44 is defined by the outside>pool>subscriber-limit command. This parameter must be power of 2 and it limits the maximum number of NAT subscribers that can be mapped to the same outside IP address.
The follow are rules governing the configuration of the map statement:
In case that the number of subscribers (IP addresses in LSN44) in the map statement is larger than the subscriber-limit per outside IP, then the subscribers must be split over a block of consecutive outside IP addresses where the outside-ip-address in the map statement represent only the first outside IP address in that block.
The number of subscribers (range of inside IP addresses in LSN44) in the map statement does not have to be a power of 2. Rather it has to be a multiple of a power of two m * 2^n, where m is the number of consecutive outside IP addresses to which the subscribers are mapped and the 2^n is the subscriber-limit per outside IP.
An example of the map statement is given below:
In this case, the configured 10.0.0.0/24 prefix is represented by the range of IP addresses in the map statement (10.0.0.0-10.0.0.255). Since the range of 256 IP addresses in the map statement cannot be mapped into a single outside IP address (subscriber-limit=128), this range must be further implicitly split within the system and mapped into multiple outside IP addresses. The implicit split will create two IP address ranges, each with 128 IP addresses (10.0.0.0/25 and 10.0.0.128/25) so that addresses from each IP range are mapped to one outside IP address. The hosts from the range 10.0.0.0-10.0.0.127 will be mapped to the first IP address in the pool (128.251.0.1) as explicitly stated in the map statement (to statement). The hosts from the second range, 10.0.0.128-10.0.0.255 will be implicitly mapped to the next consecutive IP address (128.251.0.2).
Alternatively, the map statement can be configured as:
In this case the IP address range in the map statement is split into two non-consecutive outside IP addresses. This gives the operator more freedom in configuring the mappings.
However, the following configuration is not supported:
Considering that the subscriber-limit = 128 (2^n; where n=7), the lower n bits of the start address in the second map statement (map start 10.0.0.64 end 10.0.0.127 to 128.251.0.3) are not 0. This is in violation of the rule #1 that governs the provisioning of the map statement.
Assuming that we use the same pool with 128 subscribers per outside IP address, the following scenario is also not supported (configured prefix in this example is different than in previous example):
Although the lower n bits in both map statements are 0, both statements are referencing the same outside IP (128.251.0.1). This is violating rule #2 that governs the provisioning of the map statement. Each of the prefixes in this case will have to be mapped to a different outside IP address, which will lead to underutilization of outside IP addresses (half of the deterministic port-blocks in each of the two outside IP addresses will be not be utilized).
In conclusion, considering that the number of subscribers per outside IP (subscriber-limit) must be 2^n, the inside IP addresses from the configured prefix will be split on the 2^n boundary so that every deterministic port-block of an outside IP is utilized. In case that the originally configured prefix contains less subscribers (IP addresses in LSN44) than an outside IP address can accommodate (2^n), all subscribers from such configured prefix will be mapped to a single outside IP. Since the outside IP cannot be shared with NAT subscribers from other prefixes, some of the deterministic port-blocks for this particular outside IP address are not utilized.
Each configured prefix can evaluate into multiple map commands. The number of map commands will depend on the length of the configured prefix, the subscriber-limit command and fragmentation of outside address-range within the pool with which the prefix is associated.
Support for multiple MS-ISAs in the nat-group calls for traffic hashing on the inside in the ingress direction. This will ensure fair load balancing of the traffic amongst multiple MS-ISAs. While hashing in non-deterministic LSN44 can be performed per source IP address, hashing in deterministic LSN44 is based on subnets instead of individual IP addresses. The length of the hashing subnet is common for all configured prefixes within an inside routing instance. In case that a prefixes from an inside routing instances is referencing multiple pools, the common hashing prefix length will be chosen according to the pool with the highest number of subscribers per outside IP address. This will ensure that subscribers mapped to the same outside IP address will be always hashed to the same MS-ISA.
In general, load distribution based on hashing is dependent on the sample. Large and more diverse sample will ensure better load balancing. Therefore the efficiency of load distribution between the MS-ISAs is dependent on the number and diversity of subnets that hashing algorithm is taking into consideration within the inside routing context.
A simple rule for good load balancing is to configure a large number of subscribers relative to the largest t subscriber-limit parameter in any given pool that is referenced from this inside routing instance.
The configuration example shown Figure 56 depicts a case in which prefixes from multiple routing instances are mapped to the same outside pool and at the same time the prefixes from a single inside routing instance are mapped to different pools (we do not support the latter with non-deterministic NAT).
![]() | Note: In this example is the inside prefix 10.10.10.0/24 that is present in VPRN 1 and VPRN 2. In both VPRNs, this prefix is mapped to the same pool - pool-1 with the subscriber-limit of 64. Four outside IP addresses per prefix per VPRN (eight in total) are allocated to accommodate the mappings for all hosts in prefix 10.10.10.0/24. But the hashing prefix length in VPRN1 is based on the subscriber-limit 64 (VPRN1 references only pool-1) while the hashing prefix length in VPRN2 is based on the subscriber-limit 256 in pool-2 (VPRN2 references both pools, pool-1 and pool-2 and we must select the larger subscriber-limit). The consequence of this is that the traffic from subnet 10.10.10.0/24 in VPRN 1 can be load balanced over 4 MS-ISA (hashing prefix length is 26) while traffic from the subnet 10.10.10.0/24 in VPRN 2 is always sent to the same MS-ISA (hashing prefix length is 24). |
Distribution of outside IP addresses across the MS-ISAs is dependent on the ingress hashing algorithm. Since traffic from the same subscriber is always pre-hashed to the same MS-ISA, the corresponding outside IP address also must reside on the same ISA. CPM runs the hashing algorithm in advance to determine on which MS-ISA the traffic from particular inside subnet will land and then the corresponding outside IP address (according to deterministic NAT mapping algorithm) will be configured in that particular MS-ISA.
Sharing of the deterministic pools between LSN44 and DS-Lite is supported.
Simultaneous support for deterministic and non-deterministic NAT inside of the same routing instance is supported. However, an outside pool can be only deterministic (although expandable by dynamic ports blocks) or non-deterministic at any given time.
Ingress hashing for all NATed traffic within the VRF will in this case be performed based on the subnets driven by the classic-lsn-max-subscriber-limit parameter.
Deterministic NAT does not change the way how traffic is selected for the NAT function but instead only defines a predictable way for translating subscribers into outside IP addresses and port-blocks.
Traffic is still diverted to NAT using the existing methods:
The inverse mapping can be performed with a MIB locally on the node or externally via a script sourced in the router. In both cases, the input parameters are <outside routing instance, outside IP, outside port. The output from the mapping is the subscriber and the inside routing context in which the subscriber resides.
Reverse mapping information can be obtained using the following command:
Example:
Output:
Inside router 10 ip 20.0.5.171 -- outside router Base ip 85.0.0.2 port 2333 at Mon Jan 7 10:02:02 PST 2013
Instead of querying the system directly, there is an option where a Python script can be generated on router and exported to an external node. This Python script contains mapping logic for the configured deterministic NAT in the router. The script can be then queried off-line to obtain mappings in either direction. The external node must have installed Python scripting language with the following modules: getopt, math, os, socket and sys.
The purpose of such off-line approach is to provide fast queries without accessing the router. Exporting the Python script for reverse querying is a manual operation that needs to be repeated every time there is configuration change in deterministic NAT.
The script is exported outside of the box to a remote location (assuming that writing permissions on the external node are correctly set). The remote location is specified with the following command:
The status of the script is shown using the following command:
Once the script location is specified, the script can be exported to that location with the following command:
This needs to be repeated manually every time the configuration affecting deterministic NAT changes.
The script itself can be run to obtain mapping in forward or backward direction:
The following displays an example in which source addresses are mapped in the following manner:
The forward query for this example will be performed as:
user@external-server:/home/ftp/pub/det-nat-script$ ./det-nat.py -f -s 10 -a 20.0.5.10
Output:
The reverse query for this example will be performed as:
Output:
Every configuration change concerning the deterministic pool will be logged and the script (if configured for export) will be automatically updated (although not exported). This is needed to keep current track of deterministic mappings. In addition, every time a deterministic port-block is extended by a dynamic block, the dynamic block will be logged just as it is today in non-deterministic NAT. The same logic is followed when the dynamic block is de-allocated.
All static port forwards (including PCP) are also logged.
PCP allocates static port forwards from the wildcard-port range.
A subscriber in non-deterministic DS-Lite is defined as v6 prefix, with the prefix length being configured under the DS-Lite NAT node:
All incoming IPv6 traffic with source IPv6 addresses falling under a unique v6 prefix that is configured with subscriber-prefix-length command will be considered as a single subscriber. As a result, all source IPv4 addresses carried within that IPv6 prefix will be mapped to the same outside IPv4 address.
The concept of deterministic DS-Lite is very similar to deterministic LSN44. The DS-lite subscribers (IPv6 addresses/prefixes) are deterministically mapped to outside IPv4 addresses and corresponding deterministic port-blocks.
Although the subscriber in DS-Lite is considered to be either a B4 element (IPv6 address) or the aggregation of B4 elements (IPv6 prefix determined by the subscriber-prefix-length command), only the IPv4 source addresses and ports carried inside of the IPv6 tunnel are actually translated.
The prefix statement for deterministic DS-lite remains under the same deterministic CLI node as for the deterministic LSN44. However, the prefix statement parameters for deterministic DS-Lite differ from the one for deterministic LSN44 in the following fashion:
Example:
In this case, 16 v6 prefixes (from ABCD:FF::/60 to ABCD:FF:00:F0::/60) are considered DS-Lite subscribers. The source IPv4 addresses/ports inside of the IPv6 tunnels is mapped into respective deterministic port blocks within an outside IPv4 address according to the map statement.
The map statement contains minor modifications as well. It maps DS-Lite subscribers (IPv6 address or prefix) to corresponding outside IPv4 addresses. Continuing on the previous example:
map start ABCD:FF::/60 end ABCD:FF:00:F0::/60 to 128.251.1.1
The prefix length (/60) in this case MUST be the same as configured subscriber-prefix-length. If we assume that the subscriber-limit in the corresponding pool is set to 8 and outside IP address range is 128.251.1.1 - 128.251.1.10, then the actual mapping is the following:
The ingress hashing and load distribution between the ISAs in Deterministic DS-Lite is governed by the highest number of configured subscribers per outside IP address in any pool referenced within the given inside routing context.
This limit is configured under:
While ingress hashing in non-deterministic DS-Lite is governed by the subscriber-prefix-length command, in deterministic DS-Lite the ingress hashing is governed by the combination of dslite-max-subscriber-limit and subscriber-prefix-length commands. This is to ensure that all DS-Lite subscribers that are mapped to a single outside IP address are always sent to the same MS-ISA (on which that outside IPv4 address resides). In essence, as soon as deterministic DS-Lite is enabled, the ingress hashing is performed on an aggregated set of n = log2(dslite-max-subscriber-limit) contiguous subscribers. n is the number of bits used to represent the largest number of subscribers within an inside routing context, that is mapped to the same outside IP address in any pool referenced from this inside routing context (referenced through the NAT policy).
Once the deterministic DS-lite is enabled (a prefix command under the deterministic CLI node is configured), the ingress hashing influenced by the dslite-max-subscriber-limit will be in effect for both flavors of DS-Lite (deterministic AND non-deterministic) within the inside routing context assuming that both flavors are configured simultaneously.
With introduction of deterministic DS-lite, the configuration of the subscriber-prefix-length must adhere to the following rule:
This can be clarified by the two following examples:
This means that 64 DS-Lite subscribers will be mapped to the same outside IP address. Consequently the prefix length of those subscribers must be reduced by 6 bits for hashing purposes (so that chunks of 64 subscribers are always hashed to the same ISA).
According to our rule, the prefix of those subscribers (subscriber-prefix-length) can be only in the range of [38..64], and no longer in the range [32..64, 128].
This means that each DS-lite subscriber will be mapped to its own outside IPv4 address. Consequently there is no need for the aggregation of the subscribers for hashing purposes, since each DS-lite subscriber is mapped to an entire outside IPv4 address (with all ports). Since the subscriber prefix length are not contracted in this case, the prefix length can be configured in the range [32..64, 128].
In other words the largest configured prefix length for the deterministic DS-lite subscriber will be 32+n, where n = log2(dslite-max-subscriber-limit). The subscriber prefix length can extend up to 64 bits. Beyond 64 bits for the subscriber prefix length, there is only one value allowed: 128. In the case n must be 0, which means that the mapping between B4 elements (or IPv6 address) and the IPv4 outside addresses is in 1:1 ratio (no sharing of outside IPv4 addresses).
The dependency between the subscriber definition in DS-Lite (based on the subscriber-prefix-length) and the subscriber hashing mechanism on ingress (based on the dslite-max-subscriber-limit value), will influence the order in which deterministic DS-lite is configured.
Configure deterministic DS-Lite in the following order.
Modifying the dslite-max-subscriber-limit requires that all nat-policies be removed from the inside routing context.
To migrate a non-deterministic DS-Lite configuration to a deterministic DS-Lite configuration, the non-deterministic DS-Lite configuration must be first removed from the system. The following steps should be followed:
NAT Pool
NAT Policy
NA Group
Deterministic Mappings (prefix ans map statements)
Similarly, the map statements can be added or removed only if the prefix node is in a shutdown state.
The outside-ip-address in the map statements must be unique amongst all map statements referencing the same pool. In other words, two map statements cannot reference the same <outside-ip-address> in a pool.
Configuration Parameters
Miscellaneous
Destination NAT (DNAT) in SR OS is supported for LSN44 and L2-Aware NAT. DNAT can be used for traffic steering where the destination IP address of the packet is rewritten. In this fashion traffic can be redirected to an appliance or set of servers that are in control of the operator, without the need for a separate transport service (for example, PBR plus LSP). Applications utilizing traffic steering via DNAT normally require some form of inline traffic processing, such as inline content filtering (parental control, antivirus/spam, firewalling), video caching, etc.
Once the destination IP address of the packet is translated, traffic is naturally routed based on the destination IP address lookup. DNAT will translate the destination IP address in the packet while leaving the original destination port untranslated.
Similar to source based NAT (Source Network Address and Port Translation (SNAPT)), the SR OS will maintain state of DNAT translations so that the source IP address in the return (downstream) packet is translated back to the original address.
Traffic selection for DNAT processing in MS-ISA is performed via a NAT classifier.
In certain cases SNAPT is required along with DNAT. In other cases only DNAT is required without SNAPT. The following table shows the supported combinations of SNAPT and DNAT in SR OS.
SNAPT | DNAT-Only | SNAPT + DNAT | |
LSN44 | X | X | X |
L2-Aware | X | X |
The SNAPT/DNAT address translations are shown in Figure 57.
NAT forwarding in SR OS is implemented in two stages:
As part of the NAT state maintenance, the SR OS maintains the following fields for each DNATed flow:
<inside host /port, outside IP/port, foreign IP address/port, destination IP address/port, protocol (TCP,TCP,ICMP)> Note that the inside host in LSN is inside the IP address and in L2-Aware NAT it is the <inside IP address + subscriber-index>. The subscriber index is carried in session-id of the L2TP.
The foreign IP address represents the destination IP address in the original packet, while the destination IP address represents the DNAT address (translated destination IP address).
Traffic intended for DNAT processing is selected via a nat classifier. The nat classifier has configurable protocol and destination ports. The inclusion of the classifier in the NAT policy is the trigger for performing DNAT. The configuration of the nat classifier determines whether:
Classifier cannot drop traffic (no action drop). However, a non-reachable destination IP address in DNAT will cause traffic to be black-holed.
DNAT is enabled in the config>service>nat>nat-policy context.
DNAT function is triggered by the presence of the nat classifier (nat-classifier command), referenced in the NAT policy.DNAT-only option is configured in case where SNAPT is not required. This command is necessary in order to determine the outside routing context and the nat-group when SNAPT is not configured. Pool (relevant to SNAPT) and DNAT-only configuration options within the NAT policy are mutually exclusive.
DNAT traffic selection is performed via a nat-classifier. Nat-classifier is defined under config>service>nat hierarchy and is referenced within the nat-policy.
default-dnat-ip-address is used in all match criteria that contain DNAT action without specific destination IP address. However, the default-dnat-ip-address is ignored in cases where IP address is explicitly configured as part of the action within the match criteria.
default-action is applied to all packets that do not satisfy any match criteria.
forward (forwarding action) has no effect on the packets and will transparently forward packets through the nat-classifier.
By default, packets that do not match any matching criteria are transparently passed through the classifier.
In order to forward upstream and downstream traffic for the same NAT binding to the same MS-ISA, the original source IP address space must be known in advance and consequently hashed on the inside ingress towards the MS-ISAs and micro-netted on the outside.This will be performed with the following CLI:
The classic-lsn-max-subscriber-limit parameter was introduced by deterministic NAT and it is reused here. This parameter affects the distribution of the traffic across multiple MS-ISA in the upstream direction traffic. Hashing mechanism based on source IPv4 addresses/prefixes is used to distribute incoming traffic on the inside (private side) across the MS-ISAs. Hashing based on the entire IPv4 address will produce the most granular traffic distribution, while hashing based on the IPv4 prefix (determined by prefix length) will produce less granular hashing. For further details about this command, consult the CLI command description. The source IP prefix is defined in the nat-prefix-list and then applied under the DNAT-only node in the inside routing context. This will instruct the SR OS to create micro-nets in the outside routing context. The number of routes installed in this fashion is limited by the following configuration:
The configurable range is 1-128K with the default value of 32K.DNAT provisioning concept is shown in Figure 58.
The following restrictions apply to multiple NAT policies per inside routing context
The selection of the NAT pool and the outside routing context is performed through the NAT policy. Multiple NAT policies can be used within an inside routing context. This feature effectively allows selective mapping of the incoming traffic within an inside routing context to different NAT pools (with different mapping properties, such as port-block size, subscriber-limit per pool, address-range, port-forwarding-range, deterministic vs non-deterministic behavior, port-block watermarks, etc.) and to different outside routing contexts. NAT policies can be configured:
The concept of the NAT pool selection mechanism based on the destination of the traffic via routing is shown in Figure 59.
Diversion of the traffic to NAT based on the source of the traffic is shown in Figure 60.
Only filter-based diversion solution is supported for this case. The filter-based solution can be extended to a 5 tuple matching criteria.
The following considerations must be taken into account when deploying multiple NAT policies per inside routing context:
The routing approach relies on upstream traffic being directed (or diverted) to the NAT function based on the destination-prefix command in the configure>service>vprn/router>nat>inside CLI context. In other words, the upstream traffic will be NATed only if it matches a preconfigured destination IP prefix. The destination-prefix command creates a static route in the routing table of the inside routing context. This static route will divert all traffic with the destination IP address that matches the created entry, towards the MS-ISA. The NAT function itself will be performed once the traffic is in the proper context in the MS-ISA.
The CLI for multiple NAT policies per inside routing context with routing based diversion to NAT is the following:
or, for example:
Different destination prefixes can reference a single NAT policy (policy-1 in this case).
In case that the destination-policy does not directly reference the NAT policy, the default NAT policy will be used. The default NAT policy is configured directly in the vprn/router>nat>inside context.
Once that destination-prefix command referencing the NAT policy is configured, an entry in the routing table will be created that will direct the traffic to the MS-ISA.
A filter-based approach will divert traffic to NAT based on the ip matching criteria shown in the CLI below.
The CLI for the filter-based diversion in conjunction with multiple NAT policies is shown below:
The association with the NAT policy is made once the filter is applied to the SAP.
DS-Lite and NAT64 diversion to NAT with multiple nat-policies is supported only through IPv6 filters:
Where the nat-type parameter can be either dslite or nat64.
The DS-Lite AFTR address and NAT64 destination prefix configuration under the corresponding (DS-Lite or NAT64) router/vprn>nat>inside context is mandatory. This is even in the case when only filters are desired for traffic diversion to NAT.
For example, every AFTR address and NAT64 prefix that is configured as a match criteria in the filter, must also be duplicated in the router/vprn>nat>inside context. However, the opposite is not required.
IPv6 traffic with the destination address outside of the AFTR/NAT64 address/prefix will follow normal IPv6 routing path within the 7750 SR.
The default nat-policy is always mandatory and must be configured under the router/vprn>nat>inside context. This default NAT policy can reference any configured pool in the desired ISA group. The pool referenced in the default NAT policy can be then overridden by the NAT policy associated with the destination-prefix in LSN44 or by the NAT policy referenced in the ipv4/ipv6-filter used for NAT diversion in LSN44/DS-Lite/NAT64.
The NAT CLI nodes will fail to activate (be brought out of the no shutdown state), unless a valid NAT policy is referenced in the router/vprn>nat>inside context.
Each subscriber using multiple policies is counted as 1 subscriber for the inside resources scaling limits (such as the number of subscribers per MS-ISA), and counted as 1 subscriber per (subscriber + policy combination) for the outside limits (subscriber-limit subscribers per IP; port-reservation port/block reservations per subscriber).
Any given Static Port Forward (SPF) can be created only in one pool. This pool, which is referenced through the NAT policy, has to be specified at the SPF creation time, either explicitly through the configuration request or implicitly via defaults.
Explicit request will be submitted either via SAM or via CLI:
In the absence of the NAT policy referenced in the SPF creation request, the default nat-policy command under the vprn/router>nat>inside context will be used.
The consequence of this is that the operator must know the NAT policy in which the SPF is to be created. The SPF cannot be created via PCP outside of the pool referenced by the default NAT policy, since PCP does not provide means to communicate NAT policy name in the SPF creation request.
The static port forward creation and their use by the subscriber types must follow these rules:
When the last relevant policy for a certain subscriber type is removed from the virtual router, the associated port forwards are automatically deleted.
Figure 61 and Figure 62 describe certain scenarios that are more theoretical and are less likely to occur in reality. However, they are described here for the purpose of completeness.
Figure 61 represents the case where traffic from the WEB server 1.1.1.1 is initiated toward the destined network 11.0.0.0/8. Such traffic will end up translated in the Pool B and forwarded to the 11.0.0.0/8 network even though the static port forward has been created in Pool A. In this case the NAT policy rule (dest 11.0.0.0/8 pool B) will determine the pool selection in the upstream direction (even though the SPF for the WEB server already exists in the Pool A).
The next example in Figure 62 shows a case where the Flow 1 is initiated from the outside. Since the partial mapping matching this flow already exist (created by SPF) and there is no more specific match (FQF) present, the downstream traffic will be mapped according to the SPF (through Pool A to the Web server). At the same time, a more specific entry (FQF) will be created (initiated by the very same outside traffic). This FQF will now determine the forwarding path for all traffic originating from the inside that is matching this flow. This means that the Flow 2 (reverse of the Flow 1) will not be mapped to an IP address from the pool B (as the policy dictates) but instead to the Pool A which has a more specific match.
A more specific match would be in this case fully qualified flows (FQF) that contains information about the foreign host: <host, inside IP/port, outside IP/port, foreign IP address/port, protocol>.
When multiple NAT policies per inside routing context are deployed, a new policy-id parameter is added to certain syslog messages. The format of the policy-id is:
where XX is an arbitrary unique number per inside routing context assigned by the router. This number, represents the corresponding NAT policy. Since the maximum number of NAT policies in the inside routing context is 8, the policy-id value is also a numerical value in the range 1 — 8.
Introduction of the policy-id in logs is necessary due to the bulk-operations associated with multiple NAT policies per inside routing context. A bulk operation, for example, represents the removal of the nat-policy from the configuration, shutting down the NAT pool, or removing an IP address range from the pool. Removing a NAT accounting policy in case of RADIUS NAT logging will not trigger a summarization log since an acct-off message is sent. Such operations have a tendency to be heavy on NAT logging since they affect a large number of NAT subscribers at once. Summarization logs are introduced to prevent excessive logging during bulk operations. For example, the NAT policy deletion can be logged with a single (summarized) entry containing the policy-id of the NAT policy that was removed and the inside srvc-id. Since all logs contain the policy-id, a single summarization free log can be compared to all map2 logs containing the same policy-id to determine for which subscribers the NAT mappings have ceased. Map and Free logs are generated when the port-block for the subscribers are allocated and de-allocated.
Summarization log is always generated on the CPM, regardless of whether the RADIUS logging is enabled or not. A summarization log simply cannot be generated via RADIUS logging since the RADIUS accounting message streams (start/interim-updates/stop) are always generated per subscriber. In other words, for RADIUS logging, the summarization log would need to be sent to each subscriber, which defeats the purpose of the summarization logs.
A summarization log on the CPM is generated:
With multiple NAT policies per inside routing context, the inside srvc-id and the policy-id are included in the summarization log (no outside IPs, outside srvc-id, port-block or source IP).
A log search based on the policy-id and inside srvc-id should reveal all subscribers whose mappings were affected by the NAT policy removal.
A log search based on the outside IP address and outside srvc-id should reveal all subscribers for which the NAT mappings have ceased.
A log search based on the outside IP addresses in the range and the outside srvc id should reveal all subscribers for which the NAT mappings have ceased.
Summarization logs in RADIUS logging
The summarization log for bulk operation while RADIUS logging is in effect will be generated only in the CPM (syslog). This means that for bulk operations with RADIUS logging, the operator will have to rely on RADIUS logging as well as on the CPM logging.
An open log sequence in RADIUS, for example a map for the <inside IP 1, outside IP 1,port-block 1> followed at some later time with a map for <inside IP 2, outside IP 1, port-block 1>, is an indication that the free log for <inside IP 1, outside IP 1,port-block 1> is missing. This means that either the free log for <inside IP 1, outside IP 1,port-block 1> was lost or that a policy/pool/address-range was removed from the configuration. In the latter case, the operator should look in the CPM log for the summarization message.
The summarization logs are enabled via the event control 2021 tmnxNatLsnSubBlksFree which is by default suppressed. The even control 2021 is also used to report when all blocks for the subscriber are freed.
Multiple NAT policies for a L2-aware subscriber can be selected based on the destination IP address of the packet. This allows the operator to assign different NAT pools and outside routing contexts based on the traffic destinations.
The mapping between the destination IP prefix and the NAT policy is defined in a nat-prefix-list. This nat-prefix-list is applied to the L2-aware subscriber via a sub-profile. Once the subscriber traffic arrives to the MS-ISA where NAT is performed, an additional lookup based on the destination IP address of the packet will be executed to select the specific NAT policy (and consequently the outside NAT pool). Failure to find the specific NAT policy based on the destination IP address lookup will result in selection of the default NAT policy referenced in the sub-profile.
CLI example:
As displayed in the example, multiple IP prefixes can be mapped to the same NAT policy.
The NAT prefix list cannot reference the default NAT policy. The default NAT policy is the one that is referenced directly under the sub-profile.
In L2-aware NAT with multiple nat-policies, the NAT resources are allocated in each pool associated with the subscriber. This NAT resource allocation is performed at the time when the ESM subscriber is instantiated. Each NAT resource allocation will be followed by log generation.
For example, if RADIUS logging is enabled, one Alc-NAT-Port-Range VSA per NAT policy will be included in the acct START/STOP message.
[Alc-Nat-Port-Range = "192.168.20.1 1024-1055 router base nat-pol-1"
Alc-Nat-Port-Range = "193.168.20.1 1024-1055 router base nat-pol-2".
Alc-Nat-Port-Range = "194.168.20.1 1024-1055 router base" nat-pol-3.]
Nat-policy change for L2-aware NAT is supported via sub-profile change triggered in CoA. However, change of sub-profile alone via CoA will not trigger generation of new Radius accounting message and thus NAT events related to NAT policy change will not be promptly logged. For this reason, each CoA initiating the sub-profile change in NAT environment should:
Note that the sla-profile will have to be changed and not just refreshed. In other words replacing the existing sla-profile with the same one will not trigger a new accounting message.
Both of the above events will trigger an accounting update at the time when CoA is processed. This will keep NAT logging current. The information about NAT resources for logging purposes is conveyed in the following RADIUS attributes:
NAT logging behavior due to CoA will depend on the deployed accounting mode of operation. This is described in Table 1. Note that interim-update keyword must be configured for host/session accounting in order for Interim-Update messages to be triggered:
Table Legend:
AATR - Alc-Acct-Triggered-Reason VSA ® This VSA is optionally carried in Interim-Update messages that are triggered by CoA.
ATAI - Alc-Trigger-Acct-Interim VSA ® this VSA can be carried in CoA to trigger Interim-Update message. The string carried in this VSA is reflected in the triggered Interim-Update message.
I-U – Interim-Update Message
Host or session accounting | Queue-instance accounting | Comments | |
CoA Sub-prof change + ATAI VSA | Single I-U with: — released NAT info — unchanged NAT info — new NAT info — AATR — ATAI | Single I-U with: — released NAT info — unchanged NAT info — new NAT info — AATR — ATAI | Single I-U message is triggered by CoA. |
CoA Sub-profile change + Sla-profile change | First I-U: — released NAT info — unchanged NAT info — new NAT info Second I-U: — unchanged NAT info — new NAT info | Acct Stop: — released NAT info — unchanged NAT info — new NAT info Acct Start: — unchanged NAT info — new NAT info | Two accounting messages are triggered in succession. |
CoA Sub-profile change | — | — | No accounting messages are triggered by CoA. The next regular I-U messages will contain: — old (released) NAT info — unchanged NAT info — new NAT info. |
CoA Sub-profile change+ Sla-profile-change + ATAI VSA | First I-U: — released NAT info — unchanged NAT info — new NAT info Second I-U — unchanged NAT info — new NAT info — AATR — ATAI | Acct Stop: — re-released NAT info — unchanged NAT info — new NAT info Acct Start: — unchanged NAT info — new NAT info | Two accounting messages are triggered in succession. |
For example, the second CoA row describes the outcome triggered by CoA carrying new sub and sla profiles. In host/session accounting mode this will create two Interim-Update messages. The first Interim-Messages will carry information about:
The second Interim-Update message will carry information about the NAT resources that are in use (existing and new) once CoA is activated.
From this, the operator can infer which NAT resources are released by CoA and which NAT resources continue to be in use once CoA is activated.
Nat-policy change induced by CoA will trigger immediate log generation (for example acct STOP or INTERIM-UPDATE) indicating that the nat resources have been released. However, the NAT resources (outside IP addresses and port-blocks) in SR OS will not be released for another five seconds. This delay is needed to facilitate proper termination of traffic flow between the NAT user and the outside server during the NAT policy transition. A typical example of this scenario is the following:
Stale port forwards will similarly to other stale dynamic mappings be released after five seconds. Note that static port forwards will be kept on the CPM.New CoAs related to NAT will be rejected (NAK’d) in case that the previous change is in progress (during the 5seconds interval until the stale mappings are purged).
Unless the specific NAT policy is provided during Static Port Forward (SPF) creation (SPF creation command), the port forward will be created in the pool referenced in the default NAT policy. Nat-policy can be part of the command used to modify or delete SPF. If the NAT policy is not provided, then the behavior will be:
A match is considered when at least these parameters from the modify/delete command are matched (mandatory parameters in the spf command):
UPnP will use the default NAT policy.
RADIUS Change of Authorization (CoA) can be used in subscriber management (ESM) to modify the NAT behavior of the subscriber. This can be performed by:
The behavior for NAT policy changes via CoA for LSN and L2-Aware NAT is summarized in Table 35.
Action | Outcome | Remarks | |
L2-Aware | LSN | ||
CoA - replacing NAT policy | Stale flows using the old NAT policy are cleared after 5 seconds. New flows immediately start using a new NAT policy. Restrictions:
| Stale flows using the old NAT policy continue to exist and are used for traffic forwarding until they are naturally timed-out or TCP- terminated. The exception to this is when the reference to the NAT policy in the filter was the last one for the inside VRF. In this scenario, the flows from the removed NAT policy are cleared immediately. New flows immediately start using new NAT policy. | A NAT policy change via CoA is performed by changing the sub-profile for the ESM subscriber or by changing the ESM subscriber filter in the LSN case. 1 A sub-profile change alone does not trigger accounting messages in L2-aware NAT and consequently the logging information is lost. To ensure timely RADIUS logging of the NAT policy change in L2-aware NAT, each CoA must, in addition to the sub-profile change, also:
Both of the above events will trigger an accounting update at the time when CoA is processed. This keeps NAT logging current. |
(cont.)
|
1. In non-ESM environments, the NAT policy can be changed by replacing the interface filter via CLI for LSN case.
2. The SLA profile will have to be changed and not just refreshed. In other words, replacing the existing SLA profile with the same one will not trigger a new accounting message.
Adding, removing or replacing DNAT parameters in LSN44 can be achieved through NAT policy manipulation in an IP filter for ESM subscriber. The rules for NAT policy manipulation via CoA are given in Table 35.In L2-Aware NAT, CoA can be used to:
Once the DNAT configuration is modified via CoA (enable or disable DNAT or change the default DNAT IP address), the existing flows affected by the change remain active for 5 more seconds while the new flows are created in accordance with the new configuration. After a 5 second timeout, the stale flows are cleared from the system.
The RADIUS attribute used to perform DNAT modifications is a composite attribute with the following format:
Alc-DNAT-Override (234) = “{<DNAT_state> | <DNAT-ip-addr>},[nat-policy]”
where: DNAT state = none | disable ® is mutually exclusive with the DNAT-ip-addr parameter.
DNAT-ip-addr = Provides an implicit enable with the destination IPv4 address in dotted format (a.b.c.d) ® is mutually exclusive with DNAT-state parameter.
nat-policy = nat-policy name ® This is an optional parameter. If it is not present, then the default NAT policy is assumed.
For example:
Alc-DNAT-Override=none ® This will re-enable DNAT functionality in the default NAT policy, assuming that DNAT was previously disabled via the Alc-DNAT-Override=disable attribute, submitted either in an Access-Accept message or in a previous CoA. In case that the none value is received while DNAT is already enabled, a CoA ACK is sent back to the originator.
Alc-DNAT-Override =none,nat-pol-1 ® This will re-enable DNAT functionality in the specific NAT policy with the name nat-policy-1.
Alc-DNAT-Override =none,1.1.1.1 ® A CoA NAK is returned. The DNAT-state and DNAT-ip-addr parameters are mutually exclusive within the same Alc-DNAT-Override attribute.
Alc-DNAT-Override =1.1.1.1 ® This will change the default DNAT IP address to 1.1.1.1 in the default NAT policy. In case DNAT was disabled prior to receiving this CoA, it will be implicitly enabled.
Alc-DNAT-Override =1.1.1.1,nat-pol-1 ® This will change the default DNAT IP address to 1.1.1.1 in the specific NAT policy named nat-policy-1. DNAT will be implicitly enabled if it was disabled prior to receiving this CoA.
The combination of sub-fields with the Alc-DNAT-Override RADIUS attribute and the corresponding actions are shown in Table 36.
DNAT-State | DNAT-ip-addr | NAT Policy | DNAT Action in L2-Aware NAT |
none | - | - | Re-enable DNAT in the default NAT policy. If DNAT was enabled prior to receiving this CoA, then no specific action will be carried out by the SR OS with the exception of sending the CoA ACK back to the CoA server. If a DNAT classifier is not present in the default NAT policy when this CoA is received, then CoA NAK (error) will be sent to the CoA server. |
none | - | nat-pol-name | Re-enable DNAT in the referenced NAT policy. If DNAT was enabled prior to receiving this CoA, no specific action will be carried out by the SR OS with the exception of sending the CoA ACK back to the CoA server. If the DNAT classifier is not present in the referenced NAT policy, a CoA NAK (error) will be sent to the CoA server. |
none | a.b.c.d | - | CoA NAK (error) is generated. These two parameters are mutually exclusive in the same Alc-DNAT-Override attribute. |
none | a.b.c.d | nat-pol-name | CoA NAK (error) is generated. DNAT-state and DNAT-ip-address parameters are mutually exclusive in the same Alc-DNAT-Override attribute. |
disable | - | - | Disable DNAT in the default NAT policy. If a DNAT classifier is not present in the default NAT policy, a CoA NAK (error) will be sent to the CoA server. |
disable | - | nat-pol-name | Disable DNAT in the referenced NAT policy. If a DNAT classifier is not present in the referenced NAT policy, a CoA NAK (error) will be sent to the CoA server. |
disable | a.b.c.d | - | CoA NAK (error) is generated. DNAT-state and DNAT-ip-address parameters are mutually exclusive in the same Alc-DNAT-Override attribute. |
disable | a.b.c.d | nat-pol-name | CoA NAK (error) is generated. DNAT-state and DNAT-ip-address parameters are mutually exclusive in the same Alc-DNAT-Override attribute. |
- | a.b.c.d | - | The default destination IP address is changed in the default NAT policy. |
- | a.b.c.d | nat-pol-name | The default destination IP address is changed in the referenced NAT policy. |
- | - | - or nat-pol-name | A CoA NAK (error) is generated. Either DNAT-state or DNAT-ip-address parameters must be present in the Alc-DNAT-Override attribute. |
If multiple Alc-DNAT-Override attributes with conflicting actions are received in the same CoA or Access-Accept, the action that occurred last will take precedence.
For example, if the following two Alc-DNAT-Override attributes are received in the same CoA, the last one takes effect and consequently DNAT will be disabled in the default NAT policy:
Alc-DNAT-Override = “1.1.1.1“
Alc-DNAT-Override = “disable“
The following table describes the outcome when the active NAT prefix list or NATclassifier is modified via CLI.
Action | Outcome | Remarks | |
L2-Aware | LSN | ||
CLI – Modifying prefix in the NAT prefix list | Existing flows are always checked whether they comply with the NAT prefix list that is currently applied in the subscriber profile for the subscriber. If the flows do not comply with the current NAT prefix list, they are cleared after 5 seconds. The new flows immediately start using the updated settings. | Changing the prefix in the NAT prefix list will internally re-subnet the outside IP address space. | A NAT prefix list is used with multiple NAT policies in L2-aware NAT and for downstream Internal subnet in DNAT-only scenario for LSN. The prefix can be modified (added, removed, remapped) at any time in the NAT prefix list, while the NAT policy must be first shut down via CLI. |
CLI – Modifying or replacing the NAT classifier | Existing flows are always checked whether they comply with the NAT classifier that is currently applied in the active NAT policy for the subscriber. If the flows do not comply with the current NAT classifier, they are cleared after 5 seconds. The new flows immediately start using the updated settings. | Changing the NAT classifier have the same effect as in L2-aware NAT; all existing flows using the NAT classifier are checked whether they comply with this classifier or not. | The NAT classifier is used for DNAT. NAT classifier is referenced in the NAT policy. |
CLI - Removing or adding NAT policy in NAT prefix list | Blocked | Not applicable | |
CLI - Removing or adding NAT policy in the subscriber profile | Blocked | Not applicable | |
CLI - Removing, adding or replacing NAT prefix list under the rtr/nat/inside/DNAT-only | Not applicable | This action will trigger internally re-subnet the source address space according to the new NAT prefix list. However, the current flows in the MS ISA are not effected by this change. In other words, they are not removed if the associated prefix is removed from the prefix list. |
PCP is a protocol that operates between subscribers and the NAT directly. This makes the protocol similar to DHCP or PPP in that the subscriber has a limited but direct control over the NAT behavior.
PCP is designed to allows the configuration of static port-forwards, obtain information about existing port forwards and to obtain the outside IP address from software running in the home network or on the CPE.
PCP runs on each MS-ISA as its own process and make use of the same source-IP hash algorithm as the NAT mappings themselves. The protocol itself is UDP based and is request/response in nature, in some ways, similar to UPnP.
PCP operates on a specified loopback interface in a similar way to the local DHCP server. It operates on UDP and a specified (in CLI) port. As Epoch is used to help recover mappings, a unique PCP service must be configured for each NAT group.
When epoch is lowered, there is no mechanism to inform the clients to refresh their mappings en-masse. External synchronization of mappings is possible between two chassis (epoch does not need to be synchronized). If epoch is unsynchronized then the result will be clients re-creating their mapping on next communication with the PCP server.
The R-bit (0) indicates request and (1) indicates response. This is a request so (0).
OpCode defined as:
Requested Lifetime: Lifetime 0 means delete.
As this is a response, R = (1).
The Epoch field increments by 1 every second and can be used by the client to determine if state needs to be restored. On any failure of the PCP server or the NAT to which it is associated Epoch must restart from zero (0).
Result Codes:
0 SUCCESS, success.
1 UNSUPP_VERSION, unsupported version.
2 MALFORMED_REQUEST, a general catch-all error.
3 UNSUPP_OPCODE, unsupported OpCode.
4 UNSUPP_OPTION, unsupported option. Only if the Option was mandatory.
5 MALFORMED_OPTION, malformed option.
6 UNSPECIFIED_ERROR, server encountered an error
7 MISORDERED_OPTIONS, options not in correct order
Creating a Mapping
Client Sends
MAP4 opcode is (1). Protocols: 0 – all; 1 – ICMP; 6 – TCP; 17 – UDP.
MAP4 (1), PEER4 (3) and PREFER_FAILURE are supported. FILTER and THIRD_PARTY are not supported.
Universal Plug and Play (UPnP), which is a set of specifications defined by the UPnP forum. One specification is called Internet Gateway Device (IGD) which defines a protocol for clients to automatically configure port mappings on a NAT device. Today, many gaming, P2P, VoIP applications support the UPnP IGD protocol. The SR OS supports the following UPnP version 1 InternetGatewayDevice version 1 features:
PPTP is defined in RFC 2637, Point-to-Point Tunneling Protocol (PPTP), and is used to provide VPN connection for home/mobile users to gain secure access to the enterprise network. Encrypted payload is transported over GRE tunnel that is negotiated over TCP control channel. In order for PPTP traffic to pass through NAT, the NAT device must correlate the TCP control channel with the corresponding GRE tunnel. This mechanism is referred to as PPTP ALG.
There are two components of PPTP:
The control connection is established from the PPTP clients (for example, home users behind the NAT) to the PPTP server which is located on the outside of the NAT. Each session that carries data between the two endpoints can be referred as call. Multiple sessions (or calls) can carry data in a multiplexed fashion over a tunnel. The tunnel protocol is defined by a modified version of GRE. Call ID in the GRE header is used to multiplex sessions over the tunnel. The Call-ID is negotiated during the session/call establishment phase.
Control Connection Management — The following messages are used to maintain the control connection.
The remaining control message types are sent over the established TCP session to open/maintain sessions and to convey information about the link state:
Call Management — Call management messages are used to establish/terminate a session/call and to exchange information about the multiplexing field (Call-id). Call-IDs must be captured and translated by the NAT. The call management messages are:
Error Reporting — This message is sent by the client to indicate WAN error conditions that occur on the interface supporting PPP.
PPP Session Control — This message is sent in both directions to setup PPP-negotiated options.
Once Call-ID is negotiated by both endpoints, it is inserted in GRE header and used as multiplexing filed in the tunnel that carries data traffic.
A GRE tunnel is used to transport data between two PPTP endpoints. The packet transmitted over this tunnel has the following general structure:
The GRE header contains the Call ID of the peer for the session for which the GRE packet belongs.
PPTP ALG is aware of the control session (Start Control Connection Request/Replay) and consequently it captures the Call ID field in all PPTP messages that carry that field. In addition to translating inside IP and TCP port, the PPTP ALG process data beyond the TCP header in order to extract the Call ID field and translate it inside of the Outgoing Call Request messages initiated from the inside of the NAT.
The GRE packets with corresponding Call IDs are translated through the NAT as follows:
In addition, the following applies:
The basic principle of PPTP NAT ALG is shown in Figure 63.
The scenario where multiple clients behind the NAT are terminated to the same PPTP server is shown in Figure 64. In this case, it is possible that the source IP addresses of the two PPTP clients are mapped to the same outside address of the NAT. Since the endpoints of the GRE tunnel from the NAT to the PPTP server will be the same for both PPTP clients (although their real source IP addresses are different), the NAT must ensure the uniqueness of the Call-IDs in the outbound data connection. This is where Call-ID translation in the NAT becomes crucial.
The routers supports a deployment scenario where multiple calls (or tunnels) are established from a single PPTP node within a single control connection. In this case, there is only one set of Start-Control-Connection-Req/Reply messages (one control channel) and multiple sets of Outgoing-Call-Req/Reply messages.
Call-Id are taken from the same pool as the ICMP port ranges. Port-ranges and Call-IDs are both 16-bit values. Call-id selection mechanism is the same as the outside TCP/UDP port selection mechanism (random with parity).
The following table describes the outcome when the active nat-prefix-list or NAT classifier is modified via CLI.
Action | Outcome | Remarks | |
L2-Aware | LSN | ||
CLI – Modifying prefix in the NAT prefix list | Existing flows are always checked whether they comply with the NAT prefix list that is currently applied in the sub-profile for the subscriber. If the flows do not comply with the current NAT prefix list, they are cleared after 5 seconds. The new flows will immediately start using the updated settings. | Changing the prefix in the NAT prefix list will internally re-subnet the outside IP address space. | Nat-prefix list is used with multiple NAT policies in L2-aware NAT and for downstream internal subnet in dNAT-only scenario for LSN. Prefix can be modified (added, removed, remapped) at any time in the NAT prefix list, while the NAT policy must be first shut down via CLI. |
CLI – Modifying the NAT classifier | Existing flows are always checked whether they comply with the NAT classifier that is currently applied in the active NAT policy for the subscriber. If the flows do not comply with the current NAT classifier, they are cleared after 5 seconds. The new flows will immediately start using the updated settings. | Changing the NAT classifier has the same effect as in L2-aware NAT; all existing flows using the NAT classifier are checked to see whether or not they comply with this classifier. | The NAT classifier is used for dNAT. The NAT classifier is referenced in the NAT policy. |
CLI - Removing/ adding NAT policy in nat-prefix-list | Blocked | Not Applicable | |
CLI - Removing/ adding/ replacing NAT policy in sub-profile | Blocked | Not Applicable | |
CLI - Removing/ adding/ replacing NAT prefix-list under the rtr/nat/inside/dnat-only | Not Applicable | Internally re-subnet, no effect on the flows |
LSN logging is extremely important to the Service Providers (SP) who are required by the government agencies to track source of suspicious Internet activities back to the users that are hidden behind the LSN device.
The 7750 SR supports several modes of logging for LSN applications. Choosing the right logging model will depend on the required scale, simplicity of deployment and granularity of the logged data.
For most purposes logging of allocation/de-allocation of outside port-blocks and outside IP address along with the corresponding LSN subscriber and inside service-id will suffice.
In certain cases port-block based logging is not satisfactory and per flow logging is required.
The simplest form of LSN and L2-Aware NAT logging is via logging facility in the 7750 SR, commonly called logger. Each port-block allocation/de-allocation event will be recorded and send to the system logging facility (logger). Such an event can be:
In this mode of logging, all applications in the system share the same logger.
Syslog/SNMP/Local-File logging on LSN is mutually exclusive with NAT RADIUS-based logging.
Syslog/SNMP/local-file logging must be separately enabled for LSN and L2-Aware NAT in log even-control. The following displays relevant MIB events:
In this example a single port-block [1884-1888] is allocated/de-allocated for the inside IP address 5.5.5.5 which is mapped to the outside IP address 80.0.0.1. Consequently the event is logged in the memory as.
Once the desired LSN events are enabled for logging via event-control configuration, they can be logged to memory via standard log-id 99 or be filtered via a custom log-id, such as in this example (log-id 5):
Configuration:
The event description is given below:
In this case, the destination of log-id 5 in the following example would be a local file instead of memory:
The events will be logged to a local file on the compact flash cf3 in a file under the /log directory.
In case of SNMP logging to a remote node, the log destination should be set to SNMP destination. Allocation de-allocation of each port block will trigger sending a SNMP trap message to the trap destination.
NAT logs can be sent to a syslog remote facility. A separate syslog message is generated for every port-block allocation/de-allocation.
Severity level for this event can be changed via CLI:
LSN RADIUS logging (or accounting) is based on RADIUS accounting messages as defined in RFC 2866. It requires an operator to have RADIUS accounting infrastructure in place. For that reason, LSN RADIUS logging and LSN RADIUS accounting terms can be used interchangeably.
This mode of logging operation is introduced so that the shared logging infrastructure in 7750 SR can be offloaded by disabling syslog/SNMP/Local-file LSN logging. The result is increased performance and higher scale, particularly in cases when multiple BB-ISA cards within the same system are deployed to perform aggregated LSN functions.
An additional benefit of LSN RADIUS logging over syslog/SNMP/local-file logging is reliable transport. Although RADIUS accounting relies on unreliable UDP transport, each accounting message from the RADIUS client must be acknowledged on the application level by the receiving end (accounting server).
Each port-block allocation or de-allocation is reported to an external accounting (logging) server in the form of start, interim-update or stop messages. The type of accounting messages generated depends on the mode of operation:
The accounting messages are generated and reported directly from the BB-ISA card, therefore bypassing accounting infrastructure residing on the Control Plane Module (CPM).
LSN RADIUS logging is enabled per nat-group. To achieve the required scale, each BB-ISA card in the nat-group group with LSN RADIUS logging enabled runs a RADIUS client with its own IP address. Accounting messages can be distributed to up to five accounting servers that can be accessed in round-robin fashion. Alternatively, in direct access mode, only one accounting server in the list is used. When this server fails, the next one in the list is used.
Configuration steps:
Each BB-ISA card is assigned one IPv4 address from the source-address-range command and this IPv4 address must be accessible from the accounting server. In the following example there is only one BB-ISA card in the nat-group 1. It source IP address is 114.0.1.20.
It is possible to load-balance accounting messages over multiple logging servers by configuring the access-algorithm to round-robin mode. Once the LSN RADIUS accounting policy is defined, it will have to be applied to a nat-group:
The RADIUS accounting messages for the case where a Large Scale NAT44 subscriber has allocated two port blocks in a logging mode where acct start/stop is generated per port-block is shown below.
Port-blocks allocation for the NAT44 subscriber:
Port-blocks de-allocation
The inclusion of acct-multi-session-id in the NAT accounting policy will enable generation of start/stop messages for each allocation/de-allocation of a port-block within the subscriber. Otherwise, only the first and last port-block for the subscriber would generate a pair of start/stop messages. All port-block in between would trigger generation of interim-update messages.
The User-Name attribute in accounting messages is set to app-name@inside-ip-address, whereas the app-name can be any of the following: LSN44, DS-Lite or NAT64.
Currently-allocated NAT resources (such as a public IP address and a port block for a NAT subscriber) can be periodically refreshed via Interim-Update (I-U) accounting messages. This functionality is enabled by the periodic RADIUS logging facility. Its primary purpose is to keep logging information preserved for long-lived sessions in environments where NAT logs are periodically and deliberately deleted from the service provider’s network. This is typically the case in countries where privacy laws impose a limit on the amount of time that the information about customer’s traffic can be retained/stored in service provider’s network.
Periodic RADIUS logging for NAT is enabled by the following command:
The configurable interval dictates the frequency of I-U messages that are generated for each currently allocated NAT resource (such as a public IP address and a port block).
By default, the I-U messages are sent in rapid succession for a subscriber without any intentional delay inserted by SR OS. For example, a NAT subscriber with 8 NAT policies, each configured with 40 port ranges will generate 320 consecutive I-U messages at the expiration of the configured interval. This can create a surge in I-U message generation in cases where intervals are synchronized for multiple NAT subscribers. This can have adverse effects on the logging behavior. For example, the logging server can drop messages due to its inability to process the high rate of incoming I-U messages.
To prevent this, the rate of I-U message generation can be controlled by a rate-limit CLI parameter.
The periodic logging is applicable to both modes of RADIUS logging in NAT:
Periodic I-U message output can be paced in order to avoid congestion at the logging server. Pacing is controlled by the rate-limit option of the periodic-update command. As an example, consider the following hypothetical case:
In case of an MS-ISA switchover or a NAT multi-chassis redundancy switchover, there is a chance that a large number of subscribers will become active at approximately the same time on the newly active MS-ISA (or chassis). This will cause a large number of logs to be sent in a relatively short amount of time, which may overwhelm the logging server. The rate-limit parameter is designed to help in such situations.
Logging of L2-Aware NAT is supported via accounting policy associated with the ESM subscriber (outside of NAT). In addition to ESM subscriber specific attributes, the NAT port-ranges and outside IP address (nat-port-range command in regular ESM accounting policy) are reported in the same accounting messages.
RADIUS accounting initiated by BB-ISA card is not supported for L2-Aware NAT.
Syslog/SNMP/Local-file logging can be enabled simultaneously with L2-aware NAT RADIUS accounting (which is in this case regular ESM RADIUS accounting).
LSN and L2-Aware NAT Flow logging is a facility that allows each BB-ISA card to export the creation and deletion of NAT flows to an external server. A NAT flow or a Fully Qualified Flow consists of the following parameters: Inside IP, inside port, outside IP, outside port, foreign IP, foreign port, protocol (UDP, TCP, ICMP).
In addition, the inside/outside service-id and subscriber string will be added to a flow record.
Flow logging can be deployed as an alternative to the port-range logging or can be complementary (providing a more granular log for offline reporting or compliance). Certain operators have legal and compliance requirements that require extremely detailed logs, created per flow, to be exportable from the NAT node.
Because the setup rate of new flows is excessive, logging to an internal facility (like compact flash) is not possible except in a debugging mode (which must specify match criteria down to the inside-IP and service level).
Flow logging can be enabled per NAT policy and consequently it is initiated from each BB-ISA card independently as a UDP stream, unlike a centralized Netflow/Cflowd application.
Flows are formatted according to IETF IPFIX RFC 5101, Specification of the IP Flow Information Export (IPFIX) Protocol, for the Exchange of IP Traffic Flow Information. Data structures are contained in RFC5102, Information Model for IP Flow Information Export. NAT flow logging is sent to up to two different IP addresses both of which must be unicast IPv4 destinations. These UDP streams are stateless due to the significant volume of transactions. However they do contain sequence numbers such that packet loss can be identified. They egress the chassis at FC NC.
IPFIX defines two different type of messages that will be sent from the IPFIX exporter (7750 SR NAT node). The first contains Template Set – an IPFIX message that defines fields for subsequent IPFIX messages but contains no actual data of its own. The second IPFIX message type is that containing Data Sets – here the data is passed using the previous Template Set message to define the fields. This means an IPFIX message is NOT passed as sets of TLV, but instead data is encoded with a scheme defined through the Template Set message.
While an IPFIX message can contain both Template Set and Data Set, 7750 SR sends Template Set messages periodically without any data, whereas the Data Set messages are sent on demand and as required. When IPFIX is used over UDP, the default retransmission frequency of the Template Set messages defaults to 10 minutes. The interval for retransmission is configurable in CLI with a minimum interval of 1 minute and a maximum interval of 10 minutes. When the exporter first initializes, or when a configuration change occurs the Template Set is sent out three times, one second apart. Templates are sent before any data sets, assuming that the collector is enabled, so that an IPFIX collector can establish the data template set.
Although the UDP transport is unreliable, the IPFIX Sequence Number is a 32bit number that contains the total number of IPFIX Data Records sent for the UDP transport session prior to the receipt of the new IPFIX message. The sequence number starts with 0 and it will roll over once it reaches 4,294,967,268.
The default packet size is 1500B unless another value has been defined in config (range is 512B through 9212B inclusive). Traffic is originated from a random high-port to the collector on port 4739. Multiple create/delete flow records will be stuffed into a single IPFIX packet (although the mapping creates are not delayed) until stuffing an additional data record would exceed MTU or a timer expires. The timer is not configurable and is set to 250ms (that is, should any mapping occur a packet will be sent within 250ms of that mapping being created)
Each collector has a 50 packet buffering space. In case that due to excessive logging the buffering space becomes unavailable, new flows will be denied and the deletion of flows will be delayed until buffering space becomes available.
Two collector nodes can be defined in the same IPFIX export policy for redundancy purposes.
This section provides an example of how to configure large scale NAT44 flow logging.
Define a collector node along with other local transport parameters through an IPFIX export-policy.
To export flow records via UDP stream, the BB-ISA card must be configured with appropriate IPv4 address within a designated VPRN. This address (/32) will act as the source for sending all IPFIX records and is shared by all ISA.
After the IPFIX export policy is defined, apply it within the NAT policy:
The capture of IPFIX packet for an ICMP flow creation and deletion is shown in the following examples.
Flow creation:
Flow deletion:
Table 39 lists the values and descriptions of the fields in the example flow creation and deletion templates.
Field | Value | Description |
Description | Size (B) | |
Export Timestamp | N/A | Timestamp derived from chassis NTP, per RFC 5101 |
Sequence Id | N/A | Total number of IPFIX data records sent for the UDP transport session prior to the receipt of the new IPFIX message (modulo 232), per RFC 5101 |
Observation Domain I | N/A | Unique ID set per ISA in the 7750 SR chassis |
FlowID | 8 | Unique ID (per observation domain ID) for this flow used for tracking purposes only (opaque value); flow ID in a create and a delete mapping record must be the same for a specific NAT mapping |
IP_SRC_ADDR | 4 | Outside IP address used in the NAT mapping |
IP_DST_ADDR | 4 | Destination or remote IP address used in the NAT mapping |
L4_SRC_PORT | 2 | Outside source port used in the NAT mapping |
L4_DST_PORT | 2 | Destination source port used in the NAT mapping |
flowStartMilliseconds 1 | 8 | Timestamp when the flow was created (chassis NTP derived) in milliseconds from epoch, per RFC 5102 |
flowEndMilliseconds 2 | 8 | Timestamp when the flow was destroyed (chassis NTP derived) in milliseconds from epoch, per RFC 510 |
PROTOCOL | 1 | Protocol ID, TCP, UDP or ICMP. Per RFC 5102 |
PADDING | 1 | N/A |
flowEndReason | 1 | Supported flow end reasons:
|
aluInsideServiceID | 2 | 16-bit service ID representing the inside service ID |
aluOutsideServiceI | 2 | 16-bit service ID representing the outside service ID |
aluNatSubString | var | A variable 8B aligned string that represents the NAT subscriber construct (as currently used in the tools dump service nat session commands) |
Notes:
In general, fragmentation functionality is invoked when the size of a fragmentation eligible packet exceeds the size of the MTU of the egress interface/tunnel. Packets eligible for fragmentation are:
The best practice is to avoid fragmentation in the network by ensuring adequate MTU size on the transient/source nodes. Drawbacks of the fragmentation are:
Fragmentation can be particularly deceiving in a tunneled environment whereby the tunnel encapsulation adds extra overhead to the original packet. This extra overhead could tip the size of the resulting packet over the egress MTU limit.
Fragmentation could be one solution in cases where the restriction in the mtu size on the packet’s path from source to the destination cannot be avoided. Routers support IPv6 fragmentation in DS-Lite and NAT64 with some enriched capabilities, such as optional packet IPv6 fragmentation even in cases where DF-bit in corresponding IPv4 packet is set.
In general, the lengths of the fragments must be chosen such that resulting fragment packets fit within the MTU of the path to the packets destination(s).
In downstream direction fragmentation can be implemented in two ways:
In upstream direction, IPv4 packets can be fragmented once they are de-capsulated in DS-lite or translated in NAT64. The fragmentation will occur in the IOM.
In the downstream direction, the IPv6 packet carrying IPv4 packet (IPv4-in-IPv6) is fragmented in the ISA in case the configured DS-lite tunnel-mtu is smaller than the size of the IPv4 packet that is to be tunneled inside of the IPv6 packet. The maximum IPv6 fragment size will be 48bytes larger than the value set by the tunnel-mtu. The additional 48 bytes is added by the IPv6 header fields: 40 bytes for the basic IPv6 header + 8 bytes for extended IPv6 fragmentation header. NAT implementation in the routers does not insert any extension IPv6 headers other than fragmentation header.
In case that the IPv4 packet is larger than the value set by the tunnel-mtu, the fragmentation action will depend on the configuration options and the DF bit setting in the header of the received IPv4 header:
In case that the IPv4 packet is dropped due to fragmentation not being allowed, an ICMPv4 Datagram Too Big message will be returned to the source. This message will carry the information about the size of the MTU that is supported, in essence notifying the source to reduce its MTU size to the requested value (tunnel-mtu).
The maximum number of supported fragments per IPv6 packet is 8. Considering that the minimum standard based size for IPv6 packet is 1280bytes, 8 fragments is enough to cover jumbo Ethernet frames.
Downstream fragmentation in NAT64 works in similar fashion. The difference between DS-lite is that in NAT64 the configured ipv6-mtu represents the mtu size of the ipv6 packet (as opposed to payload of the IPv6 tunnel in DS-lite). In addition, IPv4 packet in NAT64 is not tunneled but instead IPv4/v6 headers are translated. Consequently, the fragmented IPv6 packet size will be 28bytes larger than the translated IPv4 packet 20bytes difference in basic IP header sizes (40bytes IPv6 header vs 20byte IPv4 header) plus 8 bytes for extended fragmentation IPv6 header. The only extended IPv6 header that NAT64 generates is the fragmentation header.
In case that the IPv4 packet is dropped due to the fragmentation not being allowed, the returned ICMP message will contain MTU size of ipv6-mtu minus 28 bytes.
Otherwise the fragmentation options are the same as in DS-lite.
The NAT command histogram displays compartmentalized port distribution per protocol for an aggregated number of subscribers. This allows operators to trend port usage over time and consequently adjust the configuration as the port demand per subscriber increase/decrease. For example, an operator may find that the port usage in a pools has increased over a period of time. Accordingly, the operator may plan to increase the number of ports per port block.
The feature is not applicable to pools which operate in one-to-one mode.
The output is organized in port buckets with the number of subscribers in each bucket.
For example:
The output of the histogram command can be periodically exported to en external destination via cron. The following is an example:
The nat-histogram.txt file contains the command execution line. For example:
This command will be executed every 10 minutes (600 seconds) and the output of the command will be written into a set of files on an external FTP server:
The output of this command displays the port usage in a given pool per protocol per subscriber. The output is organized in a configurable number of port-buckets.
In the following example there is 1 subscriber that is using between 20 and 39 UDP ports in the pool named det. The pool is configured in the Base routing instance.
Multi-chassis stateless NAT redundancy is based on a switchover of the NAT pool that can assume active (master) or standby state. The inside/outside routes that attract traffic to the NAT pool are always advertised from the active node (the node on which the pool is active).
This dual-homed redundancy based on the pool mastership state works well in scenarios where each inside routing context is configured with a single NAT policy (NATed traffic within this inside routing context will be mapped to a single NAT pool).
However, in cases where the inside traffic is mapped to multiple pools (with deterministic NAT and in case when multiple NAT policies are configured per inside routing context), the basic per pool multi-chassis redundancy mode can cause the inside traffic within the same routing instance to fail since some pools referenced from the routing instance might be active on one node while other pools might be active on the other node.
Imagine a case where traffic ingressing the same inside routing instance is mapped as follows (this mapping can be achieved via filters):
Traffic for the same destination is normally attracted only to one NAT node (the destination route is advertised only from a single NAT node). Let assume that this node is Node 1 in out example. Once the traffic arrives to the NAT node, it will be mapped to the corresponding pool according to the mapping criteria (routing based or filter based). But if active pools are not co-located, traffic destined to the pool that is active on the neighboring node would fail. In our example traffic from the source ip-address B would arrive to the Node 1, while the corresponding Pool 2 is inactive on that node. Consequently the traffic forwarding would fail.
To remedy this situation, a group of pools targeted from the same inside routing context must be active on the same node simultaneously. In other words, the active pools referenced from the same inside routing instance must be co-located. This group of pools is referred to as Pool Fate Sharing Group (PFSG). The PFSG is defined as a group of all NAT pools referenced by inside routing contexts whereby at least one of those pools is shared by those inside routing contexts. This is shown in Figure 66.
Even though only Pool 2 is shared between subscribers in VRF 1 and VRF 2, the remaining pools in VRF 1 and VRF 2 must be made part of PFSG 1 as well.
This will ensure that the inside traffic will be always mapped to pools that are active in a single box.
There is always one lead pool in PFSG. The Lead pool is the only pool that is exporting/monitoring routes. Other pools in the PFSG are referencing the lead pool and they inherit its (activity) state. If any of the pools in PFSG fails, all the pools in the PFSG will switch the activity, or in another words they will share the fate of the lead pool (active/standby/disabled).
There is one lead pool per PFSG per node in a dual-homed environment. Each lead pool in a PFSG will have its own export route that must match the monitoring route of in the lead pool in the corresponding PFSG on the peering node.
PFSG is implicitly enabled by configuring multiple pools to follow the same lead pool.
Attracting traffic to the active NAT node (from inside and outside) is based on the routing.
On the outside, the active pool address range will be advertised. On the inside, the destination prefix or steering route (in case of filter based diversion to the NAT function) will be advertised by the node with the active pool.
The advertisement of the routes will be driven by the activity of the pools in the pool fate sharing group:
For example:
A pool can be one of the following:
Both sets of options are thus mutually exclusive.
For a leading pool redundancy will only be enabled when the redundancy node is in no shutdown. For a following pool, the administrate has no effect, and the redundancy will only be enabled when the leading pool is enabled.
Before a lead pool is enabled, consistency check will be performed to make sure that PSFG is properly configured and that the all pools in the given PFSG belong to the same NAT isa-group. PFSG is implicitly enabled by configuring multiple pools to follow the same lead pool. Adding or removing pools from the fate-share-group is only possible when the leading pool is disabled.
For example in the following case, the consistency check would fail since pool 1 is not part of the PFSG 1 (where it should be).
The following command displays the state of the leading pool (dual-homing section towards the bottom of the command output):
The following command displays the state of the follower pool (dual-homing section towards the bottom of the command output):
The following command lists all the pools that are configured along with the NAT inside/outside routing context.
NAT ISA redundancy helps protect against Integrated Service Adapter (ISA) failures. This protection mechanism relies on the CPM maintaining configuration copy of each ISA. In case that an ISA fails, the CPM restores the NAT configuration from the failed ISA to the remaining ISAs in the system. NAT configuration copy of each ISA, as maintained by CPM, is concerned with configuration of outside IP address and port forwards on each ISA. However, CPM does not maintain the state of dynamically created translations on each ISA. This will cause interruption in traffic until the translation are re-initiated by the devices behind the NAT.
Two modes of operation are supported:
By reserving memory resources it can be assured that failed traffic can be recovered by remaining ISAs, potentially with some bandwidth reduction in case that remaining ISAs operated at full or close to full speed before the failure occurred. Active-active ISA redundancy model is shown in Figure 69.
In case of an ISA failure, the member-id of the member ISA that failed is contained in the FREE log. This info is used to find the corresponding MAP log which also contains the member-id field.
In case of RADIUS logging, CPM summarization trap is generated (since RADIUS log is sent from the ISA – which is failed).
In active-active ISA redundancy, each ISA is subdivided into multiple logical ISAs. These logical sub-entities are referred to as members. NAT configuration of each member is saved in the CPM. In case that any one ISA fails, its members will be downloaded by the CPM to the remaining active ISAs. Memory resources on each ISA will be reserved in order to accommodate additional traffic from the failed ISAs. The amount of resources reserved per ISA will depend on the number of ISAs in the system and the number of simultaneously supported ISA failures. The number of simultaneous ISA failures per system is configurable. Memory reservation will affect NAT scale per ISA.
Traffic received on the inside will be forwarded by the ingress forwarding complex to a predetermined member ISAs for further NAT processing. Each ingress forwarding complex maintains an internal link per member. The number of these internal links will, among other factors, determine the maximum number of members per system and with this, the granularity of traffic distribution over remaining ISAs in case of an ISA failure. The segmentation of ISAs into members for a single failure scenario is shown in Figure 70. The protection mechanism in this example is designed to cover for one physical ISA failure. Each ISA is divided into four members. Three of those will carry traffic during normal operation, while the fourth one will have resources reserved to accommodate traffic from one of the members in case of failure. When an ISA failure occurs, three members will be delegated to the remaining ISAs. Each member from the failed ISA will be mapped to a corresponding reserved member on the remaining ISAs.
Active-active ISA redundancy model supports multiple failures simultaneously. The protection mechanism shown in Figure 71 is designed to protect against two simultaneous ISA failures. Just like in the previous case, each ISA is divided into six members, three of which are carrying traffic under normal circumstances while the remaining three members have reserved memory resources.
Table 40shows resource utilization for a single ISA failure in relation to the total number of ISAs in the system. The resource utilization will affect only scale of each ISA. However, bandwidth per ISA is not reserved and each ISA can operate at full speed at any given time (with or without failures).
Number of Physical ISAs per System | Number of Member ISAs per Physical ISA (active/reserved) | Resource Utilization Per System in Non-Failed Condition | Resource Utilization Per System With One Failed ISA |
2 | 1A 1R | 50% | 100% |
3 | 2A 1R | 67% | 100% |
4 | 3A 1R | 75% | 100% |
5 | 3A 1R | 75% | 95% |
6 | 2A 1R | 66% | 83% |
7 | 2A 1R | 66% | 80% |
8 | 2A 1R | 66% | 79% |
9 | 1A 1R | 50% | 61% |
10 | 1A 1R | 50% | 60% |
11 | 1A 1R | 50% | 59% |
12 | 1A 1R | 50% | 58% |
13 | 1A 1R | 50% | 58% |
14 | 1A 1R | 50% | 57% |
During the first five minutes of system boot-up or nat-group activation, the system behaves as if all ISAs are operational. Consequently, ISAs are segmented in its members according to the configured maximum number of supported failures.
Upon expiration of this initial five minute interval, the system is re-evaluated. In case that one of more ISAs are found in faulty state during re-evaluation, the members of the failed ISAs will be distributed to the remaining ISAs that are operational.
Once a failed ISA is recovered, the system will automatically accept it and traffic will be assigned to it. Traffic that is moved to the recovered ISA will be interrupted.
Adding additional ISAs in an operational nat-group requires reconfiguration of the active mda-limit for the nat-group (or the failed mda-limit for that matter). This is only possible when nat-group is in an administratively shutdown state.
This section describes the interaction between MS-ISA applications and other system features.
All MS-ISA uses include support for service mirroring running with no feature interactions or impacts. For example, any service diverted to AA, IPsec, NAT, LNS, or supported combinations of MS-ISA application also supports service mirroring simultaneously.
Multiple uses of MS-ISAs can be combined at one time by daisy-chaining use of the MS-ISAs. Services and subscribers terminated on the LNS ISA are full supported by Application Assurance per AA subscriber and service capabilities, and by the full NAT capabilities.
When Application Assurance and NAT are used in combination (for both ESM and SAP service contexts):
Subscriber aware Large Scale NAT44 attempts to combine the positive attributes of Large Scale NAT44 and L2-Aware NAT, namely:
Subscriber awareness in Large Scale NAT44 will facilitate release of NAT resources immediately after the BNG subscriber is terminated, without having to wait for the last flow of the subscriber to expire on its own (TCP timeout is 4hours by default).
The subscriber aware Large Scale NAT44 function leverages RADIUS accounting proxy built-in to the 7750 SR. The RADIUS accounting proxy allows the 7750 SR to inform Large Scale NAT44 application about individual BNG subscribers from the RADIUS accounting messages generated by a remote BNG and use this information in the management of Large Scale NAT44 subscribers. The combination of the two allows, for example, the 7750 SR running as a Large Scale NAT44 to make the correlation between the BNG subscriber (represented in the Large Scale NAT44 by the Inside IP Address) and RADIUS attributes such as User-Name, Alc-Sub-Ident-String, Calling-Station-Id or Class. These attributes can subsequently be used for either management of the Large Scale NAT44 subscriber, or in the NAT RADIUS Accounting messages generated by the 7750 SR Large Scale NAT44 application. Doing so will simplify both the administration of the Large Scale NAT44 and the logging function for port-range blocks.
As BNG subscribers authenticate and come online, the RADIUS accounting messages are ‘snooped’ via RADIUS accounting proxy which creates a cache of attributes from the BNG subscriber. BNG subscribers are correlated with the NAT subscriber via framed-ip address, and one of the following attributes that must be present in the accounting messages generated by BNG:
Framed-ip address must also be present in the accounting messages generated by BNG.
Large Scale NAT44 Subscriber Aware application will receive a number of cached attributes which will then be used for appropriate management of Large Scale NAT44 subscribers, for example:
Creation and removal of RADIUS accounting proxy cache entries related to BNG subscriber is triggered by the receipt of accounting start/stop messages sourced by the BNG subscriber. Modification of entries can be triggered by interim-update messages carrying updated attributes. Cached entries can also be purged via CLI.
In addition to passing one of the above attributes in Large Scale NAT44 RADIUS accounting messages, a set of opaque BNG subscriber RADIUS attributes can optionally be passed in Large Scale NAT44 RADIUS accounting messages. Up to 128B of such opaque attributes will be accepted. The remaining attributes will be truncated.
Large Scale NAT44 subscriber instantiation can optionally be denied in case that corresponding BNG subscriber cannot be identified in Large Scale NAT44 via RADIUS accounting proxy.
Configuration guidelines:
Configure RADIUS accounting proxy functionality in a routing instance that will receive accounting messages from the remote or local BNG. Optionally forward received accounting message received by RADIUS accounting proxy to the final accounting destination (accounting server).
Point the BNG RADIUS accounting destination to the RADIUS accounting proxy – this way RADIUS accounting proxy will receive and ‘snoop’ BNG RADIUS accounting data.
BNG subscriber can be associated with two accounting policies, therefore pointing to two different accounting destinations. For example, one to the RADIUS accounting proxy, the other one to the real accounting server.
Configure subscriber aware Large Scale NAT44. From Large Scale NAT44 Subscriber Aware application reference the RADIUS Proxy accounting server and define the string that will be used to correlate BNG subscriber with the Large Scale NAT44 subscriber.
Optionally enable NAT RADIUS accounting that will include BNG subscriber relevant data.
(1)
RADIUS accounting proxy will listen to accounting messages on interface ‘rad-proxy-loopback’.
The name ‘proxy-acct’ as defined by the server command will be used to reference this proxy accounting server from Large Scale NAT44.
Received accounting messages can be relayed further from RADIUS accounting proxy to the accounting server which can be indirectly referenced in the default-accounting-policy ‘lsn-policy’.
This lsn-policy can then reference an external RADIUS accounting server with its own security credentials. This external accounting server can be configured in any routing instance.
(2)
Two RADIUS accounting policies can be configured in BNG – one to the real radius server, the other one to the RADIUS accounting proxy.
----------------------------------------------
(3)
Sub-aware Large Scale NAT44 references the RADIUS accounting proxy server ‘proxy-acct’ and defines the calling-station-id attribute from the BNG subscriber as the matching attribute:
(4)
Optionally RADIUS NAT accounting can be enabled:
Such setup would produce a stream of following Large Scale NAT44 RADIUS accounting messages:
The matching accounting stream generated on the BNG is given below:
![]() | Note: The MAP-T feature and commands described in this section apply to the Nokia Virtualized Service Router (VSR) only. |
MAP-T is a NAT technique defined in RFC 7599. Its key advantage is the decentralization of stateful NAT while enabling the sharing of public IPv4 addresses among the customer edge (CE) devices. In a nutshell, the CE performs the stateful NAT44 function and translates the resulting IPv4 packet into an IPv6 packet. The IPv6 packet is transported over the IPv6 network to the Border Router (BR), which then translates the IPv6 packet to IPv4 and sends it into the public domain.
As multiple CEs can share a single public IPv4 address, MAP-T must rely on an algorithm (A+P algorithm running on the CEs and BR) to ensure that each CE is assigned a unique port-range on a shared IPv4 public address. In this way, each CE can be uniquely identified at the BR by a combination of the shared IPv4 public address and unique port-range. A set of CEs and BR that share a common set of MAP algorithm rules constitutes a MAP domain. Figure 72 shows a network-level view of Map-T.
MAP-T offers the following advantages mainly as a result of its stateless BR operation:
Mapping of address and port (MAP) is a generic function, regardless of the underlying transport mechanism (MAP-T or MAP-E) used. Each MAP CE is assigned as follows:
The CE and BR perform the following functions in the MAP-T domain:
Each device (CE and BR) is also responsible for fragmentation handling and ICMP error reporting (MTU to small, TTL expired, and so on).
MAP-T rules control the address translation in a MAP-T domain. The mapping rules can be delivered to the devices in the MAP domain using RADIUS or DHCP, or be statically provisioned.
The MAP-T rules are:
The public IPv4 address and the port-range information of the CE is encoded in its assigned IPv6 delegated prefix (IA-PD). The BMR holds the key to decode this information from the IA-PD of the CE. The BMR identifies the portion of bits of the IA-PD that contain the suffix of the IPv4 address and the port-set ID (PSID). These bits are called the EA bits. The PSID represents the port-range assigned to the CE.
The public IPv4 address of the CE is constructed by concatenating the IPv4 prefix carried in the BMR and the suffix, which is extracted from the EA bits within the IA-PD. The port-range is identified by the remaining EA bits (PSID portion). The EA bits uniquely identify the CE within the IPv6 network in a given MAP domain.
The psid-offset value must be set to a value greater than 0. It represents ports that are omitted from the mapping (for example, well-known ports).
An IPv4 address and port on the private side of the CE must be statefully translated to a public IPv4 address, and the port within the assigned port must be set on the public side of the CE. This ensures that the BR, based on the same MAP rules, can extrapolate the IPv4 source of the packet for verification (anti-spoofing) purposes in the upstream direction, and conversely, to determine the destination IPv6 MAP address (CE address) in the downstream direction (based on the destination IPv4+port).
The IPv6 MAP address is constructed by setting the subnet-id in the delegated IPv6 prefix to 0. In this way, the subnet-id of 0 is reserved for MAP function. The remaining subnets can be delegated on the LAN side of the CE.
The interface-id is set to the IPv4 public address and PSID. This is described in RFC 7599, §6.
In this way, the IPv4 and IPv6 addresses of the CE are defined and easily converted to each other based on the BMR and the port information in the packet. Figure 73 shows the A+P mapping algorithm.
Figure 74 shows a MAP-T deployment scenario.
The routes related to MAP-T are:
Routes related to MAP-T are advertised through IPv4 and IPv6 routing protocols. MAP-T routes in the VSR are owned by “protocol nat” with a metric of 50. This can be used to configure an export routing policy when advertising MAP-T routes in IGP or BGP.
Multiple MAP-T domains can be supported in the same routing context.
![]() | Note: IPv6 IA-PD end-user prefixes are carved out of the IPv6 rule prefix. Aside from MAP-T, IA-PD is used for native IPv6 end-to-end traffic outside of MAP-T. Although the IPv6 rule prefix is not marked as a NAT route in the routing table, it is nonetheless advertised in the upstream direction. |
In the upstream direction, when the BR receives an IPv6 packet destined for the BR prefix, a source-based IPv6 address lookup (anti-spoofing) is performed to verify that the packet is sent by the credible CE.
In the downstream direction, a destination-based IPv4 lookup is performed. This leads to the MAP-T rule entry, which provides the information necessary to derive the IPv6 address of the destination CE.
The MAP-T forwarding function in the VSR is also responsible for:
In address-sharing scenarios, address translation is performed for TCP/UDP and a subset of ICMP traffic; everything else is dropped. In contrast, 1:1 and prefix-sharing scenarios are protocol agnostic.
An IPv6 address of the MAP-T node is constructed according to RFC 7597, §5.2 and RFC 7599, §6. Figure 75 shows the IPv6 address of the MAP-T node.
The subnet ID for a MAP node (CE) is set to 0. Figure 76 shows the node interface (PSID is left-padded with zeros to create a 16-bit field and the IPv4 address is the public IPv4 address assigned to the CE).
This constructed IPv6 address represents the source IPv6 address of traffic sent from the CE to BR (upstream direction), and the destination IPv6 address in the opposite direction (downstream traffic sent from the BR to the CE).
The source IPv6 address in the downstream direction is a combination of the BR IPv6 prefix and the source IPv4 address (per RFC 7599, §5.1) received in the original packet.
The destination IPv6 address in the upstream direction is a combination of the BR IPv6 prefix and the IPv4 destination address (RFC 7599, §5.1) in the original packet.
1:1 translations refer to the case in which each CE is assigned a distinct public IPv4 address; that is, there is no public IPv4 address sharing between the CEs. In this case, the PSID field is 0 and the sum of lengths for the IPv4 rule prefix and EA bits is 32. In other words, all the EA bits represent the IPv4 suffix. The public IPv4 address of the CE is created by concatenating the Rule IPv4 Prefix and the EA bits.
IPv4 Prefix translations refer to the case where an IPv4 prefix is assigned to a CE. In this case, the PSID field is 0 and the sum of the lengths for IPv4 rule prefix and EA bits is less than 32.
In both preceding cases, the translations are protocol agnostic; all protocols, not just TCP/UDP or ICMP, will be translated.
The BR supports hub-and-spoke topology, which means that the BR facilitates communication between MAP-T CEs.
Rule prefix overlap is not supported because it can cause lookup ambiguity. Figure 77 shows a rule prefix overlap example.
In the case where rule IPv6 prefix 1 is a subset of rule IPv6 prefix 2, the overlapping bits between the EA-bits in end user prefix 2 and the overlapping bits in rule prefix 1 (represented by the shaded sections in Figure 77) could render end-user prefixes 1 and 2 indistinguishable (everything else being the same) when anti-spoof lookup is performed in the upstream direction. This could result in an incorrect anti-spoofing lookup.
A similar logic can be applied to overlapping IPv4 prefixes in the downstream direction, where the longest prefix match will always lead to the same CE, while the shortest match (leading to a different CE) will not be evaluated.
This section examines an example MAP-T deployment with three MAP rules. The deployment assumes the following:
The 12,000 private IPv4 addresses (CEs) in a 16:1 sharing scenario can be covered using three /24 subnets as follows:
(3 * 2^8 * 16 = 12,288)
The IPv4 rule prefix and EA bits length per rule in this scenario are:
The first 6 bits of the 16 bit port-range are set to 000000 and are reserved for psid-offset (ports 0-1023 are reserved); therefore, the user-allocated port space will be calculated as follows:
4000 - 64 = 4032 ports
The IPv6 rule prefix is the next parameter in the MAP rule. Figure 78 shows the relevant bits in the IPv6 address: only bits /32 to /64 are considered; the irrelevant bits of the IPv6 addresses are ignored in this example.
The following three rules are created in this example:
In each of the three cases, the EA bits extend from the PD length (/60) to the IPv6 rule prefix length (/48).
The IPv6 rule prefix length is determined for each of the three rules. However, the IPv6 rule prefixes must not overlap, see section 7.22.4.4 for more information. Non-overlapping IPv6 rule prefixes ensure that each CE is assigned a unique IA-PD. Table 41 describes the rules.
| Rule 1 | Rule 1 | Rule 1 |
IPv6 rule prefix | 2001:db8:0000::/48- | 2001:db8:0001::/48- | 2001:db8:0002::/48- |
IPv4 rule prefix | 11.11.11.0/24 | 12.12.12.0/24 | 13.13.13.0/24 |
EA bits | 12 | 12 | 12 |
Paid-Offset | 6 | 6 | 6 |
The final step is to ensure that the DHCPv6 server hands out proper end-user prefixes (IA-PD), and the rules are also delegated.
In this example, each /48 IPv6 rule prefix supports 4,000 MAP-T CEs, where each CE can further delegate 15 IPv6 “subnets” on the LAN side and each CE is allocated about 4,000 ports to use in stateful NAT44.
![]() | Note: The VSR-BR supports only IPv6 rule prefixes of the same length within a given domain. To accommodate a different prefix length assignment for IA-PD (for example /56), create another domain with a different IPv6 rule prefix (/44 instead of /48). |
The following ICMPv4 messages are supported in MAP-T on the VSR; other types of ICMP messages are not supported:
The ICMP Query messages and ICMP Error messages are supported regardless of whether they are just passing through a VSR (transit messages), or are terminated or generated in or from a VSR.
The NAT-related ICMPv4 behavior is described in RFC 5508. The following NAT messages are supported in the MAP-T VSR (RFC 5508, §7, Requirement 10a):
The IPv6 header of the IPv4-translated packet in MAP-T can be up to 28 bytes larger than the IPv4 header (40-byte IPv6 header plus 8-byte fragmentation header versus 20-byte IPv4 header). In the case where the IPv4-to-IPv6 translated packet is larger than the IPV6 MTU, the original IPv4 packet will be fragmented so that the size of the translated IPv6 packet is within IPv6 MTU. IPv6 packets will never be fragmented, although they may contain the fragmentation header that carries fragmentation information related to the original IPv4 packet/fragment.
The IPv6 MTU in the VSR is configurable for each MAP-T domain. The L2 header is excluded from the IPv6 MTU.
All fragments of the same IPv4 packet are translated and sent towards the same CE. As the second and consecutive fragments do not contain any port information, the translation is performed based on the <SA, DA, Prot, Ident> cached flow records extracted from the IPv4 header.
Note that the VSR may further fragment an IPv4 fragment that it has received in order to fit it within the IPv6 MTU.
Figure 79 shows downstream fragmentation scenarios.
In the upstream direction, the received IPv6 fragments are artifacts of the IPv4 packets being fragmented on the CP side, before they are translated into IPv6. No flow caching is performed in the upstream direction. The BR performs an anti-spoof for each fragment and if the anti-spoof is successful, the fragment is translated to IPv4. Figure 80 shows the upstream fragmentation scenario.
Fragmentation statistics can be cleared using the clear nat map frag-stats command. The following fragmentation statistics are available:
The MSS Adjust feature is used to prevent fragmentation of TCP traffic. The TCP synchronize/start (SYN) packets are intercepted and their MSS value inspected to ensure that it conforms with the configured MSS value. If the inspected value is greater than the value configured in the VSR BR, the MSS value in the packet is lowered to match the configured value before the TCP SYN packet is forwarded.
As the end nodes governing the MSS value are IPv4 nodes, this feature is supported for IPv4 packets only.
An MSS adjust is performed in both the upstream and downstream directions.
The VSR BR maintains a count of the forwarded and dropped packets/octets per MAP-T-domain per direction. The statistics are collected on ingress (upstream v6 and downstream v4) and stored in 64-bit counters
As with any NAT operation where the identity of the user is hidden behind the NAT identity, logging of the NAT translation information is required. In the MAP-T domain, NAT logging is based on configuration changes because the user identity can be derived from the configured rules.
A system can have a large number of rules and each configured MAP rule generates a separate log. As a result, the amount of logs generated can be substantial. Logging is explicitly enabled using a log event.
A NAT log contains information about the following:
A MAP rule log is generated when both of the following conditions are met:
Example:
A valid MAP-T license is required to enable the MAP-T functionality in the VSR BR. A MAP-T domain can only be instantiated with the appropriate license, which enables the following CLI command:
The MAP-T configuration consists of defining MAP-T parameters within a template. The MAP-T domain is then instantiated by applying (referencing) this template within a routing (router or VPRN) context.
Defining a MAP Domain Template
MAP-T Domain Instantiation
MAP Domain Example Template
The following example shows the MAP domain template for the BMRs defined in section 7.22.5.
MAP-T Domain Instantiation Example
The following example shows the MAP-T domain instantiation for the BMRs defined in section 7.22.5.
You can add new rules to an existing MAP-T domain while the MAP-T domain is instantiated and forwarding traffic. However, each rule must be in the shutdown state before any of its parameters are modified.
A MAP-T domain must be in the shutdown state to modify the dmr-prefix parameter. The remaining parameters (tcp-mss-adjust, mtu, ip-fragmentation) can be modified while the domain is active.
A MAP domain does not have to be in a shutdown state when rule modification is in progress.
MAP natively provides multi-chassis redundancy through the use of the anycast BR prefix that is advertised from multiple nodes.
As there is no state maintenance in the MAP-T BR, any BR node can process traffic for the same domain at all times. The only traffic interruption during the switch-over is for the fragmented traffic in the downstream direction being handled at the time of switchover (the flow record cache is not synchronized between the nodes).