This chapter provides information about Network Address Translation (NAT) and implementation notes.
Topics in this chapter include:
BNG Subscriber — A broader term than the ESM Subscriber, independent of the platform on which the subscriber is instantiated. It includes ESM subscribers on 7750 SR as well as subscribers instantiated on third party BNGs. Some of the NAT functions, such as Subscriber Aware Large Scale NAT44 utilizing standard RADIUS attribute work with subscribers independently of the platform on which they are instantiated.
Deterministic NAT — A mode of operation where mappings between the NAT subscriber and the outside IP address and port range are allocated at the time of configuration. Each subscriber is permanently mapped to an outside IP and a dedicated port block. This dedicated port block is referred to as deterministic port block. Logging is not needed as the reverse mapping can be obtained using a known formula. The subscriber’s ports can be expanded by allocating a dynamic port block in case that all ports in deterministic port block are exhausted. In such case logging for the dynamic port block allocation/de-allocation is required.
Enhanced Subscriber Management (ESM) subscriber — A host or a collection of hosts instantiated in 7750 SR Broadband Network Gateway (BNG). The ESM subscriber represents a household or a business entity for which various services with committed Service Level Agreements (SLA) can be delivered. NAT function is not part of basic ESM functionality.
L2-Aware NAT — In the context of 7750 SR platform combines Enhanced Subscriber Management (ESM) subscriber-id and inside IP address to perform translation into a unique outside IP address and outside port. This is in contrast with classical NAT technique where only inside IP is considered for address translations. Since the subscriber-id alone is sufficient to make the address translation unique, L2-Aware NAT allows many ESM subscribers to share the same inside IP address. The scalability, performance and reliability requirements are the same as in LSN.
Large Scale NAT (LSN) — Refers to a collection of network address translation techniques used in service provider network implemented on a highly scalable, high performance hardware that facilitates various intra and inter-node redundancy mechanisms. The purpose of LSN semantics is to make delineation between high scale and high performance NAT functions found in service provider networks and enterprise NAT that is usually serving much smaller customer base at smaller speeds. The following NAT techniques can be grouped under the LSN name:
Each distinct NAT technique is referred to by its corresponding name (Large Scale NAT44 [or CGN], DS-Lite and NAT64) with the understanding that in the context of 7750 SR platform, they are all part of LSN (and not enterprise based NAT).
Large Scale NAT44 term can be interchangeably used with the term Carrier Grade NAT (CGN) which in its name implies high reliability, high scale and high performance. These are again typical requirements found in service provider (carrier) network.
L2-Aware NAT term refers to a separate category of NAT defined outside of LSN.
NAT RADIUS accounting — Reporting (or logging) of address translation related events (port-block allocation/de-allocation) via RADIUS accounting facility. NAT RADIUS accounting is facilitated via regular RADIUS accounting messages (star/interim-update/stop) as defined in RFC 2866, RADIUS Accounting, with NAT specific VSAs.
NAT RADIUS accounting — Can be interchangeably used with the term NAT RADIUS logging.
NAT Subscriber — in NAT terminology a NAT subscriber is an inside entity whose true identity is hidden from the outside. There are a few types of NAT implementation in 7750 SR and subscribers for each implementation are defined as follows:
Non-deterministic NAT — A mode of operation where all outside IP address and port block allocations are made dynamically at the time of subscriber instantiation. Logging in such case is required.
Port block — A collection of ports that is assigned to a subscriber. A deterministic LSN subscriber can have only one deterministic port block that can be extended by multiple dynamic port blocks. Non-deterministic LSN subscriber can be assigned only dynamic port blocks. All port blocks for a LSN subscriber must be allocated from a single outside IP address.
Port range — A collection of ports that can spawn multiple port blocks of the same type. For example, deterministic port range includes all ports that are reserved for deterministic consumption. Similarly dynamic port range is a total collection of ports that can be allocated in the form of dynamic port blocks. Other types of port ranges are well-known ports and static port forwards.
The 7750 SR supports Network Address (and port) Translation (NAPT) to provide continuity of legacy IPv4 services during the migration to native IPv6. By equipping the multi-service ISA (MS ISA) in an IOM3-XP, the 7750 SR can operate in two different modes, known as:
These two modes both perform source address and port translation as commonly deployed for shared Internet access. The 7750 SR with NAT is used to provide consumer broadband or business Internet customers access to IPv4 Internet resources with a shared pool of IPv4 addresses, such as may occur around the forecast IPv4 exhaustion. During this time it, is expected that native IPv6 services will still be growing and a significant amount of Internet content will remain IPv4.
Network Address Translation devices modify the IP headers of packets between a host and server, changing some or all of the source address, destination address, source port (TCP/UDP), destination port (TCP/UDP), or ICMP query ID (for ping). The 7750 SR in both NAT modes performs Source Network Address and Port Translation (S-NAPT). S-NAPT devices are commonly deployed in residential gateways and enterprise firewalls to allow multiple hosts to share one or more public IPv4 addresses to access the Internet. The common terms of inside and outside in the context of NAT refer to devices inside the NAT (that is behind or masqueraded by the NAT) and outside the NAT, on the public Internet.
TCP/UDP connections use ports for multiplexing, with 65536 ports available for every IP address. Whenever many hosts are trying to share a single public IP address there is a chance of port collision where two different hosts may use the same source port for a connection. The resultant collision is avoided in S-NAPT devices by translating the source port and tracking this in a stateful manner. All S-NAPT devices are stateful in nature and must monitor connection establishment and traffic to maintain translation mappings. The 7750 SR NAT implementation does not use the well-known port range (1..1023).
In most circumstances, S-NAPT requires the inside host to establish a connection to the public Internet host or server before a mapping and translation will occur. With the initial outbound IP packet, the S-NAPT knows the inside IP, inside port, remote IP, remote port and protocol. With this information the S-NAPT device can select an IP and port combination (referred to as outside IP and outside port) from its pool of addresses and create a unique mapping for this flow of data.
Any traffic returned from the server will use the outside IP and outside port in the destination IP/port fields – matching the unique NAT mapping. The mapping then provides the inside IP and inside port for translation.
The requirement to create a mapping with inside port and IP, outside port and IP and protocol will generally prevent new connections to be established from the outside to the inside as may occur when an inside host wishes to be a server.
Applications which operate as servers (such as HTTP, SMTP, etc) or peer-to-peer applications can have difficulty when operating behind an S-NAPT because traffic from the Internet can reach the NAT without a mapping in place.
Different methods can be employed to overcome this, including:
The 7750 SR supports all three methods following the best-practice RFC for TCP (RFC 5382, NAT Behavioral Requirements for TCP) and UDP (RFC 4787, Network Address Translation (NAT) Behavioral Requirements for Unicast UDP). Port Forwarding is supported on the 7750 SR to allow servers which operate on well-known ports <1024 (such as HTTP and SMTP) to request the appropriate outside port for permanent allocation.
STUN is facilitated by the support of Endpoint-Independent Filtering and Endpoint-Independent Mapping (RFC 4787) in the NAT device, allowing STUN-capable applications to detect the NAT and allow inbound P2P connections for that specific application. Many new SIP clients and IM chat applications are STUN capable.
Application Layer Gateways (ALG) allows the NAT to monitor the application running over TCP or UDP and make appropriate changes in the NAT translations to suit. The 7750 SR has an FTP ALG enabled following the recommendation of the IETF BEHAVE RFC for NAT (RFC 5382).
Even with these three mechanisms some applications will still experience difficulty operating behind a NAT. As an industry-wide issue, forums like UPnP the IETF, operator and vendor communities are seeking technical alternatives for application developers to traverse NAT (including STUN support). In many cases the alternative of an IPv6-capable application will give better long-term support without the cost or complexity associated with NAT.
Large Scale NAT represents the most common deployment of S-NAPT in carrier networks today, it is already employed by mobile operators around the world for handset access to the Internet.
A Large Scale NAT is typically deployed in a central network location with two interfaces, the inside towards the customers, and the outside towards the Internet. A Large Scale NAT functions as an IP router and is located between two routed network segments (the ISP network and the Internet).
Traffic can be sent to the Large Scale NAT function on the 7750 SR using IP filters (ACL) applied to SAPs or by installing static routes with a next-hop of the NAT application. These two methods allow for increased flexibility in deploying the Large Scale NAT, especially those environments where IP MPLS VPN are being used in which case the NAT function can be deployed on a single PE and perform NAT for any number of other PE by simply exporting the default route.
The 7750 SR NAT implementation supports NAT in the base routing instance and VPRN, and through NAT traffic may originate in one VPRN (the inside) and leave through another VPRN or the base routing instance (the outside). This technique can be employed to provide customer’s of IP MPLS VPN with Internet access by introducing a default static route in the customer VPRN, and NATing it into the Internet routing instance.
As Large Scale NAT is deployed between two routed segments, the IP addresses allocated to hosts on the inside must be unique to each host within the VPRN. While RFC1918 private addresses have typically been used for this in enterprise or mobile environments, challenges can occur in fixed residential environments where a subscriber has existing S-NAPT in their residential gateway. In these cases the RFC 1918 private address in the home network may conflict with the address space assigned to the residential gateway WAN interface. Some of these issues are documented in draft-shirasaki-nat444-isp-shared-addr-02. Should a conflict occur, many residential gateways will fail to forward IP traffic.
The S-NAPT service on the 7750 SR BNG incorporates a port range block feature to address scalability of a NAT mapping solution. With a single BNG capable of hundreds of thousands of NAT mappings every second, logging each mapping as it is created and destroyed logs for later retrieval (as may be required by law enforcement) could quickly overwhelm the fastest of databases and messaging protocols. Port range blocks address the issue of logging and customer location functions by allocating a block of contiguous outside ports to a single subscriber. Rather than log each NAT mapping, a single log entry is created when the first mapping is created for a subscriber and a final log entry when the last mapping is destroyed. This can reduce the number of log entries by 5000x or more. An added benefit is that as the range is allocated on the first mapping, external applications or customer location functions may be populated with this data to make real-time subscriber identification, rather than having to query the NAT as to the subscriber identity in real-time and possibly delay applications.
Port range blocks are configurable as part of outside pool configuration, allowing the operator to specify the number of ports allocated to each subscriber when a mapping is created. Once a range is allocated to the subscriber, these ports are used for all outbound dynamic mappings and are assigned in a random manner to minimise the predictability of port allocations (draft-ietf-tsvwg-port-randomization-05).
Port range blocks also serve another useful function in a Large Scale NAT environment, and that is to manage the fair allocation of the shared IP resources among different subscribers.
When a subscriber exhausts all ports in their block, further mappings will be prohibited. As with any enforcement system, some exceptions are allowed and the NAT application can be configured for reserved ports to allow high-priority applications access to outside port resources while exhausted by low priority applications.
Reserved ports allows an operator to configure a small number of ports to be reserved for designated applications should a port range block be exhausted. Such a scenario may occur when a subscriber is unwittingly subjected to a virus or engaged in extreme cases of P2P file transfers. In these situations, rather than block all new mappings indiscriminately the 7750 SR NAT application allows operators to nominate a number of reserved ports and then assign a 7750 SR forwarding class as containing high priority traffic for the NAT application. Whenever traffic reaches the NAT application which matches a priority session forwarding class, reserved ports will be consumed to improve the chances of success. Priority sessions could be used by the operator for services such as DNS, web portal, e-mail, VoIP, etc to permit these applications even when a subscriber exhausted their ports.
The outside IP address is always shared for the subscriber with a port forward (static or via PCP) and the dynamically allocated port block, insofar as the port from the port forward is in the range >1023. This behavior can lead to starvation of dynamic port blocks for the subscriber. An example for this scenario is shown in Figure 50.
Eventually the PCs in Home 1 come to life and they try to connect to the Internet. Due to the dynamic port block exhaustion for the IP address 3.3.3.1 (that is mandated by static port forward – Web Server), the dynamic port block allocation will fail and consequently the PCs will not be able to access the Internet. There will be no additional attempt within CGN to allocate another outside IP address. In the CGN there is no distinction between the PCs in Home 1 and the Web Server when it comes to source IP address. They both share the same source IP address 2.2.2.1 on the CPE.
To prevent starvation of dynamic port blocks for the subscribers that utilize port forwards, a dynamic port block (or blocks) will be reserved during the lifetime of the port forward. Those reserved dynamic port blocks will be associated with the same subscriber that created the port forward. However, a log would not be generated until the dynamic port block is actually used and mapping within that block are created.
At the time of the port forward creation, the dynamic port block will be reserved in the following fashion:
The reserved dynamic port block (even without any mapping) will continue to be associated with the subscriber as long as the port forward for the subscriber is present. The log (syslog or RADIUS) will be generated only when there is not active mapping within the dynamic port block AND all port forwards for the subscriber are deleted.
Additional considerations with dynamic port block reservation:
Creating a NAT mapping is only one half of the problem – removing a NAT mapping at the appropriate time maximizes the shared port resource. Having ports mapped when an application is no longer active reduces solution scale and may impact the customer experience should they exhaust their port range block. The NAT application provides timeout configuration for TCP, UDP and ICMP.
TCP state is tracked for all TCP connections, supporting both three-way handshake and simultaneous TCP SYN connections. Separate and configurable timeouts exist for TCP SYN, TCP transition (between SYN and Open), established and time-wait state. Time-wait assassination is supported and enabled by default to quickly remove TCP mappings in the TIME WAIT state.
UDP does not have the concept of connection state and is subject to a simple inactivity timer. Company-sponsored research into applications and NAT behavior suggested some applications, like the Bittorrent Distributed Hash Protocol (DHT) can make a large number of outbound UDP connections that are unsuccessful. Rather than wait the default five (5) minutes to time these out, the 7750 SR NAT application supports an udp-initial timeout which defaults to 15 seconds. When the first outbound UDP packet is sent, the 15 second time starts – it is only after subsequent packets (inbound or outbound) that the default UDP timer will become active, greatly reducing the number of UDP mappings.
It is possible to define watermarks to monitor the actual usage of sessions and/or ports.
For each watermark, a high and a low value has to be set. Once the high value is reached, a notification will be send. As soon as the usage drops below the low watermark, another notification will be send.
Watermarks can be defined on nat-group, pool and policy level.
NAT is supported on DHCP, PPPoE and L2TP, there is not support for static and ARP hosts.
In an effort to address issues of conflicting address space raised in draft-shirasaki-nat444-isp-shared-addr-02, an enhancement to Large Scale NAT was co-developed to give every broadband subscriber their own NAT mapping table, yet still share a common outside pool of IPs.
Layer-2 Aware (or subscriber aware) NAT is combined with Enhanced Subscriber Management on the 7750 SR BNG to overcome the issues of colliding address space between home networks and the inside routed network between the customer and Large Scale NAT.
Layer-2 Aware NAT permits every broadband subscriber to be allocated the exact same IPv4 address on their residential gateway WAN link and then proceeds to translate this into a public IP through the NAT application. In doing so, L2-Aware NAT avoids the issues of colliding address space raised in draft-shirasaki without any change to the customer gateway or CPE.
Layer-2-Aware NAT is supported on any of the ESM access technologies, including PPPoE, IPoE (DHCP) and L2TP LNS. For IPoE both n:1 (VLAN per service) and 1:1 (VLAN per subscriber) models are supported. A subscriber device operating with L2-Aware NAT needs no modification or enhancement – existing address mechanisms (DHCP or PPP/IPCP) are identical to a public IP service, the 7750 SR BNG simply translates all IPv4 traffic into a pool of IPv4 addresses, allowing many L2-Aware NAT subscribers to share the same IPv4 address.
More information on L2-Aware NAT can be found in draft-miles-behave-l2nat-00.
In 1:1 NAT, each source IP address is translated in 1:1 fashion to a corresponding outside IP address. However, the source ports are passed transparently without translation.
The mapping between the inside IP addresses and outside IP addresses in 1:1 NAT supports two modes:
Dynamic version of 1:1 NAT is protocol dependent. Only TCP/UDP/ICMP protocols are allowed to traverse such NAT. All other protocols are discarded, with the exception of PPTP with ALG. In this case, only GRE traffic associated with PPTP is allowed through dynamic 1:1 NAT.
Static version of 1:1 NAT is protocol agnostic. This means that all IP based protocols are allowed to traverse static 1:1 NAT.
The following is applicable to 1:1 NAT:
In static 1:1 NAT, inside IP addresses are statically mapped to the outside IP addresses. In this fashion, devices on the outside can predictably initiate traffic to the devices on the inside.
Static configuration is based on the CLI concepts used in deterministic NAT. For example:
Static mappings are configured according to the map statements. The map statement can be configured manually by the operator or automatically by the system. IP addresses from the automatically generated map statements are sequentially mapped into available outside IP address in the pool:
The following mappings apply to the example from above:
Although static 1:1 NAT is protocol agnostic, the state maintenance for TCP and UDP traffic is still required in order to support ALGs. Because of that, the existing scaling limits related to the number of supported flows still apply.
Protocol agnostic behavior in 1:1 NAT is a property of a NAT pool:
The application agnostic command is a pool create-time parameter. This command will automatically pre-set the following pool parameters:
Once pre-set, these parameters cannot be changed while pool is operating in protocol agnostic mode.
The deterministic port-reservation 65536 command configures the pool to operate in static (or deterministic) mode.
Parameters in static 1:1 NAT can be changed according to the following rules:
For best traffic distribution over ISAs, the value of the classic-lsn-max-subscriber-limit parameter should be set to 1.
This mean that traffic is load balanced over ISAs based on inside IP addresses. In static 1:1 NAT this is certainly possible since the subscriber-limit parameter at the pool level is preset to a fixed value of 1.
However, if 1:1 static NAT is simultaneously used with regular (many-to-one) deterministic NAT where the subscriber-limit parameter can be set to a value greater than 1, then the classic-lsn-max-subscriber-limit parameter will also have to be set to a value that is greater than 1. The consequence of this is that the traffic will be load balanced based on the consecutive blocks of IP addresses (subnets) rather than individual IP addresses. Further information on this topic is provided in sections describing Deterministic NAT behavior.
Traffic match criteria used in selection of specific nat-policy in static 1:1 NAT (deterministic part of the configuration) must not overlap with traffic match criteria that is used in selection of specific nat-policy used in filters or in destination-prefix statement (these are used for traffic diversion to NAT). Otherwise, traffic will be dropped in ISA.
A specific nat-policy in this context refers to a non-default nat-policy, or a nat-policy that is directly referenced in a filter, in a destination-prefix command or in a deterministic prefix command.
The following example is used to clarify this point further:
Traffic is diverted to nat using specific nat-policy pol-2:
Deterministic (source) prefix 10.10.10.0/30 is configured to be mapped to specific nat-policy pol-1 that point to protocol agnostic 1:1 nat pool.
Packet received in the ISA has srcIP 10.10.10.1 and destIP 192.168.10.10.
In case that no NAT mapping for this traffic exists in the ISA, a nat-policy (and with this the NAT pool) needs to be determined in order to create the mapping. Traffic is diverted to NAT using nat-policy pol-2, while the deterministic mapping says that nat-policy pol-1 should be used (and thus a different pool from the one referenced in nat-policy pol-2). Due to the specific nat-policy conflict, traffic will be dropped in the ISA.
In order to successfully pass traffic between two subnets through NAT while simultaneously using static 1:1 NAT and regular LSN44, a default (non-specific) nat-policy can be used for regular LSN44.
For example:
In this case, the four hosts from the prefix 10.10.10.0/30 will be mapped in 1:1 fashion to 4 IP addresses from the pool referenced in the specific nat-policy pol-1, while all other hosts from the 10.10.10.0/24 network will be mapped to the NAPT pool referenced by the default nat-policy pol-2. In this fashion, nat-policy conflict is avoided.
In summary, specific nat-policy (in filter, destination-prefix command or in deterministic prefix command) will always take precedence over default nat-policy. However, traffic that matches classification criteria (in filter, destination-prefix command or a deterministic prefix command) that leads to multiple specific nat-policies, will be dropped.
Static 1:1 NAT mappings are explicitly configured, and therefore their lifetime is tied to the configuration.
The logging mechanism for static mapping is the same as in Deterministic NAT - configuration changes are logged via syslog enhanced with reverse querying on the system.
Static 1:1 NAT is supported only for LSN44 (no support for DS-Lite/NAT64 or L2-aware NAT).
In 1:1 NAT, certain ICMP messages contain an additional IP header that is embedded in the ICMP header. For example, when the ICMP message is sent to the source due to the inability to deliver datagram to its destination, the ICMP generating node will include the original IP header of the packet + 64bits of the original datagram. This information will help the source node to match the ICMP message to the process associated with this message in the first place.
When such message are received in the downstream direction (on the outside), 1:1 NAT will recognize them and change the destination IP address not only in the outside header but also in the ICMP header. In other words, a lookup in the downstream direction will be performed in the ISA to determine if the packet is ICMP with specific type. Depending on the outcome, the destination IP address in the ICMP header will be changed (reverted to the original source IP address).
Messages which carry original IP header within ICMP header are:
In deterministic NAT the subscriber is deterministically mapped into an outside IP address and a port block. The algorithm that performs this deterministic mapping is revertive, which means that a NAT subscriber can be uniformly derived from the outside IP address and the outside port (and the routing instance). Thus, logging in deterministic NAT is not needed.
The deterministic [subscriber <-> outside-ip, deterministic-port-block] mapping can be automatically extended by a dynamic port-block in case that deterministic port block becomes exhausted of ports. By extending the original deterministic port block of the NAT subscriber by a dynamic port block yields a satisfactory compromise between a deterministic NAT and a non-deterministic NAT. There will be no logging as long as the translations are in the domain of the deterministic NAT. Once the dynamic port block is allocated for port extension, logging will be automatically activated.
NAT subscribers in deterministic NAT are not assigned outside IP address and deterministic port-block on a first come first serve basis. Instead, deterministic mappings will be pre-created at the time of configuration regardless of whether the NAT subscriber is active or not. In other words we can say that overbooking of the outside address pool is not supported in deterministic NAT. Consequently, all configured deterministic subscribers (for example, inside IP addresses in LSN44 or IPv6 address/prefix in DS-Lite) will be guaranteed access to NAT resources.
The routers support Deterministic LSN44 and Deterministic DS-Lite. The basic deterministic NAT principle is applied equally to both NAT flavors. The difference between the two stem from the difference in interpretation of the subscriber – in LSN44 a subscriber is an IPv4 address, whereas in DS-Lite the subscriber is an IPv6 address or prefix (configuration dependent).
With the exception of classic-lsn-max-subscriber-limit and dslite-max-subscriber-limit commands in the inside routing context, the deterministic NAT configuration blocks are for the most part common to LSN44 and DS-Lite.
Deterministic DS-Lite section at the end of this section will focus on the features specific to DS-Lite.
The outside pools in deterministic NAT can contain an arbitrary number of address ranges, where each address range can contain an arbitrary number of IP addresses (up to the ISA maximum).
The maximum number of NAT subscribers that can be mapped to a single outside IP address is configurable using a subscriber-limit command under the pool hierarchy. For Deterministic NAT, this number is restricted to the power of 2 (2^n). The consequence of this is that the number of NAT subscribers must be configuration-wise organized in ranges with the boundary that must be power of 2.
For example, in LSN44 where the NAT subscriber is an IP address, the deterministic subscribers would be configured with prefixes (for example, 10.10.10.0/24 – 256 subscribers) rather than an IP address range that would contain an arbitrary number of addresses (e.g. 10.10.10.10 – 10.10.10.50).
On the other hand, in DS-Lite the deterministic subscribers are for the most part already determined by the prefix with the subscriber-prefix-length command under the DS-Lite configuration node.
The number of subscribers per outside IP (the subscriber-limit command [2^n]) multiplied by the number of IP addresses over all address-range in an outside pool will determine the maximum number of subscribers that a deterministic pool can support.
In deterministic NAT, the outside pool can be shared amongst subscribers from multiple routing instances. Also, NAT subscribers from a single routing instance can be selectively mapped to different outside pools.
The number of deterministic mappings that a single outside IP address can sustain is determined through the configuration of the outside pool.
The port allocation per an outside IP is shown in Figure 52.
The well-known ports are predetermined and are in the range 0 — 1023.
The upper limit of the port range for static port forwards (wildcard range) is determined by the existing port-forwarding-range command.
The range of ports allocated for deterministic mappings (DetP) is determined by multiplying the number of subscribers per outside IP (subscriber-limit command) with the number of ports per deterministic block (determinisitic>port-reservation command). The number of subscribers per outside IP in deterministic NAT must be power of 2 (2^n).
The remaining ports, extending from the end of the deterministic port range to the end of the total port range (65,535) are used for dynamic port allocation. The size of each dynamic port block is determined with the existing port-reservation command.
The determinisitic>port-reservation command enables deterministic mode of operation for the pool.
Examples:
The follow show three examples with deterministic Large Scale NAT44 where the requirements are:
In the first case, the ideal case will be examined where an arbitrary number of subscribers per outside IP address is allocated according to our requirements outlined above. Then the limitation of the number of subscribers being power of 2 will be factored in.
Well-Known Ports* | Static Port Range* | Number of Ports in Deterministic Block* | Number of Deterministic Blocks | Number of Ports in Dynamic Block* | Number of Dynamic Blocks | Number of Inside IP Addresses per Outside IP Address* | Block Limit per Inside IP Address* | Wasted Ports |
0-1023 | 1024-4023 | 300 | 153 | 100 | 153 | 153 | 5 | 312 |
0-1023 | 1024-4023 | 500 | 102 | 100 | 102 | 102 | 5 | 312 |
0-1023 | 1024-4023 | 700 | 76 | 100 | 76 | 76 | 5 | 712 |
The example in Table 28 shows how port ranges would be carved out in ideal scenario.
* — Signifies the fixed parameters (requirements).
The other values are calculated according to the fixed requirements.
port-block-limit includes the deterministic port block plus all dynamic port-blocks.
Next, a more realistic example with the number of subscribers being equal to 2^n are considered. The ratio between the deterministic ports and the dynamic ports per port-block just like in the example above: 3/1, 5/1 and 7/1 are preserved. In this case, the number of ports per port-block is dictated by the number of subscribers per outside IP address.
Well-Known Ports* | Static Port Range* | Number of Ports in Deterministic Block* | Number of Deterministic Blocks | Number of Ports in Dynamic Block* | Number of Dynamic Blocks | Number of Inside IP Addresses per Outside IP Address* | Block Limit per Inside IP Address* | Wasted Ports |
0-1023 | 1024-4023 | 180 | 256 | 60 | 256 | 256 | 5 | 72 |
0-1023 | 1024-4023 | 400 | 128 | 80 | 128 | 128 | 5 | 72 |
0-1023 | 1024-4023 | 840 | 64 | 120 | 64 | 64 | 5 | 72 |
* — Signifies the fixed parameters (requirements).
The final example is similar as Table 28 with the difference that the number of deterministic port blocks fixed are kept, as in the original example (300, 500 and 700).
Well-Known Ports | Static Port Range | Number of Ports in Deterministic Block | Number of Deterministic Blocks | Number of Ports in Dynamic Block | Number of Dynamic Blocks | Number of Inside IP Addresses per Outside IP Address | Block Limit per Inside IP Address | Wasted Ports |
0-1023 | 1024-4023 | 300 | 128 | 180 | 128 | 128 | 5 | 72 |
0-1023 | 1024-4023 | 500 | 64 | 461 | 64 | 64 | 5 | 8 |
0-1023 | 1024-4023 | 700 | 64 | 261 | 64 | 64 | 5 | 8 |
The three examples from above should give us a perspective on the size of deterministic and dynamic port blocks in relation to the number of subscribers (2^n) per outside IP address. Operators should run a similar dimensioning exercise before they start configuring their deterministic NAT.
The CLI for the highlighted case in the Table 28 is displayed:
Where:
128 subs * 300ports = 38,400 deterministic port range
128 subs * 180ports = 23,040 dynamic port range
Det+dyn available ports = 65,536 – 4024 = 61,512
Det+dyn usable pots = 128*300 + 128 *180 = 61,440 ports
72 ports per outside-ip are wasted.
This configuration will allow 128 subscribers (inside IP addresses in LSN44) for each outside address (compression ratio is 128:1) with each subscriber being assigned up to 1020 ports (300 deterministic and 720 dynamic ports over 4 dynamic port blocks).
The outside IP addresses in the pool and their corresponding port ranges are organized as shown in Figure 53.
Assuming that the above graph depicts an outside deterministic pool, the number of subscribers that can be accommodated by this deterministic pool is represented by purple squares (number of IP addresses in an outside pool * subscriber-limit). The number of subscribers across all configured prefixes on the inside that are mapped to the same deterministic pool must be less than the outside pool can accommodate. In other words, an outside address pool in deterministic NAT cannot be oversubscribed.
The following is a CLI representation of a deterministic pool definition including the outside IP ranges:
The common building block on the inside in the deterministic LSN44 configuration is a IPv4 prefix. The NAT subscribers (inside IPv4 addresses) from the configured prefix will be deterministically mapped to the outside IP addresses and corresponding deterministic port-blocks. Any inside prefix in any routing instance can be mapped to any pool in any routing instance (including the one in which the inside prefix is defined).
The mapping between the inside prefix and the deterministic pool is achieved through a nat-policy that can be referenced per each individual inside IPv4 prefix. IPv4 addresses from the prefixes on the inside will be distributed over the IP addresses defined in the outside pool referenced by the nat-policy.
The mapping itself is represented by the map command under the prefix hierarchy:
The purpose of the map statement is to split the number of subscribers within the configured prefix over available sequences of outside IP addresses. The key parameter that governs mappings between the inside IPv4 addresses and outside IPv4 addresses in deterministic LSN44 is defined by the outside>pool>subscriber-limit command. This parameter must be power of 2 and it limits the maximum number of NAT subscribers that can be mapped to the same outside IP address.
The follow are rules governing the configuration of the map statement:
In case that the number of subscribers (IP addresses in LSN44) in the map statement is larger than the subscriber-limit per outside IP, then the subscribers must be split over a block of consecutive outside IP addresses where the outside-ip-address in the map statement represent only the first outside IP address in that block.
The number of subscribers (range of inside IP addresses in LSN44) in the map statement does not have to be a power of 2. Rather it has to be a multiple of a power of two m * 2^n, where m is the number of consecutive outside IP addresses to which the subscribers are mapped and the 2^n is the subscriber-limit per outside IP.
An example of the map statement is given below:
In this case, the configured 10.0.0.0/24 prefix is represented by the range of IP addresses in the map statement (10.0.0.0-10.0.0.255). Since the range of 256 IP addresses in the map statement cannot be mapped into a single outside IP address (subscriber-limit=128), this range must be further implicitly split within the system and mapped into multiple outside IP addresses. The implicit split will create two IP address ranges, each with 128 IP addresses (10.0.0.0/25 and 10.0.0.128/25) so that addresses from each IP range are mapped to one outside IP address. The hosts from the range 10.0.0.0-10.0.0.127 will be mapped to the first IP address in the pool (128.251.0.1) as explicitly stated in the map statement (to statement). The hosts from the second range, 10.0.0.128-10.0.0.255 will be implicitly mapped to the next consecutive IP address (128.251.0.2).
Alternatively, the map statement can be configured as:
In this case the IP address range in the map statement is split into two non-consecutive outside IP addresses. This gives the operator more freedom in configuring the mappings.
However, the following configuration is not supported:
Considering that the subscriber-limit = 128 (2^n; where n=7), the lower n bits of the start address in the second map statement (map start 10.0.0.64 end 10.0.0.127 to 128.251.0.3) are not 0. This is in violation of the rule #1 that governs the provisioning of the map statement.
Assuming that we use the same pool with 128 subscribers per outside IP address, the following scenario is also not supported (configured prefix in this example is different than in previous example):
Although the lower n bits in both map statements are 0, both statements are referencing the same outside IP (128.251.0.1). This is violating rule #2 that governs the provisioning of the map statement. Each of the prefixes in this case will have to be mapped to a different outside IP address, which will lead to underutilization of outside IP addresses (half of the deterministic port-blocks in each of the two outside IP addresses will be not be utilized).
In conclusion, considering that the number of subscribers per outside IP (subscriber-limit) must be 2^n, the inside IP addresses from the configured prefix will be split on the 2^n boundary so that every deterministic port-block of an outside IP is utilized. In case that the originally configured prefix contains less subscribers (IP addresses in LSN44) than an outside IP address can accommodate (2^n), all subscribers from such configured prefix will be mapped to a single outside IP. Since the outside IP cannot be shared with NAT subscribers from other prefixes, some of the deterministic port-blocks for this particular outside IP address will not be utilized.
Each configured prefix can evaluate into multiple map commands. The number of map commands will depend on the length of the configured prefix, the subscriber-limit command and fragmentation of outside address-range within the pool with which the prefix is associated.
Support for multiple MS-ISAs in the nat-group calls for traffic hashing on the inside in the ingress direction. This will ensure fair load balancing of the traffic amongst multiple MS-ISAs. While hashing in non-deterministic LSN44 can be performed per source IP address, hashing in deterministic LSN44 is based on subnets instead of individual IP addresses. The length of the hashing subnet is common for all configured prefixes within an inside routing instance. In case that a prefixes from an inside routing instances is referencing multiple pools, the common hashing prefix length will be chosen according to the pool with the highest number of subscribers per outside IP address. This will ensure that subscribers mapped to the same outside IP address will be always hashed to the same MS-ISA.
In general, load distribution based on hashing is dependent on the sample. Large and more diverse sample will ensure better load balancing. Therefore the efficiency of load distribution between the MS-ISAs is dependent on the number and diversity of subnets that hashing algorithm is taking into consideration within the inside routing context.
A simple rule for good load balancing is to configure a large number of subscribers relative to the largest t subscriber-limit parameter in any given pool that is referenced from this inside routing instance.
The configuration example shown Figure 54 depicts a case in which prefixes from multiple routing instances are mapped to the same outside pool and at the same time the prefixes from a single inside routing instance are mapped to different pools (we do not support the latter with non-deterministic NAT).
![]() | Note:
In this example is the inside prefix 10.10.10.0/24 that is present in VPRN 1 and VPRN 2. In both VPRNs, this prefix is mapped to the same pool - pool-1 with the subscriber-limit of 64. Four outside IP addresses per prefix per VPRN (eight in total) are allocated to accommodate the mappings for all hosts in prefix 10.10.10.0/24. But the hashing prefix length in VPRN1 is based on the subscriber-limit 64 (VPRN1 references only pool-1) while the hashing prefix length in VPRN2 is based on the subscriber-limit 256 in pool-2 (VPRN2 references both pools, pool-1 and pool-2 and we must select the larger subscriber-limit). The consequence of this is that the traffic from subnet 10.10.10.0/24 in VPRN 1 can be load balanced over 4 MS-ISA (hashing prefix length is 26) while traffic from the subnet 10.10.10.0/24 in VPRN 2 is always sent to the same MS-ISA (hashing prefix length is 24). |
Distribution of outside IP addresses across the MS-ISAs is dependent on the ingress hashing algorithm. Since traffic from the same subscriber is always pre-hashed to the same MS-ISA, the corresponding outside IP address also must reside on the same ISA. CPM runs the hashing algorithm in advance to determine on which MS-ISA the traffic from particular inside subnet will land and then the corresponding outside IP address (according to deterministic NAT mapping algorithm) will be configured in that particular MS-ISA.
Sharing of the deterministic pools between LSN44 and DS-Lite is supported.
Simultaneous support for deterministic and non-deterministic NAT inside of the same routing instance is supported. However, an outside pool can be only deterministic (although expandable by dynamic ports blocks) or non-deterministic at any given time.
Ingress hashing for all NATed traffic within the VRF will in this case be performed based on the subnets driven by the classic-lsn-max-subscriber-limit parameter.
Deterministic NAT does not change the way how traffic is selected for the NAT function but instead only defines a predictable way for translating subscribers into outside IP addresses and port-blocks.
Traffic is still diverted to NAT using the existing methods:
The inverse mapping can be performed with a MIB locally on the node or externally via a script sourced in the router. In both cases, the input parameters are <outside routing instance, outside IP, outside port. The output from the mapping is the subscriber and the inside routing context in which the subscriber resides.
Reverse mapping information can be obtained using the following command:
Example:
Output:
Inside router 10 ip 20.0.5.171 -- outside router Base ip 85.0.0.2 port 2333 at Mon Jan 7 10:02:02 PST 2013
Instead of querying the system directly, there is an option where a Python script can be generated on router and exported to an external node. This Python script contains mapping logic for the configured deterministic NAT in the router. The script can be then queried off-line to obtain mappings in either direction. The external node must have installed Python scripting language with the following modules: getopt, math, os, socket and sys.
The purpose of such off-line approach is to provide fast queries without accessing the router. Exporting the Python script for reverse querying is a manual operation that needs to be repeated every time there is configuration change in deterministic NAT.
The script is exported outside of the box to a remote location (assuming that writing permissions on the external node are correctly set). The remote location is specified with the following command:
The status of the script is shown using the following command:
Once the script location is specified, the script can be exported to that location with the following command:
This needs to be repeated manually every time the configuration affecting deterministic NAT changes.
The script itself can be run to obtain mapping in forward or backward direction:
The following displays an example in which source addresses are mapped in the following manner:
The forward query for this example will be performed as:
user@external-server:/home/ftp/pub/det-nat-script$ ./det-nat.py -f -s 10 -a 20.0.5.10
Output:
The reverse query for this example will be performed as:
Output:
Every configuration change concerning the deterministic pool will be logged and the script (if configured for export) will be automatically updated (although not exported). This is needed to keep current track of deterministic mappings. In addition, every time a deterministic port-block is extended by a dynamic block, the dynamic block will be logged just as it is today in non-deterministic NAT. The same logic is followed when the dynamic block is de-allocated.
All static port forwards (including PCP) are also logged.
PCP allocates static port forwards from the wildcard-port range.
A subscriber in non-deterministic DS-Lite is defined as v6 prefix, with the prefix length being configured under the DS-Lite NAT node:
All incoming IPv6 traffic with source IPv6 addresses falling under a unique v6 prefix that is configured with subscriber-prefix-length command will be considered as a single subscriber. As a result, all source IPv4 addresses carried within that IPv6 prefix will be mapped to the same outside IPv4 address.
The concept of deterministic DS-Lite is very similar to deterministic LSN44. The DS-lite subscribers (IPv6 addresses/prefixes) are deterministically mapped to outside IPv4 addresses and corresponding deterministic port-blocks.
Although the subscriber in DS-Lite is considered to be either a B4 element (IPv6 address) or the aggregation of B4 elements (IPv6 prefix determined by the subscriber-prefix-length command), only the IPv4 source addresses and ports carried inside of the IPv6 tunnel are actually translated.
The prefix statement for deterministic DS-lite remains under the same deterministic CLI node as for the deterministic LSN44. However, the prefix statement parameters for deterministic DS-Lite differ from the one for deterministic LSN44 in the following fashion:
Example:
In this case, 16 v6 prefixes (from ABCD:FF::/60 to ABCD:FF:00:F0::/60) are considered DS-Lite subscribers. The source IPv4 addresses/ports inside of the IPv6 tunnels is mapped into respective deterministic port blocks within an outside IPv4 address according to the map statement.
The map statement contains minor modifications as well. It maps DS-Lite subscribers (IPv6 address or prefix) to corresponding outside IPv4 addresses. Continuing on the previous example:
map start ABCD:FF::/60 end ABCD:FF:00:F0::/60 to 128.251.1.1
The prefix length (/60) in this case MUST be the same as configured subscriber-prefix-length. If we assume that the subscriber-limit in the corresponding pool is set to 8 and outside IP address range is 128.251.1.1 - 128.251.1.10, then the actual mapping is the following:
The ingress hashing and load distribution between the ISAs in Deterministic DS-Lite is governed by the highest number of configured subscribers per outside IP address in any pool referenced within the given inside routing context.
This limit is configured under:
While ingress hashing in non-deterministic DS-Lite is governed by the subscriber-prefix-length command, in deterministic DS-Lite the ingress hashing is governed by the combination of dslite-max-subscriber-limit and subscriber-prefix-length commands. This is to ensure that all DS-Lite subscribers that are mapped to a single outside IP address are always sent to the same MS-ISA (on which that outside IPv4 address resides). In essence, as soon as deterministic DS-Lite is enabled, the ingress hashing is performed on an aggregated set of n = log2(dslite-max-subscriber-limit) contiguous subscribers. n is the number of bits used to represent the largest number of subscribers within an inside routing context, that is mapped to the same outside IP address in any pool referenced from this inside routing context (referenced through the nat-policy).
Once the deterministic DS-lite is enabled (a prefix command under the deterministic CLI node is configured), the ingress hashing influenced by the dslite-max-subscriber-limit will be in effect for both flavors of DS-Lite (deterministic AND non-deterministic) within the inside routing context assuming that both flavors are configured simultaneously.
With introduction of deterministic DS-lite, the configuration of the subscriber-prefix-length must adhere to the following rule:
This can be clarified by the two following examples:
This means that 64 DS-Lite subscribers will be mapped to the same outside IP address. Consequently the prefix length of those subscribers must be reduced by 6 bits for hashing purposes (so that chunks of 64 subscribers are always hashed to the same ISA).
According to our rule, the prefix of those subscribers (subscriber-prefix-length) can be only in the range of [38..64], and no longer in the range [32..64, 128].
This means that each DS-lite subscriber will be mapped to its own outside IPv4 address. Consequently there is no need for the aggregation of the subscribers for hashing purposes, since each DS-lite subscriber is mapped to an entire outside IPv4 address (with all ports). Since the subscriber prefix length will not be contracted in this case, the prefix length can be configured in the range [32..64, 128].
In other words the largest configured prefix length for the deterministic DS-lite subscriber will be 32+n, where n = log2(dslite-max-subscriber-limit). The subscriber prefix length can extend up to 64 bits. Beyond 64 bits for the subscriber prefix length, there is only one value allowed: 128. In the case n must be 0, which means that the mapping between B4 elements (or IPv6 address) and the IPv4 outside addresses is in 1:1 ratio (no sharing of outside IPv4 addresses).
The dependency between the subscriber definition in DS-Lite (based on the subscriber-prefix-length) and the subscriber hashing mechanism on ingress (based on the dslite-max-subscriber-limit value), will influence the order in which deterministic DS-lite is configured.
Configure deterministic DS-Lite in the following order.
Modifying the dslite-max-subscriber-limit requires that all nat-policies be removed from the inside routing context.
To migrate a non-deterministic DS-Lite configuration to a deterministic DS-Lite configuration, the non-deterministic DS-Lite configuration must be first removed from the system. The following steps should be followed:
NAT Pool
NAT Policy
NA Group
Deterministic Mappings (prefix ans map statements)
Similarly, the map statements can be added or removed only if the prefix node is in a shutdown state.
The outside-ip-address in the map statements must be unique amongst all map statements referencing the same pool. In other words, two map statements cannot reference the same <outside-ip-address> in a pool.
Configuration Parameters
Miscellaneous
Destination NAT (DNAT) in SR OS is supported for LSN44 and L2-Aware NAT. DNAT can be used for traffic steering where the destination IP address of the packet is rewritten. In this fashion traffic can be redirected to an appliance or set of servers that are in control of the operator, without the need for a separate transport service (for example, PBR plus LSP). Applications utilizing traffic steering via DNAT normally require some form of inline traffic processing, such as inline content filtering (parental control, antivirus/spam, firewalling), video caching, etc.Once the destination IP address of the packet is translated, traffic is naturally routed based on the destination IP address lookup. DNAT will translate the destination IP address in the packet while leaving the original destination port untranslated.Similar to source based NAT (Source Network Address and Port Translation (SNAPT)), the SR OS node will maintain state of DNAT translations so that the source IP address in the return (downstream) packet is translated back to the original address. Traffic selection for DNAT processing in MS-ISA is performed via a NAT classifier.
In certain cases SNAPT is required along with DNAT. In other cases only DNAT is required without SNAPT. The following table shows the supported combinations of SNAPT and DNAT in SR OS.
SNAPT | DNAT-Only | SNAPT + DNAT | |
LSN44 | X | X | X |
L2-Aware | X | X |
The SNAPT/DNAT address translations are shown in Figure 55.
NAT forwarding in SR OS is implemented in two stages:
As part of the NAT state maintenance, the SR OS maintains the following fields for each DNATed flow:
<inside host /port, outside IP/port, foreign IP address/port, destination IP address/port, protocol (TCP,TCP,ICMP)> Note that the inside host in LSN is inside the IP address and in L2-Aware NAT it is the <inside IP address + subscriber-index>. The subscriber index is carried in session-id of the L2TP.
The foreign IP address represents the destination IP address in the original packet, while the destination IP address represents the DNAT address (translated destination IP address).
Traffic intended for DNAT processing is selected via a nat classifier. The nat classifier has configurable protocol and destination ports. The inclusion of the classifier in the nat-policy is the trigger for performing DNAT. The configuration of the nat classifier determines whether:
Classifier cannot drop traffic (no action drop). However, a non-reachable destination IP address in DNAT will cause traffic to be black-holed.
DNAT is enabled in the config>service>nat>nat-policy context.
DNAT function is triggered by the presence of the nat classifier (nat-classifier command), referenced in the nat-policy.DNAT-only option is configured in case where SNAPT is not required. This command is necessary in order to determine the outside routing context and the nat-group when SNAPT is not configured. Pool (relevant to SNAPT) and DNAT-only configuration options within the nat-policy are mutually exclusive.
DNAT traffic selection is performed via a nat-classifier. Nat-classifier is defined under config>service>nat hierarchy and is referenced within the nat-policy.
default-dnat-ip-address is used in all match criteria that contain DNAT action without specific destination IP address. However, the default-dnat-ip-address is ignored in cases where IP address is explicitly configured as part of the action within the match criteria.
default-action is applied to all packets that do not satisfy any match criteria.
forward (forwarding action) has no effect on the packets and will transparently forward packets through the nat-classifier. By default, packets that do not match any matching criteria are transparently passed through the classifier.
In order to forward upstream and downstream traffic for the same NAT binding to the same MS-ISA, the original source IP address space must be known in advance and consequently hashed on the inside ingress towards the MS-ISAs and micro-netted on the outside.This will be performed with the following CLI:
The classic-lsn-max-subscriber-limit parameter was introduced by deterministic NAT and it is reused here. This parameter affects the distribution of the traffic across multiple MS-ISA in the upstream direction traffic. Hashing mechanism based on source IPv4 addresses/prefixes is used to distribute incoming traffic on the inside (private side) across the MS-ISAs. Hashing based on the entire IPv4 address will produce the most granular traffic distribution, while hashing based on the IPv4 prefix (determined by prefix length) will produce less granular hashing. For further details about this command, consult the CLI command description. The source IP prefix is defined in the nat-prefix-list and then applied under the DNAT-only node in the inside routing context. This will instruct the SR OS node to create micro-nets in the outside routing context. The number of routes installed in this fashion is limited by the following configuration:
The configurable range is 1-128K with the default value of 32K.DNAT provisioning concept is shown in Figure 56.
The following restrictions apply to multiple NAT policies per inside routing context
The selection of the NAT pool and the outside routing context is performed through the NAT policy. Multiple NAT policies can be used within an inside routing context. This feature effectively allows selective mapping of the incoming traffic within an inside routing context to different NAT pools (with different mapping properties, such as port-block size, subscriber-limit per pool, address-range, port-forwarding-range, deterministic vs non-deterministic behavior, port-block watermarks, etc.) and to different outside routing contexts. NAT policies can be configured:
The concept of the NAT pool selection mechanism based on the destination of the traffic via routing is shown in Figure 57.
Diversion of the traffic to NAT based on the source of the traffic is shown in Figure 58.
Only filter-based diversion solution is supported for this case. The filter-based solution can be extended to a 5 tuple matching criteria.
The following considerations must be taken into account when deploying multiple NAT policies per inside routing context:
The routing approach relies on upstream traffic being directed (or diverted) to the NAT function based on the destination-prefix command in the configure>service>vprn/router>nat>inside CLI context. In other words, the upstream traffic will be NATed only if it matches a preconfigured destination IP prefix. The destination-prefix command creates a static route in the routing table of the inside routing context. This static route will divert all traffic with the destination IP address that matches the created entry, towards the MS-ISA. The NAT function itself will be performed once the traffic is in the proper context in the MS-ISA.
The CLI for multiple NAT policies per inside routing context with routing based diversion to NAT is the following:
or, for example:
Different destination prefixes can reference a single NAT policy (policy-1 in this case).
In case that the destination-policy does not directly reference the NAT policy, the default NAT policy will be used. The default nat-policy is configured directly in the vprn/router>nat>inside context.
Once that destination-prefix command referencing the nat-policy is configured, an entry in the routing table will be created that will direct the traffic to the MS-ISA.
A filter-based approach will divert traffic to NAT based on the ip matching criteria shown in the CLI below.
The CLI for the filter-based diversion in conjunction with multiple NAT policies is shown below:
The association with the NAT policy is made once the filter is applied to the SAP.
DS-Lite and NAT64 diversion to NAT with multiple nat-policies is supported only through IPv6 filters:
Where the nat-type parameter can be either dslite or nat64.
The DS-Lite AFTR address and NAT64 destination prefix configuration under the corresponding (DS-Lite or NAT64) router/vprn>nat>inside context is mandatory. This is even in the case when only filters are desired for traffic diversion to NAT.
For example, every AFTR address and NAT64 prefix that is configured as a match criteria in the filter, must also be duplicated in the router/vprn>nat>inside context. However, the opposite is not required.
IPv6 traffic with the destination address outside of the AFTR/NAT64 address/prefix will follow normal IPv6 routing path within the 7750 SR.
The default nat-policy is always mandatory and must be configured under the router/vprn>nat>inside context. This default NAT policy can reference any configured pool in the desired ISA group. The pool referenced in the default nat-policy can be then overridden by the nat-policy associated with the destination-prefix in LSN44 or by the nat-policy referenced in the ipv4/ipv6-filter used for NAT diversion in LSN44/DS-Lite/NAT64.
The NAT CLI nodes will fail to activate (be brought out of the no shutdown state), unless a valid nat-policy is referenced in the router/vprn>nat>inside context.
Each subscriber using multiple policies is counted as 1 subscriber for the inside resources scaling limits (such as the number of subscribers per MS-ISA), and counted as 1 subscriber per (subscriber + policy combination) for the outside limits (subscriber-limitèsubscribers per IP; port-reservation è port/block reservations per subscriber).
Any given Static Port Forward (SPF) can be created only in one pool. This pool, which is referenced through the nat-policy, has to be specified at the SPF creation time, either explicitly through the configuration request or implicitly via defaults.
Explicit request will be submitted either via SAM or via CLI:
In the absence of the nat-policy referenced in the SPF creation request, the default nat-policy under the vprn/router>nat>inside context will be used.
The consequence of this is that the operator must know the nat-policy in which the SPF is to be created. The SPF cannot be created via PCP outside of the pool referenced by the default nat-policy, since PCP does not provide means to communicate nat-policy name in the SPF creation request.
The static port forward creation and their use by the subscriber types must follow these rules:
When the last relevant policy for a certain subscriber type is removed from the virtual router, the associated port forwards are automatically deleted.
Figure 59 and Figure 60 describe certain scenarios that are more theoretical and are less likely to occur in reality. However, they are described here for the purpose of completeness.
Figure 59 represents the case where traffic from the WEB server 1.1.1.1 is initiated toward the destined network 11.0.0.0/8. Such traffic will end up translated in the Pool B and forwarded to the 11.0.0.0/8 network even though the static port forward has been created in Pool A. In this case the nat-policy rule (dest 11.0.0.0/8 è pool B) will determine the pool selection in the upstream direction (even though the SPF for the WEB server already exists in the Pool A).
The next example in Figure 60 shows a case where the Flow 1 is initiated from the outside. Since the partial mapping matching this flow already exist (created by SPF) and there is no more specific match (FQF) present, the downstream traffic will be mapped according to the SPF (through Pool A to the Web server). At the same time, a more specific entry (FQF) will be created (initiated by the very same outside traffic). This FQF will now determine the forwarding path for all traffic originating from the inside that is matching this flow. This means that the Flow 2 (reverse of the Flow 1) will not be mapped to an IP address from the pool B (as the policy dictates) but instead to the Pool A which has a more specific match.
A more specific match would be in this case fully qualified flows (FQF) that contains information about the foreign host: <host, inside IP/port, outside IP/port, foreign IP address/port, protocol>.
When multiple NAT policies per inside routing context are deployed, a new policy-id parameter is added to certain syslog messages. The format of the policy-id is:
where XX is an arbitrary unique number per inside routing context assigned by the router. This number, represents the corresponding nat-policy. Since the maximum number of NAT policies in the inside routing context is 8, the policy-id value is also a numerical value in the range 1 — 8.
Introduction of the policy-id in logs is necessary due to the bulk-operations associated with multiple NAT policies per inside routing context. A bulk operation, for example, represents the removal of the nat-policy from the configuration, shutting down the NAT pool, or removing an IP address range from the pool. Removing a NAT accounting policy in case of RADIUS NAT logging will not trigger a summarization log since an acct-off message is sent. Such operations have a tendency to be heavy on NAT logging since they affect a large number of NAT subscribers at once. Summarization logs are introduced to prevent excessive logging during bulk operations. For example, the nat-policy deletion can be logged with a single (summarized) entry containing the policy-id of the nat-policy that was removed and the inside srvc-id. Since all logs contain the policy-id, a single summarization free log can be compared to all map2 logs containing the same policy-id to determine for which subscribers the NAT mappings have ceased. Map and Free logs are generated when the port-block for the subscribers are allocated and de-allocated.
Summarization log is always generated on the CPM, regardless of whether the RADIUS logging is enabled or not. A summarization log simply cannot be generated via RADIUS logging since the RADIUS accounting message streams (start/interim-updates/stop) are always generated per subscriber. In other words, for RADIUS logging, the summarization log would need to be sent to each subscriber, which defeats the purpose of the summarization logs.
A summarization log on the CPM is generated:
With multiple NAT policies per inside routing context, the inside srvc-id and the policy-id are included in the summarization log (no outside IPs, outside srvc-id, port-block or source IP).
A log search based on the policy-id and inside srvc-id should reveal all subscribers whose mappings were affected by the nat-policy removal.
A log search based on the outside IP address and outside srvc-id should reveal all subscribers for which the NAT mappings have ceased.
A log search based on the outside IP addresses in the range and the outside srvc id should reveal all subscribers for which the NAT mappings have ceased.
Summarization logs in RADIUS logging
The summarization log for bulk operation while RADIUS logging is in effect will be generated only in the CPM (syslog). This means that for bulk operations with RADIUS logging, the operator will have to rely on RADIUS logging as well as on the CPM logging.
An open log sequence in RADIUS, for example a map for the <inside IP 1, outside IP 1,port-block 1> followed at some later time with a map for <inside IP 2, outside IP 1, port-block 1>, is an indication that the free log for <inside IP 1, outside IP 1,port-block 1> is missing. This means that either the free log for <inside IP 1, outside IP 1,port-block 1> was lost or that a policy/pool/address-range was removed from the configuration. In the latter case, the operator should look in the CPM log for the summarization message.
The summarization logs are enabled via the event control 2021 tmnxNatLsnSubBlksFree which is by default suppressed. The even control 2021 is also used to report when all blocks for the subscriber are freed.
Multiple NAT policies for a L2-aware subscriber can be selected based on the destination IP address of the packet. This allows the operator to assign different NAT pools and outside routing contexts based on the traffic destinations. The mapping between the destination IP prefix and the nat-policy is defined in a nat-prefix-list. This nat-prefix-list is applied to the L2-aware subscriber via a sub-profile. Once the subscriber traffic arrives to the MS-ISA where NAT is performed, an additional lookup based on the destination IP address of the packet will be executed to select the specific nat-policy (and consequently the outside NAT pool). Failure to find the specific nat-policy based on the destination IP address lookup will result in selection of the default nat-policy referenced in the sub-profile. CLI example:
As displayed in the example, multiple IP prefixes can be mapped to the same nat-policy.
The nat-prefix-list cannot reference the default nat-policy. The Default nat-policy is the one that is referenced directly under the sub-profile.
In L2-aware NAT with multiple nat-policies, the NAT resources are allocated in each pool associated with the subscriber. This NAT resource allocation is performed at the time when the ESM subscriber is instantiated. Each NAT resource allocation will be followed by log generation. For example, if RADIUS logging is enabled, one Alc-NAT-Port-Range VSA per nat-policy will be included in the acct START/STOP message. [Alc-Nat-Port-Range = "192.168.20.1 1024-1055 router base nat-pol-1"Alc-Nat-Port-Range = "193.168.20.1 1024-1055 router base nat-pol-2".Alc-Nat-Port-Range = "194.168.20.1 1024-1055 router base" nat-pol-3.]
Nat-policy change for L2-aware NAT is supported via sub-profile change triggered in CoA. However, change of sub-profile alone via CoA will not trigger generation of new Radius accounting message and thus NAT events related to nat-policy change will not be promptly logged. For this reason, each CoA initiating the sub-profile change in NAT environment should:
Note that the sla-profile will have to be changed and not just refreshed. In other words replacing the existing sla-profile with the same one will not trigger a new accounting message.
Both of the above events will trigger an accounting update at the time when CoA is processed. This will keep NAT logging current. The information about NAT resources for logging purposes is conveyed in the following RADIUS attributes:
NAT logging behavior due to CoA will depend on the deployed accounting mode of operation. This is described in Table 1. Note that interim-update keyword must be configured for host/session accounting in order for Interim-Update messages to be triggered:
Table Legend:AATR - Alc-Acct-Triggered-Reason VSA ® This VSA is optionally carried in Interim-Update messages that are triggered by CoA.ATAI - Alc-Trigger-Acct-Interim VSA ® this VSA can be carried in CoA to trigger Interim-Update message. The string carried in this VSA is reflected in the triggered Interim-Update message.I-U – Interim-Update Message
Host or session accounting | Queue-instance accounting | Comments | |
CoA Sub-prof change + ATAI VSA | Single I-U with: — released NAT info — unchanged NAT info — new NAT info — AATR — ATAI | Single I-U with: — released NAT info — unchanged NAT info — new NAT info — AATR — ATAI | Single I-U message is triggered by CoA. |
CoA Sub-profile change + Sla-profile change | First I-U: — released NAT info — unchanged NAT info — new NAT info Second I-U: — unchanged NAT info — new NAT info | Acct Stop: — released NAT info — unchanged NAT info — new NAT info Acct Start: — unchanged NAT info — new NAT info | Two accounting messages are triggered in succession. |
CoA Sub-profile change | — | — | No accounting messages are triggered by CoA. The next regular I-U messages will contain: — old (released) NAT info — unchanged NAT info — new NAT info. |
CoA Sub-profile change+ Sla-profile-change + ATAI VSA | First I-U: — released NAT info — unchanged NAT info — new NAT info Second I-U — unchanged NAT info — new NAT info — AATR — ATAI | Acct Stop: — re-released NAT info — unchanged NAT info — new NAT info Acct Start: — unchanged NAT info — new NAT info | Two accounting messages are triggered in succession. |
For example, the second CoA row describes the outcome triggered by CoA carrying new sub and sla profiles. In host/session accounting mode this will create two Interim-Update messages. The first Interim-Messages will carry information about:
The second Interim-Update message will carry information about the NAT resources that are in use (existing and new) once CoA is activated.
From this, the operator can infer which NAT resources are released by CoA and which NAT resources continue to be in use once CoA is activated.
Nat-policy change induced by CoA will trigger immediate log generation (for example acct STOP or INTERIM-UPDATE) indicating that the nat resources have been released. However, the NAT resources (outside IP addresses and port-blocks) in SR OS node will not be released for another five seconds. This delay is needed to facilitate proper termination of traffic flow between the NAT user and the outside server during the nat-policy transition. A typical example of this scenario is the following:
Stale port forwards will similarly to other stale dynamic mappings be released after five seconds. Note that static port forwards will be kept on the CPM.New CoAs related to NAT will be rejected (NAKed) in case that the previous change is in progress (during the 5seconds interval until the stale mappings are purged).
Unless the specific nat-policy is provided during Static Port Forward (SPF) creation (SPF creation command), the port forward will be created in the pool referenced in the default nat-policy. Nat-policy can be part of the command used to modify or delete SPF. If the nat-policy is not provided, then the behavior will be:
A match is considered when at least these parameters from the modify/delete command are matched (mandatory parameters in the spf command):
UPnP will use the default nat-policy.
PCP is a protocol that operates between subscribers and the NAT directly. This makes the protocol similar to DHCP or PPP in that the subscriber has a limited but direct control over the NAT behavior.
PCP is designed to allows the configuration of static port-forwards, obtain information about existing port forwards and to obtain the outside IP address from software running in the home network or on the CPE.
PCP runs on each MS-ISA as its own process and make use of the same source-IP hash algorithm as the NAT mappings themselves. The protocol itself is UDP based and is request/response in nature, in some ways, similar to UPnP.
PCP operates on a specified loopback interface in a similar way to the local DHCP server. It operates on UDP and a specified (in CLI) port. As Epoch is used to help recover mappings, a unique PCP service must be configured for each NAT group.
When epoch is lowered, there is no mechanism to inform the clients to refresh their mappings en-masse. External synchronization of mappings is possible between two chassis (epoch does not need to be synchronized). If epoch is unsynchronized then the result will be clients re-creating their mapping on next communication with the PCP server.
The R-bit (0) indicates request and (1) indicates response. This is a request so (0).
OpCode defined as:
Requested Lifetime: Lifetime 0 means delete.
As this is a response, R = (1).
The Epoch field increments by 1 every second and can be used by the client to determine if state needs to be restored. On any failure of the PCP server or the NAT to which it is associated Epoch must restart from zero (0).
Result Codes:
0 SUCCESS, success.
1 UNSUPP_VERSION, unsupported version.
2 MALFORMED_REQUEST, a general catch-all error.
3 UNSUPP_OPCODE, unsupported OpCode.
4 UNSUPP_OPTION, unsupported option. Only if the Option was mandatory.
5 MALFORMED_OPTION, malformed option.
6 UNSPECIFIED_ERROR, server encountered an error
7 MISORDERED_OPTIONS, options not in correct order
Creating a Mapping
Client Sends
MAP4 opcode is (1). Protocols: 0 – all; 1 – ICMP; 6 – TCP; 17 – UDP.
MAP4 (1), PEER4 (3) and PREFER_FAILURE are supported. FILTER and THIRD_PARTY are not supported.
Universal Plug and Play (UPnP), which is a set of specifications defined by the UPnP forum. One specification is called Internet Gateway Device (IGD) which defines a protocol for clients to automatically configure port mappings on a NAT device. Today, many gaming, P2P, VoIP applications support the UPnP IGD protocol. The SR OS supports the following UPnP version 1 InternetGatewayDevice version 1 features:
PPTP is defined in RFC 2637, Point-to-Point Tunneling Protocol (PPTP), and is used to provide VPN connection for home/mobile users to gain secure access to the enterprise network. Encrypted payload is transported over GRE tunnel that is negotiated over TCP control channel. In order for PPTP traffic to pass through NAT, the NAT device must correlate the TCP control channel with the corresponding GRE tunnel. This mechanism is referred to as PPTP ALG.
There are two components of PPTP:
The control connection is established from the PPTP clients (for example, home users behind the NAT) to the PPTP server which is located on the outside of the NAT. Each session that carries data between the two endpoints can be referred as call. Multiple sessions (or calls) can carry data in a multiplexed fashion over a tunnel. The tunnel protocol is defined by a modified version of GRE. Call ID in the GRE header is used to multiplex sessions over the tunnel. The Call-ID is negotiated during the session/call establishment phase.
Control Connection Management — The following messages are used to maintain the control connection.
The remaining control message types are sent over the established TCP session to open/maintain sessions and to convey information about the link state:
Call Management — Call management messages are used to establish/terminate a session/call and to exchange information about the multiplexing field (Call-id). Call-IDs must be captured and translated by the NAT. The call management messages are:
Error Reporting — This message is sent by the client to indicate WAN error conditions that occur on the interface supporting PPP.
PPP Session Control — This message is sent in both directions to setup PPP-negotiated options.
Once Call-ID is negotiated by both endpoints, it is inserted in GRE header and used as multiplexing filed in the tunnel that carries data traffic.
A GRE tunnel is used to transport data between two PPTP endpoints. The packet transmitted over this tunnel has the following general structure:
The GRE header contains the Call ID of the peer for the session for which the GRE packet belongs.
PPTP ALG is aware of the control session (Start Control Connection Request/Replay) and consequently it captures the Call ID field in all PPTP messages that carry that field. In addition to translating inside IP and TCP port, the PPTP ALG process data beyond the TCP header in order to extract the Call ID field and translate it inside of the Outgoing Call Request messages initiated from the inside of the NAT.
The GRE packets with corresponding Call IDs are translated through the NAT as follows:
In addition, the following applies:
The basic principle of PPTP NAT ALG is shown in Figure 61.
The scenario where multiple clients behind the NAT are terminated to the same PPTP server is shown in Figure 62. In this case, it is possible that the source IP addresses of the two PPTP clients are mapped to the same outside address of the NAT. Since the endpoints of the GRE tunnel from the NAT to the PPTP server will be the same for both PPTP clients (although their real source IP addresses are different), the NAT must ensure the uniqueness of the Call-IDs in the outbound data connection. This is where Call-ID translation in the NAT becomes crucial.
The routers supports a deployment scenario where multiple calls (or tunnels) are established from a single PPTP node within a single control connection. In this case, there is only one set of Start-Control-Connection-Req/Reply messages (one control channel) and multiple sets of Outgoing-Call-Req/Reply messages.
Call-Id are taken from the same pool as the ICMP port ranges. Port-ranges and Call-IDs are both 16-bit values. Call-id selection mechanism is the same as the outside TCP/UDP port selection mechanism (random with parity).
LSN logging is extremely important to the Service Providers (SP) who are required by the government agencies to track source of suspicious Internet activities back to the users that are hidden behind the LSN device.
The 7750 SR supports several modes of logging for LSN applications. Choosing the right logging model will depend on the required scale, simplicity of deployment and granularity of the logged data.
For most purposes logging of allocation/de-allocation of outside port-blocks and outside IP address along with the corresponding LSN subscriber and inside service-id will suffice.
In certain cases port-block based logging is not satisfactory and per flow logging is required.
The simplest form of LSN and L2-Aware NAT logging is via logging facility in the 7750 SR, commonly called logger. Each port-block allocation/de-allocation event will be recorded and send to the system logging facility (logger). Such an event can be:
In this mode of logging, all applications in the system share the same logger.
Syslog/SNMP/Local-File logging on LSN is mutually exclusive with NAT RADIUS-based logging.
Syslog/SNMP/local-file logging must be separately enabled for LSN and L2-Aware NAT in log even-control. The following displays relevant MIB events:
In this example a single port-block [1884-1888] is allocated/de-allocated for the inside IP address 5.5.5.5 which is mapped to the outside IP address 80.0.0.1. Consequently the event is logged in the memory as.
Once the desired LSN events are enabled for logging via event-control configuration, they can be logged to memory via standard log-id 99 or be filtered via a custom log-id, such as in this example (log-id 5):
Configuration:
The event description is given below:
In this case, the destination of log-id 5 in the following example would be a local file instead of memory:
The events will be logged to a local file on the compact flash cf3 in a file under the /log directory.
In case of SNMP logging to a remote node, the log destination should be set to SNMP destination. Allocation de-allocation of each port block will trigger sending a SNMP trap message to the trap destination.
NAT logs can be sent to a syslog remote facility. A separate syslog message is generated for every port-block allocation/de-allocation.
Severity level for this event can be changed via CLI:
LSN RADIUS logging (or accounting) is based on RADIUS accounting messages as defined in RFC 2866. It requires an operator to have RADIUS accounting infrastructure in place. For that reason, LSN RADIUS logging and LSN RADIUS accounting terms can be used interchangeably.
This mode of logging operation is introduced so that the shared logging infrastructure in 7750 SR can be offloaded by disabling syslog/SNMP/Local-file LSN logging. The result is increased performance and higher scale, particularly in cases when multiple BB-ISA cards within the same system are deployed to perform aggregated LSN functions.
An additional benefit of LSN RADIUS logging over syslog/SNMP/local-file logging is reliable transport. Although RADIUS accounting relies on unreliable UDP transport, each accounting message from the RADIUS client must be acknowledged on the application level by the receiving end (accounting server).
Each port-block allocation or de-allocation is reported to an external accounting (logging) server in the form of start, interim-update or stop messages. The type of accounting messages generated depends on the mode of operation:
The accounting messages are generated and reported directly from the BB-ISA card, therefore bypassing accounting infrastructure residing on the Control Plane Module (CPM).
LSN RADIUS logging is enabled per nat-group. To achieve the required scale, each BB-ISA card in the nat-group group with LSN RADIUS logging enabled runs a RADIUS client with its own IP address. Accounting messages can be distributed to up to five accounting servers that can be accessed in round-robin fashion. Alternatively, in direct access mode, only one accounting server in the list is used. When this server fails, the next one in the list is used.
Configuration steps:
Each BB-ISA card is assigned one IPv4 address from the source-address-range command and this IPv4 address must be accessible from the accounting server. In the following example there is only one BB-ISA card in the nat-group 1. It source IP address is 114.0.1.20.
It is possible to load-balance accounting messages over multiple logging servers by configuring the access-algorithm to round-robin mode. Once the LSN RADIUS accounting policy is defined, it will have to be applied to a nat-group:
The RADIUS accounting messages for the case where a Large Scale NAT44 subscriber has allocated two port blocks in a logging mode where acct start/stop is generated per port-block is shown below.
Port-blocks allocation for the NAT44 subscriber:
Port-blocks de-allocation
The inclusion of acct-multi-session-id in the NAT accounting policy will enable generation of start/stop messages for each allocation/de-allocation of a port-block within the subscriber. Otherwise, only the first and last port-block for the subscriber would generate a pair of start/stop messages. All port-block in between would trigger generation of interim-update messages.
The User-Name attribute in accounting messages is set to app-name@inside-ip-address, whereas the app-name can be any of the following: LSN44, DS-Lite or NAT64.
Logging of L2-Aware NAT is supported via accounting policy associated with the ESM subscriber (outside of NAT). In addition to ESM subscriber specific attributes, the NAT port-ranges and outside IP address (nat-port-range command in regular ESM accounting policy) are reported in the same accounting messages.
RADIUS accounting initiated by BB-ISA card is not supported for L2-Aware NAT.
Syslog/SNMP/Local-file logging can be enabled simultaneously with L2-aware NAT RADIUS accounting (which is in this case regular ESM RADIUS accounting).
LSN and L2-Aware NAT Flow logging is a facility that allows each BB-ISA card to export the creation and deletion of NAT flows to an external server. A NAT flow or a Fully Qualified Flow consists of the following parameters: Inside IP, inside port, outside IP, outside port, foreign IP, foreign port, protocol (UDP, TCP, ICMP).
In addition, the inside/outside service-id and subscriber string will be added to a flow record.
Flow logging can be deployed as an alternative to the port-range logging or can be complementary (providing a more granular log for offline reporting or compliance). Certain operators have legal and compliance requirements that require extremely detailed logs, created per flow, to be exportable from the NAT node.
Because the setup rate of new flows is excessive, logging to an internal facility (like compact flash) is not possible except in a debugging mode (which must specify match criteria down to the inside-IP and service level).
Flow logging can be enabled per nat-policy and consequently it is initiated from each BB-ISA card independently as a UDP stream, unlike a centralized Netflow/Cflowd application.
Flows are formatted according to IETF IPFIX RFC 5101, Specification of the IP Flow Information Export (IPFIX) Protocol, for the Exchange of IP Traffic Flow Information. Data structures are contained in RFC5102, Information Model for IP Flow Information Export. NAT flow logging is sent to up to two different IP addresses both of which must be unicast IPv4 destinations. These UDP streams are stateless due to the significant volume of transactions. However they do contain sequence numbers such that packet loss can be identified. They egress the chassis at FC NC.
IPFIX defines two different type of messages that will be sent from the IPFIX exporter (7750 SR NAT node). The first contains Template Set – an IPFIX message that defines fields for subsequent IPFIX messages but contains no actual data of its own. The second IPFIX message type is that containing Data Sets – here the data is passed using the previous Template Set message to define the fields. This means an IPFIX message is NOT passed as sets of TLV, but instead data is encoded with a scheme defined through the Template Set message.
While an IPFIX message can contain both Template Set and Data Set, 7750 SR sends Template Set messages periodically without any data, whereas the Data Set messages are sent on demand and as required. When IPFIX is used over UDP, the default retransmission frequency of the Template Set messages defaults to 10 minutes. The interval for retransmission is configurable in CLI with a minimum interval of 1 minute and a maximum interval of 10 minutes. When the exporter first initializes, or when a configuration change occurs the Template Set is sent out three times, one second apart. Templates are sent before any data sets, assuming that the collector is enabled, so that an IPFIX collector can establish the data template set.
Although the UDP transport is unreliable, the IPFIX Sequence Number is a 32bit number that contains the total number of IPFIX Data Records sent for the UDP transport session prior to the receipt of the new IPFIX message. The sequence number starts with 0 and it will roll over once it reaches 4,294,967,268.
The default packet size is 1500B unless another value has been defined in config (range is 512B through 9212B inclusive). Traffic is originated from a random high-port to the collector on port 4739. Multiple create/delete flow records will be stuffed into a single IPFIX packet (although the mapping creates are not delayed) until stuffing an additional data record would exceed MTU or a timer expires. The timer is not configurable and is set to 250ms (that is, should any mapping occur a packet will be sent within 250ms of that mapping being created)
Each collector has a 50 packet buffering space. In case that due to excessive logging the buffering space becomes unavailable, new flows will be denied and the deletion of flows will be delayed until buffering space becomes available.
Two collector nodes can be defined in the same IPFIX export policy for redundancy purposes.
This section provides an example of how to configure large scale NAT44 flow logging.
Define a collector node along with other local transport parameters through an IPFIX export-policy.
To export flow records via UDP stream, the BB-ISA card must be configured with appropriate IPv4 address within a designated VPRN. This address (/32) will act as the source for sending all IPFIX records and is shared by all ISA.
After the IPFIX export policy is defined, apply it within the NAT policy:
The capture of IPFIX packet for an ICMP flow creation and deletion is shown in the following examples.
Flow creation:
Flow deletion:
Table 33 lists the values and descriptions of the fields in the example flow creation and deletion templates.
Field | Value | Description |
Description | Size (B) | |
Export Timestamp | n/a | Timestamp derived from chassis NTP, per RFC 5101 |
Sequence Id | n/a | Total number of IPFIX data records sent for the UDP transport session prior to the receipt of the new IPFIX message (modulo 232), per RFC 5101 |
Observation Domain I | n/a | Unique ID set per ISA in the 7750 SR chassis |
FlowID | 8 | Unique ID (per observation domain ID) for this flow used for tracking purposes only (opaque value); flow ID in a create and a delete mapping record must be the same for a specific NAT mapping |
IP_SRC_ADDR | 4 | Outside IP address used in the NAT mapping |
IP_DST_ADDR | 4 | Destination or remote IP address used in the NAT mapping |
L4_SRC_PORT | 2 | Outside source port used in the NAT mapping |
L4_DST_PORT | 2 | Destination source port used in the NAT mapping |
flowStartMilliseconds 1 | 8 | Timestamp when the flow was created (chassis NTP derived) in milliseconds from epoch, per RFC 5102 |
flowEndMilliseconds 2 | 8 | Timestamp when the flow was destroyed (chassis NTP derived) in milliseconds from epoch, per RFC 510 |
PROTOCOL | 1 | Protocol ID, TCP, UDP or ICMP. Per RFC 5102 |
PADDING | 1 | n/a |
flowEndReason | 1 | Supported flow end reasons:
|
aluInsideServiceID | 2 | 16-bit service ID representing the inside service ID |
aluOutsideServiceI | 2 | 16-bit service ID representing the outside service ID |
aluNatSubString | var | A variable 8B aligned string that represents the NAT subscriber construct (as currently used in the tools dump service nat session commands) |
Notes:
In general, fragmentation functionality is invoked when the size of a fragmentation eligible packet exceeds the size of the MTU of the egress interface/tunnel. Packets eligible for fragmentation are:
The best practice is to avoid fragmentation in the network by ensuring adequate MTU size on the transient/source nodes. Drawbacks of the fragmentation are:
Fragmentation can be particularly deceiving in a tunneled environment whereby the tunnel encapsulation adds extra overhead to the original packet. This extra overhead could tip the size of the resulting packet over the egress MTU limit.
Fragmentation could be one solution in cases where the restriction in the mtu size on the packet’s path from source to the destination cannot be avoided. Routers support IPv6 fragmentation in DS-Lite and NAT64 with some enriched capabilities, such as optional packet IPv6 fragmentation even in cases where DF-bit in corresponding IPv4 packet is set.
In general, the lengths of the fragments must be chosen such that resulting fragment packets fit within the MTU of the path to the packets destination(s).
In downstream direction fragmentation can be implemented in two ways:
In upstream direction, IPv4 packets can be fragmented once they are de-capsulated in DS-lite or translated in NAT64. The fragmentation will occur in the IOM.
In the downstream direction, the IPv6 packet carrying IPv4 packet (IPv4-in-IPv6) is fragmented in the ISA in case the configured DS-lite tunnel-mtu is smaller than the size of the IPv4 packet that is to be tunneled inside of the IPv6 packet. The maximum IPv6 fragment size will be 48bytes larger than the value set by the tunnel-mtu. The additional 48 bytes is added by the IPv6 header fields: 40 bytes for the basic IPv6 header + 8 bytes for extended IPv6 fragmentation header. NAT implementation in the routers does not insert any extension IPv6 headers other than fragmentation header.
In case that the IPv4 packet is larger than the value set by the tunnel-mtu, the fragmentation action will depend on the configuration options and the DF bit setting in the header of the received IPv4 header:
In case that the IPv4 packet is dropped due to fragmentation not being allowed, an ICMPv4 Datagram Too Big message will be returned to the source. This message will carry the information about the size of the MTU that is supported, in essence notifying the source to reduce its MTU size to the requested value (tunnel-mtu).
The maximum number of supported fragments per IPv6 packet is 8. Considering that the minimum standard based size for IPv6 packet is 1280bytes, 8 fragments is enough to cover jumbo Ethernet frames.
Downstream fragmentation in NAT64 works in similar fashion. The difference between DS-lite is that in NAT64 the configured ipv6-mtu represents the mtu size of the ipv6 packet (as opposed to payload of the IPv6 tunnel in DS-lite). In addition, IPv4 packet in NAT64 is not tunneled but instead IPv4/v6 headers are translated. Consequently, the fragmented IPv6 packet size will be 28bytes larger than the translated IPv4 packet 20bytes difference in basic IP header sizes (40bytes IPv6 header vs 20byte IPv4 header) plus 8 bytes for extended fragmentation IPv6 header. The only extended IPv6 header that NAT64 generates is the fragmentation header.
In case that the IPv4 packet is dropped due to the fragmentation not being allowed, the returned ICMP message will contain MTU size of ipv6-mtu minus 28 bytes.
Otherwise the fragmentation options are the same as in DS-lite.
The NAT command histogram displays compartmentalized port distribution per protocol for an aggregated number of subscribers. This allows operators to trend port usage over time and consequently adjust the configuration as the port demand per subscriber increase/decrease. For example, an operator may find that the port usage in a pools has increased over a period of time. Accordingly, the operator may plan to increase the number of ports per port block.
The feature is not applicable to pools which operate in one-to-one mode.
The output is organized in port buckets with the number of subscribers in each bucket.
For example:
The output of the histogram command can be periodically exported to en external destination via cron. The following is an example:
The nat-histogram.txt file contains the command execution line. For example:
This command will be executed every 10 minutes (600 seconds) and the output of the command will be written into a set of files on an external FTP server:
The output of this command displays the port usage in a given pool per protocol per subscriber. The output is organized in a configurable number of port-buckets.
In the following example there is 1 subscriber that is using between 20 and 39 UDP ports in the pool named det. The pool is configured in the Base routing instance.
Multi-chassis stateless NAT redundancy is based on a switchover of the NAT pool that can assume active (master) or standby state. The inside/outside routes that attract traffic to the NAT pool are always advertised from the active node (the node on which the pool is active).
This dual-homed redundancy based on the pool mastership state works well in scenarios where each inside routing context is configured with a single nat-policy (NATed traffic within this inside routing context will be mapped to a single NAT pool).
However, in cases where the inside traffic is mapped to multiple pools (with deterministic NAT and in case when multiple NAT policies are configured per inside routing context), the basic per pool multi-chassis redundancy mode can cause the inside traffic within the same routing instance to fail since some pools referenced from the routing instance might be active on one node while other pools might be active on the other node.
Imagine a case where traffic ingressing the same inside routing instance is mapped as follows (this mapping can be achieved via filters):
Traffic for the same destination is normally attracted only to one NAT node (the destination route is advertised only from a single NAT node). Let assume that this node is Node 1 in out example. Once the traffic arrives to the NAT node, it will be mapped to the corresponding pool according to the mapping criteria (routing based or filter based). But if active pools are not co-located, traffic destined to the pool that is active on the neighboring node would fail. In our example traffic from the source ip-address B would arrive to the Node 1, while the corresponding Pool 2 is inactive on that node. Consequently the traffic forwarding would fail.
To remedy this situation, a group of pools targeted from the same inside routing context must be active on the same node simultaneously. In other words, the active pools referenced from the same inside routing instance must be co-located. This group of pools is referred to as Pool Fate Sharing Group (PFSG). The PFSG is defined as a group of all NAT pools referenced by inside routing contexts whereby at least one of those pools is shared by those inside routing contexts. This is shown in Figure 64.
Even though only Pool 2 is shared between subscribers in VRF 1 and VRF 2, the remaining pools in VRF 1 and VRF 2 must be made part of PFSG 1 as well.
This will ensure that the inside traffic will be always mapped to pools that are active in a single box.
There is always one lead pool in PFSG. The Lead pool is the only pool that is exporting/monitoring routes. Other pools in the PFSG are referencing the lead pool and they inherit its (activity) state. If any of the pools in PFSG fails, all the pools in the PFSG will switch the activity, or in another words they will share the fate of the lead pool (active/standby/disabled).
There is one lead pool per PFSG per node in a dual-homed environment. Each lead pool in a PFSG will have its own export route that must match the monitoring route of in the lead pool in the corresponding PFSG on the peering node.
PFSG is implicitly enabled by configuring multiple pools to follow the same lead pool.
Attracting traffic to the active NAT node (from inside and outside) is based on the routing.
On the outside, the active pool address range will be advertised. On the inside, the destination prefix or steering route (in case of filter based diversion to the NAT function) will be advertised by the node with the active pool.
The advertisement of the routes will be driven by the activity of the pools in the pool fate sharing group:
For example:
A pool can be one of the following:
Both sets of options are thus mutually exclusive.
For a leading pool redundancy will only be enabled when the redundancy node is in no shutdown. For a following pool, the administrate has no effect, and the redundancy will only be enabled when the leading pool is enabled.
Before a lead pool is enabled, consistency check will be performed to make sure that PSFG is properly configured and that the all pools in the given PFSG belong to the same NAT isa-group. PFSG is implicitly enabled by configuring multiple pools to follow the same lead pool. Adding or removing pools from the fate-share-group is only possible when the leading pool is disabled.
For example in the following case, the consistency check would fail since pool 1 is not part of the PFSG 1 (where it should be).
The following command displays the state of the leading pool (dual-homing section towards the bottom of the command output):
The following command displays the state of the follower pool (dual-homing section towards the bottom of the command output):
The following command lists all the pools that are configured along with the NAT inside/outside routing context.
NAT ISA redundancy helps protect against Integrated Service Adapter (ISA) failures. This protection mechanism relies on the CPM maintaining configuration copy of each ISA. In case that an ISA fails, the CPM restores the NAT configuration from the failed ISA to the remaining ISAs in the system. NAT configuration copy of each ISA, as maintained by CPM, is concerned with configuration of outside IP address and port forwards on each ISA. However, CPM does not maintain the state of dynamically created translations on each ISA. This will cause interruption in traffic until the translation are re-initiated by the devices behind the NAT.
Two modes of operation are supported:
By reserving memory resources it can be assured that failed traffic can be recovered by remaining ISAs, potentially with some bandwidth reduction in case that remaining ISAs operated at full or close to full speed before the failure occurred. Active-active ISA redundancy model is shown in Figure 67.
In case of an ISA failure, the member-id of the member ISA that failed is contained in the FREE log. This info is used to find the corresponding MAP log which also contains the member-id field.
In case of RADIUS logging, CPM summarization trap is generated (since RADIUS log is sent from the ISA – which is failed).
In active-active ISA redundancy, each ISA is subdivided into multiple logical ISAs. These logical sub-entities are referred to as members. NAT configuration of each member is saved in the CPM. In case that any one ISA fails, its members will be downloaded by the CPM to the remaining active ISAs. Memory resources on each ISA will be reserved in order to accommodate additional traffic from the failed ISAs. The amount of resources reserved per ISA will depend on the number of ISAs in the system and the number of simultaneously supported ISA failures. The number of simultaneous ISA failures per system is configurable. Memory reservation will affect NAT scale per ISA.
Traffic received on the inside will be forwarded by the ingress forwarding complex to a predetermined member ISAs for further NAT processing. Each ingress forwarding complex maintains an internal link per member. The number of these internal links will, among other factors, determine the maximum number of members per system and with this, the granularity of traffic distribution over remaining ISAs in case of an ISA failure. The segmentation of ISAs into members for a single failure scenario is shown in Figure 68. The protection mechanism in this example is designed to cover for one physical ISA failure. Each ISA is divided into four members. Three of those will carry traffic during normal operation, while the fourth one will have resources reserved to accommodate traffic from one of the members in case of failure. When an ISA failure occurs, three members will be delegated to the remaining ISAs. Each member from the failed ISA will be mapped to a corresponding reserved member on the remaining ISAs.
Active-active ISA redundancy model supports multiple failures simultaneously. The protection mechanism shown in Figure 69 is designed to protect against two simultaneous ISA failures. Just like in the previous case, each ISA is divided into six members, three of which are carrying traffic under normal circumstances while the remaining three members have reserved memory resources.
Table 34shows resource utilization for a single ISA failure in relation to the total number of ISAs in the system. The resource utilization will affect only scale of each ISA. However, bandwidth per ISA is not reserved and each ISA can operate at full speed at any given time (with or without failures).
Number of Physical ISAs per System | Number of Member ISAs per Physical ISA (active/reserved) | Resource Utilization Per System in Non-Failed Condition | Resource Utilization Per System With One Failed ISA |
2 | 1A 1R | 50% | 100% |
3 | 2A 1R | 67% | 100% |
4 | 3A 1R | 75% | 100% |
5 | 3A 1R | 75% | 95% |
6 | 2A 1R | 66% | 83% |
7 | 2A 1R | 66% | 80% |
8 | 2A 1R | 66% | 79% |
9 | 1A 1R | 50% | 61% |
10 | 1A 1R | 50% | 60% |
11 | 1A 1R | 50% | 59% |
12 | 1A 1R | 50% | 58% |
13 | 1A 1R | 50% | 58% |
14 | 1A 1R | 50% | 57% |
During the first five minutes of system boot-up or nat-group activation, the system behaves as if all ISAs are operational. Consequently, ISAs are segmented in its members according to the configured maximum number of supported failures.
Upon expiration of this initial five minute interval, the system is re-evaluated. In case that one of more ISAs are found in faulty state during re-evaluation, the members of the failed ISAs will be distributed to the remaining ISAs that are operational.
Once a failed ISA is recovered, the system will automatically accept it and traffic will be assigned to it. Traffic that is moved to the recovered ISA will be interrupted.
Adding additional ISAs in an operational nat-group requires reconfiguration of the active mda-limit for the nat-group (or the failed mda-limit for that matter). This is only possible when nat-group is in an administratively shutdown state.
This section describes the interaction between MS-ISA applications and other system features.
All MS-ISA uses include support for service mirroring running with no feature interactions or impacts. For example, any service diverted to AA, IPsec, NAT, LNS, or supported combinations of MS-ISA application also supports service mirroring simultaneously.
Multiple uses of MS-ISAs can be combined at one time by daisy-chaining use of the MS-ISAs. Services and subscribers terminated on the LNS ISA are full supported by Application Assurance per AA subscriber and service capabilities, and by the full NAT capabilities.
When Application Assurance and NAT are used in combination (for both ESM and SAP service contexts):
Subscriber aware Large Scale NAT44 attempts to combine the positive attributes of Large Scale NAT44 and L2-Aware NAT, namely:
Subscriber awareness in Large Scale NAT44 will facilitate release of NAT resources immediately after the BNG subscriber is terminated, without having to wait for the last flow of the subscriber to expire on its own (TCP timeout is 4hours by default).
The subscriber aware Large Scale NAT44 function leverages RADIUS accounting proxy built-in to the 7750 SR. The RADIUS accounting proxy allows the 7750 SR to inform Large Scale NAT44 application about individual BNG subscribers from the RADIUS accounting messages generated by a remote BNG and use this information in the management of Large Scale NAT44 subscribers. The combination of the two allows, for example, the 7750 SR running as a Large Scale NAT44 to make the correlation between the BNG subscriber (represented in the Large Scale NAT44 by the Inside IP Address) and RADIUS attributes such as User-Name, Alc-Sub-Ident-String, Calling-Station-Id or Class. These attributes can subsequently be used for either management of the Large Scale NAT44 subscriber, or in the NAT RADIUS Accounting messages generated by the 7750 SR Large Scale NAT44 application. Doing so will simplify both the administration of the Large Scale NAT44 and the logging function for port-range blocks.
As BNG subscribers authenticate and come online, the RADIUS accounting messages are ‘snooped’ via RADIUS accounting proxy which creates a cache of attributes from the BNG subscriber. BNG subscribers are correlated with the NAT subscriber via framed-ip address, and one of the following attributes that must be present in the accounting messages generated by BNG:
Framed-ip address must also be present in the accounting messages generated by BNG.
Large Scale NAT44 Subscriber Aware application will receive a number of cached attributes which will then be used for appropriate management of Large Scale NAT44 subscribers, for example:
Creation and removal of RADIUS accounting proxy cache entries related to BNG subscriber is triggered by the receipt of accounting start/stop messages sourced by the BNG subscriber. Modification of entries can be triggered by interim-update messages carrying updated attributes. Cached entries can also be purged via CLI.
In addition to passing one of the above attributes in Large Scale NAT44 RADIUS accounting messages, a set of opaque BNG subscriber RADIUS attributes can optionally be passed in Large Scale NAT44 RADIUS accounting messages. Up to 128B of such opaque attributes will be accepted. The remaining attributes will be truncated.
Large Scale NAT44 subscriber instantiation can optionally be denied in case that corresponding BNG subscriber cannot be identified in Large Scale NAT44 via RADIUS accounting proxy.
Configuration guidelines:
Configure RADIUS accounting proxy functionality in a routing instance that will receive accounting messages from the remote or local BNG. Optionally forward received accounting message received by RADIUS accounting proxy to the final accounting destination (accounting server).
Point the BNG RADIUS accounting destination to the RADIUS accounting proxy – this way RADIUS accounting proxy will receive and ‘snoop’ BNG RADIUS accounting data.
BNG subscriber can be associated with two accounting policies, therefore pointing to two different accounting destinations. For example, one to the RADIUS accounting proxy, the other one to the real accounting server.
Configure subscriber aware Large Scale NAT44. From Large Scale NAT44 Subscriber Aware application reference the RADIUS Proxy accounting server and define the string that will be used to correlate BNG subscriber with the Large Scale NAT44 subscriber.
Optionally enable NAT RADIUS accounting that will include BNG subscriber relevant data.
(1)
RADIUS accounting proxy will listen to accounting messages on interface ‘rad-proxy-loopback’.
The name ‘proxy-acct’ as defined by the server command will be used to reference this proxy accounting server from Large Scale NAT44.
Received accounting messages can be relayed further from RADIUS accounting proxy to the accounting server which can be indirectly referenced in the default-accounting-policy ‘lsn-policy’.
This lsn-policy can then reference an external RADIUS accounting server with its own security credentials. This external accounting server can be configured in any routing instance.
(2)
Two RADIUS accounting policies can be configured in BNG – one to the real radius server, the other one to the RADIUS accounting proxy.
----------------------------------------------
(3)
Sub-aware Large Scale NAT44 references the RADIUS accounting proxy server ‘proxy-acct’ and defines the calling-station-id attribute from the BNG subscriber as the matching attribute:
(4)
Optionally RADIUS NAT accounting can be enabled:
Such setup would produce a stream of following Large Scale NAT44 RADIUS accounting messages:
The matching accounting stream generated on the BNG is given below: