(dpdk-with-open-vswitch)= # How to use Open vSwitch with DPDK Since [DPDK is *just* a library](https://ubuntu.com/server/docs/network-dpdk), it doesn't do a lot on its own so it depends on emerging projects making use of it. One consumer of the library that is already part of Ubuntu is Open vSwitch with DPDK (OvS-DPDK) support in the package `openvswitch-switch-dpdk`. Here is a brief example of how to install and configure a basic Open vSwitch using DPDK for later use via `libvirt`/`qemu-kvm`. ``` sudo apt-get install openvswitch-switch-dpdk sudo update-alternatives --set ovs-vswitchd /usr/lib/openvswitch-switch-dpdk/ovs-vswitchd-dpdk ovs-vsctl set Open_vSwitch . "other_config:dpdk-init=true" # run on core 0 only ovs-vsctl set Open_vSwitch . "other_config:dpdk-lcore-mask=0x1" # Allocate 2G huge pages (not Numa node aware) ovs-vsctl set Open_vSwitch . "other_config:dpdk-alloc-mem=2048" # limit to one whitelisted device ovs-vsctl set Open_vSwitch . "other_config:dpdk-extra=--pci-whitelist=0000:04:00.0" sudo service openvswitch-switch restart ``` > **Remember**: > You need to assign devices to DPDK-compatible drivers before restarting -- see the DPDK section on [unassigning the default kernel drivers](https://ubuntu.com/server/docs/network-dpdk/#heading--unassign-default-kernel-drivers). Please note that the section `_dpdk-alloc-mem=2048_` in the above example is the most basic non-uniform memory access (NUMA) setup for a single socket system. If you have multiple sockets you may want to define how the memory should be split among them. More details about these options are outlined in [Open vSwitch setup](http://docs.openvswitch.org/en/latest/intro/install/dpdk/#setup-ovs). ## Attach DPDK ports to Open vSwitch The Open vSwitch you started above supports all the same port types as Open vSwitch usually does, *plus* DPDK port types. The following example shows how to create a bridge and -- instead of a normal external port -- add an external DPDK port to it. When doing so you can specify the associated device. ``` ovs-vsctl add-br ovsdpdkbr0 -- set bridge ovsdpdkbr0 datapath_type=netdev ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk "options:dpdk-devargs=${OVSDEV_PCIID}" ``` You can tune this further by setting options: ``` ovs-vsctl set Interface dpdk0 "options:n_rxq=2" ``` ## Open vSwitch DPDK to KVM guests If you are not building some sort of software-defined networking (SDN) switch or NFV on top of DPDK, it is very likely that you want to forward traffic to KVM guests. The good news is; with the new `qemu`/`libvirt`/`dpdk`/`openvswitch` versions in Ubuntu this is no longer about manually appending a command line string. This section demonstrates a basic setup to connect a KVM guest to an Open vSwitch DPDK instance. The recommended way to get to a KVM guest is using `vhost_user_client`. This will cause OvS-DPDK to connect to a socket created by QEMU. In this way, we can avoid old issues like "guest failures on OvS restart". Here is an example of how to add such a port to the bridge you created above. ``` ovs-vsctl add-port ovsdpdkbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuserclient "options:vhost-server-path=/var/run/vhostuserclient/vhost-user-client-1" ``` This will connect to the specified path that has to be created by a guest listening for it. To let `libvirt`/`kvm` consume this socket and create a guest VirtIO network device for it, add the following snippet to your guest definition as the network definition. ``` ``` ## Tuning Open vSwitch-DPDK DPDK has plenty of options -- in combination with Open vSwitch-DPDK the two most commonly used are: ``` ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=2 ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x6 ``` The first line selects how many Rx Queues are to be used for each DPDK interface, while the second controls how many poll mode driver (PMD) threads to run (and where to run them). The example above will use two Rx Queues, and run PMD threads on CPU 1 and 2. > **See also**: > Check the links to "EAL Command-line Options" and "Open vSwitch DPDK installation" at the end of this document for more information. As usual with tunings, you need to know your system and workload really well - so please verify any tunings with workloads matching your real use case. ## Support and troubleshooting DPDK is a fast-evolving project. In any search for support and/or further guides, we highly recommended first checking to see if they apply to the current version. You can check if your issues is known on: - [DPDK Mailing Lists](http://dpdk.org/ml) - For OpenVswitch-DPDK [OpenStack Mailing Lists](http://openvswitch.org/mlists) - Known issues in [DPDK Launchpad Area](https://bugs.launchpad.net/ubuntu/+source/dpdk) - Join the IRC channels \#DPDK or \#openvswitch on freenode. Issues are often due to missing small details in the general setup. Later on, these missing details cause problems which can be hard to track down to their root cause. A common case seems to be the "could not open network device dpdk0 (No such device)" issue. This occurs rather late when setting up a port in Open vSwitch with DPDK, but the root cause (most of the time) is very early in the setup and initialisation. Here is an example of how proper initialiasation of a device looks - this can be found in the `syslog/journal` when starting Open vSwitch with DPDK enabled. ``` ovs-ctl[3560]: EAL: PCI device 0000:04:00.1 on NUMA socket 0 ovs-ctl[3560]: EAL: probe driver: 8086:1528 rte_ixgbe_pmd ovs-ctl[3560]: EAL: PCI memory mapped at 0x7f2140000000 ovs-ctl[3560]: EAL: PCI memory mapped at 0x7f2140200000 ``` If this is missing, either by ignored cards, failed initialisation or other reasons, later on there will be no DPDK device to refer to. Unfortunately, the logging is spread across `syslog/journal` and the `openvswitch` log. To enable some cross-checking, here is an example of what can be found in these logs, relative to the entered command. ``` #Note: This log was taken with dpdk 2.2 and openvswitch 2.5 but still looks quite similar (a bit extended) these days Captions: CMD: that you enter SYSLOG: (Inlcuding EAL and OVS Messages) OVS-LOG: (Openvswitch messages) #PREPARATION Bind an interface to DPDK UIO drivers, make Hugepages available, enable DPDK on OVS CMD: sudo service openvswitch-switch restart SYSLOG: 2016-01-22T08:58:31.372Z|00003|daemon_unix(monitor)|INFO|pid 3329 died, killed (Terminated), exiting 2016-01-22T08:58:33.377Z|00002|vlog|INFO|opened log file /var/log/openvswitch/ovs-vswitchd.log 2016-01-22T08:58:33.381Z|00003|ovs_numa|INFO|Discovered 12 CPU cores on NUMA node 0 2016-01-22T08:58:33.381Z|00004|ovs_numa|INFO|Discovered 1 NUMA nodes and 12 CPU cores 2016-01-22T08:58:33.381Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting... 2016-01-22T08:58:33.383Z|00006|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected 2016-01-22T08:58:33.386Z|00007|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.5.0 OVS-LOG: systemd[1]: Stopping Open vSwitch... systemd[1]: Stopped Open vSwitch. systemd[1]: Stopping Open vSwitch Internal Unit... ovs-ctl[3541]: * Killing ovs-vswitchd (3329) ovs-ctl[3541]: * Killing ovsdb-server (3318) systemd[1]: Stopped Open vSwitch Internal Unit. systemd[1]: Starting Open vSwitch Internal Unit... ovs-ctl[3560]: * Starting ovsdb-server ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait -- init -- set Open_vSwitch . db-version=7.12.1 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --no-wait set Open_vSwitch . ovs-version=2.5.0 "external-ids:system-id=\"e7c5ba80-bb14-45c1-b8eb-628f3ad03903\"" "system-type=\"Ubuntu\"" "system-version=\"16.04-xenial\"" ovs-ctl[3560]: * Configuring Open vSwitch system IDs ovs-ctl[3560]: 2016-01-22T08:58:31Z|00001|dpdk|INFO|No -vhost_sock_dir provided - defaulting to /var/run/openvswitch ovs-vswitchd: ovs|00001|dpdk|INFO|No -vhost_sock_dir provided - defaulting to /var/run/openvswitch ovs-ctl[3560]: EAL: Detected lcore 0 as core 0 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 1 as core 1 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 2 as core 2 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 3 as core 3 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 4 as core 4 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 5 as core 5 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 6 as core 0 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 7 as core 1 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 8 as core 2 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 9 as core 3 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 10 as core 4 on socket 0 ovs-ctl[3560]: EAL: Detected lcore 11 as core 5 on socket 0 ovs-ctl[3560]: EAL: Support maximum 128 logical core(s) by configuration. ovs-ctl[3560]: EAL: Detected 12 lcore(s) ovs-ctl[3560]: EAL: VFIO modules not all loaded, skip VFIO support... ovs-ctl[3560]: EAL: Setting up physically contiguous memory... ovs-ctl[3560]: EAL: Ask a virtual area of 0x100000000 bytes ovs-ctl[3560]: EAL: Virtual area found at 0x7f2040000000 (size = 0x100000000) ovs-ctl[3560]: EAL: Requesting 4 pages of size 1024MB from socket 0 ovs-ctl[3560]: EAL: TSC frequency is ~2397202 KHz ovs-vswitchd[3592]: EAL: TSC frequency is ~2397202 KHz ovs-vswitchd[3592]: EAL: Master lcore 0 is ready (tid=fc6cbb00;cpuset=[0]) ovs-vswitchd[3592]: EAL: PCI device 0000:04:00.0 on NUMA socket 0 ovs-vswitchd[3592]: EAL: probe driver: 8086:1528 rte_ixgbe_pmd ovs-vswitchd[3592]: EAL: Not managed by a supported kernel driver, skipped ovs-vswitchd[3592]: EAL: PCI device 0000:04:00.1 on NUMA socket 0 ovs-vswitchd[3592]: EAL: probe driver: 8086:1528 rte_ixgbe_pmd ovs-vswitchd[3592]: EAL: PCI memory mapped at 0x7f2140000000 ovs-vswitchd[3592]: EAL: PCI memory mapped at 0x7f2140200000 ovs-ctl[3560]: EAL: Master lcore 0 is ready (tid=fc6cbb00;cpuset=[0]) ovs-ctl[3560]: EAL: PCI device 0000:04:00.0 on NUMA socket 0 ovs-ctl[3560]: EAL: probe driver: 8086:1528 rte_ixgbe_pmd ovs-ctl[3560]: EAL: Not managed by a supported kernel driver, skipped ovs-ctl[3560]: EAL: PCI device 0000:04:00.1 on NUMA socket 0 ovs-ctl[3560]: EAL: probe driver: 8086:1528 rte_ixgbe_pmd ovs-ctl[3560]: EAL: PCI memory mapped at 0x7f2140000000 ovs-ctl[3560]: EAL: PCI memory mapped at 0x7f2140200000 ovs-vswitchd[3592]: PMD: eth_ixgbe_dev_init(): MAC: 4, PHY: 3 ovs-vswitchd[3592]: PMD: eth_ixgbe_dev_init(): port 0 vendorID=0x8086 deviceID=0x1528 ovs-ctl[3560]: PMD: eth_ixgbe_dev_init(): MAC: 4, PHY: 3 ovs-ctl[3560]: PMD: eth_ixgbe_dev_init(): port 0 vendorID=0x8086 deviceID=0x1528 ovs-ctl[3560]: Zone 0: name:, phys:0x83fffdec0, len:0x2080, virt:0x7f213fffdec0, socket_id:0, flags:0 ovs-ctl[3560]: Zone 1: name:, phys:0x83fd73d40, len:0x28a0c0, virt:0x7f213fd73d40, socket_id:0, flags:0 ovs-ctl[3560]: Zone 2: name:, phys:0x83fd43380, len:0x2f700, virt:0x7f213fd43380, socket_id:0, flags:0 ovs-ctl[3560]: * Starting ovs-vswitchd ovs-ctl[3560]: * Enabling remote OVSDB managers systemd[1]: Started Open vSwitch Internal Unit. systemd[1]: Starting Open vSwitch... systemd[1]: Started Open vSwitch. CMD: sudo ovs-vsctl add-br ovsdpdkbr0 -- set bridge ovsdpdkbr0 datapath_type=netdev SYSLOG: 2016-01-22T08:58:56.344Z|00008|memory|INFO|37256 kB peak resident set size after 24.5 seconds 2016-01-22T08:58:56.346Z|00009|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports recirculation 2016-01-22T08:58:56.346Z|00010|ofproto_dpif|INFO|netdev@ovs-netdev: MPLS label stack length probed as 3 2016-01-22T08:58:56.346Z|00011|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports unique flow ids 2016-01-22T08:58:56.346Z|00012|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath does not support ct_state 2016-01-22T08:58:56.346Z|00013|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath does not support ct_zone 2016-01-22T08:58:56.346Z|00014|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath does not support ct_mark 2016-01-22T08:58:56.346Z|00015|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath does not support ct_label 2016-01-22T08:58:56.360Z|00016|bridge|INFO|bridge ovsdpdkbr0: added interface ovsdpdkbr0 on port 65534 2016-01-22T08:58:56.361Z|00017|bridge|INFO|bridge ovsdpdkbr0: using datapath ID 00005a4a1ed0a14d 2016-01-22T08:58:56.361Z|00018|connmgr|INFO|ovsdpdkbr0: added service controller "punix:/var/run/openvswitch/ovsdpdkbr0.mgmt" OVS-LOG: ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl add-br ovsdpdkbr0 -- set bridge ovsdpdkbr0 datapath_type=netdev systemd-udevd[3607]: Could not generate persistent MAC address for ovs-netdev: No such file or directory kernel: [50165.886554] device ovs-netdev entered promiscuous mode kernel: [50165.901261] device ovsdpdkbr0 entered promiscuous mode CMD: sudo ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk SYSLOG: 2016-01-22T08:59:06.369Z|00019|memory|INFO|peak resident set size grew 155% in last 10.0 seconds, from 37256 kB to 95008 kB 2016-01-22T08:59:06.369Z|00020|memory|INFO|handlers:4 ports:1 revalidators:2 rules:5 2016-01-22T08:59:30.989Z|00021|dpdk|INFO|Port 0: 8c:dc:d4:b3:6d:e9 2016-01-22T08:59:31.520Z|00022|dpdk|INFO|Port 0: 8c:dc:d4:b3:6d:e9 2016-01-22T08:59:31.521Z|00023|dpif_netdev|INFO|Created 1 pmd threads on numa node 0 2016-01-22T08:59:31.522Z|00001|dpif_netdev(pmd16)|INFO|Core 0 processing port 'dpdk0' 2016-01-22T08:59:31.522Z|00024|bridge|INFO|bridge ovsdpdkbr0: added interface dpdk0 on port 1 2016-01-22T08:59:31.522Z|00025|bridge|INFO|bridge ovsdpdkbr0: using datapath ID 00008cdcd4b36de9 2016-01-22T08:59:31.523Z|00002|dpif_netdev(pmd16)|INFO|Core 0 processing port 'dpdk0' OVS-LOG: ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk ovs-vswitchd[3595]: PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f211a79ebc0 hw_ring=0x7f211a7a6c00 dma_addr=0x81a7a6c00 ovs-vswitchd[3595]: PMD: ixgbe_set_tx_function(): Using simple tx code path ovs-vswitchd[3595]: PMD: ixgbe_set_tx_function(): Vector tx enabled. ovs-vswitchd[3595]: PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f211a78a6c0 sw_sc_ring=0x7f211a786580 hw_ring=0x7f211a78e800 dma_addr=0x81a78e800 ovs-vswitchd[3595]: PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size no less than 4 (port=0). ovs-vswitchd[3595]: PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f211a79ebc0 hw_ring=0x7f211a7a6c00 dma_addr=0x81a7a6c00 ... CMD: sudo ovs-vsctl add-port ovsdpdkbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser OVS-LOG: 2016-01-22T09:00:35.145Z|00026|dpdk|INFO|Socket /var/run/openvswitch/vhost-user-1 created for vhost-user port vhost-user-1 2016-01-22T09:00:35.145Z|00003|dpif_netdev(pmd16)|INFO|Core 0 processing port 'dpdk0' 2016-01-22T09:00:35.145Z|00004|dpif_netdev(pmd16)|INFO|Core 0 processing port 'vhost-user-1' 2016-01-22T09:00:35.145Z|00027|bridge|INFO|bridge ovsdpdkbr0: added interface vhost-user-1 on port 2 SYSLOG: ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl add-port ovsdpdkbr0 vhost-user-1 -- set Interface vhost-user-1 type=dpdkvhostuser ovs-vswitchd[3595]: VHOST_CONFIG: socket created, fd:46 ovs-vswitchd[3595]: VHOST_CONFIG: bind to /var/run/openvswitch/vhost-user-1 Eventually we can see the poll thread in top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3595 root 10 -10 4975344 103936 9916 S 100.0 0.3 33:13.56 ovs-vswitchd ``` ## Resources - [DPDK documentation](http://dpdk.org/doc) - [Release Notes matching the version packages in Ubuntu 16.04](http://dpdk.org/doc/guides/rel_notes/release_2_2.html) - [Linux DPDK user getting started](http://dpdk.org/doc/guides/linux_gsg/index.html) - [EAL command-line options](http://dpdk.org/doc/guides/testpmd_app_ug/run_app.html) - [DPDK API documentation](http://dpdk.org/doc/api/) - [Open Vswitch DPDK installation](https://github.com/openvswitch/ovs/blob/branch-2.5/INSTALL.DPDK.md) - [Wikipedia's definition of DPDK](https://en.wikipedia.org/wiki/Data_Plane_Development_Kit)