|
|
|
|
Changelog for openvswitch2.16-2.16.0-80.el8s.x86_64.rpm :
* Tue Jun 07 2022 Open vSwitch CI - 2.16.0-80- Merging upstream branch-2.16 [RH git: 45dcf738b0] Commit list: 87922569f3 ofproto-dpif-xlate: Fix internal CT state for non-recirc traffic. 51aa8dd106 classifier: Adjust segment boundary to execute prerequisite processing. (#2081773) * Tue May 31 2022 Open vSwitch CI - 2.16.0-79- Merging upstream branch-2.16 [RH git: c224775aed] Commit list: 840c3fcb12 ofproto-dpif: Fix meter use-after-free. 77c89b0d25 ovs-rcu: Add ovsrcu_barrier. * Thu May 26 2022 Ilya Maximets - 2.16.0-78- Merging upstream branch-2.16 [RH git: d7d5f09849] Commit list: c8c78a76e5 ovsdb: raft: Fix transaction double commit due to lost leadership. (#2046340) 2809af022a Revert \"odp-util: Always report ODP_FIT_TOO_LITTLE for IGMP.\" 90e31552be ofproto-dpif: Trigger revalidation if ct tp changes. * Wed May 25 2022 Open vSwitch CI - 2.16.0-77- Merging upstream branch-2.16 [RH git: 3e3d3725d3] Commit list: 72bad27674 Carefully release NBL in Windows * Wed May 18 2022 Open vSwitch CI - 2.16.0-76- Merging upstream branch-2.16 [RH git: 72426100fe] Commit list: 6a304c7866 tests: Properly kill ovsdb test processes. 44dfae2991 ovs-save: Get highest ofp version error. 63754ac391 netdev-linux: Properly access 32-bit aligned rtnl_link_stats64 structs. * Wed May 04 2022 Open vSwitch CI - 2.16.0-75- Merging upstream branch-2.16 [RH git: 0c22edcd05] Commit list: df77b74438 ofproto-dpif-xlate: Remove mirror assert. c81571d602 netdev-dpdk: Fix tx drops statistic for a down netdev. * Thu Apr 28 2022 Timothy Redaelli - 2.16.0-74- vhost: fix queue number check when setting inflight FD [RH git: d084ce15a7] [ upstream commit 6442c329b9d2ded0f44b27d2016aaba8ba5844c5 ] In function vhost_user_set_inflight_fd, queue number in inflight message is used to access virtqueue. However, queue number could be larger than VHOST_MAX_VRING and cause write OOB as this number will be used to write inflight info in virtqueue structure. This patch checks the queue number to avoid the issue and also make sure virtqueues are allocated before setting inflight information. Fixes: ad0a4ae491fe (\"vhost: checkout resubmit inflight information\") Reported-by: Wenxiang Qian Signed-off-by: Chenbo Xia Reviewed-by: Maxime Coquelin * Thu Apr 28 2022 Timothy Redaelli - 2.16.0-73- vhost: fix FD leak with inflight messages [RH git: fafbd8f642] [ upstream commit af74f7db384ed149fe42b21dbd7975f8a54ef227 ] Even if unlikely, a buggy vhost-user master might attach fds to inflight messages. Add checks like for other types of vhost-user messages. Fixes: d87f1a1cb7b6 (\"vhost: support inflight info sharing\") Signed-off-by: David Marchand Reviewed-by: Maxime Coquelin * Wed Apr 27 2022 Open vSwitch CI - 2.16.0-72- Merging upstream branch-2.16 [RH git: 1c2e3ff275] Commit list: a51dd4685d ofproto-dpif-xlate: Clear out vlan flow fields while processing native tunnel. (#393566 2060552) * Tue Apr 26 2022 Open vSwitch CI - 2.16.0-71- Merging upstream branch-2.16 [RH git: a0490a292c] Commit list: 271bea0ee0 ofproto-xlate: Fix crash when forwarding packet between legacy_l3 tunnels. 9f9d59aeae system-traffic: Fix fragment reassembly with L3 L4 protocol information. * Thu Apr 21 2022 Timothy Redaelli - 2.16.0-70- Really set RTE_ETH_MAXPORTS to 1024 [RH git: 104da44ad6] Fixes: 81ff7c5a60f0 (\"Change RTE_ETH_MAXPORTS to 1024\") * Mon Apr 18 2022 Open vSwitch CI - 2.16.0-69- Merging upstream branch-2.16 [RH git: c9969bac2f] Commit list: 2afa9d2285 cirrus: Update FreeBSD versions. * Fri Apr 08 2022 Open vSwitch CI - 2.16.0-68- Merging upstream branch-2.16 [RH git: 2ee98fa0ff] Commit list: be8b35fddf Prepare for 2.16.4. d8639f81c1 Set release date for 2.16.3. 71a5a38c83 NEWS: Highlight libopenvswitch API change caused by UB fixes. * Wed Apr 06 2022 Open vSwitch CI - 2.16.0-67- Merging upstream branch-2.16 [RH git: 4936a7194b] Commit list: 2c666b9791 netdev-offload-tc: Check for ct_state flag combinations that are not offloadable. * Mon Apr 04 2022 Open vSwitch CI - 2.16.0-66- Merging upstream branch-2.16 [RH git: 1418edaf18] Commit list: 26189fd264 dpif-netdev: Fix dp_netdev_get_pmd() function getting correct core_id. a5af081bc6 alb.at: Add tests for cross-numa polling. 78c8f8a7f6 dpif-netdev: Fix PMD auto load balance with pmd-rxq-isolate. 6731e581c4 pmd.at: Add tests for multi non-local numa pmds. 60652bb3eb dpif-netdev: Fix non-local numa selection for more than two numas. c113039503 ofproto-dpif-xlate: Fix NULL pointer dereference in xlate_normal(). * Wed Mar 30 2022 Open vSwitch CI - 2.16.0-65- Merging upstream branch-2.16 [RH git: b4c45acc47] Commit list: 7644c924e8 sparse: bump recommended version and include headers. 20b87feba9 rculist: use multi-variable helpers for loop macros. 05a440fafb hindex: use multi-variable iterators. 04dca15004 cmap: use multi-variable iterators. 80e64f712d hmap: implement UB-safe hmap pop iterator. 3b4b0af690 hmap: use multi-variable helpers for hmap loops. 05e899ea8f list: use multi-variable helpers for list loops. d2406399ae util: add helpers to overload SAFE macro. f22f9d947a util: add safe multi-variable iterators. 72c3e8627c util: add multi-variable loop iterator macros. * Wed Mar 30 2022 Open vSwitch CI - 2.16.0-64- Merging upstream branch-2.16 [RH git: 32008eb008] Commit list: 1570924c3f ovsdb: raft: Fix inability to read the database with DNS host names. (#2055097) * Mon Mar 28 2022 Open vSwitch CI - 2.16.0-63- Merging upstream branch-2.16 [RH git: a3c48a5aeb] Commit list: c50a0f080d system-traffic.at: Fix flaky DNAT load balancing test. 9928344ea7 dpif-netdev: Keep orig_in_port as a field of the flow. aee2e66287 tests: Fix incorrect usage of OVS_WAIT_UNTIL. 5881545bd0 odp-util: Fix output for tc to be equal to kernel. 4a80c322f9 netdev-offload-tc: Fix IP and port ranges in flower returns. 49e0bb72bc netdev-offload-tc: Fix use of ICMP values instead of masks defines. 0fb545c7d9 netdev-offload-tc: Always include conntrack information to tc. 13a3f57976 netdev-offload-tc: Check for valid netdev ifindex in flow_put. 6e72fd96d3 netdev-offload-tc: Set the correct VLAN_VID and VLAN_PCP masks. e43157f303 netdev-offload-tc: Add debug logs on tc rule verify failures. 37297e7ee6 tc: Keep header rewrite actions order. 823be413ec dpdk: Use DPDK 20.11.4 release * Fri Mar 11 2022 Open vSwitch CI - 2.16.0-62- Merging upstream branch-2.16 [RH git: 561b178a3d] Commit list: 47b5374280 system-dpdk: Fix mfex autovalidator tests. 98a74bd487 ofp-prop: Silence the \'may be uninitialized\' warning. ab4f30e02b ovsdb-cluster.at: Avoid test failures due to different hashing. * Mon Mar 07 2022 Open vSwitch CI - 2.16.0-61- Merging upstream branch-2.16 [RH git: 0e0cf86cf5] Commit list: d5d2bd3c09 ofproto: Use xlate map for uuid lookups. d158b29fb6 ofproto: Add refcount to ofproto to fix ofproto use-after-free. * Sat Mar 05 2022 Open vSwitch CI - 2.16.0-60- Merging upstream branch-2.16 [RH git: 67312d8bee] Commit list: 43882d8372 ofproto-dpif: Trigger revalidation when ipfix config set. 218bb05fb2 system-tso: Skip encap tests when userspace TSO is enabled. * Fri Mar 04 2022 Open vSwitch CI - 2.16.0-59- Merging upstream branch-2.16 [RH git: 832e52bea7] Commit list: 1515e085b9 tc: Fix stats byte count on fragmented packets. 7a3b46d517 compat: Add gen_stats include to define tc hw stats. * Tue Mar 01 2022 Timothy Redaelli - 2.16.0-58- Change RTE_ETH_MAXPORTS to 1024 [RH git: 81ff7c5a60] (#2059758) Resolves: #2059758 * Fri Feb 25 2022 Open vSwitch CI - 2.16.0-57- Merging upstream branch-2.16 [RH git: 897937f6d3] Commit list: 9598f0529c ovsdb: raft: Fix inability to join the cluster after interrupted attempt. (#2033514) * Fri Feb 25 2022 Open vSwitch CI - 2.16.0-56- Merging upstream branch-2.16 [RH git: e4d6d108a3] Commit list: fb4767b472 dpif-netdev: Fix a race condition in deletion of offloaded flows. 3e72eae031 dpif-netdev: Move port flush after datapath reconfiguration. * Thu Feb 24 2022 Open vSwitch CI - 2.16.0-55- Merging upstream branch-2.16 [RH git: 970214133d] Commit list: 0168e7989d reconnect: Fix broken inactivity probe if there is no other reason to wake up. * Thu Feb 24 2022 Open vSwitch CI - 2.16.0-54- Merging upstream branch-2.16 [RH git: ac5da61d03] Commit list: dee52795e6 datapath-windows: Fix NXM_OF_IP_TOS issue * Wed Feb 16 2022 Open vSwitch CI - 2.16.0-53- Merging upstream branch-2.16 [RH git: b2df459e49] Commit list: dcde9771c5 ovsdb-idl: Fix use-after-free when destroying an IDL loop. * Wed Feb 16 2022 Open vSwitch CI - 2.16.0-52- Merging upstream branch-2.16 [RH git: bba08b5363] Commit list: 8e23c06f24 dpif-netdev-dpcls: Make subtable reprobe thread-safe. ac0e3dd3ba ci: Fix typo in variable name. fc25e0397a dp-packet: Ensure packet base is always non-NULL. dbae56e702 bfd: lldp: stp: Fix misaligned packet field access. ee17b06cf9 ovsdb-idlc: Avoid accessing member within NULL idl index cursors. 1d799a5d17 stopwatch: Fix buffer underflow when computing percentiles. * Wed Feb 09 2022 Open vSwitch CI - 2.16.0-51- Merging upstream branch-2.16 [RH git: 7b6570c65f] Commit list: 0954c2911d ofproto: Fix ipfix not always sampling on egress. (#2016346) * Wed Feb 09 2022 Open vSwitch CI - 2.16.0-50- Merging upstream branch-2.16 [RH git: c5ad7f71c5] Commit list: 867e586b45 tc: Fix incorrect TC rule for decap+encap datapath flow. * Tue Feb 08 2022 Open vSwitch CI - 2.16.0-49- Merging upstream branch-2.16 [RH git: 4541c91b99] Commit list: 418e6a0b8e dpif-netdev: fix vlan and ipv4 parsing in avx512 * Mon Feb 07 2022 Michael Santana - 2.16.0-48- Merging upstream branch-2.16 [RH git: 9d51785142] Commit list: 1ec567a752 ci: Install wheel before installing any other python packages. 031a99cef0 odp-util: Fix tunnel key attr for GTP-U. 558699c73c ovsdb-idl: Only process successful txn in ovsdb_idl_loop_run. * Wed Feb 02 2022 Open vSwitch CI - 2.16.0-47- Merging upstream branch-2.16 [RH git: 6e6f66ffd0] Commit list: 0276bdb30a ofproto-dpif-upcall: Fix n_revalidators on upcall show. * Wed Feb 02 2022 Open vSwitch CI - 2.16.0-46- Merging upstream branch-2.16 [RH git: 513117cbb0] Commit list: 16575362dc acinclude: Detect avx512 vpopcntdq compiler support. * Tue Feb 01 2022 Ilya Maximets - 2.16.0-45- ovsdb: transaction: Keep one entry in the transaction history. [RH git: 7665f42d12] (#2044621) commit 6e13565dd32fb2cf5517f51ca06956e2052c4bba Author: Ilya Maximets Date: Sun Dec 19 15:09:38 2021 +0100 ovsdb: transaction: Keep one entry in the transaction history. If a single transaction exceeds the size of the whole database (e.g., a lot of rows got removed and new ones added), transaction history will be drained. This leads to sending UUID_ZERO to the clients as the last transaction id in the next monitor update, because monitor doesn\'t know what was the actual last transaction id. In case of a re-connect that will cause re-downloading of the whole database, since the client\'s last_id will be out of sync. One solution would be to store the last transaction ID separately from the actual transactions, but that will require a careful management in cases where database gets reset and the history needs to be cleared. Keeping the one last transaction instead to avoid the problem. That should not be a big concern in terms of memory consumption, because this last transaction will be removed from the history once the next transaction appeared. This is also not a concern for a fast re-sync, because this last transaction will not be used for the monitor reply; it\'s either client already has it, so no need to send, or it\'s a history miss. The test updated to not check the number of atoms if there is only one transaction in the history. Fixes: 317b1bfd7dd3 (\"ovsdb: Don\'t let transaction history grow larger than the database.\") Acked-by: Mike Pattrick Acked-by: Han Zhou Signed-off-by: Ilya Maximets Reported-at: https://bugzilla.redhat.com/2044621 Signed-off-by: Ilya Maximets * Mon Jan 31 2022 Open vSwitch CI - 2.16.0-44- Merging upstream branch-2.16 [RH git: d202cd6da1] Commit list: 34c830c540 ovsdb-idl: ovsdb_idl_loop_destroy must also destroy the committing txn. 13009736b2 ovsdb-cs: Clear last_id on reconnect if condition changes in-flight. 017e2ae50e ofp-flow: Skip flow reply if it exceeds the maximum message size. e0c6f92a95 ovsdb-cs: Fix ignoring of the last id from the initial monitor reply. (#2044624) * Fri Jan 28 2022 Ilya Maximets - 2.16.0-43- ovsdb: storage: Randomize should_snapshot checks when the minimum time passed. [RH git: abe61535ca] (#2044614) commit 339f97044e3c2312fbb65b932fa14a181acf40d5 Author: Ilya Maximets Date: Mon Dec 13 16:43:33 2021 +0100 ovsdb: storage: Randomize should_snapshot checks when the minimum time passed. Snapshots are scheduled for every 10-20 minutes. It\'s a random value in this interval for each server. Once the time is up, but the maximum time (24 hours) not reached yet, ovsdb will start checking if the log grew a lot on every iteration. Once the growth is detected, compaction is triggered. OTOH, it\'s very common for an OVSDB cluster to not have the log growing very fast. If the log didn\'t grow 2x in 20 minutes, the randomness of the initial scheduled time is gone and all the servers are checking if they need to create snapshot on every iteration. And since all of them are part of the same cluster, their logs are growing with the same speed. Once the critical mass is reached, all the servers will start creating snapshots at the same time. If the database is big enough, that might leave the cluster unresponsive for an extended period of time (e.g. 10-15 seconds for OVN_Southbound database in a larger scale OVN deployment) until the compaction completed. Fix that by re-scheduling a quick retry if the minimal time already passed. Effectively, this will work as a randomized 1-2 min delay between checks, so the servers will not synchronize. Scheduling function updated to not change the upper limit on quick reschedules to avoid delaying the snapshot creation indefinitely. Currently quick re-schedules are only used for the error cases, and there is always a \'slow\' re-schedule after the successful compaction. So, the change of a scheduling function doesn\'t change the current behavior much. Signed-off-by: Ilya Maximets Acked-by: Han Zhou Acked-by: Dumitru Ceara Reported-at: https://bugzilla.redhat.com/2044614 Signed-off-by: Ilya Maximets * Fri Jan 28 2022 Ilya Maximets - 2.16.0-42- raft: Only allow followers to snapshot. [RH git: 915efc8c00] (#2044614) commit bf07cc9cdb2f37fede8c0363937f1eb9f4cfd730 Author: Dumitru Ceara Date: Mon Dec 13 20:46:03 2021 +0100 raft: Only allow followers to snapshot. Commit 3c2d6274bcee (\"raft: Transfer leadership before creating snapshots.\") made it such that raft leaders transfer leadership before snapshotting. However, there\'s still the case when the next leader to be is in the process of snapshotting. To avoid delays in that case too, we now explicitly allow snapshots only on followers. Cluster members will have to wait until the current election is settled before snapshotting. Given the following logs taken from an OVN_Southbound 3-server cluster during a scale test: S1 (old leader): 19:07:51.226Z|raft|INFO|Transferring leadership to write a snapshot. 19:08:03.830Z|ovsdb|INFO|OVN_Southbound: Database compaction took 12601ms 19:08:03.940Z|raft|INFO|server 8b8d is leader for term 43 S2 (follower): 19:08:00.870Z|raft|INFO|server 8b8d is leader for term 43 S3 (new leader): 19:07:51.242Z|raft|INFO|received leadership transfer from f5c9 in term 42 19:07:51.244Z|raft|INFO|term 43: starting election 19:08:00.805Z|ovsdb|INFO|OVN_Southbound: Database compaction took 9559ms 19:08:00.869Z|raft|INFO|term 43: elected leader by 2+ of 3 servers We see that the leader to be (S3) receives the leadership transfer, initiates the election and immediately after starts a snapshot that takes ~9.5 seconds. During this time, S2 votes for S3 electing it as cluster leader but S3 doesn\'t effectively become leader until it finishes snapshotting, essentially keeping the cluster without a leader for up to ~9.5 seconds. With the current change, S3 will delay compaction and snapshotting until the election is finished. The only exception is the case of single-node clusters for which we allow the node to snapshot regardless of role. Acked-by: Han Zhou Signed-off-by: Dumitru Ceara Signed-off-by: Ilya Maximets Reported-at: https://bugzilla.redhat.com/2044614 Signed-off-by: Ilya Maximets * Wed Jan 26 2022 Open vSwitch CI - 2.16.0-41- Merging upstream branch-2.16 [RH git: f1ca7b8ac3] Commit list: 2571b1a464 ofproto-dpif: Fix issue with non-reversible actions on a patch ports. * Fri Jan 21 2022 Open vSwitch CI - 2.16.0-40- Merging upstream branch-2.16 [RH git: 60b19f443c] Commit list: 07a115f7d9 ovs-monitor-ipsec: Fix generated strongSwan ipsec.conf for IPv6. * Thu Jan 20 2022 Open vSwitch CI - 2.16.0-39- Merging upstream branch-2.16 [RH git: 349d687673] Commit list: f2ee013f73 datapath-windows: Pickup Ct tuple as CT lookup key in function OvsCtSetupLookupCtx * Tue Jan 18 2022 Open vSwitch CI - 2.16.0-38- Merging upstream branch-2.16 [RH git: e370e283cf] Commit list: bd8ebcd10c Documentation: Fix Rx/Tx queue configuration section. * Mon Jan 17 2022 Open vSwitch CI - 2.16.0-37- Merging upstream branch-2.16 [RH git: c9297f5ef7] Commit list: 29936a853f ofproto-dpif: Fix memory leak in dpif/show-dp-features appctl. * Thu Jan 13 2022 Open vSwitch CI - 2.16.0-36- Merging upstream branch-2.16 [RH git: edae801e00] Commit list: ba7fffb832 dpif-netdev: Improve loading of packet data for undersized packets. * Sat Dec 18 2021 Open vSwitch CI - 2.16.0-35- Merging upstream branch-2.16 [RH git: 6ad0375ff5] Commit list: 2595b7b3d1 Prepare for 2.16.3. 6caaae525c Set release date for 2.16.2. 443e3657d7 ofproto-dpif-xlate: Snoop ingress packets and update neigh cache if needed. 75d2ef9a60 tnl-neigh-cache: Do not refresh the entry while revalidating. 5d88836566 tnl-neigh-cache: Read/write expires atomically. fb42c99c15 dpif-netdev: Improve handling of IP/TCP in avx512 mfex. * Thu Dec 09 2021 Open vSwitch CI - 2.16.0-34- Merging upstream branch-2.16 [RH git: 07b9bf085a] Commit list: f42c484445 compat: handle NF_REPEAT error on nf_conntrack_in. * Mon Dec 06 2021 Open vSwitch CI - 2.16.0-33- Merging upstream branch-2.16 [RH git: 8708b55152] Commit list: 3e527f21cf flow: Consider dataofs when parsing TCP packets. b537e049ad tests/flowgen: Fix packet data endianness. 35244b4980 ofproto: Fix resource usage explosion due to removal of large number of flows. a201297639 ofproto: Fix resource usage explosion while processing bundled FLOW_MOD. cd0133402c tests/flowgen: Fix length field of 802.2 data link header. 2d65b8ffd2 ovs-lib: Backup and remove existing DB when joining cluster. ab01177637 docs/dpdk: Fix install doc. 38a2129524 ovs-save: Save igmp flows in ofp_parse syntax. dc77857ce2 faq: Update OVS/DPDK version table for OVS 2.13/2.14. * Thu Nov 18 2021 Open vSwitch CI - 2.16.0-32- Merging upstream branch-2.16 [RH git: e90e06a818] Commit list: 1d8e0f861f ofproto-dpif-xlate: Fix check_pkt_larger incomplete translation. * Mon Nov 15 2021 Open vSwitch CI - 2.16.0-31- Merging upstream branch-2.16 [RH git: 77a249d38b] Commit list: f8f2f7c9cb datapath-windows: Reset flow key after Ipv4 fragments are reassembled * Wed Nov 10 2021 Timothy Redaelli - 2.16.0-30- python: Replace pyOpenSSL with ssl. [RH git: 0cd5867531] (#1988429) Currently, pyOpenSSL is half-deprecated upstream and so it\'s removed on some distributions (for example on CentOS Stream 9, https://issues.redhat.com/browse/CS-336), but since OVS only supports Python 3 it\'s possible to replace pyOpenSSL with \"import ssl\" included in base Python 3. Stream recv and send had to be splitted as _recv and _send, since SSLError is a subclass of socket.error and so it was not possible to except for SSLWantReadError and SSLWantWriteError in recv and send of SSLStream. TCPstream._open cannot be used in SSLStream, since Python ssl module requires the SSL socket to be created before connecting it, so SSLStream._open needs to create the socket, create SSL socket and then connect the SSL socket. Reported-by: Timothy Redaelli Reported-at: https://bugzilla.redhat.com/1988429 Signed-off-by: Timothy Redaelli Acked-by: Terry Wilson Tested-by: Terry Wilson Signed-off-by: Ilya Maximets Signed-off-by: Timothy Redaelli * Wed Nov 10 2021 Timothy Redaelli - 2.16.0-29- python: socket-util: Split inet_open_active function and use connect_ex. [RH git: 2e704b371c] In an upcoming patch, PyOpenSSL will be replaced with Python ssl module, but in order to do an async connection with Python ssl module the ssl socket must be created when the socket is created, but before the socket is connected. So, inet_open_active function is splitted in 3 parts: - inet_create_socket_active: creates the socket and returns the family and the socket, or (error, None) if some error needs to be returned. - inet_connect_active: connect the socket and returns the errno (it returns 0 if errno is EINPROGRESS or EWOULDBLOCK). connect is replaced by connect_ex, since Python suggest to use it for asynchronous connects and it\'s also cleaner since inet_connect_active returns errno that connect_ex already returns, moreover due to a Python limitation connect cannot not be used with ssl module. inet_open_active function is changed in order to use the new functions inet_create_socket_active and inet_connect_active. Signed-off-by: Timothy Redaelli Acked-by: Terry Wilson Tested-by: Terry Wilson Signed-off-by: Ilya Maximets Signed-off-by: Timothy Redaelli * Wed Nov 10 2021 Timothy Redaelli - 2.16.0-28- redhat: remove mlx4 support [RH git: 4c846afd24] (#1998122) Resolves: #1998122 * Tue Nov 09 2021 Ilya Maximets - 2.16.0-27- ovsdb: Don\'t let transaction history grow larger than the database. [RH git: 93d1fa0bdf] (#2012949) commit 317b1bfd7dd315e241c158e6d4095002ff391ee3 Author: Ilya Maximets Date: Tue Sep 28 13:17:21 2021 +0200 ovsdb: Don\'t let transaction history grow larger than the database. If user frequently changes a lot of rows in a database, transaction history could grow way larger than the database itself. This wastes a lot of memory and also makes monitor_cond_since slower than usual monotor_cond if the transaction id is old enough, because re-construction of the changes from a history is slower than just creation of initial database snapshot. This is also the case if user deleted a lot of data, so transaction history still holds all of it while the database itself doesn\'t. In case of current lb-per-service model in ovn-kubernetes, each load-balancer is added to every logical switch/router. Such a transaction touches more than a half of a OVN_Northbound database. And each of these transactions is added to the transaction history. Since transaction history depth is 100, in worst case scenario, it will hold 100 copies of a database increasing memory consumption dramatically. In tests with 3000 LBs and 120 LSs, memory goes up to 3 GB, while holding at 30 MB if transaction history disabled in the code. Fixing that by keeping count of the number of ovsdb_atom\'s in the database and not allowing the total number of atoms in transaction history to grow larger than this value. Counting atoms is fairly cheap because we don\'t need to iterate over them, so it doesn\'t have significant performance impact. It would be ideal to measure the size of individual atoms, but that will hit the performance. Counting cells instead of atoms is not sufficient, because OVN users are adding hundreds or thousands of atoms to a single cell, so they are largely different in size. Signed-off-by: Ilya Maximets Acked-by: Han Zhou Acked-by: Dumitru Ceara Reported-at: https://bugzilla.redhat.com/2012949 Signed-off-by: Ilya Maximets * Tue Nov 09 2021 Ilya Maximets - 2.16.0-26- ovsdb: transaction: Incremental reassessment of weak refs. [RH git: e8a363db49] (#2005958) commit 4dbff9f0a68579241ac1a040726be3906afb8fe9 Author: Ilya Maximets Date: Sat Oct 16 03:20:23 2021 +0200 ovsdb: transaction: Incremental reassessment of weak refs. The main idea is to not store list of weak references in the source row, so they all don\'t need to be re-checked/updated on every modification of that source row. The point is that source row already knows UUIDs of all destination rows stored in the data, so there is no much profit in storing this information somewhere else. If needed, destination row can be looked up and reference can be looked up in the destination row. For the fast lookup, destination row now stores references in a hash map. Weak reference structure now contains the table and uuid of a source row instead of a direct pointer. This allows to replace/update the source row without breaking any weak references stored in destination rows. Structure also now contains the key-value pair of atoms that triggered creation of this reference. These atoms can be used to quickly subtract removed references from a source row. During reassessment, ovsdb now only needs to care about new added or removed atoms, and atoms that got removed due to removal of the destination rows, but these are marked for reassessment by the destination row. ovsdb_datum_subtract() is used to remove atoms that points to removed or incorrect rows, so there is no need to re-sort datum in the end. Results of an OVN load-balancer benchmark that adds 3K load-balancers to each of 120 logical switches and 120 logical routers in the OVN sandbox with clustered Northbound database and then removes them: Before: %CPU CPU Time CMD 86.8 00:16:05 ovsdb-server nb1.db 44.1 00:08:11 ovsdb-server nb2.db 43.2 00:08:00 ovsdb-server nb3.db After: %CPU CPU Time CMD 54.9 00:02:58 ovsdb-server nb1.db 33.3 00:01:48 ovsdb-server nb2.db 32.2 00:01:44 ovsdb-server nb3.db So, on a cluster leader the processing time dropped by 5.4x, on followers - by 4.5x. More load-balancers - larger the performance difference. There is a slight increase of memory usage, because new reference structure is larger, but the difference is not significant. Signed-off-by: Ilya Maximets Acked-by: Dumitru Ceara Reported-at: https://bugzilla.redhat.com/2005958 Signed-off-by: Ilya Maximets * Thu Oct 28 2021 Open vSwitch CI - 2.16.0-25- Merging upstream branch-2.16 [RH git: f5366890c5] Commit list: c221c8e613 datapath-windows:Reset PseudoChecksum value only for TX direction offload case * Wed Oct 27 2021 Open vSwitch CI - 2.16.0-24- Merging upstream branch-2.16 [RH git: 4682b76694] Commit list: b79f0369f2 ci: Make linux-prepare trust system installs. * Mon Oct 25 2021 Open vSwitch CI - 2.16.0-23- Merging upstream branch-2.16 [RH git: cce913794e] Commit list: 2a4c87f300 Prepare for 2.16.2. aaa1439b8e Set release date for 2.16.1. * Thu Oct 21 2021 Open vSwitch CI - 2.16.0-22- Merging upstream branch-2.16 [RH git: 29f01c4fdb] Commit list: 108176ab5a github: Stick to python 3.9. * Tue Oct 19 2021 Open vSwitch CI - 2.16.0-21- Merging upstream branch-2.16 [RH git: 2546fa9646] Commit list: 5c5e34603b datapath-windows: add layers when adding the deferred actions * Thu Oct 14 2021 Open vSwitch CI - 2.16.0-20- Merging upstream branch-2.16 [RH git: d572c95f69] Commit list: 458a4f75f3 ofproto-dpif-xlate: Fix zone set from non-frozen-metadata fields. * Wed Oct 13 2021 Open vSwitch CI - 2.16.0-19- Merging upstream branch-2.16 [RH git: 557ca689f7] Commit list: 6d8190584a dpif-netdev: Fix use-after-free on PACKET_OUT of IP fragments. 44a66cc1d0 tunnel-push-pop.at: Mask source port in tunnel header. * Tue Oct 12 2021 Open vSwitch CI - 2.16.0-18- Merging upstream branch-2.16 [RH git: a6c4770398] Commit list: 27a5848a33 ovs-ctl: Add missing description for --ovs-vswitchd-options and --ovsdb-server-options to usage(). 0300d0c0c2 dpdk-stub: Change the ERR log to DBG. cdd6dd821d dpif-netlink: Fix feature negotiation for older kernels. c2682c42cb dpif-netdev: Fix pmd thread comments to include SMC. 9377f4a465 python: idl: Avoid sending transactions when the DB is not synced up. * Tue Oct 12 2021 Open vSwitch CI - 2.16.0-17- Merging upstream branch-2.16 [RH git: c1145b5236] Commit list: 0fd17fbb09 ipf: release unhandled packets from the batch * Thu Sep 30 2021 Open vSwitch CI - 2.16.0-16- Merging upstream branch-2.16 [RH git: 5c05133179] Commit list: 3f692fba98 datapath-windows:adjust Offset when processing packet in POP_VLAN action * Wed Sep 29 2021 Dumitru Ceara - 2.16.0-15- ovsdb-data: Deduplicate string atoms. [RH git: 24e7d1140e] (#2006839) commit 429b114c5aadee24ccfb16ad7d824f45cdcea75a Author: Ilya Maximets Date: Wed Sep 22 09:28:50 2021 +0200 ovsdb-server spends a lot of time cloning atoms for various reasons, e.g. to create a diff of two rows or to clone a row to the transaction. All atoms, except for strings, contains a simple value that could be copied in efficient way, but duplicating strings every time has a significant performance impact. Introducing a new reference-counted structure \'ovsdb_atom_string\' that allows to not copy strings every time, but just increase a reference counter. This change allows to increase transaction throughput in benchmarks up to 2x for standalone databases and 3x for clustered databases, i.e. number of transactions that ovsdb-server can handle per second. It also noticeably reduces memory consumption of ovsdb-server. Next step will be to consolidate this structure with json strings, so we will not need to duplicate strings while converting database objects to json and back. Signed-off-by: Ilya Maximets Acked-by: Dumitru Ceara Acked-by: Mark D. Gray Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2006839 Signed-off-by: Dumitru Ceara * Wed Sep 29 2021 Dumitru Ceara - 2.16.0-14- ovsdb-data: Add function to apply diff in-place. [RH git: df0e4bda98] (#2006851) commit 32b51326ef9c307b4acd0bacafb0218dd1372f3d Author: Ilya Maximets Date: Thu Sep 23 01:47:24 2021 +0200 ovsdb_datum_apply_diff() is heavily used in ovsdb transactions, but it\'s linear in terms of number of comparisons. And it also clones all the atoms along the way. In most cases size of a diff is much smaller than the size of the original datum, this allows to perform the same operation in-place with only O(diff->n * log2(old->n)) comparisons and O(old->n + diff->n) memory copies with memcpy. Using this function while applying diffs read from the storage gives a significant performance boost and allows to execute much more transactions per second. Signed-off-by: Ilya Maximets Acked-by: Mark D. Gray Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2006851 Signed-off-by: Dumitru Ceara * Wed Sep 29 2021 Dumitru Ceara - 2.16.0-13- ovsdb-data: Optimize subtraction of sets. [RH git: 5bace82405] (#2005483) commit bb12b63176389e516ddfefce20dfa165f24430fb Author: Ilya Maximets Date: Thu Sep 23 01:47:23 2021 +0200 Current algorithm for ovsdb_datum_subtract looks like this: for-each atom in a: if atom in b: swap(atom, ) destroy(atom) quicksort(a) Complexity: Na * log2(Nb) + (Na - Nb) * log2(Na - Nb) Search Comparisons for quicksort It\'s not optimal, especially because Nb << Na in a vast majority of cases. Reversing the search phase to look up atoms from \'b\' in \'a\', and closing gaps from deleted elements in \'a\' by plain memory copy to avoid quicksort. Resulted complexity: Nb * log2(Na) + (Na - Nb) Search Memory copies Subtraction is heavily used while executing database transactions. For example, to remove one port from a logical switch in OVN. Complexity of such operation if original logical switch had 100 ports goes down from 100 * log2(1) = 100 comparisons for search and 99 * log2(99) = 656 comparisons for quicksort ------------------------------ 756 comparisons in total to only 1 * log2(100) = 7 comparisons for search + memory copy of 99 * sizeof (union ovsdb_atom) bytes. We could use memmove to close the gaps after removing atoms, but it will lead to 2 memory copies inside the call, while we can perform only one to the temporary \'result\' and swap pointers. Performance in cases, where sizes of \'a\' and \'b\' are comparable, should not change. Cases with Nb >> Na should not happen in practice. All in all, this change allows ovsdb-server to perform several times more transactions, that removes elements from sets, per second. Signed-off-by: Ilya Maximets Acked-by: Han Zhou Acked-by: Mark D. Gray Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2005483 Signed-off-by: Dumitru Ceara * Wed Sep 29 2021 Dumitru Ceara - 2.16.0-12- ovsdb-data: Optimize union of sets. [RH git: e2a4c7d794] (#2005483) commit 51946d22274cd591dc061358fb507056fbd91420 Author: Ilya Maximets Date: Thu Sep 23 01:47:22 2021 +0200 Current algorithm of ovsdb_datum_union looks like this: for-each atom in b: if not bin_search(a, atom): push(a, clone(atom)) quicksort(a) So, the complexity looks like this: Nb * log2(Na) + Nb + (Na + Nb) * log2(Na + Nb) Comparisons clones Comparisons for quicksort for search ovsdb_datum_union() is heavily used in database transactions while new element is added to a set. For example, if new logical switch port is added to a logical switch in OVN. This is a very common use case where CMS adds one new port to an existing switch that already has, let\'s say, 100 ports. For this case ovsdb-server will have to perform: 1 * log2(100) + 1 clone + 101 * log2(101) Comparisons Comparisons for for search quicksort. ~7 1 ~707 Roughly 714 comparisons of atoms and 1 clone. Since binary search can give us position, where new atom should go (it\'s the \'low\' index after the search completion) for free, the logic can be re-worked like this: copied = 0 for-each atom in b: desired_position = bin_search(a, atom) push(result, a[ copied : desired_position - 1 ]) copied = desired_position push(result, clone(atom)) push(result, a[ copied : Na ]) swap(a, result) Complexity of this schema: Nb * log2(Na) + Nb + Na Comparisons clones memory copy on push for search \'swap\' is just a swap of a few pointers. \'push\' is not a \'clone\', but a simple memory copy of \'union ovsdb_atom\'. In general, this schema substitutes complexity of a quicksort with complexity of a memory copy of Na atom structures, where we\'re not even copying strings that these atoms are pointing to. Complexity in the example above goes down from 714 comparisons to 7 comparisons and memcpy of 100 * sizeof (union ovsdb_atom) bytes. General complexity of a memory copy should always be lower than complexity of a quicksort, especially because these copies usually performed in bulk, so this new schema should work faster for any input. All in all, this change allows to execute several times more transactions per second for transactions that adds new entries to sets. Alternatively, union can be implemented as a linear merge of two sorted arrays, but this will result in O(Na) comparisons, which is more than Nb * log2(Na) in common case, since Na is usually far bigger than Nb. Linear merge will also mean per-atom memory copies instead of copying in bulk. \'replace\' functionality of ovsdb_datum_union() had no users, so it just removed. But it can easily be added back if needed in the future. Signed-off-by: Ilya Maximets Acked-by: Han Zhou Acked-by: Mark D. Gray Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2005483 Signed-off-by: Dumitru Ceara * Wed Sep 29 2021 Dumitru Ceara - 2.16.0-11- ovsdb: transaction: Use diffs for strong reference counting. [RH git: 85da133eaa] (#2003203) commit b2712d026eae2d9a5150c2805310eaf506e1f162 Author: Ilya Maximets Date: Tue Sep 14 00:19:57 2021 +0200 Currently, even if one reference added to the set of strong references or removed from it, ovsdb-server will walk through the whole set and re-count references to other rows. These referenced rows will also be added to the transaction in order to re-count their references. For example, every time Logical Switch Port added to a Logical Switch, OVN Northbound database server will walk through all ports of this Logical Switch, clone their rows, and re-count references. This is not very efficient. Instead, it can only increase reference counters for added references and reduce for removed ones. In many cases this will be only one row affected in the Logical_Switch_Port table. Introducing new function that generates a diff of two datum objects, but stores added and removed atoms separately, so they can be used to increase or decrease row reference counters accordingly. This change allows to perform several times more transactions that adds or removes strong references to/from sets per second, because ovsdb-server no longer clones and re-counts rows that are irrelevant to current transaction. Acked-by: Dumitru Ceara Signed-off-by: Ilya Maximets Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2003203 Signed-off-by: Dumitru Ceara * Mon Sep 27 2021 Open vSwitch CI - 2.16.0-10- Merging upstream branch-2.16 [RH git: 2114714012] Commit list: 547371ecdb cirrus: Reduce memory requirements for FreeBSD VMs. * Thu Sep 23 2021 Timothy Redaelli - 2.16.0-9- redhat: use hugetlbfs group for /var/log/openvswitch when dpdk is enabled [RH git: 4e5928b671] (#2004543) Resolves: #2004543 * Thu Sep 16 2021 Open vSwitch CI - 2.16.0-8- Merging upstream branch-2.16 [RH git: 7332b410fc] Commit list: facaf5bc71 netdev-linux: Fix a null pointer dereference in netdev_linux_notify_sock(). 6e203d4873 pcap-file: Fix memory leak in ovs_pcap_open(). f50da0b267 odp-util: Fix a null pointer dereference in odp_flow_format(). 7da752e43f odp-util: Fix a null pointer dereference in odp_nsh_key_from_attr__(). bc22b01459 netdev-dpdk: Fix RSS configuration for virtio. 81706c5d43 ipf: Fix only nat the first fragment in the reass process. * Wed Sep 08 2021 Open vSwitch CI - 2.16.0-7- Merging upstream branch-2.16 [RH git: e71f31dfd6] Commit list: 242c280f0e dpif-netdev: Fix crash when PACKET_OUT is metered. * Tue Aug 31 2021 Ilya Maximets - 2.16.0-6- ovsdb: monitor: Store serialized json in a json cache. [RH git: bc20330c85] (#1996152) commit 43e66fc27659af2a5c976bdd27fe747b442b5554 Author: Ilya Maximets Date: Tue Aug 24 21:00:39 2021 +0200 Same json from a json cache is typically sent to all the clients, e.g., in case of OVN deployment with ovn-monitor-all=true. There could be hundreds or thousands connected clients and ovsdb will serialize the same json object for each of them before sending. Serializing it once before storing into json cache to speed up processing. This change allows to save a lot of CPU cycles and a bit of memory since we need to store in memory only a string and not the full json object. Testing with ovn-heater on 120 nodes using density-heavy scenario shows reduction of the total CPU time used by Southbound DB processes from 256 minutes to 147. Duration of unreasonably long poll intervals also reduced dramatically from 7 to 2 seconds: Count Min Max Median Mean 95 percentile ------------------------------------------------------------- Before 1934 1012 7480 4302.5 4875.3 7034.3 After 1909 1004 2730 1453.0 1532.5 2053.6 Acked-by: Dumitru Ceara Acked-by: Han Zhou Signed-off-by: Ilya Maximets Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1996152 Signed-off-by: Ilya Maximets * Tue Aug 31 2021 Ilya Maximets - 2.16.0-5- raft: Don\'t keep full json objects in memory if no longer needed. [RH git: 4606423e8b] (#1990058) commit 0de882954032aa37dc943bafd72c33324aa0c95a Author: Ilya Maximets Date: Tue Aug 24 21:00:38 2021 +0200 raft: Don\'t keep full json objects in memory if no longer needed. Raft log entries (and raft database snapshot) contains json objects of the data. Follower receives append requests with data that gets parsed and added to the raft log. Leader receives execution requests, parses data out of them and adds to the log. In both cases, later ovsdb-server reads the log with ovsdb_storage_read(), constructs transaction and updates the database. On followers these json objects in common case are never used again. Leader may use them to send append requests or snapshot installation requests to followers. However, all these operations (except for ovsdb_storage_read()) are just serializing the json in order to send it over the network. Json objects are significantly larger than their serialized string representation. For example, the snapshot of the database from one of the ovn-heater scale tests takes 270 MB as a string, but 1.6 GB as a json object from the total 3.8 GB consumed by ovsdb-server process. ovsdb_storage_read() for a given raft entry happens only once in a lifetime, so after this call, we can serialize the json object, store the string representation and free the actual json object that ovsdb will never need again. This can save a lot of memory and can also save serialization time, because each raft entry for append requests and snapshot installation requests serialized only once instead of doing that every time such request needs to be sent. JSON_SERIALIZED_OBJECT can be used in order to seamlessly integrate pre-serialized data into raft_header and similar json objects. One major special case is creation of a database snapshot. Snapshot installation request received over the network will be parsed and read by ovsdb-server just like any other raft log entry. However, snapshots created locally with raft_store_snapshot() will never be read back, because they reflect the current state of the database, hence already applied. For this case we can free the json object right after writing snapshot on disk. Tests performed with ovn-heater on 60 node density-light scenario, where on-disk database goes up to 97 MB, shows average memory consumption of ovsdb-server Southbound DB processes decreased by 58% (from 602 MB to 256 MB per process) and peak memory consumption decreased by 40% (from 1288 MB to 771 MB). Test with 120 nodes on density-heavy scenario with 270 MB on-disk database shows 1.5 GB memory consumption decrease as expected. Also, total CPU time consumed by the Southbound DB process reduced from 296 to 256 minutes. Number of unreasonably long poll intervals reduced from 2896 down to 1934. Deserialization is also implemented just in case. I didn\'t see this function being invoked in practice. Acked-by: Dumitru Ceara Acked-by: Han Zhou Signed-off-by: Ilya Maximets Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1990058 Signed-off-by: Ilya Maximets * Tue Aug 31 2021 Ilya Maximets - 2.16.0-4- json: Add support for partially serialized json objects. [RH git: 885e5ce1b5] (#1990058) commit b0bca6f27aae845c3ca8b48d66a7dbd3d978162a Author: Ilya Maximets Date: Tue Aug 24 21:00:37 2021 +0200 json: Add support for partially serialized json objects. Introducing a new json type JSON_SERIALIZED_OBJECT. It\'s not an actual type that can be seen in a json message on a wire, but internal type that is intended to hold a serialized version of some other json object. For this reason it\'s defined after the JSON_N_TYPES to not confuse parsers and other parts of the code that relies on compliance with RFC 4627. With this JSON type internal users may construct large JSON objects, parts of which are already serialized. This way, while serializing the larger object, data from JSON_SERIALIZED_OBJECT can be added directly to the result, without additional processing. This will be used by next commits to add pre-serialized JSON data to the raft_header structure, that can be converted to a JSON before writing the file transaction on disk or sending to other servers. Same technique can also be used to pre-serialize json_cache for ovsdb monitors, this should allow to not perform serialization for every client and will save some more memory. Since serialized JSON is just a string, reusing the \'json->string\' pointer for it. Acked-by: Dumitru Ceara Acked-by: Han Zhou Signed-off-by: Ilya Maximets Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1990058 Signed-off-by: Ilya Maximets * Tue Aug 31 2021 Ilya Maximets - 2.16.0-3- json: Optimize string serialization. [RH git: bb1654da63] (#1990069) commit 748010ff304b7cd2c43f4eb98a554433f0df07f9 Author: Ilya Maximets Date: Tue Aug 24 23:07:22 2021 +0200 json: Optimize string serialization. Current string serialization code puts all characters one by one. This is slow because dynamic string needs to perform length checks on every ds_put_char() and it\'s also doesn\'t allow compiler to use better memory copy operations, i.e. doesn\'t allow copying few bytes at once. Special symbols are rare in a typical database. Quotes are frequent, but not too frequent. In databases created by ovn-kubernetes, for example, usually there are at least 10 to 50 chars between quotes. So, it\'s better to count characters that doesn\'t require escaping and use fast data copy for the whole sequential block. Testing with a synthetic benchmark (included) on my laptop shows following performance improvement: Size Q S Before After Diff ----------------------------------------------------- 100000 0 0 : 0.227 ms 0.142 ms -37.4 % 100000 2 1 : 0.277 ms 0.186 ms -32.8 % 100000 10 1 : 0.361 ms 0.309 ms -14.4 % 10000000 0 0 : 22.720 ms 12.160 ms -46.4 % 10000000 2 1 : 27.470 ms 19.300 ms -29.7 % 10000000 10 1 : 37.950 ms 31.250 ms -17.6 % 100000000 0 0 : 239.600 ms 126.700 ms -47.1 % 100000000 2 1 : 292.400 ms 188.600 ms -35.4 % 100000000 10 1 : 387.700 ms 321.200 ms -17.1 % Here Q - probability (%) for a character to be a \'\\\"\' and S - probability (%) to be a special character ( < 32). Testing with a closer to real world scenario shows overall decrease of the time needed for database compaction by ~5-10 %. And this change also decreases CPU consumption in general, because string serialization is used in many different places including ovsdb monitors and raft. Signed-off-by: Ilya Maximets Acked-by: Numan Siddique Acked-by: Dumitru Ceara Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1990069 Signed-off-by: Ilya Maximets * Fri Aug 20 2021 Open vSwitch CI - 2.16.0-2- Merging upstream branch-2.16 [RH git: 7d7567e339] Commit list: 0991ea8d19 Prepare for 2.16.1. * Wed Aug 18 2021 Flavio Leitner - 2.16.0-1- redhat: First 2.16.0 release. [RH git: 0a1c4276cc]
|
|
|