summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* build: Bump version to 1.1.1v1.1.1Pablo Neira Ayuso2024-10-031-3/+3
| | | | Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: initialize filter when fetching implicit chainsPablo Neira Ayuso2024-09-171-5/+4
| | | | | | | | | | | | | | ASAN reports: src/cache.c:734:25: runtime error: load of value 189, which is not a valid value for type '_Bool' because filter->reset.rule remains uninitialized. Initialize filter and replace existing construct to initialize table and chain which leaves remaining fields uninitialized. Fixes: dbff26bfba83 ("cache: consolidate reset command") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* doc: tproxy is non-terminal in nftablesPablo Neira Ayuso2024-09-171-7/+38
| | | | | | | | | iptables TPROXY issues NF_ACCEPT while nftables tproxy allows for post-processing. Update examples. For more info, see: http://lore.kernel.org/netfilter-devel/ZuSh_Io3Yt8LkyUh@orbyte.nwl.cc/T/ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: support for timeout never in elementsPablo Neira Ayuso2024-09-174-9/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | Allow to specify elements that never expire in sets with global timeout. set x { typeof ip saddr timeout 1m elements = { 1.1.1.1 timeout never, 2.2.2.2, 3.3.3.3 timeout 2m } } in this example above: - 1.1.1.1 is a permanent element - 2.2.2.2 expires after 1 minute (uses default set timeout) - 3.3.3.3 expires after 2 minutes (uses specified timeout override) Use internal NFT_NEVER_TIMEOUT marker as UINT64_MAX to differenciate between use default set timeout and timeout never if "timeout N" is used in set declaration. Maximum supported timeout in milliseconds which is conveyed within a netlink attribute is 0x10c6f7a0b5ec. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: more randomization for timeout parameterFlorian Westphal2024-09-151-8/+34
| | | | | | | | | | | | Either pass no timeout argument, pass timeout+expires or omit timeout (uses default timeout, if any). This should not expose further kernel code to run at this time, but unlike the existing (deterministic) element-update test case this script does have live traffic and different set types, including rhashtable which has async gc. Signed-off-by: Florian Westphal <fw@strlen.de>
* tests: py: fix up udp csum fixup outputFlorian Westphal2024-09-111-2/+2
| | | | | | | | | | | | Preceeding commit switched udp to use the inkernel csum parser, so tests warn: WARNING: line 7: 'add rule ip test-ip4 input iif "lo" udp checksum set 0': '[ payload write reg 1 => 2b @ transport header + 6 csum_type 1 csum_off 6 csum_flags 0x0 ]' mismatches '[ payload write reg 1 => 2b @ transport header + 6 csum_type 0 csum_off 0 csum_flags 0x1 ]' Fixes: f89abfb4068d ("proto: use NFT_PAYLOAD_L4CSUM_PSEUDOHDR flag to mangle UDP checksum") Signed-off-by: Florian Westphal <fw@strlen.de>
* proto: use NFT_PAYLOAD_L4CSUM_PSEUDOHDR flag to mangle UDP checksumPablo Neira Ayuso2024-09-103-34/+99
| | | | | | | | | | | | | | | | | | | | | There are two mechanisms to update the UDP checksum field: 1) _CSUM_TYPE and _CSUM_OFFSET which specify the type of checksum (e.g. inet) and offset where it is located. 2) use NFT_PAYLOAD_L4CSUM_PSEUDOHDR flag to use layer 4 kernel protocol parser. The problem with 1) is that it is inconditional, that is, csum_type and csum_offset cannot deal with zero UDP checksum. Use NFT_PAYLOAD_L4CSUM_PSEUDOHDR flag instead since it relies on the layer 4 kernel parser which skips updating zero UDP checksum. Extend test coverage for the UDP mangling with and without zero checksum. Fixes: e6c9174e13b2 ("proto: add checksum key information to struct proto_desc") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: stabilize packetpath/payloadPablo Neira Ayuso2024-09-101-30/+34
| | | | | | | | | | | - Add sleep calls after setting up container topology. - Extend TCP connect timeout to 4 seconds. Test has no listener, this is just sending SYN packets that are rejected but it works to test the payload mangling ruleset. - fix incorrect logic to check for 0 matching packets through grep. Fixes: 84da729e067a ("tests: shell: add test to cover payload transport match and mangle") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: add test case for timeout updatesFlorian Westphal2024-09-104-0/+195
| | | | | | | | | | | | Needs a feature check file, so add one: Add element with 1m timeout, then update expiry to 1ms. If element still exists after 1ms, update request was ignored. Test case checks timeouts can both be incremented and decremented, checks error recovery (update request but transaction fails) and that expiry is restored in addion to timeout. Signed-off-by: Florian Westphal <fw@strlen.de>
* tests: shell: extend vmap test with updatesFlorian Westphal2024-09-101-3/+45
| | | | | | | | | | | | It won't validate that the update is actually effective, but it will trigger relevant update logic in kernel. This means the updated test works even if the kernel doesn't support updates. A dedicated test will be added to check timeout updates work. Signed-off-by: Florian Westphal <fw@strlen.de>
* tests: shell: add test for kernel stack recursion bugFlorian Westphal2024-09-102-0/+39
| | | | | | Validate that such ruleset updates get rejected. Signed-off-by: Florian Westphal <fw@strlen.de>
* libnftables: Zero ctx->vars after freeing itPhil Sutter2024-09-031-0/+1
| | | | | | | | | | | Leaving the invalid pointer value in place will cause a double-free when users call nft_ctx_clear_vars() first, then nft_ctx_free(). Moreover, nft_ctx_add_var() passes the pointer to mrealloc() and thus assumes it to be either NULL or valid. Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=1772 Fixes: 9edaa6a51eab4 ("src: add --define key=value") Signed-off-by: Phil Sutter <phil@nwl.cc>
* tests: shell: extend coverage for meta l4proto netdev/egress matchingPablo Neira Ayuso2024-09-021-0/+149
| | | | | | | | Extend coverage to match on small UDP packets from netdev/egress. While at it, cover bridge/input and bridge/output hooks too. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: position does not require full cachePablo Neira Ayuso2024-08-301-2/+1
| | | | | | | | | | | | | position refers to the rule handle, it has similar cache requirements as replace rule command, relax cache requirements. Commit e5382c0d08e3 ("src: Support intra-transaction rule references") uses position.id for index support which requires a full cache, but only in such case. Fixes: 01e5c6f0ed03 ("src: add cache level flags") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: relax requirement for replace rule commandPablo Neira Ayuso2024-08-304-6/+66
| | | | | | | | | | | | | | No need for full cache, this command relies on the rule handle which is not validated from userspace. Cache requirements are similar to those of add/create/delete rule commands. This speeds up incremental updates with large rulesets. Extend tests/coverage for rule replacement. Fixes: 01e5c6f0ed03 ("src: add cache level flags") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: remove full cache requirement when echo flag is set onPablo Neira Ayuso2024-08-301-2/+0
| | | | | | | | | The echo flag does not use the cache infrastructure yet, it relies on the monitor cache which follows the netlink_echo_callback() path. Fixes: 01e5c6f0ed03 ("src: add cache level flags") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: clean up evaluate_cache_del()Pablo Neira Ayuso2024-08-301-2/+1
| | | | | | | Move NFT_CACHE_TABLE flag to default case to disentangle this. Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: assert filter when calling nft_cache_evaluate()Pablo Neira Ayuso2024-08-301-9/+7
| | | | | | | | | | | | | | | nft_cache_evaluate() always takes a non-null filter, remove superfluous checks when calculating cache requirements via flags. Note that filter is still option from netlink dump path, since this can be called from error path to provide hints. Fixes: 08725a9dc14c ("cache: filter out rules by chain") Fixes: b3ed8fd8c9f3 ("cache: missing family in cache filtering") Fixes: 635ee1cad8aa ("cache: filter out sets and maps that are not requested") Fixes: 3f1d3912c3a6 ("cache: filter out tables that are not requested") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: cover reset command with counter and quotaPablo Neira Ayuso2024-08-261-0/+104
| | | | | Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: cover anonymous set with reset commandPablo Neira Ayuso2024-08-261-0/+21
| | | | | | | | Extend existing test to reset counters for rules with anonymous set. Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=1763 Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: consolidate reset commandPablo Neira Ayuso2024-08-269-171/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reset command does not utilize the cache infrastructure. This implicitly fixes a crash with anonymous sets because elements are not fetched. I initially tried to fix it by toggling the missing cache flags, but then ASAN reports memleaks. To address these issues relies on Phil's list filtering infrastructure which updates is expanded to accomodate filtering requirements of the reset commands, such as 'reset table ip' where only the family is sent to the kernel. After this update, tests/shell reports a few inconsistencies between reset and list commands: - reset rules chain t c2 display sets, but it should only list the given chain. - reset rules table t reset rules ip do not list elements in the set. In both cases, these are fully listing a given table and family, elements should be included. The consolidation also ensures list and reset will not differ. A few more notes: - CMD_OBJ_TABLE is used for: rules family table from the parser, due to the lack of a better enum, same applies to CMD_OBJ_CHAIN. - CMD_OBJ_ELEMENTS still does not use the cache, but same occurs in the CMD_GET command case which needs to be consolidated. Closes: http://bugzilla.netfilter.org/show_bug.cgi?id=1763 Fixes: 83e0f4402fb7 ("Implement 'reset {set,map,element}' commands") Fixes: 1694df2de79f ("Implement 'reset rule' and 'reset rules' commands") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: only dump rules for the given tablePablo Neira Ayuso2024-08-261-1/+1
| | | | | | | | | | Only family is set on in the dump request, set on table and chain otherwise, rules for the given family are fetched for each existing table. Fixes: afbd102211dc ("src: do not use the nft_cache_filter object from mnl.c") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: add filtering support for objectsPablo Neira Ayuso2024-08-262-13/+89
| | | | | | | | | | | | | | | Currently, full ruleset flag is set on to fetch objects. Follow a similar approach to these patches from Phil: de961b930660 ("cache: Filter set list on server side") and cb4b07d0b628 ("cache: Support filtering for a specific flowtable") in preparation to update the reset command to use the cache infrastructure. Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: accumulate flags in batchPablo Neira Ayuso2024-08-261-5/+7
| | | | | | | | | | | | | Recent updates are relaxing cache requirements: babc6ee8773c ("cache: populate chains on demand from error path") Flags describe cache requirements for a given batch, accumulate flags that are inferred from commands in this batch. Fixes: 7df42800cf89 ("src: single cache_update() call to build cache before evaluation") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: reset filter for each commandPablo Neira Ayuso2024-08-261-2/+6
| | | | | | | | Inconditionally reset filter for each command in the batch, this is safer. Fixes: 3f1d3912c3a6 ("cache: filter out tables that are not requested") Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_json: fix crash in json_parse_set_stmt_listSebastian Walz (sivizius)2024-08-211-4/+9
| | | | | | | | Due to missing `NULL`-check, there will be a segfault for invalid statements. Fixes: 07958ec53830 ("json: add set statement list support") Signed-off-by: Sebastian Walz (sivizius) <sebastian.walz@secunet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_json: fix handle memleak from error pathPablo Neira Ayuso2024-08-211-46/+47
| | | | | | | Based on patch from Sebastian Walz. Fixes: 586ad210368b ("libnftables: Implement JSON parser") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_json: fix several expression memleaks from error pathSebastian Walz (sivizius)2024-08-211-0/+4
| | | | | | Fixes: 586ad210368b ("libnftables: Implement JSON parser") Signed-off-by: Sebastian Walz (sivizius) <sebastian.walz@secunet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* parser_json: release buffer returned by json_dumpsSebastian Walz (sivizius)2024-08-211-3/+8
| | | | | | | | | | | | | | | The signature of `json_dumps` is: `char *json_dumps(const json_t *json, size_t flags)`: It will return a pointer to an owned string, the caller must free it. However, `json_error` just borrows the string to format it as `%s`, but after printing the formatted error message, the pointer to the string is lost and thus never freed. Fixes: 586ad210368b ("libnftables: Implement JSON parser") Signed-off-by: Sebastian Walz (sivizius) <sebastian.walz@secunet.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: replace DTYPE_F_ALLOC by bitfieldPablo Neira Ayuso2024-08-212-15/+7
| | | | | | | Only user of the datatype flags field is DTYPE_F_ALLOC, replace it by bitfield, squash byteorder to 8 bits which is sufficient. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: remove DTYPE_F_PREFIXPablo Neira Ayuso2024-08-214-7/+9
| | | | | | | | only ipv4 and ipv6 datatype support this, add datatype_prefix_notation() helper function to report that datatype prefers prefix notation, if possible. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* src: mnl: always dump all netdev hooks if no interface name was givenFlorian Westphal2024-08-214-11/+45
| | | | | | | | | Instead of not returning any results for nft list hooks netdev Iterate all interfaces and then query all of them. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: mnl: prepare for listing all device netdev device hooksFlorian Westphal2024-08-211-3/+26
| | | | | | | | | | | | | | | | | | Change output foramt slightly so device name is included for netdev family. % nft list hooks netdev device eth0 family netdev { hook ingress device eth0 { 0000000000 chain inet ingress in_public [nf_tables] 0000000000 chain netdev ingress in_public [nf_tables] } hook egress device eth0 { 0000000000 chain netdev ingress out_public [nf_tables] } } Signed-off-by: Florian Westphal <fw@strlen.de>
* doc: update outdated route and pkttype info谢致邦 (XIE Zhibang)2024-08-202-2/+2
| | | | | | | | inet family supports route type. unicast pkttype changed to host pkttype. Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com> Signed-off-by: Florian Westphal <fw@strlen.de>
* parser_bison: allow 0 burst in limit rate byte modePablo Neira Ayuso2024-08-193-5/+24
| | | | | | | | | | | | | | | Unbreak restoring elements in set with rate limit that fail with: > /dev/stdin:3618:61-61: Error: limit burst must be > 0 >                  elements = { 1.2.3.4 limit rate over 1000 kbytes/second timeout 1s, no need for burst != 0 for limit rate byte mode. Add tests/shell too. Fixes: 702eff5b5b74 ("src: allow burst 0 for byte ratelimit and use it as default") Fixes: 285baccfea46 ("src: disallow burst 0 in ratelimits") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: do not fetch set inconditionally on deletePablo Neira Ayuso2024-08-192-3/+7
| | | | | | | | This is only required to remove elements, relax cache requirements for anything else. Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: populate flowtables on demand from error pathPablo Neira Ayuso2024-08-192-6/+7
| | | | | | | | Flowtables are only required for error reporting hints if kernel reports ENOENT. Populate the cache from this error path only. Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: populate objects on demand from error pathPablo Neira Ayuso2024-08-192-5/+5
| | | | | | | | Objects are only required for error reporting hints if kernel reports ENOENT. Populate the cache from this error path only. Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: populate chains on demand from error pathPablo Neira Ayuso2024-08-193-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updates on verdict maps that require many non-base chains are slowed down due to fetching existing non-base chains into the cache. Chains are only required for error reporting hints if kernel reports ENOENT. Populate the cache from this error path only. Similar approach already exists from rule ENOENT error path since: deb7c5927fad ("cmd: add misspelling suggestions for rule commands") however, NFT_CACHE_CHAIN was toggled inconditionally for rule commands, rendering this on-demand cache population useless. before this patch, running Neels' nft_slew benchmark (peak values): created idx 4992 in 52587950 ns (128 in 7122 ms) ... deleted idx 128 in 43542500 ns (127 in 6187 ms) after this patch: created idx 4992 in 11361299 ns (128 in 1612 ms) ... deleted idx 1664 in 5239633 ns (128 in 733 ms) Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* cache: rule by index requires full cachePablo Neira Ayuso2024-08-191-1/+1
| | | | | | | | | | | | | | | In preparation for on-demand cache population with errors, set on NFT_CACHE_FULL if rule index is used since this requires a full cache with rules. This is not a fix, index is already fetching a full cache before this patch. But follow up patches relax cache requirements, so add this patch in first place to make sure index does not break. Tested-by: Eric Garver <eric@garver.life> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: add a few tests for nft -iPablo Neira Ayuso2024-08-194-0/+35
| | | | | | | | | | | Eric Garver recently provided a few tests for nft -i that helped identify issues that resulted in reverting: e791dbe109b6 ("cache: recycle existing cache with incremental updates") add these tests to tests/shell. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: improve error reporting when time unit is not correctPablo Neira Ayuso2024-08-191-1/+1
| | | | | | | | | | | | | Display: Wrong unit format, expecting bytes or kbytes or mbytes instead of: Wrong rate format Fixes: 6615676d825e ("src: add per-bytes limit") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* datatype: reject rate in quota statementPablo Neira Ayuso2024-08-191-7/+13
| | | | | | | | | | | | | Bail out if rate are used: ruleset.nft:5:77-106: Error: Wrong rate format, expecting bytes or kbytes or mbytes add rule netdev firewall PROTECTED_IPS update @quota_temp_before { ip daddr quota over 45000 mbytes/second } add @quota_trigger { ip daddr } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ improve error reporting while at this. Fixes: 6615676d825e ("src: add per-bytes limit") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* tests: shell: skip vlan mangling testcase if egress is not supportPablo Neira Ayuso2024-08-191-0/+2
| | | | | | Add dependency on egress hook to skip this test in older kernels. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* doc: add documentation about list hooks featureFlorian Westphal2024-08-193-62/+118
| | | | | | | | | | | | | | Add a brief segment about 'nft list hooks' and a summary of the output format. As nft.txt is quite large, split the additonal commands into their own file. The existing listing section is removed; list subcommand is already mentioned in the relevant statement sections. Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
* src: add egress support for 'list hooks'Florian Westphal2024-08-191-3/+4
| | | | | | | This was missing: Also include the egress hooks when listing the netdev family (or unspec). Signed-off-by: Florian Westphal <fw@strlen.de>
* src: drop obsolete hook argument form hook dump functionsFlorian Westphal2024-08-193-19/+15
| | | | | | | | since commit b98fee20bfe2 ("mnl: revisit hook listing"), handle.chain is never set in this path, so 'hook' is always set to -1, so the hook arg can be dropped. Signed-off-by: Florian Westphal <fw@strlen.de>
* src: mnl: make family specification more strict when listingFlorian Westphal2024-08-191-29/+24
| | | | | | | | | | | | | | | | | | | | | make "nft list hooks <family>" more strict. nft list hooks: query/list all NFPROTO_XXX values, i.e. arp, bridge, ipv4, ipv6. If a device is also given, then do include the netdev family for the given device as well. "nft list hooks arp" will only dump the hooks registered for NFPROTO_ARP (or nothing at all if none are active). "bridge", "ip", "ip6" will list the pre/in/forward/output/postrouting hooks for these families, if any. "inet" serves as an alias for "ip" and "ip6". Link: http://lore.kernel.org/netfilter-devel/20240729153211.GA26048@breakpoint.cc/ Signed-off-by: Florian Westphal <fw@strlen.de>
* src: mnl: clean up hook listing codeFlorian Westphal2024-08-191-63/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mnl_nft_dump_nf_hooks() can call itself for the UNSPEC case, this avoids the second switch/case to handle printing for inet/unspec. As for the error handling, 'nft list hooks' should not print an error, even if nothing is printed, UNLESS there was also a lowlevel (syscall) error from the kernel. We don't want to indicate failure just because e.g. kernel doesn't support NFPROTO_ARP. This also fixes a display bug, 'nft list hooks device foo' would show hooks registered for that device as 'bridge' family instead of the expected 'netdev' family. This was because UNSPEC handling did not query 'netdev' family and did pass the device name to the lowlevel function. Add it, and pass NULL device name for those families that don't support device attachment. The lowelevel function currently always queries NFPROTO_NETDEV to handle the 'inet' ingress case. This is dubious, as 'inet ingress' is a pseudo-alias to netdev family (inet itself is a pseudo-family that ends up registering for both ipv4 and ipv6 hooks). This is resolved in next patch. Signed-off-by: Florian Westphal <fw@strlen.de>
* tests: shell: Extend table persist flag test a bitPhil Sutter2024-08-143-11/+42
| | | | | | Using a co-process, assert owner flag is effective. Signed-off-by: Phil Sutter <phil@nwl.cc>