Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add neighbor state #3191

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

eugercek
Copy link

@eugercek eugercek commented Nov 23, 2024

Implements part of #535

Some things we can talk:

  • We can omit state="stale" like labels and just give the state value to Gauge (IMO current impl is simpler) although I saw below in CONTRIBUTING.md I didn't see many magic numbers in the metrics and labels thus added human readable names for them.

For example a proc file contains the magic number 42 as some identifier, the Node Exporter should expose it as it is and not keep a mapping in code to make this human readable.

  • For code quality I can split Update function to make it more readable (again IMO it's simple enough to do it in one function)

This pr adds below metrics

# HELP node_arp_states ARP states by device
# TYPE node_arp_states gauge
node_arp_states{device="eth0",state="stale"} 1
node_arp_states{device="eth0",state="failed"} 2
# ...

Old comments

I changed the design in second commit due to not exploding cardinality but keep the comments if maintainers wants that design.

Iff you want to have metrics like below, can continue to read:

node_arp_states{device="eth0",state="stale", ip="1.1.1.1"} 1
node_arp_states{device="eth0",state="stale", ip="1.1.1.2"} 1
node_arp_states{device="eth0",state="failed", ip="1.1.1.3"} 1

Some things we can talk:

  • Can remove neighborState struct and just use string slice, but this is easier to follow IMO
  • This change may create many metrics, to make a estimation Default values of arp cache size are: min 128 and max 1024 therefore we may except up to 1024*count(net_int) as max. But in some "fine tuning"s I saw people generally increase this although it's limited by the size of you subnet, one can have many machines in subnet. Maybe I should make a counter for each state?

@SuperQ it's ready for your review, when you have time 🙏🏻

Signed-off-by: Emin Umut Gercek <[email protected]>
collector/arp_linux.go Outdated Show resolved Hide resolved
Signed-off-by: Emin Umut Gercek <[email protected]>
@eugercek
Copy link
Author

Also fixed the merge conflict.


for _, n := range neighbors {
// Skip entries which have state NUD_NOARP to conform to output of /proc/net/arp.
if n.State&unix.NUD_NOARP == 0 {
entries[n.Interface.Name]++
if n.State&unix.NUD_NOARP != unix.NUD_NOARP {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I can say?

Suggested change
if n.State&unix.NUD_NOARP != unix.NUD_NOARP {
if n.State != unix.NUD_NOARP {

Copy link
Contributor

@dswarbrick dswarbrick Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, n.State is a bit mask. The bitwise AND is necessary to isolate the NUD_NOARP flag that we wish to test for. Other bits may be set in n.State, so you cannot just do a simple comparison as you suggest.
cf. man 7 rtnetlink:

              ndm_state is a bit mask of the following states:
              NUD_INCOMPLETE   a currently resolving cache entry
              NUD_REACHABLE    a confirmed working cache entry
              NUD_STALE        an expired cache entry
              NUD_DELAY        an entry waiting for a timer
              NUD_PROBE        a cache entry that is currently reprobed
              NUD_FAILED       an invalid cache entry
              NUD_NOARP        a device with no destination cache
              NUD_PERMANENT    a static entry

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explanation, makes sense. Then current implementation looks ok. I changed n.State&unix.NUD_NOARP == 0 to have early return, can revert that if you wish.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the bug after i changed here, btw.

@discordianfish
Copy link
Member

LGTM in general but I think this needs some tests

@discordianfish
Copy link
Member

Also see lint error:

  Error: collector/arp_linux.go:54:6: type `neighborState` is unused (unused)
  type neighborState struct {
       ^

Signed-off-by: Emin Umut Gercek <[email protected]>
Signed-off-by: Emin Umut Gercek <[email protected]>
@eugercek
Copy link
Author

eugercek commented Feb 2, 2025

Hi there was a bug on n.State&unix.NUD_NOARP == unix.NUD_NOARP check, fixed that also fixed linter.

On the testing side, I'm not sure how to test RTNL via fixtures. Maybe we can test /proc/net/arp via fixtures, but this PR doesn't touch that part.

To test it, I wrote a reproducible but hacky Bash script. I don't recommend adding it as a test in the repository, but it demonstrates that the functionality implemented correctly. I would appreciate guidance on how to add proper tests for the RTNL part. (You can run the script on your machine. It only needs docker, and cd node_exporter_repo_root)

Script
#!/bin/bash

set -eu

export DOCKER_CLI_HINTS=false

docker rm -f arp-test nginx1 &> /dev/null || true

tempFile=$(mktemp)
cat <<EOT > $tempFile
FROM ubuntu

COPY node_exporter /opt/node_exporter

RUN apt update 1>/dev/null && apt install -y curl iproute2
USER nobody
ENTRYPOINT ["/opt/node_exporter"]
EOT

GOOS=linux GOARCH=amd64 go build .

docker build -t arp-test -f $tempFile . &>/dev/null

docker run --name arp-test \
   -p 9100:9100 \
   --rm -d \
   arp-test &>/dev/null

docker exec arp-test ip neigh

docker run -d --name nginx1 nginx  &>/dev/null

nginxIp=$(docker inspect \
 -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}'  nginx1)

cur_state() {
   echo "====="
   echo "From node_exporter:"
   curl -s localhost:9100/metrics | grep -E ^node_arp
   echo "From ip neigh:"
   docker exec arp-test ip neigh
   echo "====="
}

echo "Initial state"
cur_state

echo "After curl nginx1 (NUD_DELAY or NUD_REACHABLE)"
docker exec arp-test curl -s -o /dev/null $nginxIp
cur_state
sleep 2
echo "Wait arp cache (NUD_REACHABLE)"
cur_state

echo "After curl unknown ip (NUD_INCOMPLETE or NUD_FAILED)"
docker exec arp-test curl -m 1 -s -o /dev/null 172.17.0.10 || true
cur_state

echo "Sleep 30 second for net.ipv4.neigh.default.base_reachable_time_ms (NUD_FAILED)"
sleep 30
cur_state
Output
Initial state
=====
From node_exporter:
node_arp_entries{device="eth0"} 1
node_arp_states{device="eth0",state="reachable"} 1
From ip neigh:
172.17.0.1 dev eth0 lladdr 02:42:6a:74:ba:4a REACHABLE
=====
After curl nginx1 (NUD_DELAY or NUD_REACHABLE)
=====
From node_exporter:
node_arp_entries{device="eth0"} 2
node_arp_states{device="eth0",state="reachable"} 2
From ip neigh:
172.17.0.3 dev eth0 lladdr 02:42:ac:11:00:03 REACHABLE
172.17.0.1 dev eth0 lladdr 02:42:6a:74:ba:4a REACHABLE
=====
Wait arp cache (NUD_REACHABLE)
=====
From node_exporter:
node_arp_entries{device="eth0"} 2
node_arp_states{device="eth0",state="reachable"} 2
From ip neigh:
172.17.0.3 dev eth0 lladdr 02:42:ac:11:00:03 REACHABLE
172.17.0.1 dev eth0 lladdr 02:42:6a:74:ba:4a REACHABLE
=====
After curl unknown ip (NUD_INCOMPLETE or NUD_FAILED)
=====
From node_exporter:
node_arp_entries{device="eth0"} 3
node_arp_states{device="eth0",state="incomplete"} 1
node_arp_states{device="eth0",state="reachable"} 2
From ip neigh:
172.17.0.10 dev eth0  INCOMPLETE
172.17.0.3 dev eth0 lladdr 02:42:ac:11:00:03 REACHABLE
172.17.0.1 dev eth0 lladdr 02:42:6a:74:ba:4a REACHABLE
=====
Sleep 30 second for net.ipv4.neigh.default.base_reachable_time_ms (NUD_FAILED)
=====
From node_exporter:
node_arp_entries{device="eth0"} 3
node_arp_states{device="eth0",state="failed"} 1
node_arp_states{device="eth0",state="reachable"} 2
From ip neigh:
172.17.0.10 dev eth0  FAILED
172.17.0.3 dev eth0 lladdr 02:42:ac:11:00:03 REACHABLE
172.17.0.1 dev eth0 lladdr 02:42:6a:74:ba:4a REACHABLE
=====

collector/arp_linux.go Outdated Show resolved Hide resolved
collector/arp_linux.go Outdated Show resolved Hide resolved
collector/arp_linux.go Outdated Show resolved Hide resolved
Signed-off-by: Emin Umut Gercek <[email protected]>
}

// Map of interface name to ARP neighbor count.
entries := make(map[string]uint32)
// Map of map[InterfaceName]map[StateName]int
states := make(map[string]map[string]uint32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if I were to implement this, I would keep the neighbor states as their native uint16 until ready to emit the metrics. This would be more memory efficient and also result in fewer map lookups. In other words, make this a map[string]map[uint16]uint32 and only resolve the states to string labels when emitting the metrics to the channel.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good idea, we won't store all those strings now, also fewer lookups as you said. Thanks for your careful review.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can, resolve the conversation if the new commit is ok.

@dswarbrick
Copy link
Contributor

I have a few concerns remaining about this PR.

Firstly, this has the potential to inflate cardinality quite considerably on hosts with a lot of interfaces. The simple ARP entry count is easy to predict, as it won't be more than one metric per interface. However, adding enum states where there could potentially be three or four different non-zero state counts per interface might be a concern for some users. I know for a fact that some users run node_exporter on hosts with hundreds of interfaces. Perhaps this new feature should be gated by a command line flag.

Secondly, as with any metric involving enum states, if only non-zero enum counters are exposed, this will lead to stale metrics when a state changes. For example, if node_arp_states{device="eth0",state="incomplete"} 1 becomes node_arp_states{device="eth0",state="failed"} 1, the state="incomplete" metric will still be returned by queries up to the Prometheus query.lookback-delta (default 5 mins). If that metric changes state again within those 5 minutes (which is quite possible, since Linux neighbor cache gc_stale_time is 60s by default), this will effectively make this metric unreliable and useless. Most enum state metrics strive to expose all the states, even when the counters are zero, to avoid this phenomenon. However this comes at a cardinality price.

Finally, as I already mentioned, the neighbor state is a bit mask. This PR in its current state assumes that the neighbor state will only ever be a single NUD state, and that a simple map lookup for that value is acceptable. Are you sure that there will never be entries that are a combination of state flags? If this occurs, the neighborStatesMap lookup would return an empty string due to no matching key. And if there are multiple neighbor entries with different state flag combinations, this would result in the collector attempting to emit multiple metrics with identical label sets, i.e. node_arp_states{device="eth0",state=""} - which would cause a panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants