Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Securing a Kepler Deployment #75

Open
Clint-Chester opened this issue Jan 30, 2025 · 0 comments
Open

Securing a Kepler Deployment #75

Clint-Chester opened this issue Jan 30, 2025 · 0 comments

Comments

@Clint-Chester
Copy link
Contributor

Clint-Chester commented Jan 30, 2025

Hi there, thought would write up my experience of trying to deploy Kepler without it being privileged in case it helps others or there's some tweaks that can make it even better. As we're not allowed to run containers in privileged mode, made use of the following container security context that seems to be working.

securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    add:
      - BPF
      - PERFMON
    drop:
      - ALL
  privileged: false
  readOnlyRootFilesystem: true
  runAsNonRoot: false
  runAsUser: 0
  seccompProfile:
    type: RuntimeDefault 

Running on Azure Kubernetes Services with both AMD and ARM nodes, the following warnings have been thrown up (not in order):

AMD

WARNING: failed to read int from file: open /sys/devices/system/cpu/cpu0/online: no such file or directory

1 rapl_msr_util.go:129] failed to open path /dev/cpu/0/msr: no such file or directory

1 exporter.go:135] failed to attach tp/writeback/writeback_dirty_page: opening tracepoint perf event: permission denied. Kepler will not collect page cache write events. This will affect the DRAM power model estimation on VMs.
1 exporter.go:299] Failed to open perf event for CPU cycles: failed to open bpf perf event on cpu 0: permission denied

ARM

1 rapl_msr_util.go:129] failed to open path /dev/cpu/0/msr: no such file or directory

1 exporter.go:135] failed to attach tp/writeback/writeback_dirty_page: opening tracepoint perf event: permission denied. Kepler will not collect page cache write events. This will affect the DRAM power model estimation on VMs.
1 exporter.go:145] failed to attach fentry/mark_page_accessed: create raw tracepoint: not supported. Kepler will not collect page cache read events. This will affect the DRAM power model estimation on VMs.
1 exporter.go:299] Failed to open perf event for CPU cycles: failed to open bpf perf event on cpu 0: permission denied

getCPUArch failure: open /sys/devices/cpu/caps/pmu_name: no such file or directory

Is there any other tweaks that can be made to make the deployment more secure and avoid running it as privileged? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant