Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] salt-ssh does not respect --wipe #61083

Closed
lkubb opened this issue Oct 20, 2021 · 8 comments · Fixed by #63037
Closed

[BUG] salt-ssh does not respect --wipe #61083

lkubb opened this issue Oct 20, 2021 · 8 comments · Fixed by #63037
Assignees
Labels
Bug broken, incorrect, or confusing behavior Salt-SSH severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Milestone

Comments

@lkubb
Copy link
Contributor

lkubb commented Oct 20, 2021

Description
Running salt-ssh --wipe target-id state.apply does not wipe salt execution environment on the minion. After the command is finished executing, there will still be a directory /var/tmp/.root_<hash>_salt/ containing the whole salt environment with

  • code-checksum
  • ext_version
  • minion
  • py3
  • pyall
  • running_data
  • salt-call
  • supported-versions
  • .thin-gen-py-version
  • version

Setup
I'm running salt-ssh on Qubes OS. I extensively debugged this situation and from what I groked, it is a bug in salt-ssh. For the Qubes-specific setup, I temporarily moved /srv/salt/top.sls to top.bak and created a new TemplateVM with qvm-clone debian-11-minimal someid to minimize interference. The following files produce this behavior:

[/srv/salt/top.sls]

base:
  someid:
    - teststate

[/srv/salt/teststate.sls]

packages-installed:
  pkg.installed:
    - names:
      - curl

Steps to Reproduce the behavior

$ sudo qubesctl --skip-dom0 --target=someid state.apply
$ qvm-run --pass-io someid -- ls -lha /var/tmp
total 20K
drwxrwxrwt  5 root root 4.0K Oct 20 12:34 .
drwxr-xr-x  5 root root 4.0K Oct 20 12:34 ..
drwx------  5 root root 4.0K Oct 20 12:34 .root_<hash>_salt
[...]

salt-ssh will be called with -w option as seen here. It is also possible to pause the execution when the TemplateVM is started (Ctrl+Z) and to log into a salt master VM console by issuing qvm-console-dispvm disp-mgmt-someid. This let me verify the Qubes-specific behavior by manually debugging the execution/running salt-ssh.

Expected behavior
Missing .root_<hash>_salt directory.

Versions Report
(ran on disp-mgmt-someid with salt-ssh --versions-report)

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)

Salt Version:
           Salt: 3003.3
 
Dependency Versions:
           cffi: 1.14.1
       cherrypy: Not Installed
       dateutil: 2.8.1
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
         Jinja2: 2.11.3
        libgit2: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
        msgpack: 1.0.0
   msgpack-pure: Not Installed
   mysql-python: Not Installed
      pycparser: 2.20
       pycrypto: Not Installed
   pycryptodome: 3.10.4
         pygit2: Not Installed
         Python: 3.9.7 (default, Aug  30 2021, 00:00:00)
   python-gnupg: Not Installed
         PyYAML: 5.4.1
          PyZMQ: 19.0.0
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.3.4
 
System Versions:
           dist: fedora 33
         locale: utf-8
        machine: x86_64
        release: 5.12.14-1.fc32.qubes.x86_64
         system: Linux
        version: Fedora 33

Additional context
I tried to grok what happens (from scratch, this might be quite inaccurate and skips over details) and suspect I found the problem:

A simple fix would set opts_pkg["ssh_wipe"] = self.opts.get("ssh_wipe", False) here like this:

@@ -1134,6 +1134,8 @@ class Single:
             opts_pkg["thin_dir"] = self.opts["thin_dir"]
             opts_pkg["master_tops"] = self.opts["master_tops"]
             opts_pkg["__master_opts__"] = self.context["master_opts"]
+            opts_pkg["ssh_wipe"] = self.opts.get("ssh_wipe", False)
+
             if "known_hosts_file" in self.opts:
                 opts_pkg["known_hosts_file"] = self.opts["known_hosts_file"]
             if "_caller_cachedir" in self.opts:

This works in my situation, but I can't know about butterfly effects since I'm not familiar with the architecture as a whole.

@lkubb lkubb added Bug broken, incorrect, or confusing behavior needs-triage labels Oct 20, 2021
@welcome
Copy link

welcome bot commented Oct 20, 2021

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar.
If you have additional questions, email us at [email protected]. We’re glad you’ve joined our community and look forward to doing awesome things with you!

@s-at-ik
Copy link

s-at-ik commented Jun 21, 2022

Hi,

Just to add that this bug means the salt-ssh option -W / --rand-thin-dir will slowly but surely clutter /var/tmp as internally it simply sets the wipe option:

        if self.opts.get("rand_thin_dir"):
            self.defaults["thin_dir"] = os.path.join(
                "/var/tmp", ".{}".format(uuid.uuid4().hex[:6])
            )
            self.opts["ssh_wipe"] = "True"

see:

if self.opts.get("rand_thin_dir"):

@klausfiend
Copy link

Can confirm @s-at-ik's observation that /var/tmp gradually fills up when using the -W option (which we do with all of our Salt-SSH runs). For us, this bug also affects Rocky and Ubuntu.

@vladaurosh
Copy link

vladaurosh commented Nov 2, 2022

@s0undt3ch
Would you be able to implement change suggested by @lkubb (adding opts_pkg["ssh_wipe"] = self.opts.get("ssh_wipe", False))?
Looks to be working pretty good.

@lkubb
Copy link
Contributor Author

lkubb commented Nov 3, 2022

@vladaurosh If you can confirm this works for you as expected, I'm also willing to come up with a PR + tests since this can be a very nasty issue in some cases. I remember that my fix used to work nicely, but when I tried to come up with a test at a later point in time for a possible PR, it did not anymore. Maybe my test was faulty though.

@vladaurosh
Copy link

Hi @lkubb
I'll try to test more. I've ran salt-ssh several times, tried with wipe set to true and false, it worked fine.
What kind of issues did you have in your tests?

@lkubb
Copy link
Contributor Author

lkubb commented Nov 4, 2022

As far as I remember, the fix did not seem to cause the thindir to be absent after execution. I did not put much effort in though, so it's likely I made a mistake somewhere since this small fix worked very nicely before. Which salt version are you testing with?

@vladaurosh
Copy link

vladaurosh commented Nov 4, 2022

Hi @lkubb
I just did another quick test.

  1. ssh_wipe: False and did test.ping
    I see /var/tmp/.ec2-user_a311e5_salt on target server after test.ping returned True
  2. changed to ssh_wipe: True and did test.ping again
    Before test.ping returned True I was looking at /var/tmp on target server and I saw .ec2-user_a311e5_salt there, but once salt-ssh returned True, .ec2-user_a311e5_salt was deleted from target server.

Not sure what else I can test.

As for version, I am using salt-ssh 3004.1

marmarek added a commit to QubesOS/qubes-core-admin that referenced this issue Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Salt-SSH severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants