salt.states.service fails to recognize init.d/sysv services on systemd systems #11900

uvsmtid · 2014-04-10T17:09:43Z

Problem/Example

This is a simple Salt state to enable and start jenkins service:

# jenkins.sls
activate_jenkins_service:
    service.running:
        - name: jenkins
        - enable: True

Official Jenkins installation on RedHat/CentOS/Fedora uses init.d/sysv scripts.

Manually enabling and starting through init.d/sysv scripts perfectly works even on systemd-based Fedora 20:

systemctl enable jenkins                                                                                                                                                                             
jenkins.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig jenkins on

systemctl start jenkins

On the other hand, Salt fails to execute the state:

salt-call -l all state.sls jenkins
...
          ID: activate_jenkins_service
    Function: service.running
        Name: jenkins
      Result: False
     Comment: The named service jenkins is not available
     Changes:   
...

Cause

The problem stems from the fact that Salt executes systemctl list-unit-files command which only lists systemd unit files excluding init.d/sysv scripts:

...
[INFO    ] Executing state service.running for jenkins
[INFO    ] Executing command 'systemctl --full list-unit-files | col -b' in directory '/root'
...

Because Salt doesn't see required jenkins service in the list of unit files, it doesn't pass next execution to systemctl for enabling/starting/... the service and does't let systemctl to tell "authoritatively" about actual existence of the service.

Proposal

This issue is very closely related to issue #8444 (as far as the proposed solution is concerned) and described in this comment.

Rather than executing any pre-validation logic (i.e. finding service name somewhere), Salt should rely on systemd (and its systemctl command) to determine whether states to enable/start/... the service failed or succeeded. In other words, Salt should execute systemctl with any arbitrary service name optimistically and report result of the execution instead of trying to predict its outcome.

Workaround

Again, see it in issue #8444.

Versions

Master and minion is the same host with Fedora 20 x86_64:

 salt --versions-report
           Salt: 2014.1.1
         Python: 2.7.5 (default, Feb 19 2014, 13:47:28)
         Jinja2: 2.7.1
       M2Crypto: 0.21.1
 msgpack-python: 0.1.13
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 13.0.2
            ZMQ: 3.2.4

The text was updated successfully, but these errors were encountered:

cachedout · 2014-04-10T17:16:27Z

I agree with what you're saying here. Let's try and get this in.

cachedout · 2014-04-14T21:10:22Z

@mtorromeo says this should be fixed by #11921. @uvsmtid can you verify?

uvsmtid · 2014-04-16T19:36:13Z

@cachedout and @mtorromeo Thanks for updates!

I cherry-picked both 90bece1 and 9617d33 on top of 2014.1 (latest develop had some unrelated issues) in my virtualenv.

#8444 looks fixed

I used similar state mentioned there in its example

activate_vpn_service:
    service.running:
        - name: [email protected]
        - enable: True

Indeed, commit 9617d33 handles @ in systemd unit names to make it work.
And while it still uses systemctl --full list-units command (see problems for init.d/sysv service next), parameterized services were listed in all my tries.

#11900 (this issue) still have problems

See example of Jenkins service state in the beginning of this issue.
After variations with enable/disable start/stop I can conclude that it doesn't work in general case. And here is why...

The code after commit 90bece1 still uses command systemctl --full list-units which simply does not list init.d/sysv until they are started on the system (only when they are started: enable/disable won't affect anything).
For example, start jenkins service manually and try to list it:

sudo systemctl start jenkins
systemctl --full list-units | grep jenkins
jenkins.service
# OK

Then stop jenkins service manually and execute:

sudo systemctl stop jenkins
systemctl --full list-units | grep jenkins
# ERROR: no output captured by grep

Although it seems more like an issue with systemd (I have even updated it here), the fastest fix is still possible through salt only. The argument is that systemctl --full list-units is not required to manage service.

cachedout · 2014-04-16T19:40:04Z

@uvsmtid This is great feedback, thank you! I'll go ahead and close #8444 then and we'll keep working on this one.

smithjm · 2014-08-20T13:47:02Z

This is still broken in 2014.1.10 on Fedora-20. While there is a kludgy workaround, this really does need to be fixed. the workaround, for those in a CICD environment who need to clear out any blocks in their pipeline, is ugly but works (this example is for Centrify, which also uses sysv init-style files but is managable with systemd under FC20):

centrify-service:
  service.running:
    - name: centrifydc
    - enable: True
    - reload: True
    - watch:
      - file: /etc/centrifydc/centrifydc.conf
    - require:
      - pkg: centrify-packages
      - file: centrify-config
      - cmd: centrify-adjoin
{%- if salt['grains.get']('osfinger', 'undefined') == 'Fedora-20' %}
    - provider: service
{%- endif %}

LordFPL · 2015-04-03T15:04:06Z

Hello,

A little update for a strange thing :

salt-call service.available registrator.service
[INFO    ] Executing command 'systemctl --all --full --no-legend --no-pager list-units | col -b' in directory '/root'
[INFO    ] Executing command 'systemctl --full --no-legend --no-pager list-unit-files | col -b' in directory '/root'
[INFO    ] Legacy init script: "README".
[INFO    ] Legacy init script: "functions".
[INFO    ] Legacy init script: "netconsole".
[INFO    ] Legacy init script: "network".
local:
    False

But :

salt-call service.available registrator
[INFO    ] Executing command 'systemctl --all --full --no-legend --no-pager list-units | col -b' in directory '/root'
[INFO    ] Executing command 'systemctl --full --no-legend --no-pager list-unit-files | col -b' in directory '/root'
[INFO    ] Legacy init script: "README".
[INFO    ] Legacy init script: "functions".
[INFO    ] Legacy init script: "netconsole".
[INFO    ] Legacy init script: "network".
local:
    True

Why don't support the ".service" ? On systemd both are working :/

(and it's make me a little headache to find this...)

blbradley · 2015-06-30T16:22:45Z

This happens to me when using hadoop-formula's hadoop.hdfs state. It starts three different services. The first service started by the highstate during a fresh run is not found. The rest of the services are found and function as normal. A second highstate run proceeds normally. This possibly indicates that Salt is reloading systemd later in the process than needed.

State:

{% if hdfs.is_namenode or hdfs.is_datanode %}
hdfs-services:
  service.running:
    - enable: True
    - names:
{% if hdfs.is_namenode %}
      - hadoop-secondarynamenode
      - hadoop-namenode
{% endif %}
{% if hdfs.is_datanode %}
      - hadoop-datanode
{% endif %}

{% endif %}

I also extend hdfs-services with provider: debian_service. I've tried it with the default for Debian Jessie (provider: systemd) with same results.

/var/log/salt/minion:

[INFO    ] Executing command 'service hadoop-namenode status' in directory '/root'
[ERROR   ] Command 'service hadoop-namenode status' failed with return code: 3
[ERROR   ] output: * hadoop-namenode.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)
[INFO    ] Executing command 'service hadoop-namenode start' in directory '/root'
[ERROR   ] Command 'service hadoop-namenode start' failed with return code: 6
[ERROR   ] output: Failed to start hadoop-namenode.service: Unit hadoop-namenode.service failed to load: No such file or directory.

Versions report:

                  Salt: 2015.5.0
                Python: 2.7.9 (default, Mar  1 2015, 12:57:24)
                Jinja2: 2.7.3
              M2Crypto: 0.21.1
        msgpack-python: 0.4.2
          msgpack-pure: Not Installed
              pycrypto: 2.6.1
               libnacl: Not Installed
                PyYAML: 3.11
                 ioflo: Not Installed
                 PyZMQ: 14.4.0
                  RAET: Not Installed
                   ZMQ: 4.0.5
                  Mako: 1.0.0
 Debian source package: 2015.5.0+ds-1~bpo8+1

I would like to debug this further but haven't debugged Salt much since I switched from Salt SSH to Master/Minion setup. Suggestions?

thequailman · 2016-04-28T13:06:33Z

Running, CentOS 7, Salt version 2015.8.8.2. Cassandra is affected by this as well. As a work around, running this kludge works:

cassandra_kludge:
  cmd.run:
    - name: systemctl enable cassandra
    - unless: systemctl -a | grep cassandra

cassandra_service:
  service.running:
    - name: cassandra
    - init_delay: 10
    - require:
        - cmd: cassandra_kludge

uvsmtid · 2016-04-28T15:21:18Z

This even made me update the bug in systemd again.

My test still confirm that there is no known way by systemd to list disabled services based on init.d/sysv scripts. The best current solution would be enabling/starting/stopping/disabling service and checking error code returned by systemctl - it will fail if there is no such service, but it will succeed if there is one without need to know upfront about it.

Talkless · 2016-11-28T13:18:50Z

I have discovered somewhat similar problem on Debian Jessie, when I deploy new sysv script and try to use service.running state. I get:

2016-11-28 13:59:10,206 [salt.state       ][INFO    ][1092] Running state [pgbouncer-web-login] at time 13:59:10.205771
2016-11-28 13:59:10,207 [salt.state       ][INFO    ][1092] Executing state service.running for pgbouncer-web-login
2016-11-28 13:59:10,209 [salt.loaded.int.module.cmdmod][INFO    ][1092] Executing command ['systemctl', 'status', 'pgbouncer-web-login.service', '-n', '0'] in directory '/root'
2016-11-28 13:59:10,229 [salt.loaded.int.module.cmdmod][DEBUG   ][1092] output: * pgbouncer-web-login.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)
2016-11-28 13:59:10,230 [salt.state       ][ERROR   ][1092] The named service pgbouncer-web-login is not available

Whole idea is to create /etc/init.d/pgbouncer-web-login daemon which is modification (copy) of /etc/init.d/pgbouncer (pgbouncer does not yet support systemd), but with different ports, configs, etc. because of need to have multiple pgbouncer pools, but that's details.

I had no problem on Wheesy, but on Jessie with systemd it seems that I have to execute systemctl daemon-reload (using module.wait -> cmd.run) to make new init.d script "visible" and service.running to work.

But does that mean that service.running should always reload systemd configuration? Would it be.. "bad" in any case?

seanjnkns · 2017-03-01T00:28:50Z

Still see this same issue:
Salt Version:
Salt: 2016.3.5

Dependency Versions:
cffi: 0.8.6
cherrypy: Not Installed
dateutil: 1.5
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.7.2
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: 0.21.1
Mako: 0.8.1
msgpack-pure: Not Installed
msgpack-python: 0.4.8
mysql-python: 1.2.5
pycparser: 2.14
pycrypto: 2.6.1
pygit2: Not Installed
Python: 2.7.5 (default, Sep 15 2016, 22:37:39)
python-gnupg: Not Installed
PyYAML: 3.11
PyZMQ: 15.3.0
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.2.1
ZMQ: 4.1.4

System Versions:
dist: centos 7.2.1511 Core
machine: x86_64
release: 4.4.52-2.el7.centos.x86_64
system: Linux
version: CentOS Linux 7.2.1511 Core

Using a very simple file.managed + service.running/enable

vxlan SysV service file:

/etc/init.d/vxlan:
file.managed:
- source: salt://services/vxlan/vxlan
- user: root
- group: root
- mode: 755
- require_in:
- service: vxlan

vxlan:
service.running:
- enable: True

If I chkconfig --add vxlan and then re-run these states, no problem. BTW, this appears to be a regression as I don't recall having this issue in 2016.3.4. I haven't tested 2016.11.3, which came out today, as we're not quite ready to move to that yet. Although, I'm inclined to just change this to a systemd service given I have full control over this one regardless of the bug in salt.

devopsprosiva · 2018-01-23T19:31:51Z

I ran into the same issue with cassandra init service on centos7. @gtmanfred suggested using the provider option for service.running which fixed the issue for me.

https://docs.saltstack.com/en/latest/ref/states/providers.html

start cassandra:
  service.running:
    - name: cassandra
    - provider: rh_service

stale · 2019-05-08T21:17:15Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

cachedout added Bug labels Apr 10, 2014

cachedout added this to the Outstanding Bugs milestone Apr 10, 2014

mtorromeo mentioned this issue Apr 14, 2014

Better handling of systemd parametrised and other special units not listed by list-unit-files. #11921

Merged

cachedout added the Fixed Pending Verification label Apr 14, 2014

basepi modified the milestones: Approved, Outstanding Bugs Apr 21, 2014

rallytime removed the Fixed Pending Verification label May 5, 2014

jfindlay added the Platform Relates to OS, containers, platform-based utilities like FS, system based apps label May 26, 2015

jfindlay added severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around P3 Priority 3 and removed severity-low 4th level, cosemtic problems, work around exists labels Jul 28, 2015

estahn mentioned this issue Apr 27, 2016

Zookeeper start fails on Debian Jessie saltstack-formulas/zookeeper-formula#21

Closed

syphernl mentioned this issue Sep 5, 2016

Add support for a systemd service Enrise/mailhog-formula#2

Merged

stale bot added the stale label May 8, 2019

stale bot closed this as completed May 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

salt.states.service fails to recognize init.d/sysv services on systemd systems #11900

salt.states.service fails to recognize init.d/sysv services on systemd systems #11900

uvsmtid commented Apr 10, 2014 •

edited

Loading

cachedout commented Apr 10, 2014

Uh oh!

cachedout commented Apr 14, 2014

Uh oh!

uvsmtid commented Apr 16, 2014

Uh oh!

cachedout commented Apr 16, 2014

Uh oh!

smithjm commented Aug 20, 2014

Uh oh!

LordFPL commented Apr 3, 2015

Uh oh!

blbradley commented Jun 30, 2015

Uh oh!

thequailman commented Apr 28, 2016

Uh oh!

uvsmtid commented Apr 28, 2016

Uh oh!

Talkless commented Nov 28, 2016

Uh oh!

seanjnkns commented Mar 1, 2017

Uh oh!

devopsprosiva commented Jan 23, 2018

Uh oh!

stale bot commented May 8, 2019

Uh oh!

salt.states.service fails to recognize init.d/sysv services on systemd systems #11900

salt.states.service fails to recognize init.d/sysv services on systemd systems #11900

Comments

uvsmtid commented Apr 10, 2014 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem/Example

Cause

Proposal

Workaround

Versions

cachedout commented Apr 10, 2014

Uh oh!

cachedout commented Apr 14, 2014

Uh oh!

uvsmtid commented Apr 16, 2014

#8444 looks fixed

#11900 (this issue) still have problems

Uh oh!

cachedout commented Apr 16, 2014

Uh oh!

smithjm commented Aug 20, 2014

Uh oh!

LordFPL commented Apr 3, 2015

Uh oh!

blbradley commented Jun 30, 2015

Uh oh!

thequailman commented Apr 28, 2016

Uh oh!

uvsmtid commented Apr 28, 2016

Uh oh!

Talkless commented Nov 28, 2016

Uh oh!

seanjnkns commented Mar 1, 2017

vxlan SysV service file:

Uh oh!

devopsprosiva commented Jan 23, 2018

Uh oh!

stale bot commented May 8, 2019

Uh oh!

uvsmtid commented Apr 10, 2014 •

edited

Loading