Skip to content

uv run not stopping ray serve with ctrl c #10952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ds-kylance opened this issue Jan 25, 2025 · 6 comments · Fixed by #11009
Closed

uv run not stopping ray serve with ctrl c #10952

ds-kylance opened this issue Jan 25, 2025 · 6 comments · Fixed by #11009
Labels
bug Something isn't working

Comments

@ds-kylance
Copy link

Summary

Hello, I'm trying to use uv in conjuction with Ray locally. When starting a ray serve application I'm unable to stop the program with Ctrl-c as I normally would.

Simple example

# /// script
# dependencies = [
#   "transformers",
#   "torch",
#  "ray[serve]",
#  "starlette",
# ]
# ///

import ray
from ray import serve
from starlette.requests import Request
from transformers import pipeline


@serve.deployment(num_replicas=4)
class Translator:
    def __init__(self):
        # Load model
        self.model = pipeline("translation_en_to_fr", model="t5-small")

    def translate(self, text: str) -> str:
        # Run inference
        model_output = self.model(text)

        # Post-process output to return only the translation text
        translation = model_output[0]["translation_text"]

        return translation

    async def __call__(self, http_request: Request) -> str:
        english_text: str = await http_request.json()
        return self.translate(english_text)


translator_app = Translator.bind()

uv run -- serve run ray_serve:translator_app

Doing ctrl c here does nothing

serve run ray_serve:translator_app

This will shut down the server like normal.

Ray version: 2.41.0

Platform

macOS Darwin 23.5.0 arm64

Version

0.5.24

Python version

Python 3.12.7

@ds-kylance ds-kylance added the bug Something isn't working label Jan 25, 2025
@ericmarkmartin
Copy link
Contributor

I was unable to repro---C-c seemed to shutdown the server for me

❯ uv run -- serve run ray_serve:translator_app
2025-01-25 18:19:50,384 INFO scripts.py:494 -- Running import path: 'ray_serve:translator_app'.
2025-01-25 18:19:53,885 INFO worker.py:1832 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265 
(ProxyActor pid=4327) INFO 2025-01-25 18:19:55,424 proxy 172.23.90.18 -- Proxy starting on node 12a56510c1f603fdc1009842a74b55df8e107c0d92ec226aef78d459 (HTTP port: 8000).
INFO 2025-01-25 18:19:55,477 serve 3970 -- Started Serve in namespace "serve".
INFO 2025-01-25 18:19:55,478 serve 3970 -- Connecting to existing Serve app in namespace "serve". New http options will not be applied.
(ProxyActor pid=4327) INFO 2025-01-25 18:19:55,470 proxy 172.23.90.18 -- Got updated endpoints: {}.
(ServeController pid=4330) INFO 2025-01-25 18:19:55,573 controller 4330 -- Deploying new version of Deployment(name='Translator', app='default') (initial target replicas: 4).
(ProxyActor pid=4327) INFO 2025-01-25 18:19:55,575 proxy 172.23.90.18 -- Got updated endpoints: {Deployment(name='Translator', app='default'): EndpointInfo(route='/', app_is_cross_language=False)}.
(ServeController pid=4330) INFO 2025-01-25 18:19:55,675 controller 4330 -- Adding 4 replicas to Deployment(name='Translator', app='default').
(ServeReplica:default:Translator pid=4328) Device set to use cpu
INFO 2025-01-25 18:19:59,594 serve 3970 -- Application 'default' is ready at http://127.0.0.1:8000/.
INFO 2025-01-25 18:19:59,594 serve 3970 -- Deployed app 'default' successfully.
^CWARNING 2025-01-25 18:20:16,487 serve 3970 -- Got KeyboardInterrupt, exiting...
2025-01-25 18:20:16,487 INFO scripts.py:580 -- Got KeyboardInterrupt, shutting down...
(ServeController pid=4330) INFO 2025-01-25 18:20:16,584 controller 4330 -- Removing 4 replicas from Deployment(name='Translator', app='default').
(ServeReplica:default:Translator pid=4334) Device set to use cpu [repeated 3x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
(ServeController pid=4330) INFO 2025-01-25 18:20:18,598 controller 4330 -- Replica(id='vcnq1kqr', deployment='Translator', app='default') is stopped.
(ServeController pid=4330) INFO 2025-01-25 18:20:18,599 controller 4330 -- Replica(id='jq9626ji', deployment='Translator', app='default') is stopped.
(ServeController pid=4330) INFO 2025-01-25 18:20:18,599 controller 4330 -- Replica(id='q9w3cboh', deployment='Translator', app='default') is stopped.
(ServeController pid=4330) INFO 2025-01-25 18:20:18,600 controller 4330 -- Replica(id='i6778k4r', deployment='Translator', app='default') is stopped.

@zanieb
Copy link
Member

zanieb commented Jan 26, 2025

I can reproduce

❯ uv venv && uv export --script example.py | uv pip install -r -
❯ uv run -- serve run example:translator_app
2025-01-26 10:41:20,765	INFO scripts.py:494 -- Running import path: 'example:translator_app'.
2025-01-26 10:41:22,824	INFO worker.py:1832 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8265 
(ProxyActor pid=48202) INFO 2025-01-26 10:41:23,705 proxy 127.0.0.1 -- Proxy starting on node 424fb7191c53a76ae27c105419e4b553cae9045dee1dca40ee49f330 (HTTP port: 8000).
INFO 2025-01-26 10:41:23,759 serve 48167 -- Started Serve in namespace "serve".
INFO 2025-01-26 10:41:23,759 serve 48167 -- Connecting to existing Serve app in namespace "serve". New http options will not be applied.
(ServeController pid=48198) INFO 2025-01-26 10:41:23,798 controller 48198 -- Deploying new version of Deployment(name='Translator', app='default') (initial target replicas: 4).
(ProxyActor pid=48202) INFO 2025-01-26 10:41:23,756 proxy 127.0.0.1 -- Got updated endpoints: {}.
(ProxyActor pid=48202) INFO 2025-01-26 10:41:23,798 proxy 127.0.0.1 -- Got updated endpoints: {Deployment(name='Translator', app='default'): EndpointInfo(route='/', app_is_cross_language=False)}.
(ServeController pid=48198) INFO 2025-01-26 10:41:23,899 controller 48198 -- Adding 4 replicas to Deployment(name='Translator', app='default').
(ServeReplica:default:Translator pid=48203) Device set to use mps:0
INFO 2025-01-26 10:41:36,964 serve 48167 -- Application 'default' is ready at http://127.0.0.1:8000/.
INFO 2025-01-26 10:41:36,966 serve 48167 -- Deployed app 'default' successfully.
^C^C

@ericmarkmartin are you on macOS?

@ericmarkmartin
Copy link
Contributor

@ericmarkmartin are you on macOS?

No, WSL. Sorry, should have mentioned up front

@Vethorm
Copy link

Vethorm commented Jan 27, 2025

Same poster as OP (personal account)
I was unable to reproduce on WSL as well. I assume this is probably something very specific with macOS and potentially ARM chips. I don't have a non-apple silicon laptop to check though.

@zanieb
Copy link
Member

zanieb commented Jan 27, 2025

Doing some debugging...

The process tree looks like

\-+= 98153 zb uv run -- serve run example:translator_app
  \-+= 98156 zb /Users/zb/workspace/uv/.venv/bin/python3 /Users/zb/workspace/uv/.venv/bin/serve run example:translator_app
    |--- 98170 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/_private/ray_process_reaper.py
    |--- 98171 zb /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/core/src/ray/gcs/gcs_server --log_dir=/tmp/ray/session_2025-01-27_08-56-13_690459_98156/logs --config_list=eyJvYmplY3Rfc3BpbGxpbmdfY29uZmlnIjogIntcInR5cGVcIjogXCJmaWx
    |--- 98172 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/autoscaler/_private/monitor.py --logs-dir=/tmp/ray/session_2025-01-27_08-56-13_690459_98156/logs --logging-rotate-bytes=53
    |--- 98173 zb /Users/zb/workspace/uv/.venv/bin/python3 /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/dashboard/dashboard.py --host=127.0.0.1 --port=8265 --port-retries=50 --temp-dir=/tmp/ray --log-dir=/tmp/ray/session_2025-01-
    |-+- 98180 zb /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet --raylet_socket_name=/tmp/ray/session_2025-01-27_08-56-13_690459_98156/sockets/raylet --store_socket_name=/tmp/ray/session_2025-01-27_08-56
    | |--- 98182 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/dashboard/agent.py --node-ip-address=127.0.0.1 --metrics-export-port=60895 --dashboard-agent-port=64528 --listen-port=52
    | |--- 98183 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/_private/runtime_env/agent/main.py --node-ip-address=127.0.0.1 --runtime-env-agent-port=62780 --gcs-address=127.0.0.1:64
    | |--- 98184 zb ray::ServeReplica:default:Translator                    
    | |--- 98185 zb ray::IDLE                    
    | |--- 98186 zb ray::ProxyActor                    
    | |--- 98187 zb ray::ServeReplica:default:Translator                    
    | |--- 98188 zb ray::ServeReplica:default:Translator                    
    | |--- 98189 zb ray::IDLE                    
    | |--- 98190 zb ray::ServeReplica:default:Translator                    
    | |--- 98191 zb ray::ServeController.listen_for_change                    
    | |--- 98192 zb ray::IDLE                    
    | |--- 98193 zb ray::IDLE                    
    | |--- 98194 zb ray::IDLE                    
    | |--- 98195 zb ray::IDLE                    
    | |--- 98196 zb ray::IDLE                    
    | |--- 98197 zb ray::IDLE                    
    | |--- 98198 zb ray::IDLE                    
    | \--- 98199 zb ray::IDLE                    
    \--- 98181 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/_private/log_monitor.py --session-dir=/tmp/ray/session_2025-01-27_08-56-13_690459_98156 --logs-dir=/tmp/ray/session_2025-0
❯ kill -2 98153
no effect
❯ kill -2 98156
shuts down the server

Since a SIGINT to the immediate child works, I'm surprised that Ctrl-C is not working as intended. My understanding is that a shell usually sends a SIGINT to children.

@zanieb
Copy link
Member

zanieb commented Jan 27, 2025

Ah interesting Ray is creating a new process group

❯ uv run python -c 'import subprocess; subprocess.run(["sleep", "100"])'
...
❯ ps j 08753
USER   PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
zb    8753  8750  8750      0    1 S<+  s023    0:00.02 /Users/zb/workspace/.venv/bin/python3 -c import subprocess; subproce
❯ ps j 08754
USER   PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
zb    8754  8753  8750      0    1 S<+  s023    0:00.00 sleep 100

❯ uv run -- serve run example:translator_app
...
❯ pstree 07972
-+= 07972 zb /Users/zb/workspace/uv/target/debug/uv run -- serve run example:translator_app
 \-+= 07973 zb /Users/zb/workspace/uv/.venv/bin/python3 /Users/zb/workspace/uv/.venv/bin/serve run example:translator_app
   |--- 07984 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/_
   |--- 07985 zb /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/core/src/ray/gcs/gcs_server --log_dir=/tmp/ra
   |--- 07986 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/a
   |--- 07987 zb /Users/zb/workspace/uv/.venv/bin/python3 /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/dash
   |-+- 07989 zb /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/core/src/ray/raylet/raylet --raylet_socket_na
   | |--- 07991 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray
   | |--- 07992 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray
   | |--- 07993 zb ray::IDLE                    
   | |--- 07994 zb ray::IDLE                    
   | |--- 07995 zb ray::IDLE                    
   | |--- 07996 zb ray::IDLE                    
   | |--- 07997 zb ray::ServeReplica:default:Translator                    
   | |--- 07998 zb ray::IDLE                    
   | |--- 07999 zb ray::IDLE                    
   | |--- 08000 zb ray::ServeReplica:default:Translator                    
   | |--- 08001 zb ray::ServeReplica:default:Translator                    
   | |--- 08002 zb ray::IDLE                    
   | |--- 08003 zb ray::ServeReplica:default:Translator                    
   | |--- 08004 zb ray::IDLE                    
   | |--- 08005 zb ray::ServeController                    
   | |--- 08006 zb ray::IDLE                    
   | |--- 08007 zb ray::ProxyActor                    
   | \--- 08008 zb ray::IDLE                    
   \--- 07990 zb /Users/zb/workspace/uv/.venv/bin/python3 -u /Users/zb/workspace/uv/.venv/lib/python3.12/site-packages/ray/_
❯ ps j 07972
USER   PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
zb    7972 91593  7972      0    1 S<+  s011    0:00.02 /Users/zb/workspace/uv/target/debug/uv run -- serve run example:tran
❯ ps j 07973
USER   PID  PPID  PGID   SESS JOBC STAT   TT       TIME COMMAND
zb    7973  7972  7973      0    1 S<   s011    0:01.77 /Users/zb/workspace/uv/.venv/bin/python3 /Users/zb/workspace/uv/.ven

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants