Skip to content

os.link(..., follow_symlinks=True) broken on Linux #81793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jo-he mannequin opened this issue Jul 17, 2019 · 6 comments
Closed

os.link(..., follow_symlinks=True) broken on Linux #81793

jo-he mannequin opened this issue Jul 17, 2019 · 6 comments
Assignees
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@jo-he
Copy link
Mannequin

jo-he mannequin commented Jul 17, 2019

BPO 37612
Nosy @larryhastings, @takluyver, @serhiy-storchaka, @eryksun, @jo-he
PRs
  • bpo-37612: always call linkat() from os.link(), if available #14843
  • gh-81793: always call linkat() from os.link(), if available #24997
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-07-17.23:04:55.849>
    labels = ['extension-modules', 'type-bug', '3.8', '3.9', '3.10']
    title = 'os.link(..., follow_symlinks=True) broken on Linux'
    updated_at = <Date 2021-03-23.17:43:01.277>
    user = 'https://github.com/jo-he'

    bugs.python.org fields:

    activity = <Date 2021-03-23.17:43:01.277>
    actor = 'takluyver'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2019-07-17.23:04:55.849>
    creator = 'jo-he'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37612
    keywords = ['patch']
    message_count = 4.0
    messages = ['348086', '348103', '348110', '348111']
    nosy_count = 5.0
    nosy_names = ['larry', 'takluyver', 'serhiy.storchaka', 'eryksun', 'jo-he']
    pr_nums = ['14843', '24997']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue37612'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    Linked PRs

    @jo-he
    Copy link
    Mannequin Author

    jo-he mannequin commented Jul 17, 2019

    Regarding link() POSIX states (https://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html):

    "If path1 names a symbolic link, it is implementation-defined whether link() follows the symbolic link, or creates a new link to the symbolic link itself."

    In Linux, link() does _not_ follow symlinks (http://man7.org/linux/man-pages/man2/link.2.html):

    "By default, linkat(), does not dereference oldpath if it is a symbolic link (like link())."

    But Python 3 assumes the opposite to be always the case:

    (!follow_symlinks))

    ...which suits e.g. NetBSD (https://netbsd.gw.com/cgi-bin/man-cgi?link+2):

    "When operating on a symlink, link() resolves the symlink and creates a hard link on the target."

    Therefore, I recommend to always call linkat(), if the platform provides it. That's the modern superset of link() with clearly defined behavior.

    Here are some commands to reproduce the issue on Linux (should hard link 'file' -> 'link', but tries '/tmp/symlink' -> 'link'):

    ~$ : >file
    ~$ ln -s "$PWD/file" /tmp/symlink
    ~$ strace -e link,linkat python -c 'import os; os.link("/tmp/symlink", "link", follow_symlinks=True)'
    link("/tmp/symlink", "link")            = -1 EXDEV (Cross-device link)
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    OSError: [Errno 18] Cross-device link: '/tmp/symlink' -> 'link'
    +++ exited with 1 +++

    For comparison, calling linkat() without AT_SYMLINK_FOLLOW results in the same error:

    ~$ strace -e link,linkat python -c 'import os; os.link("/tmp/symlink", "link", follow_symlinks=False)'
    linkat(AT_FDCWD, "/tmp/symlink", AT_FDCWD, "link", 0) = -1 EXDEV (Cross-device link)
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    OSError: [Errno 18] Cross-device link: '/tmp/symlink' -> 'link'
    +++ exited with 1 +++

    Currently, the only way to call linkat() with AT_SYMLINK_FOLLOW from Python, is to provide a directory file descriptor != AT_FDCWD:

    ~$ strace -e link,linkat python -c 'import os; d=os.open(".", 0); os.link("/tmp/symlink", "link", dst_dir_fd=d, follow_symlinks=True)'
    linkat(AT_FDCWD, "/tmp/symlink", 3, "link", AT_SYMLINK_FOLLOW) = 0
    +++ exited with 0 +++

    @jo-he jo-he mannequin added stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes type-bug An unexpected behavior, bug, or error labels Jul 17, 2019
    @serhiy-storchaka
    Copy link
    Member

    Not always linkat() can be used instead of link(). But on Windows (where there is no linkat()) os.link() creates a new link to the symbolic link itself. This is yet one argument for making follow_symlinks=False by default and changing the default behavior of os.link() on NetBSD.

    @eryksun
    Copy link
    Contributor

    eryksun commented Jul 18, 2019

    on Windows (where there is no linkat()) os.link() creates a new link
    to the symbolic link itself.

    Yes, CreateHardLinkW opens the source file by calling NtOpenFile with the option FILE_OPEN_REPARSE_POINT. So the behavior is follow_symlinks=False.

    Note, however, that Windows has distinct file and directory reparse points, so we can't hardlink to a directory symlink, or any other type of directory reparse point such as a junction mountpoint. In Unix, follow_symlinks=False (if implemented) allows creating a hardlink to a symlink that targets a directory.

    Also, I noticed that I can pass follow_symlinks=False in Windows, but this should raise NotImplementedError. It's supposed to be checked via follow_symlinks_specified().

    @jo-he
    Copy link
    Mannequin Author

    jo-he mannequin commented Jul 18, 2019

    The problem that POSIX does not define the behavior of link() regarding symlinks (and that Unix implementations differ indeed), is independent from Python's os.link() defaults.

    Since it makes no sense to call link(), when linkat() is available, I propose this change:

    --- a/Modules/posixmodule.c
    +++ b/Modules/posixmodule.c
    @@ -3512,15 +3512,11 @@ os_link_impl(PyObject *module, path_t *src, path_t *dst, int src_dir_fd,
     #else
         Py_BEGIN_ALLOW_THREADS
     #ifdef HAVE_LINKAT
    -    if ((src_dir_fd != DEFAULT_DIR_FD) ||
    -        (dst_dir_fd != DEFAULT_DIR_FD) ||
    -        (!follow_symlinks))
    -        result = linkat(src_dir_fd, src->narrow,
    -            dst_dir_fd, dst->narrow,
    -            follow_symlinks ? AT_SYMLINK_FOLLOW : 0);
    -    else
    +    result = linkat(src_dir_fd, src->narrow, dst_dir_fd, dst->narrow,
    +                    follow_symlinks ? AT_SYMLINK_FOLLOW : 0);
    +#else
    +    result = link(src->narrow, dst->narrow);
     #endif /* HAVE_LINKAT */
    -        result = link(src->narrow, dst->narrow);
         Py_END_ALLOW_THREADS
     if (result)
    

    This fix also simplifies the code, and should be safe for back-porting.

    Whether Python's defaults should be changed regarding Windows and those (mostly obsolete) Unix platforms that do not provide linkat(), is another question.

    @eryksun eryksun added extension-modules C modules in the Modules dir 3.10 only security fixes and removed stdlib Python modules in the Lib dir 3.7 (EOL) end of life labels Mar 12, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Jan 20, 2024
    * Update ironic from branch 'master'
      to a42f23f47581b6b7e32a6e591e48baa65cbe8adf
      - Merge "Don't create a hardlink to a symlink when handling file:// URLs"
      - Don't create a hardlink to a symlink when handling file:// URLs
        
        While os.link is supposed to follow symlinks, it's actually broken [1]
        on Linux. As a result, Ironic may end up creating a hard link to
        a symlink. If the symlink is relative, chances are high accessing the
        resulting file will cause a FileNotFoundError.
        
        [1] python/cpython#81793
        
        Change-Id: Ic52f0ddb0c94410dd854ee525e3c57b2e78ea84d
    openstack-mirroring pushed a commit to openstack/ironic that referenced this issue Jan 20, 2024
    While os.link is supposed to follow symlinks, it's actually broken [1]
    on Linux. As a result, Ironic may end up creating a hard link to
    a symlink. If the symlink is relative, chances are high accessing the
    resulting file will cause a FileNotFoundError.
    
    [1] python/cpython#81793
    
    Change-Id: Ic52f0ddb0c94410dd854ee525e3c57b2e78ea84d
    openstack-mirroring pushed a commit to openstack/ironic that referenced this issue Jan 25, 2024
    While os.link is supposed to follow symlinks, it's actually broken [1]
    on Linux. As a result, Ironic may end up creating a hard link to
    a symlink. If the symlink is relative, chances are high accessing the
    resulting file will cause a FileNotFoundError.
    
    [1] python/cpython#81793
    
    Change-Id: Ic52f0ddb0c94410dd854ee525e3c57b2e78ea84d
    (cherry picked from commit 0b3ed09)
    openstack-mirroring pushed a commit to openstack/ironic that referenced this issue Jan 29, 2024
    While os.link is supposed to follow symlinks, it's actually broken [1]
    on Linux. As a result, Ironic may end up creating a hard link to
    a symlink. If the symlink is relative, chances are high accessing the
    resulting file will cause a FileNotFoundError.
    
    [1] python/cpython#81793
    
    Change-Id: Ic52f0ddb0c94410dd854ee525e3c57b2e78ea84d
    (cherry picked from commit 0b3ed09)
    serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Apr 14, 2025
    This fixes os.link() on platforms (like Linux and OpenIndiana) where the
    system link() function does not follow symlinks.
    
    * On Linux and OpenIndiana, it now follows symlinks by default and if
      follow_symlinks=True is specified.
    * On Windows, it now raises error if follow_symlinks=True is passed.
    * On macOS, it now raises error if follow_symlinks=False is passed and
      the system linkat() function is not available at runtime.
    * On other platforms, it now raises error if follow_symlinks is passed
      with a value that does not match the system link() function behavior
      if if the behavior is not known.
    
    Co-authored-by: Joachim Henke <[email protected]>
    Co-authored-by: Thomas Kluyver <[email protected]>
    @picnixz picnixz removed 3.10 only security fixes 3.9 only security fixes 3.8 (EOL) end of life labels Apr 14, 2025
    @serhiy-storchaka
    Copy link
    Member

    I created an alternate PR #132517 based on work by @jo-he and @takluyver. The code is now clearer and all special cases are handled.

    • On Linux, os.link() without optional arguments now follows symlinks. This is a breaking change, but I think that this is unavoidable.
    • On Linux, os.link() with follow_symlinks=True now follows symlinks, and with follow_symlinks=False does not follow.
    • On OpenIndiana (and perhaps on Solaris), the same changes as on Linux.
    • On Windows, os.link() with follow_symlinks=True now raises an error.
    • On macOS, os.link() with follow_symlinks=False now raises an error if linkat() is not available at runtime.
    • On other platforms, os.link() with follow_symlinks now raises an error if link() does not have the specified behavior.

    @serhiy-storchaka
    Copy link
    Member

    Interesting, on OpenIndiana the link() system function follows symlinks, but the ln command does not and have no option to follow. This is the only platform with such difference, and I wonder why. If ln was implemented via link(), it would follow symplink. If ln was implemented via linkat(), it would have options to follow on not follow symplink.

    @serhiy-storchaka serhiy-storchaka self-assigned this Apr 15, 2025
    serhiy-storchaka added a commit that referenced this issue May 4, 2025
    This fixes os.link() on platforms (like Linux and OpenIndiana) where the
    system link() function does not follow symlinks.
    
    * On Linux, it now follows symlinks by default and if
      follow_symlinks=True is specified.
    * On Windows, it now raises error if follow_symlinks=True is passed.
    * On macOS, it now raises error if follow_symlinks=False is passed and
      the system linkat() function is not available at runtime.
    * On other platforms, it now raises error if follow_symlinks is passed
      with a value that does not match the system link() function behavior
      if if the behavior is not known.
    
    Co-authored-by: Joachim Henke <[email protected]>
    Co-authored-by: Thomas Kluyver <[email protected]>
    serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 4, 2025
    serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 4, 2025
    diegorusso added a commit to diegorusso/cpython that referenced this issue May 4, 2025
    * origin/main: (111 commits)
      pythongh-91048: Add filename and line number to external inspection routines (pythonGH-133385)
      pythongh-131178: Add tests for `ast` command-line interface (python#133329)
      Regenerate pcbuild.sln in Visual Studio 2022 (python#133394)
      pythongh-133042: disable HACL* HMAC on Emscripten (python#133064)
      pythongh-133351: Fix remote PDB's multi-line block tab completion (python#133387)
      pythongh-109700: Improve stress tests for interpreter creation (pythonGH-109946)
      pythongh-81793: Skip tests for os.link() to symlink on Android (pythonGH-133388)
      pythongh-126835: Rename `ast_opt.c` to `ast_preprocess.c` and related stuff after moving const folding to the peephole optimizier (python#131830)
      pythongh-91048: Relax test_async_global_awaited_by to fix flakyness (python#133368)
      pythongh-132457: make staticmethod and classmethod generic (python#132460)
      pythongh-132805: annotationlib: Fix handling of non-constant values in FORWARDREF (python#132812)
      pythongh-132426: Add get_annotate_from_class_namespace replacing get_annotate_function (python#132490)
      pythongh-81793: Always call linkat() from os.link(), if available (pythonGH-132517)
      pythongh-122559: Synchronize C and Python implementation of the io module about pickling (pythonGH-122628)
      pythongh-69605: Add PyREPL import autocomplete feature to 'What's New' (python#133358)
      bpo-44172: Keep reference to original window in curses subwindow objects (pythonGH-26226)
      pythonGH-133231: Changes to executor management to support proposed `sys._jit` module (pythonGH-133287)
      pythongh-133363: Fix Cmd completion for lines beginning with `! ` (python#133364)
      pythongh-132983: Introduce `_zstd` bindings module (pythonGH-133027)
      pythonGH-91048: Add utils for printing the call stack for asyncio tasks (python#133284)
      ...
    zware added a commit to zware/cpython that referenced this issue May 6, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants