Skip to content

zfs_cmd: reorganise zfs_cmd_t to match original size #17367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

robn
Copy link
Member

@robn robn commented May 22, 2025

Motivation and Context

2aa3fbe extended zinject_record_t, and in doing so inadvertently extended zfs_cmd_t, which broke compatibility with userspace tools without the change.

Description

This fixes that by using some of the unused space in zfs_cmd_t for the extra fields. We also add an assert to trigger a compile error if the size ever changes.

With this commit:

$ pahole -C zfs_cmd module/zfs.ko
struct zfs_cmd {
	char                       zc_name[4096];        /*     0  4096 */
	/* --- cacheline 64 boundary (4096 bytes) --- */
	uint64_t                   zc_nvlist_src;        /*  4096     8 */
	uint64_t                   zc_nvlist_src_size;   /*  4104     8 */
	uint64_t                   zc_nvlist_dst;        /*  4112     8 */
	uint64_t                   zc_nvlist_dst_size;   /*  4120     8 */
	boolean_t                  zc_nvlist_dst_filled; /*  4128     4 */
	int                        zc_pad2;              /*  4132     4 */
	uint64_t                   zc_history;           /*  4136     8 */
	char                       zc_value[8192];       /*  4144  8192 */
	/* --- cacheline 192 boundary (12288 bytes) was 48 bytes ago --- */
	char                       zc_string[256];       /* 12336   256 */
	/* --- cacheline 196 boundary (12544 bytes) was 48 bytes ago --- */
	uint64_t                   zc_guid;              /* 12592     8 */
	uint64_t                   zc_nvlist_conf;       /* 12600     8 */
	/* --- cacheline 197 boundary (12608 bytes) --- */
	uint64_t                   zc_nvlist_conf_size;  /* 12608     8 */
	uint64_t                   zc_cookie;            /* 12616     8 */
	uint64_t                   zc_objset_type;       /* 12624     8 */
	uint64_t                   zc_perm_action;       /* 12632     8 */
	uint64_t                   zc_history_len;       /* 12640     8 */
	uint64_t                   zc_history_offset;    /* 12648     8 */
	uint64_t                   zc_obj;               /* 12656     8 */
	uint64_t                   zc_iflags;            /* 12664     8 */
	/* --- cacheline 198 boundary (12672 bytes) --- */
	zfs_share_t                zc_share;             /* 12672    32 */
	dmu_objset_stats_t         zc_objset_stats;      /* 12704   288 */
	/* --- cacheline 203 boundary (12992 bytes) --- */
	struct drr_begin           zc_begin_record;      /* 12992   304 */
	/* --- cacheline 207 boundary (13248 bytes) was 48 bytes ago --- */
	union {
		zinject_record_t   zc_inject_record;     /* 13296   368 */
		struct {
			char       zc_pad1[352];         /* 13296   352 */
			/* --- cacheline 213 boundary (13632 bytes) was 16 bytes ago --- */
			uint32_t   zc_defer_destroy;     /* 13648     4 */
			uint32_t   zc_flags;             /* 13652     4 */
			uint64_t   zc_action_handle;     /* 13656     8 */
		};                                       /* 13296   368 */
	};                                               /* 13296   368 */
	/* --- cacheline 213 boundary (13632 bytes) was 32 bytes ago --- */
	int                        zc_cleanup_fd;        /* 13664     4 */
	uint8_t                    zc_simple;            /* 13668     1 */
	uint8_t                    zc_pad[3];            /* 13669     3 */
	uint64_t                   zc_sendobj;           /* 13672     8 */
	uint64_t                   zc_fromobj;           /* 13680     8 */
	uint64_t                   zc_createtxg;         /* 13688     8 */
	/* --- cacheline 214 boundary (13696 bytes) --- */
	zfs_stat_t                 zc_stat;              /* 13696    40 */
	uint64_t                   zc_zoneid;            /* 13736     8 */

	/* size: 13744, cachelines: 215, members: 32 */
	/* last cacheline: 48 bytes */
};

Compare with 2.2.7:

$ pahole -C zfs_cmd ~/code/zfs-2.2/module/zfs.ko
struct zfs_cmd {
	char                       zc_name[4096];        /*     0  4096 */
	/* --- cacheline 64 boundary (4096 bytes) --- */
	uint64_t                   zc_nvlist_src;        /*  4096     8 */
	uint64_t                   zc_nvlist_src_size;   /*  4104     8 */
	uint64_t                   zc_nvlist_dst;        /*  4112     8 */
	uint64_t                   zc_nvlist_dst_size;   /*  4120     8 */
	boolean_t                  zc_nvlist_dst_filled; /*  4128     4 */
	int                        zc_pad2;              /*  4132     4 */
	uint64_t                   zc_history;           /*  4136     8 */
	char                       zc_value[8192];       /*  4144  8192 */
	/* --- cacheline 192 boundary (12288 bytes) was 48 bytes ago --- */
	char                       zc_string[256];       /* 12336   256 */
	/* --- cacheline 196 boundary (12544 bytes) was 48 bytes ago --- */
	uint64_t                   zc_guid;              /* 12592     8 */
	uint64_t                   zc_nvlist_conf;       /* 12600     8 */
	/* --- cacheline 197 boundary (12608 bytes) --- */
	uint64_t                   zc_nvlist_conf_size;  /* 12608     8 */
	uint64_t                   zc_cookie;            /* 12616     8 */
	uint64_t                   zc_objset_type;       /* 12624     8 */
	uint64_t                   zc_perm_action;       /* 12632     8 */
	uint64_t                   zc_history_len;       /* 12640     8 */
	uint64_t                   zc_history_offset;    /* 12648     8 */
	uint64_t                   zc_obj;               /* 12656     8 */
	uint64_t                   zc_iflags;            /* 12664     8 */
	/* --- cacheline 198 boundary (12672 bytes) --- */
	zfs_share_t                zc_share;             /* 12672    32 */
	dmu_objset_stats_t         zc_objset_stats;      /* 12704   288 */
	/* --- cacheline 203 boundary (12992 bytes) --- */
	struct drr_begin           zc_begin_record;      /* 12992   304 */
	/* --- cacheline 207 boundary (13248 bytes) was 48 bytes ago --- */
	zinject_record_t           zc_inject_record;     /* 13296   352 */
	/* --- cacheline 213 boundary (13632 bytes) was 16 bytes ago --- */
	uint32_t                   zc_defer_destroy;     /* 13648     4 */
	uint32_t                   zc_flags;             /* 13652     4 */
	uint64_t                   zc_action_handle;     /* 13656     8 */
	int                        zc_cleanup_fd;        /* 13664     4 */
	uint8_t                    zc_simple;            /* 13668     1 */
	uint8_t                    zc_pad[3];            /* 13669     3 */
	uint64_t                   zc_sendobj;           /* 13672     8 */
	uint64_t                   zc_fromobj;           /* 13680     8 */
	uint64_t                   zc_createtxg;         /* 13688     8 */
	/* --- cacheline 214 boundary (13696 bytes) --- */
	zfs_stat_t                 zc_stat;              /* 13696    40 */
	uint64_t                   zc_zoneid;            /* 13736     8 */

	/* size: 13744, cachelines: 215, members: 35 */
	/* last cacheline: 48 bytes */
};

Importantly, each individual field has the same alignment.

There's other ways to do this, eg by having a zinject_record_ext_t as a union in a different position, but it's just different ways to arrange the deckchairs.

I don't really love this, but it seems like an ok stopgap. The other options are to revert 2aa3fbe on master too, or push forward to convert ZFS_IOC_INJECT to nvlists or something equally hefty. Good things to do, and we will, when we're done with all these other things!

How Has This Been Tested?

Just basic local testing. I've not done interop testing, but pahole suggests everything lines up nice.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@robn robn force-pushed the zfs-cmd-enforce-size branch from bee4337 to 8b87b57 Compare May 22, 2025 13:16
@amotin
Copy link
Member

amotin commented May 22, 2025

On FreeBSD:

  /home/zfs/zfs/include/sys/zfs_ioctl.h:568:16: error: static assertion failed due to requirement 'sizeof(struct zfs_cmd) == 13744': zfs_cmd_t must be 13744 bytes
    568 | _Static_assert(sizeof (zfs_cmd_t) == 13744, "zfs_cmd_t must be 13744 bytes");
        |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  /home/zfs/zfs/include/sys/zfs_ioctl.h:568:35: note: expression evaluates to '4528 == 13744'
    568 | _Static_assert(sizeof (zfs_cmd_t) == 13744, "zfs_cmd_t must be 13744 bytes");
        |                ~~~~~~~~~~~~~~~~~~~^~~~~~~~

@robn robn force-pushed the zfs-cmd-enforce-size branch from 8b87b57 to ce35a6c Compare May 22, 2025 21:52
@robn
Copy link
Member Author

robn commented May 22, 2025

Ugh, I forgot about MAXPATHLEN and MAXNAMELEN being so different.

Alright, I've pushed another version that uses those defines directly in the size check. I like this even less because we won't detect if those change, but they're even harder to pin for just this struct if they do change, so idk. I mostly want a signal that we've screwed up, not a comprehensive protection, so I hope this is enough.

2aa3fbe extended zinject_record_t, and in doing so inadvertently
extended zfs_cmd_t, which broke compatibility with userspace tools
without the change.

This fixes that by using some of the unused space in zfs_cmd_t for the
extra fields. We also add an assert to trigger a compile error if the
size ever changes.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <[email protected]>
@robn robn force-pushed the zfs-cmd-enforce-size branch from ce35a6c to 85b2f8e Compare May 23, 2025 01:11
@amotin amotin added the Status: Code Review Needed Ready for review and testing label May 23, 2025
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an additional compatibility check why don't we use of a copy of struct zfs_cmd from the 2.2.0 release in the tests/zfs-tests/cmd/libzfs_input_check.c test. This way we'll always be testing the basic libzfs compatibility.

@robn
Copy link
Member Author

robn commented May 24, 2025

As an additional compatibility check why don't we use of a copy of struct zfs_cmd from the 2.2.0 release in the tests/zfs-tests/cmd/libzfs_input_check.c test. This way we'll always be testing the basic libzfs compatibility.

@behlendorf I like the idea, but I'm not sure it helps without some rework. The test only tests the nvlist ioctls. In f414ff5 I make a very small zfs_cmd_nv_t with just the nvlist members, and that passes ok, but that doesn't exercise the members further down in the struct, which is what caused the break for things using fields after zc_inject_record.

Unless you meant something different that has gone straight over my head?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants