Convert README verification script to a Rust test #1563

jackcooper20 · 2025-03-25T23:36:57Z

This PR addresses #636 by replacing the bash script (scripts/update_example_READMEs.sh) with a Rust-based test that verifies the examples/README.md contents are in sync with the actual examples in the codebase.

The test:

Walks through all example categories (general, supplementary, restriction, experimental, and testing)
Extracts example names and descriptions from each Cargo.toml file
Generates a temporary README.md with the expected content
Verifies that the actual README.md contains all necessary example links and descriptions

This change makes the README verification process part of the standard test suite, runnable via cargo test. By keeping this verification in Rust rather than bash, we ensure better cross-platform compatibility and integration with the rest of the testing infrastructure.

CLAassistant · 2025-03-25T23:37:04Z

All committers have signed the CLA.

jackcooper20 · 2025-03-25T23:40:54Z

hey @smoelius, here is the PR. Please check and let me know if any thing is needed. Thank You!

smoelius · 2025-03-26T00:13:44Z

Hey, thanks @jackcooper20! I will try to look at this no later than this weekend.

smoelius · 2025-03-27T11:27:31Z

@jackcooper20 Do you think you could try to get CI to pass?

…ing error fixed

jackcooper20 · 2025-03-27T16:48:09Z

@smoelius completed the CI

smoelius

Thanks a lot for working on this, @jackcooper20.

My two main concerns right now are that the test encodes the entire README, and that the test does not simply compare the expected README with the actual README. The former is elaborated on below; the latter is more of a question, so if you could please explain. 🙏

smoelius · 2025-03-29T22:52:54Z

examples/tests/lib.rs

@@ -0,0 +1,4 @@
+// This file ensures that the tests directory is recognized as a test directory
+// and that the test_readme.rs file is included in the test suite.


Admittedly, I don't understand this comment.

But, either way, could we please move this test into https://github.com/trailofbits/dylint/blob/master/cargo-dylint/tests/ci.rs?

smoelius · 2025-03-29T22:53:57Z

examples/tests/test_readme.rs

+
+#[test]
+fn test_readme_contents() {
+    let examples_dir = find_examples_dir();


Did you have a reason to prefer this approach over simply hardcoding the path to examples?

Hardcoding would be simpler and more explicit. The main reason I chose the current approach was to ensure the test works regardless of where it's run

Once you move this test to https://github.com/trailofbits/dylint/blob/master/cargo-dylint/tests/ci.rs, it will always execute from within cargo-dylint's subdirectory. So there will be no ambiguity.

smoelius · 2025-03-29T22:57:43Z

examples/tests/test_readme.rs

+
+
+    // Instead we use a more flexible approach that allows for formatting differences
+    // while ensuring all examples are properly documented


It sounds like you have a reason to not just compare expected_content and actual_content. What is that reason?

Basically to ensures the README is functionally correct while being resilient to formatting changes.

I think it would be fine to compare them outright. That should also make the code simpler.

smoelius · 2025-03-29T23:23:20Z

examples/tests/test_readme.rs

+
+    if category == "restriction" {
+        // Handle restriction directory differently since it seems to have directories with slashes in names
+        for entry in read_dir(&category_dir).unwrap() {


Can't you use read_dir for both restriction and non-restriction cases?

Yeah, read_dir could be used for both restriction and non-restriction cases. I'll try that

smoelius · 2025-03-29T23:33:52Z

examples/tests/test_readme.rs

+        }
+    } else {
+        return None;
+    }


Could you use fs::read_to_string here? https://doc.rust-lang.org/std/fs/fn.read_to_string.html

Thanks for the suggestion. After looking at it again, I realized the current approach is a bit verbose than necessary. Using fs::read_to_string would make the code more concise and readable.

smoelius · 2025-03-29T23:46:34Z

examples/tests/test_readme.rs

+    let mut content = String::new();
+
+    // Add the header
+    content.push_str("# Example Dylint libraries\n\n");


I don't think I like the idea of storing the entire README contents in the test. Because then the test will have to change whenever the README changes.

I think I would prefer that the test find where the lint descriptions should go and generate just that part, taking everything else from the existing README.

I may not be obvious, but the existing shell script worked this way:

dylint/scripts/update_example_READMEs.sh

Line 24 in 349807f

if [[ "$X" =~ ^\| ]]; then

But I would be open to making things even easier for the test. For example, the README could contain markers like / to help the test know where the lint descriptions should go.

Yeah you're right that hardcoding the entire README is brittle. I'll refactor the test to generate and verify only the table sections.

jackcooper20 · 2025-04-01T09:36:11Z

@smoelius thank you for the feedback. I’ve answered your queries, and it looks like a few changes are needed. Do you think anything else needs to be changed other than the mentioned ones? Also, let me know if you have any further questions.

smoelius

Do you think anything else needs to be changed other than the mentioned ones?

I honestly won't know until I see the code.

I really appreciate you working on this.

smoelius · 2025-04-01T23:57:58Z

examples/tests/test_readme.rs

+
+#[test]
+fn test_readme_contents() {
+    let examples_dir = find_examples_dir();


Once you move this test to https://github.com/trailofbits/dylint/blob/master/cargo-dylint/tests/ci.rs, it will always execute from within cargo-dylint's subdirectory. So there will be no ambiguity.

smoelius · 2025-04-01T23:58:56Z

examples/tests/test_readme.rs

+
+
+    // Instead we use a more flexible approach that allows for formatting differences
+    // while ensuring all examples are properly documented


I think it would be fine to compare them outright. That should also make the code simpler.

jackcooper20 · 2025-04-06T07:00:09Z

Hello @smoelius,

I've tried to fix the issues you mentioned in the review. However, after pushing the changes, one of the CI tests failed, which is related to formatting. I attempted to resolve the issue, but I encountered a potential bug in the update_example_READMEs.sh script.

In particular, the problem seems to be with this line

sed 's/| A lint to check for \([^|]*\) |$/| \u\1 |/'

The intent of this line is to capitalize text (e.g., changing abc to Abc). However, the \u syntax isn't supported in GNU sed, which results in an extra u being introduced before the text (e.g., uLint description instead of Lint description). This issue is likely causing a mismatch between the expected and actual content in README.md.

Could you please check if this is the root cause of the issue, or if I might be overlooking something? If it’s not the cause, I'd appreciate your guidance in resolving the CI failure.

jackcooper20 · 2025-04-07T12:03:56Z

@smoelius ,please take a look at this at your convenience

smoelius · 2025-04-07T12:11:19Z

Hey, sorry for not replying, @jackcooper20. I will try to look at this this week.

smoelius

@jackcooper20 This is looking pretty good. Thanks a lot for working on this!

smoelius · 2025-04-08T09:50:57Z

cargo-dylint/tests/ci.rs

@@ -133,6 +133,165 @@ fn cargo_dylint_and_dylint_readmes_are_equal() {
    compare_lines(&cargo_dylint_readme, &dylint_readme);
 }

+#[test]
+fn examples_readme_contents() {
+    let examples_dir = find_examples_dir();


Suggested change

let examples_dir = find_examples_dir();

let examples_dir = Path::new(env!("CARGO_MANIFEST_DIR")).join("../examples");

I think it is fine to hardcode the path to the examples directory. The likelihood of it moving relative the cargo-dylint package is small.

Please also remove the find_examples_dir function.

smoelius · 2025-04-08T09:53:24Z

examples/README.md

+<!-- lint descriptions start -->
 ## General


Suggested change



## General



## General

The lack of an intervening blank line is why I think the current CI is complaining.

smoelius · 2025-04-08T09:53:50Z

examples/README.md

 | [`straggler`](./testing/straggler) | A lint that uses an old toolchain for testing purposes |
+<!-- lint descriptions end -->


smoelius · 2025-04-08T09:55:39Z

examples/Cargo.toml

+regex = { workspace = true }
+tempfile = { workspace = true }


Suggested change

regex = { workspace = true }

tempfile = { workspace = true }

Since no code in the examples directory is modified, these additions should not be necessary.

smoelius · 2025-04-08T10:01:17Z

cargo-dylint/tests/ci.rs

+    // Extract the description using regex
+    let re = Regex::new(r#"description\s*=\s*"([^"]*)""#).unwrap();
+    let description = if let Some(caps) = re.captures(&content) {
+        if let Some(desc) = caps.get(1) {


Suggested change

if let Some(desc) = caps.get(1) {

let desc = caps.get(1).unwrap();

If the captures call succeeds, then this call should succeed as well.

A failure of this get would likely indicate a bug in the definition of re, which I think is worth a test panic.

Note that removing this if will require other changes.

smoelius · 2025-04-08T10:11:11Z

cargo-dylint/tests/ci.rs

+    content
+}
+
+fn extract_between_markers(content: &str, start_marker: &str, end_marker: &str) -> Option<String> {


Suggested change

fn extract_between_markers(content: &str, start_marker: &str, end_marker: &str) -> Option<String> {

fn extract_between_markers(content: &str) -> Option<String> {

const START_MARKER: &str = "";

const END_MARKER: &str = "";

Since extract_between_markers is called only once, could we make start_marker and end_marker constants defined within this function?

smoelius · 2025-04-08T10:14:20Z

cargo-dylint/tests/ci.rs

+    let readme_content = read_to_string(&readme_path).unwrap();
+
+    // Generate just the lint description tables
+    let tables = generate_lint_tables(&examples_dir, &categories);


Suggested change

let tables = generate_lint_tables(&examples_dir, &categories);

let expected_tables = generate_lint_tables(&examples_dir, &categories);

Nit.

smoelius · 2025-04-08T10:15:19Z

cargo-dylint/tests/ci.rs

+        expected_content,
+        actual_content,


Suggested change

expected_content,

actual_content,

expected_tables,

actual_tables,

I would prefer this more stringent check. Please also remove the extract_meaningful_content function.

I understand why you wrote the extract_meaningful_content function: to avoid having to worry about spacing.

But generating a table with correct spacing will make it easier to replace an existing, incorrect table.

smoelius · 2025-04-08T10:15:51Z

cargo-dylint/tests/ci.rs

+    let tables = generate_lint_tables(&examples_dir, &categories);
+
+    // Extract the current tables section from README using markers
+    let current_tables = extract_between_markers(&readme_content, 


Suggested change

let current_tables = extract_between_markers(&readme_content,

let actual_tables = extract_between_markers(&readme_content,

Nit to follow the convention that "expected" is often paired with "actual".

smoelius · 2025-04-08T10:38:53Z

cargo-dylint/tests/ci.rs

+    result
+}
+
+fn generate_lint_tables(examples_dir: &Path, categories: &[&str]) -> String {


As per the comment above regarding assert_eq!, this function will need to be adjusted to get the spacing right.

I recommend computing all of the formatted "Example" and "Description/check" entries, and then determining the columns' widths from that.

Something similar to this:

dylint/dylint/src/lib.rs

Lines 182 to 193 in 67eddaf

let name_width = name_toolchain_map

.keys()

.map(String::len)

.max()

.unwrap_or_default();

let toolchain_width = name_toolchain_map

.values()

.flat_map(LazyToolchainMap::keys)

.map(String::len)

.max()

.unwrap_or_default();

jackcooper20 · 2025-04-08T17:05:01Z

@smoelius I've fixed the things you mentioned in comments, but looks like the CI is failing again for the reason I mentioned previously. I am writing it here again. Please have a look

There might be a potential bug in the update_example_READMEs.sh script.

In particular, the problem seems to be with this line

sed 's/| A lint to check for \([^|]*\) |$/| \u\1 |/'

The intent of this line is to capitalize text (e.g., changing abc to Abc). However, the \u syntax isn't supported in GNU sed, which results in an extra u being introduced before the text (e.g., uLint description instead of Lint description).

This issue is likely causing a mismatch between the expected and actual content in README.md.

examples/README.md

jackcooper20 · 2025-04-12T14:00:54Z

Hey @smoelius, I've managed to pass the CI. Apparently there were some new lines that was causing the problem. Please check and let me know if you have any questions.

Thank You!

smoelius

Please forgive me for saying this, but there are changes all over the place. It is difficult to review a PR when seemingly random changes are made.

For example. there is now code that calls preserves_cleanliness, but it's conditioned on "bless_mode". This doesn't really make sense because preserves_cleanliness is for checking that things don't change, while "blessing" typically means intentionally updating something.

When I commented on 3589ae5, I thought we were close to a solution. Is there any chance you could roll back to that commit and make smaller, more focused changes?

smoelius · 2025-04-13T14:12:20Z

.github/workflows/ci.yml

-      - name: Format example READMEs
-        run: ./scripts/update_example_READMEs.sh && git diff --exit-code
-


You are right that this will eventually have to go away. But, for now, could you please put it back in? I have found its success/failure useful to compare to.

jackcooper20 · 2025-04-14T04:49:26Z

Hello @smoelius, thanks for the feedback.

Apologies for making a mess of it. To be honest, after all the CI failures, I started getting a bit nervous and ended up trying all kinds of approaches just to get things working. That led to a few missteps, and I agree the recent changes added more confusion than value.

I'll revert back to 3589ae5 and work from there with a more simplified approach.

Thanks again for your patience — I'll follow up soon with a cleaner version of the PR.

…used dependencies

Converted shell script to rust test script

8783f84

jackcooper20 requested a review from smoelius as a code owner March 25, 2025 23:36

jackcooper20 added 3 commits March 27, 2025 07:27

lint CI fix

1a82f8c

write_with_newline, inconsistent_qualification and abs_home_path lint…

27272f1

…ing error fixed

inconsistent_qualification lint fix

349c54e

smoelius reviewed Mar 29, 2025

View reviewed changes

smoelius reviewed Apr 2, 2025

View reviewed changes

refactor(tests): improve examples README verification

2cf32c5

jackcooper20 force-pushed the convert-to-rust-test branch from b4f041b to 2cf32c5 Compare April 5, 2025 12:45

Merge branch 'master' into convert-to-rust-test

3589ae5

smoelius reviewed Apr 8, 2025

View reviewed changes

smoelius reviewed Apr 10, 2025

View reviewed changes

examples/README.md Show resolved Hide resolved

smoelius reviewed Apr 13, 2025

View reviewed changes

jackcooper20 force-pushed the convert-to-rust-test branch from 6d658ed to 3589ae5 Compare April 19, 2025 09:56

jackcooper20 added 2 commits April 19, 2025 04:56

refactor(tests): streamline README example verification and remove un…

cf63112

…used dependencies

fix(tests): clippy error fix

d12f8ee

		@@ -0,0 +1,4 @@
		// This file ensures that the tests directory is recognized as a test directory
		// and that the test_readme.rs file is included in the test suite.



		// Instead we use a more flexible approach that allows for formatting differences
		// while ensuring all examples are properly documented

	let examples_dir = find_examples_dir();
	let examples_dir = Path::new(env!("CARGO_MANIFEST_DIR")).join("../examples");

		\| [`straggler`](./testing/straggler) \| A lint that uses an old toolchain for testing purposes \|
		<!-- lint descriptions end -->

	if let Some(desc) = caps.get(1) {
	let desc = caps.get(1).unwrap();

-fn extract_between_markers(content: &str, start_marker: &str, end_marker: &str) -> Option<String> {
+fn extract_between_markers(content: &str) -> Option<String> {
+    const START_MARKER: &str = "<!-- lint descriptions start -->";
+    const END_MARKER: &str = "<!-- lint descriptions end -->";

	let tables = generate_lint_tables(&examples_dir, &categories);
	let expected_tables = generate_lint_tables(&examples_dir, &categories);

	let current_tables = extract_between_markers(&readme_content,
	let actual_tables = extract_between_markers(&readme_content,

	let name_width = name_toolchain_map
	.keys()
	.map(String::len)
	.max()
	.unwrap_or_default();

	let toolchain_width = name_toolchain_map
	.values()
	.flat_map(LazyToolchainMap::keys)
	.map(String::len)
	.max()
	.unwrap_or_default();

		- name: Format example READMEs
		run: ./scripts/update_example_READMEs.sh && git diff --exit-code

Convert README verification script to a Rust test #1563

Are you sure you want to change the base?

Convert README verification script to a Rust test #1563

Conversation

jackcooper20 commented Mar 25, 2025

CLAassistant commented Mar 25, 2025 • edited Loading

jackcooper20 commented Mar 25, 2025

smoelius commented Mar 26, 2025

smoelius commented Mar 27, 2025

jackcooper20 commented Mar 27, 2025

smoelius left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackcooper20 Apr 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackcooper20 Apr 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackcooper20 commented Apr 1, 2025 • edited Loading

smoelius left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackcooper20 commented Apr 6, 2025 • edited Loading

jackcooper20 commented Apr 7, 2025

smoelius commented Apr 7, 2025

smoelius left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackcooper20 commented Apr 8, 2025

jackcooper20 commented Apr 12, 2025

smoelius left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackcooper20 commented Apr 14, 2025

CLAassistant commented Mar 25, 2025 •

edited

Loading

jackcooper20 Apr 1, 2025 •

edited

Loading

jackcooper20 Apr 1, 2025 •

edited

Loading

jackcooper20 commented Apr 1, 2025 •

edited

Loading

jackcooper20 commented Apr 6, 2025 •

edited

Loading