-
Notifications
You must be signed in to change notification settings - Fork 36
Convert README verification script to a Rust test #1563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Convert README verification script to a Rust test #1563
Conversation
hey @smoelius, here is the PR. Please check and let me know if any thing is needed. Thank You! |
Hey, thanks @jackcooper20! I will try to look at this no later than this weekend. |
@jackcooper20 Do you think you could try to get CI to pass? |
@smoelius completed the CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this, @jackcooper20.
My two main concerns right now are that the test encodes the entire README, and that the test does not simply compare the expected README with the actual README. The former is elaborated on below; the latter is more of a question, so if you could please explain. 🙏
examples/tests/lib.rs
Outdated
@@ -0,0 +1,4 @@ | |||
// This file ensures that the tests directory is recognized as a test directory | |||
// and that the test_readme.rs file is included in the test suite. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly, I don't understand this comment.
But, either way, could we please move this test into https://github.com/trailofbits/dylint/blob/master/cargo-dylint/tests/ci.rs?
examples/tests/test_readme.rs
Outdated
|
||
#[test] | ||
fn test_readme_contents() { | ||
let examples_dir = find_examples_dir(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you have a reason to prefer this approach over simply hardcoding the path to examples
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoding would be simpler and more explicit. The main reason I chose the current approach was to ensure the test works regardless of where it's run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you move this test to https://github.com/trailofbits/dylint/blob/master/cargo-dylint/tests/ci.rs, it will always execute from within cargo-dylint
's subdirectory. So there will be no ambiguity.
examples/tests/test_readme.rs
Outdated
|
||
|
||
// Instead we use a more flexible approach that allows for formatting differences | ||
// while ensuring all examples are properly documented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like you have a reason to not just compare expected_content
and actual_content
. What is that reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically to ensures the README is functionally correct while being resilient to formatting changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fine to compare them outright. That should also make the code simpler.
examples/tests/test_readme.rs
Outdated
|
||
if category == "restriction" { | ||
// Handle restriction directory differently since it seems to have directories with slashes in names | ||
for entry in read_dir(&category_dir).unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you use read_dir
for both restriction and non-restriction cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, read_dir
could be used for both restriction and non-restriction cases. I'll try that
examples/tests/test_readme.rs
Outdated
} | ||
} else { | ||
return None; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use fs::read_to_string
here? https://doc.rust-lang.org/std/fs/fn.read_to_string.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. After looking at it again, I realized the current approach is a bit verbose than necessary. Using fs::read_to_string
would make the code more concise and readable.
examples/tests/test_readme.rs
Outdated
let mut content = String::new(); | ||
|
||
// Add the header | ||
content.push_str("# Example Dylint libraries\n\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I like the idea of storing the entire README contents in the test. Because then the test will have to change whenever the README changes.
I think I would prefer that the test find where the lint descriptions should go and generate just that part, taking everything else from the existing README.
I may not be obvious, but the existing shell script worked this way:
dylint/scripts/update_example_READMEs.sh
Line 24 in 349807f
if [[ "$X" =~ ^\| ]]; then |
But I would be open to making things even easier for the test. For example, the README could contain markers like <!-- lint descriptions start -->
/<!-- lint descriptions end -->
to help the test know where the lint descriptions should go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you're right that hardcoding the entire README is brittle. I'll refactor the test to generate and verify only the table sections.
@smoelius thank you for the feedback. I’ve answered your queries, and it looks like a few changes are needed. Do you think anything else needs to be changed other than the mentioned ones? Also, let me know if you have any further questions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think anything else needs to be changed other than the mentioned ones?
I honestly won't know until I see the code.
I really appreciate you working on this.
examples/tests/test_readme.rs
Outdated
|
||
#[test] | ||
fn test_readme_contents() { | ||
let examples_dir = find_examples_dir(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you move this test to https://github.com/trailofbits/dylint/blob/master/cargo-dylint/tests/ci.rs, it will always execute from within cargo-dylint
's subdirectory. So there will be no ambiguity.
examples/tests/test_readme.rs
Outdated
|
||
|
||
// Instead we use a more flexible approach that allows for formatting differences | ||
// while ensuring all examples are properly documented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be fine to compare them outright. That should also make the code simpler.
b4f041b
to
2cf32c5
Compare
Hello @smoelius, I've tried to fix the issues you mentioned in the review. However, after pushing the changes, one of the CI tests failed, which is related to formatting. I attempted to resolve the issue, but I encountered a potential bug in the In particular, the problem seems to be with this line sed 's/| A lint to check for \([^|]*\) |$/| \u\1 |/' The intent of this line is to capitalize text (e.g., changing Could you please check if this is the root cause of the issue, or if I might be overlooking something? If it’s not the cause, I'd appreciate your guidance in resolving the CI failure. |
@smoelius ,please take a look at this at your convenience |
Hey, sorry for not replying, @jackcooper20. I will try to look at this this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackcooper20 This is looking pretty good. Thanks a lot for working on this!
cargo-dylint/tests/ci.rs
Outdated
@@ -133,6 +133,165 @@ fn cargo_dylint_and_dylint_readmes_are_equal() { | |||
compare_lines(&cargo_dylint_readme, &dylint_readme); | |||
} | |||
|
|||
#[test] | |||
fn examples_readme_contents() { | |||
let examples_dir = find_examples_dir(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let examples_dir = find_examples_dir(); | |
let examples_dir = Path::new(env!("CARGO_MANIFEST_DIR")).join("../examples"); |
I think it is fine to hardcode the path to the examples directory. The likelihood of it moving relative the cargo-dylint
package is small.
Please also remove the find_examples_dir
function.
examples/README.md
Outdated
<!-- lint descriptions start --> | ||
## General |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<!-- lint descriptions start --> | |
## General | |
<!-- lint descriptions start --> | |
## General |
The lack of an intervening blank line is why I think the current CI is complaining.
examples/README.md
Outdated
| [`straggler`](./testing/straggler) | A lint that uses an old toolchain for testing purposes | | ||
<!-- lint descriptions end --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| [`straggler`](./testing/straggler) | A lint that uses an old toolchain for testing purposes | | |
<!-- lint descriptions end --> | |
| [`straggler`](./testing/straggler) | A lint that uses an old toolchain for testing purposes | | |
<!-- lint descriptions end --> |
Same comment as above.
examples/Cargo.toml
Outdated
regex = { workspace = true } | ||
tempfile = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regex = { workspace = true } | |
tempfile = { workspace = true } |
Since no code in the examples directory is modified, these additions should not be necessary.
cargo-dylint/tests/ci.rs
Outdated
// Extract the description using regex | ||
let re = Regex::new(r#"description\s*=\s*"([^"]*)""#).unwrap(); | ||
let description = if let Some(caps) = re.captures(&content) { | ||
if let Some(desc) = caps.get(1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let Some(desc) = caps.get(1) { | |
let desc = caps.get(1).unwrap(); |
If the captures
call succeeds, then this call should succeed as well.
A failure of this get
would likely indicate a bug in the definition of re
, which I think is worth a test panic.
Note that removing this if
will require other changes.
cargo-dylint/tests/ci.rs
Outdated
content | ||
} | ||
|
||
fn extract_between_markers(content: &str, start_marker: &str, end_marker: &str) -> Option<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn extract_between_markers(content: &str, start_marker: &str, end_marker: &str) -> Option<String> { | |
fn extract_between_markers(content: &str) -> Option<String> { | |
const START_MARKER: &str = "<!-- lint descriptions start -->"; | |
const END_MARKER: &str = "<!-- lint descriptions end -->"; |
Since extract_between_markers
is called only once, could we make start_marker
and end_marker
constants defined within this function?
cargo-dylint/tests/ci.rs
Outdated
let readme_content = read_to_string(&readme_path).unwrap(); | ||
|
||
// Generate just the lint description tables | ||
let tables = generate_lint_tables(&examples_dir, &categories); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let tables = generate_lint_tables(&examples_dir, &categories); | |
let expected_tables = generate_lint_tables(&examples_dir, &categories); |
Nit.
cargo-dylint/tests/ci.rs
Outdated
expected_content, | ||
actual_content, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected_content, | |
actual_content, | |
expected_tables, | |
actual_tables, |
I would prefer this more stringent check. Please also remove the extract_meaningful_content
function.
I understand why you wrote the extract_meaningful_content
function: to avoid having to worry about spacing.
But generating a table with correct spacing will make it easier to replace an existing, incorrect table.
cargo-dylint/tests/ci.rs
Outdated
let tables = generate_lint_tables(&examples_dir, &categories); | ||
|
||
// Extract the current tables section from README using markers | ||
let current_tables = extract_between_markers(&readme_content, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let current_tables = extract_between_markers(&readme_content, | |
let actual_tables = extract_between_markers(&readme_content, |
Nit to follow the convention that "expected" is often paired with "actual".
result | ||
} | ||
|
||
fn generate_lint_tables(examples_dir: &Path, categories: &[&str]) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the comment above regarding assert_eq!
, this function will need to be adjusted to get the spacing right.
I recommend computing all of the formatted "Example" and "Description/check" entries, and then determining the columns' widths from that.
Something similar to this:
Lines 182 to 193 in 67eddaf
let name_width = name_toolchain_map | |
.keys() | |
.map(String::len) | |
.max() | |
.unwrap_or_default(); | |
let toolchain_width = name_toolchain_map | |
.values() | |
.flat_map(LazyToolchainMap::keys) | |
.map(String::len) | |
.max() | |
.unwrap_or_default(); |
@smoelius I've fixed the things you mentioned in comments, but looks like the CI is failing again for the reason I mentioned previously. I am writing it here again. Please have a look There might be a potential bug in the In particular, the problem seems to be with this line sed 's/| A lint to check for \([^|]*\) |$/| \u\1 |/' The intent of this line is to capitalize text (e.g., changing This issue is likely causing a mismatch between the expected and actual content in |
Hey @smoelius, I've managed to pass the CI. Apparently there were some new lines that was causing the problem. Please check and let me know if you have any questions. Thank You! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please forgive me for saying this, but there are changes all over the place. It is difficult to review a PR when seemingly random changes are made.
For example. there is now code that calls preserves_cleanliness
, but it's conditioned on "bless_mode
". This doesn't really make sense because preserves_cleanliness
is for checking that things don't change, while "blessing" typically means intentionally updating something.
When I commented on 3589ae5, I thought we were close to a solution. Is there any chance you could roll back to that commit and make smaller, more focused changes?
.github/workflows/ci.yml
Outdated
- name: Format example READMEs | ||
run: ./scripts/update_example_READMEs.sh && git diff --exit-code | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right that this will eventually have to go away. But, for now, could you please put it back in? I have found its success/failure useful to compare to.
Hello @smoelius, thanks for the feedback. Apologies for making a mess of it. To be honest, after all the CI failures, I started getting a bit nervous and ended up trying all kinds of approaches just to get things working. That led to a few missteps, and I agree the recent changes added more confusion than value. I'll revert back to Thanks again for your patience — I'll follow up soon with a cleaner version of the PR. |
6d658ed
to
3589ae5
Compare
This PR addresses #636 by replacing the bash script (
scripts/update_example_READMEs.sh
) with a Rust-based test that verifies the examples/README.md contents are in sync with the actual examples in the codebase.The test:
This change makes the README verification process part of the standard test suite, runnable via
cargo test
. By keeping this verification in Rust rather than bash, we ensure better cross-platform compatibility and integration with the rest of the testing infrastructure.