Skip to content

Files with CRLF have whole file highlighted as modified #195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mjmeli opened this issue Mar 1, 2019 · 19 comments
Closed

Files with CRLF have whole file highlighted as modified #195

mjmeli opened this issue Mar 1, 2019 · 19 comments
Labels

Comments

@mjmeli
Copy link

mjmeli commented Mar 1, 2019

VS v15.8.8
GitDiffMargin v3.9.2.42

After a recent update, I'm seeing that the whole file is often getting highlighted as modified. This is similar to #82 and #164 but is a new instance as its happening after a recent update, plus those are closed, so I'd thought I'd open a new issue. I see a recent comment in #82 that this started happening after v3.9.1.

image

image

It's not happening in all files. The common thread I see between the files that it does happen in is that they have CRLF endings. This is also seen in the related issues.

I've noticed this happens when core.autocrlf is true. If I set it false, the issue goes away. Perhaps the extension is picking git normalizing to LFs and the extension shows a diff from the CRLF in Visual Studio?

@laurentkempe
Copy link
Owner

Thanks @mjmeli for reporting the issue!
Would you have a public repo which would show the issue?

@sharwell
Copy link
Collaborator

sharwell commented Mar 2, 2019

@mjmeli Can you post the contents of your .gitattributes file? Also, can you try running the following command from a clean checkout of your code (no changes in the working directory)?

git add --all --renormalize

After running this command, are there any files staged for commit?

@gnudym
Copy link

gnudym commented Mar 3, 2019

I have the same issue.
Running git add --all --renormalize staged a few thousand .cs and .csproj files to for commit.
My .gitattributes only has *.pdf binary

@TeaDrivenDev
Copy link

I'm seeing this too (in VS 2019 Preview 4), also only in some files, but a significant number. We have no .gitattributes file (I actually thought we did, but we don't), and as soon as I set autocrlf = false in my .gitconfig and reopen the document, I'm seeing the correct diff. Git itself, as well as Visual Studio etc. are always showing the correct diff.

git add --all --renormalize does indeed result in a lot of staged files, specifically some where I know I have seen this new different behavior in Git Diff Margin.

@gnudym
Copy link

gnudym commented Mar 3, 2019

@TeaDrivenDev I was about to add a similar comment about the issue being fixed after changing autocrlf from true to false but it seems like the issue came back again in some files.
I am on VS 2017.4 with Git Diff Margin 3.9.3.49 and git 2.19.1

@sharwell
Copy link
Collaborator

sharwell commented Mar 3, 2019

@gnudym You should be able to permanently resolve the issue regardless of the autocrlf setting by the following two steps:

  1. Add * text=auto to your .gitattributes, similar to the one used in this repository
  2. Run the renormalize command above, and commit the changed files

Without the first step, the repository will be sensitive to individual developer configurations, leading to various anomalies including but not limited to the one you observed here. Step 2 applies the new configuration to the full repository at once, which avoids a lengthy and very frustrating period where Git wants to renormalize files every time they are changed.

@mjmeli
Copy link
Author

mjmeli commented Mar 4, 2019

I appreciate the workaround of normalizing line endings, but unfortunately this workaround won't work in my situation. To give some context, we regretfully have a large monolithic repository shared by the entire engineering department, so it wouldn't be appropriate for me to change the line endings in every file. Some developers also intentionally use different line endings. It's a mess but it is something I have to work around...

Unfortunately I haven't been able to reproduce this in a public repo yet. It doesn't appear as simple as having a file with CRLFs in it. Hopefully someone else is able to provide an example repo.

@duncansmart
Copy link
Contributor

Been looking at GitCommands.GetGitDiffFor under the debugger - it seems this line:

var newBlob = repo.ObjectDatabase.CreateBlob(currentContent, relativeFilepath);

The currentContent that contains CRLF line-endings (as expected) gets converted to newBlob that contains just LF line-endings (?!), hence the diff shows that all lines have changed. So looks like LibGit2Sharp is to blame currently?

@TeaDrivenDev
Copy link

@duncansmart That, or libgit2 itself.

@sharwell
Copy link
Collaborator

sharwell commented Mar 4, 2019

The library is behaving the same way Git does for a given user configuration. The only reason we are forced to use CreateBlob instead of direct diff is there are some really strange edge cases for when Git changes line endings in content. It could be a libgit2 bug, but more often when I observe this I've found an inconsistency between the way the repository is configured for normalization and the way files are normalized in the commit itself. --renormalize is a command that identifies these discrepancies, and as indicated above Git is pointing out differences between the two for the repository in question.

@mjmeli
Copy link
Author

mjmeli commented Mar 5, 2019

Yes, it's surely due to such an inconsistency. Looks like there were some changes recently to libgit2 with how it handles line endings in v0.28.0, and this issue started happening in GitDiffMargin v3.9.1 when libgit2sharp was updated. libgit2 is certainly behaving differently than Git. I'm not sure if it is ultimately a bug or correct behavior on libgit2's part, and I wasn't capable to dig deep enough to confirm, but as a user I see there is a discrepancy.

I understand that much of this is stemming from having such a messy repository, but unfortunately since renormalizing can't work for me I've had to revert to an older version. For others with this issue (as I'm sure bad line endings are not rare...), v3.9.0 is the latest version that will not have this behavior. Hopefully this works for you all.

@duncansmart
Copy link
Contributor

It seems like an issue with LibGit2Sharp not exposing a diff API that allows the paths of blobs to be passed through to allow git to retrieve the content with the appropriate line-endings ...

e.g If you do a blob.GetContentStream(new FilteringOptions(relativeFilepath)); it calls git_blob_filtered_content_stream which normalises the line-endings. 👍

Whereas, if you call blob.GetContentStream() it calls git_blob_rawcontent_stream which doesn't normalise the line-endings 👎 ... which is why some files are OK and others are not depending on how there were originally stored.

@duncansmart
Copy link
Contributor

Hmmmnn, I've tried updating LibGit2Sharp to pass relativeFilepath down from repo.Diff.Compare through to libgit git_diff_blobs (note old_as_path and new_as_path params) to no avail 😢

@sharwell
Copy link
Collaborator

sharwell commented Mar 5, 2019

@duncansmart The CreateBlob call has a rather substantial performance impact. If you do find a way to implement things without that call, it would be a big help. I just wasn't able to find one that didn't change the behavior. 😄

@mjmeli
Copy link
Author

mjmeli commented Mar 5, 2019

I took an idea from the fix for #193 (9caca99) and applied this on the relativeFilepath passed to CreateBlob. @duncansmart mentioned that this call was converting CRLF line endings to LF line endings, which is why the diff between with this blob shows the whole file changing.

relativeFilepath = relativeFilepath.Replace("\\", "/");

This causes the CreateBlob call to return content with the correct line endings and the diff is comes out as expected. I'm not sure why this is working (which seemed to be the conclusion for #193 too 😄).

Anyone see any problems with doing this?

@duncansmart
Copy link
Contributor

That works on my machine too 👍 - you'd need to change relativeFilepath.Split(Path.DirectorySeparatorChar) to relativeFilepath.Split('/') later on too for the case-insensitive comparison (use Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar consts possibly)

@laurentkempe
Copy link
Owner

@mjmeli Would you provide a PR so that we can have a look?

@Samonitari
Copy link

I would also appreciate the hotfix.

Just ase you @mjmeli, I also work on a monolithic repo, with 17k (>20% of all) files having CRLF.
I will propose the we mix in some MacOS (only CR) and old QNX (RS) line endings as well, just for wider covarage 😉

@duncansmart
Copy link
Contributor

Hope you don't mind that I've submitted a PR @mjmeli

laurentkempe added a commit that referenced this issue Mar 12, 2019
Fix #195: Files with CRLF have whole file highlighted as modified.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants