Skip to content

[New Algorithm] Adding Myers Difference Algorithm #693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 10, 2019
Merged

[New Algorithm] Adding Myers Difference Algorithm #693

merged 10 commits into from
Mar 10, 2019

Conversation

horita-yuya
Copy link

Checklist

Description

It seems that some algorithm finding shortest edit script are included in this. This Myers algorithm is same type algorithm. It improves calculation time and space, but simple.

Android's DiffUtil are implemented by Myers Algorithm. So it is practical.
To understand the mechanism is good and I hope this PR supports for that.

I also add the description about Edit Graph, it is useful for understand Myers Algorithm.

Thank you.

@horita-yuya
Copy link
Author

This is a reference.
http://www.xmailserver.org/diff2.pdf

Thank you 🙏

Copy link
Member

@kelvinlauKL kelvinlauKL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the well-commented code 👍

So far, I've done a partial read through and left a comment on how the MDA generates the edit graph. Could you address that?


Myers Difference Algorithm is an algorithm that finds a longest common subsequence or shortest edit scripts (LCS/SES dual probrem) of two sequences. MDA can accomplish this in O(ND) time, where N is the sum of the lengths of the two sequences. The common subsequence of two sequences is the sequence of elements that appear in the same order in both sequences. The edit scripts will be discussed below.

For example, assuming that sequence `A = ["1", "2", "3"]` and sequence `B = ["2", "3", "4"]`, `["2"], ["2", "3"]` are common sequences. Furthermore, the latter `["2", "3"]` is the longest common subsequence. But `["1", "2"], ["3", "2"]` are not. Because, `["1", "2"]` contains `"1"` that is not included in `B`, `["3", "2"]` has elements are included in both, but the appearing order is not correct.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph doesn't make sense. Could you reword it?

Y = [C, B, A, B, A, C]
```

MDA generates the edit graph through the following steps:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These steps are hard to understand. It seems worthwhile to explain what the blue line, green dots, dotted lines, and red number represents, just to give the reader more context before diving into the steps.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing.

Yes, it is necessary. Understanding edit graph is important for understanding MDA.
I'll add more explanation 👍

@kelvinlauKL
Copy link
Member

It just occurred to me that we have an article on finding the longest common subsequence already: https://github.com/raywenderlich/swift-algorithm-club/tree/master/Longest%20Common%20Subsequence

Could you check it out to see if it makes sense to add your ideas to the current topic?

@horita-yuya
Copy link
Author

I have known that there is an article for finding the LCS. (https://github.com/raywenderlich/swift-algorithm-club/tree/master/Longest%20Common%20Subsequence)

My topic, or MDA is faster than this algorithm. It runs for loop twice like below.

for i in self.characters {
 for j in other.characters {
 }
}

I takes O(N * M) times for the worst case 😱

@horita-yuya
Copy link
Author

One more, the LCS algorithm can get only LCS.
But, if someone understand my topic MDA and Edit Graph, it can be extended to getting Diff command, like insert or delete, not only LCS.

This may make it easy that , for example, difference refreshing for UITableView or UICollectionView.

@horita-yuya
Copy link
Author

@kelvinlauKL
I mentioned about the difference between my article and LongestCommonSubsequence.
The major difference is calculation efficiency.

and I added some explanation about figure.
Thank you 🙏

@horita-yuya
Copy link
Author

Please 🙏

@horita-yuya
Copy link
Author

Anything else is required?

Copy link
Member

@kelvinlauKL kelvinlauKL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the wait! I took a look at this and ran this code on Swift 5. Runs as expected on Xcode 10.2.

Merging!

@kelvinlauKL kelvinlauKL merged commit c724cdb into kodecocodes:master Mar 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants