Skip to content

RCOCOA-2285: Wire up the new progress notifications #8492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
May 2, 2024

Conversation

nirinchev
Copy link
Member

@nirinchev nirinchev commented Feb 19, 2024

This is based on a branch that is the result of merging #8539 and master. Will rebase once #8539 is merged. It depends on an unreleased Core version, so most of the workflows will fail.

Fixes #8476

Copy link
Contributor

@dianaafanador3 dianaafanador3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall everything looks pretty good on initial review. I left some comments and some questions. I guess this is missing some more testing but I do think the API is so simple there is not much to test here, but to test it against flexible sync.


Whenever the progress reporting mode is `forCurrentlyOutstandingWork`, that value
will monotonically increase until it reaches 1.0. If the progress mode is `reportIndefinitely`, the
value will monotonically increase until it reaches 1.0, but may subsequently decrease if new server data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there is a download/upload while another download/upload happens?, wouldn't that mean that in some cases the value will not increase monotonically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. good question, I'll try and investigate this.

@nirinchev nirinchev force-pushed the ni/progress-notifications branch 2 times, most recently from 2c4e1ff to bec6712 Compare April 23, 2024 11:21
@nirinchev nirinchev changed the base branch from master to ni/master+tg-app-services-split April 23, 2024 11:21
@nirinchev nirinchev marked this pull request as ready for review April 23, 2024 15:33
@nirinchev nirinchev force-pushed the ni/progress-notifications branch from f7c035b to 3be4e19 Compare April 29, 2024 10:05
@nirinchev nirinchev changed the base branch from ni/master+tg-app-services-split to tg/app-services-split April 29, 2024 10:05
Base automatically changed from tg/app-services-split to master April 29, 2024 15:07
@nirinchev nirinchev force-pushed the ni/progress-notifications branch from f02a75a to 45667bf Compare April 30, 2024 20:34
@nirinchev nirinchev requested a review from tgoyne April 30, 2024 20:34
Copy link
Contributor

@dianaafanador3 dianaafanador3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just some minor comments

@nirinchev
Copy link
Member Author

The CI failures seem to be github related (502s).

@nirinchev nirinchev merged commit 452588e into master May 2, 2024
3 checks passed
@nirinchev nirinchev deleted the ni/progress-notifications branch May 2, 2024 10:03
@duncangroenewald
Copy link

We are currently migrating from partition based synced realm to flexible sync realm and the initial client bootstrap code no longer reflects the bootstrap sync progress.

Does this update provide progress notification of the bootstrap progress.

If so do you have any sample code for asyncOpen() progress notifications during client initialisation and what version of RealmSwift is this supported in ?

Can you also update the SDK documentation if this is now supported because there don't appear to be any asyncOpen examples.

Thanks

    func asyncRealmI(completion: @escaping (Realm?, Error?)->Void)->Realm.AsyncOpenTask {
        
        self.setSyncErrorHandler()
        
        let task = Realm.asyncOpen(configuration: self.realmSyncConfig, callbackQueue: .main) { result in
            
            switch result {
            case .failure(let error):
                DebugLog("MongoRealm asyncOpen failed: \(error.localizedDescription)")
                completion(nil, error)
                
            case .success(let realm):
                DebugLog("MongoRealm asyncOpen successfull")
                completion(realm, nil)
                return
                
            }
        }
        
        task.addProgressNotification(block: { syncProgress in
            DispatchQueue.main.async {
                let prog = syncProgress.fractionTransferred * 100.0
                DebugLog("Sync: \(prog.formatted0)%")
                GlobalVars.shared.downloadSyncProgress = prog
            }
        })
        
        return task
    }

@bdkjones
Copy link

@nirinchev I'd also like to confirm that this new progress API reports progress of the BOOTSTRAP process, not just progress of the initial download from the cloud. It's the bootstrapping that takes a lifetime, not the initial download.

@nirinchev
Copy link
Member Author

It does - every time you change the subscriptions or reconnect after being disconnected, you can get progress notifications.

@duncangroenewald
Copy link

@bdkjones - what is the difference between the BOOTSTRAP process and the initial download, seems to be the same to me and it doesn't appear to take any more time that it does with partitioned sync. It just doesn't provide any progress (bytes transferred) - so the user has no idea how long it might take. Ours is around 150mb and takes around 20 seconds depending on the network performance - so long enough to need to display something to the user. Subsequent application restarts the download of new data is usually sub-second.

@bdkjones
Copy link

@duncangroenewald I don't know the technical details. What I can say is that, in my experience, the data gets to the client quickly and then "bootstrapping changesets" takes almost 8 minutes. The app in question is massive: about 1.7 million objects.

As I understand, the bootstrapping sets up the change stream history so that new writes can be properly merged. But I wish this process behaved differently:

FIRST give me the very latest state of the Realm. Let me show that in the UI so the user can start browsing at least. Let the realm be read-only at this point.

THEN bootstrap all the cruft in the background and notify me that writes are now allowed.

Get. Stuff. On. Screen. Should be the priority.

@tgoyne
Copy link
Member

tgoyne commented May 20, 2024

The bootstrapping phase has three steps: the server runs the queries for each of the subscriptions, sends the data to the client, and then the client populates the local Realm. The first two steps happen concurrently (which is why the server can't send precise progress information), but applying the changesets to the local Realm doesn't begin until they've all been downloaded. Currently we do not provide progress information for applying the changesets (1.0 merely indicates that the download is complete). Setting up changestreams and such is not part of the bootstrap step, and the time between the download completing and bootstrap completing is entirely client-side work to get the local Realm into a usable state.

With partition-based sync the download and application steps were interleaved (although they still didn't actually happen in parallel), so the download completing also meant that the downloaded changesets had been applied.

@bdkjones
Copy link

@tgoyne so it sounds like this does not report progress for the third phase? That's really where progress reports are needed because that third phase is what takes a geologic era.

@duncangroenewald
Copy link

@tgoyne - thanks for the explanation - this understanding will be useful when we migrate to flexible sync.

@nirinchev
Copy link
Member Author

The third phase is rather quick. We haven't really seen very slow application times in the wild - it's just writing data that is already available in memory, which compared to downloading the data should be fast.

@duncangroenewald
Copy link

@duncangroenewald I don't know the technical details. What I can say is that, in my experience, the data gets to the client quickly and then "bootstrapping changesets" takes almost 8 minutes. The app in question is massive: about 1.7 million objects.

As I understand, the bootstrapping sets up the change stream history so that new writes can be properly merged. But I wish this process behaved differently:

FIRST give me the very latest state of the Realm. Let me show that in the UI so the user can start browsing at least. Let the realm be read-only at this point.

THEN bootstrap all the cruft in the background and notify me that writes are now allowed.

Get. Stuff. On. Screen. Should be the priority.

Hmm - sounds like you should just be opening the Realm and not waiting for the database to be synced.

We only use openAsync and block waiting for the initial change sets to be loaded when the client is initialised the first time. Every subsequent time the client application starts up it opens the realm without waiting for any sync to complete. Sync can happen in the background.

But - we have yet to fully test Flexible Sync - we don't need it because we sync the entire database anyway.

@bdkjones
Copy link

@duncangroenewald - if I had a dollar for every time someone told me "you should't be opening async" I wouldn't have to spend my time writing software any more.

The alternative is a non-starter. If I just opened the Realm and had nothing in the UI for 8 minutes, the user would start creating the objects that are "missing" and then when the background sync finally finished, I'd have all sorts of duplicates and conflicts to resolve because the "missing" objects were never really missing; they just weren't available yet.

I can't seem to get that point across to anyone. Initial downloads in the background are a non-starter for anything other than simple, single-user apps that have small datasets.

@bdkjones
Copy link

To show you what I see, here's Realm "bootstrapping" the database on the first initial open. As you can see, my network traffic is essentially zero, so the download seems to have completed and Realm is now taking FOREVER to actually integrate/bootstrap everything:

https://www.youtube.com/watch?v=tOIn6r80Tms

It's literally 10+ minutes of "bootstrapping" before the Realm is opened. @tgoyne mentioned that Mongo had a way to cut this by 10-20% but they haven't done it because it's "not material". I beg to disagree. Anything that can be done to speed this up should be done.

@duncangroenewald
Copy link

@duncangroenewald - if I had a dollar for every time someone told me "you should't be opening async" I wouldn't have to spend my time writing software any more.

The alternative is a non-starter. If I just opened the Realm and had nothing in the UI for 8 minutes, the user would start creating the objects that are "missing" and then when the background sync finally finished, I'd have all sorts of duplicates and conflicts to resolve because the "missing" objects were never really missing; they just weren't available yet.

I can't seem to get that point across to anyone. Initial downloads in the background are a non-starter for anything other than simple, single-user apps that have small datasets.

In that case your use case sounds the same as ours - we have to load a whole bunch of base data before the user can do anything. But this is only required the very first time the App is started - and as a result there is no 'merging' going on because the local database is empty - so performance isn't a problem. Maybe 20 seconds to download 150MB - network performance is the main bottleneck.

But every subsequence startup by the application does not use openAsync so the UI is instantly ready. However our use case is such that the changesets, apart from the initial seeding of the local database, are relatively small since they are only changes that have been made by users since the application was last running. Live updates made by other users are downloaded and merged in in the background.

Why wouldn't you adopt the same approach - where only the initial startup of the application uses openAsync so the user needs to wait while the initial data is populated. Subsequently restarts of the application then don't use openAsync because there is already data in the database.

You may have a problem if every subsequent restart of the application needs to download a new copy of all the data before the user can do anything - then you would have little choice but to use openAsync every time the app starts. That's a different use case that what we have.

@bdkjones
Copy link

bdkjones commented May 23, 2024

Why wouldn't you adopt the same approach - where only the initial startup of the application uses openAsync so the user needs to wait while the initial data is populated.

This !!!!--->>> IS <<<---!!!! the initial startup of the app. The video above is what happens during a 100% clean initial launch. No realm files on disk at all, no logged-in user, etc.

Opening the realm on subsequent app launches (after this initial bootstrapping is done) is basically instant.

@duncangroenewald
Copy link

Sorry I missed your post with the video. How many MB is your database ?

Just a thought - are you running in DEBUG mode using SPM ? If so try using the built production binary Realm and RealmSwift frameworks or use a release build to test with.

I found that there is a very big performance hit when running the SPM version of Realm/RealmSwift in debug mode. It's basically unusable because it is so slow.

If you see the same performance issue with your release builds then that's likely not the problem

@bdkjones
Copy link

@duncangroenewald the video is taken from a debug build and I do find that release versions complete this process about 35% faster, but it's still many minutes of bootstrapping even for a release build.

The realm file on disk right after the initial open is 771.8MB, as reported by Finder.

@duncangroenewald
Copy link

@duncangroenewald the video is taken from a debug build and I do find that release versions complete this process about 35% faster, but it's still many minutes of bootstrapping even for a release build.

The realm file on disk right after the initial open is 771.8MB, as reported by Finder.

Interesting - ours is 150mb and about 35 different Object classes with quite complex relationships and takes maybe 20 seconds for the initial downloads. So if it was roughly linear then one might expect yours to be around 2 minutes.

Ours is a desktop application so even if the user has to wait 5 minutes during initial setup it's not the end of the world.

I would think that downloading 5000 changesets has some overhead - even if the actual size of each changeset is quite small - that's still 5000 individual downloads that have to be initiated. Presumably not downloaded concurrently either.

@tgoyne
Copy link
Member

tgoyne commented May 24, 2024

Release builds only being 35% faster is mildly suspicious. One of the "fun" parts about C++ is that idiomatic C++ leans so hard on the optimizer eliminating abstractions that debug builds tend to be very slow and we're definitely not an exception to that. Suggests it might be i/o bound rather than cpu bound for you I guess?

@bdkjones
Copy link

@tgoyne I'm happy to record a new video with the release build, but as you can see in the video above, there is no network I/O during this bootstrapping process. I deleted the entire realm file from disk and performed a clean start of my app. The download portion (where Activity Monitor shows inbound network packets) takes a bit, but shows data coming in. During this time, NO BOOTSTRAPPING OCCURS.

Screenshot 2024-05-23 at 20 10 30

The "bootstrapping" messages don't show up until after the download completes and the network activity drops to ~zero:

Screenshot 2024-05-23 at 20 13 20

Disk IO

If you're talking about disk I/O, that's very unlikely: the machine is an M2 Max 16" MacBook Pro with Apple's internal SSDs that reach 7,000 MB/s. The bootstrapping hardly hits my CPU at all:

Screenshot 2024-05-23 at 20 15 12

The Mac shows about 75-80% idle CPU usage during the entire bootstrapping process, which drags on forever.

@bdkjones
Copy link

Here's another shot of the CPU usage during bootstrapping. If this is something that can be parallelized, it looks like Realm might currently be assuming there's always a max of 4 CPU cores instead of asking the system how many cores it has available (12, in my case)?

Screenshot 2024-05-23 at 20 19 54

@nirinchev
Copy link
Member Author

Hey @bdkjones, the application of changesets taking a long time is unexpected and it's possible you're hitting some pathological case. We'd be willing to investigate that, but this PR is probably the wrong place for it. Can you open a support ticket and share trace-level logs from a clean run of your application. Feel free to reference this ticket to have the support engineer immediately loop in engineering.

@bdkjones
Copy link

@nirinchev absolutely; I'd be happy to do that. I've long suspected this app is an unanticipated edge case. Is there a way to file a support ticket without the Advanced Enterprise Support subscription? I can justify a reasonable one-time support ticket fee to the client, but a permanent subscription is probably off the table (and overkill for our needs). Mongo's support website won't allow me to do anything until I'm subscribed.

@nirinchev
Copy link
Member Author

@bdkjones hm, interesting - I don't know too much about all of that, unfortunately. I've pinged our best support gal and expect she'll get back to me Monday with some instructions for what we can do there.

@duncangroenewald
Copy link

@bdkjones - interesting that you don't see the changesets merging until the network traffic stops. That's not what I thought I observed but I will have to rebuild/load the flexible sync environment to check that again. Other than not reporting download status correctly the time to become available seemed as expected so I didn't pay close attention.

I will try and do that in the next few days to see if the behaviour is different to yours.

Please post the findings of any investigations so we can avoid any potential bottlenecks.

@nirinchev
Copy link
Member Author

@bdkjones if you've never had a support plan, you can activate a 30-day free trial and cancel the subscription before the trial ends. You won't lose access to your ticket after the 30-day trial, though I do hope we'll be able to resolve it faster than that.

@bdkjones
Copy link

bdkjones commented May 29, 2024

@nirinchev I've opened that case and its assigned number is: 01321469

I included a 24-minute walkthrough demoing the issue and showing my code, database structure, etc. Hopefully we can find what the bottleneck is and eliminate it.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for progress notifications during bootstrap
5 participants