Clustering-related fixes. #140

guusdk · 2020-12-03T16:08:06Z

No description provided.

Stopped using deprecated API of PluginManager. Applied various IDE suggestions.

As the senior node is responsible for processing cluster events related, that node needs to have all available data to operate on. By making available the original stanza, this stanza can be persisted in the database. This in turn prevents an inpresice reproduction from being used when the corresponding message archive is used at some point later in the future.

…chivedMessage (almost) immutable No functional changes intended.

Reduce code complexity by removing code that is not used.

…ceManager PersistenceManager had a lot of definition that wasn't used. It either had no implementation, or the little implementation that was there, went unreferenced. This commit removes it all, reducing code complexity by a significant factor. It appears to me that all removed code was replaced by code in the implementation of ConversationManager - although I'm not 100% sure.

Removing code that is not referenced / used anywhere. This reduces complexity, and improves maintainability.

This adds documentation that helps define how ArchivedMessage is to be used.

When a previously archived message is being retrieved, the optional 'stanza' database value is eventually parsed as XMPP. This commit brings forward this process (to a place immediately after the data is obtained from the database), replacing the existing mechanism that did it just before the message was being routed to the intended recipient. The benefit of this is that the code that it is more fail-fast, and operations that follow the database retrieval can work with a proper stanza, instead of having to work with its String representation. This brings type-safety. Lastly, there's less ambiguity around where parsing occurs (and at what point the parsing can be expected to have been completed).

…rection' The concept of 'direction' of an ArchivedMessage is, to some, counter-intuitive. The mechanism that determines the direction of a particular stanza should be centralized to reduce confusion.

When in 2015 the XEP-0313 implementation was added to the code, this API of ConversationManager got expanded to take in the complete 'stanza' element. The ConversationEvent did not have that, which is why an empty string was used instead. Later (in 2016), the stanza was made available in ConversationEvent, but that empty string never got replaced. This commit fixes that.

When processing an archived message in MAM context, using the 'with' attribute (that would otherwise go unused) to reflect the nickname of the sender of the message allows us to avoid unneeded stanza parsing.

…tanza IDs Code duplication (and thus complexity) is reduced by centralizing where an appropriate stanza ID is obtained.

…ersationEvent

Fishbowler · 2020-12-03T22:02:55Z

So far:

ofMessageArchive now contains all of the Stanzas that it should, regardless of type or origin cluster node (MUC messages duplicated as DMs #137 and Stanzas not always stored for one-to-one messages whilst clustered #138)
can retrieve an accurate message archive for myself and for the MUC

Todo:

variants of MAM to see if Implementation contains unneeded code #139 has had any ill effects

…cluster It is important to assign a message ID, which is used for ordering messages in a conversation, soon after the message arrived, as opposed to just before the message gets written to the database. In the latter scenario, the message ID values might no longer reflect the order of the messages in a conversation, as database writes are batched up together for performance reasons. Using these batches won't affect the database-insertion order (as compared to the order of messages in the conversation) on a single Openfire server, but when running in a cluster, these batches do have a good chance to mess up the order of things.

These are the new properties and their defaults: conversation.archiver.conversation.max-work-queue-size default: 500 conversation.archiver.conversation.max-purge-interval default: 1000 (one second) conversation.archiver.conversation.grace-period default: 50 conversation.archiver.message.max-work-queue-size default: 500 conversation.archiver.message.max-purge-interval default: 1000 (one second) conversation.archiver.message.grace-period default: 50 conversation.archiver.participant.max-work-queue-size default: 500 conversation.archiver.participant.max-purge-interval default: 1000 (one second) conversation.archiver.participant.grace-period default: 50

…gniterealtime#137) With issue igniterealtime#137 fixed, issue igniterealtime#19 can be closed.

Fishbowler · 2020-12-04T22:29:55Z

MAM and RSM tests completed to the same standard as the 2.0.0 release in January (I really must automate these, or at least publish them as examples).

LGTM, but will await @guusdk to check this last change over before merging.

guusdk · 2020-12-07T10:07:21Z

src/java/com/reucon/openfire/plugin/archive/impl/MucMamPersistenceManager.java

@@ -290,7 +290,7 @@ static Long getMessageIdForStableId( final MUCRoom room, final String value )
            connection = DbConnectionManager.getConnection();
            pstmt = connection.prepareStatement( "SELECT messageId, stanza FROM ofMucConversationLog WHERE messageId IS NOT NULL AND roomID=? AND stanza LIKE ? AND stanza LIKE ?" );
            pstmt.setLong( 1, room.getID() );
-            pstmt.setString( 2, "%"+value+"%" );
+            pstmt.setString( 2, "%id=\""+value+"\"%" );


This will fail consistently when ' is used instead of ":

<bar id="foo" />

versus

<bar id='foo' />

Think I've fixed it, but it's untested.

We're well into territory of database optimization here, which I'm not very experienced with, but: my concern is that for most rows (that are not going to match), this would duplicate the text-based search effort. Is that something that we should be concerned of?

After chatting to @guusdk, changed this to be more defensive in false positives rather than more precise (+expensive) with the database.

…lt ID The implementation looks for a stanza-id and falls back to a database ID. The XEP-0313 says a client shouldn't rely on the stanza-id as the UID of a message in the archive, but OF is trending towards making that reliable. In the meantime, the result ID is still the database ID. Without this change, searches for the database ID is searched as freetext within the stanza. As a short number in amongst many GUIDs, false positives are more likely than not. This change adds to the defense against false positives and ensures that a search for a stanza-id is more thoroughly checked to be the one we were looking for.

guusdk and others added 15 commits December 1, 2020 17:33

Minor tweaks

94790f3

Stopped using deprecated API of PluginManager. Applied various IDE suggestions.

fixes igniterealtime#139: refactoring: reduce complexity by making Ar…

05bb1f5

…chivedMessage (almost) immutable No functional changes intended.

fixes igniterealtime#139: Refactor: remove unused ArchiveManager

f30b0fd

Reduce code complexity by removing code that is not used.

fixes igniterealtime#139: Removing unused code

93ec9a4

Removing code that is not referenced / used anywhere. This reduces complexity, and improves maintainability.

fixes igniterealtime#139: Removing unused code

cf557a7

Removing code that is not referenced / used anywhere. This reduces complexity, and improves maintainability.

fixes igniterealtime#139: Refactoring: annotating ArchivedMessage

9834cfc

This adds documentation that helps define how ArchivedMessage is to be used.

fixes igniterealtime#139: Refactor: centralize logic to determine 'di…

7160997

…rection' The concept of 'direction' of an ArchivedMessage is, to some, counter-intuitive. The mechanism that determines the direction of a particular stanza should be centralized to reduce confusion.

fixes igniterealtime#139: refactoring: repurposing 'with' in MUC context

bca04d3

When processing an archived message in MAM context, using the 'with' attribute (that would otherwise go unused) to reflect the nickname of the sender of the message allows us to avoid unneeded stanza parsing.

fixes igniterealtime#139: Reduce code duplication for Stable/Unique s…

2bfc5a5

…tanza IDs Code duplication (and thus complexity) is reduced by centralizing where an appropriate stanza ID is obtained.

Add recent changes to changelog

36f58d5

fixes igniterealtime#138: Actually add the 1-to-1 stanzas to the Conv…

4a37114

…ersationEvent

guusdk assigned Fishbowler Dec 3, 2020

guusdk added 2 commits December 4, 2020 11:14

guusdk mentioned this pull request Dec 4, 2020

fix issue 19 #129

Closed

fixes igniterealtime#19: Adding issue 19 to changelog (duplicate of i…

250b6e9

…gniterealtime#137) With issue igniterealtime#137 fixed, issue igniterealtime#19 can be closed.

guusdk commented Dec 7, 2020

View reviewed changes

Fishbowler force-pushed the 137_muc-messages-as-oneonone-rebased branch from 11057c4 to a2f6bb0 Compare December 7, 2020 15:06

Fishbowler force-pushed the 137_muc-messages-as-oneonone-rebased branch from a2f6bb0 to 5822cda Compare December 8, 2020 13:45

guusdk merged commit 5c13f63 into igniterealtime:master Dec 8, 2020

Fishbowler mentioned this pull request Dec 8, 2020

Hide 'stanza reconstruction' behind configuration #117

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clustering-related fixes. #140

Clustering-related fixes. #140

guusdk commented Dec 3, 2020

Fishbowler commented Dec 3, 2020

Fishbowler commented Dec 4, 2020

guusdk Dec 7, 2020

Fishbowler Dec 7, 2020

guusdk Dec 7, 2020

Fishbowler Dec 8, 2020

Clustering-related fixes. #140

Clustering-related fixes. #140

Conversation

guusdk commented Dec 3, 2020

Fishbowler commented Dec 3, 2020

Fishbowler commented Dec 4, 2020

guusdk Dec 7, 2020

Choose a reason for hiding this comment

Fishbowler Dec 7, 2020

Choose a reason for hiding this comment

guusdk Dec 7, 2020

Choose a reason for hiding this comment

Fishbowler Dec 8, 2020

Choose a reason for hiding this comment