Skip to content

Wagtail search index issue | list index out of range #12996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Aalian123 opened this issue Mar 27, 2025 · 4 comments · May be fixed by #13024
Open

Wagtail search index issue | list index out of range #12996

Aalian123 opened this issue Mar 27, 2025 · 4 comments · May be fixed by #13024
Labels
status:Needs Info Needs more information from the person who reported the issue or opened the PR status:Unconfirmed Issue, usually a bug, that has not yet been validated as a confirmed problem. type:Bug

Comments

@Aalian123
Copy link

Aalian123 commented Mar 27, 2025

Issue Summary

When adding locale and synchronizing pages with existing locale , on some pages I am getting error list index out of range. while some Specifically, when adding a locale and synchronizing pages with an existing locale, I receive a "list index out of range" error on some pages, while others are indexed correctly. I updated Python, Django, and Wagtail versions.

Older :
Python: 3.8.9
Django==3.2.13
wagtail==3.0.1

New Version:
Python: 3.9
Django==4.2
wagtail==5.2

-->

Current Code is below
from wagtail.models import Page
from wagtail.fields import StreamField
from wagtail.admin.panels import FieldPanel
from applications.components.plugins import *
from wagtail.api import APIField
from itertools import chain
from taggit.models import TaggedItemBase
from django.db import models
from modelcluster.contrib.taggit import ClusterTaggableManager
from modelcluster.fields import ParentalKey
from wagtail.admin.panels import TabbedInterface, ObjectList
from wagtail.search import index
from wagtail_headless_preview.models import HeadlessMixin
from django.utils.http import urlencode
from ebug.settings.base import FRONTEND_URL

from django.db.models.signals import post_save
from wagtail.search.signal_handlers import post_save_signal_handler

class TaggedPage(TaggedItemBase):
    content_object = ParentalKey(
        to='home.HomePage',
        on_delete=models.CASCADE,
        related_name='tagged_objects'
    )


class HomePage(HeadlessMixin, Page):
    localize_default_translation_mode = "simple"

    content = StreamField([('heading', blocks.CharBlock(form_classname="full title")),
                           ('paragraph', blocks.RichTextBlock()),
                           ('banner_with_static_image', BannerWithStaticImageBlock()),
                           ('banner_with_video', BannerWithVideoBlock()),
                           ('featured_page', FeaturedPagesBlock()),
                           ('game_page', GamePageBlock()),
                           ('highlighted_content', HighlightingContentBlock()),
                           ('image_carousel', ImageCarouselBlock()),
                           ('image_left_text_right', ImageWithSideTextBlock()),
                           ('in_page_banner', InPageBannerBlock()),
                           ('in_page_carousel', InPageCarouselBlock()),
                           ('information_for_leaders', InformationForLeader()),
                           ('individual_lesson', IndividualLessonBlock()),
                           ('lesson_landing_block', LessonsLandingBlock()),
                           ('linked_content_list_block', LinkedContentListBlock()),
                           ('news_box', NewsBoxesBlock()),
                           ('newsletter', NewsletterBlock()),
                           ('partnering_project_item', PartneringProjectBlock()),
                           ('research_item', ResearchItemBlock()),
                           ('split_content', SplitContentBlock()),
                           ('standard_text_full_width', StandardTextFullWidthBlock()),
                           ('standard_text_two_column', StandardTextTwoColumnBlock()),
                           ('standard_text_with_sidebar', StandardTextWithSidebarBlock()),
                           ('tag_container', TagsPageBlock()),
                           ('tabbed_content', TabbedContentBlock()),
                           ('timeline', TimelineBlock()),
                           ('content_with_background', PageWithBackgroundBlock()),
                           ('contact_form', ContactFormBlock()),
                           ('spacing', SpacingBlock()),
                           ('map', MapBlock()),
                           ], use_json_field=True, null=True, blank=True)

    content_panels = Page.content_panels + [
        # FieldPanel('tags'),
        FieldPanel('content'),
    ]

    edit_handler = TabbedInterface([
        ObjectList(content_panels, heading='Content'),
        ObjectList(Page.promote_panels, heading='Promote'),
        ObjectList(Page.settings_panels, heading='Settings', classname="settings"),
    ])

    tags = ClusterTaggableManager(through=TaggedPage, blank=True)
    api_fields = [
        APIField('title'),
        APIField('content'),
        APIField('slug'),
    ]

    search_fields = Page.search_fields + [
        index.SearchField('content')
    ]

    @property
    def get_tags(self):
        tags = []

        for block in self.content or []:
            # For tag_container
            if block.block_type == 'tag_container':
                for item in block.value.get('content', []):
                    raw = item.get('tags', '')
                    if raw:
                        # Remove surrounding quotes, then split on commas
                        cleaned = [t.strip().strip('"') for t in raw.split(',')]
                        tags.append(cleaned)

            # For individual_lesson
            elif block.block_type == 'individual_lesson':
                supporting_materials = block.value.get('supporting_materials', {})
                raw = supporting_materials.get('tags', '')
                if raw:
                    cleaned = [t.strip().strip('"') for t in raw.split(',')]
                    tags.append(cleaned)
        # Flatten everything and remove duplicates
        return list(set(chain.from_iterable(tags)))

    @property
    def searchable_content(self):
        """
        Flatten the stream blocks into a single text string
        that is safe for the indexer.
        """
        text_bits = []
        for block in self.content or []:
            block_text = block.block.get_searchable_content(block.value)
            text_bits.extend(block_text)
        return " ".join(text_bits)

    def save(self, clean=True, user=None, log_action=False, **kwargs):
        post_save.disconnect(
            receiver=post_save_signal_handler,
            sender=HomePage
        )

        try:
            # 2. Perform your save logic
            # If you still want to set tags in one go, do that:
            self.tags.set(self.get_tags)
            super().save(clean, user, log_action, **kwargs)

            try:
                index.insert_or_update_object(self)
                print("Indexing page with id :", self.id)
            except IndexError:
                import logging

                logger = logging.getLogger(__name__)
                logger.warning(f"Skipping indexing for page {self.id} due to index error")
        except Exception:
            self._skip_indexing = True
            super().save(clean, user, log_action, **kwargs)
            del self._skip_indexing
        finally:
            # 3. Reconnect the post_save handler for future saves
            post_save.connect(
                receiver=post_save_signal_handler,
                sender=HomePage
            )
        # for tag in self.get_tags:
        #     self.tags.add(tag)
        # super(HomePage, self).save(clean, user, log_action, **kwargs)

    def get_preview_url(self, token):
        return f"{FRONTEND_URL}{self.locale.language_code}/{self.slug}?" + urlencode({"content_type": self.get_content_type_str(), "token": token})
`

Older code 
`import markupsafe
from wagtail.core.models import Page
from wagtail.core.fields import StreamField
from wagtail.admin.edit_handlers import StreamFieldPanel
from applications.components.plugins import *
from wagtail.api import APIField
from itertools import chain
from taggit.models import TaggedItemBase
from django.db import models
from modelcluster.contrib.taggit import ClusterTaggableManager
from modelcluster.fields import ParentalKey
from wagtail.admin.edit_handlers import TabbedInterface, ObjectList
from wagtail.search import index
from wagtail_headless_preview.models import HeadlessMixin
from django.utils.http import urlencode
from ebug.settings.base import FRONTEND_URL


class TaggedPage(TaggedItemBase):
    content_object = ParentalKey(
        to='home.HomePage',
        on_delete=models.CASCADE,
        related_name='tagged_objects'
    )


class HomePage(HeadlessMixin, Page):
    localize_default_translation_mode = "simple"

    content = StreamField([('heading', blocks.CharBlock(form_classname="full title")),
                           ('paragraph', blocks.RichTextBlock()),
                           ('banner_with_static_image', BannerWithStaticImageBlock()),
                           ('banner_with_video', BannerWithVideoBlock()),
                           ('featured_page', FeaturedPagesBlock()),
                           ('game_page', GamePageBlock()),
                           ('highlighted_content', HighlightingContentBlock()),
                           ('image_carousel', ImageCarouselBlock()),
                           ('image_left_text_right', ImageWithSideTextBlock()),
                           ('in_page_banner', InPageBannerBlock()),
                           ('in_page_carousel', InPageCarouselBlock()),
                           ('information_for_leaders', InformationForLeader()),
                           ('individual_lesson', IndividualLessonBlock()),
                           ('lesson_landing_block', LessonsLandingBlock()),
                           ('linked_content_list_block', LinkedContentListBlock()),
                           ('news_box', NewsBoxesBlock()),
                           ('newsletter', NewsletterBlock()),
                           ('partnering_project_item', PartneringProjectBlock()),
                           ('research_item', ResearchItemBlock()),
                           ('split_content', SplitContentBlock()),
                           ('standard_text_full_width', StandardTextFullWidthBlock()),
                           ('standard_text_two_column', StandardTextTwoColumnBlock()),
                           ('standard_text_with_sidebar', StandardTextWithSidebarBlock()),
                           ('tag_container', TagsPageBlock()),
                           ('tabbed_content', TabbedContentBlock()),
                           ('timeline', TimelineBlock()),
                           ('content_with_background', PageWithBackgroundBlock()),
                           ('contact_form', ContactFormBlock()),
                           ('spacing', SpacingBlock()),
                           ('map', MapBlock()),
                           ], null=True, blank=True)

    content_panels = Page.content_panels + [
        # FieldPanel('tags'),
        StreamFieldPanel('content'),
    ]

    edit_handler = TabbedInterface([
        ObjectList(content_panels, heading='Content'),
        ObjectList(Page.promote_panels, heading='Promote'),
        ObjectList(Page.settings_panels, heading='Settings', classname="settings"),
    ])

    tags = ClusterTaggableManager(through=TaggedPage, blank=True)
    api_fields = [
        APIField('title'),
        APIField('content'),
        APIField('slug'),
    ]

    search_fields = Page.search_fields + [
        index.SearchField('content')
    ]

    @property
    def get_tags(self):
        tags = []
        for block in self.content:
            if block.block_type == 'tag_container':
                tags_ = list(chain.from_iterable([x.get('tags').split(',') for x in block.value.get('content', '')]))
                tags.append(tags_)
            elif block.block_type == 'individual_lesson':
                tags_ = block.value.get('supporting_materials', '').get('tags').split(',')
                tags.append(tags_)
            else:
                continue

        return list(set(chain.from_iterable(tags)))

    @property
    def description(self):
        result = []
        for _ in self.content:
            searchable_content = _.block.get_searchable_content(_.value)
            result.extend(searchable_content)

        return ' '.join(result)

    def save(self, clean=True, user=None, log_action=False, **kwargs):
        for tag in self.get_tags:
            self.tags.add(tag)
        super(HomePage, self).save(clean, user, log_action, **kwargs)

    def get_preview_url(self, token):
        return f"{FRONTEND_URL}{self.locale.language_code}/{self.slug}?" + urlencode({"content_type": self.get_content_type_str(), "token": token})

Any other relevant information. For example, why do you consider this a bug and what did you expect to happen instead?

  • In the older version, everything worked fine, but in the newer version, I encounter errors on the same pages when trying to synchronize a newly created locale with the existing locales.

Technical details

  • Python version: 3.9
  • Django==4.2
  • wagtail==5.2
  • Browser version: Chrome 134

Anyone can contribute to this. View our contributing guidelines, add a comment to the issue once you’re ready to start.

@Aalian123 Aalian123 added status:Unconfirmed Issue, usually a bug, that has not yet been validated as a confirmed problem. type:Bug labels Mar 27, 2025
@zerolab
Copy link
Contributor

zerolab commented Mar 27, 2025

Can you post the full traceback of the error? This would help determine whether this a Wagtail or wagtail-localize issue

@gasman gasman added the status:Needs Info Needs more information from the person who reported the issue or opened the PR label Mar 27, 2025
@Rish-it
Copy link

Rish-it commented Mar 28, 2025

@Aalian123 Can you provide the exact steps to reproduce this issue? Are you adding the locale through the Wagtail admin or via a script? Also, does this happen for all page types or only specific ones?
And as @zerolab mentioned, can you post the full traceback of the error? This would help determine whether this is a Wagtail core issue or related to Wagtail-localize.
This will help narrow down the issue.

@Aalian123
Copy link
Author

Aalian123 commented Mar 28, 2025

@Aalian123 Can you provide the exact steps to reproduce this issue? Are you adding the locale through the Wagtail admin or via a script? Also, does this happen for all page types or only specific ones? And as @zerolab mentioned, can you post the full traceback of the error? This would help determine whether this is a Wagtail core issue or related to Wagtail-localize. This will help narrow down the issue.
@zerolab @Rish-it

I am adding through wagtail admin and it is random as far I know. Steps to reproduce are below:

  • Add new WAGTAIL_CONTENT_LANGUAGES, this will reflect in the wagtail locale model's dropdown
  • Click on add new locale , select new one from dropdown and then click on Synchronise content from another locale
  • Select an existing locale and click save

As I have content field as searchable field, content field contains different blocks so it tries to create index on that and gives error. Signal is wagtail's builtin signal index.insert_or_update_object(self).

Traceback
Exception raised while adding <HomePage: Oral Hygiene> into the 'default' search backend
Traceback (most recent call last):
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/index.py", line 174, in insert_or_update_object
    backend.add(indexed_instance)
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 774, in add
    self.get_index_for_object(obj).add_item(obj)
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 238, in add_item
    self.add_items(obj._meta.model, [obj])
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 353, in add_items
    update_method(content_type_pk, indexers)
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 282, in add_items_upsert
    cursor.execute(
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 102, in execute
    return super().execute(sql, params)
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
IndexError: list index out of range
Internal Server Error: /admin/locales/new/
Traceback (most recent call last):
  File "venv3.9/lib/python3.9/site-packages/django/core/handlers/exception.py", line 55, in inner
    response = get_response(request)
  File "venv3.9/lib/python3.9/site-packages/django/core/handlers/base.py", line 197, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "venv3.9/lib/python3.9/site-packages/django/views/decorators/cache.py", line 62, in _wrapper_view_func
    response = view_func(request, *args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/wagtail/admin/urls/__init__.py", line 173, in wrapper
    return view_func(request, *args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/wagtail/admin/auth.py", line 167, in decorated_view
    response = view_func(request, *args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/django/views/generic/base.py", line 104, in view
    return self.dispatch(request, *args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/wagtail/admin/views/generic/permissions.py", line 32, in dispatch
    return super().dispatch(request, *args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/wagtail/admin/views/generic/mixins.py", line 91, in dispatch
    return super().dispatch(*args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/django/views/generic/base.py", line 143, in dispatch
    return handler(request, *args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/wagtail_localize/locales/views.py", line 114, in post
    return self.form_valid(form)
  File "/home/aalian/.pyenv/versions/3.9.21/lib/python3.9/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "venv3.9/lib/python3.9/site-packages/wagtail_localize/locales/views.py", line 121, in form_valid
    self.components.save(self.object)
  File "venv3.9/lib/python3.9/site-packages/wagtail_localize/locales/views.py", line 69, in save
    component_instance.save()
  File "venv3.9/lib/python3.9/site-packages/django/db/models/base.py", line 814, in save
    self.save_base(
  File "venv3.9/lib/python3.9/site-packages/django/db/models/base.py", line 892, in save_base
    post_save.send(
  File "venv3.9/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 176, in send
    return [
  File "venv3.9/lib/python3.9/site-packages/django/dispatch/dispatcher.py", line 177, in <listcomp>
    (receiver, receiver(signal=self, sender=sender, **named))
  File "venv3.9/lib/python3.9/site-packages/wagtail_localize/models.py", line 2347, in sync_trees_on_locale_sync_save
    instance.sync_trees()
  File "venv3.9/lib/python3.9/site-packages/wagtail_localize/models.py", line 2338, in sync_trees
    background.enqueue(
  File "venv3.9/lib/python3.9/site-packages/wagtail_localize/tasks.py", line 24, in enqueue
    func(*args, **kwargs)
  File "venv3.9/lib/python3.9/site-packages/wagtail_localize/synctree.py", line 195, in synchronize_tree
    source_page.copy_for_translation(
  File "venv3.9/lib/python3.9/site-packages/wagtail/models/__init__.py", line 2381, in copy_for_translation
    return CopyPageForTranslationAction(
  File "venv3.9/lib/python3.9/site-packages/wagtail/actions/copy_for_translation.py", line 153, in execute
    translated_page = self._copy_for_translation(
  File "/home/aalian/.pyenv/versions/3.9.21/lib/python3.9/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "venv3.9/lib/python3.9/site-packages/wagtail/actions/copy_for_translation.py", line 119, in _copy_for_translation
    return page.create_alias(
  File "venv3.9/lib/python3.9/site-packages/wagtail/models/__init__.py", line 2362, in create_alias
    return CreatePageAliasAction(
  File "venv3.9/lib/python3.9/site-packages/wagtail/actions/create_alias.py", line 258, in execute
    return self._create_alias(
  File "venv3.9/lib/python3.9/site-packages/wagtail/actions/create_alias.py", line 171, in _create_alias
    alias = parent.add_child(instance=alias)
  File "venv3.9/lib/python3.9/site-packages/treebeard/mp_tree.py", line 1091, in add_child
    return MP_AddChildHandler(self, **kwargs).process()
  File "venv3.9/lib/python3.9/site-packages/treebeard/mp_tree.py", line 393, in process
    newobj.save()
  File "home/models.py", line 133, in save
    index.insert_or_update_object(self)
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/index.py", line 174, in insert_or_update_object
    backend.add(indexed_instance)
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 774, in add
    self.get_index_for_object(obj).add_item(obj)
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 238, in add_item
    self.add_items(obj._meta.model, [obj])
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 353, in add_items
    update_method(content_type_pk, indexers)
  File "venv3.9/lib/python3.9/site-packages/wagtail/search/backends/database/postgres/postgres.py", line 282, in add_items_upsert
    cursor.execute(
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 102, in execute
    return super().execute(sql, params)
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "venv3.9/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
IndexError: list index out of range

@Rish-it
Copy link

Rish-it commented Apr 7, 2025

Based on the error stack trace and the code I've examined, ObjectIndexer.as_vector() returns EMPTY_VECTOR when there are no texts to index, which causes a mismatch in list lengths during the INSERT query in prepare_value(), leading to a broken SQL statement.

Probable Fix: # Ensure all lists are of equal length to prevent IndexError and #Add empty SQL entries if needed

Rish-it pushed a commit to Rish-it/wagtail that referenced this issue Apr 9, 2025
… by testing various scenarios with unequal list lengths that previously caused IndexError.
Rish-it pushed a commit to Rish-it/wagtail that referenced this issue Apr 12, 2025
Rish-it pushed a commit to Rish-it/wagtail that referenced this issue Apr 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:Needs Info Needs more information from the person who reported the issue or opened the PR status:Unconfirmed Issue, usually a bug, that has not yet been validated as a confirmed problem. type:Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants