Skip to content

Problem getting list of "wanted" items when there are none #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rwv37 opened this issue Jan 25, 2022 · 4 comments
Closed

Problem getting list of "wanted" items when there are none #92

rwv37 opened this issue Jan 25, 2022 · 4 comments
Assignees
Milestone

Comments

@rwv37
Copy link

rwv37 commented Jan 25, 2022

I'm trying to get Wantedtemplates, Wantedpages, etc. It's working fine for me for some of them, but an exception is being thrown by others (from within WikiClientLibrary). I believe I have narrowed it down to "works fine" = "there are some such wanted things" and "exception thrown" = "there are no such wanted things".

I'm not sure if this is because I'm doing something wrong or if it's perhaps a bug in WCL or in something at a deeper level than that. If it's not simply because I'm doing something wrong, I suspect it might have something to do with the issue outlined here: Apparently "HasValues" should be called on a Newtonsoft object before trying to access a child object. Disclaimer: I know nothing about Newtonsoft beyond what it is and that it's popular.

Am I doing something wrong? If not, is there a workaround for this? Any help would be appreciated.

I am getting the error when I try to do an "await foreach" on the items, and also if I simply try to get the count of them.

Here's the part of my code that's failing:

    private static async Task AddToDictionaryAsync
        (Dictionary<string, WikiPage> dictionary, 
        Func<CancellationToken, Task<IAsyncEnumerable<WikiPage>>> itemsToAdd,
        CancellationToken cancellationToken)
    {
        var items = await itemsToAdd(cancellationToken).ConfigureAwait(false);

        // EXCEPTION IS THROWN HERE
        var count = await items.CountAsync().ConfigureAwait(false);

        Console.WriteLine($"{count} items");

        // EXCEPTION IS THROWN HERE (if I comment out the "CountAsync" call above)
        await foreach (var item in items.WithCancellation(cancellationToken).ConfigureAwait(false))
        {
            dictionary[item.Title] = item;
        }
    }

And here's the exception (in the "CountAsync" case):

fail: Microsoft.Extensions.Hosting.Internal.Host[9]
      BackgroundService failed
      System.InvalidOperationException: Cannot access child value on Newtonsoft.Json.Linq.JValue.
         at Newtonsoft.Json.Linq.JToken.get_Item(Object key)
         at WikiClientLibrary.Pages.WikiPage.<>c.<FromJsonQueryResult>b__0_0(JProperty page)
         at System.Linq.EnumerableSorter`2.ComputeKeys(TElement[] elements, Int32 count)
         at System.Linq.EnumerableSorter`1.ComputeMap(TElement[] elements, Int32 count)
         at System.Linq.EnumerableSorter`1.Sort(TElement[] elements, Int32 count)
         at System.Linq.OrderedEnumerable`1.GetEnumerator()+MoveNext()
         at System.Linq.Enumerable.SelectIPartitionIterator`2.ToList()
         at WikiClientLibrary.Pages.WikiPage.FromJsonQueryResult(WikiSite site, JObject jpages, IWikiPageQueryProvider options)
         at WikiClientLibrary.Generators.Primitive.WikiPageGenerator`1.<>c__DisplayClass8_0.<EnumPagesAsync>b__1(JObject jquery)
         at System.Linq.AsyncEnumerable.SelectManyAsyncIterator`2.MoveNextCore()
         at System.Linq.AsyncIteratorBase`1.MoveNextAsync() in d:\a\1\s\Ix.NET\Source\System.Linq.Async\System\Linq\AsyncIterator.cs:line 70
         at System.Linq.AsyncIteratorBase`1.MoveNextAsync() in d:\a\1\s\Ix.NET\Source\System.Linq.Async\System\Linq\AsyncIterator.cs:line 75
         at System.Linq.AsyncEnumerablePartition`1.SkipAndCountAsync(UInt32 index, IAsyncEnumerator`1 en) in d:\a\1\s\Ix.NET\Source\System.Linq.Async\System\Linq\AsyncEnumerablePartition.cs:line 377
         at System.Linq.AsyncEnumerablePartition`1.<>c__DisplayClass11_0.<<GetCountAsync>g__Core|0>d.MoveNext() in d:\a\1\s\Ix.NET\Source\System.Linq.Async\System\Linq\AsyncEnumerablePartition.cs:line 95
      --- End of stack trace from previous location ---
         at System.Linq.AsyncEnumerablePartition`1.<>c__DisplayClass11_0.<<GetCountAsync>g__Core|0>d.MoveNext() in d:\a\1\s\Ix.NET\Source\System.Linq.Async\System\Linq\AsyncEnumerablePartition.cs:line 101
      --- End of stack trace from previous location ---
         at Rwv37.MediaWiki.Api.WikiSiteExtensions.AddToDictionaryAsync(Dictionary`2 dictionary, Func`2 itemsToAdd, CancellationToken cancellationToken) in C:\Users\bob\Bob\trunk\Dev\DotNet\Rwv37\MediaWiki\Rwv37.MediaWiki.Api\WikiSiteExtensions.cs:line 112
         at Rwv37.MediaWiki.Api.WikiSiteExtensions.GetAllWantedAsync(WikiSite site, CancellationToken cancellationToken) in C:\Users\bob\Bob\trunk\Dev\DotNet\Rwv37\MediaWiki\Rwv37.MediaWiki.Api\WikiSiteExtensions.cs:line 38
         at Rwv37.MediaWiki.SiteSetup.SiteSetupService.DoThatFunkyThingAsync(CancellationToken stoppingToken) in C:\Users\bob\Bob\trunk\Dev\DotNet\Rwv37\MediaWiki\Rwv37.MediaWiki.SiteSetup\SiteSetupService.cs:line 40
         at Rwv37.MediaWiki.SiteSetup.SiteSetupService.ExecuteAsync(CancellationToken stoppingToken) in C:\Users\bob\Bob\trunk\Dev\DotNet\Rwv37\MediaWiki\Rwv37.MediaWiki.SiteSetup\SiteSetupService.cs:line 30
         at Microsoft.Extensions.Hosting.Internal.Host.TryExecuteBackgroundServiceAsync(BackgroundService backgroundService)
@rwv37
Copy link
Author

rwv37 commented Jan 25, 2022

If not, is there a workaround for this?

I can just catch the System.InvalidOperationException and return, but I mean something more natural.

@CXuesong
Copy link
Owner

I think this error occurs because page.Value is a JValue on L31.

/// </summary>
/// <param name="site">A <see cref="Site"/> object.</param>
/// <param name="jpages">The <c>[root].qurey.pages</c> node value object of JSON result.</param>
/// <param name="options"></param>
/// <returns>Retrieved pages.</returns>
internal static IList<WikiPage> FromJsonQueryResult(WikiSite site, JObject jpages, IWikiPageQueryProvider options)
{
if (site == null) throw new ArgumentNullException(nameof(site));
if (jpages == null) throw new ArgumentNullException(nameof(jpages));
// If query.pages.xxx.index exists, sort the pages by the given index.
// This is specifically used with SearchGenerator, to keep the search result in order.
// For other generators, this property simply does not exist.
// See https://www.mediawiki.org/wiki/API_talk:Query#On_the_order_of_titles_taken_out_of_generator .
return jpages.Properties().OrderBy(page => (int?)page.Value["index"])
.Select(page =>
{
var newInst = new WikiPage(site, 0);
MediaWikiHelper.PopulatePageFromJson(newInst, (JObject)page.Value, options);
return newInst;
}).ToList();
}

I'm not sure how it happens, but perhaps can you share me the code you've used to return the IAsyncEnumerable<WikiPage> in itemsToAdd delegate? Which wiki site are you querying against? Is it a public wiki?

@rwv37
Copy link
Author

rwv37 commented Feb 7, 2022

Sorry for the late reply. Unfortunately, the exact code that was giving this exact error is gone. However, I later worked around a seemingly similar error in the same way as I previously worked around the earlier error, and I can show you the new one:

Differences between then and now

Difference in behavior

The difference in behavior is that whereas in the previous case, an exception was being thrown, in this case, it behaves as if the await foreach on the IAsyncEnumerable never finishes. To be clear, just like in the original report, the buggy behavior happens if and only if there are no pages to be returned.

Difference in code

I don't know exactly what the difference in my code is that led to this change in behavior, but I suspect that it's this: Previously (when the problem was that an exception was being thrown), I was doing something like...

// generator is a WikiClientLibrary.Generators.QueryPageGenerator
return generator.EnumPagesAsync().Take(100);

... then doing an await on that return value, and then an await foreach on the awaited return. Whereas the new problem is occurring while I am instead doing:

// generator is still a WikiClientLibrary.Generators.QueryPageGenerator
var pages = new KludgyWikiPageEnumerable(generator.EnumPagesAsync());
await foreach (var page in pages.ConfigureAwait(false).WithCancellation(cancellationToken))
{
   yield return page;
}

... and then doing an await foreach directly on the awaited return. So I'm guessing that maybe the change in behavior comes down to using Take() versus using yield return. I don't know, though.

Workaround

I worked around both this error and the previous one by creating the following two classes:

Class KludgyWikiPageEnumerable

using System.Collections.Generic;
using System.Threading;
using WikiClientLibrary.Pages;

namespace Rwv37.MediaWiki.Api
{
    /// <summary>
    /// <para>
    /// A kludgy class to deal with an apparent issue in WikiClientLibrary
    /// that happens when you attempt to retrieve a list that has no members
    /// (like, you try to get the items that are on Special:WantedFiles, but
    /// there are no such items).
    /// </para><para>
    /// See: <a href="https://github.com/CXuesong/WikiClientLibrary/issues/92">Problem getting list of "wanted" items when there are none</a>
    /// </para>
    /// </summary>
    internal class KludgyWikiPageEnumerable : IAsyncEnumerable<WikiPage>
    {
        private IAsyncEnumerable<WikiPage> WikiPages { get; init; }

        /// <summary>
        /// Initializes a new instance of the <see cref="T:Rwv37.MediaWiki.Api.KludgyWikiPageEnumerable" /> class.
        /// </summary>
        /// <param name="wikiPages">
        /// The wiki pages.
        /// </param>
        internal KludgyWikiPageEnumerable(IAsyncEnumerable<WikiPage> wikiPages)
        {
            this.WikiPages = wikiPages;
        }

        /// <summary>
        /// Returns an enumerator that iterates asynchronously through the collection.
        /// </summary>
        /// <param name="cancellationToken">
        /// A <see cref="T:System.Threading.CancellationToken">CancellationToken</see>
        /// that may be used to cancel the asynchronous iteration.
        /// </param>
        /// <returns>
        /// An enumerator that can be used to iterate asynchronously through the collection.
        /// </returns>
        public IAsyncEnumerator<WikiPage> GetAsyncEnumerator(
            CancellationToken cancellationToken = default)
        {
            return new KludgyWikiPageEnumerator(
                this.WikiPages.GetAsyncEnumerator(cancellationToken),
                this.WikiPages.GetAsyncEnumerator(cancellationToken));
        }
    }
}

Class KludgyWikiPageEnumerator

using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using WikiClientLibrary.Pages;

namespace Rwv37.MediaWiki.Api
{
    /// <summary>
    /// <para>
    /// A kludgy class to deal with an apparent issue in WikiClientLibrary
    /// that happens when you attempt to retrieve a list that has no members
    /// (like, you try to get the items that are on Special:WantedFiles, but
    /// there are no such items).
    /// </para><para>
    /// See: <a href="https://github.com/CXuesong/WikiClientLibrary/issues/92">Problem getting list of "wanted" items when there are none</a>
    /// </para>
    /// </summary>
    internal class KludgyWikiPageEnumerator : IAsyncEnumerator<WikiPage>
    {
        private IAsyncEnumerator<WikiPage> UnderlyingReal { get; init; }
        private IAsyncEnumerator<WikiPage> UnderlyingTester { get; init; }
        private bool WackinessTested { get; set; } = false;
        private bool IsWacky { get; set; } = false;

        /// <summary>
        /// Initializes a new instance of the <see cref="T:Rwv37.MediaWiki.Api.KludgyWikiPageEnumerator" /> class.
        /// </summary>
        /// <param name="underlyingReal">
        /// The underlying enumerator to "really" use.
        /// </param>
        /// <param name="underlyingTester">
        /// The underlying enumerator to use to test if we're gonna have a problem.
        /// </param>
        internal KludgyWikiPageEnumerator(IAsyncEnumerator<WikiPage> underlyingReal,
            IAsyncEnumerator<WikiPage> underlyingTester)
        {
            this.UnderlyingReal = underlyingReal;
            this.UnderlyingTester = underlyingTester;
        }

        private async ValueTask<bool> TestWackinessAsync(CancellationToken cancellationToken = default)
        {
            try
            {
                if (!this.WackinessTested)
                {
                    _ = await this.UnderlyingTester.MoveNextAsync(cancellationToken).ConfigureAwait(false);
                    this.IsWacky = false;
                }
            }
            catch (InvalidOperationException)
            {
                this.IsWacky = true;
            }
            finally
            {
                this.WackinessTested = true;
            }

            return this.IsWacky;
        }

        /// <summary>
        /// Gets the element in the collection at the current position of the enumerator.
        /// </summary>
        public WikiPage Current
        {
            get
            {
                return this.UnderlyingReal.Current;
            }
        }

        /// <summary>
        /// Dispose as an asynchronous operation.
        /// </summary>
        /// <returns>
        /// A Task&lt;ValueTask&gt; representing the asynchronous operation.
        /// </returns>
        [System.Diagnostics.CodeAnalysis.SuppressMessage(
            "IDisposableAnalyzers.Correctness",
            "IDISP007:Don't dispose injected",
            Justification = "Uh... I think this is cool? Errr... or \"hope\", at least?")]
        public async ValueTask DisposeAsync()
        {
            await this.UnderlyingReal.DisposeAsync().ConfigureAwait(false);
            await this.UnderlyingTester.DisposeAsync().ConfigureAwait(false);
        }

        /// <summary>
        /// Move next as an asynchronous operation.
        /// </summary>
        /// <returns>
        /// A Task&lt;System.Boolean&gt; representing the asynchronous operation.
        /// </returns>
        public async ValueTask<bool> MoveNextAsync()
        {
            if (!this.WackinessTested)
            {
                _ = await this.TestWackinessAsync().ConfigureAwait(false);
            }

            return !this.IsWacky && await this.UnderlyingReal.MoveNextAsync().ConfigureAwait(false);
        }
    }
}

Usage

Usage is just wrapping the IAsyncEnumerable returned by WikiClientLibrary in a KludgyWikiPageEnumerable. For example, the following code works fine, but before I wrapped the generator.EnumPagesAsync() within a new KludgyWikiPageEnumerable(), my program was acting as if the await foreach on the IAsyncEnumerable never completed:

public async IAsyncEnumerable<WikiPage> QueryAsync(
    int paginationSize = 100,
    [EnumeratorCancellation] CancellationToken cancellationToken = default)
{
    // TODO: Think about pagination size

    var generator = new QueryPageGenerator(this.Site, this.QueryPageName)
    {
        PaginationSize = paginationSize,
    };

    var pages = new KludgyWikiPageEnumerable(generator.EnumPagesAsync());
    await foreach (var page in pages.ConfigureAwait(false).WithCancellation(cancellationToken))
    {
        yield return page;
    }
}

@CXuesong CXuesong self-assigned this Feb 8, 2022
@CXuesong CXuesong added this to the v0.8.0 milestone Feb 8, 2022
@CXuesong
Copy link
Owner

CXuesong commented Feb 9, 2022

Actually, QueryPageGenerator.EnumItemsAsync haven't ever been working before... Please try the latest release instead.

Released v0.8.0-int.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants