admin管理员组

文章数量:1406943

From within my ASP.NET application (.NET Framework 4.8) I invoke an external process via Anonymous Pipes. Theoretically, it could happen that the external process hangs (for whatever reason) when my application writes to or reads from the stream. Hence I try to implement a timeout mechanism so that my application does not hang as well:

public static class TaskExtensions
{
    public static async Task<T> WithTimeout<T>([NotNull] this Task<T> task, int millisecondsTimeout, [CanBeNull] Func<T> onTimeout = null)
    {
        using (var timeoutCancellation = new CancellationTokenSource())
        {
            var delayTask = Task.Delay(millisecondsTimeout, timeoutCancellation.Token);
            var completedTask = await Task.WhenAny(task, delayTask).ConfigureAwait(false);

            if (completedTask == task)
            {
                // original task completed
                timeoutCancellation.Cancel();
                return await task;
            }

            // timeout
            return onTimeout != null ? onTimeout() : throw new TimeoutException();
        }
    }
}

Whenever I debug my application and execute this code everything works like a charm. Then I created some tests which at first looked fine, too. But when I ran them some more times, the test became flaky... Sometimes they succeed, sometimes not...

[TestFixture]
public class TaskExtensionsTests
{
    [Repeat(25)]
    [TestCase(10, 100, false)]
    [TestCase(100, 10, true)]
    public async Task WithTimeout(int longRunningTaskInMs, int waitTimeoutInMs, bool expectedWaitCallbackFired)
    {
        var timeOutCallbackFired = false;
        var actual = await Task.Run(async () =>
        {
            Console.WriteLine($"{DateTime.Now:O} long running task started...");
            await Task.Delay(longRunningTaskInMs);
            Console.WriteLine($"{DateTime.Now:O} long running task finished.");
            return "returned in time";
        }).WithTimeout(waitTimeoutInMs, () =>
        {
            Console.WriteLine("timeout!");
            timeOutCallbackFired = true;
            return "fired";
        });

        Assert.That(actual, Is.EqualTo(expectedWaitCallbackFired ? "fired" : "returned in time"));
        Assert.That(timeOutCallbackFired, Is.EqualTo(expectedWaitCallbackFired));
    }
}

You can see, that I am testing two scenarios:

  • The API-call (longRunningTaskInMs) does not hang an returns rapidly (10ms) with a "large" waitTimeoutInMs of 100ms. In this case I expect that no timeout occurs.
  • The API-call (longRunningTaskInMs) hangs for 100ms and therefore exceeds the short waitTimeoutInMs of 10ms. In this case I expect a timeout to occur.

On my development machine which is quite performant I never saw these tests failing. But after I pushed the code and my CI server (on-prem TeamCity) started doing its job I noticed the flakiness. The build agents are way less performent.

I then locally lowered the values from 10/100ms and 100/10ms to 10/25ms and 25/10ms. After doing that the test becomes flaky on my development machine as well. But I think the reason why the tests are now failing is something else. Namely the Windows clock resolution of approx 15.625ms. Therefore, in my opinion, it is not surprising that the tests with these low values become unreliable.

What I can't explain, or rather my real question, is why the tests on the build agents with the 10/100 and 100/10 values fail?! These values are 90ms apart, which is like an eternity...

I created a .NET fiddle to easily reproduce the issue: If you run the fiddle multiple times the tests will eventually fail. Please note: For the fiddle I used .NET 9 instead of .NET Fx 4.8 and MSTest instead of NUnit. But this seems to be irrelevant since the tests are flaky with these components as well...


EDIT:

In the meantime I tried Stephen's suggestion to use TimeProvider for a more precise timing. This uses a high-performance time stamp based on Stopwatch.

My extension method no looks like this:

public static async Task<T> WithTimeoutWaitAsync<T>(this Task<T> task, int millisecondsTimeout, Func<T> onTimeout = null)
{
    using (var timeoutCancellation = new CancellationTokenSource())
    {
        try
        {
            return await task
                .WaitAsync(TimeSpan.FromMilliseconds(millisecondsTimeout), TimeProvider.System, timeoutCancellation.Token)
                .ConfigureAwait(false);
        }
        catch (TimeoutException timeoutException)
        {
            return onTimeout != null ? onTimeout() : throw timeoutException;
        }
    }
}

I curiously pushed this code and at first it looked very, very promising. Multiple TeamCity builds succeeded. But then... Every now and then single builds failed the same way as before

本文标签: